- Pantera Capital (San Francisco, CA)
- Location San Francisco Employment Type Full time Location Type Hybrid Department AI We are looking for an AI Inference engineer to join our growing ... learning models for real-time inference . Responsibilities Develop APIs for AI inference that will be used by both internal and external customers Benchmark… more
- Capital One (Fredericksburg, VA)
- Lead AI Engineer (FM Hosting, LLM Inference ) Overview At... New York, NY: $211,000 - $240,800 for Lead AI Engineer San Francisco, CA: $211,000 ... AI and ML algorithms or technologies (eg LLM Inference , Similarity Search and VectorDBs, Guardrails, Memory) using Python,...- $240,800 for Lead AI Engineer San Jose, CA: $211,000 - $240,800… more
- Virtue AI (San Francisco, CA)
- An innovative AI security company in San Francisco is seeking an Inference Engineer who will be pivotal in optimizing ML model inferences. The role ... deep knowledge of serving LLMs and experience in designing inference APIs. Candidates should be comfortable in a fast-paced...presents an opportunity to work at the cutting edge of AI security with competitive compensation and growth potential.… more
- Pantera Capital (San Francisco, CA)
- A financial technology firm in San Francisco is seeking an experienced AI Inference Engineer to develop APIs for AI inference used by both ... internal and external customers. Candidates should have experience with machine learning systems and deep learning frameworks like PyTorch, and familiarity with LLM architectures. The role supports a hybrid work environment and offers a competitive salary,… more
- San Francisco Compute Co. (San Francisco, CA)
- A cutting-edge technology firm in San Francisco is seeking an engineer for Large Scale Inference . You will build and scale software systems to optimize ... compute for inference workloads. The ideal candidate enjoys software craftsmanship, is a strong communicator, and has an appreciation for reliable systems. The role… more
- Menlo Ventures (San Francisco, CA)
- …public benefit corporation in San Francisco seeks a skilled software engineer to join the inference team. This role involves building systems ... that power AI models like Claude, focusing on maximizing efficiency and enabling groundbreaking research. Ideal candidates have a background in distributed systems,… more
- Databricks Inc. (San Francisco, CA)
- Staff Software Engineer - GenAI inference P-1285 About This Role As a staff software engineer for GenAI inference , you will lead the architecture, ... development, and optimization of the inference engine that powers Databricks Foundation Model API. You'll...Intelligence Platform to unify and democratize data, analytics, and AI . Databricks is headquartered in San Francisco,… more
- San Francisco Compute Co. (San Francisco, CA)
- …that's what we make: a liquid market for GPU offtake. About the Role As an engineer working on Large Scale Inference you will build and scale software systems ... application layer companies sign multi-year contracts for computer and inference , but sell to customers on monthly subscriptions. If...by selling it back to the market? Otherwise, as AI scales, compute only becomes available to folks who… more
- Menlo Ventures (San Francisco, CA)
- About This Role As a software engineer for GenAI inference , you will help design, develop, and optimize the inference engine that powers Databricks' ... and efficient. Your work will touch the full GenAI inference stack - from kernels and runtimes to orchestration...Intelligence Platform to unify and democratize data, analytics and AI . Databricks is headquartered in San Francisco,… more
- OpenAI (San Francisco, CA)
- A leading AI research company in San Francisco seeks an engineer to optimize their powerful AI models for high-volume production environments. The ideal ... candidate has over 5 years of software engineering experience, strong familiarity with ML architectures, and experience with distributed systems. This role involves collaboration with researchers and focus on performance optimization. Compensation ranges from… more
- Arcade (San Francisco, CA)
- A pioneering technology company in San Francisco seeks a Software Engineer to build scalable backend systems and enhance generative AI workflows. The ideal ... design efficient architecture for model execution and collaborate closely with product and AI teams. This role offers the opportunity to innovate in a fast-paced… more
- OpenAI (San Francisco, CA)
- …build the load balancer that will sit at the very front of our research inference stack - routing the world's largest AI models with millisecond precision and ... About the Team Our Inference team brings OpenAI's most capable research and...and developers alike to use and access our state-of-the-art AI models, allowing them to do things that they've… more
- Baseten (San Francisco, CA)
- A technology startup in San Francisco is seeking a skilled individual to enhance the API infrastructure supporting AI models. The role involves designing and ... optimizing backend services, focusing on performance and reliability. Candidates should have over 3 years of experience with distributed systems and be comfortable debugging complex systems. This unique opportunity includes a competitive compensation package… more
- Mvp VC (San Francisco, CA)
- …aerospace company in San Francisco is seeking a skilled software engineer to optimize and integrate the Ultimate Edge SDK for embedded platforms. Key ... responsibilities include collaborating on performance tuning and ensuring efficient deployment on NVIDIA hardware. Required qualifications include a Master's in Computer Engineering, expertise in C++/Python, and familiarity with containerization technologies.… more
- Loft Orbital Solutions (San Francisco, CA)
- A leading space technology company in San Francisco is seeking a skilled engineer to contribute to the development and optimization of the Ultimate Edge SDK. The ... role focuses on integrating ONNX-based runtimes and optimizing performance across embedded platforms. Candidates should have a master's degree and solid experience in C++ or Python, along with familiarity with embedded systems. This position offers a salary of… more
- OpenAI (San Francisco, CA)
- About the Team Our Inference team brings OpenAI's most capable research and technology to the world through our products. We empower consumers, enterprise and ... developers alike to use and access our start-of-the-art AI models, allowing them to do things that they've...to before. We focus on performant and efficient model inference , as well as accelerating research progression via model… more
- OpenAI (San Francisco, CA)
- …on delivering a world-class developer experience while pushing the boundaries of what AI can do. We're expanding into multimodal inference , building the ... About the Team OpenAI's Inference team powers the deployment of our most...research. About the Role We're looking for a software engineer to help us serve OpenAI's multimodal models at… more
- Nutanix (San Diego, CA)
- A global technology leader in San Diego is seeking an AI Software Engineer to develop on-device software solutions using Python and C/C++. The ideal ... will work in a dynamic environment, collaborating with researchers to advance Gen AI technology. A strong background in software engineering and experience with deep… more
- Capital One (San Francisco, CA)
- A leading financial services provider in San Francisco is seeking a Technical Specialist to develop AI and ML solutions. You'll need a strong foundation in ... 4 years of experience programming in Python and deploying AI on cloud platforms. The ability to optimize solutions...The ability to optimize solutions and a passion for AI research are essential. This role offers a competitive… more
- Menlo Ventures (San Francisco, CA)
- …researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. About the role Our Inference team is responsible ... Anthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be...by serving our models via the industry's largest compute-agnostic inference deployments. We are responsible for the entire stack… more