Ml Inference Engineer Systems Jobs

277 jobs (page 1)

Categories

All Categories

Engineering (103)

Software/IT (51)

Staff ML Engineer , Inference…

General Motors (Sunnyvale, CA)

…the business. **This job is eligible for relocation assistance.** **About the Team:** The ML Inference Platform is part of the AI Compute Platforms organization ... efficiency. **About the Role:** We are seeking a Staff ML Infrastructure engineer to help build and...shaping the architecture, roadmap and user-experience of a robust ML inference service supporting real-time, batch, and… more

General Motors (10/21/25)
- Save Job - Related Jobs - Block Source
Software Development Engineer - AI/…

Amazon (Seattle, WA)

…integrates with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance. The Inference Enablement and ... learning and GenAI workloads on Amazon's Inferentia and Trainium ML accelerators. This comprehensive toolkit includes an ML...models like the Llama family, DeepSeek and beyond. The Inference Enablement and Acceleration team works side by side… more

Amazon (12/31/25)
- Save Job - Related Jobs - Block Source
Senior Software Development Engineer , AI/…

Amazon (Seattle, WA)

…integrates with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance. The Inference Enablement and ... learning and GenAI workloads on Amazon's Inferentia and Trainium ML accelerators. This comprehensive toolkit includes an ML...models like the Llama family, DeepSeek and beyond. The Inference Enablement and Acceleration team works side by side… more

Amazon (01/06/26)
- Save Job - Related Jobs - Block Source
Software Development Engineer AI/ ML…

Amazon (Cupertino, CA)

…applications. Key job responsibilities * Architect and lead the design of distributed ML serving systems optimized for generative AI workloads * Drive technical ... the boundaries of what's possible in large-scale ML serving. Recent shares: https://github.com/aws-neuron/upstreaming-to-vllm/releases/tag/2.25.0 https://awsdocs-neuron.readthedocs-hosted.com/en/latest/libraries/nxd- inference… more

Amazon (12/21/25)
- Save Job - Related Jobs - Block Source
Machine Learning Engineer , AWS Neuron…

Amazon (Seattle, WA)

…the Trn2 and future Trn3 servers that use them. This role is for a software engineer in the Machine Learning Applications ( ML Apps) team for AWS Neuron. This ... enables and performance tunes building blocks for all key ML model families, including Llama3, GPT OSS, Qwen3, DeepSeek...Llama3, GPT OSS, Qwen3, DeepSeek and beyond. The Neuron Inference Technology team works side by side with the… more

Amazon (12/24/25)
- Save Job - Related Jobs - Block Source
Software Engineer -AI/ ML , AWS…

Amazon (Seattle, WA)

…Trainium cloud-scale machine learning accelerators. This role is for a senior software engineer in the Machine Learning Inference Applications team. This role is ... for development and performance optimization of core building blocks of LLM Inference - Attention, MLP, Quantization, Speculative Decoding, Mixture of Experts, etc.… more

Amazon (12/21/25)
- Save Job - Related Jobs - Block Source
Senior Software Engineer , AI…

NVIDIA (Santa Clara, CA)

…highly skilled and motivated software engineers to join us and build AI inference systems that serve large-scale models with extreme efficiency. You'll architect ... that pushes the pareto frontier for the field of ML Systems ; survey recent publications and find...theories. + Knowledgeable and passionate about performance engineering in ML frameworks (eg, PyTorch) and inference engines… more

NVIDIA (01/10/26)
- Save Job - Related Jobs - Block Source
Lead Engineer , Inference Platform

MongoDB (Palo Alto, CA)

…in multi-tenant environments + 1+ years of experience serving as TL for a large-scale ML inference or training platform SW project **Nice to Have** + Prior ... We're looking for a Lead Engineer , Inference Platform to join our...of experience in managing a technical team focused on ML inference or training infrastructure **Why Join… more

MongoDB (12/27/25)
- Save Job - Related Jobs - Block Source
Senior Software Engineer , Inference…

MongoDB (Palo Alto, CA)

…for developer-first experiences. As a Senior Engineer , you'll focus on building core systems and services that power model inference at scale. You'll own key ... **About the Role** We're looking for a Senior Engineer to help build the next-generation inference platform that supports embedding models used for semantic… more

MongoDB (01/08/26)
- Save Job - Related Jobs - Block Source
Senior Software Engineer , AI…

NVIDIA (CA)

…benchmarking, automation, and documentation processes to ensure low-latency, robust, and production-ready inference systems on GPU clusters. What we need to see: ... systems , including Rust-based runtime components, for large-scale AI inference workloads. + Implement inference scheduling and deployment solutions… more

NVIDIA (11/29/25)
- Save Job - Related Jobs - Block Source
Senior Principal Machine Learning Engineer…

Red Hat (Boston, MA)

…collaborate with our team to tackle the most pressing challenges in scalable inference systems and Kubernetes-native deployments. Your work with distributed ... bring the power of open-source LLMs and vLLM to every enterprise. Red Hat Inference team accelerates AI for the enterprise and brings operational simplicity to GenAI… more

Red Hat (01/08/26)
- Save Job - Related Jobs - Block Source
Lead AI Engineer (FM Hosting, LLM…

Capital One (San Francisco, CA)

Lead AI Engineer (FM Hosting, LLM Inference ) **Overview** At Capital One, we are creating responsible and reliable AI systems , changing banking for good. For ... cost, latency, throughput - of large scale production AI systems . + Contribute to the technical vision and the...and supporting AI services + Experience developing AI and ML algorithms or technologies (eg LLM Inference ,… more

Capital One (11/04/25)
- Save Job - Related Jobs - Block Source
Senior GenAI Algorithms Engineer - Model…

NVIDIA (Santa Clara, CA)

…open-sourced inference frameworks. Seeking a Senior Deep Learning Algorithms Engineer to improve innovative generative AI models like LLMs, VLMs, multimodal and ... as large language models (LLM) and diffusion models for maximal inference efficiency using techniques ranging from quantization, speculative decoding, sparsity,… more

NVIDIA (01/10/26)
- Save Job - Related Jobs - Block Source
Machine Learning Engineer , AWS Neuron…

Amazon (Cupertino, CA)

…and the Trn1 and Inf1 servers that use them. This role is for a software engineer in the Machine Learning Applications ( ML Apps) team for AWS Neuron. This role ... for development, enablement and performance tuning of a wide variety of ML model families, including massive scale large language models like Llama2, GPT2,… more

Amazon (01/14/26)
- Save Job - Related Jobs - Block Source
Senior System Software Engineer - Dynamo…

NVIDIA (CA)

…including debugging, performance analysis, and test design. + Experience with high scale distributed systems and ML systems Ways to stand out from the ... We are now looking for a Senior System Software Engineer to work on Dynamo & Triton Inference...crowd: + Prior work experience improving performance of AI inference systems . + Background with deep learning… more

NVIDIA (01/08/26)
- Save Job - Related Jobs - Block Source
Machine Learning Engineer , vLLM…

Red Hat (Raleigh, NC)

…the power of open-source LLMs and vLLM to every enterprise. The Red Hat Inference team accelerates AI for the enterprise and brings operational simplicity to GenAI ... optimize, and scale LLM deployments. As a Machine Learning Engineer focused on vLLM, you will be at the...you. Join us in shaping the future of AI Inference ! **What You Will Do** + Write robust Python… more

Red Hat (12/31/25)
- Save Job - Related Jobs - Block Source
Senior Engineer -AI Inference

Bank of America (Addison, TX)

Senior Engineer -AI Inference Addison, Texas;Plano, Texas; Newark, Delaware; Charlotte, North Carolina; Kennesaw, Georgia **To proceed with your application, you ... must be at least 18 years of age.** Acknowledge (https://ghr.wd1.myworkdayjobs.com/Lateral-US/job/Addison/Senior- Engineer -AI- Inference \_25029879) **Job Description:** At Bank of America,… more

Bank of America (12/22/25)
- Save Job - Related Jobs - Block Source
Senior Principal Machine Learning Engineer…

Red Hat (Boston, MA)

…bring the power of open-source LLMs and vLLM to every enterprise. Red Hat Inference team accelerates AI for the enterprise and brings operational simplicity to GenAI ... (https://github.blog/news-insights/octoverse/octoverse-a-new-developer-joins-github-every-second-as-ai-leads-typescript-to-1/#the-top-open-source-projects-by-contributors) on Github. As a Machine Learning Engineer focused on vLLM, you will be… more

Red Hat (01/08/26)
- Save Job - Related Jobs - Block Source
AI / ML Engineer

Guidehouse (Huntsville, AL)

…to 10% **Clearance Required** **:** Active Top Secret (TS) Guidehouse is seeking a Lead AI/ ML Engineer to join our Technology / AI and Data team, supporting ... AI solutions that leverage large language models (LLMs), retrieval systems , and secure, scalable inference pipelines to...You Will Do** **:** + Serves as the lead AI/ ML engineer responsible for developing, optimizing, and… more

Guidehouse (01/01/26)
- Save Job - Related Jobs - Block Source
Sr. Software Engineer - AI/ ML , AWS…

Amazon (Seattle, WA)

…in Python and ML framework internals * Strong understanding of distributed systems and ML optimization * Passion for performance tuning and system ... AI accelerators Inferentia and Trainium. As a Senior Software Engineer in our Machine Learning Applications team, you'll be...architects to influence the future of AI hardware. Our systems handle millions of inference calls daily,… more

Amazon (10/31/25)
- Save Job - Related Jobs - Block Source

"Juju

Account Login

Sign Up

Forgot your password?

Advanced Search