• Senior GenAI Algorithms Engineer

    NVIDIA (Santa Clara, CA)
    …streamlined deployment strategies with open-sourced inference frameworks. Seeking a Senior Deep Learning Algorithms Engineer to improve innovative generative ... and diffusion models. In this role, you will design, implement, and productionize model optimization algorithms for inference and deployment on NVIDIA's latest… more
    NVIDIA (01/10/26)
    - Save Job - Related Jobs - Block Source
  • Senior Software Development Engineer

    Amazon (Seattle, WA)
    …with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance. The Inference Enablement and Acceleration team ... culture. The team works closely with customers on their model enablement, providing direct support and optimization expertise to...models like the Llama family, DeepSeek and beyond. The Inference Enablement and Acceleration team works side by side… more
    Amazon (01/06/26)
    - Save Job - Related Jobs - Block Source
  • Senior Software Development Engineer

    Amazon (Cupertino, CA)
    …with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance. The Inference Enablement and Acceleration team ... culture. The team works closely with customers on their model enablement, providing direct support and optimization expertise to...models like the Llama family, DeepSeek and beyond. The Inference Enablement and Acceleration team works side by side… more
    Amazon (11/05/25)
    - Save Job - Related Jobs - Block Source
  • Senior Software Engineer

    MongoDB (Palo Alto, CA)
    **About the Role** We're looking for a Senior Engineer to help build the next-generation inference platform that supports embedding models used for semantic ... with Atlas and designed for developer-first experiences. As a Senior Engineer , you'll focus on building core...focus on building core systems and services that power model inference at scale. You'll own key… more
    MongoDB (01/08/26)
    - Save Job - Related Jobs - Block Source
  • Senior Software Engineer , AI…

    NVIDIA (CA)
    …how you can make a lasting impact on the world. We are now looking for a Senior System Software Engineer to work on user facing tools for Dynamo Inference ... data scientists. What you'll be doing: + Build and maintain distributed model management systems, including Rust-based runtime components, for large-scale AI … more
    NVIDIA (11/29/25)
    - Save Job - Related Jobs - Block Source
  • Senior Principal Machine Learning…

    Red Hat (Boston, MA)
    …bring the power of open-source LLMs and vLLM to every enterprise. Red Hat Inference team accelerates AI for the enterprise and brings operational simplicity to GenAI ... the vLLM and LLM-D projects, and inventors of state-of-the-art techniques for model quantization and sparsification, our team provides a stable platform for… more
    Red Hat (01/08/26)
    - Save Job - Related Jobs - Block Source
  • Senior DL Algorithms Engineer

    NVIDIA (Santa Clara, CA)
    …leads the AI revolution. What you will be doing: + Implement language and multimodal model inference as part of NVIDIA Inference Microservices (NIMs). + ... We are now looking for a Senior DL Algorithms Engineer ! NVIDIA is...bugs and deliver production code to TRT-LLM, NVIDIA's open-source inference serving library. + Profile and analyze bottlenecks across… more
    NVIDIA (01/08/26)
    - Save Job - Related Jobs - Block Source
  • Senior Deep Learning Software…

    NVIDIA (Santa Clara, CA)
    NVIDIA seeks a Senior Software Engineer specializing in Deep Learning Inference for our growing team. As a key contributor, you will help design, build, and ... frameworks, which are at the forefront of efficient large-scale model serving and inference . You will play...are growing fast. If you're a creative and autonomous engineer with a genuine passion for technology, we want… more
    NVIDIA (12/07/25)
    - Save Job - Related Jobs - Block Source
  • Senior Deep Learning Software…

    NVIDIA (Santa Clara, CA)
    NVIDIA seeks a Senior Software Engineer specializing in Deep Learning Inference for our growing team. As a key contributor, you will help design, build, and ... SGLang and vLLM, which are at the forefront of efficient large-scale model serving and inference . You will play a central role in improving these platforms,… more
    NVIDIA (12/05/25)
    - Save Job - Related Jobs - Block Source
  • Senior Engineer -AI Inference

    Bank of America (Addison, TX)
    Senior Engineer -AI Inference Addison, Texas;Plano, Texas; Newark, Delaware; Charlotte, North Carolina; Kennesaw, Georgia **To proceed with your application, ... must be at least 18 years of age.** Acknowledge (https://ghr.wd1.myworkdayjobs.com/Lateral-US/job/Addison/ Senior - Engineer -AI- Inference \_25029879) **Job Description:** At Bank… more
    Bank of America (12/22/25)
    - Save Job - Related Jobs - Block Source
  • Senior Software Engineer - vLLM…

    Red Hat (Boston, MA)
    …for enterprises to build, optimize, and scale LLM deployments. We are seeking an experienced Senior ML Ops engineer to work closely with our product and research ... open-source LLMs and vLLM to every enterprise. Red Hat Inference team accelerates AI for the enterprise and brings...deep learning products and software. As an ML Ops engineer , you will work closely with our technical and… more
    Red Hat (12/06/25)
    - Save Job - Related Jobs - Block Source
  • Senior Principal Machine Learning…

    Red Hat (Boston, MA)
    …bring the power of open-source LLMs and vLLM to every enterprise. Red Hat Inference team accelerates AI for the enterprise and brings operational simplicity to GenAI ... maintainers of the vLLM project, and inventors of state-of-the-art techniques for model quantization and sparsification, our team provides a stable platform for… more
    Red Hat (01/08/26)
    - Save Job - Related Jobs - Block Source
  • AI Inference Engineer

    quadric.io, Inc (Burlingame, CA)
    model deployment for efficient inference ; [3] profile and benchmark the model performance. This senior technical role demands deep knowledge of AI ... conventional C++ DSP and control code. Role: The AI Inference Engineer in Quadric is the key...Electric Engineering. + 5+ years of experience in AI/LLM model inference and deployment frameworks/tools + experience… more
    quadric.io, Inc (11/25/25)
    - Save Job - Related Jobs - Block Source
  • Machine Learning Engineer , AWS Neuron…

    Amazon (Seattle, WA)
    …The Neuron Inference Technology team works side by side with the Inference Model Enablement, compiler runtime engineers to create, build and tune ... that use them. This role is for a software engineer in the Machine Learning Applications (ML Apps) team...and performance tunes building blocks for all key ML model families, including Llama3, GPT OSS, Qwen3, DeepSeek and… more
    Amazon (12/24/25)
    - Save Job - Related Jobs - Block Source
  • Machine Learning Engineer , AWS Neuron…

    Amazon (Seattle, WA)
    …and the Trn1 and Inf1 servers that use them. This role is for a software engineer in the Machine Learning Applications (ML Apps) team for AWS Neuron. This role is ... and performance tuning of a wide variety of ML model families, including massive scale large language models like...and runtime engineers to create, build and tune distributed inference solutions with Trn1. Experience optimizing inference more
    Amazon (12/13/25)
    - Save Job - Related Jobs - Block Source
  • Senior Deep Learning Inference

    NVIDIA (Durham, NC)
    Senior Deep Learning Inference Performance Architect! NVIDIA is seeking a Senior Performance Architect - a creative engineer who loves to squeeze out ... to extend the state of the art in AI Inference performance and efficiency + Model , analyze and prototype key deep learning algorithms and applications… more
    NVIDIA (01/10/26)
    - Save Job - Related Jobs - Block Source
  • Software Engineer -AI/ML, AWS Neuron…

    Amazon (Seattle, WA)
    …cloud-scale machine learning accelerators. This role is for a senior software engineer in the Machine Learning Inference Applications team. This role is ... for development and performance optimization of core building blocks of LLM Inference - Attention, MLP, Quantization, Speculative Decoding, Mixture of Experts, etc.… more
    Amazon (12/21/25)
    - Save Job - Related Jobs - Block Source
  • Senior Software Engineer

    Red Hat (Raleigh, NC)
    …serve models, and deliver innovative apps. The OpenShift AI team seeks a Software Engineer with Kubernetes and Model Inference Runtimes experience to join ... packaging, such as PyPI libraries + Solid understanding of the fundamentals of model inference architectures + Experience with Jenkins, Git, shell scripting, and… more
    Red Hat (12/06/25)
    - Save Job - Related Jobs - Block Source
  • Senior DL Algorithms Engineer

    NVIDIA (Santa Clara, CA)
    …and model post-training. + Deep understanding of distributed systems for large-scale model inference and serving. Your base salary will be determined based ... We are now looking for a Senior DL Algorithms Engineer ! We are...programming skills in Python and C++. + Experience with model quantization and modern inference optimization techniques… more
    NVIDIA (11/06/25)
    - Save Job - Related Jobs - Block Source
  • Senior Software Engineer , Machine…

    Google (Sunnyvale, CA)
    …like PyTorch or JAX. + 3 years of experience in software development for machine learning model inference or machine learning model training, and 1 year of ... Senior Software Engineer , Machine Learning, Kernel...experience with ML model inference and training optimization on modern...experience with ML model inference and training optimization on modern GPU/TPU architectures. **Preferred… more
    Google (12/27/25)
    - Save Job - Related Jobs - Block Source