Senior TensorRT LLM Engineer Jobs

23 jobs (page 1)

Categories

All Categories

Engineering (12)

Senior Software Development Engineer…

NVIDIA (Santa Clara, CA)

We are now looking for a TensorRT - LLM Software Development Engineer ! NVIDIA is hiring software engineers for its TensorRT - LLM team. Academic and ... core backend software for LLM inference. + Improve the usability of the TensorRT - LLM library and build systems (CMake) What we need to see: + Masters or… more

NVIDIA (01/10/26)
- Save Job - Related Jobs - Block Source
Senior Deep Learning Software…

NVIDIA (Santa Clara, CA)

We are now looking for a Senior Deep Learning Software Engineer , LLM Performance! NVIDIA is seeking an experienced Deep Learning Engineer passionate ... learning community to implement the latest algorithms for public release in TensorRT LLM , VLLM, SGLang and LLM benchmarks. Identify performance opportunities… more

NVIDIA (11/25/25)
- Save Job - Related Jobs - Block Source
Principal Software Engineer - Large-Scale…

NVIDIA (Santa Clara, CA)

…deployment of cutting-edge LLM workloads. We are seeking a Principal Systems Engineer to define the vision and roadmap for memory management of large-scale ... large-scale LLM inference. + Architect and implement deep integrations with leading LLM serving engines (such as vLLM, SGLang, TensorRT - LLM ), with a… more

NVIDIA (01/10/26)
- Save Job - Related Jobs - Block Source
Software Engineer II - AI/ML, AWS Neuron,…

Amazon (Cupertino, CA)

…responsible for development, enablement and performance tuning of a wide variety of LLM model families, including massive scale large language models like the Llama ... we're building an environment that celebrates knowledge-sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews.… more

Amazon (11/27/25)
- Save Job - Related Jobs - Block Source
Senior GenAI Algorithms Engineer…

NVIDIA (Santa Clara, CA)

…and streamlined deployment strategies with open-sourced inference frameworks. Seeking a Senior Deep Learning Algorithms Engineer to improve innovative LLMs, ... ( TensorRT Model Optimizer, Megatron-LM, Megatron-Bridge, Nvidia-NeMo, NeMo-AutoModel, TensorRT - LLM ) and open-source frameworks (PyTorch, Hugging Face, vLLM,… more

NVIDIA (12/18/25)
- Save Job - Related Jobs - Block Source
Senior AI Engineer , NeMo Retriever…

NVIDIA (Santa Clara, CA)

…on pre-optimized inference engines from NVIDIA and the community, including NVIDIA TensorRT and TensorRT - LLM , NIM microservices optimize response latency ... The NeMo Retriever team is looking for an AI Engineer to join our team, focusing on the intersection...deployments, etc. + Familiarity with ML libraries, especially PyTorch, TensorRT , or TensorRT - LLM . + Excellent… more

NVIDIA (01/10/26)
- Save Job - Related Jobs - Block Source
Senior DL Algorithms Engineer…

NVIDIA (Santa Clara, CA)

We are now looking for a Senior DL Algorithms Engineer ! We are seeking a highly skilled Deep Learning Algorithms Engineer with hands-on experience optimizing ... deploy, and optimize models for efficient inference using frameworks such as TensorRT , TensorRT - LLM , vLLM, and SGLang. + Understand, analyze, profile, and… more

NVIDIA (11/06/25)
- Save Job - Related Jobs - Block Source
Senior Deep Learning Algorithm…

NVIDIA (Santa Clara, CA)

We are now looking for a Senior DL Algorithms Engineer ! We are seeking a highly skilled Deep Learning Algorithms Engineer with hands-on experience optimizing ... inference. + Convert and deploy models using frameworks such as TensorRT and TensorRT - LLM + Understand, analyze, profile, and optimize performance of… more

NVIDIA (11/06/25)
- Save Job - Related Jobs - Block Source
Senior Staff Machine Learning…

NVIDIA (Santa Clara, CA)

…Today, we are increasingly known as "the AI computing company." We are seeking a Senior Staff Machine Learning Engineer to join our Enterprise AI team and build ... frameworks such as PyTorch or TensorFlow; familiarity with CUDA-accelerated libraries (eg, TensorRT - LLM ) is a plus. + Proven track record to take a significant… more

NVIDIA (01/12/26)
- Save Job - Related Jobs - Block Source
Senior Performance Engineer - AI…

Red Hat (Boston, MA)

**About the Job** The Red Hat Performance and Scale Engineering team is seeking a Senior Performance Engineer to join our PSAP (Performance and Scale for AI ... for example.This is a dynamic role for a seasoned engineer with a growth mindset who handles and adapts...PyTorch Profiler, among others + Hands-on experience with modern LLM inference server stacks (eg, vLLM, TensorRT -… more

Red Hat (01/05/26)
- Save Job - Related Jobs - Block Source
AI Senior Staff Systems Engineer

Cadence Design Systems, Inc. (San Jose, CA)

…quantization, distillation, and using high-performance serving frameworks (eg, vLLM, TGI, TensorRT - LLM ) to maximize inference throughput and minimize latency. + ... implementing CI/CD pipelines for AI model development. + Advanced LLM Deployment & Optimization: Lead the deployment, serving, and...AI infrastructure. Proven track record as a Principal or Senior Staff Engineer . + Expert-level knowledge of… more

Cadence Design Systems, Inc. (12/29/25)
- Save Job - Related Jobs - Block Source
Lead Senior Software Engineer…

NVIDIA (Santa Clara, CA)

…and see how you can make a lasting impact on the world. NVIDIA is seeking a Senior Software Engineer to serve as a Tech Lead, driving the design and delivery of ... Experience developing for GPU platforms and familiarity with NVIDIA technologies (eg, CUDA, TensorRT , Triton, NeMo) and LLM serving frameworks (eg, Dynamo, vLLM,… more

NVIDIA (01/13/26)
- Save Job - Related Jobs - Block Source
Senior Applied Agent Research…

NVIDIA (Santa Clara, CA)

…agents, or ML algorithms. + Experience with NVIDIA AI platforms: NeMo, NIMs, TensorRT - LLM , RAPIDS. NVIDIA offers competitive salaries and a generous benefits ... KGMON (Kaggle Grandmasters of NVIDIA) team is seeking a Senior Applied Research Scientist passionate about AI agents with...crowd: + Kaggle Grandmaster status with gold medals in NLP/ LLM contests, or several top-10 placements in prominent ML… more

NVIDIA (01/10/26)
- Save Job - Related Jobs - Block Source
Senior Software Engineer , AI…

NVIDIA (CA)

…you can make a lasting impact on the world. We are now looking for a Senior System Software Engineer to work on user facing tools for Dynamo Inference Server! ... the crowd: + Experience with inference-serving frameworks (eg, Dynamo Inference Server, TensorRT , ONNX Runtime) and deploying/managing LLM inference pipelines at… more

NVIDIA (11/29/25)
- Save Job - Related Jobs - Block Source
Senior Software Development Engineer…

Amazon (Seattle, WA)

…responsible for development, enablement and performance tuning of a wide variety of LLM model families, including massive scale large language models like the Llama ... will: * would with state of the art LLMs, Open source and internal LLM families, large scale performance and benchmark evaluations etc., * develop and performance… more

Amazon (01/06/26)
- Save Job - Related Jobs - Block Source
Senior ML Software Engineer…

Microsoft Corporation (Redmond, WA)

…develop novel quantization and numerics kernels to enable efficient deployment of LLM inference and training in Microsoft's Azure production environments. + Drive ... of quantized models. + Analyze performance bottlenecks in quantized state-of-the-art LLM architectures and drive performance improvements. + Prototype and evaluate… more

Microsoft Corporation (11/26/25)
- Save Job - Related Jobs - Block Source
Sr Principal Machine Learning Engineer…

Palo Alto Networks (Santa Clara, CA)

…while ensuring a formidable security posture from development through runtime. As a Senior Principal Machine Learning Engineer , you will drive research on ... mechanisms and related knowledge is a plus. + Demonstrated expertise with modern LLM inference engines (eg, vLLM, SGLang, TensorRT - LLM ) is required.… more

Palo Alto Networks (01/13/26)
- Save Job - Related Jobs - Block Source
Senior Technical Marketing Engineer…

NVIDIA (Santa Clara, CA)

…full-stack software ecosystem to power AI at scale. We are looking for a Senior Technical Marketing Engineer to join our growing accelerated computing product ... TensorFlow, JAX), and inference-specific frameworks & optimizations (Triton Inference Server, TensorRT - LLM , vLLM, SGLang). + Market Awareness - Experience… more

NVIDIA (11/06/25)
- Save Job - Related Jobs - Block Source
LLMOps Engineer

Steampunk (Mclean, VA)

…** to design, implement, and maintain production-grade large-language-model ( LLM ) pipelines, deployment architectures, and monitoring systems across enterprise ... environments. The Senior LLMOps Engineer will play a critical...models using frameworks such as Hugging Face Transformers, vLLM, TensorRT - LLM , or similar. + Proficiency in Python… more

Steampunk (11/18/25)
- Save Job - Related Jobs - Block Source
Senior Machine Learning Engineer

CAE USA INC (Arlington, TX)

…and having fun! Summary We are seeking a highly skilled and experienced Machine Learning Engineer to join our growing AI & Data Science team in R&D. This role is ... scalable and efficient model serving infrastructure using tools like ONNX, TensorRT , DeepSpeed, or vLLM. + Implement retrieval-augmented generation (RAG) pipelines… more

CAE USA INC (12/11/25)
- Save Job - Related Jobs - Block Source

"Juju

Account Login

Sign Up

Forgot your password?

Advanced Search