Senior Tensorrt Llm Engineer Jobs

29 jobs (page 1)

Categories

All Categories

Engineering (12)

Principal Software Engineer - Large-Scale…

NVIDIA Corporation (Santa Clara, CA)

Principal Software Engineer - Large-Scale LLM Memory and Storage Systems page is loaded## Principal Software Engineer - Large-Scale LLM Memory and ... of any single GPU, this platform enables efficient, resilient deployment of cutting-edge LLM workloads.We are seeking a Principal Systems Engineer to define the… more

job goal (01/14/26)
- Save Job - Related Jobs - Block Source
Principal Software Engineer - Large-Scale…

NVIDIA (Santa Clara, CA)

…deployment of cutting-edge LLM workloads. We are seeking a Principal Systems Engineer to define the vision and roadmap for memory management of large-scale ... support large-scale LLM inference. Architect and implement deep integrations with leading LLM serving engines (such as vLLM, SGLang, TensorRT - LLM ), with… more

job goal (01/14/26)
- Save Job - Related Jobs - Block Source
Machine Learning Engineer , LLM…

First Soft Solutions LLC (San Jose, CA)

Machine Learning Engineer , LLM Fine‑Tuning We are actively hiring for a Machine Learning Engineer focused on LLM fine‑tuning for Verilog/RTL ... model invocation where it fits, and/or low‑latency self‑hosted inference (vLLM/ TensorRT ‑ LLM ), autoscaling, and canary/blue‑green rollouts. Build an evaluation… more

job goal (01/14/26)
- Save Job - Related Jobs - Block Source
Senior Software Engineer - AI/ML…

GEICO (Palo Alto, CA)

…Great Rewards and Great Careers.**GEICO AI ML Infrastructure team is seeking an exceptional Senior ML Platform Engineer to build and scale our machine learning ... maintain feature stores for ML model training and inference pipelines* Build and optimize LLM inference systems using frameworks like vLLM, TensorRT - LLM , and… more

job goal (01/14/26)
- Save Job - Related Jobs - Block Source
Senior Software Engineer , Model…

Apple Inc. (San Francisco, CA)

Senior Software Engineer , Model Inference San Francisco Bay Area, California, United States Software and Services Join Apple Maps to help build the best map in ... measurable results at global scale. Description As a Software Engineer on the Apple Maps team, you will lead...and Speculative Decoding. Skilled in GPU optimization (eg, CUDA, TensorRT - LLM , cuDNN) to accelerate inference tasks. Skilled… more

job goal (01/14/26)
- Save Job - Related Jobs - Block Source
Machine Learning Engineer , 6+ Years…

Carlsbad Tech (San Francisco, CA)

Machine Learning Engineer , 6+ Years Experience job at Twelve Labs. San Francisco, CA. Who We Are At Twelve Labs, we are pioneering the development of frontier ... worldwide innovation. About The Role As a Machine Learning Engineer at Twelve Labs, you will drive our ML...the role. This role is a perfect fit for senior engineers who get excited by the prospect of… more

job goal (01/14/26)
- Save Job - Related Jobs - Block Source
Senior Software Development Engineer…

NVIDIA (Santa Clara, CA)

We are now looking for a TensorRT - LLM Software Development Engineer ! NVIDIA is hiring software engineers for its TensorRT - LLM team. Academic and ... core backend software for LLM inference. + Improve the usability of the TensorRT - LLM library and build systems (CMake) What we need to see: + Masters or… more

NVIDIA (01/10/26)
- Save Job - Related Jobs - Block Source
Senior Deep Learning Software…

NVIDIA (Santa Clara, CA)

We are now looking for a Senior Deep Learning Software Engineer , LLM Performance! NVIDIA is seeking an experienced Deep Learning Engineer passionate ... learning community to implement the latest algorithms for public release in TensorRT LLM , VLLM, SGLang and LLM benchmarks. Identify performance opportunities… more

NVIDIA (11/25/25)
- Save Job - Related Jobs - Block Source
Principal Software Engineer - Large-Scale…

NVIDIA (Santa Clara, CA)

…deployment of cutting-edge LLM workloads. We are seeking a Principal Systems Engineer to define the vision and roadmap for memory management of large-scale ... large-scale LLM inference. + Architect and implement deep integrations with leading LLM serving engines (such as vLLM, SGLang, TensorRT - LLM ), with a… more

NVIDIA (01/10/26)
- Save Job - Related Jobs - Block Source
Software Engineer II - AI/ML, AWS Neuron,…

Amazon (Cupertino, CA)

…responsible for development, enablement and performance tuning of a wide variety of LLM model families, including massive scale large language models like the Llama ... we're building an environment that celebrates knowledge-sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews.… more

Amazon (11/27/25)
- Save Job - Related Jobs - Block Source
Senior GenAI Algorithms Engineer…

NVIDIA (Santa Clara, CA)

…and streamlined deployment strategies with open-sourced inference frameworks. Seeking a Senior Deep Learning Algorithms Engineer to improve innovative LLMs, ... ( TensorRT Model Optimizer, Megatron-LM, Megatron-Bridge, Nvidia-NeMo, NeMo-AutoModel, TensorRT - LLM ) and open-source frameworks (PyTorch, Hugging Face, vLLM,… more

NVIDIA (12/18/25)
- Save Job - Related Jobs - Block Source
Senior AI Engineer , NeMo Retriever…

NVIDIA (Santa Clara, CA)

…on pre-optimized inference engines from NVIDIA and the community, including NVIDIA TensorRT and TensorRT - LLM , NIM microservices optimize response latency ... The NeMo Retriever team is looking for an AI Engineer to join our team, focusing on the intersection...deployments, etc. + Familiarity with ML libraries, especially PyTorch, TensorRT , or TensorRT - LLM . + Excellent… more

NVIDIA (01/10/26)
- Save Job - Related Jobs - Block Source
Senior DL Algorithms Engineer…

NVIDIA (Santa Clara, CA)

We are now looking for a Senior DL Algorithms Engineer ! We are seeking a highly skilled Deep Learning Algorithms Engineer with hands-on experience optimizing ... deploy, and optimize models for efficient inference using frameworks such as TensorRT , TensorRT - LLM , vLLM, and SGLang. + Understand, analyze, profile, and… more

NVIDIA (11/06/25)
- Save Job - Related Jobs - Block Source
Senior Deep Learning Algorithm…

NVIDIA (Santa Clara, CA)

We are now looking for a Senior DL Algorithms Engineer ! We are seeking a highly skilled Deep Learning Algorithms Engineer with hands-on experience optimizing ... inference. + Convert and deploy models using frameworks such as TensorRT and TensorRT - LLM + Understand, analyze, profile, and optimize performance of… more

NVIDIA (11/06/25)
- Save Job - Related Jobs - Block Source
Senior Staff Machine Learning…

NVIDIA (Santa Clara, CA)

…Today, we are increasingly known as "the AI computing company." We are seeking a Senior Staff Machine Learning Engineer to join our Enterprise AI team and build ... frameworks such as PyTorch or TensorFlow; familiarity with CUDA-accelerated libraries (eg, TensorRT - LLM ) is a plus. + Proven track record to take a significant… more

NVIDIA (01/12/26)
- Save Job - Related Jobs - Block Source
Senior Performance Engineer - AI…

Red Hat (Boston, MA)

**About the Job** The Red Hat Performance and Scale Engineering team is seeking a Senior Performance Engineer to join our PSAP (Performance and Scale for AI ... for example.This is a dynamic role for a seasoned engineer with a growth mindset who handles and adapts...PyTorch Profiler, among others + Hands-on experience with modern LLM inference server stacks (eg, vLLM, TensorRT -… more

Red Hat (01/05/26)
- Save Job - Related Jobs - Block Source
AI Senior Staff Systems Engineer

Cadence Design Systems, Inc. (San Jose, CA)

…quantization, distillation, and using high-performance serving frameworks (eg, vLLM, TGI, TensorRT - LLM ) to maximize inference throughput and minimize latency. + ... implementing CI/CD pipelines for AI model development. + Advanced LLM Deployment & Optimization: Lead the deployment, serving, and...AI infrastructure. Proven track record as a Principal or Senior Staff Engineer . + Expert-level knowledge of… more

Cadence Design Systems, Inc. (12/29/25)
- Save Job - Related Jobs - Block Source
Lead Senior Software Engineer…

NVIDIA (Santa Clara, CA)

…and see how you can make a lasting impact on the world. NVIDIA is seeking a Senior Software Engineer to serve as a Tech Lead, driving the design and delivery of ... Experience developing for GPU platforms and familiarity with NVIDIA technologies (eg, CUDA, TensorRT , Triton, NeMo) and LLM serving frameworks (eg, Dynamo, vLLM,… more

NVIDIA (01/13/26)
- Save Job - Related Jobs - Block Source
Senior Applied Agent Research…

NVIDIA (Santa Clara, CA)

…agents, or ML algorithms. + Experience with NVIDIA AI platforms: NeMo, NIMs, TensorRT - LLM , RAPIDS. NVIDIA offers competitive salaries and a generous benefits ... KGMON (Kaggle Grandmasters of NVIDIA) team is seeking a Senior Applied Research Scientist passionate about AI agents with...crowd: + Kaggle Grandmaster status with gold medals in NLP/ LLM contests, or several top-10 placements in prominent ML… more

NVIDIA (01/10/26)
- Save Job - Related Jobs - Block Source
Senior Software Engineer , AI…

NVIDIA (CA)

…you can make a lasting impact on the world. We are now looking for a Senior System Software Engineer to work on user facing tools for Dynamo Inference Server! ... the crowd: + Experience with inference-serving frameworks (eg, Dynamo Inference Server, TensorRT , ONNX Runtime) and deploying/managing LLM inference pipelines at… more

NVIDIA (11/29/25)
- Save Job - Related Jobs - Block Source

"Juju

Account Login

Sign Up

Forgot your password?

Advanced Search