- Meta (Menlo Park, CA)
- …open source, cutting-edge, and industry leading. **Required Skills:** ML Framework Software Engineer (PhD) Responsibilities: 1. Develop the PT2 compiler (eg, ... PT2 adoption through direct engagements with OSS and industry users.The PyTorch Compiler team is dedicated to making PyTorch run faster and more resource-efficient… more
- Amazon (Cupertino, CA)
- …and Trainium ML accelerators. This comprehensive toolkit includes an ML compiler , runtime, and application framework that seamlessly integrates with popular ML ... layers - from frameworks and kernels and collaborate with compiler to runtime and collectives. We not only optimize...- Familiar with syntax and tile-level semantics similar to Triton . - Experience with online/offline inference serving with vLLM,… more
- Amazon (Cupertino, CA)
- …and Trainium ML accelerators. This comprehensive toolkit includes an ML compiler , runtime, and application framework that seamlessly integrates with popular ML ... ML inference and training performance. As part of the broader Neuron Compiler organization, our team works across multiple technology layers - from frameworks… more
- NVIDIA (Santa Clara, CA)
- …across multi-GPU, multi-node, and multi-cloud environments. You'll collaborate across inference, compiler , scheduling, and performance teams to push the frontier of ... + Develop, optimize, and benchmark GPU kernels (hand-tuned and compiler -generated) using techniques such as fusion, autotuning, and memory/layout optimization;… more
- NVIDIA (Santa Clara, CA)
- We are now looking for a Senior Deep Learning Software Engineer , LLM Performance! NVIDIA is seeking an experienced Deep Learning Engineer passionate about ... deployment algorithms and optimizations using TensorRT LLM, VLLM, SGLang, Triton and CUDA kernels. Work and collaborate with a...Prior experience with a LLM framework or a DL compiler in inference, deployment, algorithms, or implementation + Prior… more
- NVIDIA (Santa Clara, CA)
- We are now looking for a Senior Deep Learning Software Engineer , PyTorch-TensorRT Performance! NVIDIA is seeking an experienced Deep Learning Engineer passionate ... NVIDIA accelerators, from datacenter GPUs to edge SoCs. Implement graph compiler algorithms, frontend operators and code generators across the PyTorch,… more
- NVIDIA (Austin, TX)
- …workloads and models. We are looking for an outstanding Senior Software Engineer that can architect and implement these highly scalable solutions to different ... of the team, you will develop new innovative workflows, work on compiler - or runtime-driven solutions that accelerate critical workloads, generate optimal code… more
- quadric.io, Inc (Burlingame, CA)
- …NN graph code and conventional C++ DSP and control code. Role: The AI Kernel Engineer in Quadric plays the key role to enable a large number of AI kernels/operators ... to run efficiently on the Quadric platform. The AI Kernel Engineer at Quadric will [1] develop a highly efficient Quadric kernel library for a variety of AI/LLM… more
- Google (Sunnyvale, CA)
- Staff Software Engineer , TPU Performance, CoreML _corporate_fare_ Google _place_ Sunnyvale, CA, USA **Advanced** Experience owning outcomes and decision making, ... cross-functional, or cross-business projects. + 3 years of experience with compiler optimization, code generation, and runtime systems for GPU architectures… more
- NVIDIA (Santa Clara, CA)
- …automation. + Experience integrating AI with GPU computing workflows (eg, CUDA, PyTorch, Triton , or compiler toolchains). + Knowledge of planning algorithms or ... Learning Safety team is looking for a Senior Software Engineer to build intelligent, autonomous software for the next...Learning Safety team is looking for a Senior Software Engineer to design and implement AI-driven agents that autonomously… more
- NVIDIA (Santa Clara, CA)
- We are now looking for a Senior Deep Learning Software Engineer , FlashInfer. NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for ... Ways to stand out from the crowd: + Background in domain specific compiler and library solutions for LLM inference and training (eg FlashInfer, Flash Attention)… more