• Software Engineer III - Search…

    JPMorgan Chase (Chicago, IL)
    …contribute your skills in enabling enterprise-wide content discovery. As a Software Engineer III at JPMorgan Chase within the Employee Platforms, Enterprise Search ... and emerging technologies **Required qualifications, capabilities, and skills** + Formal training or certification on software engineering concepts and 3+ years… more
    JPMorgan Chase (01/10/26)
    - Save Job - Related Jobs - Block Source
  • Software Engineer II - AI/ML, AWS Neuron,…

    Amazon (Cupertino, CA)
    …ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance. The Inference Enablement and Acceleration team is at the forefront ... to work at the intersection of machine learning, high-performance computing, and distributed architectures, where you'll help shape the future of AI acceleration… more
    Amazon (11/27/25)
    - Save Job - Related Jobs - Block Source
  • Lead Software Engineer - Full…

    JPMorgan Chase (Plano, TX)
    **Lead Software Engineer - Python/Java/AWS/Cloud - 603** **Organization Description** Our Consumer & Community Banking division serves our Chase customers through a ... an accommodation. We are seeking a highly skilled and innovative Lead Software Engineer with a strong focus on automation and AI solutioning. The ideal candidate… more
    JPMorgan Chase (12/12/25)
    - Save Job - Related Jobs - Block Source
  • Senior Principal Machine Learning Engineer

    Red Hat (Boston, MA)
    …optimize, and scale LLM deployments. As a Machine Learning Engineer focused on distributed vLLM (https://github.com/vllm-project/) infrastructure in the ... to GenAI deployments. As leading developers, maintainers of the vLLM and LLM -D projects, and inventors of state-of-the-art techniques for model quantization and… more
    Red Hat (01/08/26)
    - Save Job - Related Jobs - Block Source
  • Sr. Software Engineer - AI/ML, AWS Neuron…

    Amazon (Seattle, WA)
    …as well as Stable Diffusion, Vision Transformers (ViT) and many more. The ML Distributed Training team works side by side with chip architects, compiler ... accelerators. This role is for a Senior Machine Learning Engineer in the Distribute Training team for...engineers and runtime engineers to create, build and tune distributed training solutions with Trainium instances. Experience… more
    Amazon (12/31/25)
    - Save Job - Related Jobs - Block Source
  • Sr. Software Engineer - AI/ML, AWS Neuron…

    Amazon (Cupertino, CA)
    …as well as Stable Diffusion, Vision Transformers (ViT) and many more. The ML Distributed Training team works side by side with chip architects, compiler ... accelerators. This role is for a Senior Machine Learning Engineer in the Distribute Training team for...engineers and runtime engineers to create, build and tune distributed training solutions with Trainium instances. Experience… more
    Amazon (12/19/25)
    - Save Job - Related Jobs - Block Source
  • Forward Deployed Engineer , AI Inference…

    Red Hat (Sacramento, CA)
    …directly with the engineering teams at our customer to deploy, optimize, and scale distributed Large Language Model ( LLM ) inference systems. You will solve " ... developer to join our team as a **Forward Deployed Engineer ** . In this role, you will not just...-D engineering team. **What You Will Do** + **Orchestrate Distributed Inference** : Deploy and configure LLM -D… more
    Red Hat (01/08/26)
    - Save Job - Related Jobs - Block Source
  • Software Engineer , SystemML - Scaling…

    Meta (Menlo Park, CA)
    …to improve the full-stack distributed ML reliability and performance (eg Large-Scale GenAI/ LLM training ) from the trainer down to the inter-GPU and network ... NCCL has been integrated into PyTorch and is on the critical path of multi-GPU distributed training . In other words, nearly every distributed GPU-based ML… more
    Meta (12/20/25)
    - Save Job - Related Jobs - Block Source
  • Senior Research Engineer , Foundation Model…

    NVIDIA (Santa Clara, CA)
    …product roadmaps. What you will be doing: + Design and maintain large-scale distributed training systems to support multi-modal foundation models for robotics. + ... and AI infrastructure; + Proven experience designing and optimizing distributed training systems with frameworks like PyTorch,...conception to deployment; + Strong experience at building large-scale LLM and multimodal LLM training more
    NVIDIA (12/05/25)
    - Save Job - Related Jobs - Block Source
  • Senior Software Engineer , AI Platform

    LinkedIn (Mountain View, CA)
    …and resolve issues in popular libraries like Huggingface, Horovod and PyTorch, enable distributed training over 100s of billions of parameter models, debug and ... Online Learning and Serving performance optimizations across billions of user queries. Model Training Infrastructure: As an engineer on the AI Training more
    LinkedIn (12/05/25)
    - Save Job - Related Jobs - Block Source
  • Software Engineer , AI Platform

    LinkedIn (Mountain View, CA)
    …and resolve issues in popular libraries like Huggingface, Horovod and PyTorch, enable distributed training over 100s of billions of parameter models, debug and ... Serving performance optimizations across billions of user queries Model Training Infrastructure: As an engineer on the...performance applications serving very large & complex models across LLM and Personalization models. As an engineer ,… more
    LinkedIn (10/21/25)
    - Save Job - Related Jobs - Block Source
  • Senior Performance Engineer - AI Platforms

    Red Hat (Boston, MA)
    …The Red Hat Performance and Scale Engineering team is seeking a Senior Performance Engineer to join our PSAP (Performance and Scale for AI Platforms) team. In this ... role, you will drive the performance and scalability of distributed inference for Large Language Models (LLMs) as part...for example.This is a dynamic role for a seasoned engineer with a growth mindset who handles and adapts… more
    Red Hat (01/05/26)
    - Save Job - Related Jobs - Block Source
  • Principal Staff Software Engineer , AI…

    LinkedIn (Mountain View, CA)
    …GNNs, Incremental Learning, Online Learning, and advanced LLM Agents work for Training infrastructure. As a Principal Staff Software Engineer on the AI ... problems. + Designing, implementing, and optimizing the performance of large-scale distributed training for personalized recommendation as well as large… more
    LinkedIn (12/25/25)
    - Save Job - Related Jobs - Block Source
  • Sr. Staff Software Engineer , AI Infra

    LinkedIn (Mountain View, CA)
    …and resolve issues in popular libraries like Huggingface, Horovod and PyTorch, enable distributed training over 100s of billions of parameter models, debug and ... Serving performance optimizations across billions of user queries. Model Training Infrastructure: As an engineer on the...performance applications serving very large & complex models across LLM and Personalization models. As an engineer ,… more
    LinkedIn (12/27/25)
    - Save Job - Related Jobs - Block Source
  • Senior Research Engineer /Scientist

    ServiceNow, Inc. (Santa Clara, CA)
    …and developing LLM based features + Experience with methods of training and fine tuning large language models, such as distilation, supervised fine-tunning and ... sunny San Diego, California in 2004 when a visionary engineer , Fred Luddy, saw the potential to transform how...phases of Large Language Models development, including data curation, training , and evaluation. Our goal is to consistently enhance… more
    ServiceNow, Inc. (12/17/25)
    - Save Job - Related Jobs - Block Source
  • Research Engineer , Language - Generative…

    Meta (Topeka, KS)
    …is seeking a Research Engineer to join our Large Language Model ( LLM ) Research team. We conduct focused research and engineering to build state-of-the-art LLMs, ... experience in areas like language model evaluation; data processing for pre- training and fine-tuning; responsible LLMs; LLM alignment; reinforcement learning… more
    Meta (12/20/25)
    - Save Job - Related Jobs - Block Source
  • Senior HPC and AI Networking Performance Research…

    NVIDIA (Santa Clara, CA)
    …from the crowd: + In-depth knowledge and experience with AI workloads and benchmarking for distributed LLM training . + Knowledge in CUDA, and NCCL libraries. ... Computing (HPC) and AI Networking Performance Research and Analysis Engineer to join our Performance group. In this exciting...workloads on large GPUs and CPUs scale clusters for distributed Deep Learning LLM training more
    NVIDIA (12/03/25)
    - Save Job - Related Jobs - Block Source
  • Software Engineer III, Mobile, Android,…

    Google (Mountain View, CA)
    Software Engineer III, Mobile, Android, Google Maps Platform _corporate_fare_ Google _place_ Mountain View, CA, USA **Mid** Experience driving progress, solving ... to SDKs or APIs for developers. + Experience working with Large Language Models ( LLM 's) or applied AI. + Experience with mapping technologies (eg, Google Maps SDK,… more
    Google (12/18/25)
    - Save Job - Related Jobs - Block Source
  • Generative AI Engineer

    The MITRE Corporation (Mclean, VA)
    …for expertise in areas such as LLMs, machine learning, model training and deployment, model evaluation, retrieval augmented generation (RAG) systems, GraphRAG, ... The position involves researching, developing, evaluating, and integrating GenAI and LLM capabilities. Specific responsibilities will include: + Work in and provide… more
    The MITRE Corporation (01/12/26)
    - Save Job - Related Jobs - Block Source
  • Senior ML Engineer (Internet Security)

    Palo Alto Networks (Santa Clara, CA)
    …**Your Career** You will build machine learning models and develop big data and distributed systems that use the models to analyze and categorize an enormous amount ... a security-sensitive environment. + Own the end-to-end lifecycle of ML and LLM components, from problem formulation and model development to production deployment,… more
    Palo Alto Networks (01/10/26)
    - Save Job - Related Jobs - Block Source