• OpenAI (San Francisco, CA)
    …to distribute their benefits widely. About the Role As a Distributed Systems/ML engineer , you will work on improving the training throughput for our internal ... of supercomputers. We're looking for people who love optimizing performance, understanding distributed systems, and who cannot stand having bugs in their code. This… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Periodic Labs (Menlo Park, CA)
    …experience with: Training on clusters with ≥5,000 GPUs 5D parallel LLM training Distributed training frameworks such as Megatron-LM, FSDP, DeepSpeed, ... About the role You will optimize, operate and develop large-scale distributed LLM training systems that power AI scientific research. You will work closely… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • Periodic Labs (Menlo Park, CA)
    …to open-source frameworks. Ideal candidates will have expertise in GPU clusters, parallel training , and distributed training frameworks. Join a rapidly ... in California seeks an experienced professional to optimize and develop large-scale distributed LLM training systems. This role involves working with researchers… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • IMC (Chicago, IL)
    A global trading firm is seeking a Machine Learning Engineer to develop large-scale training pipelines and optimize real-time predictions. Ideal candidates have ... 5+ years in ML, strong programming skills in Python or C++, and experience with GPU programming. This role offers a competitive salary range of $175,000 - $250,000. Join a collaborative environment where your work will influence trading strategies and… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Amazon (San Francisco, CA)
    …is seeking a Senior Machine Learning Engineer focused on AWS Neuron distributed training . The role demands strong programming skills, experience in machine ... learning, and leadership capabilities. Candidates should possess a bachelor's degree in computer science and have over 5 years of software development experience. This position offers a competitive salary and various benefits. #J-18808-Ljbffr more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • Amazon (Seattle, WA)
    Engineer to develop solutions for AI/ML applications. This role involves building distributed training support and tuning ML models to maximize performance on ... AWS infrastructure. Candidates should have a strong software development background and experience in deep learning. An inclusive work culture that promotes work-life balance and career growth opportunities is offered. #J-18808-Ljbffr more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • Genesis AI (San Carlos, CA)
    …AI technology firm in California is seeking a skilled professional to optimize distributed training systems using PyTorch. The ideal candidate will have over ... 8 years of experience in distributed systems and high-performance computing, with a strong command of Python and low-level performance optimizations using CUDA.… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Amazon (San Francisco, CA)
    Software Engineer - AI/ML, AWS Neuron Distributed Training Annapurna Labs designs silicon and software that accelerate innovation. Customers choose us to ... learning accelerators. This role is for a Senior Machine Learning Engineer in the Distributed Training team for AWS Neuron, responsible for development,… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Amazon (San Francisco, CA)
    Engineer for their Machine Learning Applications team. This role focuses on distributed training , performance tuning, and support for numerous large-scale ML ... models. Candidates should have strong software development skills and experience with Python. The position offers competitive compensation, support for career growth, and a commitment to work-life balance. #J-18808-Ljbffr more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • OpenAI (San Francisco, CA)
    A leading AI research firm in San Francisco is seeking a Distributed Systems engineer . You will develop powerful APIs for managing large-scale data operations ... across distributed systems, ensuring stability and performance. Ideal candidates will...stability and performance. Ideal candidates will have experience with distributed systems and strong software engineering skills, particularly in… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • Boeing (Hazelwood, MO)
    …future with us. The Boeing Company is currently seeking a highly motivated Software Engineer (Experienced or Senior) to join the Training Systems - Battlespace ... control). BSM is responsible for the design, development, manufacture, and maintenance of training devices for a wide variety of commercial and military aircraft -… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • Capital One (Charlottesville, VA)
    Lead Machine Learning Engineer As a Capital One Machine Learning Engineer (MLE), you'll be part of an Agile team dedicated to productionizing machine learning ... including choice of model, data, and feature selection, model training , hyperparameter tuning, dimensionality, bias/variance, and validation). Solve complex problems… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • Capital One (Petersburg, VA)
    Senior Machine Learning Engineer Job Description As a Capital One Machine Learning Engineer (MLE), you'll be part of an Agile team dedicated to productionizing ... including choice of model, data, and feature selection, model training , hyperparameter tuning, dimensionality, bias/variance, and validation). Solve complex problems… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • Capital One (Washington, DC)
    Senior Lead Machine Learning Engineer (Big Data and Machine Learning) As a Capital One Machine Learning Engineer (MLE), you'll be part of an Agile team dedicated ... including choice of model, data, and feature selection, model training , hyperparameter tuning, dimensionality, bias/variance, and validation). Solve complex problems… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • Amazon (Seattle, WA)
    Software Engineer - AI/ML, AWS Neuron Distributed Training Do you love decomposing problems to develop products that impact millions of people around the ... equivalent Preferred previous software engineer expertise with PyTorch/Jax/TensorFlow, Distributed libraries and Frameworks, End‑to‑end Model Training . The… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • Rubrik, Inc. (Palo Alto, CA)
    Software Engineer , Atlas Distributed Systems Rubrik Atlas is the core data path for all Rubrik products, whether in the data center, at the edge, or in the ... our Atlas platform. We are looking for an experienced distributed systems engineer to guide us through...factors, including job‑related skills, experience, and relevant education or training . US Pay Range $158,000 - $237,000 USD Join… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Google Inc. (Sunnyvale, CA)
    Senior Software Engineer , Google Distributed Cloud, Kubernetes corporate_fare Google place Sunnyvale, CA, USA Apply Bachelor's degree in Computer Science, ... year of experience with software design and architecture for distributed systems. Preferred qualifications: Master's degree or PhD in...on and is growing every day. As a software engineer , you will work on a specific project critical… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Databricks Inc. (Mountain View, CA)
    Staff Software Engineer - Distributed Data Systems Mountain View, California P-186 At Databricks, we are obsessed with enabling data teams to solve the world's ... capabilities of traditional SQL query engines. As a software engineer on the Runtime team at Databricks, you will...at Databricks, you will be building the next generation distributed data storage and processing systems that can outperform… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Scribd (San Francisco, CA)
    …solutions in production. Role Overview: We're seeking a Senior Software Engineer with deep experience building event-driven, distributed , and scalable ... We work at the intersection of machine learning, data engineering, and distributed systems, collaborating closely with applied research and product teams to deploy… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • OpenAI (San Francisco, CA)
    …We work on building robust, scalable, high performance components to support our distributed training workloads. Our priorities are to maximize the productivity ... accelerating progress towards AGI. About the Role As a Distributed Systems engineer , you will work to...responding to the dynamic and evolving needs of our training systems architectures. This role is based in San… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source