• AI and ML HPC Cluster…

    NVIDIA (Santa Clara, CA)
    …that power some of the world's most advanced computing workloads. NVIDIA is looking for an AI /ML HPC Cluster Engineer to join our MARS team. You will provide ... be doing: + Support day-to-day operations of production on-premises and multi-cloud AI / HPC clusters, ensuring system health, user satisfaction, and efficient… more
    NVIDIA (01/10/26)
    - Save Job - Related Jobs - Block Source
  • Site Reliability Engineer

    NVIDIA (Santa Clara, CA)
    …foundational improvements and automation to improve engineer 's productivity. As a Site Reliability Engineer , you are responsible for the big picture of how ... fueled by great technology-and amazing people. Today, we're tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU… more
    NVIDIA (01/13/26)
    - Save Job - Related Jobs - Block Source
  • HPC Sr. Scientific Software Engineer

    Johns Hopkins University (Baltimore, MD)
    …and Design** + Develop and refine deployment strategies for scientific software on HPC and AI systems. + Design computational workflows, selecting optimal ... AI Agents). _Performance Optimization_ + Analyze and optimize the performance of AI models and HPC applications, focusing on GPU-enabled computing. +… more
    Johns Hopkins University (11/21/25)
    - Save Job - Related Jobs - Block Source
  • HPC Scientific Software Engineer

    Johns Hopkins University (Baltimore, MD)
    …and Design_ + Develop and refine deployment strategies for scientific software on HPC and AI systems. + Design computational workflows, selecting optimal ... _Performance Optimization_ + Analyze and optimize the performance of AI models and HPC applications, focusing on...fields, with advanced training in scientific computing. Classified Title: HPC Scientific Software Engineer Job Posting Title… more
    Johns Hopkins University (12/04/25)
    - Save Job - Related Jobs - Block Source
  • HPC / AI Platform Engineering

    Lilly (Indianapolis, IN)
    …Bold** - You will bring a high learning agility and Infrastructure availability and reliability Engineer skills to help us enable the Lilly Technology strategy, ... the world. Come help us unlock the power of HPC and AI based POGPU and Accelerated...Additionally, you would advise with our senior Linux platform engineer directing the global Linux strategy for on-premises private… more
    Lilly (11/27/25)
    - Save Job - Related Jobs - Block Source
  • Staff Software Engineer , HPC

    Google (Kirkland, WA)
    Staff Software Engineer , HPC Solutions _corporate_fare_ Google _place_ Kirkland, WA, USA **Advanced** Experience owning outcomes and decision making, solving ... future of scientific computing by leading the convergence of AI and HPC . The AI ...Google customers with breakthrough capabilities and insights by delivering AI and Infrastructure at unparalleled scale, efficiency, reliability more
    Google (12/20/25)
    - Save Job - Related Jobs - Block Source
  • Senior HPC and Quantum Systems…

    NVIDIA (Westford, MA)
    …how you can make a lasting impact on the world. We are seeking a Senior HPC & Quantum Systems Engineer to help architect, deploy, and operate a first-of-its-kind ... people. Today, we're tapping into the unlimited potential of AI to define the next era of computing. An...is not a pure research role nor a traditional HPC admin role-it is a systems engineering position dedicated… more
    NVIDIA (01/10/26)
    - Save Job - Related Jobs - Block Source
  • Software Engineer , HPC , Platform…

    Google (Kirkland, WA)
    Software Engineer , HPC , Platform Readiness, Workload Performance _corporate_fare_ Google _place_ Kirkland, WA, USA **Advanced** Experience owning outcomes and ... on and is growing every day. As a software engineer , you will work on a specific project critical...Google customers with breakthrough capabilities and insights by delivering AI and Infrastructure at unparalleled scale, efficiency, reliability more
    Google (12/18/25)
    - Save Job - Related Jobs - Block Source
  • Staff Engineer , Systems HPC

    Micron Technology, Inc. (Richardson, TX)
    …intelligence, inspiring the world to learn, communicate and advance faster than ever. As an HPC Staff Engineer at Micron, you will join a diverse team of ... You will play a key part in maintaining the reliability and efficiency of Micron's data environment. **​Responsibilities** +...from candidates as consideration for their employment with Micron. AI alert **:** Candidates are encouraged to use … more
    Micron Technology, Inc. (12/09/25)
    - Save Job - Related Jobs - Block Source
  • Senior GPU and HPC Infrastructure…

    NVIDIA (Santa Clara, CA)
    NVIDIA is hiring engineers to scale up its AI Infrastructure. We expect you to have a strong programming background, knowledge of datacenter hardware, operations, ... and planning abilities. Experience working with High Performance Computing ( HPC ), GPUs, and high-performance networking (RDMA, Infiniband, RoCE) are strongly… more
    NVIDIA (01/08/26)
    - Save Job - Related Jobs - Block Source
  • Staff Quality and Reliability

    Google (Sunnyvale, CA)
    …architecture and its integration within AI /ML-driven systems. As a Quality and Reliability Engineer for Google Cloud, you will lead the development of ... Staff Quality and Reliability Engineer , Google Cloud _corporate_fare_ Google...Google customers with breakthrough capabilities and insights by delivering AI and Infrastructure at unparalleled scale, efficiency, reliability more
    Google (12/30/25)
    - Save Job - Related Jobs - Block Source
  • Senior Software Engineer - AI

    Bloomberg (New York, NY)
    …for overseeing the ongoing monitoring, support, and maintenance of our HPC / AI clusters, ensuring peak performance and reliability . **We'll trust you to:** ... Senior Software Engineer - AI Hardware Location New York...ongoing monitoring, support, and maintenance of our HPC / AI clusters, ensuring peak performance and reliability more
    Bloomberg (12/18/25)
    - Save Job - Related Jobs - Block Source
  • MTS - Site Reliability Engineer

    Microsoft Corporation (Redmond, WA)
    …so that everyone can realize its benefits. We're looking for an experienced **Site Reliability Engineer (SRE)** to join our infrastructure team. In this role, ... **Overview** As Microsoft continues to push the boundaries of AI , we are on the lookout for passionate individuals to work with us on the most interesting and… more
    Microsoft Corporation (12/17/25)
    - Save Job - Related Jobs - Block Source
  • Principal Mechanical Reliability

    Dell Technologies (Austin, TX)
    **Principal Mechanical Reliability Engineer ** Mechanical Engineering leads and delivers the development of innovative and compliant mechanical design solutions, ... make a profound social impact as a **Principal Mechanical Reliability Engineer ** on our Mechanical **Engineering** Team...be instrumental in delivering advanced liquid cooling solutions for AI , HPC , and enterprise server markets. Your… more
    Dell Technologies (11/19/25)
    - Save Job - Related Jobs - Block Source
  • Sr Principal Software Engineer , Networking…

    Oracle (Cheyenne, WY)
    AI Infrastructure Innovation team is pioneering the creation of next-generation AI / HPC networking for GPU superclusters at massive scale. Our mission is ... system design, and implementation for high-performance RDMA solutions across OCI's AI / HPC platforms, including frontend and backend fabrics. + Innovate… more
    Oracle (12/20/25)
    - Save Job - Related Jobs - Block Source
  • AI /ML Infrastructure Engineer

    Oracle (Lincoln, NE)
    …solutions across Oracle's enterprise customers. We are seeking a highly skilled ** AI /ML Infrastructure Engineer ** to design, build, and support the systems, ... troubleshooting, and best practices. + Stay current with emerging trends in AI infrastructure, agent frameworks, HPC systems, and cloud-native technologies;… more
    Oracle (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Principal Network Engineer - DC…

    NVIDIA (Santa Clara, CA)
    …a passionate engineer who will solve networking problems for scalable AI clusters. This is a hands-on network engineering position focused on the architecture, ... and deployment of global-scale DCs inter-connects and fabric for HPC , AI , and GPU computing clusters. +...reliability . + Partner with system, OS, GPU, and HPC teams to deliver scalable, highly available networks for… more
    NVIDIA (01/10/26)
    - Save Job - Related Jobs - Block Source
  • Senior Principal Software Development…

    Oracle (Springfield, IL)
    …Forward Deployed Engineer (FDE) team is hiring a Senior Principal Software Development Engineer - AI Data Platform to help global customers unlock the full ... to streamline the adoption of Oracle AI Data Platform and Gen AI services. + Optimize performance, scalability, and reliability of distributed data/ AI more
    Oracle (01/11/26)
    - Save Job - Related Jobs - Block Source
  • Consulting Member of Technical Staff - AI

    Oracle (Santa Clara, CA)
    …and debug software programs for databases, applications, tools, networks etc.As an AI /ML Infrastructure Engineer on the GPU Strategic Customers Engineering team, ... or Scala + Proven experience designing, implementing, and managing infrastructure for AI /ML or HPC workloads. + Understanding machine learning frameworks and… more
    Oracle (12/05/25)
    - Save Job - Related Jobs - Block Source
  • Senior Principal Software Engineer

    Oracle (Austin, TX)
    …automation, and diagnostic services. These are essential for running distributed AI /ML/ HPC workloads across thousands of GPUs, leveraging technologies like ... looking for a highly skilled and motivated distributed systems engineer who can architect solutions to scale and optimize...to scale and optimize Monitoring and Repair solutions for AI infrastructure components like GPU control plane and GPU… more
    Oracle (01/03/26)
    - Save Job - Related Jobs - Block Source