Senior Hpc Cluster Engineer Jobs | Juju

Senior HPC Cluster…

NVIDIA (Santa Clara, CA)

…make a lasting impact on the world. We are seeking a highly skilled and experienced HPC Cluster Engineer to design, deploy, and operate GPU Compute Clusters ... + Provide leadership and strategic mentorship on the management of large-scale HPC systems including the deployment of compute, networking, and storage. + Develop… more

NVIDIA (12/10/25)
- Save Job - Related Jobs - Block Source
Senior AI- HPC Cluster…

NVIDIA (Santa Clara, CA)

…+ Provide leadership and strategic mentorship on the management of large-scale HPC systems including the deployment of compute, networking, and storage. + Develop ... and operating large scale compute infrastructure. + Experience with AI/ HPC job schedulers and orchestrators, such as Slurm, K8s...such as Slurm, K8s or LSF. Applied experience with AI/ HPC workflows that use MPI and NCCL. + Proficient… more

NVIDIA (10/30/25)
- Save Job - Related Jobs - Block Source
Senior AI and ML HPC Cluster…

NVIDIA (Santa Clara, CA)

…Make the choice to join us today! As a member of the GPU AI/ HPC Infrastructure team, you will provide leadership in the design and implementation of ground ... + Provide leadership and strategic guidance on the management of large-scale HPC systems including the deployment of compute, networking, and storage. + Develop… more

NVIDIA (10/19/25)
- Save Job - Related Jobs - Block Source
Senior HPC Engineer

Texas A&M University System (College Station, TX)

Job Title Senior HPC Engineer Agency Texas A&M University Department Technology Services - IT Enterprise Operations Proposed Minimum Salary Commensurate Job ... members' faculty and staff providing cutting-edge research and super computing needs. As a Senior High Performance Computing Engineer ( HPC ), you will provide… more

Texas A&M University System (10/03/25)
- Save Job - Related Jobs - Block Source
Senior GPU and HPC Infrastructure…

NVIDIA (Santa Clara, CA)

…and planning abilities. Experience working with High Performance Computing ( HPC ), GPUs, and high-performance networking (RDMA, Infiniband, RoCE) are strongly ... will be harnessing multiple data streams, ranging from GPU hardware diagnostics to cluster and network telemetry. + Work on software that manages NVLINK topography… more

NVIDIA (10/09/25)
- Save Job - Related Jobs - Block Source
Senior Systems Engineer…

NVIDIA (Santa Clara, CA)

Join the NVIDIA Deep Learning Frameworks Infrastructure team as a Senior Systems Engineer focusing on High-Performance AI & Networking Applications, committed to ... for internal teams and external partners on standard methodologies in HPC networking deployments. + Share insights on improving networking strategies for… more

NVIDIA (11/11/25)
- Save Job - Related Jobs - Block Source
Senior Site Reliability Engineer…

NVIDIA (Santa Clara, CA)

…artificial intelligence. Join our team at NVIDIA as a Senior Site reliability engineer focused on HPC storage and play a crucial role in designing, ... software + Experience with RDMA (InfiniBand or RoCE) fabrics + Background with HPC cluster management tools such as Slurm, PBS, LSF, etc. + Passionate and… more

NVIDIA (11/19/25)
- Save Job - Related Jobs - Block Source
Senior Network Development Engineer

Oracle (Des Moines, IA)

…Description** The AI2NE Org strives to be global leaders in the RDMA cluster networking domain and enable seamless, accelerated High-Performance Compute ( HPC ), ... of state-of-the-art RDMA clusters tailored specifically for AI, ML, HPC workloads. We strive to be the go-to experts...We strive to be the go-to experts in RDMA cluster architecture, leveraging our deep understanding of the unique… more

Oracle (11/25/25)
- Save Job - Related Jobs - Block Source
Senior Software Engineer , AI…

NVIDIA (Santa Clara, CA)

We are now looking for a Senior Software Engineer for AI Resiliency. At NVIDIA, we are pushing the boundaries of what's possible in AI. We are currently seeking ... a Senior Software Engineer to lead the development...GPUs. Your expertise will be crucial in driving down cluster downtime towards zero, ensuring that our AI systems… more

NVIDIA (10/15/25)
- Save Job - Related Jobs - Block Source
Senior Research Engineer…

NVIDIA (Santa Clara, CA)

NVIDIA is searching for a senior or principal engineer who specializes in building cutting-edge infrastructure for large-scale foundation model training in the ... to support multi-modal foundation models for robotics. + Optimize GPU and cluster utilization for efficient model training and fine-tuning on massive datasets. +… more

NVIDIA (12/05/25)
- Save Job - Related Jobs - Block Source
Senior Firmware Engineer - CSP…

NVIDIA (Santa Clara, CA)

NVIDIA is seeking a Senior Firmware Engineer to join our CSP Engagements team, focusing on system software for Datacenter products such as GB200. This role ... see: + Deep expertise in data center server architectures, HPC systems, and hardware-software co-design. + Deep expertise in...out from the crowd: + Knowledge of cloud and cluster level deployment and management systems. + Experience with… more

NVIDIA (10/01/25)
- Save Job - Related Jobs - Block Source
Senior Research Engineer…

NVIDIA (Santa Clara, CA)

…for AVs capable of running on thousands of GPUs; + Optimize GPU and cluster utilization for efficient model training and fine-tuning on massive datasets; + Implement ... curriculum learning. + Deep understanding of GPU acceleration, CUDA programming, and cluster management tools like Kubernetes. + Strong programming skills in Python… more

NVIDIA (10/08/25)
- Save Job - Related Jobs - Block Source
Senior Software Engineer - Storage

NVIDIA (Santa Clara, CA)

…power some of the world's most advanced computing workloads. We are seeking a Software Engineer to join our MARS team at NVIDIA. In this role, you will help design, ... experience developing and operating large-scale distributed systems, infrastructure platforms, or HPC environments. + Strong programming skills in C++, Python, or… more

NVIDIA (12/02/25)
- Save Job - Related Jobs - Block Source
Senior Software SDET Test Development…

NVIDIA (Santa Clara, CA)

…GPU Computing. We are passionate about markets include gaming, automotive, vision, HPC , datacenters and networking in addition to our traditional OEM business. ... Linux experience, reliability testing with various telemetries, scale out cluster , test plan development, track record in developing AI...are rapidly growing. If you're a creative and autonomous engineer with a real passion for technology, we want… more

NVIDIA (11/05/25)
- Save Job - Related Jobs - Block Source
Senior Systems Software Security…

NVIDIA (CO)

…server architecture. In-depth understanding of the different deployment models for GPUs (eg, HPC , AI cluster , single- or multi-GPU servers). + Experience in Data ... NVIDIA is searching for a highly motivated, creative engineer with experience in system software security to join the Data Center Systems Software team. In this… more

NVIDIA (10/22/25)
- Save Job - Related Jobs - Block Source
Senior Systems Administrator…

Mount Sinai Health System (New York, NY)

…and implements backup policies. + Assist in the management and maintenance of HPC cluster and data center work, including troubleshooting for resolving system ... data warehouse team and a research data services team. The **_Senior Systems Administrator/ Engineer ,_** as a member of the Scientific Computing and Data group, is… more

Mount Sinai Health System (09/22/25)
- Save Job - Related Jobs - Block Source
Sr Advanced IC CAD Engineer

Honeywell (Phoenix, AZ)

You will report directly to the Senior Engineering Manager and you'll work at our Plymouth, MN location on a Hybrid work schedule. (Other allowed Honeywell Aerospace ... **KEY RESPONSIBILITIES** + Work with IC Design EDA Applications, High Performance Compute cluster staff, and IC Design engineers to craft and maintain optimized EDA… more

Honeywell (11/14/25)
- Save Job - Related Jobs - Block Source
Senior Solutions Architect, Financial…

NVIDIA (NY)

…other Engineering fields (or equivalent experience) + 12+ years experience as an ML/Software Engineer with a proven track record in writing code in Python, C++ + ... models at scale on public cloud computing and/or on-prem HPC clusters in production Ways To Stand Out From...of MLOps technologies such as containers, data center deployments, cluster management software, etc. + Experience working with enterprise… more

NVIDIA (10/15/25)
- Save Job - Related Jobs - Block Source
Sr. Research Analytics Scientist

Stanford University (Stanford, CA)

…full-stack applications + Optimizing Slurm scripts for effective utilization of cluster resources + Automated web scraping + Crowdsourcing pipelines In addition ... resources. **Solutions Development:** * Formulate innovative technical strategies and engineer them to completion to achieve unique research objectives, using… more

Stanford University (10/16/25)
- Save Job - Related Jobs - Block Source

"Juju

Account Login

Sign Up

Forgot your password?

Advanced Search