- Meta (Menlo Park, CA)
- …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI / HPC Systems Performance Engineer Responsibilities: 1. ... **Summary:** Meta's AI Training and Inference Infrastructure is growing exponentially...workloads that expects a loss-less fabric interconnect. To improve performance of these systems we constantly look… more
- Meta (Menlo Park, CA)
- …hardware and software components, co-design 15. Experience in developing or debugging AI / HPC systems , performance optimizations, including familiarity ... or supporting production hardware at scale 9. Experience in deploying and productionizing AI / HPC systems and/or related components at scale 10. Experience in… more
- Ford Motor Company (Dearborn, MI)
- …and maintaining our HPC infrastructure and user-facing tooling, ensuring optimal performance and reliability for our critical AI /ML applications. This role ... Troubleshoot and resolve complex technical issues related to Linux systems , networking, storage, and AI /ML HPC...related to Linux systems , networking, storage, and AI /ML HPC applications. Develop and maintain documentation… more
- NVIDIA (Santa Clara, CA)
- …designing and operating large scale storage infrastructure. + Experience analyzing and tuning performance for a variety of AI / HPC workloads. + Experience ... join us today! As a member of the GPU AI / HPC Infrastructure team, you will provide leadership...solutions to enable runs of demanding deep learning, high performance computing, and computationally intensive workloads. We seek an… more
- NVIDIA (Santa Clara, CA)
- …group at NVIDIA has openings for software architects in the field of AI and high- performance networking and system software. We research, develop, and ... be doing + Creating proofs-of-concept to evaluate and motivate extensions in AI Frameworks (PyTorch/NEMO), HPC programming models (MPI, OpenSHMEM, PGAS), new… more
- The MITRE Corporation (Bedford, MA)
- …+ Provide Linux systems administration support for MITRE's HPC systems to ensure the availability, performance , and security of systems . ... Computing to MITRE research organizations. Job Description: We are seeking an experienced Linux HPC Systems engineer to join our team! This is an exciting… more
- The MITRE Corporation (Colorado Springs, CO)
- …Manager. + Familiarity with NVIDIA DGX systems , including DGX H100, and integration into HPC and AI workflows. + Desired experience as a team lead or similar ... Computing to MITRE research organizations. Job Description: We are seeking an experienced Linux HPC Systems Administrator to join our team as a Group Lead for… more
- General Dynamics Information Technology (Fairfax, VA)
- …Regular **Clearance Level Must Be Able to Obtain:** None **Job Family:** Systems Engineering **Skills:** High- Performance Computing ( HPC ) Systems ... are our differentiator. Our work depends on a Senior HPC Systems Engineer joining our team to...some travel required. WCOSS provides NOAA the operational High Performance Computing ( HPC ) resources essential to process… more
- Meta (Menlo Park, CA)
- …Meta and externally. **Required Skills:** Research Scientist, Systems ML and HPC - SW/HW Co-Design Responsibilities: 1. Apply High- Performance Computing ( ... Performance team is dedicated to maximizing training performance of Generative AI and recommendation models...HPC ) algorithms and techniques to optimize large-scale AI workloads 2. Analyze, benchmark, and optimize large-scale workloads… more
- Amazon (Arlington, VA)
- …computing and its potential to overcome some of the biggest challenges in High Performance Computing ( HPC )? Do you have a unique combination of deep technical ... C++, Python, CUDA, Bash - Deep GPU knowledge in HPC and/or AI /ML frameworks. Preferred Qualifications -...life sciences or related discipline. - Working knowledge of HPC schedulers and distributed/parallel file systems , underlying… more
- NYU Rory Meyers College of Nursing (New York, NY)
- Position Summary The Senior HPC Specialist supports New York University's High- Performance Computing ( HPC ) and Research Cloud services by partnering with ... HPC policies, while working closely with the HPC and Research Cloud Systems team on...protocols, tools and utilities. Experience with monitoring and improving HPC application performance . Demonstrated experience in diagnosing… more
- General Dynamics Information Technology (Phoenix, AZ)
- …Regular **Clearance Level Must Be Able to Obtain:** None **Job Family:** Systems Administration **Skills:** High- Performance Computing ( HPC ) Systems ... our differentiator. Our work depends on an On Site HPC Systems Admin joining our team to...the Phoenix, AZ. WCOSS provides NOAA the operational High Performance Computing ( HPC ) resources essential to process… more
- NVIDIA (Santa Clara, CA)
- …the world. We are looking for an outstanding engineer for a Senior HPC Systems Engineer role for at scale AI system performance and datacenter ... develop new, leading differentiated solutions. You will interact with HPC , OS, CPU and GPU compute, and systems...debugging and resolving critical software issues for the best AI workload performance at scale. + Specific… more
- General Dynamics Information Technology (Fairmont, WV)
- …to maintain the operational readiness of the client's High Performance Computing ( HPC ) environment. HOW A SYSTEMS ADMINISTRATOR ADVISOR WILL MAKE AN IMPACT * ... Obtain:** None **Public Trust/Other Required:** NACI (T1) **Job Family:** Systems Administration **Skills:** Computer Servers, HPC ,Problem Solving,Troubleshooting **Experience:**… more
- Meta (Burlingame, CA)
- …in multiple locations. **Required Skills:** Software Engineer, Systems ML - HPC Responsibilities: 1. Apply relevant AI and machine learning techniques to ... **Summary:** Meta is seeking an AI Software Engineer to join our Research &...on the web.Some aspects of this role as an HPC specialist will include using lower precision numeric formats… more
- NVIDIA (Santa Clara, CA)
- …vision? What you will be doing: + Investigate opportunities to improve communication performance by identifying bottlenecks in today's systems . + Design and ... implement new communication technologies to accelerate AI and HPC workloads. + Explore innovative solutions in HW and SW for our next generation platforms as… more
- Amazon (Cupertino, CA)
- … HPC network fabric or machine learning accelerator cluster systems . Also applicable is experience high-frequency trading networking, high-speed wireless ... team focuses on building networking solutions that for Machine Learning (ML) and High- Performance Computing ( HPC ) workloads on AWS. Working at Annapurna Labs… more
- NVIDIA (CA)
- …and to power data centers. Join the team building many of the largest and fastest AI / HPC systems in the world! NVIDIA is looking for someone with the ... and internal teams to analyze, define, and implement large-scale AI / HPC projects. These efforts include a combination...as they roll out some of the most sophisticated systems in the world! + Provide feedback to internal… more
- NVIDIA (Santa Clara, CA)
- …improved workflows and develop new, leading differentiated solutions. You will interact with HPC , OS, GPU compute, and systems specialist to architect, develop ... parallel computing. More recently, GPU deep learning ignited modern AI - the next era of computing. NVIDIA is...looking for an outstanding hands-on architect/engineer for a Senior HPC architect role to support deployment and bringup of… more
- NVIDIA (Santa Clara, CA)
- …long term maintenance strategy. What you'll be doing: + Design highly available and scalable systems to meet the demands of our HPC clusters + Evaluate new and ... graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI and enabled the next era of computing. NVIDIA is a "learning… more