- NVIDIA (Santa Clara, CA)
- …drive foundational improvements and automation to improve engineer 's productivity. As a Site Reliability Engineer , you are responsible for the big ... be doing: + Troubleshoot incoming support requests in a large-scale HPC environment. + Contribute enhancements to existing deployment automation, configuration… more
- SpaceX (Hawthorne, CA)
- Site Reliability Engineer , GNC (Falcon)...maintain virtual and physical servers + Work with SpaceX HPC team to monitor and maintain a 4000+ thread ... the ultimate goal of enabling human life on Mars. SITE RELIABILITY ENGINEER , GNC (FALCON)...HPC cluster + Closely collaborate with GNC software engineers… more
- Microsoft Corporation (Redmond, WA)
- …- so that everyone can realize its benefits. We're looking for an experienced ** Site Reliability Engineer (SRE)** to join our infrastructure team. In ... workflows. **Qualifications** **Required Qualifications** + 4+ years of experience in Site Reliability Engineering, DevOps, or Infrastructure Engineering roles.… more
- SLAC National Accelerator Laboratory (Menlo Park, CA)
- Senior High Performance Computing Engineer Job ID 6383 Location SLAC - Menlo Park, CA Full-Time Regular **SLAC Job Postings** **About SLAC:** The SLAC National ... the nature of this position, SLAC is open to on- site and hybrid work options.** **Position Overview:** As a...options.** **Position Overview:** As a Senior High Performance Computing Engineer in the Scientific Computing Services Division of the… more
- BAE Systems (Annapolis Junction, MD)
- …Integration Engineer to play a critical role in ensuring the stability and reliability of our High-Performance Computing ( HPC ) systems. As a key member of ... be available based on position level and/or job specifics. **Software Integration Engineer IV** **119553BR** EEO Career Site Equal Opportunity Employer.… more
- Insight Global (New York, NY)
- …with a competitive salary, equity package, and comprehensive benefits. The Network Operations Engineer will serve as the Regional Site Lead, taking ownership of ... Job Description Insight Global is seeking a highly skilled Network Operations Engineer to join a rapidly growing AI cloud and computing organization with datacenter… more
- NVIDIA (Westford, MA)
- …for a Senior DevOps Engineer to join our team, although Developer Experience Engineer , Site Reliability Engineer , Build and Release Engineer ... and engineering. We're looking for someone with strong integrity, reliability , persistence, problem-solving ability, and skills in Linux, scripting, debugging,… more
- Amazon (Cupertino, CA)
- Description We are seeking an experienced engineer to work on distributed AI/ML systems. This role involves working on collective operations - the fundamental ... Experience with embedded systems is valued, and experience with high-speed networking or HPC interconnects is valued highly. If you like solving hard problems, want… more
- BAE Systems (Annapolis Junction, MD)
- …critical role in ensuring the high availability and scalability of High-Performance Computing ( HPC ) systems. As a DevOps Software Engineer , you'll be responsible ... may be available based on position level and/or job specifics. **Senior DevOps Engineer ** **119556BR** EEO Career Site Equal Opportunity Employer. Minorities .… more
- Synergy ECP (Fort Meade, MD)
- …Helm to deploy Kubernetes applications | Experience using GitLab CI/CD pipelines | Familiar with Site Reliability Engineering (SRE) principles and applications ... Software Integration Engineer 3 Ft. Meade, MD (http://maps.google.com/maps?q=Ft.+Meade+MD+USA+20146) Job Type...of software performance. . Provide software product ownership for HPC tools. . Working knowledge of CM tools, web… more
- Cadence Design Systems, Inc. (San Jose, CA)
- …Job Description: We are looking for a highly skilled and motivated 3DIC Design Flow Engineer to implement system planning and integration of complex HPC and AI ... AI applications. This is a challenging and rewarding opportunity for a highly motivated engineer with a passion for innovation and a proven track record of success… more
- Amazon (Seattle, WA)
- …Adapter (EFA) network card work for Machine Learning (ML) and High-Performance Computing ( HPC ) customers on AWS. Across multiple projects written in C, our team ... every day on the hottest companies doing AI and HPC today. Key job responsibilities You will write the...2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience… more
- Amazon (Cupertino, CA)
- …language experience - 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience - 5+ years of ... full software development experience - Expertise in accelerator architectures for ML or HPC such as GPUs, CPUs, FPGAs, or custom architectures - Experience with GPU… more
- Amazon (Seattle, WA)
- …Adapter (EFA) network card work for Machine Learning (ML) and High-Performance Computing ( HPC ) customers on AWS. Across multiple projects written in C, our team ... every day on the hottest companies doing AI and HPC today. Key job responsibilities You will help lead...5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience… more
- Autodesk (Raleigh, NC)
- **Job Requisition ID #** 26WD94805 **Senior Principal Machine Learning Engineer , Foundational Models** **Position Overview** Autodesk is transforming the ... AutoCAD, Revit, Construction Cloud, and Forma. As a Senior Principal _Machine Learning_ Engineer , you will act as a technical leader and delivery owner for complex,… more
- BAE Systems (Annapolis Junction, MD)
- …be available based on position level and/or job specifics. **Software Integration Engineer III** **119551BR** EEO Career Site Equal Opportunity Employer. ... that values collaboration, excellence, and innovation. We're seeking a skilled Software Integration Engineer to play a critical role in ensuring the reliability ,… more
- Samsung SDS America (Plano, TX)
- …seamlessly manage both on-premises and cloud environments, ensuring performance, scalability, and reliability . This is a hands-on, on- site position where your ... SDS America is seeking a Senior Linux Data Center Engineer to join our dynamic team in Plano, TX....with deep expertise in Linux systems and high-performance computing ( HPC ). You will be at the forefront of designing,… more
- Amazon (Austin, TX)
- …operate next-generation infrastructure that powers breakthrough innovation in AI/ML and HPC workloads. If you're passionate about pushing the limits of performance, ... complex problems. You will decompose big difficult server system testability, reliability and diagnosis problems into straightforward tasks, components or features… more
- Global Foundries (Malta, NY)
- …enabling next-generation optical interconnects and heterogeneous integration (HI) for AI, HPC , and data center applications. With a global manufacturing footprint ... in a SiPh Flip Chip assembly . Focus on product and module reliability , package risk factors, packaging design rules, materials selection criteria, definition of… more
- Broadcom (San Jose, CA)
- …**Job Description:** Broadcom is seeking an experienced IC package-design engineer for complex flip-chip-BGA packages for industry-leading ASICs with high-speed ... package designs for ASICs for artificial intelligence (AI), networking, high-performance computing ( HPC ), and 5G base stations. These designs include SerDes at 224G… more