- Meta (Menlo Park, CA)
- …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI / HPC Systems Performance Engineer Responsibilities: 1. ... **Summary:** Meta's AI Training and Inference Infrastructure is growing exponentially...workloads that expects a loss-less fabric interconnect. To improve performance of these systems we constantly look… more
- Meta (Menlo Park, CA)
- …and host networking, communications lib and scheduling infrastructure. **Required Skills:** AI / HPC System Performance Engineer Responsibilities: 1. Lead ... **Summary:** Meta's AI Training and Inference Infrastructure is growing exponentially...a loss-less fabric interconnect with minimal latency. To improve performance of these systems we constantly look… more
- Meta (Menlo Park, CA)
- …These workloads expect a loss-less fabric interconnect with minimal latency. To improve performance of these systems we constantly look for opportunities across ... host networking, communications lib and scheduling infrastructure. **Required Skills:** AI / HPC Network Engineering Manager Responsibilities: 1. Manage engineers… more
- IBM (San Jose, CA)
- …technical areas in the context of hybrid cloud, AI systems , networking, security, high-speed networked-storage, accelerators, and HPC principles. The ... focuses on the next generation Hybrid Cloud infrastructure for AI , Storage, HPC and Quantum applications. The...Experience with GPU Systems * Familiarity with HPC system performance evaluation. * Familiarity with… more
- Meta (Menlo Park, CA)
- …on existing accelerator systems and guiding the future of models and AI HW at Meta. This drives improved performance , new model architectures and ... the following areas: Accelerators/GPU architectures, High Performance Computing ( HPC ), Machine Learning Compilers, Training/Inference ML Systems , Model… more
- Meta (Menlo Park, CA)
- … AI product introductions and AI operations initiatives supporting Meta's growing AI / HPC infrastructure for our Family of Apps . They will be responsible ... deliver on shared goals 10. The ideal candidate will have experience in AI / HPC product development and operations, demonstrated experience in the Network… more
- Deloitte (San Francisco, CA)
- …cancer detection, drug discovery, optimizing population health and clinical trials, autonomous systems and edge AI , and renewable energy. Key responsibilities: + ... in the cloud or on prem + Adopt best engineering practices in automation, HPC and AI /GenAI infrastructure and design patterns + Define and lead technology… more
- IBM (San Jose, CA)
- …technical areas in the context of hybrid cloud, AI systems , networking, security, high-speed networked-storage, accelerators, and HPC principles. The ... focuses on the next generation Hybrid Cloud infrastructure for AI , Storages, HPC and Quantum applications. The...experience with Git * HPC : experience running HPC workloads on HPC systems … more
- Deloitte (San Francisco, CA)
- …building secure networks and modern data centers , to enabling the adoption of AI or high- performance computing ( HPC ), you'll gain firsthand experience with ... organizations through Data Center and infrastructure transformation journeys, such as adopting AI , deploying high- performance computing ( HPC ) or edge… more
- Meta (Menlo Park, CA)
- …following machine learning/deep learning domains: Distributed ML Training, GPU architecture, ML systems , AI infrastructure, high performance computing, ... large-scale GPU training and inference fleet through an observable, reliable and high- performance distributed AI /GPU communication stack. Currently, one of the… more
- Cisco (Milpitas, CA)
- …team engaged in the design, development and execution of tests to qualify network performance for AI .ML capability. In this role you'll have opportunity to: + ... the next generation infrastructure to meet the needs of AI /ML workloads and continuously increasing internet users and application....Quality of Service (QoS) policies to ensure optimal network performance + Exposure to RDMA, HPC networks… more
- Cisco (Milpitas, CA)
- …agile team engaged in the design, development and execution of tests to qualify network performance for AI /ML capability. You will be a part of our solutions ... a customer-facing environment + Previous experience leading teams + Exposure network operating systems , preferably SONiC + Exposure to RDMA, HPC networks +… more
- Meta (Menlo Park, CA)
- …levels 9. Experience in leading teams working on high performance computing ( HPC ) and AI /ML systems , including: 10. GPU/ASIC-based kernel development and ... systems for our fleet 4. Technical management 5. Experience in systems architecture, performance , workload-analysis and large scale distributed systems … more
- Micron Technology, Inc. (San Jose, CA)
- …position in the Artificial Intelligence ( AI ), Machine Learning (ML) and High Performance Computing ( HPC ) business segments. You will be working on innovative ... you will be charged with defining and accomplishing the strategy for a High Performance Memory product portfolio that will further fortify Micron's leadership… more
- Micron Technology, Inc. (San Jose, CA)
- …in growing the Artificial Intelligence ( AI ), Machine Learning (ML) and High- Performance Computing ( HPC ) business segments. You will be working on innovative ... of Work (SOWs), business term sheets, and other customer-facing documents for high- performance memory products. + Represent the Product Management team in Product… more
- Meta (Menlo Park, CA)
- …10. Experience in leading teams working on high performance computing ( HPC ) and AI /ML systems , including: GPU/ASIC-based kernel development and ... ROCm), distributed systems for large scale training and serving, and systems architecture and performance 11. Accelerator (GPU/ASIC) kernel development and… more
- Broadcom (San Jose, CA)
- …compiler toolchains. + Experience analyzing and tuning performance for a variety of AI /ML and HPC workloads. + Deep knowledge of Linux kernel and Linux ... Description:** **Job Description** Ethernet NIC product portfolio is designed for high performance computing and networking applications including AI and ML.… more
- Meta (Menlo Park, CA)
- …networks, powering our global data centers and supporting cutting-edge technologies like AI , Generative AI , Recommendation engines, and Metaverse. Our network ... to join our teams and help build scalable distributed systems , develop innovative solutions to our challenges, and ship...firmware, and software for network devices, transport stacks, and AI workloads 2. Debug complex system-level issues and lead… more
- Meta (Menlo Park, CA)
- …networks, powering our global data centers and supporting cutting-edge technologies like AI , Generative AI , Recommendation engines, and Metaverse. Our network ... to join our teams and help build scalable distributed systems , develop innovative solutions to our challenges, and ship...firmware, and software for network devices, transport stacks, and AI workloads 2. Debug complex system-level issues and lead… more
- Meta (Menlo Park, CA)
- …networks, powering our global data centers and supporting cutting-edge technologies like AI , Generative AI , Recommendation engines, and Metaverse. Our network ... to join our teams and help build scalable distributed systems , develop innovative solutions to our challenges, and ship...firmware, and software for network devices, transport stacks, and AI workloads 2. Debug complex system-level issues and lead… more