- Meta (Menlo Park, CA)
- …fabric and host networking, communications lib and scheduling infrastructure. **Required Skills:** AI / HPC Network Engineering Manager Responsibilities: ... daily basis. We need to build and evolve our network infrastructure that connects myriads of training accelerators like...responsible for design, model, develop, test, deploy and operate AI / HPC Networks at scale 2. Provide continual… more
- NVIDIA (Santa Clara, CA)
- …and usable. + Creating proofs-of-concept to evaluate and motivate extensions in AI Frameworks (PyTorch/NEMO), HPC programming models (MPI, OpenSHMEM, PGAS), new ... runtime designs, and new network hardware features. + Research, design and implement features for AI and HPC communication middleware (NCCL, Open MPI, UCX,… more
- Meta (Menlo Park, CA)
- …5. Work with cross functional teams and provide guidance on the AI network architecture including topologies, transport, congestion control techniques **Minimum ... host networking, communications lib and scheduling infrastructure. **Required Skills:** AI / HPC System Performance Engineer Responsibilities: 1. Lead… more
- NVIDIA (Santa Clara, CA)
- …from the crowd: + Experience troubleshooting, debugging, and solving problems in large-scale HPC network environments + Experience as a developer and/or support ... A significant part of the role is interacting with engineering , marketing, and support teams regularly. What you will...customers installing our products with a focus on next-generation AI , and HPC server technologies. + Own… more
- NVIDIA (Santa Clara, CA)
- …to stand out from the crowd: + Experience in solving problems in large-scale HPC network environments with overlay technologies (BGP, OSPF, VXLAN, EVPN), RoCE ... part of the role is also to interact with Engineering , Marketing, and Support teams regularly. What you will...installing our products with a focus on Infiniband, next-generation AI , and HPC server technologies. + Own… more
- NVIDIA (Santa Clara, CA)
- …challenges and provide outstanding HPC solutions. + Collaborate closely with hardware engineering , CUDA engineering , and AI research groups to apply the ... healthcare by harnessing the power of GPU computing and AI to redefine data analysis in fields such as...integrating genomic solutions into mainstream healthcare. As a healthcare HPC engineer, you will join a dynamic development team… more
- NVIDIA (Santa Clara, CA)
- NVIDIA is hiring engineers to scale up its AI Infrastructure. We expect you to have a strong programming background, knowledge of datacenter hardware, operations, ... and planning abilities. Experience working with High Performance Computing ( HPC ), GPUs, and high-performance networking (RDMA, Infiniband, RoCE) are strongly… more
- Meta (Menlo Park, CA)
- …operate in a multi-organization landscape. **Required Skills:** Technical Program Manager, AI Network Infra Responsibilities: 1. Lead technical program ... and AI operations initiatives supporting Meta's growing AI / HPC infrastructure for our Family of Apps...matrix organization covering a range of areas (Data Center, Network , Hardware Systems, Infrastructure Engineering , Software … more
- NVIDIA (Santa Clara, CA)
- …Familiarity with datacenter automation, advanced network protocols, and supporting large HPC or AI clusters in production environments. + Understanding of ... , or related field, or equivalent experience. + 8+ years of proven experience in AI / HPC Infrastructure. + Familiarity with AI / HPC job schedulers and… more
- NVIDIA (Santa Clara, CA)
- …networking problems for scalable AI clusters. This is a hands-on network engineering position focused on the architecture, design, development and deployment ... We are seeking a highly skilled Principal Network Engineer to join our dynamic team to...and deployment of global-scale DCs inter-connects and fabric for HPC , AI , and GPU computing clusters. +… more
- NVIDIA (Santa Clara, CA)
- NVIDIA's AI Factories are built to accelerate AI and HPC workloads. At their core the Digital Twin (physics-based model used to design, validate, and operate ... be shaping the digital and physical foundation of NVIDIA's AI Factories, engineering virtual replicas that not...to stand out from the crowd + Background in AI / HPC data center cooling, including immersion and… more
- Deloitte (San Francisco, CA)
- …Maintain up-to-date knowledge of advancements in AI , cloud computing, network technologies, infrastructure automation, and trends in HPC infrastructure, ... and modern data centers, to enabling the adoption of AI or high-performance computing ( HPC ), you'll gain...path that can evolve toward ML engineer ing , AI architect ure , cloud adoption, network … more
- Meta (Menlo Park, CA)
- …many aspects of the system from models and runtime all the way to the AI hardware, optimizing across compute, network and storage. The team invests significantly ... develop and help productionize high performance software & hardware technologies for AI at datacenter scale. We achieve this via concurrent design and optimization… more
- Meta (Menlo Park, CA)
- **Summary:** In this role, you will be a member of the AI Networking Software team and part of the bigger DC networking organization. The team develops and owns the ... Communications Library), which enables multi-GPU and multi-node data communication through HPC -style collectives. NCCL has been integrated into PyTorch and is on… more
- Meta (Menlo Park, CA)
- **Summary:** In this role, you will be a member of the AI Networking Software team and part of the bigger DC networking organization. The team develops and owns the ... Communications Library), which enables multi-GPU and multi-node data communication through HPC -style collectives. NCCL has been integrated into PyTorch and is on… more
- quadric.io, Inc (Burlingame, CA)
- …bridge between development engineering and hands-on users in the field. The AI Application Engineer will [1] integrate Quadric product and software stack into ... co-optimized software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of...+ Bachelor's or Master's in Computer Science and/or Electronics Engineering field. + 5+ years experience with AI… more
- Amazon (Cupertino, CA)
- …for the entire AI industry. You'll join a diverse AWS Hardware Engineering team of software, hardware, and network engineers, supply chain specialists, ... design, deliver, and operate next-generation infrastructure that powers breakthrough innovation in AI /ML and HPC workloads. If you're passionate about pushing… more
- Amazon (Cupertino, CA)
- …design, deliver, and operate next-generation infrastructure that powers breakthrough innovation in AI /ML and HPC workloads. If you're passionate about pushing ... Do you want to shape the future of Generative AI at AWS? Join the team building the foundation...You'll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers,… more
- Amazon (Cupertino, CA)
- …and operating AWS cloud offerings that enable high performance and scalability in AI /ML and HPC workloads. Utility Computing (UC) AWS Utility Computing (UC) ... Do you want to build the backbone of Generative AI cloud at AWS? Do you want to build...You'll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers,… more
- NVIDIA (Santa Clara, CA)
- NVIDIA's AI Factories are built to accelerate AI and HPC workloads. At their core the Digital Twin (physics-based model used to design, validate, and operate ... We Need to See: + Pursuing PhD in Mechanical Engineering , Thermal/Fluids Engineering , or similar area. +...with CFD tools (ANSYS Fluent, Cadence, STAR-CCM+) and/or flow network modeling (Flownex). + Hands-on experience with lab equipment,… more