AI HPC Network Engineer Jobs in San Jose, CA

17 jobs (page 1)

Categories

All Categories

Engineering (5)

AI / HPC Network…

Meta (Menlo Park, CA)

… fabric, host networking, communication libraries, and scheduling infrastructure. **Required Skills:** AI / HPC Network Engineer Responsibilities: 1. ... software, leveraging software defined networking principles. 14. Understanding of AI technologies and associated network technologies (IB/RDMA/RoCE) **Preferred… more

Meta (12/03/24)
- Save Job - Related Jobs - Block Source
AI / HPC Systems Performance…

Meta (Menlo Park, CA)

…fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI / HPC Systems Performance Engineer Responsibilities: 1. Lead ... 5. Work with cross functional teams and provide guidance on the AI network architecture including topologies, transport, congestion control techniques. **Minimum… more

Meta (10/27/24)
- Save Job - Related Jobs - Block Source
AI / HPC Systems Performance…

Meta (Menlo Park, CA)

…fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI / HPC Systems Performance Engineer Responsibilities: 1. Active ... **Summary:** Meta's AI Training and Inference Infrastructure is growing exponentially...daily basis. We need to build and evolve our network infrastructure that connects myriads of training accelerators like… more

Meta (10/31/24)
- Save Job - Related Jobs - Block Source
Senior Product Architect, HPC and AI

NVIDIA (Santa Clara, CA)

…harness your infrastructure expertise to create reference designs for the world's most powerful AI clusters. As an AI / HPC Product Architect at NVIDIA, you'll ... experience with benchmarking systems and analyzing performance bottlenecks in large-scale AI / HPC infrastructure + Exceptional communication skills, with the… more

NVIDIA (10/25/24)
- Save Job - Related Jobs - Block Source
Senior Software Architect, AI…

NVIDIA (Santa Clara, CA)

…be doing + Creating proofs-of-concept to evaluate and motivate extensions in AI Frameworks (PyTorch/NEMO), HPC programming models (MPI, OpenSHMEM, PGAS), new ... runtime designs, and new network hardware features. + Research, design and implement features for AI and HPC communication middleware (NCCL, Open MPI, UCX,… more

NVIDIA (10/29/24)
- Save Job - Related Jobs - Block Source
Sr. Software Development Engineer…

Amazon (Cupertino, CA)

…have extensive experience in low-latency networking and collective operations, such as HPC network fabric or machine learning accelerator cluster systems. Also ... Description We are seeking an experienced software engineer with low-level latency networking or interconnect expertise to optimize customer experience by designing… more

Amazon (12/20/24)
- Save Job - Related Jobs - Block Source
Senior Technical Marketing Engineer…

NVIDIA (Santa Clara, CA)

…how you can make a lasting impact on the world. As a Senior Technical Marketing Engineer for AI Infrastructure, you will join a dedicated team that is passionate ... equivalent experience. + 5+ years of experience. + Proficiency in Python and C++ for AI and HPC applications. + Experience using large scale multi node GPU… more

NVIDIA (12/03/24)
- Save Job - Related Jobs - Block Source
Software Engineer , SystemML - AI…

Meta (Menlo Park, CA)

…space of GenAI/LLM scaling reliability and performance. **Required Skills:** Software Engineer , SystemML - AI Networking Responsibilities: 1. Enabling reliable ... this role, you will be a member of the AI Networking Software team and part of the bigger...Library), which enables multi-GPU and multi-node data communication through HPC -style collectives. NCCL has been integrated into PyTorch and… more

Meta (12/20/24)
- Save Job - Related Jobs - Block Source
Sr. Hardware Dev Engineer (AWS Generative…

Amazon (Cupertino, CA)

…and operating AWS cloud offerings that enable high performance and scalability in AI /ML and HPC workloads. AWS Infrastructure Services owns the design, planning, ... Do you want to build the backbone of Generative AI cloud at AWS? Do you want to build...You'll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers,… more

Amazon (12/24/24)
- Save Job - Related Jobs - Block Source
Production Systems Engineer , Sustaining

Meta (Menlo Park, CA)

…hardware requirements and specifications (eg, configuring hardware components, GPU, memory, network for AI / HPC workloads) **Public Compensation:** ... **Summary:** Meta is seeking an experienced Production Systems Engineer to join our Release to Production (RTP)...Responsibilities: 1. Develop robust, industry leading practices for supporting AI / HPC infrastructure at scale 2. Interface with… more

Meta (10/24/24)
- Save Job - Related Jobs - Block Source
AI Networking Software Developer

NVIDIA (Santa Clara, CA)

…and TensorFlow + Creating proofs-of-concept to evaluate and motivate extensions in AI Frameworks (PyTorch/NEMO), HPC programming models (MPI, OpenSHMEM, PGAS), ... people. Today, we're tapping into the unlimited potential of AI to define the next era of computing. An...new runtime designs, and new network hardware features. What we need to see: +… more

NVIDIA (11/04/24)
- Save Job - Related Jobs - Block Source
Senior System Software Engineer , NCCL…

NVIDIA (Santa Clara, CA)

…test design + Experience working with engineering or academic research community supporting HPC or AI + Practical experience with high performance networking: ... runtimes like NCCL and NVSHMEM for Deep Learning and HPC applications. We are looking for a motivated Partner...applications. We are looking for a motivated Partner Enablement Engineer to guide our key partners and customers with… more

NVIDIA (10/22/24)
- Save Job - Related Jobs - Block Source
Senior Staff Engineer , TPU Machine…

Google (Sunnyvale, CA)

… AI accelerators, which are optimized for training and inference of large AI models. As a Software Engineer in the TPU Accelerator Software team, ... and hardware operations. + 5 years of experience building Network Architecture along with routing algorithms and topologies. +...+ 5 years of experience with High Performance Computing ( HPC ). + 3 years of experience working in a… more

Google (12/18/24)
- Save Job - Related Jobs - Block Source
Senior System Software Engineer…

NVIDIA (Santa Clara, CA)

…Ways to stand out from the crowd: + Have built , deployed and operated AI platforms on HPC clusters. Have built, deployed and operated cloud native system ... We are seeking a Sr System Software Engineer to help us build out our scientific...computing cloud platform enables Physics based Numerical Simulation Solvers, AI based Training, Inference and Visualization workflow for physical… more

NVIDIA (12/10/24)
- Save Job - Related Jobs - Block Source
Senior Linux Systems Engineer

NVIDIA (Santa Clara, CA)

…fueled by great technology-and amazing people. Today, we're tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU ... lasting impact on the world. We are looking for a Senior Linux Software Engineer to join the NVIDIA Applied Systems Engineering group. The work environment is… more

NVIDIA (11/23/24)
- Save Job - Related Jobs - Block Source
Senior Software QA Test Development…

NVIDIA (Santa Clara, CA)

…GPU Computing. We are passionate about markets include gaming, automotive, vision, HPC , datacenters and networking in addition to our traditional OEM business. ... NVIDIA is also well positioned as the ' AI Computing Company', and NVIDIA GPUs are the brains...passion for automation. + Strong experience in FW, BMC/OpenBMC, Network protocol, internal/external enterprise storage devices, PCIe buses and… more

NVIDIA (12/17/24)
- Save Job - Related Jobs - Block Source
Principal Product Manager Tech, dbrown Team

Amazon (Sunnyvale, CA)

…digital transformation across several customer workloads including AI /ML, generative AI , databases, Big Data analytics, SAP, HPC , Edge, and more. ... across a range of EC2 products across compute, storage, network and accelerated computing. You will be responsible for...will help each team member develop into a better-rounded engineer and enable them to take on more complex… more

Amazon (11/16/24)
- Save Job - Related Jobs - Block Source

"Juju

Account Login

Sign Up

Forgot your password?

Advanced Search