• AI / HPC Network

    Meta (Menlo Park, CA)
    … fabric, host networking, communication libraries, and scheduling infrastructure. **Required Skills:** AI / HPC Network Engineer Responsibilities: 1. ... software, leveraging software defined networking principles. 14. Understanding of AI technologies and associated network technologies (IB/RDMA/RoCE) **Preferred… more
    Meta (12/03/24)
    - Save Job - Related Jobs - Block Source
  • Solutions Architect - HPC AI

    NVIDIA (CA)
    …can make a lasting impact on the world. NVIDIA Infrastructure Specialists team seeks an HPC / AI Infiniband Network Engineer to help customers realize ... to the world's largest and most sophisticated data centers and supercomputers. HPC Network Engineers deliver the technologies, solutions and services customers… more
    NVIDIA (01/09/25)
    - Save Job - Related Jobs - Block Source
  • AI / HPC Systems Performance…

    Meta (Menlo Park, CA)
    …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI / HPC Systems Performance Engineer Responsibilities: 1. Lead ... 5. Work with cross functional teams and provide guidance on the AI network architecture including topologies, transport, congestion control techniques. **Minimum… more
    Meta (10/27/24)
    - Save Job - Related Jobs - Block Source
  • AI / HPC Systems Performance…

    Meta (Austin, TX)
    …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI / HPC Systems Performance Engineer Responsibilities: 1. Active ... **Summary:** Meta's AI Training and Inference Infrastructure is growing exponentially...daily basis. We need to build and evolve our network infrastructure that connects myriads of training accelerators like… more
    Meta (12/03/24)
    - Save Job - Related Jobs - Block Source
  • AI Infrastructure Engineer

    Cisco (Research Triangle Park, NC)
    …and communicate advanced technical concepts. A talented and passionate engineer comfortable working in high-pressure, large-scale enterprise environments. What You ... and managing the internal NVIDIA DGX and Cisco-UCS based AI platforms at Cisco. You will provide leadership in...* 7+ years of previous experience deploying and administrating HPC clusters * Familiar with GPU resource scheduling managers… more
    Cisco (11/17/24)
    - Save Job - Related Jobs - Block Source
  • Senior Product Architect, HPC and AI

    NVIDIA (Santa Clara, CA)
    …harness your infrastructure expertise to create reference designs for the world's most powerful AI clusters. As an AI / HPC Product Architect at NVIDIA, you'll ... experience with benchmarking systems and analyzing performance bottlenecks in large-scale AI / HPC infrastructure + Exceptional communication skills, with the… more
    NVIDIA (10/25/24)
    - Save Job - Related Jobs - Block Source
  • Senior Software Architect, AI

    NVIDIA (Santa Clara, CA)
    …be doing + Creating proofs-of-concept to evaluate and motivate extensions in AI Frameworks (PyTorch/NEMO), HPC programming models (MPI, OpenSHMEM, PGAS), new ... runtime designs, and new network hardware features. + Research, design and implement features for AI and HPC communication middleware (NCCL, Open MPI, UCX,… more
    NVIDIA (10/29/24)
    - Save Job - Related Jobs - Block Source
  • Senior HPC Systems Engineer

    General Dynamics Information Technology (Fairfax, VA)
    …Description:** At GDIT, people are our differentiator. Our work depends on a Senior HPC Systems Engineer joining our team to support the National Oceanic and ... Obtain:** None **Job Family:** Systems Engineering **Skills:** High-Performance Computing ( HPC ) Systems,Linux System Administration,Systems Management **Certifications:** None - N/A… more
    General Dynamics Information Technology (01/11/25)
    - Save Job - Related Jobs - Block Source
  • Sr. Software Development Engineer

    Amazon (Cupertino, CA)
    …have extensive experience in low-latency networking and collective operations, such as HPC network fabric or machine learning accelerator cluster systems. Also ... Description We are seeking an experienced software engineer with low-level latency networking or interconnect expertise to optimize customer experience by designing… more
    Amazon (12/20/24)
    - Save Job - Related Jobs - Block Source
  • HPC Systems Administrator

    The MITRE Corporation (Mclean, VA)
    …Technology Division provides multiple corporate-wide services including High Performance Computing ( HPC ), Enterprise PC and Mobile Solutions, Network Services, ... organizations. Job Description: We are seeking an experienced Linux HPC Systems engineer to join our team!...intelligence ( AI ), and advanced computing to deliver AI and HPC services to MITRE organization… more
    The MITRE Corporation (01/01/25)
    - Save Job - Related Jobs - Block Source
  • Senior Software Engineer - AI

    Bloomberg (New York, NY)
    …maintaining system software that enables communication between GPUS, CPUs, and storage in scale-out AI and HPC systems. This role will also be responsible for ... **The Role:** We are seeking an engineer to join our hardware management team. This...overseeing the ongoing monitoring, support, and maintenance of our HPC / AI clusters, ensuring peak performance and reliability.… more
    Bloomberg (01/11/25)
    - Save Job - Related Jobs - Block Source
  • Senior Technical Marketing Engineer

    NVIDIA (Santa Clara, CA)
    …how you can make a lasting impact on the world. As a Senior Technical Marketing Engineer for AI Infrastructure, you will join a dedicated team that is passionate ... equivalent experience. + 5+ years of experience. + Proficiency in Python and C++ for AI and HPC applications. + Experience using large scale multi node GPU… more
    NVIDIA (12/03/24)
    - Save Job - Related Jobs - Block Source
  • Software Engineer , SystemML - AI

    Meta (Menlo Park, CA)
    …space of GenAI/LLM scaling reliability and performance. **Required Skills:** Software Engineer , SystemML - AI Networking Responsibilities: 1. Enabling reliable ... this role, you will be a member of the AI Networking Software team and part of the bigger...Library), which enables multi-GPU and multi-node data communication through HPC -style collectives. NCCL has been integrated into PyTorch and… more
    Meta (12/20/24)
    - Save Job - Related Jobs - Block Source
  • Sr. Hardware Dev Engineer (AWS Generative…

    Amazon (Cupertino, CA)
    …and operating AWS cloud offerings that enable high performance and scalability in AI /ML and HPC workloads. AWS Infrastructure Services owns the design, planning, ... Do you want to build the backbone of Generative AI cloud at AWS? Do you want to build...You'll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers,… more
    Amazon (12/24/24)
    - Save Job - Related Jobs - Block Source
  • Production Systems Engineer , Sustaining

    Meta (Austin, TX)
    …hardware requirements and specifications (eg, configuring hardware components, GPU, memory, network for AI / HPC workloads) **Public Compensation:** ... **Summary:** Meta is seeking an experienced Production Systems Engineer to join our Release to Production (RTP)...Responsibilities: 1. Develop robust, industry leading practices for supporting AI / HPC infrastructure at scale 2. Interface with… more
    Meta (01/07/25)
    - Save Job - Related Jobs - Block Source
  • Site Reliability Engineer

    Cisco (Research Triangle Park, NC)
    …(AWS, GCP, Azure). * Familiarity with network performance tuning in HPC environments and large-scale AI workloads. * Familiarity with DevOps practices ... We are seeking a highly skilled and experienced Senior Engineer to join our team, focusing on the design...roles. * 3+ years of experience in high-performance computing ( HPC ) or AI /ML environments. Preferred Qualifications: *… more
    Cisco (11/14/24)
    - Save Job - Related Jobs - Block Source
  • AI Networking Software Developer

    NVIDIA (Santa Clara, CA)
    …and TensorFlow + Creating proofs-of-concept to evaluate and motivate extensions in AI Frameworks (PyTorch/NEMO), HPC programming models (MPI, OpenSHMEM, PGAS), ... people. Today, we're tapping into the unlimited potential of AI to define the next era of computing. An...new runtime designs, and new network hardware features. What we need to see: +… more
    NVIDIA (11/04/24)
    - Save Job - Related Jobs - Block Source
  • Senior System Software Engineer , NCCL…

    NVIDIA (Santa Clara, CA)
    …test design + Experience working with engineering or academic research community supporting HPC or AI + Practical experience with high performance networking: ... runtimes like NCCL and NVSHMEM for Deep Learning and HPC applications. We are looking for a motivated Partner...applications. We are looking for a motivated Partner Enablement Engineer to guide our key partners and customers with… more
    NVIDIA (10/22/24)
    - Save Job - Related Jobs - Block Source
  • Integrated Circuit Design Engineer

    Actalent (Fort Collins, CO)
    …team developing high-performance package designs for ASICs for artificial intelligence ( AI ), networking, high-performance computing ( HPC ), and 5G base stations. ... Job Title: Integrated Circuit Design Engineer Job Description We are seeking an experienced...to develop high-performance package designs for ASICs used in AI , networking, HPC , and 5G base stations.… more
    Actalent (01/09/25)
    - Save Job - Related Jobs - Block Source
  • Software Engineer - XR Codec Interactions…

    Meta (Pittsburgh, PA)
    …in full-body interactive avatars, social AI for codec avatars, and generative AI for codec avatars. **Required Skills:** Software Engineer - XR Codec ... with working on the frontiers of research.In this software engineer role on the XRCIA Compute team, you will...cause analysis through multiple data engineering layers (compute, storage, network ) for GPU clusters and act as a final… more
    Meta (12/19/24)
    - Save Job - Related Jobs - Block Source