• Network Engineer , HPC

    Meta (Menlo Park, CA)
    …our network engineering teams is for you! **Required Skills:** Network Engineer , HPC Systems Network Strategy Responsibilities: 1. Design, ... you will be responsible for conceiving, developing, and deploying software, hardware and network systems and tools that improve reliability and efficiency in our… more
    Meta (08/20/24)
    - Save Job - Related Jobs - Block Source
  • Senior HPC Cloud Engineer

    Stanford University (Stanford, CA)
    Senior HPC Cloud Engineer **Doerr School of...drives new policy and technology solutions through a worldwide network of partners who work with our teams to ... help our school expand and scale our research computing ( HPC ) resource portfolio in the cloud to meet the...including distributed filesystems and an emphasis on object storage systems and data lifecycle management. + **Infrastructure as Code… more
    Stanford University (09/29/24)
    - Save Job - Related Jobs - Block Source
  • AI/ HPC Systems Performance…

    Meta (Menlo Park, CA)
    …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI/ HPC Systems Performance Engineer Responsibilities: 1. Active ... daily basis. We need to build and evolve our network infrastructure that connects myriads of training accelerators like...a loss-less fabric interconnect. To improve performance of these systems we constantly look for opportunities across stack: … more
    Meta (08/01/24)
    - Save Job - Related Jobs - Block Source
  • Production Systems Engineer

    Meta (Menlo Park, CA)
    **Summary:** Meta is seeking an experienced Production Systems Engineer to join our Release to Production (RTP) team. Our servers and data centers are the ... and life cycle of servers in production. **Required Skills:** Production Systems Engineer , Sustaining Responsibilities: 1. Develop robust, industry leading… more
    Meta (07/19/24)
    - Save Job - Related Jobs - Block Source
  • Research Data Center Facility Engineer

    Stanford University (Stanford, CA)
    …researchers from a variety of Stanford and SLAC organizations. The majority of the HPC systems are hosted in the Stanford Research Computing Facilities (SRCF1 ... Research Data Center Facility Engineer **Business Affairs: University IT (UIT), Stanford, California,...Stanford Research Computing. Research Computing offers High Performance Computing ( HPC ) hosting services, computational and data systems ,… more
    Stanford University (09/24/24)
    - Save Job - Related Jobs - Block Source
  • Software Engineer , SystemML - AI…

    Meta (Menlo Park, CA)
    …Communications Library), which enables multi-GPU and multi-node data communication through HPC -style collectives. NCCL has been integrated into PyTorch and is on ... (eg Large-Scale GenAI/LLM training) from the trainer down to the inter-GPU and network communication layer. And we are seeking for engineers to work on the… more
    Meta (09/04/24)
    - Save Job - Related Jobs - Block Source
  • Software Engineer - Datacenter networking

    Meta (Menlo Park, CA)
    …Meta's global data center networks. Our work covers the entire network lifecycle, including hardware development, capacity planning, distributed and centralized ... control systems , modeling/provisioning/automation, monitoring/troubleshooting/analytics, and simulation/design/failure analysis.We are actively seeking Software… more
    Meta (08/15/24)
    - Save Job - Related Jobs - Block Source
  • Software Engineer - Datacenter networking

    Meta (Menlo Park, CA)
    …Meta's global data center networks. Our work covers the entire network lifecycle, including hardware development, capacity planning, distributed and centralized ... control systems , modeling/provisioning/automation, monitoring/troubleshooting/analytics, and simulation/design/failure analysis.We are actively seeking Software… more
    Meta (07/19/24)
    - Save Job - Related Jobs - Block Source
  • Field Application Engineer (Machine…

    quadric.io, Inc (Burlingame, CA)
    …battery operated smart-sensor systems to high-performance automotive or autonomous vehicle systems . Unlike other NPUs or neural network accelerators in the ... co-optimized software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of...C++ DSP and control code. Role: The Field Application Engineer (FAE) will work closely with Business Development, Product,… more
    quadric.io, Inc (08/06/24)
    - Save Job - Related Jobs - Block Source
  • Principal Software Engineer , AI Platform…

    General Motors (Mountain View, CA)
    …This role will involve working across various areas, from enhancing underlying HPC infrastructure to optimizing Kubernetes and Kubeflow setups, as well as refining ... teams to understand requirements and implement solutions. + Troubleshoot complex HPC infrastructure issues and implement effective resolutions with partner team. +… more
    General Motors (07/12/24)
    - Save Job - Related Jobs - Block Source
  • Technical Lead/Manager - AI/ML Infrastructure…

    Cisco (San Jose, CA)
    …a leader and a technologist at heart, and come with: * Experience with Network Operating Systems and in System and Software Qualification * Diligent with ... and traffic generators (commercial & open-source) * Exposure to network operating systems , preferably SONiC * Knowledge...VXLAN, segment Routing and/or MPLS * Exposure to RDMA, HPC networks * Knowledge of RoCE and Infini band… more
    Cisco (09/12/24)
    - Save Job - Related Jobs - Block Source