• Network Engineer , HPC

    Meta (Menlo Park, CA)
    …our network engineering teams is for you! **Required Skills:** Network Engineer , HPC Systems Network Strategy Responsibilities: 1. Design, ... you will be responsible for conceiving, developing, and deploying software, hardware and network systems and tools that improve reliability and efficiency in our… more
    Meta (08/20/24)
    - Save Job - Related Jobs - Block Source
  • AI/ HPC Systems Performance…

    Meta (Menlo Park, CA)
    …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI/ HPC Systems Performance Engineer Responsibilities: 1. Active ... daily basis. We need to build and evolve our network infrastructure that connects myriads of training accelerators like...a loss-less fabric interconnect. To improve performance of these systems we constantly look for opportunities across stack: … more
    Meta (10/31/24)
    - Save Job - Related Jobs - Block Source
  • AI/ HPC Systems Performance…

    Meta (Menlo Park, CA)
    …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI/ HPC Systems Performance Engineer Responsibilities: 1. Lead ... deal with on a daily basis. We need to build and evolve our network infrastructure that connects myriads of training accelerators like GPUs together. In addition, we… more
    Meta (10/27/24)
    - Save Job - Related Jobs - Block Source
  • Production Systems Engineer

    Meta (Menlo Park, CA)
    **Summary:** Meta is seeking an experienced Production Systems Engineer to join our Release to Production (RTP) team. Our servers and data centers are the ... and life cycle of servers in production. **Required Skills:** Production Systems Engineer , Sustaining Responsibilities: 1. Develop robust, industry leading… more
    Meta (10/24/24)
    - Save Job - Related Jobs - Block Source
  • Software Engineer , SystemML - AI…

    Meta (Menlo Park, CA)
    …Communications Library), which enables multi-GPU and multi-node data communication through HPC -style collectives. NCCL has been integrated into PyTorch and is on ... (eg Large-Scale GenAI/LLM training) from the trainer down to the inter-GPU and network communication layer. And we are seeking for engineers to work on the… more
    Meta (10/18/24)
    - Save Job - Related Jobs - Block Source
  • Software Engineer - Datacenter networking

    Meta (Menlo Park, CA)
    …Meta's global data center networks. Our work covers the entire network lifecycle, including hardware development, capacity planning, distributed and centralized ... control systems , modeling/provisioning/automation, monitoring/troubleshooting/analytics, and simulation/design/failure analysis.We are actively seeking Software… more
    Meta (10/18/24)
    - Save Job - Related Jobs - Block Source
  • Principal Software Engineer , AI Platform…

    General Motors (Mountain View, CA)
    …This role will involve working across various areas, from enhancing underlying HPC infrastructure to optimizing Kubernetes and Kubeflow setups, as well as refining ... teams to understand requirements and implement solutions. + Troubleshoot complex HPC infrastructure issues and implement effective resolutions with partner team. +… more
    General Motors (10/11/24)
    - Save Job - Related Jobs - Block Source
  • Technical Marketing Engineer , Server…

    Cisco (San Jose, CA)
    …servers and AI systems . What you'll do: As a Technical Marketing Engineer ( TME) you will collaborate with engineering teams on product development and ... us as a highly motivated and driven Technical Marketing Engineer to define, validate, and drive compute & AI...Microsoft Hyper-V, and KVM. You have knowledge of storage systems , including SAN, NAS, and NVMe performance, and experience… more
    Cisco (10/07/24)
    - Save Job - Related Jobs - Block Source
  • Tech Lead - AI/ML Infrastructure Engineer

    Cisco (San Jose, CA)
    …if you consider yourself a technologist at heart and with: * Experience with Network Operating Systems and in System and Software Qualification * Diligent with ... and traffic generators (commercial & open-source) * Exposure to network operating systems , preferably SONiC * Knowledge...VXLAN, segment Routing and/or MPLS * Exposure to RDMA, HPC networks * Knowledge of RoCE and Infini band… more
    Cisco (11/12/24)
    - Save Job - Related Jobs - Block Source