• Software Engineer , SystemML - AI…

    Meta (Seattle, WA)
    …Communications Library), which enables multi-GPU and multi-node data communication through HPC -style collectives. NCCL has been integrated into PyTorch and is on ... (eg Large-Scale GenAI/LLM training) from the trainer down to the inter-GPU and network communication layer. And we are seeking for engineers to work on the… more
    Meta (11/08/24)
    - Save Job - Related Jobs - Block Source
  • Software Engineer - Datacenter networking

    Meta (Bellevue, WA)
    …Meta's global data center networks. Our work covers the entire network lifecycle, including hardware development, capacity planning, distributed and centralized ... control systems , modeling/provisioning/automation, monitoring/troubleshooting/analytics, and simulation/design/failure analysis.We are actively seeking Software… more
    Meta (10/18/24)
    - Save Job - Related Jobs - Block Source
  • Sr. Hardware Dev Engineer (AWS Generative…

    Amazon (Seattle, WA)
    …cloud offerings that enable high performance and scalability in AI/ML and HPC workloads. AWS Infrastructure Services owns the design, planning, delivery, and ... to help. You'll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers, and other vital… more
    Amazon (10/12/24)
    - Save Job - Related Jobs - Block Source
  • Systems Development Eng (AWS Generative AI…

    Amazon (Seattle, WA)
    …SRE (Site Reliability Engineering), or Resilience Engineering - 5+ years of SysDE ( Systems Development Engineer ) or equivalent experience - 5+ years of server ... that enable high performance and scalability in AI/ML and HPC workloads. You are intrigued by the continuous release...have tremendous interest in cloud scale and curious how systems and software decisions impact the user. You insist… more
    Amazon (10/18/24)
    - Save Job - Related Jobs - Block Source