• AI / HPC Systems Performance…

    Meta (Menlo Park, CA)
    …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI / HPC Systems Performance Engineer Responsibilities: 1. Active ... **Summary:** Meta's AI Training and Inference Infrastructure is growing exponentially...daily basis. We need to build and evolve our network infrastructure that connects myriads of training accelerators like… more
    Meta (08/01/24)
    - Save Job - Related Jobs - Block Source
  • Network AI Hardware Engineer

    Meta (Menlo Park, CA)
    …data center applications and modern AI infrastructure. **Required Skills:** Network AI Hardware Engineer Responsibilities: 1. Architectural exploration ... **Summary:** Meta is hiring a Network Hardware Engineer within our ... modeling for architectural concepts and product delivery of network ASIC interconnects for AI Systems. 3.… more
    Meta (10/25/24)
    - Save Job - Related Jobs - Block Source
  • Software Engineer , SystemML - AI

    Meta (Menlo Park, CA)
    …space of GenAI/LLM scaling reliability and performance. **Required Skills:** Software Engineer , SystemML - AI Networking Responsibilities: 1. Enabling reliable ... this role, you will be a member of the AI Networking Software team and part of the bigger...Library), which enables multi-GPU and multi-node data communication through HPC -style collectives. NCCL has been integrated into PyTorch and… more
    Meta (10/18/24)
    - Save Job - Related Jobs - Block Source
  • Technical Lead/Manager - AI /ML…

    Cisco (San Jose, CA)
    …agile team engaged in the design, development and execution of tests to qualify network performance for AI .ML capability. In this role you'll have opportunity ... to build the next generation infrastructure to meet the needs of AI /ML workloads and continuously increasing internet users and application. We are uniquely… more
    Cisco (09/12/24)
    - Save Job - Related Jobs - Block Source
  • Principal Software Engineer , AI

    General Motors (Mountain View, CA)
    …This role will involve working across various areas, from enhancing underlying HPC infrastructure to optimizing Kubernetes and Kubeflow setups, as well as refining ... teams to understand requirements and implement solutions. + Troubleshoot complex HPC infrastructure issues and implement effective resolutions with partner team. +… more
    General Motors (10/11/24)
    - Save Job - Related Jobs - Block Source
  • Production Systems Engineer , Sustaining

    Meta (Menlo Park, CA)
    …hardware requirements and specifications (eg, configuring hardware components, GPU, memory, network for AI / HPC workloads) **Public Compensation:** ... **Summary:** Meta is seeking an experienced Production Systems Engineer to join our Release to Production (RTP)...Responsibilities: 1. Develop robust, industry leading practices for supporting AI / HPC infrastructure at scale 2. Interface with… more
    Meta (10/24/24)
    - Save Job - Related Jobs - Block Source
  • Technical Marketing Engineer

    Cisco (San Jose, CA)
    …and driven Technical Marketing Engineer to define, validate, and drive compute & AI performance on UCS. Who you are: You have over 4 years of expertise in ... and PowerTool, and have experience in benchmarking for compute, network , and AI /ML. You are flexible, enthusiastic,...range of products and solutions * Understand high-performance computing ( HPC ), GPU workloads, and other AI infrastructure… more
    Cisco (10/07/24)
    - Save Job - Related Jobs - Block Source