• Network Engineer , HPC

    Meta (Menlo Park, CA)
    …our network engineering teams is for you! **Required Skills:** Network Engineer , HPC Systems Network Strategy Responsibilities: 1. Design, ... you will be responsible for conceiving, developing, and deploying software, hardware and network systems and tools that improve reliability and efficiency in our… more
    Meta (08/20/24)
    - Save Job - Related Jobs - Block Source
  • Senior HPC Cloud Engineer

    Stanford University (Stanford, CA)
    Senior HPC Cloud Engineer **Doerr School of...drives new policy and technology solutions through a worldwide network of partners who work with our teams to ... help our school expand and scale our research computing ( HPC ) resource portfolio in the cloud to meet the...including distributed filesystems and an emphasis on object storage systems and data lifecycle management. + **Infrastructure as Code… more
    Stanford University (09/29/24)
    - Save Job - Related Jobs - Block Source
  • Senior HPC Performance Engineer

    NVIDIA (Santa Clara, CA)
    …UCX for Deep Learning and HPC . We are looking for a motivated Performance engineer to influence the roadmap of our communication libraries. The DL and HPC ... are even higher at huge scales! This is an outstanding opportunity for someone with HPC and performance background to advance the state of the art in this space. Are… more
    NVIDIA (10/24/24)
    - Save Job - Related Jobs - Block Source
  • Software Development Engineer , HPC

    Amazon (Cupertino, CA)
    …experience in low-latency networking and collective operations, such as HPC network fabric or machine learning accelerator cluster systems . Also applicable ... networking or interconnect expertise to optimize customer experience by designing systems that enable scaling network -intensive workloads over thousands of… more
    Amazon (11/12/24)
    - Save Job - Related Jobs - Block Source
  • AI/ HPC Systems Performance…

    Meta (Menlo Park, CA)
    …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI/ HPC Systems Performance Engineer Responsibilities: 1. Active ... daily basis. We need to build and evolve our network infrastructure that connects myriads of training accelerators like...a loss-less fabric interconnect. To improve performance of these systems we constantly look for opportunities across stack: … more
    Meta (10/31/24)
    - Save Job - Related Jobs - Block Source
  • AI/ HPC Systems Performance…

    Meta (Menlo Park, CA)
    …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI/ HPC Systems Performance Engineer Responsibilities: 1. Lead ... deal with on a daily basis. We need to build and evolve our network infrastructure that connects myriads of training accelerators like GPUs together. In addition, we… more
    Meta (10/27/24)
    - Save Job - Related Jobs - Block Source
  • Senior Product Architect, HPC and AI

    NVIDIA (Santa Clara, CA)
    …and complex network topologies + Extensive experience with benchmarking systems and analyzing performance bottlenecks in large-scale AI/ HPC infrastructure + ... to create reference designs for the world's most powerful AI clusters. As an AI/ HPC Product Architect at NVIDIA, you'll be the linchpin in transforming ideas into… more
    NVIDIA (10/25/24)
    - Save Job - Related Jobs - Block Source
  • Senior Software Architect, AI and HPC

    NVIDIA (Santa Clara, CA)
    …environments, and system software to make current and future high-end computer systems more performant, scalable, and usable. As an NVIDIAN, you'll be immersed ... proofs-of-concept to evaluate and motivate extensions in AI Frameworks (PyTorch/NEMO), HPC programming models (MPI, OpenSHMEM, PGAS), new runtime designs, and new… more
    NVIDIA (10/29/24)
    - Save Job - Related Jobs - Block Source
  • Production Systems Engineer

    Meta (Menlo Park, CA)
    **Summary:** Meta is seeking an experienced Production Systems Engineer to join our Release to Production (RTP) team. Our servers and data centers are the ... and life cycle of servers in production. **Required Skills:** Production Systems Engineer , Sustaining Responsibilities: 1. Develop robust, industry leading… more
    Meta (10/24/24)
    - Save Job - Related Jobs - Block Source
  • Senior Network Engineer (Ashburn,VA)

    LinkedIn (Sunnyvale, CA)
    …networks. We develop tools and automate processes to support our hyper-growth. As a Senior Network Engineer , you'll play a pivotal role as a technical leader and ... competing priorities. In addition to leadership acumen, a successful Senior Network Engineer should demonstrate sufficient proficiency in both networking… more
    LinkedIn (08/24/24)
    - Save Job - Related Jobs - Block Source
  • Senior System Software Engineer , NCCL…

    NVIDIA (Santa Clara, CA)
    …We deliver communication runtimes like NCCL and NVSHMEM for Deep Learning and HPC applications. We are looking for a motivated Partner Enablement Engineer ... guide our key partners and customers with NCCL. Most DL/ HPC applications run on large clusters with high-speed networking...Develop tools and automation to isolate issues on new systems and platforms, including cloud platforms (Azure, AWS, GCP,… more
    NVIDIA (10/22/24)
    - Save Job - Related Jobs - Block Source
  • Senior Linux Systems Engineer

    NVIDIA (Santa Clara, CA)
    …a lasting impact on the world. We are looking for a Senior Linux Software Engineer to join the NVIDIA Applied Systems Engineering group. The work environment is ... of environments. + Crafting, developing and enhancing components of the compute, network , storage security, and management software elements in support of HPC more
    NVIDIA (08/24/24)
    - Save Job - Related Jobs - Block Source
  • Research Data Center Facility Engineer

    Stanford University (Stanford, CA)
    …researchers from a variety of Stanford and SLAC organizations. The majority of the HPC systems are hosted in the Stanford Research Computing Facilities (SRCF1 ... Research Data Center Facility Engineer **Business Affairs: University IT (UIT), Stanford, California,...Stanford Research Computing. Research Computing offers High Performance Computing ( HPC ) hosting services, computational and data systems ,… more
    Stanford University (09/24/24)
    - Save Job - Related Jobs - Block Source
  • Senior Software Engineer , GPU…

    NVIDIA (Santa Clara, CA)
    …wave of artificial intelligence. We are looking for a highly motivated senior software engineer for an exciting role in our communication libraries and network ... crew that develops and maintains software for complex heterogeneous computing systems that power disruptive products in High Performance Computing and Deep… more
    NVIDIA (10/14/24)
    - Save Job - Related Jobs - Block Source
  • Software Engineer , Engineering…

    NVIDIA (Santa Clara, CA)
    …Our collaborative team plays a critical role in NVIDIA's high performance computing ( HPC ) products., We build the Network Operating System software that powers ... NVIDIA is hiring a Software Engineer experienced in DevOps, build/release and software configuration...software-defined to meet the exploding growth in AI and HPC . What you'll be doing: + Managing, monitoring, and… more
    NVIDIA (11/10/24)
    - Save Job - Related Jobs - Block Source
  • Software Engineer , SystemML - AI…

    Meta (Menlo Park, CA)
    …Communications Library), which enables multi-GPU and multi-node data communication through HPC -style collectives. NCCL has been integrated into PyTorch and is on ... (eg Large-Scale GenAI/LLM training) from the trainer down to the inter-GPU and network communication layer. And we are seeking for engineers to work on the… more
    Meta (10/18/24)
    - Save Job - Related Jobs - Block Source
  • Software Engineer - Datacenter networking

    Meta (Menlo Park, CA)
    …Meta's global data center networks. Our work covers the entire network lifecycle, including hardware development, capacity planning, distributed and centralized ... control systems , modeling/provisioning/automation, monitoring/troubleshooting/analytics, and simulation/design/failure analysis.We are actively seeking Software… more
    Meta (10/18/24)
    - Save Job - Related Jobs - Block Source
  • Field Application Engineer (Machine…

    quadric.io, Inc (Burlingame, CA)
    …battery operated smart-sensor systems to high-performance automotive or autonomous vehicle systems . Unlike other NPUs or neural network accelerators in the ... co-optimized software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of...C++ DSP and control code. Role: The Field Application Engineer (FAE) will work closely with Business Development, Product,… more
    quadric.io, Inc (11/13/24)
    - Save Job - Related Jobs - Block Source
  • Principal Software Engineer , AI Platform…

    General Motors (Mountain View, CA)
    …This role will involve working across various areas, from enhancing underlying HPC infrastructure to optimizing Kubernetes and Kubeflow setups, as well as refining ... teams to understand requirements and implement solutions. + Troubleshoot complex HPC infrastructure issues and implement effective resolutions with partner team. +… more
    General Motors (10/11/24)
    - Save Job - Related Jobs - Block Source
  • Technical Marketing Engineer , Server…

    Cisco (San Jose, CA)
    …servers and AI systems . What you'll do: As a Technical Marketing Engineer ( TME) you will collaborate with engineering teams on product development and ... us as a highly motivated and driven Technical Marketing Engineer to define, validate, and drive compute & AI...Microsoft Hyper-V, and KVM. You have knowledge of storage systems , including SAN, NAS, and NVMe performance, and experience… more
    Cisco (10/07/24)
    - Save Job - Related Jobs - Block Source