• Datacenter Engineering

    Amazon (Boardman, OR)
    …in our facilities operations organization in Eastern Oregon as a Data Center Engineering Operations (DCEO) Cluster Manager. A DCEO Cluster Manager ... years of relevant experience managing data center operations , facility engineering operations , critical environment facilities, datacenter build-outs and… more
    Amazon (10/10/24)
    - Save Job - Related Jobs - Block Source
  • Senior Solution Architect - AI cluster

    Lenovo (Morrisville, NC)
    Senior Solution Architect - AI cluster direction **General Information** Req # WD00074559 Career area: Software Engineering Country/Region: United States of ... the solution defined. **Location:** Morrisville, NC **What You'll Do** + Design AI/HPC cluster topology + Design and develop test software solutions for GPU … more
    Lenovo (11/21/24)
    - Save Job - Related Jobs - Block Source
  • Manager III, Tech Ops Eng

    Amazon (Canton, MS)
    …or relevant industry experience . 10+ years of relevant management experience in datacenter operations , facility engineering operations , information ... Description The Cluster Operations Manager is responsible for...organizations are composed of two primary functions: Data Center engineering operations (DCEO). A physical security organization,… more
    Amazon (10/22/24)
    - Save Job - Related Jobs - Block Source
  • Member of Technical Staff, Pre-training Platform…

    Microsoft Corporation (Mountain View, CA)
    …AI needs to track closely all aspects of infrastructure including cluster acquisition, capacity deployment, large scale GPU buildout, high-speed fabric buildout ... to support and accelerate our model pre-training, post-training and fine-tuning operations . We are an interdisciplinary team of engineers and scientists, learning… more
    Microsoft Corporation (11/07/24)
    - Save Job - Related Jobs - Block Source
  • Senior Site Reliability Engineer - AI Research…

    NVIDIA (Santa Clara, CA)
    …include: + Design and implement state-of-the-art GPU compute clusters. + Optimize cluster operations for maximum reliability, efficiency, and performance. + ... NVIDIA is the leader in AI, machine learning and datacenter acceleration. NVIDIA is expanding that leadership into ...to see: + Bachelor's degree in Computer Science, Electrical Engineering or related field or equivalent experience with a… more
    NVIDIA (09/25/24)
    - Save Job - Related Jobs - Block Source
  • Senior Technical Program Manager - GPU Clusters

    NVIDIA (Santa Clara, CA)
    …continuous improvement, finding new opportunities across tooling, automation and processes to scale cluster operations and management + Guide a diverse set of ... to lead the strategy and execution of programs to support the bringup, operations and automation of GPU infrastructure. The GPU infrastructure we build and operate… more
    NVIDIA (11/01/24)
    - Save Job - Related Jobs - Block Source
  • Systems Analyst Sr Advisor- Hybrid Remote

    General Dynamics Information Technology (Eagan, MN)
    …a **Systems Analyst Sr. Advisor** resource joining our team to provide engineering support for Middleware platforms which support many key revenue generating, mail ... and provide remediation + Provide Tier 3 support for issues escalated from operations team + Work with vendors to troubleshoot and implementation of new solutions… more
    General Dynamics Information Technology (11/23/24)
    - Save Job - Related Jobs - Block Source
  • Senior System Software Engineer - Scientific…

    NVIDIA (Santa Clara, CA)
    …Training, Inference and Visualization workflow for physical science and engineering problems. Those applications include Weather prediction, Climate modeling, ... data driven scientific workflows + Design novel algorithms and actively engaged with operations to increase overall system performance, it spans across the stack eg… more
    NVIDIA (09/10/24)
    - Save Job - Related Jobs - Block Source