- Amazon (Boardman, OR)
- …in our facilities operations organization in Eastern Oregon as a Data Center Engineering Operations (DCEO) Cluster Manager. A DCEO Cluster Manager ... years of relevant experience managing data center operations , facility engineering operations , critical environment facilities, datacenter build-outs and… more
- Lenovo (Morrisville, NC)
- Senior Solution Architect - AI cluster direction **General Information** Req # WD00074559 Career area: Software Engineering Country/Region: United States of ... the solution defined. **Location:** Morrisville, NC **What You'll Do** + Design AI/HPC cluster topology + Design and develop test software solutions for GPU … more
- Amazon (Canton, MS)
- …or relevant industry experience . 10+ years of relevant management experience in datacenter operations , facility engineering operations , information ... Description The Cluster Operations Manager is responsible for...organizations are composed of two primary functions: Data Center engineering operations (DCEO). A physical security organization,… more
- Microsoft Corporation (Mountain View, CA)
- …AI needs to track closely all aspects of infrastructure including cluster acquisition, capacity deployment, large scale GPU buildout, high-speed fabric buildout ... to support and accelerate our model pre-training, post-training and fine-tuning operations . We are an interdisciplinary team of engineers and scientists, learning… more
- NVIDIA (Santa Clara, CA)
- …include: + Design and implement state-of-the-art GPU compute clusters. + Optimize cluster operations for maximum reliability, efficiency, and performance. + ... NVIDIA is the leader in AI, machine learning and datacenter acceleration. NVIDIA is expanding that leadership into ...to see: + Bachelor's degree in Computer Science, Electrical Engineering or related field or equivalent experience with a… more
- NVIDIA (Santa Clara, CA)
- …continuous improvement, finding new opportunities across tooling, automation and processes to scale cluster operations and management + Guide a diverse set of ... to lead the strategy and execution of programs to support the bringup, operations and automation of GPU infrastructure. The GPU infrastructure we build and operate… more
- General Dynamics Information Technology (Eagan, MN)
- …a **Systems Analyst Sr. Advisor** resource joining our team to provide engineering support for Middleware platforms which support many key revenue generating, mail ... and provide remediation + Provide Tier 3 support for issues escalated from operations team + Work with vendors to troubleshoot and implementation of new solutions… more
- NVIDIA (Santa Clara, CA)
- …Training, Inference and Visualization workflow for physical science and engineering problems. Those applications include Weather prediction, Climate modeling, ... data driven scientific workflows + Design novel algorithms and actively engaged with operations to increase overall system performance, it spans across the stack eg… more