- NVIDIA (Santa Clara, CA)
- … analysis, optimization, and modeling to define the architecture and design of NVIDIA's DGX Cloud clusters. The ideal candidate will have a deep understanding of ... the methodology to conduct end to end performance analysis of critical AI applications running on large...will work closely with the multi-functional teams to define DGX Cloud cluster architecture for different CSPs,… more
- NVIDIA (Santa Clara, CA)
- … analysis, optimization, and modeling to define the architecture and design of Nvidia's DGX Cloud clusters. The ideal candidate will have a deep understanding of ... the methodology to conduct end to end performance analysis of critical AI applications running on large...work closely with the cross functional teams to define DGX Cloud cluster architecture for different CSPs,… more
- NVIDIA (Santa Clara, CA)
- Joining NVIDIA's DGX Cloud AI Efficiency Team means contributing to the infrastructure that powers our innovative AI research. This team focuses on optimizing ... Engineers to design and develop tools for AI application performance analysis. Your work will enable AI researchers to...to work efficiently with a wide variety of DGXC cloud AI systems as they seek out opportunities for… more
- NVIDIA (Santa Clara, CA)
- …highly motivated, creative engineer with strong experience in system software to join the DGX Cloud Software Team. You will lead the architecture, design and ... implementation of our next generation DGX cloud clusters using latest technologies. On...stack deployment including hardware architecture, workload orchestration and application performance tuning. Are you ready to change the next… more
- NVIDIA (Santa Clara, CA)
- …outstanding, passionate, and talented Senior AI Infrastructure Engineer to join our DGX Cloud group. This engineering role will design, build and maintain ... cloud enabling technologies like Kubernetes and OpenStack. DGX Cloud SRE at NVIDIA ensures that...AI training and Inferencing platform built on top of cloud infrastructure + Conduct in-depth performance characterization… more
- NVIDIA (Santa Clara, CA)
- …workload isolation, Zero Trust). + Ability to partner effectively across central security, and DGX Cloud teams. Ways To Stand Out From The Crowd: + Expertise ... who will design and implement security best practices for on-premise and cloud access, keeping in mind boundaries that securely enable NVIDIA business verticals… more
- NVIDIA (Santa Clara, CA)
- …possess expertise in different domains, such as storage architecture, high- performance distributed storage, data management, systems, networking, coding, database ... planning, continuous delivery and deployment, as well as open-source cloud -enabling technologies like Kubernetes, containers, and virtualization. Their responsibilities… more
- NVIDIA (Santa Clara, CA)
- NVIDIA is seeking a Senior Systems Software Engineer to build cloud -native platform software harnessing open-source container runtimes and Kubernetes. You will ... + Automate and optimize build, test, integration, and release pipelines for cloud -native services. + Diagnose and improve performance , reliability, and security… more
- NVIDIA (Santa Clara, CA)
- We are seeking a highly skilled Senior Network Automation Architect to design, implement, and oversee end-to-end automation frameworks for provisioning Baremetal and ... Kubernetes clusters across hybrid and multi- cloud environments. This role blends deep networking expertise with...logging, alerting, and self-healing workflows to improve resilience and performance . + Act as the technical authority for network… more
- NVIDIA (Santa Clara, CA)
- …+ Contribute to architecture, integration, and alignment with both on-prem and cloud -native platforms. + Optimize system performance and reliability through ... storage services! Services that will need to meet extreme performance and scalability demands! We have crafted a team...block storage solution for the world's first AI factory, cloud computing company. What you'll be doing: + 100%… more
- NVIDIA (Santa Clara, CA)
- …database, capacity management, continuous delivery and deployment and open source cloud enabling technologies like Kubernetes and OpenStack. SRE at NVIDIA ensures ... that our internal and external facing GPU cloud services run maximum reliability and uptime as promised...planning while keeping an eye on capacity, latency and performance . SRE is also a mindset and a set… more
- NVIDIA (Santa Clara, CA)
- NVIDIA is looking for a Senior Software Engineer in Object Storage to design, implement, and extend the capabilities of our internal object storage system. This ... - 10k+ nodes, exabytes of data + Analyzing and improving system performance at all levels + Automating storage infrastructure end-to-end including provisioning,… more
- NVIDIA (Santa Clara, CA)
- We are looking for a Senior AI Infrastructure Engineer (AI Tooling) to design and build the backend systems and infrastructure powering our internal AI tools and ... writing clear design documentation + Deep experience with Kubernetes and cloud -native infrastructure + Experience working with building, deploying, and maintaining… more
- NVIDIA (Santa Clara, CA)
- NVIDIA is seeking a Senior Software Engineer to build a worldwide network of fast, efficient, and reliable data transfer systems. The goal is to enable NVIDIA AI ... building high-scale distributed systems such as distributed databases, storage systems, or cloud services NVIDIA is leading the way in groundbreaking developments in… more
- NVIDIA (Santa Clara, CA)
- …and excellent communication and planning abilities. Experience working with High Performance Computing (HPC), GPUs, and high- performance networking (RDMA, ... that automates GPU asset provisioning, configuration, and lifecycle management across cloud providers. You'll contribute to this platform to build end-to-end… more
- NVIDIA (Santa Clara, CA)
- …complex cluster configurations including Slurm and Kubernetes orchestrators for performance , scalability and resilience, ensuring they meet real-world customer ... networking Ways to stand out from the crowd: + Experience with C++, high- performance computing, Kubernetes and/or system administration would be an asset + Previous… more
- NVIDIA (Santa Clara, CA)
- …the most advanced storage services! Services that will need to meet extreme performance and scalability demands! We have crafted a team of extraordinary people ... related discipline (or equivalent experience). + 15+ years of experience as a senior developer, preferably in a storage company + Comprehension of large and… more
- NVIDIA (Santa Clara, CA)
- …a rapidly expanding ecosystem of data center platform designs. From single node HGX/ DGX systems all the way up to large multi-node NVLink domain rack architectures. ... These designs have become core to NVIDIA's rapidly growing enterprise and cloud provider businesses. Each brings together the full power of NVIDIA GPUs, NVIDIA… more
- NVIDIA (Santa Clara, CA)
- …that measure and improve product and solutions adoption, ecosystem growth, and DGX software performance ; use these metrics to communicate progress clearly ... is a small, dynamic, and highly motivated team behind DGX systems and DGX SuperPOD -the platforms...technology with product management or engineering experience on high-tech, cloud , AI/ML, technologies. + BS or MS in engineering,… more
- NVIDIA (Santa Clara, CA)
- …drive alignment, predictability, and accountability across AI platform initiatives. + Represent DGX Cloud programs to senior leadership, clearly articulating ... The DGX Cloud organization builds and operates...NVIDIA's research and product innovation by delivering a resilient, high- performance AI platform that seamlessly integrates hardware, orchestration, and… more