• Cluster Deployment Operations

    NVIDIA (Santa Clara, CA)
    …the first people to make them operational in production? We are seeking a dedicated Cluster Deployment Operations Engineer to support product deployments ... team, acting as the link between engineering and the NVIS field team for cluster deployment and management solutions! We bridge the gap between product roadmaps… more
    NVIDIA (12/18/25)
    - Save Job - Related Jobs - Block Source
  • Senior Software Engineer - Market Data…

    Bloomberg (New York, NY)
    Senior Software Engineer - Market Data Platform, Cluster Management Location New York Business Area Engineering and CTO Ref # 10046371 **Description & ... in it for you:** As a Market Data Platform engineer , you will: + Get hands-on experience working on...for monitoring load and latency. Our platform enables self-service operations and supports Incident Response. + Design - We… more
    Bloomberg (11/15/25)
    - Save Job - Related Jobs - Block Source
  • Principal/Sr Principal HPC Systems Engineer

    Northrop Grumman (Jessup, MD)
    …+ Oversee design, deployment , and lifecycle operation of a high-performance compute cluster + Lead team of HPC Systems Administrators + Assess and respond to ... and risks by performing trade studies of technological function, value proposition, and deployment timeline + Assess and report on cluster operational risks and… more
    Northrop Grumman (12/05/25)
    - Save Job - Related Jobs - Block Source
  • Senior Cloud Engineer - Kubernetes

    TP-Link North America, Inc. (Irvine, CA)
    …lifestyle. OVERVIEW We are looking for an experienced Senior Cloud Engineer specializing in Kubernetes development and enhancing underlying system capabilities to ... the underlying architecture, integrating opensource solutions, and improving Kubernetes cluster capabilities. You will directly contribute to the development of… more
    TP-Link North America, Inc. (10/16/25)
    - Save Job - Related Jobs - Block Source
  • Principal, Software Engineer - Cloud…

    Walmart (Sunnyvale, CA)
    …remediation. **Automation & Observability** + Build and standardize automation for cluster deployment , expansion, and monitoring using Ansible, Terraform, and ... **Position Summary ** We are seeking a highly skilled Principal Engineer (Ceph/Scale-Out Storage) with 10years+ of deep technical experience in distributed storage… more
    Walmart (11/20/25)
    - Save Job - Related Jobs - Block Source
  • Senior Enterprise Cloud Operations

    Mastercard (O'Fallon, MO)
    …governments realize their greatest potential._ **Title and Summary** Senior Enterprise Cloud Operations Engineer Job Summary We are seeking an experienced Senior ... Cloud Engineer to join our Cloud Operations team....candidate should be able to automate daily activities, support deployment pipelines, develop observability for the cloud platforms. This… more
    Mastercard (11/19/25)
    - Save Job - Related Jobs - Block Source
  • MTS V, Network Engineer - PaaS

    Panasonic Avionics Corporation (Beaverton, OR)
    …**Responsibilities** **The Position:** The K3s Network Engineer will focus on **networking for K3s ... ARM, accelerators). The role involves designing, implementing, and maintaining cluster networking that integrates with external systems. This includes **writing… more
    Panasonic Avionics Corporation (10/07/25)
    - Save Job - Related Jobs - Block Source
  • Site Reliability Engineer , GNC (Falcon)

    SpaceX (Hawthorne, CA)
    Site Reliability Engineer , GNC (Falcon) Hawthorne, CA Apply SpaceX was founded under the belief that a future where humanity is out exploring the stars is ... goal of enabling human life on Mars. SITE RELIABILITY ENGINEER , GNC (FALCON) SpaceX is looking for a Site...and vehicle simulation and participates in recurring mission-critical launch operations . This position will work with the GNC team… more
    SpaceX (10/16/25)
    - Save Job - Related Jobs - Block Source
  • Container as a Service Engineer - OpenShift

    Truist (Charlotte, NC)
    …introducing new capabilities. This includes improving/developing automation for cluster installation, system upgrades, patch management/compliance, and monitoring ... of the overall environment. Other important aspects of this role will be cluster capacity management and providing level two operational support. CAAS support … more
    Truist (12/11/25)
    - Save Job - Related Jobs - Block Source
  • DevOps Engineer (Onsite)

    Cognizant (Bridgewater, NJ)
    …Develop automation scripts in Shell Python + Build internal tools to streamline cluster operations and observability. **Work model:** At Cognizant, we strive to ... As a **DevOps Engineer ** you will make an impact by administering...Retail team. **In this role, you will:** + Perform cluster lifecycle operations including upgrades patching node… more
    Cognizant (12/09/25)
    - Save Job - Related Jobs - Block Source
  • Regional Chief Engineer , Operations

    Amazon (Mesa, AZ)
    …Regional Chief Engineer (CE) to join our Data Center Engineering Operations (DCEO) Team. This committed group works to maintain the critical physical ... software, hardware, and network engineers, supply chain specialists, security experts, operations managers, and other vital roles. You'll collaborate with people… more
    Amazon (12/26/25)
    - Save Job - Related Jobs - Block Source
  • AI Senior Staff Systems Engineer

    Cadence Design Systems, Inc. (San Jose, CA)
    …world of technology. We are seeking a highly skilled and experienced AI Systems Engineer to join our team. This is a hands-on, senior individual contributor role ... that will be pivotal in leading the development, operations , and support of our entire AI infrastructure. You...services on both GCP and Azure. + Hands-on GPU Cluster Management: Take a leadership role in the configuration,… more
    Cadence Design Systems, Inc. (12/29/25)
    - Save Job - Related Jobs - Block Source
  • Senior GPU and HPC Infrastructure Engineer

    NVIDIA (Santa Clara, CA)
    operations , and networking, familiarity with software testing and deployment , familiarity with distributed systems, and excellent communication and planning ... management systems (Kubernetes, SLURM.) Hands-on experience in Machine Learning Operations . Hands-on experience with Bright Cluster Manager. + Hands-on… more
    NVIDIA (10/09/25)
    - Save Job - Related Jobs - Block Source
  • Systems Engineer - Platform - active TS/SCI…

    V2X (Springfield, VA)
    …Google GKE, with demonstrate proficiency in the following container areas; cluster management, deployment and automation, monitoring and logging, security, ... Google GKE, with demonstrate proficiency in the following container areas; cluster management, deployment and automation, monitoring and logging, security,… more
    V2X (12/12/25)
    - Save Job - Related Jobs - Block Source
  • Platform System Engineer (Kubernetes)

    CACI International (Fort Bragg, NC)
    …experience. . Extensive experience with Kubernetes: You should be comfortable with cluster management, deployment strategies, and general Kubernetes concepts. . ... Platform System Engineer (Kubernetes) Job Category: Engineering Time Type: Full...systems. You'll work at the intersection of development and operations , focusing on automation and tooling to improve the… more
    CACI International (10/09/25)
    - Save Job - Related Jobs - Block Source
  • Big Data Support Engineer - Assistant Vice…

    Citigroup (Irving, TX)
    …that have completed the development stage and are running in the daily operations of the firm. + Manages, maintains and supports applications and their operating ... requirements. + Participate in application releases, from development, testing and deployment into production. + Engages in post implementation analysis to ensure… more
    Citigroup (12/30/25)
    - Save Job - Related Jobs - Block Source
  • Senior MLOps Engineer , GenAI Framework

    NVIDIA (Santa Clara, CA)
    …Artifactory, Jira) in hybrid on-premise and cloud environments. + Assist with cluster operations and system administration (managing: servers, team accounts, ... dedicated and motivated senior build and continuous integration (CI/CD) engineer for its GenAI Frameworks (Megatron-LM (https://github.com/NVIDIA/Megatron-LM) and NeMo… more
    NVIDIA (10/15/25)
    - Save Job - Related Jobs - Block Source
  • Senior BizOps Engineer

    Mastercard (O'Fallon, MO)
    …to design, build, implement, and support technology services. A business operations engineer will ensure operational criteria like system availability, ... Operations (BizOps) team is seeking a Business Operations Site Reliability Engineer (SRE). The role...capacity, performance, monitoring, self-healing, and deployment automation are implemented throughout the… more
    Mastercard (10/08/25)
    - Save Job - Related Jobs - Block Source
  • Senior Software Development Engineer

    NVIDIA (Santa Clara, CA)
    We are looking for Senior Software Development Engineer in Test (SDET) to join our New GPU Integration (NPI) team for NVIDIA's Enterprise Compute SWQA team. Are you ... to have your skills on the team! As an engineer on this New Platform GPU Integration team, you...tools to significantly enhance our testing capabilities and streamlining operations for more efficient and accurate results. + Improve… more
    NVIDIA (12/12/25)
    - Save Job - Related Jobs - Block Source
  • Cloud Engineer 3

    Huntington Ingalls Industries (Suffolk, VA)
    …Our capabilities range from C5ISR, AI and Big Data, cyber operations and synthetic training environments to fleet sustainment, environmental remediation and ... Job Description Mission Technologies, a division of HII, is seeking a Cloud Engineer 3 to support the Joint Training Synthetic Environment (JTSE) Joint Staff J7(JS… more
    Huntington Ingalls Industries (11/07/25)
    - Save Job - Related Jobs - Block Source