• The Voleon Group (Berkeley, CA)
    …a multibillion‑dollar asset manager, and we have ambitious goals for the future. As a Senior Cluster Site Reliability Engineer (SRE), you will help ... scale our research compute cluster to meet our growing needs, and you will...leverage engineering skills to ensure high degrees of uptime, reliability , and robustness. Our research clusters are at the… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • The Voleon Group (Berkeley, CA)
    A leading technology firm in Berkeley is seeking a Senior Cluster Site Reliability Engineer to ensure high uptime and manage operational issues for their ... research compute cluster . Candidates should have extensive SRE experience, knowledge of HPC frameworks, and scripting skills. The role emphasizes collaboration with… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Lawrence Berkeley National Laboratory (Berkeley, CA)
    …Lab's ( LBNL ) Information Technology Division ( IT ) has an opening for a Senior HPC Cluster Systems Administrator to join their ScienceIT Team ! In this ... by building, integrating, and maintaining Linux-based resources, high-performance computing cluster systems, and Kubernetes clusters. This role provides extensive… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • NVIDIA Corporation (Santa Clara, CA)
    …take great pride in providing excellent, comprehensive support to our customers! Sr Site Reliability Engineer in this role will significantly impact and ... experience in Computer Science or related field. 8+ years of experience in site reliability engineering and/or software development roles. Fluency in Python… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Fluidstack (San Francisco, CA)
    …regions". Building internal tooling to decrease deployment time and increase cluster reliability , including automation where the customer benefits clearly ... join us in building what's next. About the Role Senior / Staff SREs at Fluidstack sit at the...working across software, hardware, and operations to ensure the reliability and performance of our global GPU cloud. They… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • Pantera Capital (Palo Alto, CA)
    …knowledge with their teammates. About the Role We are seeking a highly skilled Senior Site Reliability Storage Engineer to join our mission-driven team, ... with up to 25% travel required. Required Qualifications 5+ years of experience as a Site Reliability Engineer or similar role, with a focus on building and… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Boson AI (Palo Alto, CA)
    About The Role We're looking for a Senior Site Reliability Engineer to help us run one of the most exciting GPU clusters around-our Toronto datacenter packed ... as we continue to scale. Responsibilities Manage and optimize HPC cluster operations Deploy and maintain infrastructure‑as‑code solutions Support ML/research teams… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Google Inc. (Sunnyvale, CA)
    …the ability to build consensus across organizational boundaries. About the job Site Reliability Engineering (SRE) combines software and systems engineering to ... Senior Staff Software Engineer, SRE, ML Fleet Systems...Understanding of resource management systems (eg, Borg, Kubernetes, Flex), cluster management, and scheduling algorithms. Familiarity with Machine Learning… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Pantera Capital (Palo Alto, CA)
    A leading tech firm in Palo Alto is seeking a Senior Site Reliability Storage Engineer to design and optimize Kubernetes clusters. The ideal candidate will ... Kubernetes orchestration and distributed systems. Responsibilities include managing system reliability , developing software for cluster provisioning, and… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Invisible Technologies, Inc. (Olympia, WA)
    Senior Software Engineer, Forward Deployed (US Public Sector) Washington DC-Baltimore - Hybrid About Invisible Invisible Technologies makes AI work. Our end-to-end ... the enterprise and to advance our platform technology. About The Role As a Senior Forward Deployed Engineer (FDE) for our US Public Sector team at Invisible, you'll… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Roberts Recruiting, LLC (Boston, MA)
    …migrations, orchestration of ML utility services Optimize applications for reliability and scalability, addressing complex technical challenges Add monitoring, ... React and Python, hosted in a scalable and distributed multi‑ cluster Kubernetes environment. We don't expect everyone to have...AI technologies and methods Hybrid work - remote and on‑ site co‑working space in Downtown Boston Paid lunch at… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Teamblind, Inc. (New York, NY)
    …Salary Range $135,000.00 - $175,000.00 Etsy's Services Infrastructure group is looking for a Site Reliability Engineer II to join us in our mission of building ... SRE you will drive the adoption of containers and Kubernetes, improve reliability , automating the operations and providing a self-service runtime platform to… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Absolute Business Solutions Corp. (Bethesda, MD)
    …adaptive team that values innovation, collaboration, and professional development. As a Senior Elasticsearch Engineer, you will play a pivotal role in designing, ... search needs. You will collaborate with cross-functional teams to ensure the reliability , performance, and scalability of our Elasticsearch clusters. While most work… more
    job goal (01/12/26)
    - Save Job - Related Jobs - Block Source
  • Emerald AI, Inc. (Boston, MA)
    Location Bay Area, Boston, Washington DC Employment Type Full time Location Type On- site About Emerald AI We're at a pivotal moment for AI and energy. Demand for ... We're looking for talent ranging from mid-level to highly seasoned senior contributors with expertise in backend systems. Key Responsibilities Build scalable… more
    job goal (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Senior Site Reliability

    Insight Global (Blue Ash, OH)
    Job Description An employer is looking for a Senior SRE in the Cincinnati area for an onsite role to support and enhance our on‑premises infrastructure ecosystem. ... clusters using Rancher, automating configuration with Ansible, and ensuring reliability across self‑managed environments. You'll also provide technical leadership… more
    Insight Global (01/09/26)
    - Save Job - Related Jobs - Block Source
  • Senior Platform Engineer/Kubernetes SME

    KBR (Beavercreek, OH)
    Title: Senior Platform Engineer/Kubernetes SME Belong. Connect. Grow. with KBR! KBR's National Security Solutions team provides high-end engineering and advanced ... the future of space defense. Role summary KBR is seeking a highly experienced Senior Platform Engineer to join our team in Beavercreek, OH. The ideal candidate will… more
    KBR (01/08/26)
    - Save Job - Related Jobs - Block Source
  • Senior SRE Engineer

    Realtor (Austin, TX)
    …and building confidence through expert guidance. **About the Role** We are seeking a Senior Site Reliability Engineer to join our newly formed Operations ... our platform infrastructure serving millions of users. As a Senior SRE, you will be a strong technical contributor...You'll Bring** **Experience & Expertise** + 5+ years in Site Reliability Engineering, DevOps, or Infrastructure Engineering… more
    Realtor (11/25/25)
    - Save Job - Related Jobs - Block Source
  • AI Senior Staff Systems Engineer

    Cadence Design Systems, Inc. (San Jose, CA)
    …experienced AI Systems Engineer to join our team. This is a hands-on, senior individual contributor role that will be pivotal in leading the development, operations, ... solutions, and networking to ensure optimal performance, scalability, and reliability for all our AI workloads. + Cloud AI...services on both GCP and Azure. + Hands-on GPU Cluster Management: Take a leadership role in the configuration,… more
    Cadence Design Systems, Inc. (12/29/25)
    - Save Job - Related Jobs - Block Source
  • Senior Operations & Maintenance Support

    Applied Research Solutions (Washington, DC)
    …in a government environment. This role combines Tier 1 and Tier 2 Site Reliability Engineering (SRE) support responsibilities with deep technical expertise in ... **Description** We are seeking a Senior Operations & Maintenance Support Specialist to provide...and containerized application support + SRE Practices: Understanding of Site Reliability Engineering principles including monitoring, incident… more
    Applied Research Solutions (01/13/26)
    - Save Job - Related Jobs - Block Source
  • Senior BizOps Engineer

    Mastercard (O'Fallon, MO)
    …everything you can? The Business Operations (BizOps) team is seeking a Business Operations Site Reliability Engineer (SRE). The role of BizOps is to be the ... alerting strategy and create the framework to achieve zero downtime during deployment. Site Reliability Engineering: * Serve as the primary contact responsible… more
    Mastercard (01/06/26)
    - Save Job - Related Jobs - Block Source