• Senior Manager, SRE - Infrastructure…

    NVIDIA (Santa Clara, CA)
    In this role, the Senior SRE manager will be leading the SRE functions with a diverse team of systems engineers in close collaboration with SRE teams to ... Lead and grow a team of systems engineers and SRE developing and maintaining key infrastructure services used across...across NVIDIA. + Transform teams of systems engineers into SRE teams. + Build roadmaps for the next generation… more
    NVIDIA (01/15/25)
    - Save Job - Related Jobs - Block Source
  • US E-Consulting-CBO-CE-Cloud Native SRE

    Deloitte (San Jose, CA)
    …Job Summary: We are seeking a highly skilled and experienced Site Reliability Engineer ( SRE ) to join our dynamic team. The ideal candidate will have a strong ... you will do: + Monitoring & Performance Management using SRE principles: + Set up and manage monitoring tools...for your service. + Reduce MTTD & MTTR using SRE principles. + Scripting & Automation: + Develop and… more
    Deloitte (12/21/24)
    - Save Job - Related Jobs - Block Source
  • Senior SRE

    LiveRamp (San Francisco, CA)
    …to build and maintain products operational documentation and setting up product SRE practices + Support Security and Compliance governance support in production ... environments + Work in close collaboration with SRE team members and Engineering organizations based in California,...and 5+ years of experience in the fields of SRE , DevOps or production engineering + Experience in Infrastructure… more
    LiveRamp (01/23/25)
    - Save Job - Related Jobs - Block Source
  • Director, SRE

    LinkedIn (Mountain View, CA)
    …We are seeking a strategic, hands-on Director to lead the Grid and Streaming SRE team. In this leadership role, you will collaborate closely with development teams ... and monitoring platforms.Lead, mentor, and develop a high-performing team of SRE engineers specialized in data processing, compute and storage for LinkedIn's… more
    LinkedIn (01/22/25)
    - Save Job - Related Jobs - Block Source
  • Principal Engineer - Core Engineering (Full Stack…

    Palo Alto Networks (Santa Clara, CA)
    …experience, and work location. For candidates who receive an offer at the posted level, the starting base salary (for non-sales roles) or base salary + commission ... target (for sales/commissioned roles) is expected to be between $147000 - $237500/YR. The offered compensation may also include restricted stock units and a bonus. A description of our employee benefits may be found here (http://benefits.paloaltonetworks.com/)… more
    Palo Alto Networks (01/22/25)
    - Save Job - Related Jobs - Block Source
  • Sr Director, Production Operations (SASE, Access,…

    Palo Alto Networks (Santa Clara, CA)
    …all win with precision. **Your Career** We're looking for a Technical SRE Leader that has experience supporting large-scale distributed systems. Technology stack ... mindset to operations, monitoring, alerting and remediation. As a SRE Technical Leader, you will be responsible for the...SASE as well as CLoud-NGFW managed services and their SRE operations teams. You will be expected to lead… more
    Palo Alto Networks (12/13/24)
    - Save Job - Related Jobs - Block Source
  • Staff Site Reliability Engineer

    Abbott (Pleasanton, CA)
    …solutions for our customers. You will be responsible for implementing SRE improvement processes, procedures and influencing change within the organization. You ... environment and have DevOps or formal test automation, load testing or SRE experience. You will need extensive technical knowledge in the development, delivery,… more
    Abbott (11/17/24)
    - Save Job - Related Jobs - Block Source
  • Senior Production Engineer - Storage

    NVIDIA (Santa Clara, CA)
    Site Reliability Engineering ( SRE ) is an engineering discipline that involves designing, building, and maintaining large-scale production systems with high ... software and systems engineering practices, storage, data management, and services. SRE professionals are highly specialized and possess expertise in different… more
    NVIDIA (01/19/25)
    - Save Job - Related Jobs - Block Source
  • Software Engineer III, Site Reliability…

    Google (Mountain View, CA)
    …analyzing, and troubleshooting large-scale distributed systems. Site Reliability Engineering ( SRE ) combines software and systems engineering to build and run ... large-scale, massively distributed, fault-tolerant systems. SRE ensures that Google Cloud's services-both our internally critical...customer's needs and a fast rate of improvement. Additionally SRE 's will keep an ever-watchful eye on our systems… more
    Google (11/11/24)
    - Save Job - Related Jobs - Block Source
  • Senior Manager - Storage Production Engineering

    NVIDIA (Santa Clara, CA)
    As a Sr Manager in Site Reliability Engineering ( SRE ), you will lead a team dedicated to the design, construction, and maintenance of expansive production systems, ... software and systems engineering, cloud-scale storage, data management, and services. SRE Senior Managers bring specialized expertise in areas such as systems,… more
    NVIDIA (12/12/24)
    - Save Job - Related Jobs - Block Source
  • Senior Software Developer, Site Reliability…

    Google (San Francisco, CA)
    …+ Master's degree in Computer Science or Engineering. Site Reliability Engineering ( SRE ) combines software and systems engineering to build and run large-scale, ... massively distributed, fault-tolerant systems. SRE ensures that Google Cloud's services-both our internally critical...customer's needs and a fast rate of improvement. Additionally SRE 's will keep an ever-watchful eye on our systems… more
    Google (01/04/25)
    - Save Job - Related Jobs - Block Source
  • Sr. AWS Cloud & DevOps Architect - Remote

    McAfee, Inc. (San Jose, CA)
    …monitoring, logging, and alerting solutions to maintain system health and security. ​ ** SRE Leadership:** + Drive SRE practices by implementing strategies that ... and guide junior engineers in cloud architecture, DevOps, and SRE best practices. + Act as a subject matter...subject matter expert on AWS cloud solutions, DevOps, and SRE practices within the organization. **Documentation & Reporting:** +… more
    McAfee, Inc. (01/21/25)
    - Save Job - Related Jobs - Block Source
  • Senior Site Reliability Engineer - Observability…

    NVIDIA (Santa Clara, CA)
    Site Reliability Engineering ( SRE ) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and ... open source cloud enabling technologies like Kubernetes and OpenStack. SRE at NVIDIA ensures that our internal and external...while keeping an eye on capacity, latency and performance. SRE is also a mindset and a set of… more
    NVIDIA (01/23/25)
    - Save Job - Related Jobs - Block Source
  • Site Reliability Engineer

    LiveRamp (San Francisco, CA)
    …to build and maintain products operational documentation and setting up product SRE practices** + **Support Security and Compliance governance support in production ... environments** + **Work in close collaboration with SRE team members and Engineering organizations based in California,...+ **3+ years of experience in the fields of SRE , DevOps or production engineering** + **Experience in Infrastructure… more
    LiveRamp (01/23/25)
    - Save Job - Related Jobs - Block Source
  • Sr./Lead Site Reliability Engineer

    Federal Reserve Bank (San Francisco, CA)
    …scaling and operational consistency + Implement/leverage observability, monitoring, and SRE principles (eg, error budgets, proactive incident management) to enhance ... + Guide engineering teams, fostering standard processes in cloud engineering, SRE , and automation + Adopt security standard processes within cloud infrastructure… more
    Federal Reserve Bank (01/01/25)
    - Save Job - Related Jobs - Block Source
  • Sr Staff DevOps Engineer

    Palo Alto Networks (Santa Clara, CA)
    …ones needed for this role. **Your Impact** + Contribute to the success of SRE and DevOps + Develop expertise in new technologies + Work with developers, researchers, ... + Orchestrate end-to-end monitoring and alerting + Participate with SRE and Dev teams in the on-call rotation +...critical business and production issues + Mentor and champion SRE culture + Participate in design reviews **Your Experience**… more
    Palo Alto Networks (01/24/25)
    - Save Job - Related Jobs - Block Source
  • Sr Staff Cloud Escalation Engineer (Networking)

    Palo Alto Networks (Santa Clara, CA)
    …assist in escalations. While this role is similar to a Site Reliability Engineer ( SRE ) and lives in the same organization, here you will provide more opportunities ... in engineering troubleshooting roles in fields like Support, QA, Dev and SRE for an Enterprise-sized product delivery + Knowledge/Understanding in scripting and… more
    Palo Alto Networks (01/16/25)
    - Save Job - Related Jobs - Block Source
  • Principal Site Reliability Engineer (Cortex Cloud…

    Palo Alto Networks (Santa Clara, CA)
    …insights into our systems' performance and health. **Your Impact** As a Senior Staff SRE with the Cortex Cloud Security Posture Management team, you will: + Cloud ... incident and alerts management in Site Reliability Engineering + DevOps/ SRE Expertise - 5+ years of experience as a... Expertise - 5+ years of experience as a DevOps/ SRE engineer with a passion for technology and a… more
    Palo Alto Networks (01/14/25)
    - Save Job - Related Jobs - Block Source
  • Senior Site Reliability Engineer, AI…

    NVIDIA (Santa Clara, CA)
    …and scalability across global public and private clouds. + Implement SRE fundamentals, including incident management, monitoring, and performance optimization, while ... or related field, or equivalent experience with 12+ years in Software Development, SRE , or Production Engineering. + Proficiency in Python and at least one other… more
    NVIDIA (01/11/25)
    - Save Job - Related Jobs - Block Source
  • AI K8s Infrastructure Generalist

    NVIDIA (Santa Clara, CA)
    …There is an excellent opportunity to architect and drive advancements in the SRE automation on the largest NVIDIA GPU clusters in the cloud! Please apply ... doing: + As part of Maglev AI infrastructure and SRE team you will propose and craft new ways...crowd: + Previous experience with building sophisticated tooling and SRE automation on the large 100+ nodes GPU/CPU clusters… more
    NVIDIA (01/10/25)
    - Save Job - Related Jobs - Block Source