- NVIDIA (Santa Clara, CA)
- …and drive foundational improvements and automation to improve researchers productivity. As a Site Reliability Engineer , you are responsible for the big ... and operating large scale compute infrastructure + Proven experience in site reliability engineering for high-performance computing environments with operational… more
- NVIDIA (Santa Clara, CA)
- …of our global scale PaaS and SaaS offering! We are seeking a highly motivated Senior Site Reliability Engineer to join our Omniverse Infrastructure ... also make feature development faster and safer. As a Senior Omniverse Cloud SRE, you will architect solutions to...working for us. Are you a creative and autonomous Site Reliability Engineer , who loves… more
- Palo Alto Networks (Santa Clara, CA)
- …environment where we all win with precision. **Your Career** We are looking for an exceptional Sr Site Reliability Engineer to enhance our ATP Infra ... tools, and processes that will ensure the highest levels of availability and reliability of all our applications. We need creative and innovative problem solvers who… more
- Palo Alto Networks (Santa Clara, CA)
- …and Alerts Management - Clear understanding of incident and alerts management in Site Reliability Engineering + DevOps/SRE Expertise - 5+ years of experience ... our systems' performance and health. **Your Impact** As a Senior Staff SRE with the Cortex Observability team, you...influence the operability of the product and ensure the reliability and availability of our services **Your Experience** +… more
- Palo Alto Networks (Santa Clara, CA)
- …Alto Networks runs a large infrastructure and is one of the largest GCP customers. As a Senior Staff DevOps Engineer for the CDL/SLS team, you will be part of a ... This includes automation, architecture, performance, observability, troubleshooting, security, and reliability . Our Infrastructure Platform stack includes Terraform, Kubernetes, GitLab… more
- NVIDIA (Santa Clara, CA)
- …of study. + 12+ years of experience in Software Development and/or Site Reliability Engineering/Production Engineering. + Strong software development using ... make a lasting impact on the world. As a Sr Staff Engineer , you will drive the...operating our existing infrastructure to the highest level of reliability and security. You will work side by side… more
- NVIDIA (Santa Clara, CA)
- Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and ... internal and external facing GPU cloud services run maximum reliability and uptime as promised to the users and...be doing: + Design, implement and support operational and reliability aspects of large scale Kubernetes clusters with focus… more
- NVIDIA (Santa Clara, CA)
- …intelligence. We are seeking a highly skilled and experienced Staff Software Engineer to lead the design, deployment, and management of our large-scale GPU ... clusters. These clusters will power AI workloads across multiple teams and projects, making a significant impact on the future of machine learning and artificial intelligence at NVIDIA. Join our engineering team and collaborate with researchers, AI engineers,… more
- Palo Alto Networks (Santa Clara, CA)
- …as our support teams to assist in escalations. While this role is similar to a Site Reliability Engineer (SRE) and lives in the same organization, here you ... in the digital age.This team utilizes the experience of senior network and security engineers to build environments where...our customers and employees thrive. In this Cloud Escalation Engineer - Prisma Access role, you will act as… more
- Google (Sunnyvale, CA)
- …Preferred qualifications: + Master's degree in Computer Science or Engineering. Site Reliability Engineering (SRE) combines software and systems engineering ... Google Cloud's services-both our internally critical and our externally-visible systems-have reliability , uptime appropriate to customer's needs and a fast rate of… more
- General Motors (Mountain View, CA)
- …the scale, availability, and operations for our applications. As a staff engineer , you'll be creating patterns for reliability , implementing reliability ... a week, at minimum **The Role:** This is a senior technical leadership role, and we are looking for...Our engineers have a passion for quality, efficiency, and reliability to help us accelerate and innovate in this… more
- SpaceX (Sunnyvale, CA)
- …These positions cover a variety of areas ranging from Developer Operations, to Site Reliability and managing our Kubernetes environment. You will develop ... Sr . Software Infrastructure Engineer (Starlink) at...and 5+ years of professional experience in systems administration, site reliability engineering, or DevOps; OR 7+… more
- Amazon (East Palo Alto, CA)
- …of innovation and delivery for Redshift organization. We are searching for a Sr . Software Development Engineer with passion for building large-scale, highly ... as codified in Amazon's leadership principles (https://www.amazon.jobs/en/principles). As a Sr . Software Development Engineer , in Redshift Builder Experience,… more
- Amazon (Sunnyvale, CA)
- …General Intelligence (AGI) team is looking for a passionate, talented, and inventive Sr . Software Development Engineer ( Sr . SDE)/Machine Learning Engineer ... multi-modal and multi-lingual Large Language Models (LLM). As our Sr . SDE/MLE superstar, you'll have the power to lead...5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience… more
- Amazon (Sunnyvale, CA)
- Description We are looking for a driven Sr . Software Development Engineer with a growth mindset who is adept at a variety of skills, especially with design and ... of a new product space within Amazon. As a Sr . Software Development Engineer , you will design,...Integrate ML models into production systems and ensure their reliability and efficiency. - Deploy applications on devices and… more
- Amazon (Santa Clara, CA)
- …fast iteration in a start-up like environment. We are looking for a Sr . Software Development Engineer obsessed with customer success, passionate about solving ... and delivering new products at AWS scale. As a Sr . Software Development Engineer , you will be...5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience… more
- Amazon (Sunnyvale, CA)
- …collaborative and at the top of their respective fields. We are looking for talented Senior Machine Learning Engineer who are adept at a variety of skills, that ... models, large language models (LLM), generative audio (music and speech synthesis), computer vision (CV), reinforced learning (RL) and...bar within the team. Key job responsibilities As a Sr . MLE in the team you will: Work closely… more
- Amazon (Sunnyvale, CA)
- Description As a Sr .SDET within SSG team, you will own automation framework development and execution of test strategy, tooling and automated performance test to ... SDMs, QAMs and engineers across the SSG organization, and regularly communicate with senior leaders, and stakeholders at all levels. You will amplify your impact… more
- Amazon (East Palo Alto, CA)
- …application level down to the code and hardware level. 10031 Key job responsibilities A Sr . Performance Engineer at Amazon Redshift needs to be able to diagnose ... Redshift Performance Engineering team is looking for an experienced performance engineer who is passionate about database and distributed systems performance. Join… more
- Amazon (Cupertino, CA)
- …creating a toolchain that will provide a quantum leap in performance. You: As a Sr . Machine Learning Compiler Engineer III on the AWS Neuron Compiler team, you ... we're building an environment that celebrates knowledge-sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews.… more