- Crusoe Energy Systems LLC (San Francisco, CA)
- …in California is seeking a Site Reliability Engineer to optimize their AI -optimized cloud infrastructure. The role involves building automation tools, driving ... high-performance storage systems. Candidates should have strong experience in SRE , distributed storage systems, and programming languages. The position… more
- SherlockTalent (San Francisco, CA)
- …cleaning, and transformation for quality assurance. Exposure to Data Science Manage cloud persistence for database and storage solutions for efficient data ... Overview Job Title: SRE & Data Engineer Location: Bay Area, CA,...and data ingestion processes. Set up and maintain GCP cloud native infrastructure and manage pipelines, exposure to client… more
- Hispanic Alliance for Career Enhancement (Scottsdale, AZ)
- …and resolution in collaboration with business and technology partners Champion modern cloud , edge, and AI ‑driven monitoring solutions for store technology ... (Dynatrace, AppDynamics, Prometheus, Splunk, Grafana, etc.) Strong understanding of cloud infrastructure components (compute, storage , networking, security)… more
- harvey.ai (San Francisco, CA)
- …services operate - not incrementally, but end-to-end. By combining frontier agentic AI , an enterprise-grade platform, and deep domain expertise, we're reshaping how ... at Harvey, you will ensure the reliability, scalability, and performance of our legal AI platform. You'll join a high-leverage team that sits at the intersection of… more
- Lenovo (Morrisville, NC)
- …of Platform Services, Core Intelligence, Experience Engineering, Cloud & Platform, and SRE & Delivery. 5. Partner Across the AI and Device Ecosystem Work ... AI -optimized devices (PCs, workstations, smartphones, tablets), infrastructure (server, storage , edge, high performance computing and software defined infrastructure),… more
- LGBT Great (New York, NY)
- …This includes object storage systems (eg S3, Azure Blob, GCP Cloud Storage ) and database technologies (relational and NoSQL managed databases) on ... (IaC) tools, and implementing agentic AI (autonomous AI agents) to optimize cloud operations. The...GCP, with a focus on PaaS components like object storage and database services. Ensure solutions are scalable, highly… more
- Syneos Health, Inc. (Bridgewater, MA)
- …AWS, Oracle Cloud and on-prem systems, ensuring interoperability. Oversee cloud infrastructure and services, aligning with AI /ML strategies and adopting ... culture, attract and develop talent, and collaborate across Engineering, Security, Data/ AI , and Finance. Communicate cloud strategy progress to executives.… more
- NCBiotech (Morrisville, NC)
- …AWS, Oracle Cloud and on-prem systems, ensuring interoperability. Oversee cloud infrastructure and services, aligning with AI /ML strategies and adopting ... culture, attract and develop talent, and collaborate across Engineering, Security, Data/ AI , and Finance. Communicate cloud strategy progress to executives.… more
- PriorLabs GmbH (San Francisco, CA)
- …security within the engineering team. Qualifications 3+ years of professional experience in a cloud engineering, data platform, or SRE role, with a proven track ... accelerating. We're now building the next generation of models that combine AI advancements with specialized architectures for structured data. What's Next: With €9M… more
- NetImpact Strategies (Bethesda, MD)
- …optimizing server configuration and managing complex applications within Microsoft Azure Cloud environments. Your expertise will be crucial in implementing and ... maintaining efficient cloud -based solutions, including backup and recovery strategies using Azure...Azure services such as Azure Backup, Site Recovery (ASR), Storage Redundancy options, VM Backup, Blob Storage … more
- Boson AI (Palo Alto, CA)
- …Toronto datacenter packed with NVIDIA H100 and A100 GPUs, over 20PB of Ceph storage , terabit networking, and hundreds of servers. You'll be hands‑on with the full ... teams with cluster usage optimization Operate, troubleshoot and optimize Ceph storage clusters Develop automation and tooling Minimum Qualifications 5+ years of… more
- Aldea Inc (San Francisco, CA)
- …Level: Mid-Level / Senior About Aldea Aldea is a multi-modal foundational AI company reimagining the scaling laws of intelligence. We believe today's architectures ... challenges including networking (BGP, ECMP), load balancing (MetalLB/Kube‑VIP), and storage orchestration (CSI/Rook‑Ceph) for stateful workloads. 2. Observability & … more
- Crusoe Energy Systems LLC (San Francisco, CA)
- … infrastructure. About This Role: Crusoe is building the most reliable, energy-efficient, AI -optimized cloud platform - and operational excellence is at the ... powers a world where people can create ambitiously with AI - without sacrificing scale, speed, or sustainability. Be...Bring to the Team: 5+ years of experience in cloud operations, SRE , or related roles Understanding… more
- Pantera Capital (Palo Alto, CA)
- About xAI xAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly ... the Role As a Data Center Site Reliability Engineer ( SRE ) at xAI, you will play a pivotal role...infrastructure, including the Colossus supercluster in Memphis-the world's largest AI training cluster with over 100,000 liquid‑cooled Nvidia GPUs… more
- Qualcomm (San Diego, CA)
- …architecture with resiliency. Drive business projects with clear objectives to the SRE team that align with business roadmaps, task delegation, and implement Agile ... using JIRA agile methodology. Ensure high-quality documentation, including detailed SRE operations, configuration steps, custom procedures/scripts, testing/validation processes, and… more
- Qualcomm (San Diego, CA)
- …architecture with resiliency. Drive business projects with clear objectives to the SRE team that align with business roadmaps, task delegation, and implement Agile ... using JIRA agile methodology. Ensure high-quality documentation, including detailed SRE operations, configuration steps, custom procedures/scripts, testing/validation processes, and… more
- CATHEXIS (Honolulu, HI)
- …innovative and trusted results. We are looking for a dynamic Site Reliability Engineer ( SRE ) to join our team at Joint Base Pearl Harbor-Hickam. The Site Reliability ... Engineer ( SRE ) will manage, monitor, and optimize clusters on Kubernetes....transformation through the building and deployment of data-driven, scalable AI solutions. The ideal candidate will have a deep… more
- Pathway Genomics Corporation (Palo Alto, CA)
- …& CEO Zuzanna Stamirowska, a complexity scientist who created a team consisting of AI pioneers, including CTO Jan Chorowski who was the first person to apply ... infrastructure that powers our ML training and inference workloads across multiple cloud providers, from bare‑bones Linux to container orchestration and CI/CD. You… more
- Epoch Biodesign (San Francisco, CA)
- …500 companies to power their most advanced AI applications. Crusoe is redefining AI cloud infrastructure, with a mission to align the future of computing ... Cloud Engineering Crusoe is building the World's Favorite AI -first Cloud infrastructure company. We're pioneering vertically integrated, purpose-built… more
- Rethink recruit (Palo Alto, CA)
- …experience. Deep expertise with Google Cloud Platform (compute, networking, storage , and security). Proven AI /ML infrastructure integration experience in ... About AI Growth Labs AI Growth Labs...Architect, automate, and scale secure, high-performance infrastructure on Google Cloud Platform (GCP) . Build and maintain CI/CD pipelines… more