- NVIDIA (Santa Clara, CA)
- …Observability is at the heart of this transformation. We are looking for a Senior AI & HPC Observability Engineer to design and build the next-generation ... observability platform for large-scale AI workloads, GPU clusters, and high-performance computing environments. This...The Crowd: + Proven experience designing and scaling full-stack observability platforms for large-scale AI , GPU, or… more
- NVIDIA (Santa Clara, CA)
- …to do their best work. We are looking for a highly skilled Principal Software Engineer to design and develop AIOps & Observability platforms at NVIDIA. The ... people. Today, we're tapping into the unlimited potential of AI to define the next era of computing. An...of engineers, product managers, and partners to define the observability strategy, roadmap, and standard methodologies for NVIDIA. You… more
- NVIDIA (Santa Clara, CA)
- NVIDIA's Observability team is seeking a Senior/Staff Engineer to compose and build the next-generation, multi-region observability platform. This platform ... powers our rapidly expanding AI , Data, and Observability ecosystem, operating at an immense scale: trillions of metrics, hundreds of terabytes of logs, and… more
- Microsoft Corporation (Redmond, WA)
- …potential of AI to create intelligent, adaptive, and transformative software. The Observability group is seeking a **Software Engineer II - Observability ... **Overview** Core AI is at the forefront of Microsoft's mission...performance. We are seeking a passionate and skilled software engineer to join the Observability platform team.… more
- MongoDB (New York, NY)
- …VictoriaMetrics, Splunk, QuickWit, Jaeger, Fluentbit, and Vector. In addition to owning our observability infrastructure, as an Engineer on the team, you'll also ... **Team and Role Overview** The SRE Observability team is part of the larger Platform...the market. We have redefined the database for the AI era, enabling innovators to create, transform, and disrupt… more
- Cisco (Milpitas, CA)
- Senior Full Stack Engineer - Cloud-Native Observability Platform We are the Catalyst Center Platforms and Capabilities team, responsible for delivering scalable, ... innovation. One of our key initiatives is a cloud-native observability platform purpose-built for Cisco Catalyst Center deployments-bridging on-premises network… more
- MongoDB (Seattle, WA)
- The Networking & Observability Team builds infrastructure for low-overhead observability and communication between MongoDB Server nodes, clients, and other ... core components for data processing systems + Familiarity with observability ecosystem and best practice + Excellent verbal and...the market. We have redefined the database for the AI era, enabling innovators to create, transform, and disrupt… more
- Robert Half Technology (Austin, TX)
- …Half is actively partnering with an Austin-based client to hire a Senior Observability Engineer to design and implement scalable observability solutions ... enhance monitoring practices, enable rapid incident response, and drive innovation in observability tooling and strategy. This position is located in Austin, Texas.… more
- Walmart (Sunnyvale, CA)
- **Position Summary ** **What you'll do ** As an observability Distinguished Engineer , you will be a key researcher and technical lead expert in the architecture ... and development of cloud native observability designs, managed services, and real-time telemetry software systems. You will use your depth of engineering and… more
- NVIDIA (Santa Clara, CA)
- …Intelligence: Real world experience applying model development, RAG, MCP, and Agentic AI technical solutions to the problem of observability data analytics, ... at NVIDIA, you will own the development of DGX Cloud strategy for observability , monitoring, and remediation across all layers of infrastructure, IaaS, platforms and… more
- ServiceNow, Inc. (Orlando, FL)
- It all started in sunny San Diego, California in 2004 when a visionary engineer , Fred Luddy, saw the potential to transform how we work. Fast forward to today - ... ServiceNow stands as a global market leader, bringing innovative AI -enhanced technology to over 8,100 customers, including 85% of...user experiences, and a culture of continuous improvement. Every engineer here plays a key role in shaping the… more
- MongoDB (Seattle, WA)
- Join and be a part of leading the MongoDB Networking Observability team, helping build the core of a distributed database! Our team focuses on creating and enhancing ... make these processes, and their communication, easily observable. Networking Observability 's responsibilities include improving MongoDB networking, improving the efficiency… more
- NVIDIA (Santa Clara, CA)
- …NVLink, NVIDIA InfiniBand networking, NVIDIA Grace CPUs, and a fully optimized NVIDIA AI and HPC software stack. We're looking for a strong technical architect to ... can perceive and understand the world. Today, we are increasingly known as "the AI computing company." We're looking to grow our company and establish teams with the… more
- Vanguard (Malvern, PA)
- …experience operate within a complex and rapidly evolving resiliency landscape. As an Application Engineer within the ChAI (Chat & AI ) team, you will contribute ... build, and support application-level capabilities that improve reliability, performance, and observability for AI and Generative AI workloads. You will also… more
- Google (Sunnyvale, CA)
- Senior Engineering Manager, ML Optimization Tools and Observability _corporate_fare_ Google _place_ Sunnyvale, CA, USA **Advanced** Experience owning outcomes and ... of the following: ML performance, debugging, optimization, profiling, or observability . + 5 years of experience in a people...Like Google's own ambitions, the work of a Software Engineer goes beyond just Search. Software Engineering Managers have… more
- LinkedIn (Mountain View, CA)
- …Serving performance optimizations across billions of user queries. Model Training Infrastructure: As an engineer on the AI Training Infra team, you will play a ... the process for model training and serving. As an engineer in the team, you will explore and innovate...MLOps and experimentation systems across LinkedIn. From Ramping to Observability , this org powers the AI products… more
- LinkedIn (Mountain View, CA)
- …Serving performance optimizations across billions of user queries Model Training Infrastructure: As an engineer on the AI Training Infra team, you will play a ... the process for model training and serving. As an engineer in the team, you will explore and innovate...MLOps and experimentation systems across LinkedIn. From Ramping to Observability , this org powers the AI products… more
- Avispa Technology (Stanford, CA)
- AI -Ops Engineer 1464463 * Hourly pay: $60/hr * Worksite: Leading university (Stanford, CA 94305 - Hybrid, Must be onsite 2-3 days on campus) * W2 Employment, ... 12 Month Assignment, Possible extension or conversion A leading university seeks an AI -Ops Engineer . The successful candidate will be responsible for evolving… more
- Microsoft Corporation (Redmond, WA)
- …our lives and society will be impacted? We are seeking an experienced **Principal Software Engineer - AI Safety and Security** to join a high impact team sitting ... at the intersection of cybersecurity and generative AI . As a Principal Software Engineer ...everyone can thrive at work and beyond._ **Responsibilities** + ** AI Logging and Observability ** : Develop company-wide… more
- The Walt Disney Company (Glendale, CA)
- …product teams to ensure AI is delivered using shared AI capabilities, governance, observability , and enterprise-grade quality. **Responsibilities and Duties ... Safety & Observability ** + Establish standards for guardrails, evaluation, and observability of AI -driven workflows. + Partner with security and governance… more