- Meta (Menlo Park, CA)
- … production issue triage, rolling out new features in FW/Driver. **Required Skills:** Production Systems Engineer , AI Systems Responsibilities: ... **Summary:** Meta is seeking a Systems Engineer to join our Release to Production (RTP) team working on AI /ML initiatives supporting large scale AI … more
- Meta (Menlo Park, CA)
- … production issue triage, rolling out new features in FW/Driver. **Required Skills:** Production Systems Engineer , AI Systems Responsibilities: ... **Summary:** Meta is seeking a Systems Engineer to join our Release to Production (RTP) team working on AI /ML initiatives supporting large scale AI … more
- Meta (Menlo Park, CA)
- …platforms, all the way to mass production and deployment. **Required Skills:** Production Systems Engineer , AI Systems Responsibilities: ... **Summary:** Meta is seeking a Systems Engineer to join our Release to Production...Inference Accelerator (MTIA) program as a part of the AI /ML initiatives supporting large scale AI Training… more
- Meta (Menlo Park, CA)
- …health and lifecycle of servers in production . **Required Skills:** Production Systems Engineer , Fleet AI Systems Responsibilities: 1. Interface ... **Summary:** Meta is seeking a Production Systems Engineer to...systems issues. 15. 2+ years of experience supporting AI or HPC systems and/or related … more
- Meta (Menlo Park, CA)
- …health and lifecycle of servers in production . **Required Skills:** Production Systems Engineer , Fleet AI Systems Responsibilities: 1. Interface ... **Summary:** Meta is seeking an experienced Production Systems Engineer to... hyperscale environments, engineering varying solutions to wide-reaching, at-scale systems issues. 21. Experience supporting AI /HPC … more
- Meta (Menlo Park, CA)
- **Summary:** Meta is seeking a Systems Engineer to join our Release to Production (RTP) team working on AI /ML initiatives supporting large scale AI ... services, and data center operations teams to enable new systems that will be deployed in our production...Silicon hyperscalar bring up and validation. **Required Skills:** Hardware Systems Engineer , NPI AI Responsibilities:… more
- Meta (Menlo Park, CA)
- … based approach to the new product introduction (NPI) phase. **Required Skills:** Hardware Systems Engineer , AI NPI Responsibilities: 1. Drive and execute ... services, and data center operations teams to enable new systems that will be deployed in our production...strategy (hardware and software), with a focus on various AI /HPC hardware systems in datacenter applications. 2.… more
- Meta (Menlo Park, CA)
- …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI /HPC Systems Performance Engineer Responsibilities: 1. Active ... **Summary:** Meta's AI Training and Inference Infrastructure is growing exponentially...a loss-less fabric interconnect. To improve performance of these systems we constantly look for opportunities across stack: network… more
- Meta (Menlo Park, CA)
- **Summary:** Meta is seeking a Partner Engineer to join Meta's AI Partner Engineering team, a highly technical team that works with strategic partners, machine ... evangelize Meta's AI design patterns and best practices. **Required Skills:** Partner Engineer , Generative AI Responsibilities: 1. Apply relevant AI and… more
- Charles Schwab (San Francisco, CA)
- … Incubation and Enablement team is looking for a talented, technical, hands-on Senior Engineer to drive the development of innovative AI solutions. This position ... iterative software development using Large Language Models. The Senior Engineer on the AI Incubation and Enablement...building complex products from scratch and running them in production . + 3 + years of experience building applications… more
- Amazon (San Francisco, CA)
- …team, you'll be instrumental in transforming cutting-edge research into high-performance production systems . You'll collaborate directly with scientists to ... Peter Chen to make breakthrough foundation models run at production scale. As a Senior Machine Learning Engineer...We tackle some of the most challenging problems in AI and robotics, from developing sophisticated perception systems… more
- Cisco (San Francisco, CA)
- …learning technologies. The ideal candidate will help build and maintain scalable AI systems while ensuring robust deployment and operational excellence. ... part of our journey! **Role** As the Machine Learning Engineer , AI Platform in the Splunk ...Engineers and Applied Scientists to build efficient model serving systems + Monitor system performance and implement improvements for… more
- Cisco (San Jose, CA)
- …(LLM). + Experience developing large-scale, complex models and deploying them in production systems . + Experience large-scale data processing and parallel ... and executing the technical roadmap for the team, as we develop the core AI /ML capabilities to power the entire Splunk product portfolio and help our customers to… more
- Meta (Menlo Park, CA)
- **Summary:** Meta is seeking an experienced Production Systems Engineer to join our Release to Production (RTP) team. Our servers and data centers are ... cycle of servers in production . **Required Skills:** Production Systems Engineer , Sustaining Responsibilities:...hardware at scale 9. Experience in deploying and productionizing AI /HPC systems and/or related components at scale… more
- Meta (Menlo Park, CA)
- …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI /HPC Network Engineer Responsibilities: 1. Design, develop, test and ... operate networking systems to support large scale AI training...more. 5. Be oncall to learn from real world production challenges and take the lessons to improve current… more
- Amazon (San Francisco, CA)
- …team, you'll be instrumental in transforming novel research into high-performance production systems . You'll collaborate directly with scientists to optimize ... where you'll contribute to breakthrough foundation models run at production scale. As a Software Development Engineer ...We tackle some of the most challenging problems in AI and robotics, from developing sophisticated perception systems… more
- Meta (Menlo Park, CA)
- …validation, supporting customer deployment, production issue triage. **Required Skills:** Production Systems Engineer , Cooling & Power Responsibilities: ... **Summary:** Meta is seeking a Systems Engineer to join our Release to Production...scaling and deployment challenges requires us to take a systems based approach to AI system bring… more
- IBM (San Francisco, CA)
- …challenging problems? If so, lets talk. **Your role and responsibilities** As a Staff AI /MLOps Development Engineer at Apptio, you will work closely with the ... in hybrid cloud environments. You will help design and engineer efficient and resilient MLOps platforms and software products...the machine learning system and its components interact with systems around it. * Develop and deploy AI… more
- Cisco (San Jose, CA)
- …or if a sufficient number of applications are received. Who We Are The Cisco Security AI team delivers AI products and platform for all Cisco Secure products and ... customers secure by simplifying security with zero compromise using AI and Machine Learning. Who You Are You are...Who You Are You are a passionate Machine Learning Engineer who is building their career through successfully building,… more
- Meta (Menlo Park, CA)
- …space of GenAI/LLM scaling reliability and performance. **Required Skills:** Software Engineer , SystemML - AI Networking Responsibilities: 1. Enabling reliable ... learning/deep learning domains: Distributed ML Training, GPU architecture, ML systems , AI infrastructure, high performance computing, performance optimizations,… more