- Meta (Menlo Park, CA)
- … production issue triage, rolling out new features in FW/Driver. **Required Skills:** Production Systems Engineer , AI Systems Responsibilities: ... **Summary:** Meta is seeking a Systems Engineer to join our Release to Production (RTP) team working on AI /ML initiatives supporting large scale AI … more
- Meta (Menlo Park, CA)
- …platforms, all the way to mass production and deployment. **Required Skills:** Production Systems Engineer , AI Systems Responsibilities: ... **Summary:** Meta is seeking a Systems Engineer to join our Release to Production...Inference Accelerator (MTIA) program as a part of the AI /ML initiatives supporting large scale AI Training… more
- Meta (Menlo Park, CA)
- …health and lifecycle of servers in production . **Required Skills:** Production Systems Engineer , Fleet AI Systems Responsibilities: 1. Interface ... **Summary:** Meta is seeking an experienced Production Systems Engineer to... hyperscale environments, engineering varying solutions to wide-reaching, at-scale systems issues. 21. Experience supporting AI /HPC … more
- Meta (Menlo Park, CA)
- **Summary:** Meta is seeking a Systems Engineer to join our Release to Production (RTP) team working on AI /ML initiatives supporting large scale AI ... services, and data center operations teams to enable new systems that will be deployed in our production...Silicon hyperscalar bring up and validation. **Required Skills:** Hardware Systems Engineer , NPI AI Responsibilities:… more
- Meta (Menlo Park, CA)
- … based approach to the new product introduction (NPI) phase. **Required Skills:** Hardware Systems Engineer , AI NPI Responsibilities: 1. Drive and execute ... services, and data center operations teams to enable new systems that will be deployed in our production...strategy (hardware and software), with a focus on various AI /HPC hardware systems in datacenter applications. 2.… more
- Meta (Menlo Park, CA)
- …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI /HPC Systems Performance Engineer Responsibilities: 1. Lead ... **Summary:** Meta's AI Training and Inference Infrastructure is growing exponentially...interconnect with minimal latency. To improve performance of these systems we constantly look for opportunities across stack: network… more
- Meta (Menlo Park, CA)
- …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI /HPC Systems Performance Engineer Responsibilities: 1. Active ... **Summary:** Meta's AI Training and Inference Infrastructure is growing exponentially...a loss-less fabric interconnect. To improve performance of these systems we constantly look for opportunities across stack: network… more
- Meta (Menlo Park, CA)
- **Summary:** Meta is seeking a Partner Engineer to join Meta's AI Partner Engineering team, a highly technical team that works with strategic partners, machine ... evangelize Meta's AI design patterns and best practices. **Required Skills:** Partner Engineer , Generative AI Responsibilities: 1. Apply relevant AI and… more
- Meta (Menlo Park, CA)
- **Summary:** Meta is seeking a Partner Engineer to join Meta's AI Partner Engineering team, a highly technical team that works with partners, machine learning ... and taking Large Language Models (LLMs) from research to production . In this role, you will engage with some... design patterns and best practices. **Required Skills:** Partner Engineer , Generative AI Responsibilities: 1. Apply relevant… more
- Walmart (Sunnyvale, CA)
- …**RAG frameworks** to lead the design, development, and deployment of advanced AI systems . This role involves architecting scalable solutions, integrating ... redefine customer experiences. We are seeking a **Principal, Software Engineer ** with deep expertise in **Generative AI **.... 2. **Architecture & Scalability:** + Architect scalable, distributed AI systems with a focus on performance,… more
- NVIDIA (Santa Clara, CA)
- …and blameless postmortems + Be part of an on call rotation to support production systems + Write and review code, develop documentation and capacity plans, ... automation to improve researchers productivity. As a Site Reliability Engineer , you are responsible for the big picture of...Deployment, BCM, Terraform. + Understanding of fast, distributed storage systems like Lustre and GPFS for AI /HPC… more
- NVIDIA (Santa Clara, CA)
- …can connect to enterprise data sources and power search, chatbots and other gen AI applications + Develop platform and systems enabling unified experience across ... people. Today, we're tapping into the unlimited potential of AI to define the next era of computing. An...and products that improve business efficiency and productivity. This engineer is expected to be familiar with concepts of… more
- NVIDIA (Santa Clara, CA)
- We are looking for a Principal Machine Learning Engineer to join our growing team focused on Enterprise AI ! NVIDIA's invention of the GPU in 1999 sparked the ... parallel computing. More recently, GPU deep learning ignited modern AI -the next era of computing-with the GPU acting as...building and deploying ML models and architectures in a production environment. + Solid knowledge of NLP and deep… more
- LinkedIn (Sunnyvale, CA)
- …Machine Learning and Artificial Intelligence Preferred QualificationsExperience in bringing large scale AI systems to production .PhD in Computer Science, ... within FAIT and across the company to realize these AI innovations. As a Principal Staff Engineer ...define the bar for quality and efficiency of software systems while balancing business impact, operational impact and cost… more
- LinkedIn (Mountain View, CA)
- …with a wide range of technologies such as deep learning, generative AI , large language models, recommender systems , ranking, search, advertising, auction ... engineering and infrastructure teams to build the next generation AI -first product experience for our members. As a Principal...-first product experience for our members. As a Principal Engineer , you will be one of the overall technical… more
- Palo Alto Networks (Santa Clara, CA)
- …all win with precision. **Your Career** We are seeking an experienced and innovative AI Engineer to join the Worldwide Shared Services (WWSS) organization. In ... to inform impactful AI solutions + Design, prototype, and implement AI -driven systems that automate workflows, optimize business processes, and solve complex… more
- NVIDIA (Santa Clara, CA)
- We are now looking for a Senior High-Performance AI Training Engineer : NVIDIA is seeking senior engineers who are obsessed with performance analysis and ... help us squeeze every last clock cycle out of AI training, the workload driving the design and construction...and construction of the largest and most powerful compute systems in the world. This role offers the opportunity… more
- eightfold.ai (Santa Clara, CA)
- …for Machine Learning & AI + Implement best practices for building AI -enabled products + Develop AI -based systems for Natural Language Processing ... is NOT a remote position ) About Eightfold Eightfold AI is the industry leader in AI -powered...frameworks (scikit-learn, tensorflow, torch, etc.) + Experience with implementing production machine learning systems and working with… more
- NVIDIA (Santa Clara, CA)
- Join NVIDIA as a Machine Learning Engineer and contribute to Product Security, Content Safety, ML Fairness, and Robustness efforts for LLMs in our research and ... production engineering teams. In this role you'll have the...and explainability. Our LLMs are a growing area of AI products including models and services, and we are… more
- Meta (Fremont, CA)
- …world class manufacturing and quality processes for the next generation of disruptive AI hardware products **Required Skills:** Product Quality Engineer , AI / ... a comprehensive product quality strategy and process for next-generation AI hardware systems , ensuring timely delivery and...action for failures that occur during server integration, rack production and in the Data Center 5. Early engagement… more