- Meta (New York, NY)
- …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI / HPC Systems Performance Engineer Responsibilities: 1. ... **Summary:** Meta's AI Training and Inference Infrastructure is growing exponentially...workloads that expects a loss-less fabric interconnect. To improve performance of these systems we constantly look… more
- Mount Sinai Health System (New York, NY)
- …Computing and Data's computational and data science ecosystem. This ecosystem includes high- performance computing and data systems , Data Ark data commons, ... translational science research. To achieve these aims, we support a high- performance computing and data ecosystem along with MD/PhD-level support for researchers.… more
- Amazon (New York, NY)
- …modernizing customer requirements to the cloud - Practical experience in High Performance Computing ( HPC ) and/or distributed training, performance profiling ... Description Are you passionate about Generative AI (GenAI)? Do you want to help define...services to power their businesses. We're continuously raising our performance bar as we strive to become Earth's Best… more
- Amazon (New York, NY)
- …experience - 5+ years building or optimizing computational applications for large scale HPC systems (eg physics based simulations) to take advantage of high ... of Go to Market (GTM) at AWS using generative AI (GenAI)? AWS Sales, Marketing, and Global Services (SMGS)...years building or optimizing computational applications for large scale HPC systems (eg physics based simulations) to… more
- Bloomberg (New York, NY)
- …or familiarity in MLOps platforms & Machine Learning toolkits + Applied experience optimizing HPC workloads for AI & ML + Experience with capacity planning, ... to shape and execute the vision, and roadmap for the next-generation Bloomberg Generative AI platform. As a Technical Product Manager, you will have ownership over a… more
- Meta (New York, NY)
- …in Python, C++ or CUDA programming. 10. Research or industry experience in ML systems , ML accelerators, HPC , GPU performance , and similar. 11. Currently ... PyTorch. 14. Expert knowledge in GPU performance and writing high- performance communication libraries and fault tolerance distributed systems . 15. Proven… more