• IT InfiniBand /GPU -Sr Staff Systems…

    Cadence Design Systems, Inc. (San Jose, CA)
    …of HPC infrastructure and troubleshooting and supports technical roles supporting HPC, InfiniBand , and GPU at our San Jose location! The successful candidate will ... years overall experience in technical roles supporting GPU Infrastructure setup using InfiniBand + Experience with interconnections between InfiniBand & GPU's +… more
    Cadence Design Systems, Inc. (07/03/24)
    - Save Job - Related Jobs - Block Source
  • IT - InfiniBand GPU, Sr Systems Engineer…

    Cadence Design Systems, Inc. (San Jose, CA)
    …equipment. + Customer deployments and ensure on-time bring-up of GPU Servers. + InfiniBand fabric bring-up, configuration, and subnet management on the IB switch. + ... drivers, loading + Experience with GPU end to end testing in a cluster with InfiniBand + Experience with setup of GPU servers in a cluster. + Need experience in… more
    Cadence Design Systems, Inc. (07/06/24)
    - Save Job - Related Jobs - Block Source
  • Staff Computer Systems Analyst

    Northrop Grumman (Northridge, CA)
    …(HPC) system operated under a classified government contract. HPC infrastructure includes InfiniBand network, Lustre parallel file system, and NVIDIA GPUs. * Able to ... files systems (eg Lustre, GPFS), high speed interconnect fabrics (eg Infiniband , Omni-Path), and HPC batch scheduling software suites (eg PBSPro, SLURM)… more
    Northrop Grumman (07/13/24)
    - Save Job - Related Jobs - Block Source
  • Senior Software Engineer

    Microsoft Corporation (Redmond, WA)
    …understand the details of how everything works across cutting edge storage, network and Infiniband , this could be the role for you. Microsoft's mission is to empower ... Cloud Systems and Cloud Networks / SDN networks + Experience with Infiniband operation, including troubleshooting + Knowledge in HPC schedulers, eg SLURM, SUNK,… more
    Microsoft Corporation (07/13/24)
    - Save Job - Related Jobs - Block Source
  • Senior Scientific Computing Support Engineer

    Penn Medicine (Philadelphia, PA)
    …scripting languages such as Python, networking and network interconnects such as InfiniBand , and be proficient working primarily in a Llinux command line interface. ... Abilities** + TECHNOLOGY: Experience managing Linux-based compute clusters, preferably Infiniband -based + TECHNOLOGY: Understands, maintains and supports large-scale high… more
    Penn Medicine (07/05/24)
    - Save Job - Related Jobs - Block Source
  • Senior High Performance Computing (HPC) Architect

    Insight Global (Rockville, MD)
    …to include designing new clusters or expanding existing components such as storage, InfiniBand , and compute * Monitor and report on cluster performance and generate ... architecture design experience with HPC to include storage, file system, InfiniBand , security, authentication, and compute architectures * Experience with Slurm job… more
    Insight Global (07/02/24)
    - Save Job - Related Jobs - Block Source
  • Senior System/High Performance Computing…

    Integration Innovation, Inc. (i3) (Vicksburg, MS)
    …and security + Maintain hardware infrastructure supporting the HPC including infiniband data and Ethernet network switches, different node types. + Monitoring ... + Proficiency with the managing the following network technologies DNS, CephFS, Infiniband . + Experience with systems backup and recovery methodologies. + Ability to… more
    Integration Innovation, Inc. (i3) (06/14/24)
    - Save Job - Related Jobs - Block Source
  • Senior HPC Architect

    General Dynamics Information Technology (Fairfax, VA)
    …to include designing new clusters or expanding existing components such as storage, InfiniBand , and compute + Monitor and report on cluster performance and generate ... architecture design experience with HPC to include storage, file system, InfiniBand , security, authentication, and compute architectures + Experience with Slurm job… more
    General Dynamics Information Technology (06/09/24)
    - Save Job - Related Jobs - Block Source
  • Research Systems Engineer

    University of Oregon (Eugene, OR)
    …design and implementation of hardware and cluster software subsystems including Infiniband networking, GPFS parallel file systems HPC queuing systems. The Research ... and other systems and peripherals, including advanced filesystems and InfiniBand networks. Responsibilities include maintaining and modifying large-scale cyberinfrastructure,… more
    University of Oregon (05/31/24)
    - Save Job - Related Jobs - Block Source
  • Distinguished Software Architect - Deep Learning…

    NVIDIA (Santa Clara, CA)
    …(eg. NVLink, PCIe) within a node and with high-speed networking (eg. Infiniband , Ethernet) across the nodes. Communication performance between the GPUs has a ... high performance networking from prior work experience: network technologies ( Infiniband , Ethernet), network design, network topologies, network debug and… more
    NVIDIA (05/25/24)
    - Save Job - Related Jobs - Block Source
  • Software Engineering Manager - GPU Communications…

    NVIDIA (Santa Clara, CA)
    …(eg. NVLink, PCIe) within a node and with high-speed networking (eg. Infiniband , Ethernet) across the nodes. Communication performance between the GPUs has a ... OpenACC, pthreads. + Background with RDMA, high-performance networking technologies ( InfiniBand , RoCE, Ethernet, EFA), network architecture and network topologies.… more
    NVIDIA (05/17/24)
    - Save Job - Related Jobs - Block Source
  • Senior HPC Performance Engineer

    NVIDIA (Santa Clara, CA)
    …(eg. NVLink, PCIe) within a node and with high-speed networking (eg. Infiniband , Ethernet) across the nodes. Communication performance between the GPUs has a ... Ways to stand out from the crowd: + Practical experience with Infiniband /Ethernet networks in areas like RDMA, topologies, congestion control + Experience debugging… more
    NVIDIA (05/04/24)
    - Save Job - Related Jobs - Block Source
  • Solutions Architect, Spectrum-X - Switch-Centric

    NVIDIA (Santa Clara, CA)
    …distributed collection of NVIDIA GPUs inter-connected by networking solutions such as InfiniBand , Ethernet, or RoCE (RDMA over Converged Ethernet) we make powerful ... AI workflows + Linux Environment and Linux Networking + Familiarity with (NVIDIA) Ethernet/ Infiniband switches, RoCE, and RDMA concepts NVIDIA is leading the way in… more
    NVIDIA (05/02/24)
    - Save Job - Related Jobs - Block Source
  • Nvidia Dgx Infrastructure Architect

    World Wide Technology (St. Louis, MO)
    …GPU architectures, and performance optimization techniques. + Experience with InfiniBand networking technology, including switches, adapters, and fabric management. ... join a dynamic team and work on cutting-edge DGX infrastructure solutions, including InfiniBand networking and Base Command Manager, we would love to hear from you.… more
    World Wide Technology (04/30/24)
    - Save Job - Related Jobs - Block Source
  • Software Engineer II

    Microsoft Corporation (Redmond, WA)
    …+ Past experience with high-performance networking, including working with NVLink, InfiniBand and/or RDMA protocols (eg, RoCEv2, iWARP), MPI, libfabric (OFI) + ... Past experience with FPGA-based development and hardware. + Familiarity with hardware development and RTL (eg, System Verilog). + Growth mindset/curiosity for learning and deploying new technologies. Software Engineering IC3 - The typical base pay range for… more
    Microsoft Corporation (07/16/24)
    - Save Job - Related Jobs - Block Source
  • Digital Verification Engineer

    Digital Prospectors (Lexington, MA)
    …FPGA and embedded software development is a plus. + Familiarity with Ethernet, InfiniBand , and/or PCIe protocols, testing, tools, and debugging is a plus. + ... Familiarity with RDMA and RoCE protocols is a plus. + Experiencing developing, designing, and programming logic for signal processing applications is preferred. + Experience programming in Python is preferred. + **Due to the nature of the work, an Interim… more
    Digital Prospectors (07/16/24)
    - Save Job - Related Jobs - Block Source
  • Senior Software Engineer

    Microsoft Corporation (Redmond, WA)
    …hardware. + Experience with high-performance networking, including working with NVLink, InfiniBand and/or RDMA protocols (eg, RoCEv2, iWARP), MPI, libfabric (OFI), ... and collective communication libraries (eg, NCCL, RCCL, MSCCL). + Growth mindset/curiosity for learning and deploying new technologies. Software Engineering IC4 - The typical base pay range for this role across the US is USD $117,200 - $229,200 per year. There… more
    Microsoft Corporation (07/16/24)
    - Save Job - Related Jobs - Block Source
  • Senior HPC Systems Engineer

    General Dynamics Information Technology (Fairfax, VA)
    …(eg, Lustre, NFS). + Network interconnect configuration and monitoring experience (eg, Infiniband , Ethernet). + Programming or scripting in at least two languages ... (eg, Bash, Perl, Python, C). + Strong writing skills for technical documents, system procedures, user wiki's and FAQs. + Experience developing regression tests (eg pavilion, ReFrame) + Ability to work both independently and as part of a team. The likely salary… more
    General Dynamics Information Technology (07/14/24)
    - Save Job - Related Jobs - Block Source
  • Group Lead HPC Services

    The MITRE Corporation (Mclean, VA)
    …Familiarity with HPC-specific resources and technologies, such as GPUs, FPGAs, MPI, Infiniband + Experience with resource managers and schedulers for HPC clusters ... such as Slurm, + Experience with Docker, Singularity, or another container technology. + Experience with logging and monitoring tools such as Splunk. + Candidates holding current / active US Government security clearance(s) are preferred. + Experience with… more
    The MITRE Corporation (07/13/24)
    - Save Job - Related Jobs - Block Source
  • Software Engineer, SystemML - Scaling…

    Meta (Menlo Park, CA)
    …5. Experience with NCCL and distributed GPU reliability/performance improvment on RoCE/ Infiniband 6. Experience working with DL frameworks like PyTorch, Caffe2 or ... TensorFlow 7. Experience with both data parallel and model parallel training, such as Distributed Data Parallel, Fully Sharded Data Parallel (FSDP), Tensor Parallel, and Pipeline Parallel 8. Experience in AI framework and trainer development on accelerating… more
    Meta (07/12/24)
    - Save Job - Related Jobs - Block Source