Senior HPC Cluster Engineer
Nebius(10 months ago)
About this role
A Senior HPC Cluster Engineer at Nebius will contribute to the development and optimization of the company’s hyperscaler cloud platform for AI workloads. The role is part of the GPU & InfiniBand team and centers on delivering high performance, security, and reliability in multi-GPU HPC environments while enabling new hardware support. The position works closely with hardware virtualization and infrastructure teams to scale and automate GPU and InfiniBand-based systems.
Required Skills
- GPU Computing
- InfiniBand
- KVM/QEMU
- Kubernetes
- Linux Systems
- System Software
- Performance Tuning
- Hardware Integration
- Automation
- PCIe
+10 more
About Nebius
nebius.comNebius is a cloud platform for AI explorers that provides GPU‑accelerated infrastructure to build, tune, and run machine learning models and applications. It offers access to top‑tier NVIDIA GPUs and tooling designed to maximize efficiency and performance for training, fine‑tuning, and inference. Nebius focuses on simplifying ML workflows so researchers, developers, and teams can iterate faster without managing hardware.
Apply instantly with AI
Let ApplyBlast auto-apply to jobs like this for you. Save hours on applications and land your dream job faster.
More jobs at Nebius
Similar Jobs
Senior HPC Developer - GPU and Networking
Clockwork.io(7 days ago)
Senior Systems Engineer - AI Infrastructure
Clockwork.io(6 days ago)
Senior HPC Operations Engineer
Lambda(2 months ago)
Site Reliability Engineer, AI/ML Infrastructure
Boson AI(8 days ago)
Site Reliability Engineer, AI/ML Infrastructure
Boson AI(1 day ago)
Site Reliability Engineer, AI/ML Infrastructure
Boson AI(1 day ago)