GPU Cluster Architect
Nebius(5 months ago)
About this role
The GPU Cluster Architect at Nebius will drive design and architecture of next-generation AI infrastructure, making end-to-end decisions across compute, networking, and storage to meet large-scale AI workload demands. This hands-on, high-impact role involves defining how tens of thousands of GPUs are interconnected, cooled, powered, and optimized across multiple data center sites. The position requires close collaboration with site reliability, networking, storage, and data center engineering teams to operationalize scalable, high-performance, and reliable platforms.
Required Skills
- Cluster Design
- Performance Modeling
- Network Architecture
- Storage Integration
- Monitoring
- Collaboration
- Scripting
- GPU Architecture
- HPC Interconnects
- Systems Architecture
+1 more
About Nebius
nebius.comNebius is a cloud platform for AI explorers that provides GPU‑accelerated infrastructure to build, tune, and run machine learning models and applications. It offers access to top‑tier NVIDIA GPUs and tooling designed to maximize efficiency and performance for training, fine‑tuning, and inference. Nebius focuses on simplifying ML workflows so researchers, developers, and teams can iterate faster without managing hardware.
Apply instantly with AI
Let ApplyBlast auto-apply to jobs like this for you. Save hours on applications and land your dream job faster.
More jobs at Nebius
Similar Jobs
Senior HPC Developer - GPU and Networking
Clockwork.io(7 days ago)
Technical Lead - GPU/Compute Infrastructure
Impossible Cloud(2 months ago)
Member of Technical Staff, GPU Optimization
Mirage(2 months ago)
Network Engineer, AI/ML Infrastructure
Boson AI(8 days ago)
Site Reliability Engineer, AI/ML Infrastructure
Boson AI(1 day ago)
Cluster Test Engineer
AMAX(2 years ago)