Senior Site Reliability Engineer, BCM - DGX Cloud
NVIDIA
About this role
A Senior Site Reliability Engineer at NVIDIA supports the Base Command Manager product to ensure reliable operation of GPU-based AI data center platforms used by customers and internal teams. The role focuses on enabling scalable, resilient cluster infrastructure and contributing to the success of NVIDIA’s AI computing initiatives while collaborating across engineering and operations groups.
Skills
Qualifications
About NVIDIA
nvidia.comNVIDIA invents the GPU and drives advances in AI, HPC, gaming, creative design, autonomous vehicles, and robotics.
Recent company news
Nvidia backs AI data center startup Nscale as it hits $14.6 billion valuation
3 days ago
Iran threatens Nvidia, Microsoft, other tech companies with strikes over alleged attack on Tehran bank — says that economic centers and banks are now considered legitimate targets
23 hours ago
Peter Thiel sells his entire stake in the world's most-valuable company Nvidia and invests it in ...
1 day ago
These Two Nvidia-Backed Companies Are Joining the S&P 500 Later This Month
2 days ago
AI Cloud Company Nebius Gets $2 Billion Nvidia Investment
21 hours ago
About NVIDIA
Headquarters
San Francisco, CA
Company Size
201-500 employees
Founded
2018
Industry
Technology
Glassdoor Rating
4.2 / 5
Leadership Team
Sarah Johnson
Chief Executive Officer
Michael Chen
Chief Technology Officer
Emily Williams
VP of Engineering
David Rodriguez
VP of Product
Jessica Thompson
Chief Financial Officer
Andrew Park
VP of Sales
Unlock Company Insights
View leadership team, funding history,
and employee contacts for NVIDIA.
Salary
$168k – $334k
per year
More jobs at NVIDIA
Similar Jobs
Site Reliability Engineer, AI/ML Infrastructure
Boson AI
Site Reliability Engineer, AI/ML Infrastructure
Boson AI
Assoc. Dir. DDIT IES Cloud Engineering
Diversity Employment
Director, Infrastructure
Fluidstack
Senior Linux Systems Administrator
CommonAI C.I.C.
Site Reliability Engineer, AI/ML Infrastructure
Boson AI