NVIDIA

Senior Product Manager - Observability and Resilience

NVIDIA(1 month ago)

HybridFull TimeSenior$208,000 - $327,750Product Management
Apply Now

About this role

A Product Manager at NVIDIA who will lead development of foundational resiliency and observability tools for large‑scale accelerated computing platforms. The role focuses on ensuring system diagnostics, performance monitoring, and automated recovery to maximize uptime and efficiency for AI training and inference workloads. The position supports deployment and operation of AI infrastructure across customers and partners.

View Original Listing

Required Skills

  • Resiliency
  • Observability
  • GPU Observability
  • Telemetry
  • Reliability
  • Kubernetes
  • Containerization
  • Cloud
  • Networking
  • HPC

+9 more

Qualifications

  • BS in Computer Science or related
  • MS in Computer Science or related
NVIDIA

About NVIDIA

nvidia.com

NVIDIA invents the GPU and drives advances in AI, HPC, gaming, creative design, autonomous vehicles, and robotics.

View more jobs at NVIDIA

ApplyBlast uses AI to match you with the right jobs, tailor your resume and cover letter, and apply automatically so you can land your dream job faster.

© All Rights Reserved. ApplyBlast.com