Reliability Lead, Common Services
CoreWeave(1 day ago)
About this role
Reliability Lead, Common Services at CoreWeave will establish and lead the SRE and production operations practice for the Common Services organization, defining reliability strategy, processes, and standards. The role partners with engineering and product teams to ensure CoreWeave’s shared platforms are reliable, observable, and operable at scale.
Required Skills
- Site Reliability
- Production Engineering
- Incident Management
- Observability
- SLOs/SLIs
- Linux
- Kubernetes
- Terraform
- Automation
- Capacity Planning
+1 more
About CoreWeave
coreweave.comCoreWeave is a cloud provider purpose-built for GPU-accelerated AI and high-performance compute workloads, positioning itself as "The Essential Cloud for AI." It offers on-demand and dedicated GPU infrastructure (bare metal, virtual machines, and Kubernetes), high-performance networking and storage, and managed services to support large-scale training, inference, and graphics rendering. CoreWeave emphasizes performance, cost-efficiency, and operational support so enterprises and research teams can deploy and scale AI workloads with predictable performance and security.
Apply instantly with AI
Let ApplyBlast auto-apply to jobs like this for you. Save hours on applications and land your dream job faster.
More jobs at CoreWeave
Similar Jobs
Site Reliability Engineering Manager, Consumer Apps
Attain(28 days ago)
Staff Site Reliability Engineer
Gradle Inc.(1 month ago)
Manager, Site Reliability Engineering
Veeam Software(1 month ago)
Senior Site Reliability Engineer
GoDaddy(2 months ago)
Staff Software Engineer, Reliability
Veeam Software(5 months ago)
Manager, Site Reliability Engineering
Veeam Software(13 days ago)