Engineering Manager, Kernel Reliability
Cerebras Systems(26 days ago)
About this role
A hands-on engineering leader heading Cerebras' on-field Kernel Reliability team, focused on improving the reliability of advanced compute clusters and production services. The role sets the technical vision while staying close to code and works cross-functionally with software and hardware teams to scale systems and tooling for growing production demands. The position also includes building and growing a high-performing engineering team.
Required Skills
- Software Reliability
- Hardware Reliability
- Debugging
- Diagnostic Tools
- Failure Analysis
- Parallel Programming
- Distributed Programming
- Kernel Development
- Monitoring
- Incident Response
+4 more
About Cerebras Systems
cerebras.aiCerebras builds purpose‑built AI compute systems centered on its wafer‑scale processors to accelerate training and inference of large neural networks. Their integrated hardware‑and‑software platform delivers high throughput, low latency, and very large on‑chip memory/interconnect to shorten time‑to‑train for demanding AI workloads. Cerebras targets research labs and enterprises that need to scale experiments and deploy large models more quickly, pairing systems, tooling, and support to simplify large‑model development.
Apply instantly with AI
Let ApplyBlast auto-apply to jobs like this for you. Save hours on applications and land your dream job faster.
More jobs at Cerebras Systems
Senior Technical Program Manager
Cerebras Systems(4 days ago)
Senior Software Development Engineer in Test (SDET) - AI Cluster
Cerebras Systems(4 days ago)
Senior Software Development Engineer in Test (SDET) - AI Cluster Networking and Security
Cerebras Systems(4 days ago)
Data Center Construction Project Manager
Cerebras Systems(4 days ago)
Similar Jobs
Staff Site Reliability Engineer, Compute
Crusoe(1 month ago)
Software Engineer, Acceleration Kernel Development
Tenstorrent(1 year ago)
Production Engineer, Compute
Crusoe(1 month ago)
Senior Site Reliability Engineer, Compute
Crusoe(1 month ago)
Engineering Manager - OS and Kernel
Wayve(2 months ago)
Senior eBPF/Linux Kernel Developer
Clockwork.io(8 days ago)