Performance Engineer - Inference

Cerebras Systems(13 days ago)

Toronto, Ontario, CanadaOnsiteFull TimeMedior$109,106 - $146,700 (estimated)Engineering

About this role

An engineer on the Inference Performance team at Cerebras works at the intersection of hardware and software to improve ML model inference speed and throughput on the Wafer Scale Engine. The role focuses on performance modeling, system-level analysis, and building tooling for performance projection and diagnostics. Team members contribute to advancing state-of-the-art inference capabilities for large-scale ML applications.

View Original Listing

Required Skills

Performance Modeling
Kernel Optimization
Compiler Algorithms
Runtime Debugging
Tooling Development
Performance Profiling
System Analysis
Computer Architecture
Simulator Experience
LLM Math

+3 more

Qualifications

Bachelors in Electrical Engineering
Masters in Electrical Engineering
PhD in Electrical Engineering
Bachelors in Computer Science
Masters in Computer Science
PhD in Computer Science

About Cerebras Systems

cerebras.ai

Cerebras builds purpose‑built AI compute systems centered on its wafer‑scale processors to accelerate training and inference of large neural networks. Their integrated hardware‑and‑software platform delivers high throughput, low latency, and very large on‑chip memory/interconnect to shorten time‑to‑train for demanding AI workloads. Cerebras targets research labs and enterprises that need to scale experiments and deploy large models more quickly, pairing systems, tooling, and support to simplify large‑model development.