Head of Inference Kernels

Etched

4 months ago

San Jose, CA

Onsite

Full Time

Director

0 applicants

View Job Listing

Apply to 100+ jobs

About this role

The Head of Inference Kernels at Etched leads a high-performance team to develop optimized kernels and inference stacks for state-of-the-art transformer models, aiming for over 10x performance enhancement compared to existing benchmarks. The role encompasses architecting best-in-class inference performance, co-designing innovative algorithmic improvements, and ensuring alignment across cross-functional teams. The ideal candidate will possess extensive experience in designing GPU kernels, a deep understanding of transformer architectures, and a demonstrated track record in managing effective engineering teams.

Skills

About Etched

www.etched.com

Etched is at the forefront of advanced computing technology with its groundbreaking product, Sohu, the world's first transformer ASIC. By etching transformer architecture directly into silicon, Etched delivers server solutions that provide dramatically faster and more cost-effective AI model inference compared to traditional GPU-based systems. Their innovative technology is designed to optimize the performance of AI applications, enabling unprecedented processing capabilities for next-generation models. With a commitment to pushing the boundaries of what's possible in AI, Etched is poised to revolutionize the industry.