Cohere

Member of Technical Staff, Model Efficiency

Cohere

4 months ago
New York, NY
Hybrid
Full Time
Senior
1 applicant
View Job Listing
Cohere
Apply to 100+ jobs

About this role

A Member of Technical Staff for Model Efficiency at Cohere focuses on enhancing the performance of large language models (LLMs) by implementing optimizations that improve inference speed, latency, and throughput. The role involves deep technical work across the inference stack, diagnosing bottlenecks, and collaborating with modeling and systems teams to deploy performance improvements. Candidates should have strong programming skills in C++ or Python, experience with LLM inference ecosystems, and a background in performance optimization, particularly with GPUs and distributed systems.

Skills

Cohere

About Cohere

cohere.com

Cohere is an AI company that builds large language models and enterprise AI platforms for businesses and developers.

About Cohere

Headquarters

San Francisco, CA

Company Size

201-500 employees

Founded

2018

Industry

Technology

Glassdoor Rating

4.2 / 5

Leadership Team

Sarah Johnson

Chief Executive Officer

Michael Chen

Chief Technology Officer

Emily Williams

VP of Engineering

David Rodriguez

VP of Product

Jessica Thompson

Chief Financial Officer

Andrew Park

VP of Sales

Unlock Company Insights

View leadership team, funding history,
and employee contacts for Cohere.

Reveal Company Insights

ApplyBlast uses AI to match you with the right jobs, tailor your resume and cover letter, and apply automatically so you can land your dream job faster.

© All Rights Reserved. ApplyBlast.com