Staff Research Engineer, Model Efficiency
Cohere
About this role
The Staff Research Engineer, Model Efficiency at Cohere is responsible for enhancing the inference efficiency of Large Language Models (LLMs) within AI systems. This role involves developing and deploying novel techniques to optimize model architecture, decoding algorithms, and software/hardware co-design for GPU acceleration. Candidates should possess a PhD in Machine Learning, significant experience in model efficiency techniques, strong software engineering skills, and a background in research publications.
Skills
Qualifications
About Cohere
cohere.comCohere is an AI company that builds large language models and enterprise AI platforms for businesses and developers.
Recent company news
Nvidia-Backed Cohere Forms AI Alliance With Telecom Firm BCE
15 hours ago
Enterprise AI startup Cohere tops revenue target as momentum builds to IPO: Investor memo
1 month ago
Cohere joins Aston Martin Aramco as Official Generative AI Partner to help accelerate AI innovation
2 weeks ago
The AI Model Race May Have Slowed Down for Cohere
Nov 17, 2025
Cohere Technologies drives ahead with innovation vision
1 week ago
About Cohere
Headquarters
San Francisco, CA
Company Size
201-500 employees
Founded
2018
Industry
Technology
Glassdoor Rating
4.2 / 5
Leadership Team
Sarah Johnson
Chief Executive Officer
Michael Chen
Chief Technology Officer
Emily Williams
VP of Engineering
David Rodriguez
VP of Product
Jessica Thompson
Chief Financial Officer
Andrew Park
VP of Sales
Unlock Company Insights
View leadership team, funding history,
and employee contacts for Cohere.
Salary
$183k – $244k
per year
More jobs at Cohere
Similar Jobs
Senior DL Algorithms Engineer - Cosmos
NVIDIA
Research Scientist
Pluralis Research
Research Scientist, Audio
DeepMind
Hewlett Packard Labs -- Senior AI/ML Research Scientist
Hewlett Packard Enterprise
Senior GenAI Algorithms Engineer — Post-Training Optimizations
NVIDIA
Senior Deep Learning Inference Performance Architect
NVIDIA