Senior Research Engineer, LLM Evaluation and Behavioral Analysis

Together AI(1 month ago)

RemoteFull TimeSenior$220,000 - $270,000Research

About this role

This role sits in Together AI’s Turbo organization as a bridge between cutting-edge model research and production behavioral reliability. The person will define and drive how model quality and reliability are measured across releases, influencing datasets, evaluation standards, and model improvements. The position is research-driven and collaborates closely with training, inference, product, and infrastructure teams to ensure models behave consistently in real-world settings.

View Original Listing

Required Skills

Python
Evaluation Tooling
Distributed Workflows
LLMs
Model Evaluation
Red Teaming
Reasoning
Experiment Design
Dataset Design
Function Calling

+9 more

About Together AI

together.ai

Together AI is an "AI Native Cloud" that helps teams reliably build, deploy, and scale AI-native applications. It combines cutting‑edge research with a complete developer experience and infrastructure optimized for high price‑performance. Together provides hosted model training and inference, APIs/SDKs, and tooling to move projects from experimentation into production. Customers pick it for scalability, cost efficiency, and faster time-to-production for AI applications.