Research Engineer, Reward Models Training

Anthropic(10 months ago)

HybridFull TimeSenior$350,000 - $500,000Research Engineering

About this role

This role owns the end-to-end engineering of reward model training, building infrastructure to train, evaluate, and deploy reward models that align AI with human values. The engineer will scale training pipelines to large model sizes, incorporate diverse human feedback, and partner closely with researchers to productionize novel techniques. Work directly impacts the safety, helpfulness, and honesty of Anthropic's models.

View Original Listing

Required Skills

Python
PyTorch
Machine Learning
Distributed Training
Data Pipelines
ML Infrastructure
Scalability
Fault Tolerance
Model Evaluation
Human Feedback

+8 more

Qualifications

Bachelor's Degree in Related Field

About Anthropic

anthropic.com

Anthropic is an AI safety and research company focused on building reliable, interpretable, and steerable AI systems. It develops large language models (branded as Claude) and offers APIs and enterprise products that let organizations integrate conversational AI with safety-focused controls, moderation, and privacy features. The company prioritizes interpretability and alignment research, publishes technical work, and engages with policymakers to reduce risks from advanced AI. Customers choose Anthropic for its safety-first approach, controllability tools, and research-driven models.