Senior Data Scientist - LLM Evaluation (Medhub)
EvolutionIQ(29 days ago)
About this role
Senior Data Scientist on the Medhub team responsible for architecting the company’s LLM evaluation framework and ensuring AI products meet rigorous scientific standards. The role emphasizes establishing objective, statistically sound evaluation processes to guarantee model safety, accuracy, and reliability prior to deployment. It partners with domain experts and data teams to translate qualitative expertise into validated evaluation datasets and scalable labeling strategies.
Required Skills
- LLM Evaluation
- Experimental Design
- Statistical Analysis
- Hypothesis Testing
- Regression Analysis
- Benchmarking
- Prompt Auditing
- Scorecard Design
- Labeling Optimization
- Inter-Rater Reliability
+6 more
Qualifications
- Master's or PhD in Statistics, Mathematics, or Related Quantitative Field
About EvolutionIQ
evolutioniq.comEvolutionIQ is an AI-driven claims guidance platform for property & casualty insurers that uses machine learning and advanced analytics to identify, prioritize, and guide high-value claims so adjusters can make faster, more accurate decisions. The platform surfaces actionable recommendations—triage, subrogation and recovery opportunities, fraud flags, and optimal settlement strategies—helping carriers accelerate recoveries and reduce leakage. Integrated into existing workflows, EvolutionIQ focuses on measurable outcomes like faster recoveries, lower costs, and improved operational efficiency, and is chosen for its carrier-focused models and explainable, scalable insights.
Apply instantly with AI
Let ApplyBlast auto-apply to jobs like this for you. Save hours on applications and land your dream job faster.
More jobs at EvolutionIQ
Similar Jobs
Sr Lead Machine Learning Engineer
Upwork(27 days ago)
Data Scientist II
Allegiant(4 months ago)
Applied AI, Evaluation Engineer
Mistral AI(13 days ago)
Data Scientist - Quantitative Research
Betting Hero(26 days ago)
AI/ML Evaluation Engineer - Global Solutions Provider (Mexico)
Truelogic Software(2 months ago)
UX Engineer, LLM Experimentation Platform
Arize AI(1 month ago)