Reddit

Staff Research Engineer, Pre-training Data

Reddit(4 days ago)

RemoteFull TimeSenior$230,000 - $322,000AI Engineering
Apply Now

About this role

Staff Research Engineer for Pre-training Data at Reddit will define the technical strategy and architecture for data curriculum pipelines that power Reddit-native foundational LLMs. The role focuses on transforming Reddit’s large multimodal conversational corpus into high-quality training signals and building scalable infrastructure to feed distributed training clusters. This position supports Reddit’s AI products across Safety, Moderation, Search, Ads, and next-generation user experiences.

View Original Listing

Required Skills

  • Python
  • Distributed Processing
  • Ray Data
  • Spark
  • Data Sampling
  • Curriculum Learning
  • PII Redaction
  • Graph Data
  • Multimodal Data
  • Rust

+3 more

Reddit

About Reddit

redditinc.com

Reddit is a social news and discussion platform where users submit links, posts, and media into topic-based communities called subreddits. Content is surfaced and ranked by user voting and threaded discussions, and the site hosts popular formats like AMAs (Ask Me Anything) and community-driven events. Reddit provides moderation tools, mobile apps, and advertising products for brands and agencies. It’s known for its passionate niche communities and outsized influence on internet culture and trends.

ApplyBlast uses AI to match you with the right jobs, tailor your resume and cover letter, and apply automatically so you can land your dream job faster.

© All Rights Reserved. ApplyBlast.com