Research Scientist – Speech and Audio Understanding (Large Models & Multimodal Systems)

Tencent(3 months ago)

Bellevue, WAOnsiteFull TimeSenior$122,500 - $229,700Research

About this role

A research role on a core team developing large-scale multimodal model systems that integrate vision, audio, and text to improve perception and understanding of the physical world. The position focuses on advancing speech and audio capabilities within those multimodal systems and contributing to model and dataset development.

View Original Listing

Required Skills

Speech Recognition
Speech Synthesis
Audio Understanding
Representation Learning
Multimodal Alignment
Transformer Architecture
PyTorch
TensorFlow
Distributed Training
Data Annotation

+2 more

Qualifications

Ph.D. in Computer Science, Electrical Engineering, Artificial Intelligence, Linguistics, or related field
Master's degree with several years of relevant experience

About Tencent

tencent.com

腾讯于1998年11月成立，是一家互联网公司，通过技术丰富互联网用户的生活，助力企业数字化升级。我们的使命是“用户为本科技向善”。Founded in 1998, Tencent is an Internet-based platform company using technology to enrich the lives of Internet users and assist the digital upgrade of enterprises. Our mission is "Value for Users, Tech for Good".

View more jobs at Tencent →