Research Engineer (LLM Training and Performance)
JetBrains(3 months ago)
About this role
JetBrains is seeking a Research Engineer specialized in large language model training pipelines. The role involves optimizing training speed, stability, and efficiency across large-scale multi-GPU clusters, with responsibilities including profiling, architecture design, custom kernel programming, and facilitating robust training workflows.
Required Skills
- PyTorch
- CUDA
- NCCL
- Megatron-LM
- DeepSpeed
- Profiling
- GPU Programming
- Distributed Computing
- Tensorflow
- Kubernetes
About JetBrains
jetbrains.comJetBrains is a software vendor that builds developer tools and platforms, best known for the IntelliJ IDEA Java IDE and the Kotlin programming language. It offers a family of intelligent, productivity-focused IDEs (including PyCharm, WebStorm, CLion, Rider, PhpStorm, DataGrip and ReSharper) alongside team and DevOps products like TeamCity, YouTrack and Space. JetBrains emphasizes smart code completion, refactorings, deep language support, extensibility via plugins, and increasingly integrates AI-assisted features to speed development. Its products serve individual developers and teams across platforms with both commercial and free/community editions.
View more jobs at JetBrains →Apply instantly with AI
Let ApplyBlast auto-apply to jobs like this for you. Save hours on applications and land your dream job faster.
More jobs at JetBrains
Similar Jobs
Senior System Software Engineer - AI Performance and Efficiency Tools
NVIDIA(1 month ago)
Senior High-Performance LLM Training Engineer
NVIDIA(1 month ago)
Senior System Software Engineer - AI Performance and Efficiency Tools
NVIDIA(8 months ago)
AI Test Architect
NVIDIA(2 months ago)
Member of Technical Staff (GPU Engineer)
Reka(1 month ago)
Senior Deep Learning Performance Engineer - Training at Scale
NVIDIA(2 months ago)