Boson AI

Site Reliability Engineer, AI/ML Infrastructure

Boson AI(1 day ago)

Toronto, CanadaOnsiteFull TimeSenior$171,744 - $231,759 (estimated)Engineering
Apply Now

About this role

A Senior Site Reliability Engineer to operate and scale a high-performance GPU cluster in Toronto featuring NVIDIA H100/A100 GPUs, large Ceph storage, and terabit networking. The role focuses on the full lifecycle of HPC infrastructure — planning, building, testing, deploying, and ensuring reliable operations — while collaborating with engineering and science teams. The engineer will also contribute to capacity planning and technology evaluations as the cluster grows.

View Original Listing

Required Skills

  • Linux
  • Kubernetes
  • Ceph
  • Python
  • Bash
  • Ansible
  • Terraform
  • GitOps
  • Helm
  • ArgoCD

+9 more

Boson AI

About Boson AI

boson.ai

Boson AI builds conversational and audio-generation AI focused on making interaction with machines "as easy, natural and fun as talking to a human." Their platform offers high‑fidelity, open‑source voice synthesis and multi‑speaker dialog generation, plus promptable audio (including sound effects) and emotional voice rendering. Boson provides APIs, demos and developer tools so teams can embed natural spoken interfaces into products. The company targets developers and businesses creating conversational experiences across products and platforms.

ApplyBlast uses AI to match you with the right jobs, tailor your resume and cover letter, and apply automatically so you can land your dream job faster.

© All Rights Reserved. ApplyBlast.com