All openings
Research
Research Scientist, Multimodal
Research Bangalore, India Full Time
Push the frontier of multimodal video understanding — vision, audio, speech, and text fused into a single model.
What you'll do
- Design and run experiments on large-scale multimodal models.
- Publish or open-source impactful results where appropriate.
- Collaborate with engineering to ship research into production.
What we're looking for
- PhD (or equivalent experience) in ML, CV, NLP, or speech.
- Track record of strong publications or production-shipped models.
- Deep experience with large-scale model training.
Nice to have
- Experience with video foundation models or long-context architectures.
Apply for this role
Tell us about yourself — it takes about 2 minutes.