All openings
Research

Research Scientist, Multimodal

Research Bangalore, India Full Time

Push the frontier of multimodal video understanding — vision, audio, speech, and text fused into a single model.

What you'll do

  • Design and run experiments on large-scale multimodal models.
  • Publish or open-source impactful results where appropriate.
  • Collaborate with engineering to ship research into production.

What we're looking for

  • PhD (or equivalent experience) in ML, CV, NLP, or speech.
  • Track record of strong publications or production-shipped models.
  • Deep experience with large-scale model training.

Nice to have

  • Experience with video foundation models or long-context architectures.

Apply for this role

Tell us about yourself — it takes about 2 minutes.

By clicking submit, you agree to Mikshi's Privacy Policy.