Microsoft takes on Google and OpenAI with its own AI models

Microsoft just shipped its own AI models, and they’re coming for OpenAI and Google. The company has publicly released three proprietary models: MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2. The models are available via the Microsoft Foundry platform and the MAI Playground.

We’re bringing our growing MAI model family to every developer in Foundry, including …

· MAI-Transcribe-1, most accurate transcription model in world across 25 languages
· MAI-Voice-1, natural, expressive speech generation
· MAI-Image-2, our most capable image model yet

Start… pic.twitter.com/p0DZZcAUZ4

— Satya Nadella (@satyanadella) April 2, 2026

So, what can Microsoft’s AI models actually do?

The trio covers a variety of use cases: listening, speaking, and seeing. MAI-Transcribe-1, for instance, handles speech-to-text across 25 languages and is 2.5 times faster than Microsoft’s own Azure Fast offering. It’s worth mentioning that the audio model was built by a team of 10 people.