
Microsoft just shipped its own AI models, and they’re coming for OpenAI and Google. The company has publicly released three proprietary models: MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2. The models are available via the Microsoft Foundry platform and the MAI Playground.
We’re bringing our growing MAI model family to every developer in Foundry, including …
· MAI-Transcribe-1, most accurate transcription model in world across 25 languages
· MAI-Voice-1, natural, expressive speech generation
· MAI-Image-2, our most capable image model yet
Start… pic.twitter.com/p0DZZcAUZ4
— Satya Nadella (@satyanadella) April 2, 2026
So, what can Microsoft’s AI models actually do?
The trio covers a variety of use cases: listening, speaking, and seeing. MAI-Transcribe-1, for instance, handles speech-to-text across 25 languages and is 2.5 times faster than Microsoft’s own Azure Fast offering. It’s worth mentioning that the audio model was built by a team of 10 people.
Recommended Videos
MAI-Voice-1 can produce 60 seconds of natural-sounding
...Keep reading this article on Digital Trends.