Meta has unveiled a new AI system called SeamlessM4T, designed to facilitate translation and transcription tasks across almost 100 languages, encompassing both written text and spoken language.
The AI system is adept at comprehending various dialects as well. TechCrunch’s report indicates that the tech company has also made available an open-source translation dataset known as SeamlessAlign.
This release marks a significant stride forward in AI-driven speech-to-speech and speech-to-text technologies, as emphasized by Meta.
According to ANI News, Meta is of the belief that their novel SeamlessM4T model offers an on-demand translation solution that enhances communication between individuals speaking different languages.
SeamlessM4T possesses the ability to discern source languages without the necessity of a distinct language recognition engine.
In essence, SeamlessM4T can be seen as a successor to the Universal Speech Translator. This predecessor was one of the few direct speech-to-speech translation systems capable of supporting Hokkien.
Similarly, Meta’s “No Language Left Behind” initiative, a version of automated text-to-text translation, also shares a lineage with SeamlessM4T.
Notably, Amazon, Microsoft, OpenAI, and several startup companies have all introduced their own versions of universal speech models.
Additionally, Mozilla has led the development of Common Voices, an extensive collection of multi-language audio clips utilized to train automatic speech recognition systems.
The introduction of SeamlessM4T represents one of the most robust endeavors thus far to seamlessly integrate translation and transcription capabilities within a single AI model.