News Report Technology
November 07, 2023

Whisper V3 by OpenAI Goes Open Source, Expanding Voice Recognition Across Languages

In Brief

OpenAI announced the open-source release of WHISPER V3, a state-of-the-art model for voice recognition in multiple languages.

OpenAI Unveils Whisper V3: Revolutionizing Voice Recognition Across Languages

Artificial intelligence (AI) research company OpenAI, has taken a significant leap in the realm of speech recognition by open-sourcing its state-of-the-art model Whisper large-v3, during their Developer Day event.

This latest iteration of the Whisper model demonstrates a remarkable ability to understand and transcribe voice in a multitude of languages, broadening its applicability beyond the English-centric models of the past.

Whisper large-v3 thrives in diverse conditions, adeptly handling various language inputs. According to OpenAI, while models targeting English applications like tiny.en and base.en show superior performance. However, Whisper large-v3’s effectiveness is subject to fluctuation depending on the language being transcribed.

Originally focusing on English upon its launch last September, the model expanded its capabilities with version 2 in December to include support for a range of languages, though it did not specify which ones.

Whisper large-v3 available under a permissive license on GitHub, enables users to transcribe various forms of content with best-in-class accuracy. Its unique timestamp feature adds significant value, potentially revolutionizing subtitle generation on video platforms like YouTube.

Source: OpenAI

OpenAI’s Multilingual Speech Recognition Breakthrough

Whisper large-v3 processes audio by first segmenting it into 30-second clips and then running it through a complex system that includes an encoder and decoder to generate the output.

These components work in unison to predict the textual transcription of the spoken words. One of the technical highlights of Whisper large-v3 is its language identification feature, which not only transcribes multilingual speech but also translates it into English.

While initial plans suggested integration with the popular ChatGPT to facilitate direct voice interaction with the chatbot, OpenAI has opted to grant the public direct access to Whisper large-v3. It’s worth noting that the current target audience for Whisper is primarily researchers, not the general public.

OpenAI’s commitment to advancing robust speech processing is evident in their decision to open-source Whisper large-v3. The organization underscores its objective to foster the development of practical applications and further research in this field.

OpenAI has refined its AI tool with a vast dataset featuring 680,000 hours of closely monitored data gathered from the internet, including a substantial share of non-English audio. This step aims to fuel innovation and broaden the scope of voice recognition technology worldwide.

Disclaimer

In line with the Trust Project guidelines, please note that the information provided on this page is not intended to be and should not be interpreted as legal, tax, investment, financial, or any other form of advice. It is important to only invest what you can afford to lose and to seek independent financial advice if you have any doubts. For further information, we suggest referring to the terms and conditions as well as the help and support pages provided by the issuer or advertiser. MetaversePost is committed to accurate, unbiased reporting, but market conditions are subject to change without notice.

About The Author

Nik is an accomplished analyst and writer at Metaverse Post, specializing in delivering cutting-edge insights into the fast-paced world of technology, with a particular emphasis on AI/ML, XR, VR, on-chain analytics, and blockchain development. His articles engage and inform a diverse audience, helping them stay ahead of the technological curve. Possessing a Master's degree in Economics and Management, Nik has a solid grasp of the nuances of the business world and its intersection with emergent technologies.

More articles
Nik Asti
Nik Asti

Nik is an accomplished analyst and writer at Metaverse Post, specializing in delivering cutting-edge insights into the fast-paced world of technology, with a particular emphasis on AI/ML, XR, VR, on-chain analytics, and blockchain development. His articles engage and inform a diverse audience, helping them stay ahead of the technological curve. Possessing a Master's degree in Economics and Management, Nik has a solid grasp of the nuances of the business world and its intersection with emergent technologies.

Hot Stories

Top Investment Projects of the Week 25-29.03

by Viktoriia Palchik
March 29, 2024
Join Our Newsletter.
Latest News

Top Investment Projects of the Week 25-29.03

by Viktoriia Palchik
March 29, 2024

Supply and Demand Zones

Cryptocurrency, like any other currency, is a financial instrument based on the fundamental economic principles of supply ...

Know More

Top 10 Crypto Wallets in 2024

With the current fast-growing crypto market, the significance of reliable and secure wallet solutions cannot be emphasized ...

Know More
Join Our Innovative Tech Community
Read More
Read more
Modular Blockchain Sophon Raises $10M Funding from Paper Ventures and Maven11 Amid Veil of Mystery
Business News Report
Modular Blockchain Sophon Raises $10M Funding from Paper Ventures and Maven11 Amid Veil of Mystery
March 29, 2024
Arbitrum Foundation Announces Third Phase Of Grants Program, Opens Applications From April 15th
News Report Technology
Arbitrum Foundation Announces Third Phase Of Grants Program, Opens Applications From April 15th
March 29, 2024
Top Investment Projects of the Week 25-29.03
Digest Technology
Top Investment Projects of the Week 25-29.03
March 29, 2024
Vitalik Buterin Advocates For Memecoins’ Potential In Crypto Sector, Favors ‘Good Memecoins’
News Report Technology
Vitalik Buterin Advocates For Memecoins’ Potential In Crypto Sector, Favors ‘Good Memecoins’
March 29, 2024