YouTube launches Hindi auto-captioning feature after 13 years

Automated subtitles could open up millions of Hindi language videos to viewers who are hearing impaired; the feature was first launched for English in 2010

December 26, 2023 05:35 pm | Updated 07:54 pm IST - NEW DELHI

One major challenge for Hindi speech recognition has been the common use of English in everyday speech. File

One major challenge for Hindi speech recognition has been the common use of English in everyday speech. File | Photo Credit: AP

YouTube has started rolling out automatic captions for Hindi videos, a much delayed expansion of its speech recognition-aided subtitles since the feature was launched in 2010. The automated subtitles could open up millions of Hindi language videos to viewers who are hearing impaired.

Hindi subtitles have been available on the platform on videos where creators have specifically chosen to add them; but YouTube hasn’t offered a convenient way to automatically caption Hindi videos. Since creators on YouTube have to pay for professionally created and timed subtitles, many do not commission them.

It is unclear when precisely Hindi captioning started becoming available. Transcription of Hindi has been available on Google Translate and other products by the search giant. But the inclusion of Hindi auto-captioning as a widely available feature for Hindi videos is a signal that enough data has now been gathered and processed on Hindi speech that Google feels it can offer enough accuracy on most videos in the language. By extension, that means that language data availability on Indian languages is expanding.  

Well before the generative Artificial Intelligence boom, firms like YouTube have been using voice recognition for accessibility purposes. But that’s easier said than done for languages that are not heavily represented online. “In the speech to text problem, you need a lot of speech in Hindi, and a corresponding correct transcript, which is fed to [AI] models that learn by looking at this data,” Mayuresh Nirhali, a senior executive at Reverie, which works on solving problems related to Indian languages on the Internet, said.

Developing AI-enabled services like speech recognition for Indian languages is particularly difficult due to several foundational challenges, including inconsistent encoding of text online, as well as regional variations in spelling and pronunciation, Mr. Nirhali said. Now that more data appears to be available — at least to big tech firms — the situation is improving. A YouTube spokesperson did not respond to queries on the launch of auto-captioning in Hindi.

Mimicking the style of closed captioning for television viewers in countries like the United States, where it is mandatory for the small screen, YouTube’s captions show up as blocks of words as and when they are spoken, with little punctuation. While captions for news broadcasts are generally created in real time for professional TV channels, AI-enabled speech recognition allows automatic captioning to be timed more precisely, allowing viewers to pick up pauses and other cues of speech.

But accuracy and quality issues linger. Even in auto-generated English captions, for which YouTube has been perfecting its technology for over a decade, mistakes are common, and many words are often mistranscribed. Hindi captions are no different, The Hindu found in some videos. Many lines that are not articulated by speakers, even in single-speaker contexts like stand-up comedy videos, are simply omitted, while other words are transcribed by similar-sounding words.

YouTube has by default censored offensive terms and swear words in automatic captioning for several years. When a prohibited word or term is used in English for instance, YouTube transcribes it as an underscore in square brackets ([_]). This does not seem to be the case in Hindi yet, and expletives show up as similar sounding words.

One major challenge for Hindi speech recognition has been the common use of English in everyday speech. For the moment, YouTube is simply ‘devanagarising’ English words in Hindi sentences, displaying them without switching to English script, while skipping over English-only sentences entirely. “The expectation for anybody building AI models [for speech and text] is that more colloquial and realistic ground root data is included, so that the model learns the nuances of mixing languages,” Mr. Nirhali said. 

“Languages are such, their spread is so wide that you’ll always have different translations or transcriptions than the models have understood,” he added. “There’s never a line you can draw and say, ‘that’s it, I’m done’.” 

Top News Today

Sign in to unlock member-only benefits!
  • Access 10 free stories every month
  • Save stories to read later
  • Access to comment on every story
  • Sign-up/manage your newsletter subscriptions with a single click
  • Get notified by email for early access to discounts & offers on our products
Sign in

Comments

Comments have to be in English, and in full sentences. They cannot be abusive or personal. Please abide by our community guidelines for posting your comments.

We have migrated to a new commenting platform. If you are already a registered user of The Hindu and logged in, you may continue to engage with our articles. If you do not have an account please register and login to post comments. Users can access their older comments by logging into their accounts on Vuukle.