 />
GIFT a SubscriptionGift
HamberMenu
  1. Science
  2. Data
  3. Opinion
  4. Plutus IAS
  5. SEARCH Icon
  1. Science
  2. Data
  3. Opinion
  4. Plutus IAS
  5. SEARCH Icon

To enjoy additional benefits

GIFT a SubscriptionGift
ShowcaseCrossword+

CONNECT WITH US

DUK researchers discover error in OpenAI’s Whisper

Team says OpenAI’s accuracy-checking process fails to account for vowel signs used in Indian languages, including Malayalam

Published - November 15, 2024 08:42 pm IST - Thiruvananthapuram

The Hindu Bureau

Researchers at Digital University Kerala (DUK) have uncovered a major error in OpenAI’s Whisper, an artificial intelligence system designed to transcribe speech into text.

Typically, AI companies test the accuracy of AI models, usually automatically, before releasing them for public use. This is to assess how precisely these systems handle spoken language. However, owing to an error in the quality-check process, OpenAI has reported the accuracy of its AI systems for Indian languages, including Malayalam, as better than it actually is.

This discovery was made by researchers at DUK’s Virtual Resource Centre for Language Computing (VRCLC), led by Elizabeth Sherly, with team members Kavya Manohar and Leena G. Pillai. Their research paper documenting the problem has been selected for presentation at the ongoing international conference on Empirical Methods in Natural Language Processing 2024 in Florida, U.S. The Association of Computational Linguistics (ACL) has also awarded the team a grant to support their presentation.

During their investigation, the researchers found that OpenAI’s accuracy-checking process fails to account for vowel signs used in Indian languages. For instance, when the Malayalam word ‘Digital University’ is stripped of its vowel signs and ‘chandrakkala’ (virama), it becomes ‘ഡ ജ റ റ ൽ യ ണ വ ഴ സ റ റ,’ losing its readability. By excluding these signs, errors go undetected, leading to an overestimated transcription accuracy. The team also observed that similar issues are present in Meta’s evaluations of its own AI models.

Regional language AI computing differs vastly from systems designed for languages such as English. The DUK’s VRCLC centre is committed to developing systems that account for these linguistic nuances, enhancing accuracy and usability of Indian languages in AI applications, a statement from the university said.

Published - November 15, 2024 08:42 pm IST

Top News Today

0 / 0
Sign in to unlock member-only benefits!
  • Access 10 free stories every month
  • Save stories to read later
  • Access to comment on every story
  • Sign-up/manage your newsletter subscriptions with a single click
  • Get notified by email for early access to discounts & offers on our products
Sign in

Comments

Comments have to be in English, and in full sentences. They cannot be abusive or personal. Please abide by our community guidelines for posting your comments.

We have migrated to a new commenting platform. If you are already a registered user of The Hindu and logged in, you may continue to engage with our articles. If you do not have an account please register and login to post comments. Users can access their older comments by logging into their accounts on Vuukle.