Facebook's AI model can translate 100 languages without relying on English

For years, AI researchers have been working toward building a single universal model that can understand all languages across different tasks.

October 21, 2020 07:48 pm | Updated 07:53 pm IST

Facebook used novel mining strategies to create translation data, building a data set with 7.5 billion sentences for 100 languages.

Facebook used novel mining strategies to create translation data, building a data set with 7.5 billion sentences for 100 languages.

(Subscribe to our Today's Cache newsletter for a quick snapshot of top 5 tech stories. Click here to subscribe for free.)

Facebook on Monday introduced the first open-source multi-lingual machine translation model M2M-100 that can translate between any pair of 100 languages without relying on English data.

When translating, previous models made use of English translation data as it was widely available. Facebook's model translates directly from language one to another to preserve meaning better, the company said in a statement.

The programme outperforms English-centric systems by 10 points on the BLEU metric, which is used for evaluating machine translations.

M2M-100 is trained on a total of 2,200 language directions. It will improve the quality of translations for billions of people, especially those who speak low-resource languages, it added.

Typical translation systems require building separate AI models for each language and task, but this approach doesn’t scale effectively on Facebook, where people post content in more than 160 languages across billions of posts. Advanced multilingual systems can process multiple languages at once, but compromise on accuracy by relying on English data to bridge the gap between the source and target languages.

Also read | Find the song by humming it to Google Assistant

Facebook used novel mining strategies to create translation data, building a data set with 7.5 billion sentences for 100 languages.

For years, AI researchers have been working toward building a single universal model that can understand all languages across different tasks. A single model that supports all languages, dialects, and modalities will help serve more people, keep translations up to date, and create new experiences for billions of people equally.

0 / 0
Sign in to unlock member-only benefits!
  • Access 10 free stories every month
  • Save stories to read later
  • Access to comment on every story
  • Sign-up/manage your newsletter subscriptions with a single click
  • Get notified by email for early access to discounts & offers on our products
Sign in

Comments

Comments have to be in English, and in full sentences. They cannot be abusive or personal. Please abide by our community guidelines for posting your comments.

We have migrated to a new commenting platform. If you are already a registered user of The Hindu and logged in, you may continue to engage with our articles. If you do not have an account please register and login to post comments. Users can access their older comments by logging into their accounts on Vuukle.