Eliminating translation bugs

The Paribhashika software, jointly developed by C-DAC and Kerala Bhasha Institute, can translate complex English sentences into Malayalam with a high degree of accuracy.

November 02, 2013 02:00 pm | Updated 02:03 pm IST - Kozhikode:

Ever since the advent of Malayalam computing on a larger scale around the turn of the millennium, the lack of an accurate English to Malayalam translator has been a bug.

Giants such as Google have found the script as well as the complicated syntax challenging and they are yet to include the language in the translator list. The other attempts on a smaller scale have been unsuccessful.

With the release on Friday of ‘Paribhashika,’ a translator jointly developed by the Centre for Development of Advanced Computing (C-DAC) and the Kerala Bhasha Institute, this scenario is set to change. It is equipped to translate complex sentences with multiple structures. Entire books can be processed with the help of the software which is touted as the first of its kind.

The idea for translators in regional languages was mooted six years back by the Central Government’s Department of Information Technology. Over this period, a dedicated team of eight has been constantly working on the syntax and various inflections of root words at the C-DAC office.

“We started working with ‘Anglobharathi,’ an engine for Indian languages developed by R.M.K. Sinha, a Professor with IIT, Kanpur, as the base. The language rules were provided by the Bhasha Institute. Later, we sought the help of linguists such as V.R. Prabodhachandran Nair. Malayalam did not have a computational grammar, which we had to develop,” says P.K. Bhadran, Associate Director, C-DAC.

A particular challenge for the team was the structure of the sentences. In English, it is subject-verb-object but, in Malayalam it is subject-object-verb. Also, the language is peculiar in that it has many inflexions for a single root word (such as ‘aval,’ meaning ‘she/her,’ has the inflections, ‘avalkku,’ ‘avalude,’ ‘avale,’ ‘avalum’ etc).

“The usual translators such as Google work with a statistical translation formula whereby the frequency of usage of a particular word is a gauge for its accuracy. These translations had a structural ambiguity. But, we worked with knowledge-based translations which makes use of the Interlingua approach for translation. This process has source language analysis and target language generation,” says Mr. Bhadran.

To illustrate the complexities involved in digital translations, he cites the classic example of the translation to Russian of the English sentence ‘My body is weak, spirit is willing’ which on retranslation to English generated ‘My flesh is rotten, spirit is good.’

The makers say that the translator is 75 per cent accurate as of now. But it is a work in progress which is moving towards 100 per cent accuracy with every passing day. It can translate science, law, and official texts efficiently. But literary texts, where the writers’ imaginations run wild, are still a challenge which spurs them to rigorously improve the software. They have already successfully translated a book on Raman Effect using the software.

0 / 0
Sign in to unlock member-only benefits!
  • Access 10 free stories every month
  • Save stories to read later
  • Access to comment on every story
  • Sign-up/manage your newsletter subscriptions with a single click
  • Get notified by email for early access to discounts & offers on our products
Sign in

Comments

Comments have to be in English, and in full sentences. They cannot be abusive or personal. Please abide by our community guidelines for posting your comments.

We have migrated to a new commenting platform. If you are already a registered user of The Hindu and logged in, you may continue to engage with our articles. If you do not have an account please register and login to post comments. Users can access their older comments by logging into their accounts on Vuukle.