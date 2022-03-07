Computer-based translation developed from the 1950s and the idea was to look for words in the target language with rules keyed into computer software for grammar, syntax and idioms.

Computer based automatic translation is an evolving field expected to improve in the coming years

Translation has always been a difficult task. While translation is a creative endeavour when literary works are involved, translation for purposes of administration, governance and tourism is a mundane but important necessity. With the advancements in computer technologies, translation has become easier and immediate. For example, Google Translate, a popular product available on the internet and which you might have used already, translates billions of words per day.

Translation techniques

Computer-based translation developed from the 1950s and the idea was to look for words in the target language with rules keyed into computer software for grammar, syntax and idioms. In the 1980s, Statistical Machine Translation (SMT) became popular: it took as input hundreds of documents that were already translated by humans (from the UN and from government departments) and created probability distributions of when each word or phrase or sentence in a target language appears for the corresponding input in a different language. SMT then uses these probabilities to predict what the translation is for a new input sentence. This technique was widely popular and is still a good solution. When Google Translate was introduced in 2006, it used this technique.

Another successful technique that was explored was to apply mathematics to model sentences as matrix of numbers. This is called “word embedding” and it enables Artificial Neural Network (ANN) to identify clusters of words with semantic relations — for example, when hundreds of sentences are given as input, word embedding can form clusters of related individuals like birds, nationalities, financial phrases etc. This semantic ability is possible because similar words are used in similar contexts: for example, in general, wild animal nouns are used in the context of forest, zoos etc. and thus a cluster of animal nouns can be formed. This idea was made popular in the 1950s by the British linguist John Rupert Firth when he said “You shall know a word by the company it keeps”. This technique can be seen in action in auto-completion: when you type a few words during search (for example in Google), many potential choices are recommended to auto-complete what you type.

Word Embedding, i.e., the mathematical representation of words and sentences as matrices of numbers to identify semantic relations, contributed to advancements in Neural Machine Translation (NMT). In NMT, Artificial Neural Network (ANN) performs the operations of “encode”, “attention” and “decode”: here to “encode” means to continuously augment the meaning based on the list of words read so far. This can be understood better with an example. Assume we assign a value of 4 to the word “bank” when it occurs beside a river but a value of 6 to the word “bank” when it is a financial institution. Now consider an example input “He crossed the river to reach the bank”. When NMT reads the word “crossed”, it starts to focus on clusters like bridge, river, channel, sea etc. And when NMT reads “river”, it keeps its focus on words associated with “river” and typical places around it, gleaned from past examples. Finally when NMT reaches “bank”, NMT knows that we are talking about “river” bank (and not about a “financial institution”) and it assigns the value 4 instead of 6. This value of 6 is carried forward to the next sentence, and thus context is preserved. This assigning of values to each word based on context is the “encode” operation. Then the encoded sentence is used to predict the first translated word, say T1, by calculating probabilities of suitable potential output words. This assigning of probability weights is called “attention” process. During the “decode” operation, each previous translated word and the encoded input are used along with probability weights to generate one output word in each step.

Understanding complexities

Yet another recent advancement in Neural Machine Translation (NMT) is “Transformers” which introduces additional intermediate steps in NMT by iterating multiple times the “Attention” step to fine tune the probability weights of the potential output words.

Popular examples are sentences like “The animal didn’t cross the river as it was wide” and “The animal didn’t cross the river as it was tired”. The meaning of the underlined “it” is different in the two sentences (referring to the river in the first sentence but to the animal in the second) but with Transformers, NMT resolves accurately even such complexities.

With the adoption of NMT, computers can translate to multiple languages. For example, Google Translate can handle more than 100 languages and can translate between any pair of languages in this group. In commercial applications like the mobile phone and in the software industry, NMT is the current preferred choice to perform translations. Automatic translation is an evolving field and more improvements are expected to happen in the coming years.

S. Varahasimhan is a senior employee at a software product MNC in Chennai.