Indus script does encode a language

An Indus seal from Mohenjodaro.   | Photo Credit: — Photo: Special Arrangement

A. Srivathsan & T.S. Subramanian

New study reported in Science shows it was no mere ‘chain of symbols’

Chennai: Computation science, information theory, and machine learning have now come to the vindication of Indus Valley scholars – providing a new type of “quantitative evidence for the existence of linguistic structure in the Indus script, complementing other arguments that have been made explicitly or implicitly in favour of the linguistic hypothesis.” This quantitative evidence comes from the results of a statistical study published online recently in the journal Science ( >

Drawing from multiple disciplines, using rigorous equations, and through scientific number crunching, a team of scientists — including the well-known Indus script scholar, Iravatham Mahadevan — have demonstrated that the Indus script encodes a language and is not a mere “chain of symbols,” as an article published in 2004 claimed.

The seals and tablets of the Indus civilisation that flourished between 2500 and 1900 B.C carry examples of what has long been understood to be writing in an unknown language. Despite many attempts, the script, known for 130 years, has not been deciphered. The 2004 article, published in the Electronic Journal of Vedic Studies, challenged the idea that the Indus script encoded language and suggested that it might have been a non-linguistic symbol system like the Vinèa inscriptions of southeastern Europe and the Near Eastern emblem systems.

The new statistical study compared the pattern of symbols found on Indus Valley artifacts to five types of natural linguistic systems (the Sumerian logo-syllabic system, the Old Tamil alpha-syllabic system, the Rig Vedic Sanskrit alpha-syllabic system, English words, and English characters), four types of non-linguistic systems (including human DNA sequences and bacterial protein sequences), and the artificially created computer programming language, Fortran.

The decisive finding was that “the conditional entropy of Indus inscriptions closely matches those of linguistic systems and remains far from non-linguistic systems…The similarity in conditional entropy to Old Tamil, a Dravidian language, is especially interesting in light of the fact that many of the prominent decipherment efforts to date…have converged upon a proto-Dravidian hypothesis for the Indus script.”

The study is the collaborative work of Rajesh P.N. Rao, a University of Washington computer scientist; Nisha Yadav and Mayank N. Vahia of the Department of Astronomy & Astrophysics at the Tata Institute of Fundamental Research, Mumbai; Hrishikesh Joglekar, a software engineer from Mumbai; Ronojoy Adhikari, Faculty Fellow at the Institute of Mathematical Sciences, Chennai; and Mr. Mahadevan at the Indus Research Centre, Chennai.

Dr. Adhikari, who specialises in Novel Applications of Statistical Mechanics, has no doubt that that the Indus script was part of a structured language. Opening his Nokia mobile phone, he types the alphabets H and A one after the other. The messaging service automatically fills the next two slots with V and E. “This,” he says, “is a simple algorithm the mobile phone uses to help you complete a word quickly. It works on the principle of correlation. In English, when you use the alphabet Q, the next one that follows is often U. Every language has a probability or flexibility of what token would come after another. A token could be an alphabet or punctuation or any component of the linguistic system. We have used the idea of entropy to measure the non-randomness in a linguistic system including the Indus script.”

When Dr. Adhikari and his collaborators compared the conditional entropy of the Indus script with the conditional entropies of the various linguistic and non-linguistic systems, the results provided “quantitative evidence for the existence of linguistic structure in the Indus script.” “The Indus script,” he explains, “comes close to the entropy value of Old Tamil and lends credence to the debate that the Indus script is connected with the Dravidian language.”

The use of statistical methods is not new to research on the Indus script. The point of departure in the new study is the use of rigorous correlation techniques, a significant methodological advance.

Work on the Indus script continues. The temporal and spatial analysis of the script has been completed and awaits publication. There is scope to compare the Indus script with systems like the Chinese pictograms and the Egyptian hieroglyphics. Dr. Adhikari believes that all these efforts “are taking us closer to understanding the Indus script.”

