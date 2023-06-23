June 23, 2023 04:43 pm | Updated 04:44 pm IST

In India, if we consider non-English language Wikipedia, the highest number of articles are available in Urdu, Hindi and Tamil. A non-English language Wikipedia is not a translation of English articles. It is self-sustaining: active users and moderators create and moderate content in their languages. Among languages which are mostly confined to a State, Tamil leads by a wide margin, with 1.6 times more articles than the second-best, Marathi, followed by Malayalam and Telugu.

Understandably, when all the global languages are considered, English leads the list with 66,71,236 articles (Chart 1).

Chart 1 | The chart lists the 320 languages in which Wikipedia articles are available. The bigger the size of the bubble, the more the number of articles.

Interestingly, Cebuano, a regional language spoken widely in the Philippines, has the second-highest number of articles in Wikipedia (61,23,197). The Cebuano entries are written in Latin alphabets. However, news reports show that many entries were made in Cebuano by a bot.

German (around 28.1 lakh), Swedish (25.6 lakh), French (25.3 lakh) and Dutch (21.2 lakh) are the other prominent languages in which a considerable number of Wikipedia articles are maintained. There are relatively few articles in Chinese and Cantonese (13.6 lakh articles and 1.3 lakh, respectively) despite the fact that many more people speak these languages.

Chart 2 | The chart lists the 23 languages spoken in India in which Wikipedia articles are available.

Urdu, Hindi, and Tamil lead with 1.5 lakh-2 lakh articles each, followed by Bangla, spoken widely in West Bengal and Bangladesh, with 1.4 lakh articles. Among other languages confined to a State, Marathi, Malayalam, Telugu, and Punjabi dominate, with 0.5 lakh-1 lakh articles each. There were around 12,000 articles in Sanskrit and around 15,000 in Sindhi.

There are no Wikipedia articles in two of the 22 languages in the Eighth Schedule of the Constitution: Bodo and Dogri. On the other hand, Bhojpuri, Bishnupriya, and Tulu (with just 1,884 articles and featuring last) are the non-scheduled languages in which Wikipedia articles are available. Of them, interestingly, there were over 25,000 articles in Bishnupriya, which had 79,646 recorded speakers as per the 2011 Census. The number of articles in Bishnupriya is just 5,000 less than the entries in Gujarati and Kannada.

Chart 3 | The chart shows the number of Wikipedia administrators available in each language, who can delete and undelete pages, block users, edit protected pages, and grant rights to others. They have been given extra editing privileges by the Wikipedia community.

English language administrators dominate (898), while German and French are a distant second and third (Chart 3). Among the Indian languages, Tamil leads with 35 administrators, followed by Malayalam (15) and Bangla (14). Hindi has six administrators and Sanskrit, three.

Chart 4 | The chart shows the number of Wikipedia users. A user is one who has created an account on the site.

Those who browse Wikipedia without registrations are not considered users. English dominates with over 4.5 crore users, while all the other languages have less than 1 crore users (Chart 4). Among the Indian languages, Hindi dominates with 7.6 lakh users, and among languages mostly confined to a State, Tamil leads with 2.2 lakh.

Chart 5 | The chart shows the number of active Wikipedia users.

An active user is a registered user who has performed an action in the last one month, which includes editing an article or taking part in page discussions. The dominant languages of active users were similar to that of the users.

Source: Wikimedia Statistics and Census of India

