March 27, 2011 10:30 pm | Updated November 17, 2021 01:11 am IST

Several new applications relating to freely available ‘open data' are being released across the Net. This edition of NetSpeak examines this emerging trend and discusses the significance of such applications.

The open data movement, based on the concept of making all kinds of data machine-readable with unrestricted access, is catching up like wildfire all over the Net. As mentioned in the past ( > ), many governments across the globe have adopted this idea to share their data.

The Canada Government's ‘Open Data Pilot Project' ( > ) is an instance of this ongoing trend. It is widely felt that such attempts by governments would enhance the transparency in public life.

Along with the data, one should provide necessary means to make them comprehensible and useful to stakeholders. The data should be published in such a manner that a tech entrepreneur should be able to create valuable applications out of it with ease. Opening up of the data fire-hose offers new opportunities to IT industry as well. For companies which utilise ‘open data', it would almost be like functioning in an industry segment, where raw materials are available for free. The software developer's challenge is to produce innovative applications that are useful for the layperson, utilising the enormous data at his disposal. As mentioned in an earlier column, World Bank has opened up its data in 2010. As part of its endeavour to generate innovative/useful applications that utilise its data, World Bank floated a competition called ‘Apps for development' ( > ).

This project could attract many software developers/development practitioners worldwide and lead to a variety of applications that can be used by researchers, development economists, teaching fraternity and so on.

The Worldstat/wb ( > ), an educational application created using World Bank data is a good instance of this trend. While teaching development issues, teachers have to present statistics on a variety of development parameters (such as population, mortality rate, teledensity and so on). In a fast developing world many such indicators change constantly and unless you have the latest data, interpretations could go haywire. The Worldstat/Wb attempts to plug this educational bottleneck by presenting the latest data from the World Bank development indicators database in a comprehensible fashion. The tables and graphs shown in the application get automatically updated as they are generated directly from World Bank data and this helps the teachers discuss issues using the latest facts and figures.

Another application that could come in handy for a teacher is the “World Bank Data Mapper” ( > ). This application can map thousands of development indicators for countries across the globe. World Bank widget ( > ) is yet another application with an educational value. The widget that can be embedded on a web page can be used to know how a country performs against World average on different targets of the Millennium Development Goals ( > ). The Chrome extension, Economic Data Finder ( > ) is yet another attempt at exploiting the open database of World Bank.

While reading a web page article that deals with economic data, you may come across several development indicators.

Now, if you wish to obtain additional information on the different indicators discussed in that piece, you may find the extension ‘economic finder' useful. The extension lets you grab the relevant information from the World Bank database with a couple of mouse clicks.

Google ngram viewer

Data analysis is not limited to just slicing and dicing of quantitative data — one can obtain new insights by analysing text data as well. A huge corpus of textbooks in digital form is available now in public domain through book scanning projects such as Google books, Project Gutenberg and the like. As millions of books are available in digital form, one can easily analyse the text data. This opens up immense possibilities of exploring cultural trends, spotting the changes in speech structure/usage of words over time and the like.

Let us explore this idea with an example. Romance is an emotion that enthralls everyone irrespective of age or race or creed. Though the basic emotion of love has not changed, the words used by lovers to express their feelings keep on changing.

Which word is more popular now, darling, honey or sweetie? And how the popularity of these words changes over time? One way to capture the different words used by lovers is to analyse the words used in various books during intense romantic moments. If you are curious on this theme, as described in this blog post ( > ), Google Books Ngram viewer ( > ) could be of use.

Ngram viewer is a Google tool that helps you obtain the frequency of words/phrases used in books over time graphically. For this the service scans millions of books stored in Google's book database. The tool can be used to compare the changing popularity of different words over time. For instance, if you compare the words, fantastic and awesome, you will find that popularity of fantastic peaked during 1960s and the word awesome is picking up now.

Q&A service

As discussed in an earlier column, the service Stackexchange acts as a ‘Question & Answer' service creation platform ( > ). Though a majority of Q&A services launched from this platform comprise technical subjects (like stackoverflow — meant for programmers — Statistical Analysis, TeX and so on), some non-technical Q&A services are also in place. The Q&A service meant for ‘English Language and Usage' ( > ) is a good example. The Q&A service meant for cooking ( > ), the one meant for writers ( > ) are some other examples in this genre.

