The ‘open data' movement that strives to make digital data freely available to the public is creating waves across cyberspace. This edition of NetSpeak discusses the features of this fascinating movement.
Almost all aspects of our life are getting computerised and an offshoot of this IT-enabled life process is the generation of digital data. Though for a layperson this data amounts to a few facts and figures, for a researcher this could be a valuable raw material.
For instance, a digitised hospital can capture complete patient data with ease. If all hospitals keep such data in a standard format, it becomes a goldmine for a health researcher who has access to it.
Likewise, if a magazine publisher creates a database of its institutional subscribers, an information scientist could use this database for listing out the libraries/institutions that subscribe to this magazine.
This will be helpful for researchers who wish to read that magazine.
So the bottom line is: if data is stored in a standard format and made easily accessible, the community will find umpteen ways to create value out of it.
Though digital systems produce an immense amount of data, due to various reasons we are unable to use them for further research. Organisations that create the data keep it for themselves. The advantage of digital data is that it can be processed rather easily.
However, for this to happen, the data should come in a standard format with the necessary documentation. This is the context in which the open data movement assumes significance.
The open data movement strives to make all data freely available to the public. Like its open source software counterpart, open data can be freely downloaded/shared without any restrictions and it can be mixed with other similar data sources.
In addition, for the data to be truly in the open data realm, its owner should offer necessary software tools (API or application programming interface) for accessing it via software.
This will enable programmers build new applications using the data set.
For instance, a real estate owner can create an application for helping home buyers choose a proper house by extracting data from diverse data sets such as the hospital database, educational institutions database and the like. One possible use for such an application could be the search facility that enables the client to obtain a list of houses near certain educational institutions.
Yet another advantage of having data in the public domain is that it helps search engines index them.
For instance, Google indexes public data available on the Net and present them in a user-friendly fashion.
If you invoke a search on Google with some data related parameter (like ‘gdp of India'), it immediately produces the pertinent figure, along with an interactive graph. By clicking on this graph one can compare the GDP of one country with other world economies instantaneously. In addition, Google allows you to visualise certain public data in multiple ways — as a line graph, bar-chart, bubble chart and the like
The concept of open data is gaining ground with many organisations publishing their data in open mode.
The recent announcement of the World Bank opening up its huge database to the public with a user-friendly interface is an excellent instance of this trend. The service enables you to access several key development indicators (pertaining to all countries) with a few mouse clicks. In addition, the WB offers APIs that enable the software developers create web applications based on its data .
Besides public institutions like the World Bank, several governments across the world are opening up the data. Governments of countries such as the U.S. and the U.K. are spearheading this trend. Check out the web page here that lists out links to a huge collection of open data sets on a variety of subjects.
To get a feel of the potential of open data and the latest developments in this arena, listen to this presentation by Tim Berners-Lee.
In India too the open data movement is slowly gaining momentum.
The RBI's database of Indian economy and Election Commission of India data are instances of this trend. But most of this data are available only in PDF and other forms. Though a human user can consume this data easily, due to lack of APIs, it constrains new applications based on this data.
As mentioned earlier, open data becomes more valuable provided one can build innovative applications by mixing different databases. This is the context in which OpenCivic, a movement to liberate data in non-machine friendly form into machine readable/re-mixable format, chips in.
OpenCivic provides a set of APIs that lets developers build engaging applications using civic participation data, observes Akshay Surve, the brain behind this innovative venture.
The concept of open data is fast gaining acceptance and is all set to cross a critical mass. It has assumed the shape of a movement and has given rise to an open-data ecosystem that consists of open data publishers, software developers, data analysis tools, researchers and end-users. The open data movement certainly facilitates the availability of a wide variety of data on your fingertips.
However, making the data available to the public is just one small step in the process of unlocking information from it. The open data movement becomes meaningful only if we liberate/strengthen the data analysis infrastructure as well. Of course, the software/analysts community is also rising to the occasion — the surging popularity of the open source data analysis package ‘R', discussed in the past, is an instance of this trend.
A download tool
Rapidshare and Megaupload are some of the on-line storage locations, where you can find a wealth of digital resources.
However, downloading the resources via browser in the usual manner could be time consuming.
One solution to avoid this browser based download drudgery is to rope in the service of a download tool that can download resources from these services automatically. The advantage of such a tool is that you just need to paste the download links to it- the tool will take care of the rest of the download process.
The free software Mipony is one such program worth a test.
He can be contacted at: firstname.lastname@example.org