The problem with cherry-picking data

If it’s the government’s case that NSSO figures are suspect, what has it based policy decisions on?

April 24, 2019 12:02 am | Updated 12:02 am IST

Minister of State for Housing and Urban Affairs Hardeep Singh Puri said last week, “we definitely have a data crisis,” and blamed academics for creating a “false narrative”. Yet, at the heart of the data crisis in India is the Central government, which has been holding back important data. Most recently, it did not announce the data on employment created by the ‘Mudra’ scheme. Earlier, the National Sample Survey Office (NSSO) data on employment were withheld. Data on farm suicides have not been available since 2016. Data are being withheld precisely where experts have flagged problems, such as on employment, farmers’ crisis and economic growth.

Clashing with reality

The NSSO data (which have not been released officially) undermine the National Democratic Alliance government’s claims on job creation. In fact, they showed massive unemployment. Demonetisation and the implementation of the goods and services tax, both of which undermined the unorganised sector which employs 94% of the workforce, have impacted employment. Data from the Centre for Monitoring Indian Economy (CMIE) and others have confirmed the loss of jobs. The NSSO and CMIE data are based on household surveys which capture any additional employment created by Mudra loans, tax aggregators, e-commerce, etc. Basically, jobs are being lost so that the net effect is a decline in employment.

The government had promised doubling of farm incomes by 2022. But, farmers’ incomes have come under pressure due to falling farm produce prices and rising input costs. This got aggravated by demonetisation, with cash shortages in rural areas compelling farmers to sell at lower prices to the traders to get cash. Data on farmer suicides have not been released on schedule even though the National Crime Records Bureau (NCRB) collects them annually.

The government has implicitly admitted that there is a crisis in the farming and unorganised sectors, and due to that in employment generation. That is why it announced an annual ₹6,000 support to farmers owning up to five acres of land and promised insurance to workers in the unorganised sector. It has also increased allocations for the Mahatma Gandhi National Rural Employment Guarantee Scheme (MGNREGS) from ₹55,000 crore to ₹60,000 crore. This allocation is inadequate, but it does indicate that the government is forced to acknowledge the crisis facing the poor.

To counter the argument of a crisis facing large segments of the population, the government first tried to discredit alternative arguments and then changed its stance to say that data on the unorganised sector employment were bad. In the process, it discredited its own agency’s data.

But the data on employment put out by NSSO have been used for long. Was that all incorrect also? If so, policy formulation based on that faulty (or non-existent) data was also incorrect. It would render suspect not only the policies of past governments, but even this government’s.

Most critically, if data on unorganised sector employment are not reliable or are non-existent, then GDP data are also not credible. After all, such data must factor in the contribution of the unorganised sector. The implication is that the government is only estimating growth on the basis of the organised sector of the economy, a point repeatedly made by this writer in the last two years.

In brief, the government has tied itself up in knots by saying that no one has credible employment data. So, GDP data also become suspect and so does the claim of 7% rate of growth. If the GDP calculation is inaccurate, can the budget figures based on this data be relied upon? Using the organised sector growth to represent the unorganised sector growth is somewhat acceptable only if the two components are moving in the same direction. But that has not been true post-demonetisation, which decimated the unorganised sector. So, the official figures represent only the organised sector. If the alternative data on unorganised sector growth are included, then the rate of growth would turn out to be less than 1%. This would be consistent with the crisis of the unorganised sector, agriculture and employment. A 7% growth rate of the economy is not consistent with this crisis.

The situation becomes even worse when quarterly GDP growth figures are relied on. They are based only on the corporate sector data and not even the organised sector. Thus, they are even less representative of growth of the economy.

The government cites World Bank and International Monetary Fund figures in support of its contention of 7% growth. But these agencies do not collect independent data and only reiterate the official data. So, their figures are not independent endorsements of government data.

To be fair, all governments in the past have manipulated data. The budgetary figures — fiscal deficit, expenditures and revenues — are fudged. Critics point to creative accounting in the budget every year. Data on education and health are also manipulated to show better performance. Whenever the GDP base change has been announced, experts have pointed to flaws. Inflation measured by the wholesale price index has been criticised as not representing the services sector, hence understating inflation.

Risk of arbitrariness

What is new is the complete denial of data collected by official agencies. If the government wishes to revamp data collection, it cannot be done arbitrarily. Expert committees must be appointed to work on the modification of methodology and the database. Even this would not account for the substantial black economy.

In brief, the present government is denying the data of its own agencies or modifying data arbitrarily. This is opening the doors for future governments to do the same. Tomorrow when the inflation rate rises, a government can claim that data on prices are faulty. If so, the bottom falls out of the calculation of dearness allowance to the organised sector and the budget formulation gets impacted. Further, the calculations of the Reserve Bank of India (RBI) also go wrong since it is supposed to target inflation. If both the data on growth and prices are denied, what would the RBI target? No one says that data cannot be improved but denying the existing official data only creates problems for policy and its credibility.

Arun Kumar is at the Institute of Social Sciences, New Delhi and author of ‘Ground Scorching Tax’, 2019

Top News Today

Sign in to unlock member-only benefits!
  • Access 10 free stories every month
  • Save stories to read later
  • Access to comment on every story
  • Sign-up/manage your newsletter subscriptions with a single click
  • Get notified by email for early access to discounts & offers on our products
Sign in


Comments have to be in English, and in full sentences. They cannot be abusive or personal. Please abide by our community guidelines for posting your comments.

We have migrated to a new commenting platform. If you are already a registered user of The Hindu and logged in, you may continue to engage with our articles. If you do not have an account please register and login to post comments. Users can access their older comments by logging into their accounts on Vuukle.