How many people died in India as a result of the COVID-19 pandemic? This question has become the subject of a heated argument after the World Health Organization (WHO) estimated India’s pandemic excess deaths at around 4.7 million. The Government of India issued a strongly worded response, and media houses and editors waded in. It is almost as though life and death themselves are now matters of opinion.
Here are some basic observations. First, we will never know precisely how many excess deaths occurred in India during the novel coronavirus pandemic. Second, all mortality studies, including the latest from WHO, involve choices about what data to include, how to fill gaps, and how to deal with uncertainty; there is always room for debate and disagreement about these choices. Third, uncertainty does not mean total ignorance: even the most optimistic reading of the data puts excess deaths at six or seven times official COVID-19 deaths.
The current dispute has been noisier than usual, but is not new. Several studies, most putting India’s pandemic excess deaths at between three and five million, have been met by strident Government “rebuttals”. These rebuttals have highlighted the uncertainties (which is valid), and then jumped — without justification — to claiming that there are no excess deaths beyond recorded COVID-19 deaths. The rebuttals are also littered with irrelevant, confused and absurd points.
The latest response is well summarised in its title: “India strongly objects to the use of mathematical models for projecting excess mortality estimates in view of the availability of authentic data”. The “authentic data” in question is mortality data from the Civil Registration System (CRS), and there are two implications: that CRS data has been ignored by the researchers; that CRS data does not support estimates of high pandemic mortality.
Both are false. Estimates of pandemic mortality, including those of WHO, are largely data-driven, and the main data-source is — you guessed it — the CRS. This data strongly supports estimates of high pandemic mortality. The “modelling” that the Government objects to is largely simple data analysis and techniques for filling gaps in the data, entirely unavoidable if we are to use CRS data to estimate excess mortality.
Civil registration data
To make sense of all this, we need to consider what data is available and what it shows. In 2021, journalists managed to obtain monthly death registrations at the State or city level; these crucial efforts led to the first hard evidence that official COVID-19 deaths were only the tip of the iceberg. Although valuable, the data is patchy: not all States and regions are covered; it often comes from online systems which do not capture all registrations; and it often misses the latter part of India’s devastating second wave. Nevertheless, a clear picture emerges: there was a gradual surge in deaths during the second half of 2020, which subsided and was followed by a tsunami of deaths during April-June 2021.
Although the data comes from local government records, the Health Ministry objects that it is “non-official”. However, there is no official CRS report for 2021, and only very recently (on May 3, 2022) was the 2020 CRS report made available. This report does not give monthly registrations, so it is hard to cross-check with earlier data. In yearly totals, there are some discrepancies; but, nevertheless, we found that the gross estimates for 2020 were broadly aligned across the States whose data we ourselves had used in our estimates.
With everyone in agreement on the value of CRS data, how does the Government propose to explain away the pandemic surge in deaths? We find the answer in a bizarre assertion: during 2020, the Government claims, 99.9% of all deaths in India were registered. The message is: what appears like a rise in mortality in 2020, actually reflects a sharp improvement in registration. Note that this could never explain the bulk of excess deaths which came later, during 2021. But is the claim of complete death registration in 2020 plausible?
On the contrary, it is absurd. Consider the data from Uttar Pradesh. The government Sample Registration System tells us to expect around 1.5 million deaths in Uttar Pradesh every year. But during 2020 only 0.87 million deaths were registered, around 60% of the expected toll. If registration was complete, then 2020 saw a huge, unexplained, drop in deaths in the State!
Consider, also, Andhra Pradesh, where freely available CRS data tells a startling story: during 15 months from April 2020 to June 2021, over 50% more deaths were registered than expected. Could this reflect an improvement in registration? No. According to the 2019 CRS report, there was no room for improvement as death registration in the State was already complete before the pandemic. This is probably an overstatement; but however we look at it, Andhra Pradesh’s huge mortality surge cannot be explained via increased registration coverage.
It is possible that in some States, registration coverage improved during the pandemic. But, overall, registration probably dropped during 2020. Data from the Government’s latest National Family Health Survey suggests that deaths that occurred in 2020 were less likely to be registered than deaths in 2019. Birth registration data from the CRS points in the same direction: after increasing by 5% during 2017-18 and 7% during 2018-19, birth registrations fell by 2.5% in 2020.
Disruption to registration could have been especially severe in marginalised communities and in States where registration is anyway weak. In Uttar Pradesh, for example, both birth and death registrations fell sharply during 2020. Assuming that registration held steady during the pandemic, as we and many others have done, risks underestimating the mortality surge.
India was badly hit. A year ago, tragic stories of overflowing hospitals and oxygen shortages filled the news as the virus swept through the country. There is now a weight of evidence — not just from the CRS, but from surveys too — telling us that many millions died. Data is still emerging, and estimating pandemic mortality will be an ongoing effort; but this effort is undermined by the shrill, incoherent response from the Government following each study.
All the estimates come with uncertainty and depend on choices. For example, the WHO estimate drops from 4.7 million to 4.4 million if we consider the pandemic period to span April 2020-July 2021 rather than January 2020-December 2021. Acknowledging the uncertainties and debating the choices is natural, but is very different from dismissing the estimates.
Strengthen the CRS
The tragedy has been huge; but in the global context, India is not an outlier. Parts of the developing world and eastern Europe saw similarly high pandemic mortality. Historical weaknesses and deliberate dishonesty, well-documented by journalists, mean that India recorded only 10%-15% of its pandemic deaths. In this too, India is not alone. India’s all-cause mortality data is imperfect — but in many Asian and African countries, the data is even sparser. The current state of affairs highlights both the value of India’s CRS data, and the need to strengthen the CRS.
What is most troubling — and makes India stand out — is the relentless Government hostility towards every attempt to understand the pandemic. If the objections were made in good faith, the Government could accelerate the release of data, for example from the CRS for 2021 or from the Sample Registration System. Ultimately, the rift is not about science, data or methodology; the basic question is whether we wish to pursue the truth or not.
Aashish Gupta is a David E. Bell Fellow at Harvard University. Murad Banaji is a U.K.-based mathematician who has closely tracked India’s pandemic data