With close to 4,00,000 cases being added every day, questions are being raised by many scientists on whether a government-backed model, called SUTRA, to forecast the rise and ebb of the COVID-19 pandemic, may have had an outsized role in creating the perception that a catastrophic second wave was unlikely in India.
An official connected with the COVID-19 management exercise said, on condition of anonymity, that the SUTRA model input was “an important one, but not unique or determining”.
The SUTRA group had presented its views to Dr. V.K. Paul, who chaired a committee that got inputs from several modellers and sources. “The worst case predictions from this ensemble were used by the National Empowered Group on Vaccines and the groups headed by Dr. Paul to take measures. However, the surge was several times what any of the modellers had predicted,” the official said.
On May 2, the SUTRA group put out a statement, carried by the Press Information Bureau, that the government had solicited its inputs where it said a “second wave” would peak by the third week of April and stay around 1 lakh cases. “Clearly the model predictions in this instance were incorrect,” the group noted.
Past its peak
SUTRA (Susceptible, Undetected, Tested (positive), and Removed Approach) first came into public attention when one of its expert members announced in October that India was “past its peak”. After new cases reached 97,000 a day in September, there was a steady decline and one of the scientists associated with the model development, M. Vidyasagar, said at a press conference then that the model showed the COVID burden was expected to be capped at 10.6 million symptomatic infections by early 2021, with less than 50,000 active cases from December. In October, at that time, there were 7.4 million confirmed cases of which about 7,80,000 were active infections.
Computational biologist Mukund Thattai, of the National Centre for Biological Sciences, Bengaluru, in a Twitter thread summarised instances of the SUTRA forecasts being far out of bounds of the actual case load. “The so-called Covid ‘supermodel’ commissioned by the Govt of India is fundamentally flawed,” he tweeted. “Based on Prof. Agrawal’s [Manindra Agrawal of IIT-Kanpur] own posts, it was quite clear that the predictions of the SUTRA model were too variable to guide government policy. Many models got things wrong but the question is why the government continued to rely on this model, than consult epidemiologists and public health experts,” he told The Hindu .
Mr. Agrawal was one of the scientists involved in developing the model. In an email to The Hindu , Mr. Agrawal admitted that the model, which had multiple purposes, didn’t work well on a metric of “predicting the future under different scenarios”.
He said unlike many epidemiological models that extrapolated cases based on the existing number of cases, the behaviour of the virus and manner of spread, the SUTRA model chose a “data centric approach”. The equation that gave out estimates of what the number of future infections might be and the likelihood of when a peak might occur, needed certain ‘constants’. These numbers kept changing and their values relied on the number of infections being reported at various intervals. However, the equation couldn’t tell when a constant changed. A rapid acceleration of cases couldn’t be predicted in advance.
Too many parameters
Rahul Siddharthan, a computational biologist at the Institute of Mathematical Sciences, in an email said no model, without external input from real-world data, could have predicted the second wave. However, the SUTRA model was problematic as it relied on too many parameters, and recalibrated those parameters whenever its predictions “broke down”. “The more parameters you have, the more you are in danger of ‘overfitting’. You can fit any curve over a short time window with 3 or 4 parameters. If you keep resetting those parameters, you can literally fit anything,” Mr. Siddharthan said.
According to Mr. Agrawal, one of the main reasons for the model not gauging an impending, exponential rise was that a constant indicating contact between people and populations went wrong. “We assumed it can at best go up to pre-lockdown value. However, it went well above that due to new strains of virus,” he said.
Further the model was ‘calibrated’ incorrectly. The model relied on a serosurvey conducted by the ICMR in May that said 0.73% of India’s population may have been infected at that time. “ I have strong reasons to believe now that the results of the first survey were not correct (actual infected population was much lower than reported). This calibration led our model to the conclusion that more than 50% population was immune by January. In addition, there is also the possibility that a good percentage of immune population lost immunity with time,” Mr. Agrawal said.
In the SUTRA approach, the factor by which reported cases differ from actual ones is a parameter in the model that could be estimated from just reported data, (covid19india.org), according to Mr. Agrawal. “I understand it may appear a bit mysterious, but the math shows how. This, in fact, is one of our central contributions,” he told The Hindu . This has been described in a preprint research paper that has been available online since January.
The modelling study called the “COVID-19 India National Supermodel” was the result of analysis by an expert committee consisting of mathematicians and epidemiologists — though in a research paper explaining how the model worked, there are three authors: Mr. Agrawal, M. Vidyasagar, a professor of electrical engineering at the Indian Institute of Technology, Hyderabad and Madhuri Kanitkar, paediatric nephrologist and Deputy Chief, Integrated Defence Staff (Medical) in the Army.
While many groups of epidemiologists, disease experts and groups of mathematicians had developed several kinds of models to predict the outcome of the pandemic, this group was facilitated by the Department of Science and Technology and was the only one among several forecast groups, whose numbers were relayed using the government’s publicity channels.
Until February, the model seemed more or less right, the curve was declining and as of mid-February while 10,000-12,000 new cases were added daily, the overall numbers were close to 10 million.
In an interview with this newspaper published on February 27, Mr. Agrawal asserted that a “second wave was unlikely” though a slight pick-up — to about 15,000 cases a day — had begun. India’s overall caseload wouldn’t extend beyond mid-March and only 3,00,000-5,00,000 new confirmed infections over the next 10 weeks were expected which would bring the overall load to 11.3 or 11.5 million infections by April 2021. This was premised partly on 60% of the population having been exposed to the virus.
On April 2, he told the Press Trust of India that the new cases would “peak” by April 15-20 — in line with the SUTRA team’s public statement.
On April 23, he again reported a new peak at May 11-15 with 3.3-3.5 million total ‘active’ cases and a decline by the end of May. India is currently at about 3.4 million active cases.
Gautam Menon, a modeller and Professor, Ashoka University, Sonepat, Haryana, who also worked on estimating the spread of COVID-19 disagreed with the approach, on the grounds that it was “somewhat simplistic and insufficiently informed by epidemiological data and expertise”.
At best, the SUTRA model could be used along with an ‘ensemble’ — where results from various scenarios were grouped. “The use of machine learning to forecast epidemic spread is a relatively recent advance. Some of those models do quite well. But the problems with those methods is that you can’t really figure out what they are doing and how sensitive they are to simply bad data. I would use those models, if we had them, along with an ensemble of other models, but would not repose utter faith in them.”
The SUTRA model’s omission of the importance of the behaviour of the virus; the fact that some people were bigger transmitters of the virus than others (say a barber or a receptionist more than someone who worked from home); a lack of accounting for social or geographic heterogeneity and not stratifying the population by age as it didn’t account for contacts between different age groups also undermined its validity.
Mr. Agrawal — who now regularly tweets on the evolution of the pandemic in States and districts — responded that new variants showed up in the SUTRA model as increase in value of parameter called ‘beta’ (that estimated contact rate). “As far as the model is concerned, it is observing changes in parameter values. It does not care about what is the reason behind the change. And computing new beta value is good enough for the model to predict the new trajectory well.”
He conceded that a combination of good epidemiologists, data-centric modelling like SUTRA and time-series models worked best. “Time-series based predictions are good at detecting changes in data patterns. So they can flag, early on, phase changes. SUTRA-type data-centric models can explain the past very well [and in studying what was the effect of policy actions, leading to a better knowledge base for the future]. They are also very good at predicting future trajectory assuming phase does not change.”
In 2002, Mr. Agrawal and two of his students developed a mathematical test called AKS primality that could efficiently determine if one could tell a big number was prime that won them global accolades. He used a computer science approach to solve a problem of pure math. “This is the second time I am entering a domain as a complete outsider. First was when I proved primality theorem. Mathematicians all over the world welcomed a computer scientist in their fold, and in fact went out of their way to celebrate it. Our paper was not written in standard math style, however, experts quickly shut down anyone who questioned the presentation or minor errors in the paper. In contrast, I am experiencing a hostile reaction from epidemiologists, at least in India,” he said.