## Margins of error aren't actually hard to understand. If you're having trouble, blame pollsters who are being deliberately opaque.

I reported last week on a television sting operation that appeared to show polling agencies offering to manipulate their results for money and/or on the asking of undercover reporters posing as political consultants. I cannot really offer an opinion its authenticity – especially having been given access to edited footage only – so I'd offer instead that you watch for yourselves to make up your minds. The channel's put up all the links here (note: most of it is in Hindi).

But there's one part of it that's of particular significance, and that is 'margins of error'. In the sting, several pollsters appear to be offering to manipulate the margins of error in their surveys, which they tell the undercover reporters will allow them to show more seats in the tally of the political party they are shilling for. One pollster offers to raise the margin of error from 3 to 5%, and another memorably tells the undercover reporter: "What does Uttar Pradesh's public understand of margins of error?"

UP's – and most other states' – publics might not understand much about margins of errors at the moment, but that's because of the criminal opacity of most polls (for starters, more than half of the 11 pollsters I tried to contact didn’t have websites), and the complicity of the media that often presents findings without disclosing the methodology. It really shouldn't be that complicated.

At its essence, what the margin of error is, is a statistical estimation of the amount of uncertainty introduced by the fact that only a small proportion of the whole is being sampled. So if I was surveying a class of 100 students and spoke to all of them, my margin of error would be 0. If a spoke to a proportion, my margin of error would be calculated from a mathematical equation. As the size of the sample rises, the margin of error is supposed to fall. However, this relationship does not proceed linearly.

In India, a perfectly representative sample of 1200 people, whether for a state or for the country, yields a margin of error of about 4%, and a sample of 2-3,000 people yields a margin of error of around +/- 3% (with a confidence level of 95% - but ignore that for now), Sanjay Kumar, the well-respected electoral expert and director of the Centre for the Study of Developing Societies (CSDS), told me. This would mean that if 20% of a sample of 2-3,000 people say that they are going to vote for the Bahujan Samaj Party, the actual result should be in the range of 17-23%. That’s what a margin of error means.

Fixing for biases

But it’s important to note that we’re talking here of perfectly representative samples. If a pollster has disproportionately more urban people or too few women or too many upper castes in his sample, that introduces its own element of error. The statistician Nate Silver who founded the blog Five Thirty Eight writes in his book The Signal and The Noise that his model, which essentially relied on aggregating the electoral forecasts of pollsters, calculated its own margins of error for each pollster based among other things on its past performance (a working example of Bayesian priors, for the statistically minded among you). So even if a pollster stated his margin of error at 3.4%, if Mr. Silver found that the pollster’s results tended to be biased towards Republicans, he further adjusted the margin of error himself.

From covering pollsters for a few years now, I have come to the conclusion that many are simply faking the 3% margin of error that has become de rigeur to plant at the end of the poll in parentheses. Mr. Kumar said much the same to me.

When Pew Research, probably the world’s leading pollster, conducted an opinion poll in India over December 2013-January 2014, their margin of error was 3.8%, a number I have literally never seen in an Indian poll. Scott Keeter, director of survey research at Pew, told me that the 3% is a convention that matches a simple random sample of approximately 1,000 in the US while for telephone surveys the margin of error for a sample of 1,000 tends to be around +/- 3.6%. (The 95% confidence level was also a pretty arbitrary convention, Mr. Keeter said.)

For the faking, there isn’t a clear solution. There is nothing theoretically to stop a pollster from making/ hyper-extending the margin of error, Pew’s Mr. Keeter told me. “But this is really the same question as “why don’t pollsters just make up their results to suit their own opinion or that of their client”? The principal constraint on public pollsters is that we all have to maintain our credibility, and if we are proven wrong by the results of the election, it can badly damage that credibility. Similarly, for private or campaign pollsters, they have their livelihood to consider. If their polling is proven to be inaccurate, they will lose their clients,” he said. There are no laws or regulations on polling in the US either, Mr. Keeter said, even though associations of pollsters lay down their own code of ethics.

Perhaps the best we can do is vote with our feet, on pollsters and not just at the polls.