Research papers are getting harder to read, comprehend

Charles Darwin’s book: “On the Origin of Species”, was accessible to biologists and even lay people.  

One of the all-time most important scientific research publications is Charles Darwin’s “On the Origin of Species”, published in 1859. It was not a paper but a whole book. It was readable and understandable not only by biologists, but by mathematicians, philosophers, historians and the “lay public” as well. Alas, today’s scientific reports are increasingly becoming unreadable and incomprehensible even by peer groups. “The readability of scientific texts is decreasing over time”, write a group of neuroscientists from the Karolinska Institute, Stockholm, Sweden (Plaven-Sigray et al. eLife 2017; 6:e27725).

The group analysed the language of the abstracts of as many as 7,09,577 papers published over the last 34 years – between 1880 and 2015. Note that they analysed not only the whole texts of the research papers but their abstracts as well. An abstract describes in a nutshell the main message of the paper — what the question is, what methods were used to address the question, what the results obtained were, and what the salient conclusions have been. Thus, an abstract is meant for not the specialists in the same field alone, but for non-specialists and interested readers as well. It, in effect, offers the reader the “take home” message. Only specialists and fellow researchers, interested in the area of research of the publication read the whole paper in all its sections. It is thus vital that the abstract be readable and understandable by all.

How to measure this?

How does one measure and quantify “readability”? Way back in 1948, one Dr. R Flesch published a ‘yardstick’ of readability for the English language texts. It was based on (a) the number of syllables per word and (b) the number of words in each sentence. The Flesch Readability Ease (FRE) is between 100-90 for a typical 5th grade schoolchild in the US. (The sentence “A cat sat on a mat” has an FRE of 110, easily understood by a primary schoolchild). The magazine Readers Digest has an FRE of 65, understood by high school students and beyond. The Harvard Business Review, on the other hand, has a value of 30. The “Harvard Business Review” (with its complex and specialised technical language) has an FRE of 30. Thus, the lower the FRE the harder the readability.

Likewise, the FRE of scientific abstracts published way back in 1880 was found to be around 30. But, over the years, it has fallen down to as low as 10 today. Worryingly, as many as 1.6 lakh abstracts (close to 20% of all journal articles) have FRE of zero (0). An M.Sc. graduate may not be able to understand them. Leave alone specialised journals (with their specialised terms and jargon), this appears to be true of even “general” journals such as Nature or Science. The mean syllables per word has also shot up almost twofold and the number of difficult words (counted as NDC) has gone up from 35% to over 50%, particularly during the last 60 years, making readability increasingly difficult.

Why has this difficulty arisen in the readability? The authors suggest two possibilities. One is that the number of co-authors has gone up with time. Indeed, we seldom see a single-author paper (only perhaps in mathematics?). Many of the co-authors want their say in the text — the classic cooks and broth situation. The other appears to be a general increase in the scientific (and linguistic) jargon, and hence a vocabulary that has become a language in itself (they call it “science-ese”, I see a similarity here with “legal-ese”). Interestingly, it is not only scientific jargon even other words such as “novel”, “robust”, “significant”, “district”, “underlying”, and “suggestive” are used increasingly these days.

Same conclusions drawn by the Karolinska group are worth quoting. They write: “Lower readability implies less accessibility, particularly for non-specialists, such as journalists, policy makers and the wider public... scientific credibility can sometimes suffer when reported by journalists... further, amidst concerns that modern societies are becoming less stringent with actual truths, replaced with true-sounding “post-facts”... science should be advancing our most accurate knowledge. One suggestion from the field is to create accessible “lay summaries.” Another proposal is to make scientific communication a necessary part of undergraduate and graduate education.” (This last suggestion is particularly true for India, where mastery over English, the lingua franca of today’s science, needs to be improved badly.)

Finally, the authors did a self-analysis of their own paper and found it has a FRE score of 49, and its abstract 40. I hope my own report here fares higher!

Seminar presentations

It is already difficult to read and comprehend a published paper. One would think listening to it in a seminar might make it easier. Alas, no. These days the speaker uses the modern device called Powerpoint, which makes it worse. Each slide is filled from top to bottom with words and pictures. More often than not, they are ‘copy and paste’ jobs from the paper. Given that the lights are dimmed, each slide brimful and the speaker drones on and on, the whole thing is soporific. Just as FRE and NDC, there are factors such as aspect ratio, font size, number of lines per slide, and colour contrast which make Powerpoint presentations attractive. And just as we want courses and workshops in scientific writing, we need to have classes and workshops on oral presentations, using audiovisual aids. If this does not happen, do not blame us if we fall asleep during seminars.

