Peer Review: The Worst Way to Judge Research, Except for All the Others

Aaron Carroll

November 9, 2018

Even before the recent news that a group of researchers managed to get several ridiculous fake studies published in reputable academic journals, people have been aware of problems with peer review.

Throwing out the system — which deems whether research is robust and worth being published — would do more harm than good. But it makes sense to be aware of peer review’s potential weaknesses.

Reviewers may be overworked and underprepared. Although they’re experts in the subject they are reading about, they get no specific training to do peer review, and are rarely paid for it. With 2.5 million peer-reviewed papers published annually worldwide — and more that are reviewed but never published — it can be hard to find enough people to review all the work.

There is evidence that reviewers are not always consistent. A 1982 paperdescribes a study in which two researchers selected 12 articles already accepted by highly regarded journals, swapped the real names and academic affiliations for false ones, and resubmitted the identical material to the same journals that had already accepted them in the previous 18 to 32 months. Only 8 percent of editors or reviewers noticed the duplication, and three papers were detected and pulled. Of the nine papers that continued through the review process, eight were turned down, with 89 percent of reviewers recommending rejection.

Peer review may be inhibiting innovation. It takes significant reviewer agreement to have a paper accepted. One potential downside is that important research bucking a trend or overturning accepted wisdom may face challenges surviving peer review. In 2015, a study published in P.N.A.S. tracked more than 1,000 manuscripts submitted to three prestigious medical journals. Of the 808 that were published at some point, the 2 percent that were most frequently cited had been rejected by the journals.

An even bigger issue is that peer review may be biased. Reviewers can usually see the names of the authors and their institutions, and multiple studies have shown that reviews preferentially accept or reject articles based on a number of demographic factors. In a study published in eLife last year, researchers created a database consisting of more than 9,000 editors, 43,000 reviewers and 126,000 authors whose work led to about 41,000 articles in 142 journals in a number of domains. They found that women made up only 26 percent of editors, 28 percent of reviewers and 37 percent of authors. Analyses showed that this was not because fewer women were available for each role.

A similar study focusing on earth and space science journals found that women made up only about a quarter of first authors and about 20 percent of reviewers. They had higher acceptance rates than men, though.

In 2012, the journal Nature undertook an internal review of its peer review process, finding balance in its editors and reporters but disparities elsewhere. In 2011, women made up only 14 percent of the more than 5,500 peer reviewers for papers. Only 18 percent of the 34 researchers profiled in 2011-12 were women, and only 19 percent of the articles written for the “Comment and World View” section were by women.

It’s possible women declined opportunities to review, but studies have documented that male editors tend to favor male reviewers. This year, Nature reported that it had increased participation of women in the “Comment and World View” section to 34 percent, while the percent of reviewers had climbed only to 16 percent.

Unesco estimates that women make up 29 percent of the worldwide science work force.

But there are also data to support the value of peer review. A 1994 study, published in Annals of Internal Medicine, reviewed the quality of papers submitted to the journal before and after the peer review and editorial system. Researchers used a tool that assessed the manuscript’s quality on 34 items, and their work showed that all but one got better. The biggest improvements were in the discussion of a study’s limitations, its generalizations, its use of confidence intervals and the tone of the conclusions. Probably none of these would have occurred without the nudge of peer review.

Ideas for Improving Peer Review

How then to improve the existing system?

For starters, more formal training might improve quality and speed. Given how hard it is to recruit good reviewers, journal editors could consider better incentives, such as paying reviewers for their time. The unpaid costs of peer review were estimated at 1.9 billion pounds (almost $3.5 billion) in 2008. Or journals could offer, without promise of acceptance, quicker turnaround for a reviewer’s future papers. Academia might offer more formal recognition for review work as well.

A number of journals have moved toward fully blinded reviews, in which reviewers don’t know the authors or institutions of papers they’re judging. This could eliminate some biases. It’s hard to do this, though, because papers often refer to prior work or to where the research occurred. It also doesn’t solve the relative lack of women in the editorial and review process in general.

One way to detect problems with research earlier would be to let researchers post manuscripts online before submission, for public judgment before formal peer review. This is already common in some sciences, such as physics. Medical journals would probably resist this, however, because it could reduce their ability to get press and attention once the research was fully published.

A significant improvement would require a change in attitude. Too often, we think that once a paper gets through peer review, it’s “truth.” We’d do better to accept that everything, even published research, needs to be reconsidered as new evidence comes to light, and subjected to more thorough post-publication review.

As an author of papers, and as a writer who comments on papers in the news media, I’ve seen how the peer review process can fail. But I’m also an editor at the journal JAMA Pediatrics. There, as at many journals, a paper’s first gatekeeper is an editor. Those getting past that hurdle are sent out to a few experts in the field who are asked to read and offer their views to the editor. This informs what might happen next: acceptance, rejection or a chance to respond to reviewer comments before a decision is made.

Each week we meet by teleconference to discuss papers we are considering for publication. We talk about the reviews, and ultimately decide what few studies make the cut. I’m always impressed by the quality of the discussion and the seriousness with which people take their charge. We also follow papers we turn down to see if we made mistakes in deciding to reject. We use that data to review and improve our process. I’m sure other journals do the same. And I’m sure we make our share of bad calls, as other journals do.

Peer review is still better than the alternatives. It might make more sense, though, to see it (and publication) as steps on the road to assurance, not a final stamp of approval.

Aaron Carroll

Aaron E. Carroll, MD, MS is co-Editor-in-Chief of The Incidental Economist and tweets at @aaronecarroll. He is the Chief Health Officer at Indiana University. He is also a Distinguished Professor of Pediatrics and Associate Dean for Research Mentoring at the Indiana University School of Medicine. In addition to contributing regularly to The New York Times and the Atlantic, he is the author of four books, most recently The Bad Food Bible.

Peer Review: The Worst Way to Judge Research, Except for All the Others

Ideas for Improving Peer Review

Aaron Carroll

Hidden information below

Aaron Carroll

The Incidental Economist

Editors In Chief

Managing Editor

Subscribe