Reproducibility in research

One of the things we try and cling to here at TIE is evidence, the best of which comes from rigorous peer reviewed research. David Banks has an interesting paper in the journal Statistics, Politics and Policy on the reproducibility of research, that is motivated by recent high profile cases of research being found to be at a minimum non reproducible, and at worst fraudulent. Banks (who is a Professor at Duke) quotes Roger Peng to frame the issue:

The replication of scientific findings using independent investigators, methods, data, equipment, and protocols has long been, and will continue to be, the standard by which scientific claims are evaluated. However, in many fields of study there are examples of scientific investigations that cannot be fully replicated because of a lack of time or resources. In such a situation, there is a need for a minimum standard that can fill the void between full replication and nothing. One candidate for this minimum standard is “reproducible research”, which requires that data sets and computer code be made available to others for verifying published results and conducting alternative analyses.

The problem is that the scientific ideal runs into scarce resources, but a middle ground is for the actual data sets used in papers and the statistical code to analyze same could be made available so that other investigators could reproduce results and/or do new analyses with the same data. This offers no protection against faked data, but would be a step toward reproducibility. This sounds simple and straightforward, but Banks, who was editor of the Journal of the American Statistical Association, knows that the definition of reproducible will remain hazy:

As a former editor of the Journal of the American Statistical Association, my own sense is that very few applied papers are perfectly reproducible.

Most the reasons are innocuous. A step in creating a variable is forgotten, or in the editing process the text of a paper is shortened in way that makes variable construction ambiguous. Or a very detailed description of how an explanatory variable was coded may be provided, but the detail that the few cases with missing or nonsensical values were dropped from the analysis is omitted. It is typically not that important to the cocktail party story of a paper….unless of course it is. Banks notes that only a handful of the most important studies can be deeply checked by an emerging field of ‘forensic statistics’ and says that in the current system co-authors are the best source of maintaining strong, rigorous data safeguards. Banks provides three suggestions:

  • increasing the viability of publishing null results in journals; he says the lack of an ability to publish null results tacitly enables non reproducible studies as authors may be tempted to ‘torture the data until it confesses’
  • expand funding for forensic statistics and/or reproducibility studies. The argument goes if the study was worth funding when speculative, when done, don’t you want to be certain it is correct?
  • he argues for development of a continuous measure of reproducibility to give a sense of how careful researchers have been in preparing their data and communicating their methods in published papers

For people who care about the evidence, this is a big issue. Getting the incentives straight seems likely to be the key. (h/t Bill Gardner via twitter @Bill_Gardner)

Hidden information below


Email Address*