...In medicine, the quest to distinguish between distinct conditions with similar symptoms is called 'differential diagnosis'. By analogy, three questions can guide a diagnosis of apparent irreproducibility: (1) Was the replication attempt backed by the requisite expertise? (2) Does a systematic comparison reveal a basis for discordant results? (3) Could the original result be wrong?
To illustrate the first question, consider the following example: extensive evidence exists that, in his prime, Tiger Woods could consistently hit a golf ball more than 260 metres straight down the fairway using his driver (the largest club in the bag). I too play golf; I have achieved credible results with my own driver (well, some of the time) and I am roughly the same height and weight as Tiger. I would very much like to be able to reproduce his results with my own hands.
Over the years, the golf industry has made untold sums of money from many golfers (myself included), who aspire to reproduce Tiger Woods' results. However, only a small percentage of us can pull this off (and I am not one of them). Does this mean that Tiger Woods' results were 'wrong', or that the remarkable physics he seemingly exemplified does not truly stand up to independent scrutiny? Of course not; it means that reproducing such results consistently requires a level of mastery that the typical golfer does not possess.
I do not believe that experimental skill is as elusive as that necessary to win the green jacket. I just want to underscore that we cannot assume that any given scientist — even a very good one — is properly equipped to reproduce an experiment if it involves new reagents, systems or biological context. Before even attempting to reproduce an 'index' experiment, a lab that lacks the needed experience often sends a trainee to another lab to work with a scientist who regularly performs the experiment. This is so, even if both labs are recognized experts.
If the question of reproducibility persists once the requisite expertise is established, the answers often reside in subtle differences in methods, cell-line properties or reagents that become apparent only after scrutiny. Two acclaimed cancer researchers required more than a year to harmonize techniques to get similar measures for experiments; success depended on cross-country visits and on reconciling minor differences in how cells were prepared.
If these steps do not work, we must consider whether the original result was really correct. And we must be prepared to accept the brutal facts, make the required corrections and move on. Great scientists are always willing to embrace the truth with humility and grace — even when it hurts....I find this passage odd, mostly because I'm an organic chemist, and the ten or twenty variables that we can control for are measurable with the myriad tools that we have to characterize physical and/or chemical properties. Biologists have many more variables, so that's probably where the difference lies. (I wonder if this means there is room to understand and communicate biological experimental technique better in biomedical science?)
An incomplete thought, but one that I ponder a lot, especially as the Reproducibility Wars rage on...