Saturday, May 24, 2014

All hypotheses in biology are true

Scarcely a moment goes by in my scientific life without hearing the words "hypothesis-driven science". Now that scientists have been driven mad by the poor funding situation and inane NIH guidelines for hypotheses in specific aims, it's all you ever hear everywhere you go. I think the notion that it's important to have a hypothesis has somehow replaced the more fundamental premise that one should think carefully about what they are doing and why, but these two views are not equivalent. It is possible to have a stupid hypothesis, and it's possible to have well thought out discovery-based experiments. Now that data is easy to generate, there is more room for the latter, but I believe this excessive focus on hypothesis is at least partly why most NIH grants have become so boring and conservative. Ultimately, yes, part of biomedical scientific training is to develop an idea and test it experimentally, but the strict hypothesis-based approach often promulgated in grants and thesis committees, etc., is a fairly narrow and inaccurate description of how real science moves forward, in my opinion.

Whatever, lots of people have written about hypothesis-driven vs. discovery-based science, and to sum up my thoughts, I think it would be far more useful to spend our time discussing good science vs. bad science. Instead, I wanted to point out another issue with hypothesis-driven science, which is that now that our measurement tools are better, every hypothesis is true. Yes, that's an exaggeration, but let me explain. I feel like when you study cells, everything you do affects everything else. If I knock down expression of gene A, expression from any randomly chosen gene B is pretty likely to change. Perhaps not by much, but now that we can measure things so well, you can quantify that change and it will be statistically significant. So let's say you have data saying Protein X binds to Protein Y, and because of this somehow you formulate the hypothesis that expression of Gene A affects expression of Gene B. If you do enough RT-PCR or RNA FISH or whatever, you will find an effect. So your hypothesis is true. In the old days, if you had such a hypothesis that led to a small effect size, you probably would not have detected it, and so you would have accepted the null hypothesis. But nowadays, with RNA-seq and RNA FISH and so forth, all these small effects are detectable. I think a strict hypothesis-driven approach is ill-equipped to deal with this issue, but actually a discovery-based approach can be powerful. You can say: well, Protein X binds to Protein Y and so I expect that expression of Gene A affects something. Then use RNA-seq to find out what that something is! This approach has its problems and limitations as well, but it's not inherently wrong.

I think this "every hypothesis is true" effect is partly why there are more and more irreproducible (or perhaps more accurately, inconsequential) papers out there, with lots of hypotheses that are technically true but whose biological meaning is unclear. I think that it would be more useful to think about how and why different data fit together. This is hard work, I believe harder than just formulating hypotheses. It requires careful reasoning, not just about controls for simple tests, but about alternative interpretations of the data as a whole, and also some amount of inspiration as to why studying this is meaningful in the first place. To me, that is what makes something good science, not just the fact that you managed to write down a hypothesis.

No comments:

Post a Comment