An Index of the “Disruptiveness” of Research: Are We Using Numbers to Study a Phenomenon, Or Are We Just Studying Numbers?
A colleague brought to my attention a paper recently published in the journal Nature titled, Papers and patents are becoming less disruptive over time which apparently has generated some buzz. I have been puzzling over it for a little over a week and have decided I shouldn’t spend more time on it.
The authors introduce their topic stating that concerns about a decline in innovative discoveries have been expressed by writers from many sectors of science. Although various explanations have been proposed for this decrease, little research has been done to not only explain but quantify it. Their purpose was to use a “new” index created by the last author to fill this knowledge gap.
They measure innovation by using a concept called “disruptiveness,” which means that the findings of the paper “disrupt existing knowledge, rendering it obsolete, and propelling science and technology in new directions.” The new index, called CD for “consolidation” or “disruptiveness,” is based on the idea that if a paper is disruptive (innovative), then authors of future papers who cite the disruptive paper will not also cite the references listed by the disruptive paper, because the information from those references is no longer relevant. They developed a formula to capture this tendency towards disruptiveness or consolidation. Consolidation refers to a paper which confirms or extends the existing knowledge, thus keeping scientific research on the same track. The maximum CD index value equals 1 (maximum disruptiveness) and the minimum CD index value equals -1 (“maximum” consolidation). They have a nice Figure 1 which explains the CD index.
The authors apply the CD index to papers listed in the Web of Science (WoS), which is a global citation index. Every article in the WoS (and there are many millions) is linked to its citations so that basically we have a complete record of all the citation-based links among the papers in their collection. It also keeps track of all authors for each paper. The science papers and citations go back to the year 1900. After calculating the CD index for each paper based on the five years after it was published (called CD5), they plot the average CD5 index over time, starting with 1945. You can choose any time window for the CD index, but these authors verified that for their data it does not meaningfully change the results.
The main findings can be seen in Figure 2. The CD5 index declines from 1945 through 2010 for all of the major fields of science they studied. (The results also apply to patents, which they studied using a patents database which had records from 1980). To the authors’ credit, they explore various alternative explanations for this decline in disruptiveness and the alternative explanations don’t seem to account for what they observe. Some of these explanations include: 1) a decline in the disruptiveness of “high quality research” (discussed further below); 2) the items in the Web of Science vs. other citation databases that could be used; 3) changes in publication or citation practices. It does appear they controlled for volume of publications by year, which would be important.
Here are my thoughts in an order that is not particularly coherent. First, the authors are social scientists and faculty at the Carlson School of Business at the University of Minnesota. From what I have read over the years, social scientists have a different orientation than that of scientists in biological and physical fields. That is, social scientists see research as “knowledge creation through social processes or shaped by its social construct.” (See my May 2021 post, “Epistemology. You keep using that word …”) My orientation aligns with the idea that research helps to discover knowledge through careful thoughts, careful observations, and verification. In other words, there is an objective reality and it is “discovered” through rigorous scientific research. So that sets up a difficulty for me in accepting (or following) their work.
The authors are faculty at a school of business. In business, (positive) disruption is the key to success. A business that is the first in a new market, that provides customers with a product they didn’t even know they wanted, thrives and profits. My question is, if you accept the idea that scientific knowledge (i.e., understanding of the natural world) accumulates using the “careful thoughts, careful observations” orientation, would we expect the same rate of disruption over time as the knowledge base grows? If we use the authors’ orientation and are regularly saying, “oh my god, everything I thought before about ‘X’ is wrong,” is there any point to scientific research? It seems if science and technology were continually being “pushed in new directions” we would end up going in circles. The crude example I thought of is infectious disease. Hundreds of years ago, if someone got sick, people would say, “it’s an imbalance in the four humors,” “he was cursed by witches,” “it’s bad air,” “God is punishing him,” or other explanations. Because of scientific research, we have realized that infectious disease is caused by pathogens like bacteria and viruses. We’ve studied them and verified them, and we don’t have to go back to the well for new ideas. Our field of research is narrowed. (But there is certainly plenty more to study about them.)
But let’s assume knowledge is created through “intuition” and social processes and look at the paper. In the Introduction the authors make assertions which they support with references from non-peer reviewed literature. For example, “The Rise and Fall of American Growth” (Gordon JR), “Innovation and Its Discontents: How Our Broken Patent System is Endangering Innovation” (Jaffe AB), and “The End of Science: Facing the Limits of Knowledge in the Twilight of the Scientific Age” (Horgan J) are all books sold on Amazon. The negative reviews of Horgan’s book are especially entertaining. Horgan is a science journalist who “has a window on contemporary science unsurpassed in all the world.” (Did Horgan write that?) Apparently he has a chapter at the end describing how he was lying on someone’s lawn and had a “mystical” experience where he perceived the mind of God and understood His fear of death. From what I could tell on Amazon the books are essentially policy positions where the authors select the evidence they feel fits their position. There are also a few unsupported assertions in the paper whose validity I question. For example, “The gap between the year of discovery and the awarding of a Nobel Prize has also increased (23,24), suggesting that today’s contributions do not measure up to the past.” I don’t question the first part of the sentence but is the second part necessarily so? Of course in business, taking a long time to develop a disruptive strategy would be undesirable.
Methods. One of my red flags when evaluating articles is, “the paper confuses you.” As loquacious as these authors are, I found myself unsure of either what they did or why they did certain things. For example, they gave a rationale for starting their timeline for the main analyses at 1945 (“the scale and social organization of science shifted markedly in the post-war era”). Okay. Then, when they used alternative datasets to see if it would change their results, they started the timeline at 1930 (extended data Figure 6). Why? Further, when they look at “high quality” literature only (i.e., Nobel prize-winning papers and papers published in the prominent journals Nature, Proceedings of the National Academy of Sciences and Science), they start the timeline at 1900 (Figure 5). The Web of Science starts at 1900, so is the very high CD5 index at 1900 an artifact of the fact that there were no papers before 1900 to be included in the subject paper’s references? Or is it due to the Web of Science data not being “as reliable” prior to 1945 as stated in the legend under Figure 5? Furthermore, how did they decide on their three journals (the inset plot)? Since one of the scientific fields was medicine, shouldn’t they have included the New England Journal of Medicine (NEJM)? It started publication in 1812. NEJM’s impact factor in 2010 was 53.484, compared to Nature‘s which was 36.101 (the analysis ended in 2010). Was it because the authors submitted their paper to Nature…?
Speaking of trends in the disruptiveness (CD5 index) of “high quality research,” the authors start with an assumption that “Declining rates of disruptive activity are unlikely to be caused by the diminishing quality of science and technology.” They support this assumption by referencing the Jaffe book and one of my least favorite authors, John Ioannidis. I did a search in Ioannidis’ paper (“Why most published research findings are false“) for the terms “innovation,” “innovate,” “disrupt,” “rigorous” and “quality.” I found no comments about the relationship between the quality of research and its level of innovation or “disruptiveness” (the only “hit” was in relation to high-quality randomized trials in meta-analyses). So it is hard to determine if Ioannidis meant to weigh in on this. (Another red flag of mine is, “the authors support an assertion with a reference that doesn’t support the assertion.”)
Based on this assumption, the authors theorize that high quality research should show either no trend or a diminished decline in their CD5 index if the overall decline is due to “diminishing quality of science and technology.” This theory seems to depend on “high quality research” being predominantly disruptive. However, the authors acknowledge that some Nobel prize-winning research is highly consolidating (extreme negative values of CD5). So the distribution of CD5 values may actually be what statisticians call “heavy-tailed,” i.e., a higher proportion occur at the extreme values than does the distribution of “not high quality” research papers.
The whole line of thinking around what should be happening with “high quality research” and how it would explain trends in the CD5 index seems muddled. They focus on a very small subset of all the published research, and assert that if we see the same trend with that small subset then it sheds light on the overall trend among all the research papers. Even if their theory is correct, it doesn’t really address their primary question. If “not high quality” research is less disruptive, and “not high quality” research becomes a larger and larger proportion of all published research, then the declining trend in overall CD5 index can be explained at least in part by “the diminishing quality of science and technology.” In addition, if you look at Figure 5 where the year 1945 starts, the trend in CD5 index among those Nobel prize-winning papers does appear to be less dramatic, although it’s hard to say due to the different ranges of the Y-axes in Figures 2 and 5. (By starting the trendline in 1900, it makes a dramatic fall from a starting value of about 0.8.) In Figure 2, three of the four trend lines start above 0.35 and fall to close to zero. In Figure 5, if we eyeball where the curve starts at 1945, it falls from just above 0.2 to close to zero. Ideally the authors would statistically test that the slopes of the lines I just described are different in order to support or not support their assertion. Regardless, at least from what is presented the comparison of patterns in Figures 2 and 5 appears to support the idea that the disruptiveness of “high quality research” has been declining at a slower rate.
Regarding verifying their results by conducting analyses on alternative samples, I think there is a lot of overlap among their alternative datasets (JSTOR, the American Physical Society corpus, Microsoft Academic Graph and PubMed) and the Web of Science. If there is a lot of overlap then there is no surprise that you got the same results when you used the same data points.
Remember to pay attention to the range of values on the Y-axis. Trends can be made to look more dramatic when the Y-axis range shrinks, or less dramatic by expanding the Y-axis range. In particular, Figure 6 depicts a “decline in the diversity of work cited.” Detour: how is the diversity of work cited measured? The authors use a measure called “normalized entropy.” This is a Bayesian statistic that indicates the magnitude of uncertainty of a variable, based on its underlying probability distribution. It is “normalized” because it is converted to a scale of 0 to 1, with 1 meaning maximal diversity as represented by “distribution of citations to a wider range of existing work.” So they included a Bayesian statistic as a variable in a frequentist regression model. Is this conceptually sound? Of course, you can insert any numeric variable into a regression model. Back to the Y-axis, the authors provide Figure 6 and show on the graph that the trend in diversity of the work being cited is decreasing over time. Egads! Except that the range on the Y-axis is 0.94 to 0.99, and the lowest point reached by any of the lines is about 0.96. So the dramatic drop seen is actually unremarkable considering that the changes occur in a very small range on the scale.
One might also ask, “what is a meaningful difference in the CD index?” As many of us know, a “statistically significant difference” can be shown, but the magnitude of that difference may be trivial (this can happen when you have a large number of observations). I’m not sure this has been established. (Although, given their scale, perhaps this would be change of at least 0.2 or 0.3.) Furthermore, is “disruption” of a continuous or binary nature? Recall that their definition of disruptiveness is “disrupt(ing) existing knowledge, rendering it obsolete, and propelling science and technology in new directions.” Can something be “slightly” obsolete? Can you change the direction of science and technology by 10 degrees, 30 degrees, 180 degrees….?
I am used to the biomedical literature where you report results in the Results section rather than the Methods section. But some of the more interesting results were in the Methods section. Extended Data Figure 5 shows the relative contributions from authors, year and field to the variability in the CD5 index. By far the most important influence in a paper’s CD5 is the author (roughly 80% of the variation in the CD5 index that they could explain). So basically all this talk about how innovation and disruptiveness is declining over time is not really about time, it’s the author that is important. What does that mean? I wish I knew more about how they did the regression. There are more authors than papers, but the CD5 index is based on the paper, so did they have repeated measures that they (hopefully) accounted for?
Discussion. I was amused by the fact that the only limitation they could think of is that the CD index is a relatively new metric (even though they state it has been “validated extensively” based on two studies) and so more work is needed to explore its properties. I was especially amused by the fact that they interpreted their results to mean that professors should be more often awarded paid year-long sabbaticals, when their findings equally suggest that if we want more innovation then we need World War 3.
In spite of my criticisms I think some of the authors’ ideas are reasonable. For example, the likelihood of innovative research findings may be increased through the cross-contamination of ideas from different fields. My lack of comfort is with their approach to investigating the questions they posed. Actually, I think the authors have executed a master stroke by studying things which can neither be proven nor disproven.
The null hypothesis is that I just don’t understand the paper, which is certainly possible.
As always an outstanding review and commentary! A very interesting way the Authors approached this paper and your comments are important if we are to learn anything from science
Thank you Carl! Thanks for reading the article review.