CATEGORY: Climate

Largest study of peer-reviewed literature to date finds overwhelming climate disruption consensus (UPDATED)

Public perception of the consensus among scientists on the human-driven nature of climate disruption vs. the measured consensus by Cook et al 2013

A new peer-reviewed study has confirmed again that there is an overwhelming consensus on the human-driven cause of climate disruption. The study, Quantifying the consensus on anthropogenic global warming in the scientific literature by John Cook and a large number of contributors to the website Skeptical Science (Cook et al 2013), looked at 11,944 papers over a 21 year period and assigned each to one of three categories on the basis of the papers’ abstracts: endorse, reject, or take no position on the consensus. Of the papers that either endorsed or rejected the consensus, 97.1% of the papers and 98.4% of the papers’ authors endorsed the consensus. In addition, 1200 authors of the analyzed papers were contacted and asked to self-rate their own papers for level of endorsement. Of the self-rated papers that either endorsed or rejected the consensus, 97.2% of the papers and 96.4% of the authors endorsed the consensus.

Cook et al 2013 represents the largest study to date of the consensus among the scientific community regarding the industrial nature of climate disruption (where human activity, primarily the burning of fossil fuels, is the dominant cause of the observed global warming). Prior studies such as Doran and Zimmerman 2009 and Anderegg et al 2010 had found that approximately 97% of climate experts and “super-experts” agreed that climate disruption was caused by human activity. However, some critics had attacked the studies for small sample sizes (Doran and Zimmerman 2009) or for using Google Scholar (Anderegg et al 2010) instead of the “official” scientific database, the ISI Web of Science. Cook et al 2013 addresses both criticisms by using a large sample of 11,944 papers from 1980 different journals and by using only peer-reviewed papers identified in the ISI Web of Science.

Cook et al 2013 Figure 2b – Percentage of endorsement, rejection, and no position/undecided abstracts. Uncertain comprise 0.5% of no position abstracts.

Figure 1b from Cook et al 2013 shows how the percentage of abstracts rated as “no position,” “endorse,” and “reject” have changed during the study period of 1991 to 2012. Note that the number of abstracts rejecting the consensus has stayed flat at nearly 0% over the entire period while the number of papers endorsing has declined slightly and the number of papers expressing no opinion has increased. Overall, 32.6% of the abstracts endorsed the consensus, 66.4% took no position, 0.7% rejected the consensus, and 0.3% were uncertain.

Cook et al 2013 explains why this result is expected. Specifically, when a controversial subject has been accepted and is no longer controversial, scientists move on to other subjects and no longer feel the need to explicitly endorse the consensus position. For example, scientists no longer argue about the general accuracy of the law of gravity, so there’s no point in restating why they think that gravitation applies except in unusual cases. Add the fact that abstracts are usually strictly limited in length and adding a few extra words to explicitly endorse the scientific consensus on climate disruption is a luxury most abstracts can’t afford.

Cook et al 2013 Figure 2b – Percentage of self-rated endorsement, rejection, and no position papers.

In addition, Cook et al 2013 contacted 8547 authors of the papers and asked them to self-rate their own papers. 1200 authors responded, and Figure 2b from Cook et al 2013 shows how they rated their papers as endorsing, rejecting, or having no position on the consensus. Overall, 62.7% of the papers endorsed the consensus, 35.5% took no position, and 1.8% rejected the consensus.

The authors who responded to the request to self-rate their papers provide additional clarity to the abstract-only ratings performed by Cook et al 2013. First, the authors made their ratings based on the entire paper, not just the abstract, and so they are better positioned to claim whether or not their paper endorses the consensus or not. Second, the self-ratings also provide a way to measure how much effect just rating the abstract has on the results, and the impact is significant. Cook et al 2013 compared the self-rated papers directly with the abstract-rated papers and found that the number of endorsing papers increased from 36.9% in the abstract-only ratings to 62.7% in the author self-ratings (see Cook et al 2013 Table 5 for more information).

And third, the self-rated papers provides some evidence that the large number of papers categorized as “no position” are categorized that way because the consensus position is no longer controversial. If the position that human activity was the dominant driver of climate disruption was still controversial among scientists, then that would be more likely to be stated in the abstract.

There are a few main areas of uncertainty in Cook et al 2013. The first is the aforementioned issue with short abstracts, but as mentioned above, the self-rating process minimizes this concern. The second is that using a “crowdsourcing” methodology using predefined categories is still ultimately subjective and could be influenced by the biases of the reviewer. However, this effect was minimized through using multiple reviewers and through the self-rating scheme. Possible biases toward the consensus position are ruled out by the fact that self-rated papers were more likely, not less, to endorse the consensus. But a possible bias by the abstract reviewers toward the “no position” category was analyzed and found to have minimal effect on the final results.

The third and final uncertainty is whether or not the papers selected are representative of the overall sample. The large sample size (11,944 papers) is suggestive of representativeness (the larger the sample, the more likely it is to be representative), but doesn’t guarantee it. As Cook et al 2013 points out, there are nearly 130,000 papers with the keyword “climate” in the ISI Web of Science.

However, the highly skewed results of Cook et al 2013 strongly suggest that the results are broadly applicable. The more skewed the results are, the smaller the sample size needs to be in order to accurately deduce the opinions of a population. As I demonstrated in this response to Joe Bast, President of The Heartland Institute, the results of Doran & Zimmerman 2009 had a margin of error of only 3.5% (for a hypothetical sample size of 100,000 scientists). Alternatively, Doran & Zimmerman 2009 could have statistically deduced a 97% consensus using only 39 respondents, not the 79 they actually had.

The results of Cook et al 2013 are even stronger because the sample size is so much larger. Cook et al 2013 found that 98.4% of the authors of the 4,014 papers that endorsed or rejected the consensus. That’s 10,188 authors vs. 168. If we assume that there are 100,000 authors publishing on climate disruption topics globally, then the results of Cook et al 2013 have a confidence level of 99.9% and a margin of error of +/- 0.48%. Increasing the number of climate authors to 1 million results in a margin of error at 99.9% confidence level of +/- 0.51%.

Every serious survey of the expert opinion of climate scientists regarding the causes of climate disruption has found the same thing – that an overwhelming number of climate scientists agree that the causes of climate disruption is dominated by human causes. Cook et al 2013 won’t be the final word on the subject by any means, but if “it’s not over until the fat lady sings,” we can fairly say that Cook et al 2013 indicates that she’s started to inhale.

UPDATE

I’ve been thinking about this paper a bit more and I have a few more thoughts about it that I didn’t include above.

First, in the discussion about sources of uncertainty in the analysis, Cook et al 2013 discusses the representativeness of the sample size. But something that isn’t discussed or mentioned in the Supplementary Information that I can find is a discussion of the representativeness of the paper authors who responded to requests to self-rate their own papers. Generally speaking people who respond to polls are the most energized by the questions being asked, so we could reasonably expect that the scientists who responded would be most likely to either endorse or reject the consensus. But it’s a relatively minor point.

Second, I feel that there was insufficient explanation of the 66.2% of abstracts that were rated “no position.” I would have preferred a few more sentences explaining why scientists don’t explicitly endorse or reject a consensus position, or maybe some attempt on the part of the authors to estimate the degree of consensus among the “no position” abstracts. For example, an analysis could have been done to cross-reference authors of the “endorsing” abstracts with co-authors in the “no position” abstracts and in the process develop a subcategory of “endorsement via co-authorship.” Or a bit more time could have been spent on the Shwed and Bearman 2010 study, which Cook et al 2013 references but doesn’t explain in much detail.

Shwed and Bearman 2010 looked at five historical (20th century) cases, including industrial climate disruption, where a scientific consensus developed and analyzed citation networks among peer-reviewed studies over time. What they found was that, as a consensus developed more and more papers cited a common core of studies that formed the nucleus of the consensus. In addition, Shwed and Bearman 2010 found that consensus leads to a dramatic increase in the number of publications, even as the number of references to the seminal studies remains constant. They describe the rationale as follows:

If consensus was obtained with fragile evidence, it will likely dissolve with growing interest…. If consensus holds, it opens secondary questions for scrutiny.

Essentially, once a consensus on the “big questions” is reached, scientists are free to dive into the details and argue over those instead.

The Shwed and Bearman 2010 analysis found that industrial climate disruption hit this consensus point sometime around 1991, by the way.

There is a lot of work that could be done still with the Cook et al 2013 dataset. I look forward to reading more about it.

Here’s a short list of links to several other sites and news articles about this study:

29 comments on “Largest study of peer-reviewed literature to date finds overwhelming climate disruption consensus (UPDATED)

  1. 97% Study Falsely Classifies Scientists’ Papers, according to the scientists that published them

    Link to PopTech’s post

    The paper, Cook et al. (2013) ‘Quantifying the consensus on anthropogenic global warming in the scientific literature’ searched the Web of Science for the phrases “global warming” and “global climate change” then categorizing these results to their alleged level of endorsement of AGW. These results were then used to allege a 97% consensus on human-caused global warming.

    To get to the truth, I emailed a sample of scientists who’s papers were used in the study and asked them if the categorization by Cook et al. (2013) is an accurate representation of their paper. Their responses are eye opening and evidence that the Cook et al. (2013) team falsely classified scientists’ papers as “endorsing AGW”, apparently believing to know more about the papers than their authors.

    “It would be incorrect to claim that our paper was an endorsement of CO2-induced global warming.” – Craig D. Idso

    • Oh, this is hilarious.

      So Craig Idso is complaining that the abstract he wrote for his paper doesn’t accurately represent the “skeptical” content of the paper itself. Reading the abstract (available here, I would have personally rated the abstract as “no position,” but I can see how someone else might have considered it as “implicit endorsement.” There is certainly nothing in the abstract to indicate that Idso’s paper is an “implicit rejection.”

      Nicolas Scafetta’s abstract (paper available here – abstract is the first paragraph in the left column) states

      We estimate that the sun contributed as much as 45–50% of the 1900–2000 global warming, and 25–35% of the 1980–2000 global warming. (emphasis added)

      Cook et al 2013 rated this as an “explicit endorsement with quantification of 50%+”, the strongest endorsement, and it’s easy to see why – Scafetta’s abstract attributes less than 50% of the observed warming from 1900-2000 to solar influences. The logically alternative source for the observed warming is greenhouse gases, which would thus be greater than 50%. So this appears to be an example of an abstract that misrepresents the contents of the paper.

      Nil J. Shaviv’s abstract (available here) says toward the bottom that “0.37 +/- 0.13 K” of the last century’s global increase in temperature “should be mainly attributed to anthropogenic causes.” The following GISTEMP graph shows that the global temperature has increased by about 0.8 K over that same period:
      GISTEMP graph of Temperature Anomalies
      In other words, if solar and cosmic ray flux accounts for 0.37 K of 0.8 K, then that’s less than half – and Shaviv’s own abstract says that the remaining 0.43 K “should be mainly attributed to anthropogenic causes. (emphasis added).” Since the word “mainly” is in there, I totally understand how Cook et al 2013 would have identified this abstract as “explicitly endorses but does not quantify or minimize.”

      Cook et al 2013 knows that their abstract ratings could be biased, either toward a “no position” or even to an “endorse” position. After all, you can only get so much information jammed into an abstract, and why bother mentioning the causes of the observed climate disruption if they’re not applicable or of only secondary importance to the topic of your paper? That’s why they asked authors to self-rate their papers as well.

      And when the authors rated their papers, a bias toward “no position” was discovered. Both the number of endorsement and the number of rejection papers increased from the abstract ratings (36.9% to 62.7% endorsement and 0.6% to 1.8% rejection).

      In summary, Idso, Scafetta, and Shaviv had a chance to self-rate their papers and correct the record. Hopefully they did so. But the fact that their abstracts possibly misrepresented the content of their papers is not the fault of Cook and his co-authors. Any errors of that type are strictly the responsibility of Idso, Scafetta, and Shaviv.

      • Are you claiming to know more about the paper than their authors? It is most certainly the fault of the Cook et al. authors if they misrepresented a scientist’s paper.

        Dr. Scafetta explicitly used the words, “might have” and Shaviv used “mainly”.

        The Cook et al. authors generated data and the data is wrong – period.

        • No, I’m not claiming that I know the authors’ papers better than they do, and neither did Cook et al 2013. I’m saying that the abstracts to the papers may have misrepresented the contents of the paper. And I’m saying that the ratings of the abstracts, as written, are reasonable.

          You even quote Shaviv even as admitting that he had to be circumspect in his abstract because of the refereeing.

          Stop confusing ratings of the abstracts for ratings of the papers, PT.

          Why don’t you ask Idso, Scafetta, and Shaviv if they participated in the self-rated portion of the survey. If they did, and if they corrected the bias in the abstract ratings toward “no position,” then you and they really have no right to complain. The consensus results are the same regardless of whether the abstract ratings or the paper self-ratings are used.

          And if they didn’t participate in the self-ratings, then they have no-one else to blame but themselves for not correcting the record when they had a chance to do so.

          Either way, you’re wrong, and they’re crying crocodile tears.

        • I said, and I quote: “I’m saying that the ratings of the abstracts, as written, are reasonable.”

          I think I’ve been pretty clear, as has Cook et al 2013 – they set out to examine paper abstracts and then let the authors of those papers correct the rating if necessary as a means to control the experiment and remove biases. And the control method was successful. You just don’t happen to like the results.

          As an aside, I find your defense the right of authors to determine the fate of their papers to be admirable. I presume you’ll be paring down your own list of papers in response to the many authors who have publicly stated that you’re misrepresenting their conclusions. Anything less would be hypocritical on your part, after all.

        • That is semantics so please stop making excuses,

          By rating the abstracts are they implying the rating to the papers?

          So again, do you think Cook et al. accurately rated these Scientist’s papers?

          You really should learn to do better research and not believe everything you Google. Name the paper on my list that is misrepresented and tell me why it was listed. My list is only classifying if a paper can be used to support a skeptic argument (which may have nothing to do with the author) not what the author’s position is on their own paper.

        • This is most definitely not semantics, PT. It’s a critically important point that you’re ignoring because it doesn’t fit your intended narrative.

          An abstract is very short summary of the contents of the paper. It is supposed to accurately represent the critical conclusions of the paper, but no abstract can accurately represent the full contents of a paper that is 10x longer. By their very nature, abstracts cannot fully represent the contents of any scientific paper, and it is possible to write an abstract that misrepresent the paper itself, intentionally (as Shaviv admitted to) or otherwise.

          Cook et al 2013 rated the abstracts to the best of their ability. They may have (and clearly did) get some ratings wrong. They controlled for those mistakes by asking the authors to rate the complete papers. And a bias in the abstract rating was discovered in the process.

          But none of that means that Cook et al 2013 misrepresented the papers in any way. They didn’t rate the papers, only the abstracts.

          As for an answer to your other questions, I’ll simply point you to your discussions on Ars Technica, Greenfyre’s site, and Skeptical Science, which I have read, yes. I don’t feel the need to research in greater detail, however, as the various takedowns of your views and list are extensive.

          If you’ll excuse the term, the consensus appears to be that your list is laughable.

        • Brian,

          By rating the abstracts are they implying the rating to the papers?

          Please name the argument you feel is still valid about my list that I have not rebutted in extensive detail,

          Rebuttal to Greenfyre – “Poptart gets burned again, 900 times”

          Google Scholar Illiteracy at Skeptical Science

          I will be happy to refute it in extensive detail for you here. I suggest reading the Rebuttals to Criticism section of my list before continuing on this line of argumentation.

          It is quite embarrassing you believe the Denominator post to be valid. Do you not understand how Google Scholar works too or are you computer illiterate as well?

        • To answer your question, no, the Cook et al 2013 ratings of the abstracts do not imply ratings of the papers.

          Which is what I’ve been saying and explaining, in great detail, and what you’ve apparently been failing to understand, for several comments now.

  2. Brian, this is incorrect. From their own paper,

    “For both abstract ratings and authors’ self-ratings, the percentage of endorsements among papers expressing a position on AGW marginally increased over time.Our analysis indicates that the number of papers rejecting the consensus on AGW is a vanishingly small proportion of the published research.”

    • You’re ignoring all the caveats that Cook et al 2013 has in their paper, I’m afraid, and thus quoting out of context (badly). The context of Cook et al 2013 is that they acknowledge that their abstract ratings may be biased, both due to the inherent limitations of the abstract and due to the volunteers’ own biases. Cook et al 2013 points out that addressing these biases is part of why they asked authors to self-rate their own papers based on the entire content of the paper, not just the abstracts.

      I suppose I could adjust my prior answer slightly to “Ratings of the abstracts are ratings of the paper, but with larger error bands than the self-ratings.” But as Scafetta’s abstract clearly demonstrated (and as Shaviv admitted on your site), the rating of an abstract need not be suggestive of a rating based on the details of the paper.

      • I am not ignoring anything. Acknowledging that it might be wrong does not excuse it from being wrong. The author self-rating’s simply confirmed that their ratings were wrong as they were not identical for the 14% of responses they received.

        So is it acceptable to partially read a paper and misrepresent it?

        • I don’t think you understand the implications of what you’re saying. You’re essentially saying that we should ignore the results of Cook et al 2013 unless their abstract ratings are 100% identical to the full paper self-ratings. That’s a completely unreasonable and unrealizable criterion.

          Do you reject every other scientific study because they have a margin of error? If you’re using this 100% criterion, you’d have to, because any uncertainty, any mistake would automatically render the entire study meaningless.

          If Cook et al 2013 only mis-rated 4 papers, that would be an amazingly success rate. 4/11944 = 0.03348%, or a 99.96652% success rate. Of course they’re going to make mistakes, and of course there’s uncertainty. The fact of the matter is that Cook et al 2013 openly discussed the potential for mistakes in their discussion of uncertainty, and they controlled for it. That was, after all, the ethical thing to do.

          If anyone has a right to complain about errors and biases in ratings, it’s the authors who had the endorsing papers. Cook et al 2013 misidentified nearly 600 of their papers, not just a couple of dozen.

          I understand that you don’t like Cook et al 2013′s conclusions, PT, but that doesn’t excuse your use of, so far, the black or white and the previous quoting out of context logical fallacies.

        • I reject studies where the methods do not produce accurate results such as Cook et al. (2013).

          It was not 4 papers because Dr. Tol found 5 himself.

          As I said the errors identified from comparing the self-ratings should have been enough to reject the paper during the peer-review process.

        • Their methodology is imperfect. They admit it. I suspect that if you asked Cook, Nuccitelli, or any of the other co-authors, they would have preferred to base their assessments on complete readings of 11,944 papers instead of merely 11,944 abstracts. That would certainly have been an improved methodology, but at the cost of a massive amount of additional time on a project that took something like a year as it was. Often in science (and my own field, engineering) you have to move forward even though your data is imperfect. Not having time to do a “proper” job of data gathering and analysis is, in my experience, more common than being given the time and budget to do the job “right.”

          One of the facts of data analysis is that even horribly biased data is useful if the biases can be removed. Cook et al 2013 was able to remove most of the bias. That makes their abstract ratings usable. If you want an example of biased data that is still useful, you have only to look at the global satellite data from UAH and RSS. Multiple biases have been detected and corrected, and as a result the data is still very useful.

          Cook et al 2013′s methodology is sound precisely because they controlled for rating bias. If they hadn’t controlled for bias, then it would be unsound. But they did. None of your protestations to the contrary change that basic fact.

          There are probably a number of fair and reasonable criticisms of this study, PT. But this one is a dead end.

        • You are seriously comparing the mathematical error corrections from RSS and UAH to this “study”?

          Failure to get accurate ratings for these papers is an unacceptable excuse.

          So arguments like Dr. Tol’s are baseless? …good luck selling that one. Keep believing this is a dead end …good luck selling that one too.

        • You’ve clearly never done serious data analysis, or research involving people, psychology, biological sciences, or polling. If you had, you’d not have made such a statement (unless you were being intentionally argumentative, that is).

          People can be considered measurement instruments just like a thermometer or a microwave sounding unit. They are biased and error prone, but neither factor renders human measurements unusable.

          So yes, I’m seriously comparing the mathematical corrections made by RSS and UAH to this study, because they are not so different as you think. The study was designed well enough to make good comparisons within the time constraints that Cook and his co-authors had.

          If you don’t like Cook et al 2013′s results, you’re welcome to replicate their work with a large enough group of volunteers to rate the complete content of 12,000 papers. I wish you good luck on that endeavor.

  3. @Poptech
    You don’t seem to understand how science works. You would not be replicating flawed methods you would be correcting flawed methods by carrying out a new study based on the supposed flaws of this one. If you are really sure that the Cook et al’s conclusions are wrong, publish an analysis in the journal, highlighting the flawed methods and what you’ve done to correct the flaw. It’ll mean you having to do some real work though as opposed to publishing nonsensical tittle-tattle on the internet.

      • Not even close, PT, but nice try.

        By including the word “global” in front of “climate change,” Cook et al 2013 reproduced the results of Oreskes 2004, showing that the results of Oreskes 2004 were accurate even when using a much larger sample size.

        And you and Tol are both ignoring the fact that Cook et al 2013 specifically addresses representativeness:

        A Web of Science search for ‘climate change’ over the same period yields 43 548 papers, while a search for ‘climate’ yields 128 440 papers.

        Put another way, Tol finding evidence that the search might not have been representative is not the same as proving that the search is not representative. For all we know, Tol’s search could be less representative of the 128440 “climate” papers than the Cook et al 2013 search.

        The graphs are pretty, but they don’t prove – or disprove – anything. You’re over-reaching badly, PT, and so is Tol.

        • The graphs are devastating and demonstrate how un-representative Cook et al. is of the peer-reviewed literature.

          Brain, you seem to be unaware that Cook et al. was not science but marketing propaganda,

          link

        • Repeatedly asserting something doesn’t make it true. You (or Tol, since they’re his graphs) have to prove that the results are not representative. The graphs do not do that. It takes analysis to prove it, and there is no analysis in those graphs.

          If this was a court of law, those graphs would be circumstantial at best.

          Come back when you or Tol has proven that the Cook et al 2013 is sufficiently unrepresentative to invalidate the results.

        • Brian, the only thing Cook et al. is representative of is papers whose authors chose to use the keywords “global warming” and “global climate change” in their abstracts, it is in no way representative of the climate science peer-reviewed literature.

  4. Popular Technology: Please familiarize yourself with S&R’s comment policy, specifically the portions about “meaningful engagement” and how “commenters exhibiting bad faith behavior will have their offending comments deleted.”

    Your comments to date have been tolerated as a courtesy since Brian chose to engage with your first comment. However, your comments have recently veered away from meaningful engagement and toward bad faith behavior.

    Further deviations from the comment policy may result in our refusing to pass your comments out of moderation. You will not be warned again.

    • This is an incredibly bizarre warning which includes requirements that are impossible to meet. What is considered “meaningful engagement” is purely subjective and can be used to dismiss my comments for any reason what so ever.

      • If we wanted to dismiss your comments for any reason whatsoever we’d have done so long ago. However, we value intelligent discussion and we recognize that a) people can disagree in good faith and b) legitimate data and research is subject to dispute. However, there is a difference between honest pursuit of knowledge and a predetermined argument that has less to do with science than it does with ideology.

        That comment policy emerged from a great deal of deliberation on the part of the staff, and while it might be in some ways subjective, it is in no way arbitrary. We’re not better as a publication if we chase away commenters. We are better when we hold everyone to a high standard of evidence and reason.

  5. Pingback: Climate Illogic: industrial climate disruption is not a popularity contest | Scholars and Rogues

Leave us a reply. All replies are moderated according to our Comment Policy (see "About S&R")

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s