The term fake news can be used to refer to a variety of different phenomena. Here, we largely adopted the use suggested in (25) of knowingly false or misleading content created largely for the purpose of generating ad revenue. Given the difficulty of establishing a commonly accepted ground-truth standard for what constitutes fake news, our approach was to build on the work of both journalists and academics who worked to document the prevalence of this content over the course of the 2016 election campaign. In particular, we used a list of fake news domains assembled by Craig Silverman of BuzzFeed News, the primary journalist covering the phenomenon as it developed (7). As a robustness check, we constructed alternate measures using a list curated by Allcott and Gentzkow (2), who combined multiple sources across the political spectrum (including some used by Silverman) to generate a list of fake news stories specifically debunked by fact-checking organizations.

The Silverman list is based on the most-shared web domains during the election campaign as determined by the analytics service BuzzSumo. Silverman and his team followed up their initial results with in-depth reporting to confirm whether a domain appeared to have the hallmark features of a fake news site: lacking a contact page, featuring a high proportion of syndicated content, being relatively new, etc. We took this list and removed all domains classified as “hard news” via the supervised learning technique used by Bakshy et al. (23) to focus specifically on fake news domains rather than the more contested category of “hyperpartisan” sites (such as Breitbart). (The authors used section identifiers in article URLs shared on Facebook that are associated with hard news—“world,” “usnews,” etc.—to train a machine learning classifier on text features. They ultimately produced a list of 495 domains with both mainstream and partisan websites that produce and engage with current affairs.) The resulting list contains 21 mostly pro-Trump domains, including well-known purveyors such as, the Denver Guardian, and Ending the Fed. In analyses using this list, we counted any article from one of these domains as a fake news share. (See below for details on these coding procedures and a list of domains in what we refer to as our main BuzzFeed-based list.)

The Allcott and Gentzkow list begins with 948 fact checks of false stories from the campaign. We retrieved the domains of the publishers originating the claims and again removed all hard news domains as described above. Then, we coded any article from this set of domains as a fake news article. For robustness, in table S9, we used only exact URL matches to any of the 948 entries in the Allcott and Gentzkow list as a more restrictive definition of fake news, but one that does not require assuming that every article from a “fake news domain” should be coded as fake news. Since the list contains the researchers’ manual coding of the slant of each article, we also presented models analyzing pro-Trump and pro-Clinton fake news sharing activity only.

Note: The content above has been extracted from a research article, so it may not display correctly.

Please log in to submit your questions online.
Your question will be posted on the Bio-101 website. We will send your questions to the authors of this protocol and Bio-protocol community members who are experienced with this method. you will be informed using the email address associated with your Bio-protocol account.

We use cookies on this site to enhance your user experience. By using our website, you are agreeing to allow the storage of cookies on your computer.