Epistemology. You keep using that word. I don’t think it means what you think it means. A discussion of “anti-mask science”

By Lynnepi

Abstract for this post:  The article that is the subject of this post was not written by someone who said, “I wonder …”

The paper for this review is “Viral Visualizations: How Coronavirus Skeptics Use Orthodox Data Practices to Promote Unorthodox Science Online.” I’m going to assume this paper was intended to describe a scientific study.  When I read through it the first time, I found it to be tedious and incoherent.  I have spent a good deal of time with it and now believe I can explain why I found it tedious and incoherent.

The paper is not written in a standard format for scientific articles so it was somewhat confusing to work through.  Due to their grand verbiage, my brain occasionally lapsed into a vision of eagles soaring over windswept mountains.  The authors would say my problem is I’m a sheeple scientist who naively conforms to the mainstream scientific establishment.  One advantage of using the standard format is that it encourages describing the important parts of a scientific inquiry and the reader knows where to find this information.  In this case, the authors left out some chunks of information that are important to know when assessing their work. 

Ultimately, to achieve any understanding of the paper I had to summarize each paragraph.  I’ve included that after the end of this post in case it is at all helpful to anyone.  Through that exercise I learned the fundamental problem of this paper:  the authors want two essentially mutually exclusive conditions to be true: 1) the anti-masker communities employ scientifically rigorous thinking and methods to arrive at their conclusions and present their findings; and 2) they operate through distrust and resentment.  It is extremely difficult to engage in rigorous scientific inquiry if you are motivated by “the answer you already know” which you will use to defeat your enemies.  Curiosity is a much better fuel for the scientific process.

That’s not to say that good scientists don’t have a theory they hope is true.  But they start with the default position that the theory is not true and then proceed by asking questions which can lead to evidence that is either consistent or not consistent with that theory.  These questions require identifying alternative explanations for what they observe, and their process involves “checking” whether these alternative explanations are reasonably likely, or when possible, eliminating alternative explanations through study design. 

The authors appear to attempt to bridge the gap between conditions one and two by asserting that all data and data analyses are biased and conditional upon the social context of the people doing the analysis.  According to the authors, knowledge is “created” based on the social constructs of whoever is collecting, analyzing and presenting the data.  (This made me wonder if knowledge is “created” or “discovered”?)

There are instances where who is collecting the data and how they interpret and present it can “create” a “reality” that is highly dependent on social context.  For example, there is the history of Nazis measuring physical features of people in order to separate them into “Aryans” and “Jews.”  Putting such instances aside, it may be true that “all” data collection methods and data analyses are biased and/or contain errors.  The pertinent question is how much bias and/or error are present?  Are they enough to cause us to make an incorrect inference?

Unlike the authors, I don’t accept the idea that all knowledge is subject to the social construct under which it was obtained.  My position is that there is an objective reality of some sort.  My husband is an electrician.  I could accuse him of being a white male chauvinist pig (the paper sometimes discusses feminists) and that he can’t oppress me with his version of how electricity works.  However, if I hold a hot wire in one hand and a grounded metal object in the other, I will be badly injured regardless of whose social construct is predominant.

If you’ve gotten this far you’re probably wondering, “What does the paper say?”  Overall, the study is about “activists” who the authors claim are adept at using data visualizations which support interpretations that are at odds with public health measures designed to combat COVID-19.  The statement that most closely resembles a study objective is, “This paper investigates how these activist networks use rhetorics of scientific rigor to oppose these public health measures.”  The authors refer to this group as “anti-maskers,” although at the bottom of page four in section 3.2.3 they mention that they actually mean anyone who is skeptical of the COVID-19 pandemic and not specifically people who oppose wearing masks.

The authors undertook two approaches to this investigation.  First, they extracted tweets from a dataset of tweets related to COVID-19 assembled by a non-author.  The authors used certain keywords to find tweets involving data visualizations (e.g., “bar,” “line,” “trend,” “data”).  “Unfortunately, this strategy yielded more noise than signal as most images included were memes and photographs.”  So they did a second search using keywords like “chart,” “map,” and “dashboard.”  They didn’t comment on whether this produced a dataset closer to what they were looking for, but it is the one they used.  Actually, they never gave inclusion and exclusion criteria for what type of tweet should be included in this portion of the study as would normally have been done in a scientific study.  I wondered if the second search might represent a selection bias.  That is, if the tweeters thought they were engaged in data visualization, then shouldn’t the tweet be included?  Did the authors eliminate such tweets in order to produce a dataset that would support their conclusions?

The authors describe in a quantitative way the characteristics of the tweets and identify the top six tweeting networks based on the number of users.  The fourth largest network was the anti-maskers.  The other network groups were: “American politics and media (blue),” “American politics and right-wing media (red),” “British news media,” “New York Times centric,” and “World Health Organization.”  The various tweeting groups all used charts with lines, bars, images, maps, tables, symbols, pie charts, and dashboards.  The authors stated that there was “no significant difference in the kinds of visualizations that the communities on Twitter are using to make drastically different arguments about coronavirus.”  They do not explain in the Methods or anywhere else how they made that comparison.  They provide a few examples of tweets, some from anti-maskers and some from other groups.

The authors also did a qualitative analysis through “deep lurking” of five Facebook groups that are involved in data visualizations related to COVID-19.  It’s not stated how many potential Facebook groups could have been followed, or how the authors decided they would follow five of them and which five (although there are five authors …?)  The only “inclusion” type statement about the Facebook groups is that they have 30,000 – 100,000 followers.  So the reader has no way of assessing how representative these Facebook groups are.

The authors’ qualitative analysis revealed the following to them: “Far from ignoring scientific evidence to argue for individual freedom, antimaskers often engage deeply with public datasets and make what we call ‘counter-visualizations’—visualizations using orthodox methods to make unorthodox arguments—to challenge mainstream narratives that the pandemic is urgent and ongoing.”  Another quotation:  “…anti-maskers invoke data and scientific reasoning to support policies like re-opening schools and businesses.”  The reader gets the impression that the authors are enthralled with the anti-maskers.  Almost every statement made about the anti-maskers is favorable.  A few of many examples:

  • “…many of the visualizations shared by anti-mask Twitter users employ visual forms that are relatively similar to charts that one might encounter at a scientific conference.”
  • “…these groups leverage the language of scientific rigor—being critical about data sources, explicitly stating analytical limitations of specific models, and more…”
  • “…coronavirus skeptics champion science as a personal practice that prizes rationality and autonomy; for them, it is not a body of knowledge certified by an institution of experts.”
  • “…anti-mask science has extended the traditional tools of data analysis by taking up the theoretical mantle of recent critical studies of visualization.” [Emphasis added.]
  • “Indeed, anti-maskers often reveal themselves to be more sophisticated in their understanding of how scientific knowledge is socially constructed than their ideological adversaries, who espouse naive realism about the ‘objective’ truth of public health data.” [Emphasis added.]

The one negative statement about anti-maskers is that their effectiveness has led to “horrifying ends.” However, there is no evidence presented in the paper that the anti-maskers have motivated any destructive actions.  It’s possible they have simply reinforced one another’s pre-existing opinions within their own group.

The authors speak of anti-maskers as a uniform population, all of whom are sophisticated people who “practice a form of data literacy in spades.”  (Although the authors decline to define “data literacy,” a term they appear to use interchangeably with “media literacy” and “science literacy.”)  They don’t draw any distinctions between the Twitter network and the Facebook groups.  I doubt these groups are identical and as homogeneous as the authors suggest.

Now that we’ve built up the anti-maskers as, to borrow a term from David Gorski of Science-Based Medicine, “brave maverick data visualizers,” let’s take a look at a couple of data visualizations offered by the authors as examples of sophisticated, polished work using the rhetoric of scientific rigor (the authors only provided tweets, so we have no examples from the Facebook groups).

The Georgia Map

A tweet (dated 7/17/2020) shows two maps of Georgia with the counties outlined and color-coded according to their incidence of COVID-19 relative to other Georgia counties.  The metric chosen is “cases per 100K” and appears to be cumulative cases.  The maps represent 7/2/2020 and 7/17/2020. Underneath is a legend which shows the range of incidence rates according to color.  The incidence rates have increased from July 2 to July 17, but the color coding remains the same.  The tweeter states, “In just 15 days the total number of @COVID19 cases in Georgia is up 49%, but you wouldn’t know it from looking at the state’s data visualization map of cases.  The first map is July 2.  The second is today.  Do you see a 50% case increase?  Can you spot how they’re hiding it?”

Snarky me thought “so is it 49% or 50%?” and “wouldn’t it be funny if the authors were the ones who tweeted that?” (how many people use the term “data visualization”?)

The phrase “Can you spot how they’re hiding it?” implies that it’s difficult to figure out that cases have increased, but all you have to do is look at the legends under the maps.  Some quick, rough arithmetic shows that the lower-incidence counties have increased cases by about 50% and the higher incidence counties have increased cases by about 27%.  So, not difficult to see that cases have increased but also it’s not clear from the tweet how the increase of 49% was determined.

The Georgia Department of Public Health has a PDF document available from their website that defines the metrics they use for the various graphs on display at their website (e.g., “confirmed COVID-19 cases,” “ICU admissions”).  There is also a section that describes the different options for viewing the Georgia county map.  For example, you can view total cases, cases per 100K (the choice of the tweeter), cases in the last two weeks (!), etc.  For the county map the description includes this statement: “The color scale is based on the distribution of county-level case counts or rates, outlier values are removed from the scale calculation. The scale will change as needed to accommodate increasing or decreasing case counts and maintain distinctions between counties, and will be calculated based on the current data.” [Emphasis added.]  The purpose of the map is to rank the counties according to their COVID-19 incidence.  If you want to know how recent cases are increasing or decreasing, they have a useful graph of daily cases which very clearly shows that daily cases accelerated during the first two weeks of July 2020.  The Georgia DPH is not hiding anything.  The tweeter either knows this or is stupid.  But the tweeter is not “creating new knowledge” using “anti-mask science” or “an epistemology in conflict with orthodox science.” One caveat is that I don’t know what data the Georgia DPH presented in July 2020.  However, if someone accused them of hiding data, they stopped hiding it. 

The “Governors Chart”

Another tweet (dated 7/19/2020) shows a “stylized” bar chart using governors’ heads sized according to the death rate from COVID-19 (“fatalities per million”) in their states.  The chart doesn’t indicate per million what, but I assume it’s population.  The states with the five highest death rates are New Jersey, New York, Connecticut, Massachusetts, and Rhode Island.  The other states shown, which have rates that are approximately one-fifth and less of the high-fatality rate states, are Arizona, Florida, California, and Texas.  The tweet reads, “Hey Fauci…childproof chart!  Even a 4-year-old can figure this one out!” I’m forced to confess that I didn’t figure it out.  I assumed that the governors from the high fatality states were Democrats and those from the low fatality states were Republicans.  However, the governor of Massachusetts is Republican and the governor of California is a Democrat, so that didn’t fit.  Since the tweeter is using governors’ likenesses it must have something to do with the actions or policies of those governors.  I can’t easily find information on their masking and social distancing policies prior to July 2020. 

For the sake of brevity, consider New York and Arizona. Eyeballing it, it looks like New York’s COVID-19 fatality rate per 1,000,000 is about 1,600 while Arizona’s is about 300 per 1,000,000.  COVID-19 cases in New York state are influenced by New York City (NYC), which through 5/28/2021 accounted for 65.1% of the state’s cases.  The NYC population (8,336,817) represents 42.9% of the state’s population.  In 2018, about 13.6 million international travelers came to NYC.  The population density of NYC is 27,000 people per square mile – the most densely populated city in the United States.  Think of people riding the subway and other forms of mass transit during their 41.5 minute commute to work (56% of NYC residents use public transportation).  Basically, NYC itself is a superspreader event. The population is 32.1% white, non-Hispanic, 24% black, 29% Hispanic or Latino and 14% Asian.  COVID-19 was first detected in NYC (and also in New York state) on 3/1/2020 and quickly spread after that and accumulated 20,875 cases (statewide) by 3/23/2020 and 373,040 cases by 6/2/2020.  This is important in that physicians were confronted in March with a disease they did not have experience treating.

In Arizona, Maricopa county (home to Phoenix) accounts for 62.2% of the state’s cases.  In 2019, about 2.2 million international travelers passed through Sky Harbor Airport in Phoenix (I did not find information on visitors to Phoenix).  The population density of Phoenix is 2,800 people per square mile.  The commute is approximately 26.2 minutes and I’m guessing that Phoenix is more like my city where most people drive to work in their car (“less than 10% of the population currently commutes by walking, biking or transit.”) About 43% of Phoenix’s citizens are white, non-Hispanic, 43% are Hispanic or Latino, 7% black, 10% “other,” 4% Asian and 2% Native American.  Although the first COVID-19 case was detected on 1/26/2020, cases accumulated much more slowly so that by 6/2/2020 the state had only 21,250 cases (by 3/23/2020 New York state had accumulated about that many cases).

My point in comparing NYC and Phoenix is that by using some basic epidemiologic principles, I developed alternative plausible explanations for the differences in COVID-19 fatality rates besides what the governors of New York and Arizona may have done or not done.  That is, population density, interaction with the international community, method and length of commuting, demographic composition and timing of how the pandemic unfolded most likely play a role in the differences in the fatality rates recorded for New York and Arizona.  Phoenix is not a counterfactual representation of New York City.

Bottom line, the tweet examples above do not indicate that the anti-maskers use “orthodox” scientific methods to make inferences.  This appears to go back to the authors’ apparent assertion that all knowledge is created through social construct.  With “regular” science (I’m not sure what the best descriptor is), there is a body of objective knowledge that exists and that we draw from in order to view and interpret observations.  I didn’t mention it above, but I used the knowledge that SARS-CoV-2 is a virus that is transmitted by droplets.  That led to considering how population density and likelihood of being in crowded places might influence transmission of the virus and patient outcomes.  Based on the authors’ description (everything we “know” about the anti-maskers is delivered through the filter of the authors’ narrative), anti-maskers take one metric, create a graph or chart and devise an explanation of what it means as guided by their “social construct.”

That is my response to the authors’ expressed confusion over “a fundamental epistemological conflict between maskers and anti-maskers, who use the same data but come to different conclusions.”  We’re not using the same data.

The authors pay lip service to the idea that the data visualization activities of the anti-maskers need to be confronted.  Examples: “This paper shows that more critical approaches to visualization are necessary, and that the frameworks used by these researchers (e.g., critical race theory, gender analysis, and social studies of science) are crucial to disentangling how antimask groups mobilize visualizations politically to achieve powerful and often horrifying ends,” and “Understanding how these groups skillfully manipulate data to undermine mainstream science requires us to adjust the theoretical assumptions in HCI research about how data can be leveraged in public discourse.”  This statement also illustrates what I referred to as the “incoherence” of the paper.  The authors say the anti-maskers engage in scientific inquiry but “skillfully manipulating” data is not the aim or process of science. 

A word about the language used in this paper.  The authors continually refer to “orthodox” science and research methods.  I have never heard that term used by a scientist.  The authors refer to “mainstream” science, the scientific establishment, scientific elites.  Meanwhile, the anti-maskers are members of “effervescent” communities who teach each other how to analyze, interpret and present data and who understand that “science is a process, and not an institution.”  They are more “sophisticated” and “scientifically rigorous” than the “naïve” sheeple scientists, all of whom accept any data they receive as perfect without question.  The brave maverick anti-masker data visualizers won’t be cowed by some “hierarchical” scientific monolith, which insults them by calling them “data illiterates.”  The language used to describe anti-maskers is almost always glowingly positive, while the language used to describe “orthodox” scientists is always derogatory.  The authors say, “To be clear, we are not promoting these views.”  I think maybe they are.

This brings me to the last piece of the puzzle. Why did the authors need to “talk smack” about the “regular” scientists, who are not the subject of their study. This is my theory: the authors describe an “effervescent,” high-functioning community of brave maverick anti-mask data visualizers who create new knowledge through their socially-constructed, alternative epistemology but nevertheless endanger the world with their effective charts which lead people astray. The naive, craven sheeple scientists who don’t dare venture from their mainstream hierarchy are clueless about the anti-maskers and what to do to stop them. Whatever shall we do? Have no fear, the social scientists understand the complexities of the anti-maskers and are the only ones who can rescue humanity! Why do I think this? Because of sentences like this: “This paper shows that more critical approaches to visualization are necessary, and that the frameworks used by these researchers (e.g., critical race theory, gender analysis, and social studies of science) are crucial to disentangling how antimask groups mobilize visualizations politically to achieve powerful and often horrifying ends.”

A comment about epistemology, a word the authors use repeatedly.  They claim the anti-maskers have developed a rival epistemology and this has resulted in a “crisis.”  I tried to do some reading on the epistemology of science but … eagles soaring … it was mostly over my head, although I found this paper helpful.  It opens with: “The aim of science is the generation of scientific knowledge. Science issues in propositional outputs that we seek to support with sufficient evidence that they are worthy of belief.”  Certainly “regular” scientists and anti-maskers have different ideas about what constitutes “sufficient evidence.”  My bias is that there is no “complementary and alternative” science and I find the anti-maskers’ evidence “not worthy.”

Summary of the Article

Section 1.  “Introduction” but really a semi-summary of the paper.

2nd paragraph – closest thing to an objective, although written as if the investigation is retrofitted to confirm this assertion.  “This paper investigates how these activist networks use rhetorics of scientific rigor to oppose these public health measures.”  Start of 2nd paragraph, which continues describing their results.

3rd paragraph starts with a summary of their methods and ends with more results description.

4th paragraph has more methods summary.

5th paragraph – description of results

6th paragraph – commercial for coming attractions “As we shall see …”

7th paragraph – more description of results

8th paragraph – implications of results (discussion?)

Section 2. “Related Work”

Section 2.1

1st paragraph – others have investigated what is data literacy and how do we know whether someone has it or not. Ends with Peck who uses some flowery words about how someone’s beliefs and preferences influence how they create a graph, chart or image.  Authors: we need to understand data visualizers’ social and political context.

2nd paragraph – Literacy is more than just understanding messages. Literacy has multiple meanings(?) and it all depends on how the local community views it.  The important factor is how you leverage literacy during local social interactions.  So there is no normative definition of data literacy.  Data literacy is whatever the local community thinks it is.

3rd paragraph – promoting education to improve data literacy is the wrong thing to do, and can backfire as it increases skepticism of the data and the people who report it, such as government.  Then something about fake news is a collective crisis.  Fake news doesn’t cause people to vote for Trump, they vote for Trump because they see logical inconsistencies in how the mainstream media portrays Trump. Improving media literacy is futile.

4th paragraph – Climate skeptics. Fischer says fact-checking isn’t enough, have to investigate “what is behind” climate skepticism. Discussions about COVID datasets represent “political questions about the role of science in public life.”

Section 2.2

1st paragraph – [Data] visualizations are not “objective” representations of “knowledge,” but representations of power!! Feminists have shown that data science and design are infused with power dynamics and inequalities.  It’s important to identify what is missing from the data and to develop alternative methods for “analyzing and presenting data based on anti-oppressive practices.” It’s all based on your social context, man.

2nd paragraph – “Critical, reflexive” studies of data visualization must be done to erase social inequalities created by “computational systems.” Other people have been studying COVID visualizations and the authors will add to this literature by investigating the epistemological crisis that results in divergent conclusions about mask-wearing.

3rd paragraph –The feminists and anti-maskers say that the US government is inappropriately wielding political power through the datasets it releases.  The data collection method is non-neutral (according to the authors isn’t it impossible to be neutral?  It’s all about your social context, man). According to feminists and anti-maskers the government is authoritarian and uses science as a weapon.  A vague statement about the need for critical approaches to data visualization. How anti-maskers use data visualization to achieve their political “and often horrifying” goals will be revealed through the social sciences. [Note: there isn’t any evidence presented in the paper that the anti-maskers have affected behavior regarding masking and social distancing.  It’s possible they’re talking to each other and reinforcing each others’ pre-dispositions to not follow mask-wearing requirements.]

Section 3.  Methods

“qualitative analysis of comments to identify changes in online dialogue over time [104],” [Note: there is no discussion in the Results of how the dialogue changed over time.]

“visualization research that reverse-engineers and classifies chart images [88, 104].”

Used a dataset of tweets created by a non-author who was studying tweets about COVID-19.  They initially searched this dataset using keywords associated with data analysis (e.g., “bar,” “line,” “trend,” “data”).  “Unfortunately, this strategy yielded more noise than signal as most images included were memes and photographs.” So they did a second search using keywords like “chart,” “map,” “dashboard.”  They didn’t comment on whether this produced a dataset closer to what they were looking for, but it is the one they used.

Quantitative methods.  To classify the images, they used the Poco and Heer model to categorize the images, but the model was only able to do this for 30% of the images.  So the authors categorized the rest using a more manual approach which I don’t understand but accept.  The other quantitative method was network analysis, which involved identifying nodes (people who tweeted) and edges (which seems to involve communication among the nodes).  They used the Louvain method to delineate communities.  Again, I don’t understand much of this but accept it.

Qualitative methods. The authors used “deep lurking” (digital ethnography) to document the behaviors / discussions of the online communities.  They followed five Facebook groups and used a case-study approach.  There is no description of how the Facebook groups were selected, except that they had 10,000-300,000 followers.

Data collection and analysis.  They “printed out” posts as PDFs and “tagged” them with qualitative software.  They did not say which software they used.  They used grounded theory to extract themes, which is a known method in qualitative research.  The authors make a couple of puzzling statements about how digital ethnography, a qualitative research method, doesn’t produce quantifiable results.  They say it’s a limitation and the reason they also had to use quantitative methods. Qualitative research typically doesn’t involve quantitative results and doesn’t need to.  So I’m not sure why they spent text on that, or why that necessitated quantitative methods when earlier they stated that the quantitative and qualitative results complement each other.

On the bottom of page 4, in section 3.2.3, they briefly mention that when they say “anti-masker,” they really mean it as a general term for people with a variety of objections to the pandemic and public health recommendations for combatting it.  That would have been nice to know on page 1.  Typically, the first time you use a term differently from how it would be commonly understood, you would mention the meaning that you are assigning to it.  They explain they didn’t want to use the term “anti-science” because the “anti-maskers” have a different perspective on “science.”  The authors could have used the term “pandemic skeptics” or something similar, but I imagine that wouldn’t grab attention in the way that “anti-masker” does.  At least, that’s the reality of my social context, and it’s perfectly legitimate, thank you.

The next section is titled “Case Study” but this seems to be the detailed presentation of the results.

The first paragraph presents the results of the quantitative Twitter analysis and I realized I had not previously understood that the tweets are from “mainstream” and “anti-masker” individuals.  The authors spend so much time in the first three sections on anti-maskers that it did not occur to me that non-anti-maskers were also involved.  I’m used to papers which delineate the inclusion and exclusion criteria for whose data is included in the study, which this paper did not have.  Anti-maskers had the fourth largest network.

They compared the types of data visualizations tweeted by anti-maskers and mainstream and found “no significant difference in the kinds of visualizations that the communities on Twitter are using to make drastically different arguments about coronavirus.”  They do not explain in the Methods or anywhere else how they made that comparison.  The statement seems like they are surprised about this, but not all line graphs, or maps, or charts, are identical.  Couldn’t they be using different sets of charts and graphs?   “… how can opposing groups of people use similar methods of visualization and reach such different interpretations of the data?”  I’m very confused by their confusion.  They’re saying “types” and “methods” of visualization, not that they matched specific charts among the different groups.  If they took a specific set of charts or graphs and found that both anti-mask and “mainstream” networks commented on the same set in qualitatively different ways, I would understand their confusion better.  However, there is no indication that this is what was done.

“We approach this problem by ethnographically studying interactions within a community of anti-maskers on Facebook to better understand their practices of knowledge-making and data analysis, and we show how these discussions exemplify a fundamental epistemological rift about how knowledge about the coronavirus pandemic should be made, interpreted, and shared.”

On pages 6-7, the authors describe in a narrative fashion the characteristics of the various data visualizations that comprise their dataset (e.g., line graphs, bar charts, maps, area charts).  Figure 1 is a “UMAP” visualizations with some examples of each type of visualization.  The UMAP has different colors to represent the different types.

Page 7 shows Figure 2 which describes the Twitter networks, with a color assigned to each network.  They provide more detail about the “top 6 network communities,” which appears to be determined based on volume (of nodes? Tweet volume?).  They are:

  1. American politics and media (blue).  Center-left, left, “mainstream” media, and affiliated politicians and actors with a lot of followers.
  2. American politics and right-wing media (red).  The Trump administration, right-wing media personalities (e.g., Tucker Carlson), but also some “mainstream” media organizations like CNN and NBC News because of how often they mention the President.  Interestingly, Barack Obama has the most followers in this community.  Is this a “community”?
  3. British news media (orange).  “…news media in the UK, with a significant proportion of engagement targeted at the Financial Times’ successful visualizations by reporter John Burn-Murdoch…” Unfortunately I’m unfamiliar with John Burn-Murdoch or why his visualizations are “successful.”  He is mentioned in several sections of the paper.
  4. Anti-mask network (teal).  A group of 2,500 users, of whom Alex Berenson (former NY Times reporter), Ethical Skeptic and Justin Hart play a major role.  Elon Musk (a vocal anti-masker) has the most followers.  The anti-maskers hate Governor Mike DeWine and The Atlantic because of their public health policies so these are also part of the “community.”  “These dynamics of intertextuality and citation within these networks are especially important here, as anti-mask groups often post screenshots of graphs from “lamestream media” organizations (e.g., New York Times) for the purpose of critique and analysis.”  Are they saying the other groups don’t do this? 
  5. New York Times-centric (green).  The authors describe this as an “artifact” of one viral Tweet from Andy Slavitt (former acting CMMS Administrator), which announced that the NY Times was suing the CDC and showed a bar chart of the racial disparity in COVID cases.
  6. World Health Organization (WHO) and related organizations (purple). 

Figure 3 on page 8 shows a series of what I can describe as “color splashes” which indicate the types and quantities of data visualizations by network community.  The figure title states, “While every community has produced at least one viral tweet, anti-mask users (group 6) receive higher engagement on average.” In looking at the engagement scores listed for each group, the anti-maskers rank 3rd (average engagement = 65), after American politics and media (blue) at 131 and British media at 94.

Page 9 includes Figure 4, which provides “sample counter-visualizations” of the anti-masker community.  The title includes the statement, “…While there are meme-based visualizations, antimaskers on Twitter adopt the same visual vocabulary as visualization experts and the mainstream media.” 

Table 1 (shown on page 8 and discussed on page 10) shows that the anti-masker network ranks second in re-tweets (indicating “insularity” according to the authors) and third in original tweets.

In section 4.1.4, at the bottom left of page 10, the authors state that Figure 3 shows that “…there is little variance between the types of visualizations that users in each network share: almost all groups equally use maps or line, area, and bar charts.”  In the same section at the top right of page 10 they state that anti-maskers “…use the most area/line charts and the least images across the six communities…”  I don’t follow how these two statements are both true.

In section 4.1.5, they repeat that the anti-maskers share data visualization forms that would commonly be seen at scientific conferences.  They seem very impressed by this.  But doesn’t content matter as well as form?

Section 4.2 describes the qualitative results of the “deep lurking” the authors did in the Facebook communities.  It’s clear that the authors feel the anti-maskers are engaging in “scientific reasoning.” In subsequent sections they describe the categories of how anti-maskers discuss COVID-19 data sources.

4.2.1 (page 11).  The authors state that anti-maskers prioritize using original data to make their own graphs, because they do not trust graphs created by academia and the mainstream media.  Apparently the anti-maskers are able to obtain pandemic-related data for smaller geographic regions that is not easily available to the public.  They use it to create their own graphs and charts.  From the authors’ statements it sounds like some anti-maskers want “township” level data.  [Note: a public health department may not want to provide data at this level in order to avoid identifying a COVID patient.]

4.2.2. (page 11) Anti-maskers often criticize the data collection methods of their sources.  They feel the government is not providing metrics that are important, sometimes they argue about which metrics are important.  The anti-maskers feel the government is “manipulating” and/or “deliberately withholding” the data.  As in 4.2.1, it appears some anti-maskers want very detailed data that might reveal a patient’s identity (e.g., deaths per day in a county or township).  Given the authors comments apparently the anti-maskers tried to get data from local health departments.  The anti-maskers question how the data are cleaned and coded, and state this is being done subjectively.  The authors statements indicate that the anti-maskers have assumed that the government is being purposefully deceitful.

[Parenthetically, as described by the authors, the anti-maskers never seem to consider that if the pandemic was over, the public health employees could stop working 16 hours a day, seven days a week for no additional pay (which they did for months on end in my state).  Has anyone thought of fatigue as a possible contributor to error in the data?]

4.2.3 (page 12) The anti-maskers discuss the best ways to visualize data and recognize that how it is presented will influence how someone interprets it.  They argue over whether it is better to present raw counts or proportions / number “per capita.” 

4.2.4 (page 12).  The authors state that anti-maskers are self-aware and recognize their own biases (well, they give an example of one person who said that).  The anti-maskers also state that “big pharma” spins the data to increase their profits.

4.2.5 (page 12).  Some anti-maskers tout their academic credentials in order to convince others they have the appropriate expertise to create data visualizations and “criticize the scientific community.”

4.2.6 (pages 12-13).  The title of this section is “Developing expertise and processes of critical engagement.”  The authors state, “The goal of many of these groups is ultimately to develop a network of well-informed citizens engaged in analyzing data in order to make measured decisions during a global pandemic.”  They follow this with a quotation of a community member who says, ““The other side says that they use evidence-based medicine to make decisions but the data and the science do not support current actions.”  The authors discuss how longer-tenured members teach new members how to create and analyze graphs, leading me to wonder if the blind are leading the blind?  The authors state that the main importance of these efforts is to build social unity (not engage in scientific reasoning?)

This statement made me laugh out loud: “Some questions and comments would not be out of place at all at a visualization research poster session: ‘This doesn’t make sense. What do the colors mean? How does this demonstrate any useful information?’”  Is that what attendees say to the authors when they present their work at a data visualization conference? 

4.2.7 (page 13). This section discusses how some anti-masker groups have used their data visualizations to do things like sue the Ohio Department of Health and initiate an investigation in Texas.  In the Texas case, there was a “surge in positive test rates,” and it turned out this was due to a backlog of unaudited tests.  Once again, I can’t help but think of how public health departments are chronically underfunded and understaffed.

In reviewing section 4.2, there is little information presented by the authors that indicates how the anti-maskers are engaging in scientific reasoning, and if they are, are they doing it well?  Sections 4.2.2 and 4.2.3 come the closest, in that the anti-maskers express awareness that how data are collected, formatting and coded will influence the quality of the data. Yet after criticizing the bad quality of the data from the government sources, they apparently use it to create their own graphs and charts.  If the data itself is highly questionable, how does creating another version of a graph or chart make the information “valid”? 

Section 5: Discussion

The Discussion is essentially a paean to the “brave maverick anti-masker data visualizers” (to borrow Dr. Gorski’s phrase). Visualizationists?  Usually in a scientific paper you would briefly state your overall findings, present how these findings fit or don’t fit with other similar investigations, outline the strengths and weaknesses of your study, and then conclude with some potential next steps for your study’s research topic. (Sometimes these sections are in a slightly different order.)

The reader gets the sense that the “brave maverick anti-masker data visualizers” understand that scientific endeavor is a process while the “orthodox scientific establishment” does not.  Also, that there is no objective reality. 

On page 15, under section 6 (Implications and Conclusion) there is a paragraph on the right hand side which makes some sense to me.  They state: “Understanding how these groups skillfully manipulate data to undermine mainstream science requires us to adjust the theoretical assumptions in HCI research about how data can be leveraged in public discourse.”  The authors spent the whole paper talking about the scientific rigor of the anti-maskers, and how they critically evaluate their data sources, etc., etc., and now they are stating that the anti-maskers are manipulating data.