By Andy Podgurski and Sharona Hoffman
The COVID-19 pandemic presents special challenges to even well-informed and well-intentioned promulgators and consumers of medical research findings, including the legal community.
The stakes in the debate about handling the pandemic are extremely high in terms of lives, jobs, wealth, and political power. In addition, there are tremendous opportunities for researchers to receive attention and notoriety for influencing the debate.
All this means that perverse incentives exist to publicize initial scientific findings that are dubious, poorly vetted, and possibly dangerous to public welfare. The risk of promulgating false or misleading scientific claims is substantial, even when they are made by well-respected scientists affiliated with prestigious institutions. Government authorities must be extremely cautious about basing public policy decisions on inadequately vetted findings, no matter how much hype they get.
A case in point is a preprint of an unpublished paper by Stanford Professor Eran Bendavid and colleagues titled “COVID-19 Antibody Seroprevalence in Santa Clara County, California.” The study estimates the prevalence of individuals in Santa Clara County, CA with antibodies for SARS-CoV-2 (the virus causing COVID-19) by administering antibody tests to a sample of Santa Clara County residents.
The unadjusted prevalence of antibodies to SARS-CoV-2 in the sample was 1.5%. The authors also reported a population prevalence estimate derived by reweighting the data to account for possible under-sampling or over-sampling of groups defined by sex, race, and zip code. This weighted prevalence estimate was 2.81%.
Since the antibody test kits used in the study were unlikely to be perfect, the authors also considered three scenarios for test performance (as characterized by sensitivity and specificity), obtaining population prevalence estimates that ranged from 2.49% to 4.16%. These estimates indicate that between 48,000 and 81,000 people in Santa Clara County were infected by SARS-CoV-2 by early April. These estimates are 50 to 85 times larger than the number of confirmed cases.
The authors argue that the population prevalence of SARS-CoV-2 antibodies in Santa Clara County implies that the infection is much more common than is indicated by the number of confirmed cases.
If this is true and if the results generalize to other regions, then the death rate due to COVID-19 is likely to be much lower than has been suggested by many experts. Naturally, this could have major implications for how governments should handle COVID-19 going forward.
These conclusions have been widely publicized in the media. It has been less well publicized, however, that scientists and statisticians have raised serious questions about the methodology and assumptions underlying the findings of Bendavid et al.
One issue concerns the representativeness of the sample of Santa Clara County residents used to estimate the population prevalence of antibodies to SARS-CoV-2. Instead of using a random sample of residents, the investigators recruited participants using Facebook ads. Those who responded to the ads and traveled to the testing sites may not be representative of the county population with regard to SARS-CoV-2 antibody prevalence.
Moreover, given the difficulty of getting tested for COVID-19, it is likely that individuals who suspected that they had contracted the illness were more likely to seek to participate in the study than other residents. Such self-selection by research subjects could seriously distort the prevalence estimates.
Critics of the study also note that there are inconsistencies in the authors’ statistical analysis of the uncertainty of their estimates, which suggest that their results may be due to the antibody test kits misclassifying some individuals as having SARS-CoV-2 antibodies when in fact they don’t. There are known problems with the accuracy of such tests.
The conclusions of the study by Bendavid et al. will no doubt be confirmed or refuted by other studies. However, this example illustrates the importance of maintaining some degree of skepticism about reported medical research findings (and scientific findings generally). We discuss the need for this approach in an article entitled “Big Bad Data: Law, Public Health, and Biomedical Databases.” To be considered reliable, such findings should both survive peer review and be widely discussed, replicated, and accepted by the relevant research communities.
As public pressure to reopen the country mounts and governors contemplate easing restrictions, they must not be enticed by convenient but unconfirmed research findings. Naively embracing encouraging findings without adequate scrutiny and critique could cost many lives.
Andy Podgurski is a professor in the department of computer and data sciences at Case Western Reserve University.
Sharona Hoffman is the Edgar A. Hahn Professor of Law, a professor of bioethics, and Co-Director of Law-Medicine Center at Case Western Reserve University School of Law. For more information see https://sharonahoffman.com/.