This is the second post in Bill of Health‘s symposium on the Law, Ethics, and Science of Re-Identification Demonstrations. We’ll have more contributions throughout the week, and extending at least into early next week. Background on the symposium is here. You can call up all of the symposium contributions by clicking here (or by clicking on the “Re-Identification Symposium” category link at the bottom of any symposium post).
Please note that Bill of Health continues to have problems receiving some comments. If you post a comment to any symposium piece and do not see it within half an hour or so, please email your comment to me at mmeyer @ law.harvard.edu and I will post it. —MM
By Jen Wagner, J.D., Ph.D.
Before I actually discuss my thoughts on the re-identification demonstrations, I think it would be useful to provide a brief background on my perspective.
Identification≠identity
My genome is an identifier. It can be used in lieu of my name, my visible appearance, or my fingerprints to describe me sufficiently for legal purposes (e.g. a “Jane Doe” search or arrest warrant specifying my genomic sequence). Nevertheless, my genome is not me. It is not the gist of who I am –past, present or future. In other words, I do not believe in genetic essentialism.
My genome is not my identity, though it contributes to my identity in varying ways (directly and indirectly; consciously and subconsciously; discretely and continuously). Not every individual defines his/her self the way I do. There are genomophobes who may shape their identity in the absence of their genomic information and even in denial of and/or contradiction to their genomic information. Likewise, there are genomophiles who may shape their identity with considerable emphasis on their genomic information, in the absence of non-genetic information and even in denial of and/or contradiction to their non-genetic information (such as genealogies and origin beliefs).
My genome can tell you probabilistic information about me, such as my superficial appearance, health conditions, and ancestry. But it won’t tell you how my phenotypes have developed over my lifetime or how they may have been altered (e.g. the health benefits I noticed when I became vegetarian, the scar I earned when I was a kid, or the dyes used to hide the grey hairs that seem proportional to time spent on the academic job market). I do not believe in genetic determinism. My genomic data is of little research value without me (i.e. a willing, able, and honest participant), my phenotypic information (e.g. anthropometric data and health status), and my environmental information (e.g. data about my residence, community, life exposures, etc). Quite simply, I make my genomic data valuable.
As a PGP participant, I did not detach my name from the genetic data I uploaded into my profile. In many ways, I feel that the value of my data is maximized and the integrity of my data is better ensured when my data is humanized.
Delusions of De-identification
Linguistically the prefix “re” indicates an action happening again or with backward motion (e.g. repetition, return, restoration, regenerate, reproduce, retrace, redefine). To discuss the potential re-identification of data (genomic or otherwise) and the corresponding risks, we must first think critically about the possibility of de-identification.
The Privacy Rule of the Health Information Portability and Accountability Act (HIPAA) lists 18 identifiers that can be stripped to “de-identify” the data. Genomes cannot be de-identified and are themselves identifiers. The sequences are unique to the individuals from whom they were derived. Yet genomic data can be unlinked, detached from other identifiers, such as an individual’s name, that may make the individual more comfortable participating in the PGP and other research endeavors. Yet, by and large, de-identification is a delusion.
I attended the GET conference this year and visited the Data Privacy Lab’s re-identification table. I was not surprised at all to learn that simply providing my sex, birthdate, and zip code would allow anyone to identify me easily. (As previous blog posts have explained, no PGP participant should be surprised by this.) Again, I have made no attempts to detach identifiers from my PGP profile, so the risk of “re-identification” is a non sequitur. Moreover, the identifying feature of genomes is not inherently negative: identification applications have a number of benefits that are not yet being realized to their fullest potential (e.g. missing persons, human trafficking, family reunifications).
On the Re-identification Demonstrations
If the purpose of the re-identification demonstrations was to highlight the ease of identifying members of the PGP, that purpose could have been fulfilled without specifically attempting to identify members. The PGP consent process, which is thorough and documented, addresses confidentiality and anonymity (or lack thereof) in Article IX. The relevant part to consider for these re-identification demonstrations is Item 9.2 (emphasis added):
9.2 Association of Your Name With Your Data. The PGP will not intentionally associate your name with your genomic or trait data or other information that is published to the PGP’s public website and database or otherwise intentionally identify you as a participant in the PGP without your prior consent. However, as described above, because of the identifiable nature of the information you provide to the PGP, as well as the nature of the data and analyses generated by the PGP, it is possible that one or more third parties may identify you as a participant in the study. This may result in the association of your published data and other information with your name or other information that you have not provided to the PGP and may not have wished to be publicly disclosed.
The demonstrations are unmistakably intentional attempts to associate participants with their data; however, the re-identification demonstration itself did not show names or PGP Participant ID as part of the demonstration results. Moreover, implicit consent was given for the attempt to identify when each visitor to the demonstration table voluntarily entered his/her sex, birthdate, and zip code to conduct the demonstration. Technically, in my opinion, the re-identification demonstrations have not gone beyond the informed consent given by PGP participants and the subsequent implicit consent given by those who participated at the demonstration table.
Rather than focusing on whether it was controversial or unethical for the demonstrations to occur, I would be more interested in exploring the necessity and efficacy of such demonstrations in conveying just how delusional the concept of “de-identification” is in a research context using any type of data. Could broad recognition of the delusion of de-identification facilitate meaningful reform in human research protection policies?