South Africa’s Pragmatic Turn: Pseudonymized Data and the Future of Health Research
A recent South African court judgment on pseudonymized school examination results has quietly opened a new and unexpected front in global debates about data protection and health research.

Published
Author
Share
A recent South African court judgment on pseudonymized school examination results has quietly opened a new and unexpected front in global debates about data protection and health research. While the case had nothing to do with medicine or genomics, its reasoning departs in important ways from dominant U.S. and European approaches to identifiability — with potentially far-reaching consequences for how pseudonymized health data may be shared. To appreciate why this matters, it is worth briefly situating South Africa’s emerging approach alongside the two paradigms that currently dominate global health data governance.
The United States’ Health Insurance Portability and Accountability Act (HIPAA) of 1996 exemplifies a sector-specific, rule-based paradigm, focusing on “de-identification” of protected health information (PHI). Under HIPAA, data ceases to be PHI if it meets one of two tests: the Safe Harbor method, which mandates the removal of 18 specific identifiers (e.g., names, dates finer than year, geographic details below state level, and unique codes like Social Security numbers), or the Expert Determination method, where a qualified statistician certifies a “very small” risk of re-identification using scientific principles. This prescriptive approach provides clear compliance pathways but is limited to health data, reflecting America’s fragmented, industry-tailored privacy regime.
In contrast, the European Union’s General Data Protection Regulation (GDPR), effective since 2018, embodies a broader, principles-based framework applicable to all personal data, including health information classified as “special category” data. Here, “anonymization” renders data non-personal if it no longer relates to an identifiable natural person, assessed via a contextual, risk-based lens: whether identification is possible through all means reasonably likely to be used by the controller or third parties, factoring in time, cost, and technology (Recital 26). UK Information Commissioner’s Office guidance operationalises identifiability through the “motivated intruder” test, which explicitly assumes a diligent actor seeking to re-identify individuals.
EU jurisprudence has, however, begun to acknowledge that identifiability may be relative to the recipient. In litigation between the Single Resolution Board and the European Data Protection Supervisor, the Court of Justice of the European Union clarified that pseudonymized data may not qualify as personal data for a recipient if re-identification is not reasonably likely for them, taking into account technical, organizational, and legal constraints. Although the case was ultimately withdrawn, it signaled an important (if limited) softening of Europe’s traditionally stringent position.
As Edgcumbe et al. show in their recent analysis in the Journal of Law and the Biosciences, HIPAA and the GDPR represent two distinct paradigms for rendering health data non-identifiable: HIPAA’s rule-based, ex ante certainty versus the GDPR’s risk-based assessment of identifiability. South Africa’s Protection of Personal Information Act (POPIA), enacted in 2013 and fully effective in 2021, clearly belongs to the same legislative family as the GDPR. Like European data protection law, POPIA is structured around conditions for lawful processing, and it defines “de-identification” as the deletion of information that identifies a person by a reasonably foreseeable method (section 1). This phrasing superficially echoes GDPR’s “reasonably likely”.
This apparent alignment has now been tested in practice. In Minister of Basic Education v Information Regulator (High Court of South Africa, Gauteng Division, Pretoria, decided December 12, 2025), a full court unanimously overturned an enforcement notice from South Africa’s Information Regulator. The dispute arose from the longstanding practice of publishing Grade 12 results in newspapers, previously under learners’ names and later under exam numbers following an earlier court order. The Regulator argued that these numbers remained identifiable, especially within school communities where peers might deduce identities.
The court rejected this argument. It held that pseudonymized exam results do not constitute personal information under POPIA, dismissing the Regulator’s re-identification scenarios as “fanciful” and lacking real-world empirical support. The implicit test distilled from the judgment is pragmatic and low-threshold: Data is personal only if an ordinary observer can identify a person without particular diligence or contrived assumptions, grounded in everyday plausibility rather than theoretical possibility. In effect, the court excluded diligence-based reasoning from the foreseeability analysis altogether.
Although the court did not engage in comparative analysis, its approach is clearly incompatible with the GDPR’s motivated intruder test. Identification must be possible “without more,” the court emphasized, rendering pseudonymized data non-personal and exempt from POPIA’s application. (True pseudonymization in the technical sense usually involves separate key storage; the exam-number scenario differs, but the legal question of identifiability is structurally similar.)
The Information Regulator has since applied for leave to appeal this ruling to the Supreme Court of Appeal. If granted, the Supreme Court of Appeal could overturn or refine the decision. For now, however, the judgment remains binding on all High Courts in South Africa when constituted by a single judge and has immediate doctrinal consequences. By treating pseudonymized data as non-personal absent effortless, real-world identifiability, the judgment materially lowers barriers to secondary use and data sharing.
For health research — especially genomics, where pseudonymization is standard from the point of collection — this is potentially significant. While the case did not concern health data, its reasoning provides a framework for arguing that pseudonymized health datasets shared between researchers fall outside POPIA’s scope, provided re-identification would not occur in the ordinary course of research. This matters in Africa, where data sharing is essential for addressing diseases such as HIV and tuberculosis and for correcting global underrepresentation in genomic datasets.
The key point is not that South Africa has solved the problem of identifiability once and for all — but it ismoving decisively in the direction of a workable solution. This judicial move materially lowers barriers to sharing pseudonymized health data, at least for now, and offers a genuinely interesting contrast to dominant U.S. and European approaches. By focusing on re-identification risks in the ordinary course of business rather than scenarios requiring particular diligence — such as the motivated intruder test — South Africa positions itself as a more permissive, innovation-friendly jurisdiction. Whether this approach endures on appeal remains to be seen. But for health researchers watching the global regulatory landscape, it is a development worth paying close attention to.