Artificial Intelligence

Channeling the Power of Generative Health AI: Implications for Health Care, Research, and Governance (Part II)

In this piece, we draw on insights about the complementary nature of human and AI intelligence to identify three key lessons for effectively integrating AI into health care practice and policy, using examples from recent research to illustrate their impact.

In Part I of our blog post, we explored how powerful new AI models, most recently exemplified by DeepSeek R1, are creating unprecedented opportunities for health care research and innovation. These systems process information much as human intelligence does. However, they lack the ability to simulate and test scenarios before acting and the ability to evaluate potential outcomes based on complex ethical and social considerations. This distinction creates both extraordinary potential and significant challenges for health care implementation.

In this piece, we draw on these insights about the complementary nature of human and AI intelligence to identify three key lessons for effectively integrating AI into health care practice and policy, using examples from recent research to illustrate their impact.

1. Domain Experts as AI Pioneers

Generative AI models are not just predefined tools — they exhibit complex emergent capabilities that even their developers cannot fully predict. Since these models encode layers of human knowledge and preferences, their real-world potential can only be uncovered through hands-on exploration. No AI developer can definitively tell medical professionals what their model can achieve in specific health care contexts. Those medical professionals are the true domain professionals, who deeply understand the challenges and needs of their field and can identify valuable applications. This makes AI a powerful, untapped resource for clinicians, bioethicists, and researchers. Those ready to explore its possibilities can become pioneers in discovering and developing AI applications in their fields, regardless of their technical background.

Harnessing AI’s potential requires developing new practical skills distinct from both medical expertise and traditional programming. Our own research highlights this: We initially sought to train language models by teaching them how to write and reason like individual scholars. While the concept seemed straightforward, implementation revealed two critical insights. First, the system needs a minimum number of articles to work effectively — our model trained on 60 articles produced excellent results, while a smaller model trained on eight articles generated mostly gibberish. Second, output quality varied significantly. Initial results were often mediocre, and if we had relied on these early outputs to judge the project’s viability, we might have abandoned it. This would have been a shame, since with continued iteration and variations in prompting we found that we could produce remarkably sophisticated analyses that captured not just writing style, but also the reasoning patterns of the authors.

This means that, while we can generally envision how AI might help streamline IRB review, improve clinical trial design, facilitate regulatory compliance, and help predict patient preferences, each application will likely have its own practical requirements and constraints that can only be discovered through careful testing. The key question isn’t whether AI can help with these challenges, but rather what specific conditions must be met for it to do so effectively. Each application will likely have its own practical (and ethical) requirements and constraints that can only be discovered through careful testing and continuous iteration.

2. Understanding Human-AI Interaction Dynamics and Relational Norms

A second important lesson is that empirical research must examine not just the models themselves, but also the psychological and social dynamics shaping how humans interact with them. As discussed in Part I, AI systems and humans have complementary strengths — AI excels at processing vast information quickly, while humans provide contextual understanding and value judgments that AI currently lacks.

Given this relationship, understanding the human-AI interface becomes critical for multiple reasons. First, different configurations of this interface — varying in how much autonomy is granted to the AI, how human judgment is incorporated, and how decisions are ultimately made — can significantly affect both the quality of outputs and their reliability. Second, these different configurations create specific challenges regarding responsibility, credit, and accountability. For instance: Who bears responsibility for a misdiagnosis when an AI suggests one treatment but a clinician chooses another? How should intellectual credit be distributed when research findings emerge from human-AI collaboration? How can we ensure accountability in rapidly evolving AI-assisted care models?

The human psychological response to AI — how clinicians, researchers, and patients understand, trust, and interact with these systems — directly impacts their effectiveness in health care. This understanding is essential for everyone involved: for practitioners who need to effectively use these systems, for developers designing interfaces that appropriately balance AI capabilities with human judgment, for administrators implementing these systems in health care settings, and for policymakers developing governance frameworks.

In a recent paper on relational norms in human-AI interaction, we argue that social roles deeply shape these dynamics. The study suggests that humans will likely evaluate AI behavior through relationship-specific cooperative functions — including care, transaction, hierarchy, and mating — with different norms applying to different roles. For instance, an AI system providing emotional support would be evaluated differently when acting as a friend-like companion (where non-contingent care norms dominate) versus a professional therapist (where care is provided on a fee-for-service basis, introducing transaction norms alongside care norms). This demonstrates why interface design must account not just for the AI’s processing capabilities, but for how well it supports humans in applying appropriate relationship-specific world models and value systems.

3. Developing Informed and Adaptive Governance Frameworks

The third key lesson stems from human-AI complementarity: While AI excels at information processing, humans provide world models and value systems. Governance and compliance frameworks, such as the AI Act, should actively enhance this complementarity through mechanisms like regulatory sandboxes, which are regulatory environments that allow for controlled experimentation, or AI-driven regulatory compliance solutions, that integrate human judgment with AI capabilities.

As outlined in recent guidelines on the ethical use of generative AI in scholarship, effective governance must balance two fundamental considerations: outcome goods, relating to primary objectives like scientific progress, and process goods, concerning how those outcomes are achieved and credited. This distinction helps identify essential criteria for ethical and legally compliant AI deployment — including meaningful human oversight, substantial intellectual contribution beyond basic prompting, and appropriate transparency about AI use. These requirements take on added significance when we understand that humans aren’t simply checking AI work, but rather contributing essential cognitive capabilities that AI systems fundamentally lack.

Moreover, as our research on relational norms suggests, these governance frameworks must adapt not only to evolving technical capabilities but also to the varying social and relational contexts in which AI systems operate. Equally important is ensuring compliance with emerging legal frameworks such as the EU AI Act, FDA regulations, and other jurisdiction-specific requirements that govern AI in health care. The rapidly developing nature of the technology, combined with its deployment across different types of human-AI relationships, means that governance approaches must be both principled and flexible.

Conclusion

These three lessons — empowering domain experts, understanding interaction dynamics, and developing adaptive governance — are essential for health care and research, where both the potential benefits and risks are profound. The key to success lies not in replacing human judgment with AI, but in integrating AI’s processing power into well-designed systems that enhance, rather than diminish, the expertise, judgment, and values that are critical to ethical and effective decision-making.


Acknowledgment:This article was made possible through the generous support of the Novo Nordisk Foundation (NNF) via a grant for the scientifically independent Collaborative Research Program in Bioscience Innovation Law (Inter-CeBIL Program – Grant No. NNF23SA0087056).