How to Govern AI in Healthcare: Safety, Agents, & Roles

Governing AI in Healthcare Is No Longer Optional

Healthcare organizations have spent the last two years asking whether AI is ready for clinical use. That is now the wrong question.

A more urgent one is this: Do your governance, safety, and operating models match the reality of modern AI?

In a recent discussion hosted by BMJ Future Health, physician and digital health leader Dr. Keith Grimes argued that AI in healthcare has already moved beyond the phase of curiosity and pilot enthusiasm. The real challenge is now operational: understanding how these tools behave, how they fail, who owns the risk, and what organizations must do before adoption scales.

For healthcare and cybersecurity leaders, this matters far beyond model performance. AI systems increasingly touch patient documentation, triage, translation, coding, workflow orchestration, and decision support. That means they also affect clinical safety, privacy, resilience, compliance, and trust.

This article distills the most important ideas from that discussion and adds context for healthcare delivery organizations, digital leaders, and risk owners trying to govern AI responsibly.

Key Takeaways

Prompt engineering is no longer the main differentiator. Context, data access, memory, and workflow design now matter more.
Governance cannot be outsourced. Even if a vendor supplies the tool, the deploying organization still owns local safety, workflow, and risk decisions.
AI should be governed like other sources of harm. Dr. Grimes frames this through four domains: clinicians, drugs, devices, and digital systems.
Agentic AI changes the risk profile. Tools that can plan, call other systems, and execute tasks create more value - but also more pathways to failure.
Clinical safety capability is uneven. The discussion cited UK evidence suggesting that many deployed digital health tools lacked documented compliance with relevant safety standards.
Shadow AI is a governance issue, not just a policy issue. If clinicians can build or use tools on their own, organizations need guardrails that acknowledge reality.
AI may expose existing operational weakness. Triage, care navigation, translation, and inconsistent workflows can all be amplified by automation.
The right benchmark is not perfection. It is whether the AI-supported process is safer or better than the current state.
Practical education matters. Staff do not need to become AI engineers, but they do need to know how a tool works, how it fails, and how to escalate issues.
Start with low-risk use cases and structured oversight. Learn in bounded environments before moving into high-consequence clinical workflows.

The First Big Shift: Prompting Matters Less Than Context

One of the clearest points from the discussion was that common advice about AI has already aged.

Early in the generative AI era, a great deal of attention went to prompting: how to phrase instructions, assign roles, break down tasks, and coax better answers from large language models. That advice was useful at the time. But as models have improved, prompting has become less central.

What matters more now is context.

In practice, that means the quality of the information surrounding the model’s task:

the documents it can access
the memory or prior interaction history it can draw from
the workflow it sits inside
the data source connected to it
the organizational constraints it must respect

For healthcare leaders, this is a crucial governance insight. Many AI risks do not stem from the base model alone. They come from the system around the model: what information it sees, what tools it can call, what output downstream users trust, and whether the context supplied is accurate and current.

This shift also has cybersecurity implications. Once organizations move from stand-alone chat interfaces to AI systems connected to repositories, EHR-adjacent tools, internal knowledge bases, or external APIs, the attack surface changes. Access control, data minimization, logging, and dependency mapping become far more important than clever prompt wording.

The Second Big Shift: Governance Is Not Someone Else’s Problem

A central argument in the conversation was that healthcare has treated digital systems differently from other regulated sources of risk.

Clinicians understand governance around professional practice. They are also familiar with the controls surrounding drugs and, to a lesser extent, medical devices. But digital tools often arrive with an implicit assumption: if the software is available, somebody else must already have made it safe.

That assumption does not hold.

Dr. Grimes described a practical framing using four areas of healthcare governance:

Doctors and other licensed professionals
Drugs
Devices
Digital

The key point is not that all four should be managed identically. It is that digital systems deserve the same seriousness when they can contribute to patient harm, workflow failure, or loss of service.

This is especially relevant in AI deployments. A tool may not qualify as a regulated medical device in every jurisdiction, yet it can still create clinical risk if it:

inserts incorrect content into records
creates misleading summaries
fails silently during triage or translation
encourages overreliance by busy staff
routes sensitive data to inappropriate environments
degrades availability of core systems

For CISOs and CIOs, this is where cyber and clinical governance begin to converge. A resilient AI program is not just about secure procurement. It requires shared accountability across clinical leaders, digital teams, privacy officers, and security operations.

Clinical Safety Is Becoming a Front-Door Requirement

The discussion focused heavily on the UK’s clinical safety model, particularly the role of a Clinical Safety Officer and the use of digital clinical risk management standards. While the specifics referenced were UK-based, the underlying lesson is broader: AI governance needs named responsibility, structured hazard analysis, and ongoing monitoring.

Dr. Grimes described clinical safety in straightforward terms:

What does the technology do?
Where will it be used?
How could it go wrong?
What would the consequences be?
What controls reduce the risk?
How will you know if the assumptions stop being true?

That is a useful operating model for any healthcare organization, even where local regulation differs.

Why this matters beyond the UK

Many U.S. health systems already have mature structures for patient safety, device review, third-party risk, and change management. But AI can fall into the gaps between them.

For example:

A documentation assistant may not be treated like a device
A workflow copilot may not go through the same review path as clinical software
A departmental AI prototype may bypass enterprise architecture and security review
A vendor update may materially change model behavior without changing the product name

These are governance failures waiting to happen.

The discussion cited a UK study reporting that a large share of deployed digital health technologies lacked evidence of compliance with referenced safety standards. Even if that does not prove the tools were unsafe, it does signal something serious: organizations often cannot demonstrate the rigor of their own deployment decisions.

That should concern boards, regulators, insurers, and incident responders alike.

What a Clinical Safety Officer Represents in Broader Terms

The transcript described a Clinical Safety Officer as a registered healthcare professional trained in clinical risk management who helps evaluate whether a digital tool is safe for its intended setting and monitors whether it remains safe over time.

Even if your organization does not use that title, the role translates well into a broader governance need:

someone with clinical credibility
someone who understands risk analysis
someone who can challenge unsafe assumptions
someone who bridges vendor claims and operational reality
someone accountable for ongoing surveillance, not just go-live approval

For healthcare organizations in the U.S., this role might sit with a CMIO office, digital quality leader, patient safety function, or multidisciplinary AI governance committee. The title is less important than the function.

The operational question is simple: Who, exactly, owns clinical risk when an AI-enabled workflow fails?

If the answer is vague, governance is immature.

Agentic AI Will Raise Both the Value Ceiling and the Risk Ceiling

Another major theme was the rise of agents.

The discussion defined agents as AI systems that do more than generate text. They can use tools, break tasks into steps, maintain state or memory, and sometimes coordinate multiple sub-agents to complete a goal.

That matters because healthcare is full of multi-step processes:

prior authorization workflows
inbox management
care coordination
documentation follow-up
scheduling and referrals
chart review and abstraction
revenue cycle tasks
patient navigation

Agentic AI can be powerful in these environments because it can move from answering questions to doing work.

But that creates a different category of risk.

How agents change governance requirements

A traditional chatbot may produce a bad answer. An agent may:

query the wrong source
mis-sequence a workflow
take an action based on stale assumptions
propagate an error across several connected systems
escalate privileges through tool access
create opaque chains of execution that are hard to audit

In cybersecurity terms, agentic systems are not just content generators. They are emerging as operational actors.

That means organizations need stronger controls around:

identity and access management
action authorization
least-privilege tool connections
auditability and replay
testing of edge cases and failure modes
rollback and human intervention points
vendor change notification and model drift monitoring

If copilots were the first wave of AI in healthcare, agents may become the second. Many organizations are not yet set up to govern them.

Healthcare Should Stop Comparing AI to an Imagined Perfect Process

One of the most useful arguments in the discussion concerned evaluation.

Healthcare often rejects AI tools because they are not perfect. But many current workflows are neither consistent nor well measured. If an AI system is 90% effective at a task, the real comparison should not be to an idealized human process. It should be to the actual quality of the current state.

That is uncomfortable, because healthcare organizations often do not have strong baseline measurements for messy operational processes such as:

translation quality in time-pressured situations
referral routing consistency
call-center triage accuracy
note quality under workload stress
patient message prioritization
care navigation across multimorbidity

This creates a blind spot. AI can expose underperformance that organizations have tolerated because it was manual, distributed, or hidden inside workflow variability.

A better evaluation lens

Instead of asking, "Is the AI flawless?" ask:

What is the current process performance?
Where does harm or delay occur today?
What kinds of errors does the AI introduce?
Which errors are more detectable or recoverable?
What controls make the new process safer than the old one?

This does not lower the bar. It makes the comparison honest.

For governance teams, that means AI review should include not just validation of the tool, but assessment of the legacy process it may replace or augment.

AI Will Amplify Existing Weaknesses Unless Workflows Improve First

A participant in the discussion raised a critical point: AI tends to amplify existing strengths and weaknesses.

That observation should shape every healthcare AI roadmap.

If your current triage logic is inconsistent, automating it can spread inconsistency faster. If your translation process is already unsafe, AI may either improve it or worsen it depending on controls. If your documentation workflow rewards speed over verification, ambient AI may magnify that tension rather than solve it.

This is why AI governance should not begin with software selection alone. It should begin with workflow scrutiny.

Ask:

What problem are we actually solving?
Is the current process well understood?
Where are the unsafe workarounds?
What assumptions do frontline staff make under time pressure?
Which parts of the workflow are stable enough to automate?
Where does human judgment remain essential?

The discussion also warned against merely digitizing current dysfunction. That is a common enterprise mistake. AI should not only help organizations do the same broken process faster.

Shadow AI and Builder Culture Are Now Part of the Risk Landscape

One of the most pragmatic parts of the discussion addressed clinicians and staff who are already experimenting with AI tools or building lightweight applications themselves.

This is not hypothetical.

Modern AI has lowered the technical barrier to creating calculators, bots, workflow helpers, and internal tools. That is exciting because frontline staff often understand operational pain better than anyone else. But it also means healthcare organizations face a new reality: innovation can now happen outside traditional procurement and SDLC channels.

That creates several risks:

unapproved use of patient data
insecure API connections
unknown hosting environments
poor validation of outputs
no version control or change oversight
informal tool sharing between departments
absent incident reporting pathways

From a cybersecurity and compliance perspective, shadow AI should be treated similarly to shadow IT - but with added concern for clinical interpretation and patient harm.

What leaders should do instead of issuing blanket bans

Total prohibition rarely works when tools are cheap, accessible, and obviously useful. A better strategy is controlled enablement:

provide approved sandbox environments
publish clear rules for acceptable use
prohibit real patient data in nonapproved tools
create lightweight review pathways for prototypes
support secure low-code/no-code experimentation
define when a prototype becomes a governed application
require documentation of intended use and known limitations

This approach preserves innovation while reducing hidden risk.

Regulation Is Maturing, but It Still Lags the Pace of Change

The discussion acknowledged substantial uncertainty in AI regulation, particularly around device status, cross-border deployment, and how quickly frameworks can adapt to changing models.

That uncertainty has real consequences. Vendors may delay or avoid certain markets. Health systems may hesitate to invest. Useful tools may remain inaccessible because the rules are not yet clear enough for manufacturers or buyers to act confidently.

Still, there was cautious optimism in the discussion: regulatory approaches appear to be becoming more adaptive, more proportionate, and more focused on post-deployment monitoring.

For healthcare leaders, that implies two things.

1. Compliance alone will not be enough

A tool can be legally marketable and still be poorly integrated, overtrusted, or insecure in your environment. Governance needs to cover the full deployment lifecycle, not just pre-purchase review.

2. Post-market oversight becomes more important

As AI models and features evolve quickly, static approval snapshots lose value. Organizations need mechanisms to monitor:

model or feature updates
changes in intended use
shifts in output quality
adverse events and near misses
user behavior changes
access pattern anomalies
downstream workflow effects

This is where AI governance starts to resemble cyber resilience: continuous monitoring matters more than one-time assurance.

What Clinicians and Frontline Staff Actually Need to Know

A valuable practical theme in the discussion was that not every healthcare professional needs deep technical expertise. But they do need enough fluency to use AI safely.

That minimum threshold includes:

knowing what the tool does
understanding its intended use
recognizing common failure modes
verifying outputs before acting on them
explaining the tool to patients when appropriate
understanding consent implications
knowing how and where to report problems

This is a manageable training agenda. It is also one many health systems have not yet formalized.

For CIOs, CISOs, and Chief AI Officers, this suggests that AI readiness should include workforce safety training, not just technical implementation planning.

A clinician using an ambient documentation tool, for example, should not simply be told it will save time. They should know what kinds of transcription or summarization errors to look for, what must be verified, and what to do when the tool performs unexpectedly.

A Practical Starting Point for Organizations

Near the end of the discussion, Dr. Grimes offered a simple recommendation: learn AI first in lower-risk, nonclinical settings.

That advice is more strategic than it may sound.

People understand AI best through use. But learning directly in high-consequence clinical environments creates avoidable risk. A safer path is to build familiarity through administrative, educational, research-adjacent, or personal productivity tasks before applying the same concepts to patient-facing workflows.

For organizations, the equivalent is a phased maturity model:

Phase 1: Safe Familiarization

Use AI in bounded, low-risk tasks such as summarizing policies, drafting internal documents, or organizing nonclinical information.

Phase 2: Controlled Operational Use

Move into governed administrative workflows with strong human oversight, such as back-office assistance, coding support, or staff-facing search.

Phase 3: Clinical Augmentation

Adopt AI in patient-adjacent scenarios where verification is clear and human review remains central.

Phase 4: Higher-Autonomy Workflows

Consider agentic or semi-autonomous systems only when governance, monitoring, and recovery processes are mature.

This phased approach aligns well with both patient safety and cyber resilience principles.

Questions Every Healthcare AI Governance Committee Should Be Asking

To operationalize the themes from the discussion, healthcare leaders should pressure-test their programs against a few core questions:

Governance and ownership

Who owns AI safety after procurement?
Who can stop deployment if risk controls are weak?
Is there a named clinical risk owner?

Security and privacy

What data can the tool access, retain, or transmit?
Are tool-to-tool connections controlled and logged?
What prevents inadvertent PHI leakage?

Operational design

What current workflow is being changed?
What baseline performance are we comparing against?
Where must human review remain mandatory?

Monitoring and incident response

How do users report AI-related near misses?
How are vendor changes tracked and assessed?
Can the organization detect drift, misuse, or degraded outputs?

Workforce readiness

Do frontline users understand failure modes?
Can they explain the tool’s role to patients?
Have they been trained on escalation procedures?

If these questions do not yet have clear answers, scaling AI will outpace governance.

Conclusion: The Real AI Challenge in Healthcare Is Organizational, Not Just Technical

The most important message from this discussion is that AI governance in healthcare is not primarily a model selection problem. It is an organizational competence problem.

Healthcare now needs stronger ways to:

assign responsibility
evaluate risk proportionately
monitor digital tools continuously
train users realistically
govern builders as well as buyers
compare AI against actual, not idealized, care processes

AI will keep improving. Agents will become more capable. Regulation will continue to evolve. But none of that removes the local responsibilities of the healthcare organization deploying the tool.

The institutions that benefit most from AI are unlikely to be the ones that move fastest without controls. They will be the ones that build repeatable, cross-functional governance that treats patient safety, cyber resilience, workflow design, and clinical accountability as part of the same system.

That is the real work now.

Source: "AI in Healthcare 2026: Governance, Clinical Safety & AI Agents Explained" - BMJ Future Health, YouTube, May 14, 2026 - https://www.youtube.com/watch?v=evQg-IVRZ1o