< All Topics
Print

Post-Market Surveillance for Health AI

Post-market surveillance (PMS) for health artificial intelligence (AI) represents a fundamental shift in how medical safety and performance are ensured throughout the lifecycle of a device. Unlike traditional hardware medical devices, which may remain stable for years once manufactured, health AI systems are characterized by adaptive behavior, continuous learning, and potential concept drift. This reality necessitates a surveillance framework that is as dynamic as the technology it governs. For professionals developing, deploying, or maintaining AI systems in the European healthcare ecosystem, understanding the interplay between the Medical Device Regulation (MDR), the AI Act, and national competent authority practices is not merely a compliance exercise; it is a prerequisite for safe clinical integration.

The regulatory landscape in Europe is currently defined by the transition from the Medical Device Directive (MDD) to the Medical Device Regulation (MDR, Regulation (EU) 2017/745), which became fully applicable in May 2024, and the imminent application of the Artificial Intelligence Act (AI Act, Regulation (EU) 2024/1689). Health AI typically falls under the definition of a medical device (or an accessory to one) and, depending on its risk profile, will likely be classified as a High-Risk AI System. Consequently, manufacturers face a dual-layered surveillance obligation: one rooted in the specific safety requirements of medical devices and another in the horizontal governance of AI risks.

The Regulatory Foundation: MDR and AI Act Synergy

To understand PMS for health AI, one must first map the regulatory scope. The MDR governs the safety and performance of medical devices throughout their lifecycle. The AI Act regulates AI systems based on their risk to health, safety, and fundamental rights. For health AI, these two frameworks overlap significantly.

Under the MDR, PMS is defined as a systematic and proactive process to collect and analyze data. It is not merely reactive. The manufacturer must establish, document, and implement a PMS system tailored to the device. This is distinct from the older MDD approach, where PMS was often less formalized. Under the MDR, the level of PMS activity must be proportional to the class of the device (Class I, IIa, IIb, III). Most software as a medical device (SaMD) and health AI tools that diagnose, monitor, or treat will fall into Class IIa, IIb, or III, triggering rigorous PMS requirements.

The AI Act complements this by mandating a Post-Market Monitoring (PMM) system for high-risk AI systems. While the terminology is similar, the focus differs slightly. The MDR focuses on device safety and performance (e.g., technical features, clinical suitability), whereas the AI Act’s PMM focuses on the AI system’s compliance with the requirements set out in the Act (e.g., human oversight, robustness, cybersecurity) and the mitigation of risks that might emerge from the AI’s specific characteristics, such as bias or automation bias.

Defining the Scope: What Constitutes Health AI?

In practice, regulatory teams must first classify the product. A health AI system that analyzes medical images to assist in diagnosis is a medical device. If it operates with a high degree of autonomy or processes sensitive data, it is also a high-risk AI system. The PMS strategy must address both.

It is crucial to recognize that “health AI” is not a monolith. It ranges from clinical decision support systems (CDSS) to robotic surgery planning software and wearable monitoring algorithms. The PMS requirements will scale with the risk associated with the intended purpose. For instance, an AI predicting hospital readmission risks (indirect impact) may face a different PMS intensity than an AI controlling an infusion pump (direct, life-sustaining impact).

Post-Market Surveillance under the MDR: The Technical File Reality

The MDR requires the manufacturer to draw up a Periodic Safety Update Report (PSUR) for Class IIa, IIb, and III devices. This document is the cornerstone of PMS. It summarizes the results and conclusions of the analysis of PMS data, drawing on sources such as user feedback, scientific literature, and complaint data.

The PMS Plan and PMS Data

Before the PSUR exists, the manufacturer must have a PMS Plan. This plan details what data will be collected and how. For health AI, this data collection is complex. It includes:

  • Real-World Performance: How does the AI perform in the clinical environment compared to the validation dataset?
  • Software Updates: Every update to the AI model (even minor patches) must be evaluated for its impact on safety.
  • Input Data Quality: Drifts in the quality of input data (e.g., changes in scanner protocols for imaging AI) can degrade performance.

The MDR mandates that PMS shall be proportionate to the device’s risk class. However, for AI, “proportionality” does not mean “minimal.” It means the surveillance must be statistically powered to detect performance degradation.

Vigilance and Incident Reporting

Under the MDR, manufacturers have a strict obligation to report Serious Incidents to the competent authorities (Notified Bodies and National Competent Authorities like BfArM in Germany or ANSM in France). A serious incident is any event that led to a deterioration in the health of a patient, user, or other person, or led to a serious impairment in the functioning of the device.

For health AI, defining a “serious incident” requires specific expertise. It is not just a hardware failure. It includes:

  • False Negatives/Positives: An AI missing a critical diagnosis (e.g., intracranial hemorrhage) that results in patient harm.
  • Automation Bias: A clinician over-relying on the AI’s output, leading to a wrong treatment decision. The manufacturer must assess if the system design contributed to this.
  • Cybersecurity Breaches: If a vulnerability in the AI system allows unauthorized access to patient data, this is a reportable incident.

The timeline is tight: Initial reports must be made within 2 days (for a serious public health threat) or 10 days (for other serious incidents) following the awareness of the incident. This requires robust internal processes to triage software bugs versus safety incidents.

Post-Market Monitoring under the AI Act: The New Paradigm

The AI Act introduces the concept of Post-Market Monitoring (PMM). While the MDR looks at the device, the AI Act looks at the system’s behavior in the wild. The AI Act Article 72 requires providers of high-risk AI systems to establish a system for actively collecting information on the performance of the high-risk AI system throughout its lifecycle.

Monitoring for Algorithmic Drift and Bias

The AI Act explicitly mentions the need to monitor algorithmic drift and bias. In a clinical setting, this is critical. A health AI model trained on data from one demographic or hospital system may perform poorly when deployed in another. This is known as concept drift.

PMM under the AI Act requires the manufacturer to monitor:

  • Unexpected behavior (hallucinations in Generative AI for health).
  • Changes in the prevalence of the target condition (e.g., during a pandemic).
  • Feedback loops where the AI’s predictions influence the data it receives in the future.

Unlike the MDR’s PSUR, which is submitted periodically (annually for Class IIb/III, unless otherwise specified), the AI Act’s PMM system is a continuous internal process. However, the results of this monitoring feed into the Technical Documentation and must be available for audit by the Notified Body or market surveillance authorities.

The Role of the Notified Body vs. Market Surveillance Authorities

Under the AI Act, high-risk AI systems that are also medical devices are subject to the conformity assessment procedure involving a Notified Body. This is a key distinction from other high-risk AI sectors (like biometrics) which might use internal controls. For health AI, the Notified Body (appointed under MDR) will assess the compliance with the AI Act requirements.

The Notified Body will scrutinize the PMS/PMM documentation to ensure the manufacturer is detecting and addressing algorithmic risks. If the manufacturer changes the AI model (e.g., retraining on new data), this constitutes a substantial modification under the MDR, potentially requiring a new conformity assessment.

Practical Implementation: The Monitoring Loop

For a manufacturer, setting up a compliant PMS/PMM system involves operationalizing data pipelines. It is not sufficient to wait for users to report complaints. The system must be active.

Data Sources for Health AI Surveillance

A robust surveillance strategy integrates multiple data streams:

  1. Clinical User Feedback: Structured channels for radiologists or doctors to flag “near misses” or “disagreements” with the AI.
  2. System Logs: Telemetry data showing inference times, confidence scores, and error rates.
  3. External Data: Monitoring scientific literature for new evidence that might contradict the AI’s intended use (e.g., a new study showing a biomarker is no longer predictive).
  4. Cybersecurity Monitoring: Continuous scanning for vulnerabilities (CVEs) in the software stack.

Handling “Near Misses” and “Close Calls”

In aviation safety, “near misses” are reported and analyzed to prevent accidents. Health AI requires a similar culture. While the MDR mandates reporting of serious incidents, the AI Act encourages the monitoring of near misses to identify systemic risks before they cause harm.

Manufacturers should establish an internal threshold for “near misses.” For example, if an AI suggests a contraindicated drug interaction that is caught by a pharmacist 99% of the time, the 1% where it is not caught is a risk. Analyzing these near misses helps refine the model or the user interface (UI) to reduce risk.

Corrective Actions and Field Safety Corrective Actions (FSCA)

When PMS or PMM identifies a risk, the manufacturer must act. This is where the regulatory process becomes tangible.

Field Safety Corrective Actions (FSCA)

If a device poses a risk to health, the manufacturer must initiate a Field Safety Corrective Action. This is not just a software update pushed over the air; it is a formal regulatory action involving a Field Safety Notice (FSN) sent to customers and authorities.

For health AI, an FSCA might involve:

  • Remote Deactivation: Disabling a specific algorithmic feature that is found to be biased.
  • Model Retraining: Requiring all instances of the AI in the field to be updated with a new model version.
  • Labeling Changes: Updating the Instructions for Use (IFU) to warn against specific input data types.

The distinction between a “software update” and a “corrective action” is vital. A corrective action addresses a safety defect. Under the MDR, if a manufacturer identifies a trend of incidents (even if individually they are not serious), they must investigate and may need to issue an FSCA.

Corrective Actions under the AI Act

The AI Act requires providers to take corrective actions to ensure compliance. If the market surveillance authority finds that a high-risk AI system is non-compliant (e.g., it exhibits bias not present in the technical documentation), they can request corrective actions within a specific timeline. If the provider fails to act, the authority can withdraw the AI system from the market or delete it.

For health AI, this is a high-stakes scenario. A withdrawal of a diagnostic tool can disrupt hospital workflows significantly. Therefore, the PMS system must be sensitive enough to detect compliance gaps before the authorities do.

National Nuances: The European Patchwork

While EU Regulations (MDR and AI Act) are directly applicable, their enforcement and the infrastructure for reporting vary across Member States.

Reporting Portals and Competent Authorities

Every Member State has a National Competent Authority (NCA). For PMS and Vigilance, manufacturers must report incidents to the NCAs. However, the technical interfaces differ.

  • Germany (BfArM): Uses the “GestraM” system for vigilance reporting. Germany is known for its rigorous interpretation of the MDR, particularly regarding clinical evaluation.
  • France (ANSM): Has a strong focus on real-world data (RWD) and encourages proactive data collection from healthcare institutions.
  • Italy (ISS): The National Institute of Health plays a central role in vigilance analysis.
  • The Netherlands (CIBG): Part of the Ministry of Health, they manage the LAREB system for adverse event reporting.

For a manufacturer deploying health AI across Europe, the PMS system must be capable of aggregating data centrally but formatting reports according to the specific requirements of each NCA. This is a significant operational burden that is often underestimated.

Interplay with National Health Data Strategies

Many European countries are building national health data spaces (e.g., Germany’s “Gesundheitsdatennutzungsgesetz” – GDNG). PMS data often resides within these hospital systems. Manufacturers need to establish Data Processing Agreements (DPAs) with hospitals to legally access the performance data needed for PMS. This is complicated by GDPR and national data protection laws.

In some countries, like Estonia or Finland, the integration of PMS data into national registries is streamlined. In others, it requires navigating complex hospital procurement and privacy committees.

Continuous Evaluation: The Lifecycle Approach

The concept of “continuous evaluation” is the bridge between PMS and clinical evidence. Under the MDR, the Clinical Evaluation Report (CER) is a living document. It must be updated regularly with PMS data.

The Feedback Loop

The process looks like this:

  1. Design Phase: Risk analysis (ISO 14971) identifies potential AI failures.
  2. Market Release: PMS collects real-world data.
  3. Analysis: Data is analyzed for trends (e.g., performance drop in specific patient subgroups).
  4. Update: The AI model is retrained or the risk management file is updated.
  5. Re-certification: If the change is substantial, the Notified Body reviews the update.

This loop must be documented. The AI Act reinforces this by requiring that the technical documentation demonstrates how the provider ensures the system is monitored to detect risks.

Managing “Software as a Medical Device” (SaMD) Updates

Health AI often updates frequently. The regulatory framework allows for “minor updates” that do not affect the intended purpose or safety. However, determining what constitutes a “minor” update is tricky.

If an AI model is retrained on new data to improve accuracy, is that minor? Usually, yes, if the intended purpose remains the same. But if the retraining changes the sensitivity/specificity significantly, it might require a new clinical evaluation. The PMS data provides the evidence to make this determination.

Operational Challenges for Manufacturers

Implementing a compliant PMS/PMM system for health AI presents distinct technical and organizational challenges.

1. The “Black Box” Problem

Many advanced AI models (Deep Learning) are opaque. If an incident occurs, it can be difficult to explain why the AI made a specific decision. PMS must include tools for explainability (XAI). Regulators will increasingly expect manufacturers to provide logs or explanations for adverse events. If you cannot explain why the AI failed, you cannot prove you have mitigated the risk.

2. Data Privacy and GDPR

PMS requires data. Often, this data is personal health data. To analyze a serious incident, a manufacturer might need to look at the specific patient data that triggered the error. This requires a legal basis under GDPR (often Article 9(2)(i) – necessary for medical purposes) and strict data minimization. Manufacturers must have technical measures (like federated learning or differential privacy) to analyze PMS data without unnecessarily exposing patient identities.

3. Resource Allocation

PMS is not a one-time task; it is a continuous operational cost. It requires:

  • Data scientists to analyze performance drift.
  • Clinical experts to assess the medical impact of incidents.
  • Regulatory experts to manage reporting obligations.
  • Software engineers to deploy corrective actions.

Small and Medium-sized Enterprises (SMEs) often struggle with these resource requirements. The regulatory framework does not offer exemptions for SMEs regarding safety, though the AI Act does mention support for innovation (e.g., regulatory sandboxes).

Future Outlook: The Convergence of Standards

We are moving toward a harmonized European approach where PMS is not just about “bugs” but about “s

Table of Contents
Go to Top