The Harness: the deployment layer for autonomous medical AI.

By Pablo Díaz

Co-founder, Superposition Labs, Inc.

Published Apr 15, 2026

What is the harness, exactly?

The harness is a category of infrastructure - the complete set of things that have to exist around autonomous medical AI for it to reach a patient. Five components: clinical integrations, liability architecture, data standards, regulatory scaffolding, and trust infrastructure.

The analogy we keep returning to: the harness is to autonomous medical AI what roads, traffic law, insurance, and licensing are to automobiles. The AI is the car. The harness is everything else. Henry Ford shipped the Model T in 1908; the Federal Highway Act passed in 1956. Forty-eight years of infrastructure-building between the invention and the system it could run on. Medicine does not have forty-eight years. The WHO projects an 11 million health worker shortfall by 2030. The workforce crisis is already here, and the pressure it creates on health systems is accelerating the timeline for everything downstream.

When we say “the harness,” we mean the entire deployment layer: the unglamorous, high-liability connective tissue between a foundation model that can diagnose and a patient who receives that diagnosis in a clinical setting, with legal standing, through a compliant workflow, backed by an audit trail that holds up in court. Every piece has to exist simultaneously. A model that achieves 95% diagnostic accuracy in a lab is a research milestone. A model that achieves 95% diagnostic accuracy and reaches a patient through an integrated EHR workflow, with appropriate liability coverage, under a regulatory framework that permits its use, and with sufficient physician trust to act on its output - that is a deployed medical AI. The distance between those two states is the harness.

The term is deliberate. A harness channels force. The frontier labs generate the clinical intelligence; the harness ensures it reaches the patient safely, legally, and in a form the health system can absorb. Without the harness, autonomous medical AI remains a set of impressive benchmarks. With it, autonomous medical AI becomes medicine.

Why won’t frontier labs build it themselves?

Google, OpenAI, and Anthropic have the models. Med-Gemini is at 91.1% on MedQA. Claude and GPT-4 are being fine-tuned on clinical datasets by dozens of research groups. The raw clinical intelligence exists and is improving on a curve that shows no signs of flattening. The question is not whether AI will surpass human diagnostic capability - that question is settled on benchmarks and approaching settlement in practice. The question is who builds the deployment layer.

Frontier labs will not build the harness because it is structurally unattractive to them. Their business model is model-as-a-service: sell API access, sell enterprise licenses, sell inference at scale. The harness requires the opposite operational profile - deep local regulatory knowledge, health system-by-health system integration, liability assumption, and multi-year relationship-building with hospital administrators who are constitutionally skeptical of technology companies telling them how to practice medicine.

Consider what the harness demands. In the United States alone, medical AI deployment touches FDA SaMD classification, state medical board licensing, CMS reimbursement policy, HIPAA data handling, OIG anti-kickback compliance, and malpractice liability allocation - and that is before you integrate with a single EHR. Each health system runs its own Epic or Cerner instance with custom workflows. Each state has its own licensing regime. Each payer has its own reimbursement logic for AI-assisted care.

A frontier lab that tries to build this is building a healthcare services company inside a research company. The talent profiles are different. The sales cycles are different. The liability posture is different. Google Health has been through three restructurings in five years. Microsoft shuttered its health AI division and rebuilt it twice. These are structural mismatches. The harness requires a company built specifically to deliver it.

This is the deployment gap in concrete terms. The gap is institutional, regulatory, and operational. And it is precisely the kind of gap that creates durable companies - because filling it requires sustained, specialized work that incumbents on both sides (model labs and health systems) are poorly positioned to do themselves.

What are its five components?

The harness decomposes into five interlocking layers. None is sufficient alone; all five must exist for autonomous medical AI to operate in a clinical setting. We treat them as infrastructure primitives - each can be built incrementally, but the system only functions when all five reach minimum viability for a given clinical surface.

Clinical integrations. The pipes between the AI and the clinical environment. EHR connectors (Epic, Cerner, Athenahealth), FHIR adapters for structured data exchange, clinical decision support (CDS) hooks that embed AI output into existing physician workflows, PACS integration for imaging, and HL7 v2 interfaces for legacy systems that will not migrate for a decade. The integration layer has to be bidirectional: the AI reads from the clinical record and writes back into it, with full provenance tracking. This is not an API call - it is a stateful, auditable transaction that must survive interruption, version changes, and regulatory inspection.

Liability architecture.The legal scaffolding that determines who is responsible when an autonomous AI makes a clinical decision. Indemnity contracts between the AI deployer and the health system. Malpractice carve-outs that define whether AI-assisted decisions fall under the physician's existing coverage or require new policy structures. Algorithmic audit trails that reconstruct the AI's reasoning chain for every clinical recommendation - not just the output, but the input data, model version, confidence score, and any human override. This is what makes the liability question tractable: not by eliminating liability but by making it assignable and insurable. The Utah model is the first framework that attempts this at scale - requiring a licensed physician supervisor, an FDA-cleared or exempt algorithm, and a defined scope of autonomous action.

Data standards. Existing health data standards were built for human-generated clinical data. HL7 FHIR handles physician notes, lab results, imaging orders. It does not handle autonomous AI outputs: diagnostic confidence intervals, multi-model consensus scores, reasoning traces, or consent records specific to AI-generated care. The harness requires extensions to FHIR that represent AI clinical outputs as first-class resources - with their own provenance, versioning, and consent semantics. Consent frameworks must address a question that did not exist five years ago: does a patient consent to AI-generated care the same way they consent to physician-directed care? The answer is no, and the data standards have to encode that distinction.

Regulatory scaffolding. The regulatory landscape for autonomous medical AI is moving fast and unevenly. The FDA's SaMD framework was designed for locked algorithms that do not learn; the agency is developing a Predetermined Change Control Plan (PCCP) pathway for adaptive AI, but it is not finalized. State-by-state licensing creates a patchwork: Utah permits autonomous prescribing under supervision; most states have no framework at all. Post-market surveillance for AI differs fundamentally from device surveillance - the “device” changes with every model update. The regulatory scaffolding component of the harness is not lobbying; it is the operational infrastructure to navigate, comply with, and adapt to a regulatory environment that will change materially every twelve to eighteen months for the next decade.

Trust infrastructure. Hospital administrators do not adopt clinical AI because it performs well on benchmarks. They adopt it when they see peer institution evidence, validated clinical studies, physician endorsement, and a risk profile they can present to their board. Trust infrastructure is the systematic production of that evidence: clinical validation studies designed for the specific patient populations a health system serves, evidence packages formatted for hospital board review, physician education programs that build fluency rather than resistance, and transparent reporting of AI performance in production - including failures. Trust is not a marketing exercise. It is an engineering discipline with its own deliverables, timelines, and quality metrics.

How do you build infrastructure when regulation isn’t settled?

The objection we hear most often: how can you build deployment infrastructure for a regulatory regime that does not yet exist? The answer is that this is exactly what telemedicine companies did before COVID, and it is the reason they were ready when the regulation arrived overnight.

Telemedicine platforms like Teladoc, Amwell, and MDLive spent years building the technical infrastructure, credentialing systems, and payer integrations for virtual care delivery. They operated under restrictive state-by-state licensing, limited reimbursement, and widespread physician skepticism. Then in March 2020, CMS waived geographic restrictions and expanded telehealth coverage in a matter of days. The companies that had built the infrastructure were the ones that scaled. The ones that waited for regulatory certainty before building were left behind by years.

The pattern is consistent: workforce pressure forces regulatory movement; infrastructure has to be ready when it does. The workforce crisis in healthcare is more acute and more structural than the COVID-era access problem. The WHO's 11 million worker shortfall is not a pandemic spike - it is a demographic trend that gets worse every year. Rural emergency departments are closing. Wait times for specialists are measured in months across most developed countries. The political pressure to allow autonomous clinical AI will come from the same place the telehealth waivers came from: necessity.

The regulatory signals are already visible. Utah moved in January 2026 with HB0249, creating the first state framework for AI-assisted prescribing. It is narrow - approximately 190 chronic medications, physician supervision required - but it establishes the legal principle that an AI system can participate in clinical decision-making with defined autonomy. China did not wait; autonomous clinical AI is deployed across 260+ hospitals in 93.5% of provinces. ARPA-H is funding the first agentic-AI clinical pilots in the United States. The regulatory direction is clear even if the timeline is uncertain. Building the harness now is not speculative - it is positioning.

Our operating assumption: the regulation arrives in waves, not all at once. Radiology first (narrow scope, existing digital infrastructure, lower liability exposure), then pathology, then chronic care management, then acute care. Each wave needs a different harness configuration. The companies that have built and tested the harness components in early waves will be the ones that deploy in later, higher-stakes waves. The harness is infrastructure; infrastructure rewards being early.

How does Superposition ship the harness without building all of it at once?

A company cannot be built on a decade-long thesis alone. You need revenue, customers, and operational learning - and you need them while the regulatory environment is still forming. This is the base camp strategy.

Each base camp is a product. It solves a real problem for real customers, earns its own revenue, and leaves a piece of the harness behind. The base camps are sequenced by clinical risk: we start where the liability exposure is lowest and the operational learning is highest, then move toward higher-risk clinical surfaces as we accumulate the integrations, relationships, and regulatory credibility each subsequent surface requires.

SignatureAPI is base camp one. It is document infrastructure for healthcare - electronic signatures, consent management, document workflows for clinical and administrative processes. It does not touch clinical decisions. It solves an acute problem (healthcare document workflows are fractured, non-compliant, and expensive) and earns revenue from day one. But it also does something else: it teaches us healthcare enterprise sales, HIPAA-grade infrastructure operations, and clinical workflow patterns at scale. Every SignatureAPI integration maps a piece of a health system's operational topology. Every consent workflow we process informs the consent architecture that autonomous AI will eventually require. The harness component SignatureAPI leaves behind is trust infrastructure and clinical workflow mapping - the two components that are hardest to build without being inside the health system.

Radiology is the thesis-fit next candidate. It is the clinical surface where autonomous AI is closest to deployment readiness: narrow scope (specific read types), existing digital infrastructure (PACS, DICOM), a well-defined clinician-in-the-loop workflow (the radiologist reviews AI-flagged findings), and an acute workforce shortage (the average radiologist reads 50+ studies per day, and the number of radiologists per capita is declining in most markets). The harness for radiology requires all five components but at lower liability intensity than prescribing or acute care. It is the right next surface to build on.

The base camp sequence is not arbitrary. Each camp is selected because it maximizes the ratio of harness infrastructure deposited to clinical risk assumed. SignatureAPI: zero clinical risk, high workflow learning. Radiology: moderate clinical risk, high integration learning. Prescribing: high clinical risk, full harness deployment. The harness grows with each product, and each product is viable independently. This is how you build infrastructure before the market fully arrives - the same way AWS built cloud infrastructure by selling commodity compute before the market for cloud-native applications existed.

What does the harness look like in radiology, in prescribing, in documentation?

The harness is not a monolith. Each clinical surface requires a different configuration of the five components, weighted by the specific regulatory, liability, and integration demands of that domain. Three surfaces illustrate the range.

Radiology.The narrowest, most tractable harness configuration. Clinical integrations center on PACS and DICOM - mature, standardized imaging infrastructure. The AI reads imaging studies and flags findings; the radiologist reviews and confirms. Liability architecture is simpler because the AI operates as a decision support tool under a radiologist's final authority - existing malpractice frameworks can be extended rather than reinvented. Data standards benefit from DICOM's existing structure; AI outputs (confidence scores, region-of-interest annotations) fit naturally into structured reporting. Regulatory scaffolding operates under the FDA's existing SaMD framework, which has already cleared over 800 AI/ML-enabled medical devices, the majority in radiology. Trust infrastructure is furthest ahead here - radiologists have been working with CAD systems for two decades and have a framework for evaluating AI-assisted reads. Radiology is the proving ground: if the harness works here, it establishes the operational patterns for every subsequent surface.

Prescribing. The deepest harness configuration. Prescribing is where autonomous AI takes on its most consequential clinical role - recommending or initiating pharmacological treatment. Clinical integrations must connect to pharmacy benefit managers, drug interaction databases, formulary systems, and e-prescribing networks (Surescripts). Liability architecture requires the most sophisticated design: who is liable when an AI prescribes a medication that causes an adverse event? The Utah model provides the first legal framework - requiring a physician supervisor, an FDA-cleared algorithm, and a defined medication scope - but the liability contracts, indemnity structures, and audit trail requirements are substantially more complex than radiology. Data standards must extend to capture prescribing rationale, contraindication checks, patient consent specific to AI-generated prescriptions, and longitudinal outcome tracking. Regulatory scaffolding here is the most uncertain and the highest-leverage: the state that gets prescribing right creates the template. Trust infrastructure requires the deepest investment because prescribing touches every patient directly. Physician resistance will be highest, board scrutiny most intense, and the evidence bar for adoption most demanding. The harness for prescribing is the full expression of what we are building.

Documentation. The lowest-risk, highest-adoption harness configuration. Clinical documentation - visit notes, procedure summaries, discharge instructions - is where AI has the lowest clinical risk (errors in documentation are correctible before they affect care) and the highest adoption velocity (physicians hate documentation and will adopt anything that reduces it). Clinical integrations focus on EHR note entry and voice-to-text pipelines. Liability is minimal - documentation errors are subject to correction workflows, not malpractice claims in most jurisdictions. Data standards are well-established for clinical notes. Regulatory requirements are the lightest. Trust is the easiest to build because the physician reviews every output before it enters the record. SignatureAPI lives at this layer: document infrastructure is the administrative substrate of clinical documentation. The documentation surface is where the harness meets the health system with the least friction and the most immediate value.

These three surfaces are not sequential in time but sequential in harness complexity. We are building all three layers of the harness in parallel - documentation first (via SignatureAPI), radiology next (thesis-fit candidate), prescribing on the horizon (full harness deployment). The infrastructure we build for documentation carries forward to radiology. The infrastructure we build for radiology carries forward to prescribing. The harness compounds.

This is what Superposition builds. The layer between AI and regulation - the layer that turns a research capability into a clinical reality. Read more about our team and approach, the deployment gap we are closing, or our answers to common questions.