Blog

How to Protect High-Stakes AI Workflows From Model Vendor Turbulence

You built agents on a frontier model. You tested it, tuned your prompts around its quirks, trained your team to interpret its outputs. You committed to it. Then the model provider sends an email. Deprecation. End-of-life. Migration deadline in 90 days. The government requires a shutdown. Think it can't happen to you - think again.

By Kevin Riley

13 min read·June 16, 2026
THE RENTING PROBLEM

Renting Intelligence Works — Until It Doesn't

There is a useful analogy that has been circulating since the Fable 5 shutdown: renting versus owning intelligence. It is worth sitting with, because it captures something that the cost-and-capability conversation around AI models has mostly missed.

Renting works great. Frontier model APIs are extraordinary products. They are move-in ready. Someone else handles the maintenance, the training runs, the safety evaluation, and the infrastructure. That is exactly why almost every AI deployment in the last three years started there — including ours.

But renting comes with constraints that are easy to ignore until they are not. The landlord can raise the rent. They can change what modifications you are allowed to make. They can change the rules entirely. And every once in a while — for reasons that have nothing to do with you — they can tell you it is time to leave. You did not do anything wrong. You are just operating on someone else's property.

For a consumer application, this is a painful inconvenience. For a healthcare organization running prior authorization decisions, revenue cycle operations, or clinical documentation — it is an unacceptable operational risk. You cannot tell a payer that your prior auth workflow is offline because your model provider had a call with a government official. You cannot tell a regulator that your audit trail is inaccessible because a model you built on top of was deprecated.

The Mythos shutdown made something concrete that was previously abstract. A company built on intelligence it did not control found itself exposed to decisions it could not influence. In healthcare, the consequences of that exposure are not a bad week. They are regulatory findings, claims backlogs, and patient care delays.
THE OWNERSHIP QUESTION

The Right Thing to Own Is Not the Model

The natural response to the renting problem is to ask whether you should own your models instead — build your own, fine-tune on your data, post-train on your workflows. It is a compelling argument, and for organizations with the engineering depth to do it well, a model tuned on your proprietary data can match frontier quality at a fraction of the cost on the tasks that matter most.

But owning a model is not the same thing as owning your AI roadmap. And for most healthcare organizations, it is not the right layer to own.

Here is the distinction that matters: a model encodes what intelligence knows. An agent harness encodes what intelligence does — in your specific workflow, under your specific compliance requirements, with your specific governance rules and escalation paths and audit obligations. Those are different things, and one of them is far more valuable and far more portable than the other.

Owning the Model

Intelligence that is yours

You control the weights. You trained on your data. The model reflects your domain expertise. No provider can take it away.

But: the workflow still lives somewhere. The governance layer still lives somewhere. The audit trail still lives somewhere. If those are built around a specific model's behavior, swapping the model still costs you everything you built on top of it.

Owning the Architecture

Workflows that are yours

The domain logic, compliance rules, tool configurations, escalation paths, and governance substrate are yours — independent of any model. Every frontier model becomes an option, not a dependency.

When the Brain changes, nothing else does. The workflow keeps running. The audit trail stays intact. The compliance posture is unchanged. That is what portability looks like.

The organizations that win in healthcare AI will not necessarily be the ones with the biggest models or even the most precisely tuned ones. They will be the ones that turned intelligence into something they own — at the layer that actually carries value across model generations, provider changes, and regulatory turbulence.

KORA'S ANSWER

When a Model Goes Offline, What Exactly Goes Offline With It?

For most enterprise AI deployments, the answer is: almost everything. When the model is the architecture — when prompts are written to exploit its specific behaviors, tools are configured around its API format, and outputs are parsed based on its response structure — the workflow and the model are inseparable. Swap the model and you have rebuilt the workflow.

For KORA customers, the answer is: the Brain. Nothing else.

KORA's Three-Layer Architecture

Brain

The swappable frontier model. The only layer that changes in a migration. Plugs in. Plugs out. Any major provider or open-source alternative.

Hands

The typed, governed tool sandbox — your integrations, APIs, decision points, workflow steps, domain rules. Model-agnostic. This is what you own.

Session

The append-only audit log. Every action, every decision, every human approval. Model-agnostic. Preserved across every migration.

The domain knowledge — workflow logic, tool configurations, compliance rules, escalation paths — lives in the Hands layer. The audit trail lives in the Session layer. Neither is coupled to the Brain. When Anthropic deprecates a model, or OpenAI releases a new generation, or a better-performing open-source model becomes available, KORA customers swap the Brain. The rest of the agent is unchanged.

That is not a feature. That is the foundational design principle of an architecture built to outlast any individual model — and any individual provider's decisions.

WHY THE MODEL ALONE IS NOT ENOUGH

We Tested Every Major Frontier Model on Real Healthcare Workflows. Here Is What We Found.

There is a convenient assumption baked into most AI adoption conversations: if you get a sufficiently powerful model, the rest will follow. Deploy the best model, and the workflow will work. That assumption is wrong in healthcare — and we can prove it.

CHI-Bench is the world's first long-horizon healthcare benchmark for AI agents, built with 20+ hospitals and universities including Johns Hopkins Medicine, Wellstar Health System, and Yale School of Medicine. It evaluates agents across prior authorization, utilization management, and care management — the exact workflows where healthcare AI is being deployed today. Every agent configuration is tested as a harness × model combination. That pairing is intentional. Because in healthcare, the model is not the product. The model plus the harness built for the workflow is the product.

28%
Best pass@1 across all 75 tasks — achieved by the top-performing harness paired with Claude Opus 4.6. The model alone does not reach this. The harness makes the difference.
0%
End-to-end prior authorization automation rate at pass@1 — across every frontier model tested. Without the right harness, even the most capable model cannot complete a full PA case.
60–80
Agent steps per trial, across 4–6 distinct role-composed stages. Healthcare workflows are not a single prompt. They are long-horizon decision chains that require structured harness support at every stage.
1,279
Pages in the medical policy corpus that governs each task — site-of-service rules, evidence grounding requirements, consent scripts, and clinical criteria that the harness must navigate correctly at every decision point.

The reason frontier models score zero on end-to-end prior authorization is not that they are unintelligent. Claude Opus 4.6 and GPT-5.5 are extraordinarily capable. The reason is that healthcare workflows have three properties that general-purpose agents are not built to handle:

01 — Long-Horizon

State that commits and cannot retry

A PA case runs 60–80 steps across intake, case construction, documentation review, and final disposition. Once a stage commits, it cannot be retried. One wrong site-of-service decision cascades into four scorecard failures. General-purpose agents are not trained for this kind of stateful consequence management.

02 — Role-Composed

One agent, many seats

A single agent must play clinical intake clerk, PA coordinator, clinical reviewer, and submitter — each with different permissions, different tools, and different artifact obligations. Without a harness that structures these role boundaries, the model collapses them.

03 — Policy-Driven

1,279 pages of criteria the model must navigate

Medical policy criteria, payer-specific coverage rules, consent scripts, and evidence-grounding requirements are checked by deterministic rubrics — not summarized by a language model. The harness must retrieve, interpret, and apply the right policy at the right step. A general-purpose model cannot do this reliably without the harness providing that structure.

The benchmark result is the argument. The best frontier model in the world — Claude Opus 4.6, tested in the CHI-Bench leaderboard configuration — reaches 28% pass@1 with the right harness and 0% on end-to-end prior authorization without it. The model is necessary. It is not sufficient. In healthcare, the harness built for the workflow is what turns a capable model into a deployable agent.

This is the deeper reason why model independence matters in healthcare specifically. When a model goes offline, you need to be able to swap it without rebuilding the harness. When a new model becomes available, you need to be able to test it against the harness without starting over. The harness is where your healthcare-specific value lives. The Brain is the component you need to be able to change.

THERE ISN'T ONE FRONTIER

The Frontier Is Not a Single Model. It Is a Moving Target You Should Be Able to Track.

One underappreciated consequence of the Fable 5 shutdown is what it revealed about the structure of the AI landscape. The future of AI does not depend on a single model winning. There are multiple frontiers — and the most interesting thing happening is not that one model is getting smarter. It is that intelligence is becoming increasingly customizable, distributable, and composable.

Frontier Model

The best general-purpose reasoning available from Anthropic, OpenAI, Google, or Meta. The right Brain for complex, open-ended clinical reasoning tasks.

Specialized Model

A model post-trained on healthcare workflows, prior auth criteria, or claims patterns. Outperforms frontier models on narrow, high-volume tasks at a fraction of the cost.

Open-Source Model

Where data residency requirements or cost constraints make proprietary APIs unsuitable. Deployable on your own infrastructure. No provider dependency.

Routed Ensemble

Different models for different steps in the same workflow — the right Brain for each decision point, coordinated by a governance layer that remains constant.

KORA is designed for exactly this landscape. The governance layer — the Hands and Session — stays constant. The Brain is a variable. You can run different models for different agents within the same workflow, evaluate a new release without committing to it, or hold a stable version while a next-generation model matures. The architecture does not force a choice. It preserves all of them.

BUILD · TEST · DEPLOY

A Model Migration Is a Test Run, Not a Rebuild

Most organizations experience model migrations as reconstruction projects. Prompts rebuilt from scratch. Outputs re-tested. Tool integrations reconfigured. Compliance behavior re-validated. Timeline: weeks to months. Risk: high, because new models surface behavioral differences only in production edge cases.

For KORA customers, a model migration is a test run:

Build

Point to the new Brain

Update the model reference in the agent configuration. The workflow, tool definitions, and governance rules are unchanged. Nothing is rebuilt.

Test

Run your existing test suite

Execute the same test cases against the new model. Compare outputs. Identify behavioral differences. Tune where needed — not rebuild from scratch.

Deploy

Release with confidence

Governance layer, audit trail, and human-in-the-loop gates are exactly as they were. Compliance posture is preserved. Migration complete.

That three-step process collapses a months-long migration into a test cycle measured in days. Not because the new model is easier to work with. Because the architecture does not require the workflow to be rebuilt every time the Brain changes.

OWN YOUR ROADMAP

When the Intelligence Is Yours, Nobody Gets to Pull the Floor Out

There is a deeper issue underneath every forced model migration. When your AI roadmap is tightly coupled to a single provider, your provider's product decisions become your product decisions. They ship a new model; you migrate whether you are ready or not. They change pricing; your unit economics shift without warning. They receive a phone call from a government official; your production workflows go dark.

Owning your architecture means your roadmap is yours. You choose the best-performing Brain for each workflow. You evaluate new model releases without committing to them. You take advantage of open-source models where data residency or cost requirements demand it. You stay on a stable version while a new one matures. These are business decisions that belong to you — not concessions you make to whoever supplies your current model.

The cloud generation made this mistake first. Organizations that built on a single vendor's proprietary infrastructure spent a decade negotiating from weakness. The ones that built on portable, standards-aligned infrastructure kept the leverage to move, to negotiate, and to adopt whatever came next without paying a migration tax on everything they had already built.

The pattern is repeating in AI. The organizations making the right choice now are not necessarily owning their models. They are owning the layer where the actual value lives: the workflow logic, the domain expertise, the compliance substrate, the governance architecture, and the compound knowledge that builds up in production over time.

In healthcare, that compound knowledge is especially precious — and especially irreplaceable. The escalation patterns your agents have learned from real cases. The edge cases your compliance team flagged and corrected. The policy interpretations that your workflows have internalized over thousands of prior authorization decisions. That is not in the model. That is in the harness. And it is yours.

THE ACTAVA ANSWER

Build Once. Own It. Deploy on Your Terms.

KORA is model-agnostic by design. Every major frontier model — Anthropic, OpenAI, Google, Meta, and leading open-source alternatives — connects through a consistent interface that abstracts the Brain from the workflow. The domain knowledge, governance layer, and audit trail are built once and persist regardless of which model powers the agent at any given moment.

When a model goes offline, our customers notice the email. They don't notice the migration.

When a better Brain becomes available, they evaluate it against their existing workflows without rebuilding them. When a new model is faster or cheaper but slightly less precise on a specific task, they make a deliberate trade-off — not a forced one. When the next round of deprecations arrives — and it will — their agents keep running, their audit trails stay intact, and their roadmap remains theirs to set.

The lesson from the Fable 5 shutdown is not that frontier models are too risky to use. They are extraordinary, and they are becoming infrastructure. But infrastructure and ownership are different things. You can use any model while still owning the architecture that makes it productive in healthcare. The companies that figure that out are the ones that turn intelligence into something uniquely theirs — something no provider shutdown, deprecation notice, or government phone call can take away.

The model is a Brain you rent or own. The harness is the business you are building. In healthcare, only one of those two can actually do the work.

Learn more about KORA's model-agnostic architecture at actava.ai/why-kora, and explore the data behind the harness at actava.ai/benchmarks.


Kevin Riley

Written by

Kevin Riley

CEO & Co-Founder

Share this