Claims adjudication is the highest-volume decision environment in healthcare insurance. A large commercial health plan adjudicates hundreds of millions of claims annually across medical, pharmacy, and ancillary benefit lines. Each claim requires a sequence of determinations: is the member eligible, is the service a covered benefit, is the clinical coding accurate, what does the provider contract require, and what is the correct payment amount. The aggregate outcome of those determinations is the plan’s medical spend — the largest single line in the income statement and the primary driver of Medical Loss Ratio.

The economics of adjudication error are straightforward and material. Payment errors average between 3 and 7 percent of total claims spend across the industry, combining overpayments, underpayments, coordination of benefits errors, and contractual misapplications. On a $1 billion claims book, the midpoint of that range represents $50 million in payment inaccuracy annually. Each 1 percent improvement in payment accuracy recovers $10 million. Those numbers do not require a sophisticated ROI model. They require a baseline measurement and a programme designed to close the gap.

Where the adjudication decision breaks down

The adjudication process fails at predictable points. Eligibility and benefits verification surfaces when the member’s coverage edge cases — coordination of benefits, benefit period limits, plan design exceptions — fall outside the rules that standard adjudication logic can resolve. Clinical editing fails when the code combinations on the claim are internally inconsistent, when unbundling or upcoding is present but not captured by existing edit rules, or when new coding patterns emerge that the rules library has not yet addressed. Contract application fails when provider contract complexity — tiered rates, carve-outs, capitation arrangements, value-based contract overlays — requires logic that the adjudication engine cannot fully automate.

Each failure pushes the claim into manual review. Auto-adjudication rates below 80 to 85 percent are widely cited as a signal of model or rules gaps. A plan with a 70 percent auto-adjudication rate processing 10 million claims annually is routing 3 million claims to manual review at a cost of $15 to $25 per touch — between $45 million and $75 million in avoidable administrative expense annually, before accounting for the payment accuracy impact of claims that are resolved incorrectly in the manual process.

The payment accuracy problem and the manual review problem are related but not identical. Some manual review produces correct payments on genuinely complex claims. Some auto-adjudicated claims produce incorrect payments because the rules or model resolved them confidently but wrongly. The productive diagnostic separates these: which claims are manually reviewed because they are genuinely complex, which are manually reviewed because the adjudication model cannot handle them, and which are auto-adjudicated but paying incorrectly.

The two economic leakages

Overpayments are the most visible leakage. Duplicate payments, coordination of benefits errors where the plan pays as primary when it should pay as secondary, pricing errors against complex provider contracts, and clinical editing failures that allow upcoded or unbundled claims to pay at face value all contribute to a post-payment overpayment burden that the plan then has to recover — at a recovery cost that can consume 20 to 40 cents per dollar recovered before net benefit is realised. Pre-payment accuracy eliminates the recovery cost entirely.

Underpayments create a different problem. A provider who is systematically underpaid relative to their contract will dispute, appeal, and withhold cooperation from the plan’s utilisation management and care management programmes. Provider relations cost, legal expense, and the indirect cost of a deteriorated provider relationship are the consequence of systematic contract misapplication in the underpayment direction. The plan that measures payment accuracy only by overpayment recovery yield is seeing half the problem.

What AI-assisted adjudication looks like

The highest-value applications of AI in claims adjudication are in the cases that fall outside the clean resolution path — the edge cases and exception patterns that current rules cannot handle. A gradient boosting or deep learning model trained on the full historical claims adjudication record can identify the patterns that predict payment accuracy and flag cases for enhanced review before payment rather than after.

Clinical editing AI improves on rule-based clinical edit libraries by identifying code combination patterns that are statistically inconsistent with the documented clinical scenario — catching novel upcoding and unbundling patterns that the static rules library has not yet encoded. Contract application AI handles the combinations of contract terms, member benefit design, and claim characteristics that static rules cannot fully cross-reference, reducing the volume of claims that require manual contract lookup.

The case for pre-payment accuracy is the same as the case for pre-payment fraud detection: stopping an incorrect payment before it leaves the plan is five to ten times more cost-effective than recovering it afterwards. The same model infrastructure supports both objectives.

The technology dimension

Large health plan claims adjudication platforms run at scale on infrastructure that at many large US payers includes IBM Z. Claims data, provider contract data, member eligibility, and benefits configuration are held in the same operational environment. Deploying AI-assisted adjudication logic on IBM Z via IBM Machine Learning for z/OS keeps the inference within the adjudication transaction, applying the model at the point of adjudication decision rather than in a separate analytical environment. For claims volumes in the hundreds of millions annually, the throughput and latency characteristics of on-platform inference are the difference between a model that can run on every claim and one that can run only on a sampled subset.

What success looks like

The metrics are auto-adjudication rate, payment error rate — separated into overpayment and underpayment components — manual review cost per claim, and net payment accuracy improvement as a dollar figure against the claims book baseline. The programme should establish each of these before deployment, set improvement targets, and measure both the model performance and the financial outcome. The financial outcome is the only metric that justifies the investment to a CFO. It should be established as a baseline at the start and reported as the primary result at every review.