When AI Decides Who Gets Healthcare: Kenya’s PMT Experiment and the Cost of Getting the Model Wrong
Executive summary: Kenya’s Social Health Authority (SHA) replaced large parts of its old insurance system with a predictive, asset‑based model that assigns premiums through proxy means testing (PMT). The rollout misclassified millions—often marking the poorest as wealthier—pushing families into unaffordable premium tiers, eroding hospital cashflows, and provoking protests. Independent warnings from IDinsight and investigative audits by Africa Uncensored, Lighthouse Reports and The Guardian were ignored before national deployment. The failure is both technical and political: a brittle targeting method, stale data, and a design trade‑off that preferred not undercharging richer people over protecting the poor. This is a governance problem as much as an engineering one.
At a glance
- SHA nationwide launch: October 2024
- Target population: largely informal workforce (about 83% of workers)
- Registrations: more than 20 million people registered; ~5 million regularly paying premiums
- Core method: proxy means testing (PMT) implemented with a predictive machine‑learning model
- Key independent voices: IDinsight (pre‑deployment warning); investigations by Africa Uncensored, Lighthouse Reports and The Guardian
What happened — a human hook
Grace Amani* (name changed), a community volunteer collecting household data, watched a family she knows well move from a safe premium bracket into one they couldn’t afford. “People are dying, people are suffering. They thought it was something that would help,” she said. Across Kenya, similar stories multiplied: single parents quoted premiums that took 10–20% of limited incomes, some critically ill people skipping care because the algorithm assigned them unaffordable contributions, and hospitals left waiting for reimbursements that never arrived.
How the SHA system worked — plain language explanation
Proxy means testing (PMT) estimates household income from observable assets and living conditions — things like roof material, phone ownership, or access to electricity. SHA’s model used data collected by volunteers on these attributes, then a predictive machine‑learning algorithm looked for patterns and guessed each household’s ability to pay. This is not a large language model like ChatGPT; it’s a statistical model that infers likely incomes from asset patterns.
Why this can fail: the algorithm assumes stable, representative relationships between assets and actual income. If the training data are outdated, non‑representative, or if local economies have shifted, the model’s guesses drift away from reality. Seasonal work, remittances, informal enterprise fluctuations, and recent economic shocks all weaken the asset‑to‑income link, producing systematic errors.
Where it broke — technical, data and political failures
Three failures collided.
- Bad fit between model and reality: IDinsight warned the model was “inequitable, particularly for low‑income households,” noting the model relied on stale or insufficiently representative data. In practice the algorithm often marked the poorest households as wealthier than they are, moving them into higher premium tiers.
- Governance and accountability gaps: There was limited public documentation, no robust appeals channel, and weak transparency about how scores were calculated. Misclassified households had little recourse to challenge results or get timely corrections.
- Political trade‑offs in design: Officials reportedly accepted a design choice: prioritize accurately catching richer households so the state didn’t undercharge those who could pay, even if that raised errors for the poorest. That’s a fiscal and political calculation with clear human costs.
“If you identify a richer person as poor and therefore ask him to pay less, this person will never own up and say, ‘I’m actually supposed to be paying more.’” — David Khaoya, health economist
The operational fallout was predictable. More than 20 million registered with SHA but only around 5 million are regularly paying premiums. That shortfall left hospitals with cashflow problems: fewer paying members meant delayed reimbursements and service strain. Public anger, protests, and forecasts of institutional collapse followed; a prominent former deputy president predicted SHA would “collapse in another six months.”
Evidence and global precedents
Independent investigations by Africa Uncensored, Lighthouse Reports and The Guardian documented the misclassification patterns and human harms. IDinsight’s pre‑deployment review flagged representativeness and fairness concerns that were not acted upon. These outcomes are not unique to Kenya: PMT schemes elsewhere have produced large exclusion errors — for example, testing in Indonesia and Rwanda showed very high error rates in practice. Development economists like Stephen Kidd have long warned that asset‑based targeting can feel arbitrary to households and produce large exclusion of the needy.
“It feels like a lottery.” — Stephen Kidd, development economist
The governance gap: what authorities didn’t put in place
Four institutional safeguards were missing or weak:
- Transparent model documentation and public model cards explaining variables, thresholds and known failure modes.
- Independent external validation tied to deployment decisions — not just advisory reports but conditional rollout criteria.
- An accessible and fast appeals process so households could contest scores and have rechecks without catastrophic delays.
- Operational contingency plans: cashflow bridging for hospitals while enrollment and payments stabilize, and explicit vendor liability clauses.
SHA and government spokespeople defended the reform as a scalable way to reach the informal sector and argued the system would be improved over time. That may be true, but the pause point should have been before national scale: when a predictive model determines access to essential services, the threshold for proof of safety must be high.
Key questions leaders ask (quick Q&A)
Does the SHA system fairly classify households?
No — audits and reporting found the model often overcharged the poorest and underestimated higher earners, leading to unfair premium assignments.
What exactly is the technical approach?
Proxy means testing (PMT) implemented with a predictive machine‑learning model that uses household asset and living‑condition data gathered by volunteers to estimate ability to pay — not an LLM like ChatGPT.
Were risks flagged before rollout?
Yes — IDinsight produced a pre‑deployment review warning the model was inequitable and that the training data were stale and unrepresentative. Officials deployed the model anyway.
What tangible harms have occurred?
Reported harms include premium increases of 10–20% of household income for some families, skipped care, cashflow problems for hospitals, and public protests.
Practical checklist — what leaders must do now
- Immediate relief: Pause enforcement of contested premium assignments and implement an emergency waiver or temporary subsidy for households who can demonstrate financial hardship.
- Independent audit and remediation: Commission a public, independent audit of the model’s performance (include confusion‑matrix metrics: false positives/negatives by region and demographic) and publish the results and corrective timeline.
- Appeals and rechecks: Launch a fast, low‑cost appeals process with verified re‑assessment options and community‑based grievance handlers.
- Data and model governance: Require vendors to publish model documentation, training data provenance, and a roadmap for continuous recalibration funded by government/donors.
- Pilots and thresholded rollouts: Tie national rollouts to pre‑agreed performance thresholds (maximum allowable exclusion error) and use randomized audits during pilots.
- Consider alternatives: Evaluate phased universal subsidies for essential services, hybrid targeting (community verification + PMT), or smaller universal blocks for the most critical care.
Methods note — what an independent pre‑deployment review would (and did) look for
IDinsight’s review focused on representativeness of training data and fairness metrics; it recommended more recent, regionally stratified data, and stronger testing for exclusion errors among the poorest. The kinds of tests that would have caught major problems include held‑out regional validation, randomized audits that compare predicted scores to consumption surveys, and fairness checks across urban/rural, gender and informal‑sector subgroups.
Why this matters for AI for public services and AI governance
AI for public services and algorithmic targeting promise efficiency and scale, especially where administrative data are thin. But efficiency without fairness becomes a policy harm. The SHA case shows the consequences of: choosing a cheap, scalable targeting tool; ignoring independent technical warnings; and failing to build transparency and recourse into the system. Those are governance decisions, not inevitable technical failures.
“No Kenyan will be left behind.” — President William Ruto
Final verdict and takeaways for leaders
Technology amplified a policy decision. The choice to use PMT, the choice to prioritize avoiding undercharging wealthier households, and the choice to roll out nationally despite warnings are political and managerial choices with measurable human costs. For executives, policy teams, and donors advising on AI deployments: insist on rigorous pre‑deployment testing, transparent model documentation, public grievance channels, and metrics that matter for human outcomes — not just cost savings on paper.
Algorithms can improve public services, but they are tools within a policy ecosystem. When that ecosystem is weak — stale data, perverse incentives, weak accountability — the algorithm magnifies the weakness. The immediate job is remediation: protect vulnerable households, fix the model governance, and rebuild trust before the costs of getting it right become irreversibly high.
Sources: reporting and audits by Africa Uncensored, Lighthouse Reports, The Guardian, and a pre‑deployment review by IDinsight. Background on PMT and donor history cited from public World Bank materials on targeting and social protection.