Project Maven: Pentagon’s AI Nervous System – AI Agents, Risks and Governance Lessons for Leaders

Project Maven: How the Pentagon Turned AI into a Nervous System

Project Maven started as a narrow computer‑vision experiment and, over a few intense years, grew into a deployed operational platform that stitches sensor data, runs models at scale, and accelerates targeting and interdiction workflows. Its commercial spine—the Palantir‑built Maven Smart System—now supports thousands of users across multiple combatant commands and domestic agencies. That speed and breadth demonstrate what AI agents and large language models (LLMs) can do in high‑tempo operations — and why governance, training, and accountability often trail capability.

Fast facts — TL;DR

What it is: Maven Smart System is a Palantir-built platform that fuses sensor feeds, surfaces automatic target recognition (ATR) detections, and integrates model-driven automation.
Scale: NGA reported roughly 1 billion AI detections in its vision data store; roughly 25,000 U.S. personnel were reported to be using Maven as adoption climbed.
Procurement: Pentagon ceiling for Maven Smart System raised to about $1.3 billion (through 2029); an Army ceiling reportedly near $480 million; NGA launched a roughly $708 million data‑labeling contract awarded to Enabled Intelligence.
Operational footprint: Deployed across CENTCOM, SOUTHCOM, NORAD/NORTHCOM, used for strike support, maritime interdictions, missile/launcher detection, and homeland defense monitoring.
Risk highlights: Model hallucinations, data poisoning, common‑mode failures from a single “common view,” and gaps in weapons‑level training and auditability.

How Maven moved from demo to program of record

Project Maven began as a Pentagon effort to apply computer vision to the flood of intelligence, surveillance, and reconnaissance (ISR) imagery. The program became a flashpoint in 2018 when Google employees protested the company’s involvement in military AI work. Inside the Defense Department, evangelists and skeptics sparred as operational demands — wars, crises, and an appetite for faster decision cycles — drove adoption.

Drew Cukor, a Marine colonel widely credited as an early architect and advocate, pushed the work into theater. Palantir’s CEO later referred to him with a mix of admiration and irreverence, calling him “crazy Cukor” while crediting his role. Vice Admiral Frank “Trey” Whitworth moved from initial skepticism to institutional endorsement after scrutinizing how the system would stand up under congressional and legal review — especially following concerns about mistaken strikes and record‑keeping.

By November 2023, the program was declared a program of record with recurring funding. The Pentagon’s January 2026 “AI‑First” posture accelerated procurement and encouraged agentic AI — systems that plan and act across multiple steps — to be woven into workflows.

What the Maven Smart System actually does (plain English)

At its core, Maven Smart System fuses sensor feeds (satellite, airborne, maritime, and other ISR sources), runs AI models (including ATR — automatic target recognition — and later LLMs), and presents analysts and commanders with a unified operational picture. A “detection” is when a model flags an object or event of interest (e.g., a truck, a rocket launcher, a small boat). The system can pair those detections to “effectors” — meaning it can nominate options to act, such as cueing a strike, alerting interdiction forces, or assigning a task to an analyst.

Two important terms:

ATR (automatic target recognition): Computer‑vision models that detect objects in imagery automatically.
LLM (large language model): A text‑based AI used for summarization, reasoning, or automating multi‑step workflows; when embedded in operations, LLMs can assemble context, recommend actions, or script automation chains.

Think of Maven as a nervous system: sensors are the senses, models are reflexes and pattern detectors, and the dashboard is the brain’s control panel. The crucial difference from a biological nervous system is that humans remain part of the loop — at least nominally — but the loop is getting shorter and faster.

Operational footprint and measurable impacts

Usage and throughput climbed rapidly once the system was fielded. The National Geospatial‑Intelligence Agency (NGA) reported roughly 1 billion AI detections in its vision store. CENTCOM reportedly operated 179 live feeds with thousands of accounts in its region; NORAD and NORTHCOM reached daily user counts in the low thousands by 2025. Across services and agencies, adoption figures were reported in the tens of thousands of personnel.

Throughput improvements are often the headline metric: where manual workflows once processed fewer than 100 target developments per day, Maven’s automation moved that into the hundreds and — after LLM and agentic integrations — into reported ranges from roughly 1,000 up to claims of 5,000 targets per day in specific contexts. Those gains translated into faster strike cycles, more timely maritime interdictions in SOUTHCOM, and quicker detection of launchers and asymmetric threats across multiple theaters.

“Every commander is using the AI,”

— Joe O’Callaghan, NGA AI director (paraphrase)

Supply chain, procurement, and political dynamics

Palantir is the primary commercial spine of Maven, and the platform has been extended into partner discussions with NATO and reportedly the U.K. (a reported £750 million deal surfaced during a 2025 state visit). The Pentagon raised the Maven Smart System contract ceiling to approximately $1.3 billion, and the Army accepted a related ceiling of about $480 million. NGA’s roughly $708 million data‑labeling award went to Enabled Intelligence — a company noted for hiring neurodiverse workers for pattern‑labeling tasks — rather than a high‑profile vendor like Scale AI.

At least three dozen companies are involved across modeling, sensors, integration, and labeling. That breadth reduces single‑vendor risk but introduces complex vendor management, potential lock‑in around Palantir’s architecture, and questions about long‑term costs and exit strategies.

Where the tech and doctrine still fall short

The operational benefits are real; the gaps are equally real.

Hallucinations and brittleness. Models sometimes produce false positives or confident but incorrect assessments. Brig. Gen. John Cogbill gave the system a cautious grade (roughly a “C+”) because hallucinations can lead operators to wrong conclusions.
Data poisoning and adversary deception. Training data and sensor feeds can be manipulated. If adversaries find ways to inject misleading signals, automated systems will amplify errors.
Common‑mode failure. A single, shared operational picture is efficient — and dangerous. An NGA official warned that if the common view is wrong, multiple commands can fail together.
Record‑keeping and accountability. Whitworth pressed the question of how logs, audit trails, and decision records would hold up under congressional or legal scrutiny after a mistaken strike.
Training and doctrine. Operators need weapons‑system‑level training, argued Emelia Probasco, meaning rigorous certification, exercises, and well‑defined handover points for human decision authority.

“He pressed to know how the system would stand up to congressional scrutiny after a mistaken strike—questioning record‑keeping and accountability.”

— Vice Admiral Frank “Trey” Whitworth (paraphrase)

Model governance: what’s being tried

NGA began issuing model assessment cards and accredited a small number of models by late 2025 as part of early governance work. Accreditation and assessment cards are attempts to document model performance, limitations, and intended use—similar in spirit to industry model cards or assessment checklists. These steps matter, but accreditation alone does not replace doctrine, independent red‑teaming, continuous monitoring for drift, or legally robust audit trails.

Key takeaways and questions

What does the Maven Smart System actually do?

Maven fuses sensor feeds, runs ATR and other models, surfaces detections to analysts and commanders, and can link those detections to options for action—thereby shortening target development cycles.
How widespread and influential is it?

Very. NGA reports roughly 1 billion detections; adoption reached tens of thousands of users across commands like CENTCOM, NORAD/NORTHCOM, and SOUTHCOM.
Are safeguards keeping pace?

No. While model assessment and accreditation are emerging, training, doctrine, and legally robust audit trails lag the operational rollout.
Biggest operational risks?

Model hallucination, data poisoning, exposure of electronic signatures, common‑mode failures from a single common view, and compressed human oversight as agentic AI speeds decisions.

Seven practical lessons for executives deploying AI agents

The military’s Maven experience is a cautionary blueprint for any organization adopting AI agents or centralized model‑driven views for mission‑critical decisions.

Build immutable audit trails. Log model inputs, outputs, user interactions, and timestamps so decisions can be reconstructed under scrutiny.
Require independent assessment and red‑teaming. Third‑party testing and adversarial exercises expose hallucinations, brittleness, and attack vectors before deployment.
Diversify models and suppliers. Avoid monoculture. Multiple models and vendors reduce common‑mode risk and give you escape routes if one model goes off rails.
Train operators like safety‑critical system users. Scenario‑based drills, certification, and explicit veto authority are not optional—treat the tool as a system that can harm if misused.
Segment and canary critical inputs. Use canary datasets, holdout sensors, and segmented feeds to detect poisoning or drift early.
Design for human execution vetoes. Make it easy and practiced for humans to pause automation; exercise that veto under pressure so it’s not theoretical.
Align procurement with governance. Contracts should include exit clauses, audit rights, and obligations for model explainability and ongoing validation.

Operational mitigations and a practical starter checklist

Start by operationalizing the lessons above:

Deploy a model‑assessment card for each production model that lists intended use, failure modes, tested performance, and mitigation steps.
Implement logging and tamper‑evident storage for decision trails; ensure logs survive system outages and can be audited independently.
Run regular red‑team exercises that include data‑poisoning scenarios, adversary deception, and supply‑chain compromises.
Define clear human‑machine handoffs and mandate weapons‑system‑level training analogues for staff operating critical AI systems.
Use diverse training datasets, holdout canaries, and runtime monitors for model drift and performance decay.

Why speed without accountability becomes a brittle advantage

Faster targeting, better maritime interdictions, and improved situational awareness are defensible operational wins. But when speed outruns accountability, organizations create systemic risk. Maven’s lesson is blunt: a shared AI nerve center multiplies both advantage and vulnerability. For commercial leaders, the takeaway is less about military policy and more about the architectural and governance choices that shape enterprise risk.

“Maven is a movement. We’ve drunk the Kool‑Aid.”

— Joe O’Callaghan, NGA AI director (paraphrase)

That cultural buy‑in is powerful. It drives adoption and hardens capabilities. It also normalizes reliance on a single view. The prudent path is to keep the performance gains while engineering for failure: independent checks, diverse models, practiced human oversight, and procurement terms that preserve agency and auditability.

Next step for leaders

If your organization is rolling out AI agents or centralized model views, begin with three pragmatic moves this week: (1) mandate immutable logging for all model decisions, (2) schedule an adversarial red‑team exercise against a high‑impact model, and (3) require operational drills where humans must exercise veto authority. These actions start you toward a posture where capability and accountability move together.

Project Maven shows how quickly AI can become a nervous system. The upside is clear; the price of neglect is not. Build for both.