Anthropic Shows LLMs Turn Patches into Exploits in Hours — N‑Hour Wake‑Up Call for CISOs

N‑Day → N‑Hour: Anthropic shows LLMs can turn patches into exploits in hours

TL;DR: Anthropic’s security tests found Claude‑family models — led by the institutional Mythos Preview — can reverse‑engineer patches and produce proof‑of‑concept crashes and full exploit chains in hours (often minutes). Defenders must stop treating time‑to‑exploit as days or weeks and instead design around an “N‑Hour” threat model.

Key metrics at a glance

Targets: 18 SpiderMonkey (Firefox) patches; 21 Windows kernel Patch Tuesday advisories (Jan–Feb 2026).
Top model: Mythos Preview (Anthropic). Other Claude variants (Opus 4.5–4.8, Sonnet 4.6) tested for comparison.
SpiderMonkey: 14 of 18 vulnerabilities crashed; first PoC crash ~12 minutes; 8 working exploits in ~12 hours; 7 bugs reproducible across 50 runs.
Windows kernel: 18 of 21 vulnerabilities located in under six hours; 8 full privilege‑escalation chains built; discovery cost ≈ $2,200 API credits; total exploit build cost ≈ $15,700 (≈ $2,000 per exploit).
Operational gap: Windows Autopatch typically reaches 90% of devices in ~7 days and forces reboots in ~11 days — long after LLMs produced usable attacks.

What Anthropic ran and what it found (short methods)

The team evaluated six Claude‑family LLMs, including an institutional Mythos Preview variant and public Claude derivatives. Two testbeds were chosen to stress different real‑world conditions: Firefox’s open SpiderMonkey engine (source diffs available) and the more challenging Windows kernel (compiled binaries with debug symbols, decompiled with Ghidra). Prompts and human‑in‑the‑loop guidance were used to direct analysis, triage likely vulnerable code paths, and iterate toward proof‑of‑concept (PoC) crashes and then full exploit chains. Reproducibility was measured (e.g., Mythos reproducibly triggered 7 of 18 SpiderMonkey bugs across 50 runs each).

Limitations and important caveats: Mythos Preview used in the strongest tests is not publicly available. Public Claude variants and some open‑source models can also develop exploits but were generally less reliable in these experiments. API cost figures reflect compute/usage credits; they do not include human analyst labor or external tooling costs beyond decompilation and debugging.

Detailed findings: SpiderMonkey and Windows kernel

SpiderMonkey (Firefox)

Using patch diffs and source context, Mythos Preview produced PoC crashes quickly: the first in ~12 minutes, a cluster of 13 more within roughly 40 minutes, and a final one around three hours. Over a roughly 12‑hour session the model produced eight fully working exploits. Importantly, seven bugs were reproducibly triggered across 50 runs, showing the model’s output wasn’t just a one‑off lucky guess. One practical note: the first exploit was ready well before the Firefox release that contained the fix was shipped to users.

Windows kernel (compiled targets)

Targets here were more complex: compiled binaries, debug symbols, and Ghidra decompilation were used to present the model a readable code surface. Mythos identified 18 of 21 CVEs in under six hours. Discovery stage API costs were roughly $2,200. Building the exploit chains — privilege escalation from a restricted user to SYSTEM — required more iterations and produced eight full chains at a total API cost around $15,700 (roughly $2,000 per exploit). Mythos was the only model in the set to achieve full privilege escalation during these tests.

Anthropic researchers summarize the shift bluntly: a single operator can convert roughly a month’s worth of published patches into working exploits in an afternoon, at modest cost and without deep specialist knowledge.

Why this changes the defender playbook (the “N‑Hour” problem)

Historically, attackers often needed days or weeks to weaponize a patch — a window defenders relied upon to roll out fixes. That margin is evaporating. Modern LLMs can ingest diffs, decompiled code, and advisories, then locate vulnerable code paths and generate exploit code rapidly. The result is a compressed attacker‑defender timeline: many organizations will not install a patch before an AI‑driven exploit is viable.

This isn’t just about patch cadence. It shifts where defenders should invest effort: from exclusively speeding rollouts to also reducing the prevalence and impact of whole classes of vulnerabilities, improving detection, and hardening systems so successful exploitation is harder even when exploits exist.

Practical, prioritized actions for CISOs and IT leaders

Short, actionable steps you can start today — ranked by immediacy and impact.

Recalibrate triage and risk scoring: Treat time‑to‑exploit as hours, not days. Elevate urgent fixes where LLMs can automate weaponization (memory corruption, unsafe deserialization, etc.).
Fast‑track critical rolls, but don’t rely on speed alone: Automate patch testing and deployment where safe to do so, and measure mean time to patch in hours for critical assets. Remember: distribution alone won’t eliminate the root causes.
Prioritize engineering fixes that eliminate entire bug classes: Invest in memory‑safe languages (Rust), stronger sandboxing, and hardware mitigations (e.g., Control‑Flow Enforcement Technology, Pointer Authentication) for high‑risk components.
Harden slow‑to‑update systems: For OT, medical, or embedded devices that cannot be rebooted easily, implement compensating controls (network segmentation, application allowlists, virtual patching) and prioritize migration plans.
Govern model access and monitor abuse potential: Include model access and API keys in your threat model. Monitor for anomalous usage patterns that could indicate internal misuse or third‑party abuse.
Use LLMs defensively — carefully: Apply models to accelerate triage, regression testing, and automated fuzzing, but avoid prompting models to produce exploit code. Implement strict controls, logging, and human oversight when using LLMs for security tasks.
Improve detection and incident playbooks: Add AI‑driven exploitation scenarios to tabletop exercises, and ensure detection rules cover rapid exploit techniques, not just known‑bad signatures.

Policy, governance, and the broader ecosystem

Access control for high‑capability models is a policy lever. Mythos Preview is institutional, but Anthropic noted that public Claude models and open‑source variants — when filters are weakened — can also generate exploits. Regulators, vendors, and platform operators must balance legitimate research access with misuse risk. Practical steps include graduated access, vetted research programs, rate limits, and mandatory logging for sensitive model capabilities.

Vulnerability scoring systems also need updating. Vendor “likelihood to be exploited” ratings may understate risk when LLMs can automate attack construction — Anthropic’s tests showed models succeeded against many vulnerabilities previously rated “less likely” or “unlikely”.

Defenders can co‑opt LLMs — but with constraints

There’s an uncomfortable symmetry: the same models that lower the bar for attackers can speed defensive tasks like automated patch validation, exploit detection, and proactive hardening. The key is governance. Use red‑team testing with isolated models, maintain strict non‑disclosure and audit trails, and avoid producing or storing exploit artifacts.

Operationalize LLM‑assisted security for safe uses: vulnerability summarization, triage prioritization, test case generation, and automated fuzzing inputs — all with human review and no direct generation of exploit payloads.

Anthropic warns that public and open‑source models with safety filters disabled can also produce exploits, expanding who can weaponize patches.

Key questions leaders are already asking

How fast can LLMs turn patches into exploits?

Often within hours. Anthropic’s Mythos produced PoC crashes in minutes and full exploits within roughly 12 hours in their tests.
What does it cost to weaponize patches with LLMs?

Discovery in the Windows tests cost about $2,200 in API credits; building eight full privilege escalations totaled ≈ $15,700 (≈ $2,000 per exploit). These figures exclude human analyst time and external tooling.
Can closed‑source targets be attacked?

Yes. With decompilation tools (e.g., Ghidra) and debug symbols, models identified and exploited Windows kernel vulnerabilities in the experiments.
Are public models a risk?

Yes — public and open‑source models can produce exploits if filters are insufficient, broadening the pool of potential attackers.

Ethics, safety, and responsible disclosure

These findings are a dual‑use problem. Publicly releasing exploit code, detailed prompts, or step‑by‑step weaponization guidance would increase risk. That’s why responsible disclosure and controlled research programs are essential. Security teams and vendors should share high‑level findings, patch advisories, and mitigations — not exploit artifacts. When using LLMs in security programs, log activity, restrict outputs, and keep human reviewers in the loop.

Executive takeaway and next steps

The timeline defenders have relied on — N‑Day — no longer holds universally. Treat patch→exploit as an N‑Hour race: speed up critical rollouts where safe, but more importantly, reduce the presence and impact of exploitable classes through engineering and hardware mitigations, govern model access, and use LLMs defensively under strict controls. Update risk scoring and vendor communications to reflect the new reality, and rehearse incident response against AI‑accelerated exploit scenarios.

If you’d like a concise one‑page executive brief or a 5‑slide CISO deck that summarizes the metrics, business risks, prioritized actions, and board‑level asks, I can prepare either — say which you want and it will be ready.