LLM Routers Turn Rogue: Protect AI Agents from Credential Theft and Crypto Drain

When LLM Routers Turn Rogue: Why AI Agents Are a New Crypto Attack Surface

TL;DR: A University of California study shows public LLM routers (services that forward requests between your app and language models) can inject malicious instructions, read secrets, and even siphon cryptocurrency. Immediate action: stop sending private keys or seed phrases through third‑party routers and require human approval for any automated money movement. Medium‑term: adopt ephemeral credentials, HSMs, and private routing. Long‑term: push for model attestation (cryptographic signing of outputs) across providers.

Hook: an automated theft that happened too fast

Researchers set traps and routed model traffic through 428 publicly available LLM routing services (28 paid, 400 free). Some routers behaved like ordinary middlemen. Others quietly changed the conversation. Nine routers injected malicious code or tool calls, 17 exfiltrated AWS credentials used in the tests, and at least one router withdrew cryptocurrency from a seeded test wallet. An initial test drain was under $50; a related public claim referenced a much larger compromised wallet. The takeaway for any business using AI agents is brutal and simple: any secret you pass through a router that decrypts traffic can be seen, altered, and acted upon at machine speed.

“Dozens of routers are secretly injecting malicious tool calls and stealing credentials; we also demonstrated taking over hundreds of hosts by poisoning routers.” — Chaofan Shou, co‑author (paraphrased)

Study snapshot: what researchers found

Sample: 428 public LLM routers tested (28 paid, 400 free).
Malicious behaviors observed:
- 9 routers injected malicious code/tool calls.
- 2 routers used evasion techniques to avoid detection.
- 17 routers accessed AWS credentials belonging to the researchers.
- At least one router siphoned crypto from a test wallet.
Root technical flaw: many routers terminate TLS (decrypt traffic to inspect it), exposing plaintext payloads and any secrets inside.
Agent frameworks with auto‑execute modes (“YOLO mode”) amplify consequences because malicious instructions can be executed without human checkpoints.
Free and low‑cost routers were disproportionately risky — they’re convenient bait for attackers.

Why this matters to business leaders: the new supply‑chain

LLM routers were adopted because they solve real problems: multi‑model orchestration, provider failover, and simplified billing. But convenience concentrated an attack surface. Think of a router as a customs officer who opens every suitcase that passes through: if that officer is malicious or compromised, any secret that travels through can be removed or altered.

Threat actors and scenarios to consider:

Malicious operator: a router operator deliberately injects commands to steal funds or exfiltrate data.
Compromised vendor: legitimate routers can be taken over via credential reuse or supply‑chain compromise.
Insider abuse: operator staff with access to plaintext logs misuse secrets.
Automated pivot: stolen cloud credentials let attackers spin up infrastructure and move laterally inside an organization.

Why crypto is special: blockchain transactions are irreversible. If a routed agent sends a signed transaction using a private key that passed through a router, funds can be lost permanently before anyone notices.

Risk matrix — what to protect first

High risk: private keys, seed phrases, payment authorizations, cloud credentials with broad scopes, customer PII transmitted to agents.
Medium risk: internal automation commands that can change billing, spin up infrastructure, or alter access controls.
Low risk: one‑off public queries, generic knowledge retrieval, or ephemeral non‑sensitive prompts.

Short‑term defenses you can deploy this week

These are pragmatic, high‑impact controls to reduce exposure immediately.

Secrets‑never policy for public agent sessions: ban private keys, seed phrases, and long‑lived credentials from any session that routes through a third‑party router.
Disable auto‑execute for money ops: require explicit human confirmation (two‑step approvals) for any automated transfer of funds or privileged actions.
Use ephemeral credentials and least privilege: issue short‑lived tokens with narrow scopes (OAuth scopes, STS tokens, ephemeral API keys) for agent use.
Isolate high‑risk agents: run sensitive agents through enterprise or on‑prem routers that you control, not public free services.
Harden logging and alerting: monitor for unusual tool calls, failed auth attempts, and rapid transaction patterns; hook to SIEM and alerting rules.
Rotate and revoke: rotate keys and revoke tokens immediately if you suspect a router compromise.

Long‑term fixes: model attestation and supply‑chain standards

Short‑term controls blunt risk, but structural trust requires new primitives. Researchers propose cryptographic signing and attestation of model outputs so that clients can verify that an instruction came from an authentic model and wasn’t tampered with by an intermediary.

What model attestation might involve:

Providers cryptographically sign outputs (or an execution manifest) with keys tied to the model and release revocation lists for compromised keys.
Routers would be required to preserve signatures end‑to‑end (no TLS termination without re‑attestation) or provide verifiable proofs of integrity.
Industry standards for attestation APIs, certificate formats, and revocation would allow clients to automatically verify provenance before an agent executes a tool call.

Implementing this will take coordination across model vendors (OpenAI, Anthropic, Google, etc.), router companies, and enterprise buyers. It will also raise new questions: key management for signing, scaling attestation validation, and governance for revoking compromised model keys.

Vendor due‑diligence: 10 questions to ask any router provider

Do you terminate TLS for customer traffic? If so, why—and where are encrypted payloads logged?
Do you persist plaintext requests or model responses? For how long and where?
How do you restrict employee or operator access to plaintext logs and keys?
Do you support mutual TLS, private peering, or on‑prem routing appliances for enterprise customers?
Do you use any model attestation/signing mechanisms today or have a roadmap for them?
What credentials are stored by your service, and are they encrypted and isolated in an HSM?
Do you have an incident response SLA and mandatory breach notification clause? What’s the timeline?
Can you provide third‑party penetration test reports and SOC/ISO certifications?
How do you prevent and detect tooling that injects arbitrary tool calls into customer sessions?
Do you offer enterprise features to disable auto‑execution or enforce human approval flows?

Technical checklist for engineers

Audit agent configurations for any “auto‑execute” or “YOLO” settings and default them to disabled.
Keep secrets out of prompt history; use reference tokens that map to secrets stored in HSMs at execution time.
Favor short‑lived credentials (STS, OAuth) and narrow scopes for agent operations.
Use private routers, VPC peering, or on‑prem appliances for sensitive traffic.
Implement verification steps: require signed model outputs or an integrity manifest before executing critical tool calls.
Instrument agents to log both intent and final execution, and alert on deviations from expected instruction flow.

Sample contract language (starter clause)

“Router Provider shall not terminate TLS for customer payloads routed to designated enterprise endpoints without prior written consent. Any plaintext user data temporarily decrypted for processing must be stored only for the minimum operational window (no more than 24 hours), encrypted in an HSM, and deleted upon completion. Provider shall support mutual TLS or private peering for enterprise traffic, enforce least privilege on all stored credentials, and notify Customer within 24 hours of any suspected compromise. Provider shall cooperate in cryptographic attestation of model outputs where available.”

Quick audit runbook: find your exposure in one afternoon

Inventory: search code, IaC, and config for references to router vendors, API keys, and agent frameworks.
Identify agent sessions that can trigger actions (payments, cloud provisioning, data exports).
Check which routes terminate TLS and where logs are stored; list services that store plaintext.
Review agent config for auto‑execute flags and behavioral tooling (e.g., webhooks, tool calls).
Rotate any keys that have been used in public routers and replace with ephemeral tokens for agents.
Require human approval for any agent action that affects money, access, or customer data until controls are validated.

FAQ — quick answers for leaders

Are paid routers safer than free ones?

Paid routers often offer enterprise features (private peering, SLAs, logging controls), but they’re not automatically safe. Evaluate operational controls, logging practices, and whether the provider terminates TLS on your traffic.

Is ChatGPT or a major model provider directly at fault?

Not necessarily. The issue is often the intermediary. Routers sit between apps and models; many of them decrypt traffic. Even legitimate models can be harmlessly used as a vector if the router injects malicious instructions or steals secrets.

Can model attestation be implemented soon?

Technically yes, but it requires coordination: signing keys, revocation mechanisms, and standards for attestation. Expect an industry rollout over months to years, starting with enterprise features from major vendors.

How do I protect crypto wallets used by agents?

Never embed private keys or seed phrases in agent sessions. Use HSMs, multisig wallets, hardware signing devices, and human approval for any transaction that moves funds.

What leaders should do next

LLM routers are now part of the AI supply chain and belong in your vendor risk and architecture reviews. Start by banning secrets in public agent sessions, inventorying where agents can execute privileged actions, and enforcing human approval for financial operations. Push your vendors for better transparency: ask whether they terminate TLS, how they store logs, and what their roadmap is for attestation. Finally, plan for a future where model outputs carry cryptographic provenance — that will be the safety buckle that lets agents act with far less friction.

AI agents are here to accelerate work, but acceleration without guardrails is how incidents go from inconvenient to catastrophic. Treat routing infrastructure with the same scrutiny you apply to CI/CD runners, package registries, and cloud identities — because the consequences can be the same, only faster.