Secure AI agents on Amazon Bedrock AgentCore with Cedar policies and Lambda interceptors
TL;DR: Use Cedar policies for fast, auditable allow/deny decisions and Lambda REQUEST/RESPONSE interceptors for runtime validation, token vending, and payload transformation. Combine them to enforce dynamic controls (like data-residency) without sacrificing auditability or introducing unnecessary latency.
Why agent governance matters
AI agents—LLM-powered assistants that pick tools and arguments at runtime—are moving from experiments to business-critical workflows. That flexibility creates a new failure mode: one misdirected tool call can leak sensitive data, misuse credentials, or break compliance rules such as data residency. A governance strategy that treats agents like users or services misses the key difference: agents make unpredictable, contextual calls during execution. The right approach blends deterministic policy with runtime context enrichment.
“Policies provide the immutable firewall; interceptors are the field agents that gather extra facts before the firewall makes its call.”
Core components (quick definitions)
- Amazon Bedrock AgentCore Gateway: the gateway that front-doors tool invocations from agents and exposes policy and interceptor hooks.
- Cedar: a declarative policy language and engine that evaluates allow/deny rules over principal, action, resource, and contextual attributes. Decisions are deterministic and auditable.
- AWS Lambda interceptors (REQUEST / RESPONSE): serverless hooks that run before and after policy evaluation to enrich requests, call external services, exchange tokens, or redact responses.
- MCP (Model Context Protocol) server: backend exposing tools (for example, query_claims, get_claim_details) that agents invoke through the Gateway.
- STS and Cognito: Cognito issues JWTs for authentication; Lambda interceptors can call STS:AssumeRole to issue short-lived, tenant-scoped credentials.
- Control plane & observability: Cedar emits audit records to Amazon CloudWatch; interceptors should instrument their lookups and token operations into logs for a complete trace.
Problem vignette: an insurance lakehouse gone wrong
An agent asked by a claims adjuster to “summarize claims in Q1” accidentally called get_claim_details and returned PII for EU customers. The root causes were twofold: the agent selected the wrong tool at runtime, and downstream services received broad credentials allowing unrestricted data access. The fix is layered: stop bad calls with policy, prevent credential misuse with token vending, and ensure policies can act on runtime attributes (like geography) by enriching the request in-flight.
Design patterns for securing AI agents
Split responsibilities: Cedar policies are ideal for static, high-frequency checks that need immediate enforcement and auditability. Lambda interceptors are ideal for dynamic enrichment, external lookups, and token vending. Use one of three patterns depending on requirements:
1. Policy only
Move static, high-frequency rules into Cedar policies. Examples: forbid the “policyholders” group from calling get_claims_summary; block any call to admin tools from non-admin principals.
- Pros: near-zero latency (typically sub-millisecond), automatic audit logs to CloudWatch, immediate emergency forbids via the control plane.
- Cons: cannot call external services, cannot mutate payloads, cannot vend tokens.
2. Interceptor only
Let a REQUEST interceptor vet inputs, perform token exchange (Cognito JWT → sts:AssumeRole), and inject a tenant-scoped credential and metadata into the outbound call. This addresses the confused-deputy problem (a service inadvertently using overly privileged credentials).
- Pros: can do external lookups, transform payloads, and issue scoped credentials.
- Cons: introduces Lambda latency and operational overhead; auditability requires explicit logging in the interceptor code.
3. Policy + Interceptor (recommended for many production scenarios)
Use the REQUEST interceptor to fetch dynamic attributes (tenant geography, compliance flags, recent policy changes) and inject them into the request context. Cedar then evaluates forbids/permits using that enriched context. RESPONSE interceptors can redact or filter outputs before they return to the agent.
- This combines the best of both worlds: dynamic checks and deterministic, auditable enforcement.
- Remember: the REQUEST interceptor executes before Cedar evaluation; RESPONSE runs after policy decision and before returning result to the agent.
Insurance lakehouse demo — architecture and numbered request flow
Core components used in the demo:
- Storage: Amazon S3 Tables (Apache Iceberg)
- Query: Amazon Athena
- Fine-grained access: AWS Lake Formation
- Auth: Amazon Cognito (JWTs)
- Metadata/mappings: Amazon DynamoDB
- Gateway: Amazon Bedrock AgentCore (Cedar + interceptors)
- Observability: AgentCore Observability + Amazon CloudWatch
Request flow (numbered):
- Agent issues a tool call (e.g., get_claim_details) to AgentCore Gateway with caller JWT.
- REQUEST interceptor extracts the subject from the JWT, looks up tenant and geography in DynamoDB, and optionally calls STS:AssumeRole to get tenant-scoped credentials.
- Interceptor injects enriched context fields (for example, geography: EU) and tenant-scoped credentials into the tool request.
- Cedar policy evaluates allow/deny rules over principal, action, resource, and the enriched context (geography).
- If allowed, Gateway forwards the request to the MCP server/tool with scoped credentials; Athena and Lake Formation apply fine-grained row/column controls at query time.
- RESPONSE interceptor can redact sensitive fields or filter the response before returning to the agent.
- All events—JWT validation, interceptor logs, Cedar decisions, and tool outputs—are emitted to CloudWatch for traceability.
Concrete artifacts
Short Cedar policy example (anonymized):
# Deny reading individual claim details when geography == “EU”
forbid principal:group == “policyholders” and action == “get_claim_details” when resource.claim_residency == “EU”
REQUEST interceptor pseudocode (logic outline):
1. Validate incoming Cognito JWT and extract user_id.
2. Lookup tenant and geography = DynamoDB.get(user_id).
3. If tenant needs scoped creds: call STS:AssumeRole(role_arn_for_tenant).
4. Inject context: request.context.geography = geography; request.headers.Authorization = temp_creds.
5. Forward enriched request to policy engine / MCP tool.
Example CloudWatch entries to log from the interceptor:
- JWT validation success/failure with user_id (no PII)
- DynamoDB lookup result: tenant_id, geography
- STS AssumeRole success/failure (role_arn, temp creds metadata)
- Request enrichment events and any payload mutations
Tradeoffs and operational considerations
Balance these factors when designing agent governance:
- Latency: push high-frequency, simple checks into Cedar (near-zero latency). Keep heavy external lookups in interceptors only when necessary. Consider provisioned concurrency for Lambda hot-paths.
- Auditability: Cedar decisions are automatically logged; interceptors must be instrumented explicitly to produce a complete audit trail.
- Emergency control: Cedar rules can be toggled or updated via the control plane for immediate forbids. Interceptor changes require code deploys and are slower to update.
- Complexity and maintainability: too many interceptors or ad-hoc logic fragments increase operational risk. Favor declarative policies where possible and keep interceptor behavior small and well-tested.
- Security: protect temporary credentials, enforce least privilege, rotate roles and secrets, and avoid logging sensitive PII. Threat-model interceptors as critical security components.
Operational playbook: rollout, testing, and governance
Practical steps to move from pilot to production:
- Policy-first mindset: express as many rules as you can in Cedar. Use policy for high-frequency, safety-critical checks.
- LOG_ONLY staging: run Cedar in LOG_ONLY to observe how rules behave in production traffic and iterate until logs show expected decisions.
- Interceptor minimalism: keep interceptors focused: attribute enrichment, token vending, and response redaction. Avoid embedding complex business logic.
- Policy lifecycle: implement namespaces, change reviews, automated tests that assert expected allow/deny outcomes, and staged rollout: dev → staging (LOG_ONLY) → prod (ENFORCE).
- Monitoring & SLOs: track average interceptor latency, percent-denied calls, cold-start rate, and audit event completeness. Set alert thresholds and an error budget for interceptor failures.
- Incident playbook: have a rollback mechanism (control-plane forbid toggle) and a runbook for token revocation and emergency policy changes.
- Cost planning: measure Lambda execution frequency and duration; push cheap checks into Cedar to save on Lambda costs.
Key questions and quick answers
- How should I block unauthorized tool calls from an AI agent?
Use Cedar policies for deterministic, auditable allow/deny decisions. Reserve interceptors for what policies can’t do—external lookups, token exchange, and payload mutation. Combine both when you need dynamic context with strong auditability.
- When is an interceptor necessary?
When you need act-on-behalf token exchange (Cognito JWT → sts:AssumeRole), external metadata lookups (tenant geography), payload enrichment, or response redaction. Policies cannot perform external calls or mutate requests.
- Can I audit every decision?
Yes. Cedar emits structured decisions to CloudWatch. Interceptors must log their own lookups, token exchanges, and mutations; correlate these logs with policy decisions for end-to-end traceability.
- How do I implement data-residency controls like “EU users cannot access individual claim records”?
Have a REQUEST interceptor fetch geography from a trusted store (for example, DynamoDB), inject geography into the request context, and write a Cedar forbid rule that denies the action when geography == “EU”. Start in LOG_ONLY to validate before enforcing.
Pre-launch checklist for agent governance
- Start Cedar policies in LOG_ONLY; review CloudWatch logs to validate behavior.
- Move static checks into Cedar to reduce latency and cost.
- Use REQUEST interceptors only for lookups, token vending, or necessary mutations.
- Instrument interceptors extensively: log lookups, role assumptions, and payload changes (avoid PII in logs).
- Establish policy namespaces, CI tests for policy behavior, and a staged deployment pipeline.
- Monitor latency, deny rates, cold-starts, and audit completeness; set SLOs.
- Have an emergency control-plane procedure to flip policies to forbid state immediately.
Resources and next steps
- AgentCore samples on GitHub — sample code, MCP server, and CDK deployment scripts.
- Cedar policy documentation — syntax and policy patterns.
- AWS Lake Formation — row/column level access control for lakehouses.
- Amazon CloudWatch — centralized observability and logging.
- Amazon Cognito and AWS STS — authentication and temporary credential patterns.
Layering Cedar policies and Lambda interceptors gives teams a practical path to production-ready agent governance: policies are the fast, auditable guardrails; interceptors provide the dynamic context and token handling agents need to act on behalf of users safely. Start with policy-first design and add interceptors where external context or least-privilege credentialing is required. Run policies in LOG_ONLY while you test, instrument interceptors for observability, and adopt a staged rollout model to minimize surprises.
Authors and contributors behind the demo: Bharathi Srinivasan (Generative AI Data Scientist), Subha Kalia (Senior Technical Account Manager), and Renya Kujirada (AI/ML Specialist Solutions Architect) — bringing a responsible, operational lens to agent governance and deployment.
Ready to try it? Clone the samples repository, deploy the CDK demo, and run the demo workflow. Use LOG_ONLY mode to iterate on Cedar rules, add a minimal REQUEST interceptor to inject geography, and watch the CloudWatch logs to confirm decisions before flipping to ENFORCE.