Databricks Omnigent: Open‑Source Meta‑Harness to Compose, Govern and Share AI Agents

Databricks Omnigent: an open‑source meta‑harness for composing, governing, and sharing AI agents

TL;DR: Omnigent is an Apache‑2 meta‑harness from Databricks that sits above individual agent SDKs so teams can compose multi‑model workflows, enforce stateful governance (cost caps, approvals, sandboxing), and share live sessions across terminal, web, and mobile—while you continue to provide and pay for your own models and inference.

Why engineering teams need a meta‑harness

Modern engineering teams treat AI agents like specialized tools: one for code generation, another for retrieval‑augmented answers, a third for automation. That diversity is powerful but messy—sessions get split across different SDKs, security teams struggle to enforce consistent policies, and product teams lose visibility into cross‑agent behavior. Omnigent aims to be the padded tray that holds the glassware: it doesn’t replace the tools, it keeps them from shattering together.

What Omnigent is (simple definition)

Omnigent is a coordination and governance layer that standardizes agent sessions so disparate agent harnesses (Claude Code, Codex, Pi, OpenAI Agents, etc.) behave like interchangeable components. It runs sandboxed agent sessions, exposes a uniform API for interaction, and provides a server‑side control plane for policies, sharing, and stateful enforcement.

“One orchestrator. Multiple harnesses. One governed session.”

Core architecture in one line

A sandboxed Runner executes agent sessions with a uniform API, while a Server manages policies, sharing, and session state—clients connect via CLI, local web UI, or synchronized terminal/mobile views.

Key features: composition, governance, and collaboration

Agent orchestration: Compose multiple agent harnesses into a single session so you can route tasks across models and SDKs without losing context.
Standardized interface: Sessions expose the same input/output pattern—messages and files in; streamed responses and tool calls out—making harnesses swappable.
Stateful governance: The Server enforces policies that remember session history—cumulative spend, approval history, and paused actions—rather than relying only on prompts.
OS & secret sandboxing (Omnibox): Lock down OS and network access; only inject secrets on approved egress requests to reduce exfiltration risk.
Live sharing: Sessions sync across terminal, local web UI (default launches on localhost:6767), and mobile for collaborative debugging and design.
Cloud sandboxes: Run Runners in cloud providers such as Modal or Daytona so teammates without local setup can join safely.

Practical example patterns shipped with Omnigent

Omnigent’s repo includes alpha tools and examples to illustrate common patterns. Two are particularly useful for business owners and engineering leads:

Polly — parallel coding orchestrator

Polly divides a large refactor into parallel sub‑tasks, spins up subagents to work in separate git worktrees, and then merges results under human supervision. For a team refactoring a multi‑module repo, Polly can speed up turnaround by running codegen and unit updates concurrently—trading extra inference spend for faster cycle time and developer productivity. The governance plane can require a human approval before any push (e.g., after npm install) and cap spend per PR.

Debby — two‑head model comparison

Debby sends the same prompt to two models and displays side‑by‑side answers. That’s useful when evaluating vendor models or performing A/B testing of prompts: product managers see how model styles differ, and engineers can route follow‑ups to the most promising model. Side‑by‑side debate also surfaces hallucinations and divergent tool usage patterns.

Governance examples and a short pseudo‑policy

Policies live at the Server layer and can trigger actions such as require approval, pause session, or block network egress. Here’s a plain‑text pseudo policy to illustrate the shape:

name: spend-control
type: spend
soft_limit: $3.00
hard_limit: $5.00
on_soft_limit: require-approval
on_hard_limit: block-session

This form shows how policies can be both human‑centric and stateful: the Server keeps track of cumulative spend and can prompt for approval or halt actions when thresholds hit.

Deployment, cost, and operational notes

Bring your own models: Omnigent coordinates model endpoints and gateways, but you supply and pay for them—Databricks does not host your inference by default.
Install prerequisites: Designed for modern dev stacks (Python 3.12+, Node.js 22 LTS, tmux). A one‑line installer is available for quick experimentation.
Alpha state: The repo is an alpha release. Roadmap items like an always‑on Omnigent Server MCP, enterprise RBAC, and hardened audit trails are planned but not yet shipped.
Cost management: Parallel subagents increase inference cost. Mitigate by model routing, warm pools, or defining strict spend policies and per‑session caps.

Limitations, risks, and what still needs work

Omnigent fills a real operational gap, but it’s not a turnkey enterprise control plane yet. Key considerations before production rollout:

Security hardening: Omnibox reduces secret exposure, but enterprises will want RBAC, SSO/identity integration, and tamper‑proof audit logs—features slated for future work.
Operational responsibility: Someone must run the always‑on server and maintain Runners. Decide whether this sits with central engineering, platform teams, or a hosted vendor.
Opaque models: Orchestrating third‑party LLMs can expose you to hidden behaviors; policies help, but model governance and legal review remain necessary.
Scalability and cost attribution: Large-scale parallelization requires careful infra planning to avoid runaway spend and to properly attribute costs to teams or projects.

How to evaluate Omnigent for a pilot

Recommended pilot scope: one non‑PII internal team (developer productivity, codegen, or docs automation) for 4–8 weeks. Measure time‑to‑task, cost per task, and frequency of human approvals. Use the pilot to validate policy triggers, secret handling via Omnibox, and the user experience of shared sessions.

Checklist for a quick pilot

Identify a single use case (e.g., PR refactors or design brainstorming)
Prepare model endpoints and a budget cap for the pilot
Deploy a Runner (local or cloud sandbox) and connect the Server
Install sample agents (Polly or Debby) and test end‑to‑end workflows
Define 2–3 policies: spend cap, approval gate before external network or git push, and secret injection rules
Log and review sessions for security and performance signals

Decision Q&A

What problem does Omnigent solve?

It standardizes and composes multiple agent harnesses into a single orchestrated session with a governance layer, reducing operational friction when teams mix models and SDKs.
How do teams interact with sessions?

Via a CLI (omnigent / omni), a local web UI (default localhost:6767), and synchronized terminal/mobile views for live collaboration.
What governance controls exist today?

Stateful policies (soft/hard spend caps, human approval gates), OS/network sandboxing with Omnibox, and controlled secret injection; enterprise RBAC and audit features are planned.
Is Omnigent production ready for regulated environments?

Not yet. It’s alpha—perform internal pilots first and wait for RBAC, SSO, and hardened audit trails before rolling into regulated workloads.
Do I have to provide models and pay for inference?

Yes. Omnigent coordinates and governs your endpoints; model hosting and inference costs remain your responsibility.

Final thoughts and next steps

Omnigent is a practical experiment in making AI agents composable, observable, and controllable at scale. For teams that want to mix best‑of‑breed models without fragmenting governance, it’s a promising glue layer: not a vendor‑managed model host, but a coordination and policy plane that keeps multi‑model workflows auditable and safe.

Try the quickstart on the GitHub repo (omnigent‑ai/omnigent) or visit omnigent.ai to review docs and examples. If you want a ready checklist or example YAML policies tailored to your environment—spend control, secret rules, and code‑push approvals—request those next and use the pilot checklist above to get started.