Mira Murati Returns: Thinking Machines Unveils Tinker API for Real‑Time Multimodal AI Agents

Mira Murati’s Return Signals a Shift to Real‑Time, Multimodal AI for Business

Executive TL;DR

Mira Murati, now CEO of Thinking Machines Lab, is pitching “interaction models” and the Tinker API — tools for continuous multimodal AI agents that listen, watch and respond in real time.
These systems process audio, text and video in ~200‑millisecond slices to capture conversational texture — a shift from turn‑based ChatGPT‑style prompts to real‑time AI automation for sales, support and meetings.
Boards should treat governance, vendor strategy and talent economics as linked priorities: pilot a focused use case, require interoperability and prepare controls for rapid scaling.

Why this matters to business leaders

Mira Murati’s public reappearance puts a spotlight on a practical pivot: AI is moving from discrete chat prompts to continuous, multimodal interactions that can augment live human workflows. For sales, customer service and compliance teams, that means AI agents that can interject mid‑conversation, surface relevant documents as a call unfolds, and enforce policy in real time — not after the fact.

What Thinking Machines Lab is building

Thinking Machines Lab has been operating quietly: raising capital, hiring engineers and shipping the Tinker API — a platform for fine‑tuning open‑source foundation models. Murati told Bloomberg that the lab is focused on a new class of “interaction models.”

Interaction models treat conversation as a continuous stream of audio, text and video rather than discrete question/answer turns. Instead of waiting for a typed prompt, these models process inputs in roughly 200‑millisecond windows — fast enough to pick up interruptions, mid‑sentence corrections and the pauses that carry meaning in human talk. Two hundred milliseconds is about the time between a human pause and the next spoken word — quick enough to feel instantaneous to participants.

“My decisions during the November 2023 crisis were guided by protecting the mission and the team, which made those choices feel obvious even amid chaos,” Murati told Bloomberg.

How this changes AI agents and AI automation

Turn‑based chat is great for many tasks, but it flattens real conversation. Continuous multimodal agents can do things turn‑based systems can’t: interrupt politely with a legal caveat, queue a product demo at the exact moment a prospect expresses interest, or mute a microphone when a sensitive number appears on screen. For businesses, that unlocks a new category of automation — real‑time augmentation instead of post‑hoc analysis.

Practical implications:

Sales copilots that surface contracts, pricing comparisons and objection‑handling lines during live calls.
Contact centers that reduce handle time by detecting escalations and routing to specialists before customers ask for escalation.
Compliance systems that flag and redact sensitive information on the fly during video conferences.

Market context: talent, stacks and consolidation

The industry is consolidating while competition for engineers and researchers intensifies. Thinking Machines positions itself in the open‑source fine‑tuning niche, arguing that companies want control over model behavior without being locked into proprietary stacks.

Compensation has ballooned at the frontier. Reports of multi‑million and even nine‑figure packages have become part of recruiting dynamics; Murati described departures from her team as the kind of concentrated churn that happens when frontier labs scale quickly. She framed it as compressed volatility rather than systemic failure, noting that money alone rarely explains every exit.

Other labs — OpenAI, Anthropic, and ventures associated with high‑profile founders — continue to define competitive benchmarks. As players differentiate on features, data governance and prebuilt integrations, vendors will compete on interoperability as much as raw model performance.

Governance: power, checks and institutional design

“Too many consequential decisions are concentrated in too few hands; governance structures need attention beyond individual character,” Murati told Bloomberg.

Her point is strategic: businesses and boards should build systems that make responsible decisions resilient to personnel changes. Relying on the virtue of executives or star engineers is brittle when models can alter markets and customer outcomes in minutes. Effective governance should include delegation frameworks, incident response playbooks, independent audits and red‑team testing before production rollout.

Risks and mitigations

Latency and infrastructure: Continuous multimodal agents require low‑latency compute and potentially edge deployment to hit sub‑200 ms responsiveness. Mitigation: prototype with hybrid cloud/edge and measure real latency across real networks.
Privacy and compliance: Real‑time audio and video raises data residency and consent issues. Mitigation: default on data minimization, encryption in transit, and real‑time redaction where required.
Vendor lock‑in: Fine‑tuning via APIs can still create dependencies. Mitigation: insist on model provenance, exportable checkpoints and exit clauses in contracts.
Human trust: Agents that interrupt or act autonomously risk alienating customers or employees. Mitigation: design for graceful, reversible interventions and clear disclosure when AI intercedes.

Concrete pilot roadmap: 90 days to signal value

Objective: Validate a continuous multimodal use case (sales enablement or contact center assist) with measurable impact on handle time and customer satisfaction.
Scope: One team, one workflow, low regulatory exposure. Limit initial rollout to internal users or opt‑in customers.
Data & privacy checklist: Confirm consent, retention rules, encryption and data deletion processes. Ensure PII redaction paths exist.
Resources: Product manager, ML engineer, security lead, and an integration engineer. Budget for cloud/edge compute and modest third‑party licensing.
Metrics to track: Time saved per call, first‑contact resolution rate, CSAT lift, false‑positive intervention rate, and model drift indicators.
Estimated timeline & cost: 90 days; small pilot budgets typically start in the mid‑to‑high five‑figure range and scale with compute needs. Edge‑heavy scenarios will push costs higher.

Practical checklist for executives

Pilot one continuous multimodal workflow with explicit success metrics.
Require exportable model checkpoints and interoperability terms in vendor contracts.
Ask the board for an AI governance playbook with delegated authorities and incident response plans.
Pair every autonomous intervention with a human‑in‑the‑loop rollback mechanism.
Monitor talent risk: maintain a retention strategy beyond compensation (career paths, mission clarity, governance alignment).

Questions your board should ask

Who has authority to push models into production, and what approvals are required?
Ensure explicit delegation and audit trails for model deployment decisions.
How will we detect and respond to harmful or biased outputs in real time?
Require red‑team schedules, monitoring dashboards and incident SLAs.
What contractual rights do we have if a vendor changes model behavior or pricing?
Negotiate exit clauses, checkpoint exports and audit rights.
How are data residency and consent handled for live audio/video processing?
Demand documented flows and compliance assessments.

“Neither utopia nor dystopia is inevitable; the current period determines the path and humans should not ‘take their hands off the wheel,’” Murati told Bloomberg.

A short vignette: a contact center pilot (illustrative)

A financial services firm pilots a continuous multimodal assistant for one inbound queue. The assistant listens and transcribes, detects escalation language and surfaces policy scripts for agents. In the pilot, supervisors see faster interception of compliance breaches and agents report fewer after‑call tasks because the assistant drafts follow‑up notes. Success criteria: 10–15% reduction in wrap‑up time and improved compliance audit scores. If governance and privacy controls are solid, the pilot scales to other queues.

Takeaways

Thinking Machines’ push toward interaction models and the Tinker API exemplifies how the next wave of AI will focus on interfaces that move with human tempo. The technical novelty — processing multimodal input in ~200‑millisecond slices — matters because it enables AI agents to act during conversations, not after them.

For leaders, the choice isn’t binary between closed stacks and open source. The right approach combines a practical vendor strategy, strong governance and staged pilots that surface infrastructure and human factors early. Start small, measure impact, and codify controls so that fast‑moving innovation doesn’t outpace accountability.

Action now: Start a 90‑day pilot for a single continuous multimodal workflow, require exportable checkpoints and a governance playbook, and prepare the board to ask the hard operational questions that keep AI systems aligned with business and compliance goals.