March 14, 2026

TACIT: Compile-Time Type Safety Ushers in Breakthrough for Secure AI Agents

In a significant advancement for AI safety, researchers led by Martin Odersky have introduced TACIT, a framework leveraging Scala 3's capture checking and type system to enforce safety in AI agents. Published on arXiv as paper 2603.00991 on March 1, 2026, TACIT shifts AI agent safety from runtime monitoring to static compile-time guarantees. Agents express their intentions as capability-safe Scala code executed in a controlled REPL environment, where capabilities like file access or network calls are tracked via types using capture sets. This approach prevents common risks such as information leakage, malicious side effects, and prompt injection attacks without relying on model alignment or behavioral training.

TACIT operates by requiring agents to generate Scala 3 code via a single `run(code)` tool, which compiles and executes it in a sandboxed stateful REPL. Capabilities are scoped variables that regulate access to effects and resources, with the type system enforcing local purity—ensuring sub-computations remain side-effect-free. Sensitive data is wrapped in `Classified[T]`, and operations like mapping accept only pure functions to block exfiltration. Features like safe mode restrict unsafe language elements such as casts or reflection, while explicit nulls and controlled string interpolation further bolster security. If code violates rules, the compiler rejects it outright, forcing the agent to iterate until compliant.

The framework provides ironclad safety guarantees independent of the underlying model. Capability safety prevents forging or smuggling permissions, while completeness ensures all effects are gated. Local purity blocks data leaks in classified contexts, and static enforcement catches violations before runtime. Experiments on an AgentDojo-based safety benchmark showed 100% security in classified mode for models like Claude Sonnet 4.6 and MiniMax M2.5, compared to sharp drops without safeguards. This model-agnostic approach complements other defenses by embedding least-privilege principles at the language level.

Performance evaluations confirm TACIT's practicality. On expressiveness benchmarks like τ²-bench for airline and retail tasks, capability-safe agents matched or exceeded tool-calling baselines, with gains up to 3.7%. SWE-bench Lite saw only a minor dip from 43.3% to 41.7%, demonstrating agents can adapt to compilation feedback without substantial overhead. These results highlight that safety need not compromise capability, enabling reliable deployment in unsupervised settings.

TACIT represents a paradigm shift in AI alignment, prioritizing proof-based safety through programming languages over probabilistic training methods. By making safety compositional and auditable, it paves the way for extensible, domain-specific harnesses and scalable agent systems. As AI agents grow more autonomous, this compile-time enforcement—where "the compiler says no, so the agent can't say yes"—offers a robust foundation to mitigate existential risks from misalignment and unintended behaviors.

Read Research Source →