Building the frontier of Agentic Law

Integrated AI and legal research advancing how law is encoded into AI, and how AI agents can facilitate AI governance.

Our Mission

AI Agents are becoming autonomous actors. We are building agentic law to align them with societal values and enable high-stakes, safe deployments.

Research Areas

  1. 01

    Legal Turing test

    We develop various Turing Tests related to high-stakes real legal workflows. Norm will continue to build out that suite of benchmarks that only a technology company powering a full-service law firm can.

  2. 02

    As Intelligence Becomes Cheap, Trust is the new Bottleneck

    As AI drives the cost of intelligence toward zero, the bottleneck in the economy shifts to assurance of agentic legal systems. How do we ensure that AI systems act in ways that are legal, trustworthy, and enforceable?

  3. 03

    Legal Infrastructure for the Agentic Economy

    Legal systems were built for human actors. As AI agents become economic and societal actors, law is their real-time alignment infrastructure. We investigate the systems by which AI agents will transact, be governed, and held liable.

  4. 04

    Benchmarking Legal Reasoning

    Legal reasoning spans rule extraction, statutory interpretation, analogical reasoning, judgment under ambiguity, and more. Most academic benchmarks do not measure reasoning. We build the evaluation infrastructure for getting to the right answers the right way.

Latest Research

Opus 4.6Opus 4GPT-5.4GPT-5Sonnet 4.6Sonnet 4GPT-5.4 MiniGPT-5 Mini
020406080100Opus 4/4.6GPT-5/5.4Sonnet 4/4.6GPT-5/5.4 MiniOpus 4.6 · 94% consistency94%Opus 4 · 88% consistency88%GPT-5.4 · 89% consistency89%GPT-5 · 90% consistency90%Sonnet 4.6 · 90% consistency90%Sonnet 4 · 80% consistency80%GPT-5.4 Mini · 84% consistency84%GPT-5 Mini · 88% consistency88%Answer Consistency Across Turns (%)

Even at 90% consistency, frontier models still contradict themselves at scale

Why the latest generation of models is not yet production-ready for high-stakes legal work.

The latest generation of frontier models reaches the same conclusion on a legal question roughly 90% of the time. At scale, that gap still produces contradictory answers to the same question every single week.

Publications