u/ISeeThings404

Harvey's World Model (or anyone else) Claims Make No Sense

(Disclosure I work at a legal AI competitor, Irys, but this post has no selling, it's simply an assessment of the costs of world models).

I saw a few people hyping up Harvey's plan on legal world models. Which makes sense because world models are THE hyped thing in AI right now. But their plans are all hype and no substance for a simple reason: world models are too expensive to build (especially in a domain like legal). As of now, any attempts to build them would be more narrative and less results.

If you want to know the true cost of a world model, look at Meta’s Code World Models (CWM). Code is the friendliest possible sandbox for this architecture. It has deterministic state, explicit transitions, and perfect, cheap verification (you can see if it runs or not). However, even here, Meta had to sink a ton of capital into:

- 35,000 isolated Docker environments,

- 120 million execution traces, and

-3 million fully verified agent trajectories.

To generate a single verified data point, the system had to fail tests, apply a patch, rerun tests, and prove the fix. That’s hundreds of millions of Docker-minutes which is an eight-figure infrastructure bill just to establish ground truth before training even begins.

The cost of training and inference adds orders of magnitude to this (and that's for one setup, do't forget to account for retraining etc)

And what did that buy them? It does well in the training setup but change the harness or the edit format, and performance collapses by double digits. In other words, it didn't learn general reasoning so much that it overfit to its environment.

Doing this for law is exponentially harder. There is a lot of ambiguous language, competing interpretations, and jurisdictional chaos. There is no compiler for an M&A contract. You can't write unit tests for a litigation strategy.

If you want to build a true “world model” for legal—one that explicitly simulates state transitions—how do you verify the trajectories? You need expert legal judgment to confirm that a specific redline achieves the desired state without breaking anything else. Generating millions of verified legal trajectories with human-in-the-loop oversight wouldn't cost eight figures; it would cost billions.

If Meta couldn't build a robust, generalizable world model in a perfectly deterministic sandbox with free verification, then no player in legal is going to crack world models anytime soon. There is a lot of groundwork to be done here.

That's all. I have my thesis on what will work in Legal and what won't but this isn't a vendor pitch so I won't rattle on here. Just wanted to flag that law isn't ready for world models, and a lot of this conversation around them in law is about catching buzz words

https://preview.redd.it/quanb9nylevg1.jpg?width=598&format=pjpg&auto=webp&s=8abd1d99c4a28e839c213fbb60b6d553a60be140

reddit.com
u/ISeeThings404 — 8 days ago