
NURL (Neural Unified Representation Language) v0.1.0 - designing a small LLVM-backed language with LLM token economics as the primary constraint
For the last few months I've been building NURL — a small,
self-hosted, LLVM-backed language whose syntax is shaped by a single
hypothesis: existing languages were optimised for human ergonomics,
and that's a poor fit for code generated token-by-token by an LLM.
Keywords, punctuation, and indentation exist for human eyes; an LLM
pays for every redundant token in both context and inference cost.
v0.1.0 just went public, source MIT/Apache-2.0:
https://github.com/nurl-lang/nurl. Site + browser playground:
https://nurl-lang.org.
I'd love feedback from this sub, especially on the design trade-offs
below. Genuine criticism welcome — I'm not married to the choices.
The design constraints
I tried to make these explicit and falsifiable rather than vibes:
- Token efficiency. Every syntactic construct minimises tokens
- without information loss. Single characters can carry full
- semantic meaning (
@= function,^= return,~= loop / - mutability prefix,
??= pattern match, …). - Regular grammar. No exceptions, no "this works here but not
- there". LL(1) parser with ≤4-token lookahead; full EBNF fits on a
- page (
spec/grammar.ebnf, currently v1.7). - Local semantics. A token's meaning is derivable from ≤8 tokens
- of preceding context. No long-range dependencies that break
- mid-generation.
- Deterministic compiler. Same source → byte-identical IR. The
- self-hosted compiler must reproduce its own IR on the bootstrap's
- second pass or the build is rejected.
- LLVM all the way down. Codegen is delegated to clang; native
- Linux/Windows/macOS and
wasm32-wasiall work. The compiler - itself also builds to wasm.
What it looks like
Everything is prefix notation, one shape: OP ARG1 ARG2 ….
@ add i a i b → i { ^ + a b } // i = i64
( add 3 4 ) // → 7
Algebraic data types and pattern match:
: | Expr {
Num i
Add *Expr *Expr
Mul *Expr *Expr
}
@ eval *Expr e → i {
^ ?? . e 0 {
Num n → n
Add l r → + ( eval l ) ( eval r )
Mul l r → * ( eval l ) ( eval r )
}
}
Closures carry a function-type literal (@ ret_ty arg_tys):
: (@ i i) square \ i x → i { * x x }
( square 7 ) // → 49
Strings live between backticks (\hello`) so single/double quotes can stay free for other syntax. The grammar deliberately reuses every character it can — there's nofor/while/if/fn` keyword in the
language.
Token economy — a quick check
Hand-counted on a "sum 1..N" toy:
| Language | Tokens | Runtime | Targets |
|---|---|---|---|
| Python | ~46 | interp. | host |
| C | ~30 | native | many (per port) |
| NURL | ~13 | native | any LLVM target |
This isn't rigorous — it's just a sanity check that the design is pulling in the right direction. The real metric would be something
like expected tokens for an LLM to produce a correct program across
a corpus, which I haven't measured yet.
Toolchain bits
- Python bootstrap → self-hosted
nurlc.nu→ re-compiles to - byte-identical IR (hard gate in
build.sh). - Stdlib: option/result/errors, string (Vec[u8]-backed,
- NUL-tolerant), int/float/time, lazy iter chains, cmp + sort,
- HashMap[K V], Vec[A], JSON, HTTP (libcurl + SSE streaming), CSV
- reader/writer (RFC 4180), POSIX/Win32 process spawning, SHA-256
- HMAC + base64.
- Memory: default-immutable bindings, compiler-inserted auto-drop
- for owned strings, slices, and selected struct fields. No GC, no
- borrow checker — the auto-drop pass is conservative and the
- type system tracks ownership transfer through return values.
- Hosted MCP server (
/mcp) exposes the entire compiler to MCP - clients (Claude Desktop, Cursor, Windsurf, Zed) — they can
- browse the stdlib, fetch examples, and build native/wasm
- binaries on the user's behalf.
Honest rough edges
- No fixed-width int types yet (
i8,u32,f64…) — the - lexer splits
i8intoi+8. Workaround: cast with#. - This is the most-asked-for feature.
- No borrow checker. Auto-drop covers common ownership patterns
- but nested owned struct fields and arm-local bindings that fall
- through without
^can leak. - Generic instantiation is text-level — type parameters don't
- propagate through generic functions the way Rust/Haskell readers
- expect. Documented gotcha.
- Single-letter
[T]parameter collides with the boolean literal T**.** Use[E]or[A]until I find a less hacky fix.
Where I'd love your input
- Is "tokens per program" the wrong metric? My gut says
- grammar regularity (no exceptions, predictable next-token
- distribution) is doing more of the work than raw token count
- when an LLM is generating code. Anyone seen actual measurements?
- Byte-identical bootstrap as a hard gate — too strict, or
- exactly the right paranoia level for a young self-hosted
- compiler?
- Pattern match without a coverage checker yet. I'm leaning
- toward implementing exhaustiveness at the IR-gen layer rather
- than the type-check layer (lets me share code with switch
- lowering). Sane?
- Auto-drop vs. explicit ownership annotations — the
- conservative auto-drop pass is fine for "the 80% case" but leaks
- in nested-struct + control-flow-fallthrough corners. Has anyone
- tried a similar approach and stayed sane?
- LLM-first language design generally — is this a real
- constraint worth optimising for, or is the right take "frontier
- models will learn whatever syntax you throw at them, so optimise
- for humans anyway"?
Try it
- Browser playground (compiles to
wasm32-wasi, runs locally in the tab — no server-side execution): - https://play.nurl-lang.org
- Grammar (EBNF v1.7): https://github.com/nurl-lang/nurl/blob/main/spec/grammar.ebnf
- Gotchas / current rough edges:
- https://github.com/nurl-lang/nurl/blob/main/docs/GOTCHAS.md
- Roadmap:
- https://github.com/nurl-lang/nurl/blob/main/ROADMAP.md
Thanks for reading — happy to dig into any of this in the comments.