r/Compilers

Compiling with sequent calculus
▲ 32 r/ProgrammingLanguages+1 crossposts

Compiling with sequent calculus

Long time lurker here. I've seen a number of posts about using IRs based on sequent calculus and decided to have a go at it myself. My prototype compiler can be found here, for those of you interested in this niche: https://github.com/August-Alm/sequent

The paper that influenced me the most was https://se.cs.uni-tuebingen.de/publications/schuster25compiling.pdf, that has been highlighted on this reddit before. It defines a low-level IR that corresponds to focused/normalized terms of a depolarised sequent calculus calculus and explains how to transpile it to traditional assembly. I copied this pretty much wholesale.

One abstraction level above it, I have a polarised sequent calculus with generalised algebraic data/codata types, higher kinded types, quantitative type theory-style usage tracking, polymorphism and automatic data kinds in the vein of Haskell's DataKinds extension, and primitive "destination" types for type-safe memory writes in destination passing style (in the style of https://arxiv.org/pdf/2503.07489).

Above that sits functional programming language users are meant to write. It supports the same type-level features, but it is not polarised. It has (generalised algebraic) data and codata types, but they have the same kind "type" ("*") -- polymorphism is higher rank and over all types, not over data or codata separately.

The compilation pipeline was pleasantly easy once. I only really faced two big conundrums:

  1. The surface functional language is not polarised, but the core sequent calculus is. So, I shift all constructions in the surface language into the positive & producer fragment of the core. Since, e.g., function types in sequent calculus are canonically polarised as ((A:+) -> (B:-)):-, this involves quite a bit of shifting to get right. Similarly, the low-level focused/normalised IR is again unpolarised but still has a chirality division of terms into producers and consumers. The compilation of the core sequent calculus to the focused form is done in such a way that "focused chirality = core chirality + core polarity mod 2". That is, the chirality of terms of negative types flip. Again, the transformations are not really very difficult, the difficulty was realising how it needed to be done. I'm sure it has been written about in the research literature but, not being an academic, I had to reinvent the wheel for myself.

  2. The low-level, focused IR does not support polymorphism. In fact, the focusing/normalisation of the core sequent calculus into the focused IR does not support polymorphism, or at least I couldn't figure out how to do it. The issue is that this focusing/normalisation relies crucially on eta-expansion of cuts, in a way that depends on knowing the type of the cut. A cut at a polymorphic type variable cannot be eta-expanded. To get around this, I monomorphise the core sequent terms before normalisation, based on https://dl.acm.org/doi/epdf/10.1145/3720472 That paper does not do it in the setting of a sequent calculus, but it translated very nicely into my setting and allowed me to fully monomorphise higher-rank polymorphism with very little effort.

The compiler "frontend" is very underdeveloped. No source code positions in error messages or etc, the syntax is a bit crude and types annotations could be much more inferred. But for what it is -- a prototype of compilation with sequent calculus-based IRs -- I feel it has achieved its goal. It supports high-level, feature-rich functional programming language and emits surprisingly fast Arm64 assembly, with no external dependencies apart from lexx/yacc for parsing.

u/KeyDue7848 — 20 hours ago
I built a self-hosting x86-64 toolchain from scratch. Part 1: The compiler

I built a self-hosting x86-64 toolchain from scratch. Part 1: The compiler

A while ago I wrote about the overall development journey of building a self-hosting toolchain from scratch. Today I want to get into the details of the compiler specifically — how it works, what it produces, and some metrics on how it performs. Here you can take a look at the source code if you're interested: bjornc

What it is

bjornc is a single-pass, IR-free compiler for Björn, a statically typed systems language I designed alongside the toolchain. It takes .bjo source files and emits x86-64 assembly. No LLVM, no libc, no intermediate representation. The assembly goes to my own assembler, which produces a custom object format (.cub), which goes to my own linker, which produces an ELF executable. But that's for future posts — today is about the compiler.

Single pass, no IR

Most serious compilers go source → IR → optimised IR → assembly. The IR is where all the interesting work happens — dead code elimination, register coalescing, loop transformations, inlining.

bjornc goes source → AST → assembly. One pass, direct emission, nothing in between.

The obvious question is: why? Partly philosophy — I wanted to understand the direct relationship between source constructs and machine instructions without any abstraction layer in the way. Partly practicality — the compiler was co-designed alongside the assembler and linker, and the language evolved constantly as I built those tools. An IR schema that needed updating every time I added a language feature would have slowed everything down.

The cost is real though. No IR means no optimisation pipeline. The two optimisations I have — constant folding and addressing mode optimisation — are applied opportunistically during code generation, limited to what's locally visible at each AST node.

The source-to-assembly expansion factor for a non-trivial program is roughly 4-5:1. That's in the same ballpark as GCC at -O0 for comparable C code. At -O2 GCC closes that gap significantly. I don't.

Register allocation without live ranges

This is the part that required the most original thinking.

Graph coloring and linear scan — the two standard approaches — both require an IR to compute live ranges. No IR means no live ranges, which means neither algorithm was available to me.

What I ended up with is a tree-scoped pinning strategy. When the code generator needs a register for an AST node, it requests one from a central allocator which marks it as pinned. The register stays pinned for the duration of that subtree and is released on exit — either manually once the value has been consumed, or automatically at a safe exit point — which is another problem on its own, knowing whether the AST node is safe for freeing a register.

When no registers are available for usage, they are spilled to the stack and a spill record is created and linked to the AST node that pushed the registers, making sure that every AST node cleans up after they're done. The live value being calculated is moved around between safe registers as needed to ensure the calculation is done properly, no register gets clobbered and we are left with the final computed value instead of stale intermediate numbers.

To put some numbers on this — I analysed the 31K LOC assembly output for the assembler source itself:

Register Spill Count
rdi 388
rsi 362
rax 98
rbx 16

864 total spills across 31K LOC — roughly one per 36 lines. Two things are important to address:

  1. rdi and rsi dominate because they're the argument registers and get pinned at every call site and spilled if we have nested/linked function calls. Around 80-90% of rdi and rsi spills come from my assembler source code usage of the builder pattern for creating the mnemonic encoding lookup table. Code such as new()->func1(a1)->func2(b1)->func3(c1); may push rdi 3 times, potentially explaining the metrics.

  2. rax and rbx aren't spilled often, but their spilling frequency could be decreased by either using more registers — my compiler uses only 4 general purpose registers (rax,rbx,rcx,rdx) besides the 6 System V ABI parameter registers — or having some sort of heuristics to avoid using rax for general computing when we are at a division node (rax is explicitly needed for division as per x86-64 requirements).

I decided to stick to 4 register to stress test my allocation algorithm, if it works under these constraints, it'll work with more registers and then once I got it working, I never bothered to change it. It is important to note that not all of those 864 are unnecessary — a significant fraction are genuine saves across boundaries where the register is actually live. A liveness-aware allocator could eliminate the redundant ones, but that requires an IR I don't have.

What the compiler actually produces

Here's a concrete example. This Björn function:

func int32 add(int32 a, int32 b){
    return a + b;
}

Produces this assembly:

_add_i32_i32:
    push rbp
    mov rbp, rsp
    sub rsp, 8
    mov dword [rbp - 4], edi      ; save a
    mov dword [rbp - 8], esi      ; save b
    mov eax, dword [rbp - 4]      ; load a
    mov ebx, dword [rbp - 8]      ; load b
    add eax, ebx                  ; a + b, result in eax
.ret_from_add_i32_i32:
    add rsp, 8
    pop rbp
    ret

The name mangling — _add_i32_i32 — incorporates the parameter types, which is how function overloading is resolved at compile time without runtime dispatch.

Field access folds the offset directly into the memory operand:

mov eax, dword [rax + 4] ; v->y where y is at offset 4

No intermediate register needed. This was a deliberate optimisation — the naive approach materialises the offset in a register and adds it, which wastes a register and an instruction.

Compilation performance

I measured compilation time against AST node count rather than raw lines of code — a single line can contain a trivially simple expression or a deeply nested one, so LOC is a poor proxy for compiler workload. Six files from the assembler source were selected across a wide range of AST counts, measured with hyperfine over 300 runs:

File AST Nodes Compilation Time (ms)
arena.bjo 558 1.1 ± 0.4
assembler_ops.bjo 1,364 2.6 ± 0.8
register.bjo 1,701 2.1 ± 0.4
main.bjo 2,347 4.8 ± 0.6
ast.bjo 4,283 5.9 ± 0.6
analyzer.bjo 9,249 23.3 ± 2.0

Linear regression gives R² = 0.9537 and p-value = 0.000815 — approximately linear behaviour, statistically significant.

The notable outlier is analyzer.bjo. That file contains deeply nested template builder calls and nested foreach loops whose lowering to for requires AST copying, generating disproportionately complex AST structures relative to raw node count. It's not a random spike — it's exactly the file you'd expect to be slow given what it does.

The full assembler — 6228 LOC of Björn across 10 files — compiles in roughly 47ms. The single-pass IR-free design means there's no IR to construct, no optimisation passes to run, and no backend lowering phase. It's a single tree traversal and it shows.

Closing thoughts

It was never my intention to create a production-ready optimised and competitive compiler. It was however my intention to know how to develop a compiler, own every line of code, ponder about design decisions and face challenges that required solutions of my own. Along the way, I also wanted to get more familiar with x86-64 assembly. At the end of the day, I did this for curiosity and personal drive rather than grinding for a project entry in my CV — potentially explaining how I was able to stuck with this project for 1.5 years, out of which, around 10 months ish were spent working on the compiler. Curiosity was also the reason why I avoided the obvious shortcuts — LLVM for the compiler backend, Bison for parsing, NASM for assembling, GNU ld for linking, libc for the runtime. Every one of those would have been the sensible choice. None of them would have taught me what I wanted to know. To me this was like watching a show you wanted to take a look at for long time, sure you can look up highlights, a synopsis of it online and a couple clips, and you kinda get the whole idea of the plot and how it comes to an end, but if you really wanted to watch it, you just would have. I was in for the ride, not the destination.

Next post will be the runtime libraries — malloc, printf, variadic arguments, all implemented from scratch on top of direct Linux syscalls with no libc. After that, I'll post about the assembler, the custom binaries, and my own linker. If you have questions about anything here, happy to go into more detail.

u/Soft_Honeydew_4335 — 2 hours ago
A small update on Nore: first public release and thanks
▲ 27 r/ProgrammingLanguages+1 crossposts

A small update on Nore: first public release and thanks

A while ago I posted here asking for feedback on Nore, a language idea I've been exploring around data-oriented design.

At that point, one of the main things I was unsure about was whether the idea was actually strong enough to carry a real compiler project, or whether it mostly sounded interesting in theory. The feedback I got here helped me keep pushing on that question instead of backing away from it.

So I wanted to post a small follow-up and say thanks: I've now published the first public release, v0.1.0.

If anyone wants to take a look, the repo is here: Nore

Since that first post, a big part of the work has gone into two things:

  • building a small standard library from scratch
  • getting the language self-hosted

When I first posted, there really wasn't a stdlib yet. I've tried to keep it intentionally small so the language has to carry its own weight, instead of hiding weak spots behind a large library too early.

The self-hosting part mattered even more to me. My earlier post was mostly about whether this fairly opinionated language model could really express a non-trivial systems project. Getting Nore to the point where it can implement its own compiler feels like the first meaningful validation that the core idea is worth continuing.

A lot of the suggestions from that first discussion were genuinely useful, and I've kept track of them. But this release hasn't really started addressing most of those bigger future ideas yet. I felt it was more important first to get Nore into a somewhat more stable state before taking on more ambitious work.

I definitely don't see this as "the language is done" or anything close to that. There's still a lot to improve in the language, tooling, stdlib, and general ergonomics. But it does feel like an important milestone, and I honestly don't think I would have pushed it this far without the feedback I got here.

So mostly: thanks. This community helped me turn a language idea I wasn't fully sure about into a first public release I feel good enough about to share.

And if anyone wants to take a look, I'd still love feedback, especially on:

  • the data-oriented design direction itself
  • whether self-hosting changes how convincing the language feels
u/jumpixel — 8 hours ago
Prysma: Anatomy of an LLVM Compiler Built from Scratch in 8 Weeks
▲ 5 r/cpp+2 crossposts

Prysma: Anatomy of an LLVM Compiler Built from Scratch in 8 Weeks

Prysma: https://github.com/prysma-llvm/prysma

This is a compiler development project I started about 8 weeks ago. I’m a CEGEP student, and since systems engineering of this scale isn’t taught at my level, I decided to build my own low-level ecosystem from scratch. Prysma isn’t just a student project; it’s a complete language and a modular infrastructure designed with the constraints of industrial production tools in mind. This document is a technical dissection of the architecture, my engineering choices, and the effort invested in the project.

1. Meta-generation and automation of the frontend

Developing a compiler normally requires manually coding hundreds of classes for the Abstract Syntax Tree (AST) and its visitors, which generates a lot of technical debt. To avoid this, I created a compiler generator in Python.
Prysma’s grammar is defined in an ast.yaml file. My Python engine (engine_generation.py), which uses Jinja2, reads this specification and generates all the C++ code for the frontend (classes, virtual methods, interfaces). This strategy is inspired by LLVM’s TableGen. It allows me to add a new operator in 30 seconds. Without this technique, it would take me about an hour to add a single node, because I would have to manually modify the token, the lexer, the parser, and the visitors, with a high risk of errors. Now, everything is handled by automated templates.

2. Parallel Orchestration with llvm::ThreadPool

A modern compiler needs to be fast, so I architected the orchestrator around llvm::ThreadPool. Prysma processes files in parallel for the lexing, parsing, and IR generation phases. The technical challenge was that LLVM contexts are not thread-safe. I had to isolate each compilation unit in its own context and memory module before the final merging by the linker. Managing race conditions on global symbols required strict adherence to the object lifecycle.

3. Native Object Model and V-Tables

Prysma implements a class model directly in LLVM IR, including encapsulation (public, private, protected). Implementing polymorphism was one of the most complex aspects. I modeled navigation in virtual method tables (V-Tables) at the binary level using LLVM’s opaque types (llvm::StructType). Call resolution is handled at runtime with GetElementPtr (GEP) instructions to retrieve function pointers. Because a single-byte error causes Segfaults, this part is still in an unstable version in the compiler.

4. Memory Management: Arena and Heap

Memory allocation is crucial for speed. For the AST nodes, I use a memory arena (llvm::BumpPtrAllocator). The compiler reserves a massive block and simply advances a pointer for each allocation in $O(1)$. Everything is freed at once at the end, as in Clang.

For the Prysma language itself, I implemented dynamic allocation with the new and delete keywords, which communicate with libc’s malloc and free. Loops also manage their stack via LLVM’s alloca instruction.

5. Unit and Functional Testing System

To ensure the reliability of the backend, I implemented a robust pipeline. I use Catch2 for C++ tests of the AST and the register. I also developed a test orchestrator in Python (orchestrator_test.py) that uses templates to compile and execute hundreds of files simultaneously. This allows testing recursion, variable shading, and thread collisions. Deployment is blocked by GitHub Actions if a single test fails.

6. Execution Volume and Work Methodology

Systems engineering demands a significant amount of execution time. To make this much progress in 8 weeks, I worked 14 hours a day, 7 days a week. Designing an LLVM backend requires reading thousands of pages of documentation and debugging complex memory errors.

AI was a great help in understanding this complexity. My method was iterative: I generated LLVM IR code (version 18) from C++ code to inspect and understand each line. I combined Doxygen’s technical documentation with questions posed to the AI ​​to master everything. To maintain this pace, I managed my fatigue with caffeine (a maximum of three times a week to avoid upregulation), accepting the impact on my mental health to achieve my goals. I was completely absorbed by the project.

7. Data-Oriented Design (Work by Félix-Olivier Dumas)

Félix-Olivier Dumas joined the Prysma team to restructure the project’s algorithmic foundation. He implemented a Data-Oriented Design (DOD) architecture for managing the AST, which is more efficient.

In its system (currently being finalized), a node is a simple integer (node_id_t). Data (name, type) is stored in Sparse Sets as flat arrays. The goal is to maximize the L1/L2 cache: by traversing aligned arrays, the CPU can preload data and avoid cache misses. It also uses Tag Dispatching in C++ to link components at no runtime cost (zero-cost abstraction), without v-tables or switch statements.

8. Current State of the Language

Prysma is currently a functional language with stable capabilities:

Syntax: Primitive types (int32, float, bool), full arithmetic, and operator precedence.

Structures: If-else conditions and while loops.

Functions: Recursion support and passing arguments by value.

Memory & OOP: Native arrays, classes, inheritance, and heap allocation.

Tools: Error diagnostics (row/column), Graphviz export of the AST, and a VS Code extension for syntax highlighting.

9. Roadmap and Future Vision

The project is evolving, and here are the planned objectives:

Short term (v1.1): Development of the Standard Library (lists, stacks, queues) and an import system for linking C libraries.
Medium term (v1.2): Support for Generics (templates), addition of Namespaces, and stricter semantic analysis for type checking.

Long term: Just-In-Time (JIT) compilation, integration of the inline assembler (asm {}), and custom SSA optimization passes.

The project is open source, and anyone interested in LLVM or Data-Oriented Design can contribute to the project on GitHub. The code is the only judge.

Prysma: https://github.com/prysma-llvm/prysma

u/Any-Perspective1933 — 2 days ago
Prysma: Anatomy of an LLVM Compiler Built from Scratch in 8 Weeks

Prysma: Anatomy of an LLVM Compiler Built from Scratch in 8 Weeks

Prysma: https://github.com/prysma-llvm/prysma

This is a compiler development project I started about 8 weeks ago. I’m a CEGEP student, and since systems engineering of this scale isn’t taught at my level, I decided to build my own low-level ecosystem from scratch. Prysma isn’t just a student project; it’s a complete language and a modular infrastructure designed with the constraints of industrial production tools in mind. This document is a technical dissection of the architecture, my engineering choices, and the effort invested in the project.

1. Meta-generation and automation of the frontend

Developing a compiler normally requires manually coding hundreds of classes for the Abstract Syntax Tree (AST) and its visitors, which generates a lot of technical debt. To avoid this, I created a compiler generator in Python.
Prysma’s grammar is defined in an ast.yaml file. My Python engine (engine_generation.py), which uses Jinja2, reads this specification and generates all the C++ code for the frontend (classes, virtual methods, interfaces). This strategy is inspired by LLVM’s TableGen. It allows me to add a new operator in 30 seconds. Without this technique, it would take me about an hour to add a single node, because I would have to manually modify the token, the lexer, the parser, and the visitors, with a high risk of errors. Now, everything is handled by automated templates.

2. Parallel Orchestration with llvm::ThreadPool

A modern compiler needs to be fast, so I architected the orchestrator around llvm::ThreadPool. Prysma processes files in parallel for the lexing, parsing, and IR generation phases. The technical challenge was that LLVM contexts are not thread-safe. I had to isolate each compilation unit in its own context and memory module before the final merging by the linker. Managing race conditions on global symbols required strict adherence to the object lifecycle.

3. Native Object Model and V-Tables

Prysma implements a class model directly in LLVM IR, including encapsulation (public, private, protected). Implementing polymorphism was one of the most complex aspects. I modeled navigation in virtual method tables (V-Tables) at the binary level using LLVM’s opaque types (llvm::StructType). Call resolution is handled at runtime with GetElementPtr (GEP) instructions to retrieve function pointers. Because a single-byte error causes Segfaults, this part is still in an unstable version in the compiler.

4. Memory Management: Arena and Heap

Memory allocation is crucial for speed. For the AST nodes, I use a memory arena (llvm::BumpPtrAllocator). The compiler reserves a massive block and simply advances a pointer for each allocation in $O(1)$. Everything is freed at once at the end, as in Clang.

For the Prysma language itself, I implemented dynamic allocation with the new and delete keywords, which communicate with libc’s malloc and free. Loops also manage their stack via LLVM’s alloca instruction.

5. Unit and Functional Testing System

To ensure the reliability of the backend, I implemented a robust pipeline. I use Catch2 for C++ tests of the AST and the register. I also developed a test orchestrator in Python (orchestrator_test.py) that uses templates to compile and execute hundreds of files simultaneously. This allows testing recursion, variable shading, and thread collisions. Deployment is blocked by GitHub Actions if a single test fails.

6. Execution Volume and Work Methodology

Systems engineering demands a significant amount of execution time. To make this much progress in 8 weeks, I worked 14 hours a day, 7 days a week. Designing an LLVM backend requires reading thousands of pages of documentation and debugging complex memory errors.

AI was a great help in understanding this complexity. My method was iterative: I generated LLVM IR code (version 18) from C++ code to inspect and understand each line. I combined Doxygen’s technical documentation with questions posed to the AI ​​to master everything. To maintain this pace, I managed my fatigue with caffeine (a maximum of three times a week to avoid upregulation), accepting the impact on my mental health to achieve my goals. I was completely absorbed by the project.

7. Data-Oriented Design (Work by Félix-Olivier Dumas)

Félix-Olivier Dumas joined the Prysma team to restructure the project’s algorithmic foundation. He implemented a Data-Oriented Design (DOD) architecture for managing the AST, which is more efficient.

In its system (currently being finalized), a node is a simple integer (node_id_t). Data (name, type) is stored in Sparse Sets as flat arrays. The goal is to maximize the L1/L2 cache: by traversing aligned arrays, the CPU can preload data and avoid cache misses. It also uses Tag Dispatching in C++ to link components at no runtime cost (zero-cost abstraction), without v-tables or switch statements.

8. Current State of the Language

Prysma is currently a functional language with stable capabilities:

Syntax: Primitive types (int32, float, bool), full arithmetic, and operator precedence.

Structures: If-else conditions and while loops.

Functions: Recursion support and passing arguments by value.

Memory & OOP: Native arrays, classes, inheritance, and heap allocation.

Tools: Error diagnostics (row/column), Graphviz export of the AST, and a VS Code extension for syntax highlighting.

9. Roadmap and Future Vision

The project is evolving, and here are the planned objectives:

Short term (v1.1): Development of the Standard Library (lists, stacks, queues) and an import system for linking C libraries.
Medium term (v1.2): Support for Generics (templates), addition of Namespaces, and stricter semantic analysis for type checking.

Long term: Just-In-Time (JIT) compilation, integration of the inline assembler (asm {}), and custom SSA optimization passes.

The project is open source, and anyone interested in LLVM or Data-Oriented Design can contribute to the project on GitHub. The code is the only judge.

Prysma: https://github.com/prysma-llvm/prysma

u/Any-Perspective1933 — 2 hours ago
I forced a data-oriented language to carry its own compiler before letting it grow

I forced a data-oriented language to carry its own compiler before letting it grow

I've just gotten Nore, a data-oriented systems language I've been working on, to a first public release and a self-hosted compiler.

The part I'm actually interested in discussing here is this: how much should self-hosting change how seriously we take a language design?

For me, getting there was less about milestone and more about trying to answer a specific question. I wanted to know whether this fairly opinionated language model could really carry a non-trivial compiler codebase, or whether it was mostly compelling at the idea level.

So I deliberately tried not to hide behind too much infrastructure. I started with basically no stdlib, only grew a small one, kept a trusted C stage0 compiler around, used C as the backend IR, and pushed toward self-hosting before taking on bigger language ideas.

Now that it's there, it feels like meaningful validation, but not closure. It tells me the core model is capable of expressing and sustaining a compiler. It does not tell me the language is mature, or that the current design is the right final shape.

A lot of bigger future ideas are still untouched on purpose. I felt it was more important to stabilize the current compiler/language shape a bit before taking on more ambitious changes.

Repo: Nore lang

I'd be very interested in how compiler people here think about a few things:

  • Does self-hosting materially change how credible a language feels to you, or is it overrated?
  • How long would you keep a C backend and trusted C seed around after self-hosting?
  • If you were in this position, would you stabilise first, or start tackling the bigger queued language ideas right away?
u/jumpixel — 7 hours ago
▲ 3 r/Compilers+1 crossposts

Built a complete out-of-tree LLVM backend for a custom 32-bit SIMT GPU ISA

GitHub: github.com/Deepesh1024/NVMirror

NVMirror compiles LLVM IR all the way down to custom GPU assembly instruction selection, register allocation, and instruction scheduling, built from scratch as an out-of-tree LLVM backend.

https://preview.redd.it/87560vtrl6tg1.png?width=1738&format=png&auto=webp&s=00ae36a214d136a6d6f1d64f446394628728a36f

The scheduler's job is simple: don't let the GPU sit idle waiting 20 cycles for memory. It does this by finding independent instructions and filling that wait window with useful work. On matrix multiply, this eliminates 47.6% of all cycles. On vector add where there's almost no independent work to fill the window only 31.7%. The numbers tell you exactly where ILP exists and where it doesn't.

One design question I'd love input on: I used Linear Scan over Graph Coloring for register allocation. With 256 physical registers, spills almost never happen so the compile-time cost of Graph Coloring never felt justified. Has anyone actually benchmarked this tradeoff on a large-register-file GPU backend?

reddit.com
u/Remarkable_Garage_40 — 4 hours ago

Is "crafting interpreter" enough before jumping into academic paper in compiler field?

As title suggests, I'm curious that I need to read other perquisite book like "engineering compiler", "dragon book" or I can just go straight forward to papers.

reddit.com
u/FinishExtension1375 — 10 hours ago
Week