Ok for context, inside a bigger project I'm doing I have been developing a mathematical expression DAG to register file compiler that then runs on a custom VM tight loop.
My question is when developing a compiler how do you know what is impacting performance the most between input, so quality of the dag, compiler optimization/code generation or VM loop.
Because like I want a balance between performance and code generation time but like I don't know when I should stop optimizing certain parts.
I know where it lacks in terms of dag generation that feeds the compiler and will be a point I will address after I finish another part of the project.
But like my compiler I'm always like there is something here that probably can squeeze a few more nanoseconds/microseconds
At the moment I have like a compiler pipeline of
-GVN
-VIR fusion ( so like transforming sin and cos to sin_cos instructions, generate muladds etc)
-A greedy scheduler based on a Sethi-Ullman heuristic
- VIR Dead code elimination
Then not optimization but reg allocation
-Some passes to change some instructions from simple ones to more complex ones that are more efficient for the computer
-Another DCE pass
-Another fusion pass
-Another Dce pass
-Compaction of the constants pool
And for last encoding into a dense flat bytecode
For reference at the moment it does compilation of a dag that is basically just polynomials with 1,312,149 nodes in 350-400 ms into 105152 intr with 52883 registers and takes around 550µs to evaluate each time.
My question is: how do you know when to stop optimizing the compiler itself? I've been analyzing the output and the flame graph, but ~90% of the time is already spent on actual math, so the usual profiling signals aren't pointing anywhere obvious. Feeling a bit stuck on where to go next, or if I should stop and work on other parts.