u/watman12

▲ 10 r/golang

A regression in code I didn't touch

CPU data cache associativity issues are relatively well known. Instruction cache associativity issues, less so.

While working on go code, I investigated a surprising performance regression that turned out to be caused by L1 instruction cache associativity. In the code I didn’t even change.

The investigation included usage of go toolchain, but the underlying issue is mostly language-agnostic.

blog.andr2i.com
u/watman12 — 20 hours ago
▲ 9 r/golang

I've been polishing this for about a month, and the last two weeks turned into an unhealthy benchmark-tuning spiral. I have to ship it now or I never will.

https://github.com/molecule-man/go-brrr

Features:

  • Pure Go Brotli (RFC 7932), no cgo
  • Byte-compatible with the C reference implementation
  • Full support for compound dictionaries

Performance:

  • Faster than andybalholm/brotli at every quality level
  • ~59% geomean faster compression
  • ~70% geomean faster streaming decompression

I also compared it against klauspost/compress zstd.
At Brotli q5+, ratio-vs-speed starts beating pure-go zstd at its highest level.
At lower quality settings, zstd is still generally the better choice.

README has the benchmark tables/charts.

Bug reports very welcome.

u/watman12 — 22 days ago