u/Objective_Method_822

Built an open-source kdb+ alternative on weekends — 5.52M ticks/sec, standard SQL

Built an open-source kdb+ alternative on weekends — 5.52M ticks/sec, standard SQL

I worked on quant infra for two years. Two things drove me crazy:

  1. The kdb+ license. ~$100K/core/year for production. Hard to justify when you're not at a top-5 fund.
  2. The q language. Every new hire spent 2 months learning it before shipping anything. That's expensive in engineer-time, and it locked our codebase into a tiny hiring pool.

I tried the obvious alternatives before building anything.

ClickHouse is great for analytics, but it doesn't have ASOF JOIN. If you've never used ASOF JOIN, it's the SQL operator that lets you do tick-by-tick correlation across feeds — joining a trade with the most recent quote at or before its timestamp. You can fake it with correlated subqueries but it's slow and ugly.

InfluxDB chokes above ~500K events/sec per series. TimescaleDB is fine for slower workloads but not for tick data.

So I started writing my own thing in C++ on weekends. It became ZeptoDB.

**What it does**

- Standard SQL with ASOF JOIN, Window JOIN, xbar (kdb+-style time bucketing), VWAP, EMA — the financial functions you actually use
- 5.52M ticks/sec sustained single-node ingest (8 cores, x86)
- 272µs filter on 1M rows, 248µs GROUP BY
- FIX (350ns), NASDAQ ITCH (250ns), Kafka, MQTT, OPC-UA native consumers
- Python zero-copy bridge — DataFrame in, DataFrame out, no serialization
- Source-available (BSL-1.1, becomes Apache-2.0 in 2030), self-host, K8s Helm chart included
- x86 and ARM/Graviton both supported (test matrix runs on both)

**What surprised me building it**

The wins came from places I didn't expect.

- Highway SIMD on window aggregates: 11x over scalar
- LLVM JIT on filter predicates kept us within kdb+'s range on most queries
- Per-(table, symbol, hour) partition keys gave 2–50x speedup on multi-table workloads. We started with a symbol-only key and it caused weird cross-table data leaks until we found it.

The thing that took longest wasn't performance. It was distributed cluster correctness — split-brain defense, FencingToken in the RPC header, K8s Lease integration, online partition rebalancing. Tick data needs strong correctness guarantees and most of the engineering effort went there, not into making queries fast.

What it's not (yet)

Things I'd rather you know up front than hit in production:

- No JDBC/ODBC drivers. Tableau works through a ClickHouse protocol shim, Excel doesn't.
- No managed cloud. Self-host only for now.
- Window functions over virtual tables aren't supported.
- One query (VWAP 1M p50) has a ~7% gap vs my best baseline due to a clang register-spill issue. Documented in the devlog if you care.

Where it ended up

Started for quants. The same engine now runs in semiconductor fabs (10kHz OPC-UA sensor data), game backends (Kafka telemetry, anti-cheat analytics), and physical AI sensor fusion (ASOF JOIN across LiDAR + camera + IMU). Different verticals, same workload shape.

Happy to answer questions — the kdb+ comparison, why C++ over Rust, why I didn't just put q on top of a free DB, anything.

GitHub: https://github.com/ZeptoDB/ZeptoDB
Site: https://zeptodb.com

u/Objective_Method_822 — 15 hours ago