u/Beginning-Fruit-1397

Hello everyone,

For the past two months I've been working on belugas:
https://github.com/OutSquareCapital/belugas

It's a python dataframe library aiming to provide a polars-like API to build and execute queries on a duckdb backend.

Compared to Ibis or narwhals, the goal is not to be a general multi-backend tool, but rather a specialist one, covering as much as possible of polars and duckdb functionnalities (currently +700 expressions functions, selectors, unpivot, join_asof, geometry datatypes, and more).

Disclaimer:

I'm just an uni student, I have no affiliation at all with DuckDB, or polars.

It's not an "official" project showcase, because even if it work and has already can do a lot, it's still WIP.

There's no website doc yet, nor Pypi package, nor proper docstrings in all expressions, nor full test coverage (+1000 currently but it should be way more), etc....

I will do one at some point in the future however!

Hence, the motivation behind my post is to ask the following question:

If you were to use this library, would you rather have duckdb aligned results after computation, or polars ones?

They can diverge quite a lot.

For example, the `millisecond` function doesn't give the same results at all, the `len` function doesn't have the same meaning, column names can be different after a join, etc...

I see pros and cons for both.

## DuckDB aligned results

- Best performance wise if there's a divergence that need a coalesce, null handling, or custom column name resolution implementation

- Aligned with duckdb documentation and backend expectations

- Are simpler to implement internally

## Polars aligned results

- Way easier to test (simply check equality with polars computations)

- I expect that polars users would be more interested by belugas than duckdb users, hence they would be the most expected results

- The possibility to simply switch from `bl.col()` to `pl.col` seamessly in a data pipeline when one tool is preferrable to the other (which is a huge point IMO)

reddit.com
u/Beginning-Fruit-1397 — 7 days ago