u/Dieriba

Review of My C Library

Review of My C Library

Hi fellow C programmers,

I’ve been building my own C library as a way to go deeper into systems programming, API design, memory ownership, and testing.

The library has grown quite a bit, and currently includes:

  • Dynamic string (d_dyn_string)
  • String views (d_string_view)
  • Dynamic arrays (d_dyn_array)
  • Stack (d_stack)
  • Queue (d_queue)
  • Deque (d_de_queue)
  • Unordered map (d_unordered_map)
  • Hash set (d_hash_set)

I’d really love feedback from experienced C programmers:

  • API design: anything that feels awkward or unsafe?
  • File/project structure: does it scale well?
  • Ownership semantics: anything unclear or non-idiomatic?
  • What mature C libraries would you recommend studying?

Repository:

c_library

Any brutally honest feedback is welcome :)

u/Dieriba — 6 days ago

Choosing a hash function for a Swiss-table implementation in C

I’m implementing a Swiss-table style hash table in C and I’m trying to pick the right hash function.

For people who’ve implemented their own Swiss table (or robin hood / SIMD probing tables):

What hash function did you choose, and why?

Would love to hear your thoughts, benchmarks, or mistakes to avoid.

reddit.com
u/Dieriba — 7 days ago

Hi everyone,

I am implementing my own generic unordered_map in C inspired by SwissTable, and I am thinking through the memory layout.

Right now my rough structure looks like this:

typedef struct Table
{
    u8 *metadata;
    void *groups;
} Table;

typedef struct DUnorderedMap
{
    Table table;
    HashFn hash_fn;
    FreeFn free_fn;
    usize elem_size;
} DUnorderedMap;

The metadata would store the control bytes, similar to SwissTable. My question is about the layout of the actual key/value storage.

One idea I had was to make groups one contiguous array of pointers, split into two halves:

keys:   k1 k2 k3 k4 ...
values: v1 v2 v3 v4 ...

Instead of storing entries interleaved like:

[k1, v1][k2, v2][k3, v3]...

Would the split layout be better for cache locality when probing/comparing keys, since the keys are packed together? Or is the interleaved key/value layout usually better because once a key matches, the value is needed immediately?

My second question is about whether a generic C map should store only pointers to keys and values, or whether it is better to store key/value bytes inline inside each slot.

For fixed-size types, inline storage seems better because it avoids extra allocations and pointer indirection. But for types like C strings, the length is dynamic. If I tried to store the actual string bytes inline, then entries would become variable-sized, and indexing into the table would become complicated. I would also need to store the key length somewhere and handle alignment manually.

My understanding is that this would also make the SwissTable-style layout worse for string keys. With fixed-size entries, once the metadata tells me that slot i is a candidate, I can immediately compute the address of the key/value with something like:

entry = entries + i * entry_size;

But if keys are stored inline with dynamic length, I cannot directly compute the offset of slot i. I would need to know the length of previous keys, or maintain a separate offset table.

So for strings, maybe the better approach is to store a fixed-size object inside the slot, for example a char *, or a string view like:

typedef struct DStringView {
    const char *data;
    size_t len;
} DStringView;

Then the slot size remains fixed, and the string data lives elsewhere.

Does this sound like the right direction for a generic SwissTable-style map in C? Are there any common layout mistakes I should avoid?

reddit.com
u/Dieriba — 14 days ago