u/AvvYaa

▲ 4 r/deeplearning+1 crossposts

A visual explanation of Deepseek v4.

Compressed Sparse Attention (CSA)

Heavily Compressed Attention (HCA)

Sliding Window Attention (SWA)

Deepseek Sparse Attention (DSA)

And more!

u/AvvYaa — 13 days ago