
▲ 6 r/RISCV
sse2rvv: An MMX/SSE/AES-NI C Intrinsics to RVV C Intrinsics Translator
github.comu/omasanori — 6 hours ago

Although the designer mentioned it's for educational purposes, why did he simplify stuff so much.
https://github.com/Grubre/smol-gpu
What are the reasons behind these simplifications:
Sequential warp scheduling
No warp-level parallelism within a core
No cache hierarchy
Separated program and data memory
No shared memory / scratchpad
No barrier / synchronization primitives
No reconvergence stack in hardware
and many more....
Is there any reasoning behind these simplifications?
I have also checked the RTL, there were few cases of possible race conditions. Is this repo even a legit baseline to make an advanced gpu on top of it?