I built a log-structured filesystem with CoW functionality in Rust and the heatmaps are... interesting.
I am a 2nd year CSE student and I have been building a custom filesystem called RoseFS. I am building my own OS and I basically wanted the performance and flash-lifespan of f2fs but with the userspace CoW snapshotting features of btrfs. I need that for a virtual A/B root system.
I looked at bcachefs and others but they all have their own ups and downs. I figured if I build it myself I actually own the code and can optimize it exactly for my OS and ecosystem.
The Benchmark: I ran 30,000 ops on a 64MB loopback device to see how these things handle extreme pressure.
- ext4: Looks clean because it updates in-place, but it hammers the same blocks over and over (hottest block is 1821x).
- f2fs: I do not know what it is doing but 4.3 million host writes for 30k ops is crazy. It could have melt the NAND chips on mobile devices (UFS storage media on Android phones). However it writes so much so it doesnt hit the same block again and again - so technically it does save the lifecycle of the hardware!
- RoseFS: It looks messy because it is log-structured and spreads writes for wear leveling. The write count is 286k which is much better than f2fs.
Current Feature Set:
- Grid packing (8 inodes per 4k page).
- Inline directory entries and tail packing for small files.
- CoW batching (metadata updates are delayed to reduce WAF).
- Modern iomap based DIO path.
Current Issues: I am hitting a capacity wall. On these tiny 64MB disks I can only get about 57% usable space before ENOSPC. The B+ tree CoW overhead is just too high for small volumes. Also the NAT (Node Address Table) is still a big hotspot (the pink line at the bottom).
Looking for Advice: I want to get in touch with some experienced FS maintainers. I want to learn more about proper automated test suites and better ways to handle metadata compression or B+ tree bloat.
Full disclosure: I use AI to help me build stuff faster because I am still a student and I am definitely not on par with the great kernel devs yet. But I am aware of the technical knowledge and I am trying to understand every line that goes into the module.
Any feedback on the heatmap or how to handle CoW bloat on small volumes would be appreciated.