r/zfs

New release candidate 10 for OpenZFS on Windows 2.4.1
▲ 8 r/zfs

New release candidate 10 for OpenZFS on Windows 2.4.1

https://github.com/openzfsonwindows/openzfs/releases
https://github.com/openzfsonwindows/openzfs/issues

** rc10

  • Add FileCompressionInformation to enable query of on-disk compressed size
  • Do some performance fixes to make things faster
  • Hardlink deletion would hide all other hardlinks
  • Fix deadlock in write path
  • Prioritise HarddiskXPartitionY paths over hack path
  • Add import --fix-gpt to correct NumPartitions=9 to NumPartitions=128.
  • Fix up condvar and mutex
  • Use User credentials, enabling zfs allow to work. Mix Unix and Windows permissions and hope for the best
  • OpenZVOL unload bug fixes
  • Fix spl_panic() call print and stack

So with Unix created GPT partitions, they use gpt.NumPartitions=9, this Windows does not accept, and Windows
computes gpt.checksum "as if" gpt.Numpartitions==128. So checksum mismatches, and partition table is ignored.

This is why OpenZFS uses path encoding of #partition_offset#partition_length#/path/to/device, saved into vdev->vdev_physpath.

This continues to work.

We added a new zpool import --fix-gpt which will rewrite gpt.NumPartitions=128, and recompute gpt.checksum. Since libefi already reads in the full GPT partition, we need not change anything else, and write it back out. This is left as a user option, as there could be partition usage I am unaware of. Who know if some legacy archs can only use fewer partitions? Or store microcode in the backhalf.

If GPT is written with gpt.NumPartitions=128, Windows will recognise the partitions, and create //?/HarddiskXPartitionY device objects, so we can import those directly, no need for special path. Success. We prioritise //?/HarddiskXPartitionY over #partition_offset#partition_length#/path/to/device - but it will try both.

Let's check for regression in this release.

Evaluate and report issues

u/_gea_ — 5 hours ago
▲ 9 r/zfs

ZFS Encryption Key vs Passphrase

I am not a TrueNAS user but I watched:

https://www.youtube.com/watch?v=RmJMqacoPw4

and in that video, it's mentioned that TrueNAS gives you the option to unlock encrypted datasets with either a passphrase or a key.

When installing Proxmox, IIRC I set both the passphrase and the key. When I boot Proxmox, I input the key to unlock the data. What I can't find anywhere is whether ZFS has the same two options of key and passphrase or is it different to TrueNAS and needs both? Or how does it work?

I'm trying to figure out whether I need to do the key step and back the key up or if I can just use a passphrase and generate a key at a later date if necessary?

u/Jastibute — 1 day ago
▲ 6 r/zfs

ZFS pool offline after power outage - unable to open rootbp, cant_write=1, metaslab space map crash

My external ZFS pool went offline after a power outage. The drive is connected via USB enclosure. I've tried recovery on both TrueNAS 25.04 and Ubuntu ZFS 2.2.2 with no success. Data is irreplaceable (no backup) so looking for any recovery options before going to professional recovery.

Drive Info

  • Single disk pool, no redundancy
  • Drive reads fine with dd at 200+ MB/s, no read errors
  • SMART test passes

Pool Label (zdb -l /dev/sde1)

name: 'external_backup' state: 0 txg: 2893350 pool_guid: 5614369720530082003 txg from uberblock: 2894845

zdb -e -p /dev/sde1 on TrueNAS shows

vdev.c: disk vdev '/dev/sde1': probe done, cant_read=0 cant_write=1 spa_load: LOADED successfully then crashes at: ASSERT at cmd/zdb/zdb.c:6621 loading concrete vdev 0, metaslab 765 of 1164 space_map_load failed

All Import Attempts Fail With

cannot import 'external_backup': I/O error unable to open rootbp in dsl_pool_init [error=5]

What I've Tried

  • zpool import -f
  • zpool import -F -f (recovery mode)
  • zpool import -F -f -o readonly=on
  • zpool import -f -T 2893350 — gives different error: "one or more devices is currently unavailable" instead of I/O error
  • zdb -e -p — pool loads but crashes at metaslab 765 space map verification
  • Tried on TrueNAS 25.04 and Ubuntu ZFS 2.2.2/2.3.4

Key Observations

  • cant_write=1 appears on TrueNAS but not on Ubuntu
  • zdb actually loaded the pool successfully on TrueNAS before crashing at metaslab verification
  • -T 2893350 (older txg from label) gives a different error suggesting that txg may be accessible
  • partuuid symlink exists and matches label

Any suggestions on next steps before going to professional recovery?

reddit.com
u/chiefrussian — 1 day ago
▲ 4 r/zfs

Curious about thoughts on vdev layouts?

I have been able to get very lucky and scrape together a system that is quite solid. I have 64gb of ram. I have 8x12tb used enterprise drives, 2x1.92tb sata ssds, 2x256gb sata SSDs likely for os, and 2x1tb NVME drives.

What I would like to ask as I have only used zfs in a basic capacity, what would be the safest and most efficient way to layout the vdevs.

The large capacity will mostly be used for media files, photo backups, and file backups/backups in general.

The way I understand it my most useful options are listed below:

  • One big raidz2 or 3, with or w/o a special vdev
  • 2 raidz1 vdevs, with or w/o a special vdev
  • 4 mirror, with or w/o a special vdev
  • Everything in its own pool, a big raidz2 or 3 and mirrors for the respective ssds

Just looking for thoughts, I would like to prioritize safety and efficiency, the capacity loss is OK to a point, would like to reduce as much as possible.

reddit.com
u/ComatoseCow — 2 days ago
▲ 13 r/zfs+1 crossposts

Free Webinar: ZFS 101 (Basics + Practical Design Tips)

We’re hosting a free session on the fundamentals of ZFS for anyone looking to better understand how it actually works under the hood.

📅 Apr 22, 2026
⏰ 3–4 PM AST

We’ll cover:

  • RAID types and how they affect performance/resiliency
  • How the ZFS volume manager works
  • ZFS storage hierarchy (vdevs, pools, datasets, etc.)
  • How ZFS ties filesystem + storage together
  • Managing ZFS with the 45Drives Houston UI

This is aimed at beginners/intermediate users who want a clearer mental model of ZFS.

If that sounds useful, you can register here:
https://ow.ly/eHTx50YN3l0

u/45drives — 1 day ago
▲ 2 r/zfs

How to benchmark ZFS?

I'm building a NAS and want to benchmark my pool. It is a 2x2tb HDD in mirror, I have 64 GB DDR4 RAM, and an i3-14100.

I want to check how it performs and compare to ext4, but I'm afraid having this amount of memory will cloud the results.

I'm thinking of allocating a 50GB file in a tmpfs, with random data from /dev/urandom. Would this be enough to trigger I/O to be flushed to disk frequently?

What else can I tune to not have RAM impacting the results too much?

Also, what fun benchmarks to run? I'm thinking of fio, pgbench, copying small/medium/large files. What else would be cool?

reddit.com
u/hpb42 — 6 days ago
▲ 18 r/zfs+1 crossposts

bzfs v1.20.0 is out

bzfs v1.20.0 is out.

This release has a few changes I'm pretty excited about if you use ZFS replication in more demanding setups:

  • New --r2r support for efficient remote-to-remote bulk data transfers
  • --bwlimit now also applies to mbuffer, not just pv
  • A Docker image with a corresponding replication example
  • Better validation and hardening around SSH config files, recv options, file permissions, and incompatible remote shells
  • A new bzfs_jobrunner --repeat-if-took-more-than-seconds option

The headline item is probably --r2r. If you have source and destination on different remote hosts and want the data path to be more efficient, this release makes that workflow more natural and efficient.

I also tightened up a few safety checks. bzfs is the sort of tool people use for backups, disaster recovery, and automation, so I'd rather be conservative than "flexible" in ways that can go wrong later.

If you want the full changelog: https://github.com/whoschek/bzfs/blob/main/CHANGELOG.md

If you're using bzfs for local replication, push/pull over SSH, remote-to-remote, or scheduled jobrunner setups, I'd be interested in hearing what your setup looks like and where it still feels rough.

u/werwolf9 — 6 days ago
▲ 0 r/zfs

I got tired of parsing FIO JSON logs manually, so I built a simple web-based visualizer with AI help

Hi everyone,

I’ve been doing a lot of HDD benchmarking lately using FIO to test different configurations and ashift values. As much as I love FIO, looking at raw JSON or log files to compare latency and bandwidth is not possible, as using tools like matplotlib or fio-plot wich are great but if you don’t use after few weeks you forgot how works.

I couldn't find a simple, no-fuss online tool that just takes the JSON/Log output and turns it into clean graphs, so I decided to build one with AI help: https://raidzfscalculator.com/en/fio-results-viewer/

What it does:

  • Visualizes FIO JSON output (IOPS summary).
  • Plots Bandwidth, IOPS, and Latency time-series from .log files.
  • No registration, no data tracking, adds (there is a "Clear Session" button to wipe your logs from the server).

It’s still a work in progress, but it has helped me a lot to visualize clat spikes rapidly. I thought some of you might find it useful for your own tests instead of spending hours to understand how or other plot software worked.

Any feedback on what metrics I should add next?

u/No_Signature4127 — 2 days ago
▲ 6 r/zfs

Struggling to understand zfs dRAID (calculator)

I'm adding 12x8TB drives to my server. I'm looking at two dRAID configs - one with a bigger safety net than the other. But I'm not understanding the configs. The configs would be:

Config 1:
draid1:10d:12c:1s
I'd expect this to have 10x8TB(ish) space - 80TB usable, 8TB for parity and 8TB for Spare.

Config 2:
draid2:8d:12c:2s
I'd expect this to have 8x8TB(ish) space - 64TB usable, 16TB for parity and 16TB for Spare.

But that's not what the graph shows at all - Config1 shows ~70TiB usable with 8 Data Disks and capacity drops to ~55TiB if I have 10 data disks. This doesn't make sense to me since 8x8TB disks would never fit 70TiB's worth of data...

Config 2 looks more like I'd expect it - around ~55TiB with 8 data disks since I'm using about 4 disks' worth for redundancy.

What am I doing wrong?

u/4chanisforbabies — 6 days ago
▲ 0 r/zfs

3 drive Mirror or 3 drive z2 - data security ONLY

Ok as always need an odd number of drives in a mirror and majority working to validate no data loss/corruption want to see a comparison based on how zfs actually works.

2 drive Mirror can lose 1 drive can NOT validate data corruption or not rot.

3 (and 4) Drive Mirror while physically possible has same problem as 2 drive Mirror
5 drive Mirror can lose 2 drives no data loss and can validate bit rot and corruption (all ODD numbers of drive Mirrors after can as well)

Now if you do not care speed but ONLY data security can a Z2 with 3 drives do what is needed with a 5 way mirror or a z3 with 4 drives do with a 7 way mirror

Other then reduced drives per zfs code this seems to be correct am I misunderstanding or due to the strip nature the additional drives of the mirror are better and why.

Note this thread only cares about data redundancy it does not care about speed. It is given that the Z2 and the z3s will be slower due to additional writes.

reddit.com
u/Financial-Issue4226 — 8 days ago
▲ 3 r/zfs

One or more devices has experienced an error resulting in data corruption. Applications may be affected

Hello. First off I would like to apologize for my lack of knowledge. While there are some things I know when it comes to PC’s, I don’t know everything. So some of my terminology may not be correct. I’m simply someone who wants to have a simple NAS on a budget. I know very little of linux, and I’m willing to understand more so I can help maintain this system.

I have setup a NAS with a Thinkcentre M910Q. There is a 2.5 SSD where the OS is installed as well as a 1TB m.2 drive installed. That is where my apps, files, and datasets are. The installed apps I have are Nextcloud, Cloudflare Tunnel, Tailscale, and Jellyfin. It’s setup for simple file sharing and media streaming. Not necessarily file backups. Although I hope to expand to something better later, so that I can use this as data backup.

I’m frequency experiencing an issue. Now the first thing I want to mention, is that the M.2 is not being held down properly. And yes, I am already taking measures to try and fix this. The mini PC that I have is not meant for a standoff and screw. I have ordered a plastic push-pin which will be arriving soon and hopefully stop this issue from occurring. And yes, I do realize that this could very well be causing all these errors and what I’m experiencing. I understand that all of this may be redundant given this. I am doing what I can for now, and until I have what I need to properly secure my m.2, here is the issue.

I have alerts setup to my email. Pretty much everyday, I’ll get the error “Pool “my pool name” state is ONLINE: One or more devices has experienced an error resulting in data corruption. Applications may be affected.” Ever since I got the message the first time, I logged into the web UI to see that the CPU averages at high ~95% usage. I would reboot it to see if all of my files were corrupted. Rebooting or shutting down via the web UI wouldn’t do anything. I would forcefully shut it off, reboot, and find that all my files are safe. A notification pops up saying that all of the previous errors have been cleared.

Today that error has occurred multiple times. Seemingly with no cause, not even any heavy work loads. On top of a new error. “Pool “my pool name” state is SUSPENDED: One or more devices are faulted in response to IO failures. The following devices are not healthy: ”My M.2 Drive”.

I ran zpool status -v during one time the error occured with this as the output.

Permanent errors have been detected in the following files:
/var/db/system/update/update.sqsh
/mnt/.ix-apps/app_mounts/jellyfin/config/data/jellyfin.db-shm

Another instance of having and error and running the same command resulted in this:
(Some of the characters are not exact and I apologize for that)

Permanent errors in
**mnt/.ix-apps/docker/container//mnt/.ix-apps/docker/containers/85e8175a59bb209e7c361214b6f5ded968f387a3deb5c0c6bb46b5b42c7a729e/85e8175a59bb209e7c361214b6f5ded968f387a3deb5c0c6bb46b5b42c7a729e-json.log

/var/db/system/netdata/dbengine/datafile-1-000000094.ndf

/var/db/system/netdata/journalfile-1-000000094.njf**

mnt/.ix-apps/app_mounts/jellyfin/config/data/fellyfin.db-shm

But it’s worth nothing that I’ve had the first error happen to me many times without any apps even installed and simply using the SMB service. I have never rain zpool status before today, and it’s my first time noticing the files affected. So I’m confused to see files referenced from jellyfin. It makes me concerned for what the actual problem may be.

It has been a cycle ever since. I have seen a few people online mentioning the possibility of faulty ram. So currently I’m running MemTest86. I have previously loaded my m.2 on a portable drive on my main PC and ran CrystalDiskInfo. The drive was reportedly healthy. Not too entirely sure if only using that software was the right move or conclusive enough to determine that.

reddit.com
u/Mr_Esuoh — 8 days ago
▲ 1 r/zfs+2 crossposts

From Celeron Optiplex to dual-node Proxmox with RAIDZ3, VLANs, and hardened cameras — 15+ years of homelab evolution

I see a lot of "where do I start?" posts in the homelab world. I started with hardware I didn't understand and broke things until they worked. No formal IT background — just practice, reading docs, and more recently letting AI compress the feedback loop. Hopefully the "Skills Developed" tags help anyone wondering what they'll actually learn by tackling each piece.    

The Physical Setup: From IKEA to Rack                                                                                      

My first "rack" was a 19" IKEA LACK table with a switch and a Celeron Optiplex sitting on top. Then a second LACK on top of that. Then a third. By the time I had four stacked with gear spilling out of every shelf, it was time for a real rack. Now everything lives in a dedicated 11U rack on wheels — clean cabling, proper airflow, and I can actually find things at 2am. The gear is almost entirely off-lease enterprise equipment. Decommissioned rack servers, surplus managed switches, enterprise SAS enclosures — stuff companies dump after 3-5 years when warranties expire. A server that cost $15K new goes for a few hundred dollars. Even an empty SAN frame with a handful of drives gives you a real enterprise storage interface to learn on —same CLI a storage admin uses in production, for the price of a dinner out. This was before and during the SaaS boom, but the premise still holds: cheap enterprise hardware lets you learn at home, a little bit every day. Thirty minutes every evening compounds. A year of that is 180+ hours of hands-on practice no certification course can replicate. The gear doesn't need to be current — it needs to be real.

Skills Developed: Physical rack planning, cable management, airflow design, enterprise hardware sourcing, evaluating off-lease equipment   

Virtualization: Celeron 600 → ESXi → Proxmox

First server was that Optiplex — Celeron 600, maxed RAM, pair of VelociRaptor 10K RPM drives. Ran CentOS 5 with SSH exposed to the internet for tunneling home from school — an L3 proxy bypass via SSH SOCKS (ssh -D 1080). That machine taught me more about Linux than any course.                                                                                               

Built a couple of custom 4U rackmount Ubuntu servers after that — first experience with hot-swap bays, server motherboards, and IPMI. Big, loud, power-hungry, educational. Moved to VMware ESXi free tier, then vSphere Enterprise through their educational program (~$200/year) — vMotion, HA, distributed switches. Ran that for years until the Broadcom acquisition pushed me to Proxmox. Now running two Proxmox nodes. Primary hosts everything: virtualized firewall, DNS, media stack (the full *arr suite), network monitoring, vulnerability scanning, web frontends. Secondary is dedicated to backup via PBS. 

Skills Developed: Linux administration (CentOS/Ubuntu), SSH tunneling and SOCKS proxies, server hardware selection, IPMI/out-of-band management, ESXi/vSphere administration, vMotion and HA clustering, Proxmox VE, KVM vs LXC decision-making                                

Storage & Backup: RAIDZ3 + Dedicated Backup Bond 

The backup node runs a 12-disk RAIDZ3 pool (~20TB usable) — three simultaneous drive failures before data loss. Already exercised this replacing 3 faulted drives via hot-swap. The procedure: map ZFS vdev IDs to physical SAS addresses with sg_ses, light the locate LED, swap the drive. The RAID controller doesn't auto-configure hot-swapped drives as JBOD, so you hit the out-of-band management API to set JBOD mode and rescan the SCSI bus. Took a full day to figure out the first time. Backups run daily via PBS with 30-day retention and zstd compression. Traffic runs over a dedicated 3x1Gb bonded link (balance-xor, MTU 9000) on a separate subnet, keeping it off the production LAN.    

Skills Developed: ZFS RAIDZ3 administration, hot-swap drive replacement, SAS enclosure management (sg_ses), out-of-band management APIs, NIC bonding, MTU tuning, PBS configuration, backup retention policies                 

Network: Virtualized Firewall + VLANs                 

Firewall runs as a VM with two NICs — clean LAN and dirty WAN. A 52-port managed switch handles Layer 2 segmentation: clean LAN, dirty VLAN (ISP uplink + firewall WAN only), and a planned camera isolation VLAN. The VLAN setup solved a real problem: the ISP router's DHCP was bleeding offers into the clean network through a shared broadcast domain. Before VLANs, I had ebtables rules filtering DHCP by MAC — fragile. VLAN isolation fixed it at Layer 2. The switch requires legacy SSH crypto (diffie-hellman-group1-sha1, aes256-cbc) and shows a non-standard prompt, so automation needs expect scripts from a dedicated management container. Wi-Fi is split across three SSIDs: clean, bridged internal, and an untrusted guest network on the ISP router.                                                                         

Skills Developed: Firewall virtualization, VLAN design and trunk configuration, Layer 2 isolation, DHCP debugging, managed switch CLI, legacy SSH crypto, expect scripting, Wi-Fi segmentation                                                                                                                              

Security: Cameras, IDS, and Vuln Scanning         

Four IP cameras feed into recording software on a Windows Server VM. All cameras hardened: default gateways removed (no internet route), DNS cleared, NTP pointed to the firewall, UPnP disabled. Scheduled for dedicated VLAN isolation so cameras can only reach the recording server.     

Zeek IDS monitors both clean and dirty bridges with ip_forward=0 — passive only. I'm building an html5 dashboard to review connection pairings in real-time. OpenVAS / Greenbone runs vulnerability scans. Pi-hole handles DNS filtering for the LAN.      

Skills Developed: IP camera hardening, RTSP/HTTP API integration, Zeek IDS deployment, passive network monitoring, vulnerability scanning (OpenVAS/Greenbone), DNS sinkholing

The AI Angle                              

Most recent evolution: using AI as a hands-on homelab partner. Not for basic Googling — for real operational work. Writing camera API automation, debugging ZFS issues by reasoning about drive serials and SAS addresses, documenting network topology, planning VLAN migrations, managing the switch over SSH with its weird legacy prompts. AI doesn't replace learning, it compresses the feedback loop. Instead of 4 hours reading forum posts about RAID controller JBOD passthrough, 30 minutes working through the management API with an AI that holds the entire hardware context. I still learned how it works — just got there faster.                                                                                                                                                   

Skills Developed: AI-assisted systems administration, documentation-as-code, prompt engineering for infrastructure tasks                                                                   

So: Start with one box, break it, fix it, keep notes. Everything above started with a Celeron Optiplex and a LACK table. Happy to answer questions about any of this. 

reddit.com
u/HandsomeNomad — 7 days ago
▲ 5 r/zfs

Postgres workload - SLOG Disk vs WAL Disk

English isn’t my first language, so please excuse any awkward phrasing.

With the setup shown below, I’m unsure whether it would be better to use one Optane mirror set for SLOG, or dedicate it exclusively for WAL.

I’ll be running an API server and various services on a Proxmox host, along with a PostgreSQL database.

https://preview.redd.it/cd81hcujkgvg1.png?width=626&format=png&auto=webp&s=2f917eb5f0ec20550b1fe2215a0fdb0bbf52cf58

Disk Capacity File System Purpose
P4800X 0.4 ZFS WAL Mirror vs SLOG Mirror
P4800X 0.4 ZFS WAL Mirror vs SLOG Mirror
P4800X 0.4 ZFS Special VDEV Mirror
P4800X 0.4 ZFS Special VDEV Mirror
PM1733 3.84 ZFS OS/VM/Etc... Mirror
PM1733 3.84 ZFS OS/VM/Etc... Mirror
reddit.com
u/Best-Condition-5784 — 7 days ago
▲ 44 r/zfs

ZFS instant clones for Kubernetes node provisioning — under 100ms per node

I've been using ZFS copy-on-write clones as the provisioning layer for Kubernetes nodes and wanted to share the results.

The setup: KVM VMs running on ZFS zvols. Build one golden image (cloud image + kubeadm + containerd + Cilium), snapshot it, then clone per-node. Each clone is metadata-only — under 100ms to create, near-zero disk cost until the clone diverges.

Some numbers from a 6-node cluster on a single NVMe:

- Golden image: 2.43G

- 5 worker clones: 400-1200M each (COW deltas only)

- Total disk for 6 nodes: ~8G instead of ~15G if full copies

- Clone time: 109-122ms per node

- Rebuild entire cluster: ~60 seconds (destroy + re-clone)

Each node gets its own ZFS datasets underneath:

- /var/lib/etcd — 8K recordsize (matches etcd page size)

- /var/lib/containerd — default recordsize

- /var/lib/kubelet — default recordsize

Sanoid handles automated snapshots — hourly/daily/weekly/monthly per node. Rolling back a node is instant (ZFS rollback on the zvol). Nodes are cattle — drain, destroy the zvol, clone a fresh one from golden, rejoin the cluster.

The ZFS snapshot-restore pipeline also works through Kubernetes via OpenEBS ZFS CSI — persistent volumes backed by ZFS datasets with snapshot and clone support.

Built this into an open source project if anyone wants to look at the implementation: https://github.com/kldload/kldload

Demo showing the full flow: https://www.youtube.com/watch?v=egFffrFa6Ss
6 nodes, 15 mins.

Curious if anyone else is using ZFS clones for VM provisioning at this scale?

u/anthony-kldload — 10 days ago
▲ 3 r/zfs

Using 15TB+ NVMe with full PLP for ZFS — overkill SLOG or finally practical L2ARC?

Mods let me know if this crosses any lines — happy to adjust.

I’ve been working on a deployment recently using some high-capacity enterprise NVMe (15.36TB U.2, full power loss protection, ~1 DWPD endurance), and it got me thinking about how these fit into ZFS setups beyond the usual small, low-latency devices.

A few things I’ve been considering:

SLOG

- Clearly overkill from a capacity standpoint, but with full PLP and solid write latency, they’re about as safe as it gets for sync-heavy workloads

- Curious if anyone here is actually running larger NVMe for SLOG just for endurance + reliability headroom

L2ARC:

- At this capacity, L2ARC starts to feel more viable again, especially for large working sets

- Wondering how people are thinking about ARC:L2ARC ratios when drives are this big

All-flash pools:

- With ~15TB per drive, you can get into meaningful capacity with relatively few devices

- Tradeoff seems to be fewer drives (capacity density) vs more vdevs (IOPS + resiliency)

Other considerations:

- ashift alignment and sector size behavior on these newer enterprise drives

- Real-world latency vs spec sheet under mixed workloads

- Whether endurance (1 DWPD) is enough for heavy cache-tier usage long-term

We ended up with a few extra from that deployment, so I’ve been especially curious how folks here would actually use drives like this in a ZFS context.

Would love to hear real-world configs or any lessons learned.

reddit.com
u/AshleshaAhi — 10 days ago
▲ 9 r/zfs

What would happen if I use hdparm to change the logical sector size of my HDD to 4096 bytes?

I have four 8TB HDDs all with Physical Sector Size 4096 bytes and Logical Sector Size 512 bytes, according to `hdparm` and `lsblk`. They're in a raidz1 zpool with ashift=12. Also lsblk says their minimum IO size is 4096 bytes.

What would happen if I used hdparm to change one disk's logical sector size to 4096 bytes? I assume all data on disk would be lost and ZFS would resilver the drive. After the resilver, would the on-disk data be laid out differently? Would writes happen differently? Would there be any effect on performance?

reddit.com
u/rileywbaker — 13 days ago
▲ 0 r/zfs

A guy use Claude Code recovered nearing 90TB of corrupted ZFS pool that rejected by data recovery companies

Originally posted by @ shujueitsao in Threads

The following content is the post content translated by ChatGPT

Note the "dollar" is TWD.


Last May, the company NAS crashed. 90TB of animation project files — three years of hard work. The hard drives weren’t broken, but the ZFS metadata was corrupted. The system couldn’t recognize the data. I contacted several data recovery companies. I was ready to pay the price of a domestic car. In the end, no one was willing to take the job — all said it was unrecoverable. It sat there for almost a year. One day, while playing with Claude, I suddenly thought: Could I just try to fix it myself? I spent less than 5,000 dollars to buy an HBA card. Connected 12 hard drives to the computer. Installed Claude Code and let the AI issue commands on its own. It tried ZFS historical restore points one by one. Found a point in time that wasn’t corrupted. Mounted the entire data pool back. Recovered everything. Directory structure intact, filenames intact — three years of projects. I had absolutely no Linux experience. AI guided me through the entire process. Today, I saved the cost of a domestic car 😮‍💨


The following content is the summaries for the chat message that done by ChatGPT


Recovery is difficult; ECC errors cannot currently be corrected.

They will return the hard drives first, but backups are retained.

No recommended contacts are available locally; recovery in Taiwan is not feasible.

Consultation was done with overseas experts. They connected remotely and wrote several test programs; additional tests were also written locally.

Conclusion: recovery is unlikely.


The following content is the Chinese content that in the Ubuntu's Claude Code screenshot


All background tasks have completed; data integrity is confirmed.

Main data is located in zpool1/zfs19 (88.6TB). How would you like to proceed?

Option 1: Extract selectively (using zdb)

Suitable for recovering a small number of important files

Slow; impractical for 88TB

Option 2: VM approach (recommended)

Run a QuTS hero VM via KVM/QEMU on this machine

Pass through all 12 drives to the VM

Inside the VM, use zpool import -o readonly=on to mount the pool fully

Then export data via SMB/NFS

Which option do you want to use? If you choose option 2, do you have another set of drives or enough space to store the recovered data?

u/XLioncc — 16 days ago
▲ 20 r/zfs

OpenZFS tuning for torrents?

Hey everyone,

I've been recently thinking about migrating from btrfs to ZFS, mainly because I'd like to learn how it works, use send/recv for backups, and to improve my skills with system administration. I have a 14TB drive which I use to store personal data, documents, and linux isos.

Due to the way torrents work, it can cause fragmentation on copy-on-write mountpoints. Usually in btrfs, I'd just make a new subvolume with NODATACOW, and set it as the unfinished downloads directory in my torrent client.

I did read through the documentation for Workload Tuning, and it does mention the fragmentation issue, and it suggests the same copy method I use on btrfs. Am I able to just set chattr +C /mnt/nodatacow on my nodatacow dataset's mountpoint and call it a day?

Also, if you have any other tuning recommendations for torrents, please let me know! :) If it helps, I'm using openZFS 2.3.6 on Gentoo Linux. Thanks for reading!

..Since you've been reading this far, I'll slip in another question I've been having.

Theres a lot of debate around if single disk ZFS is worth it. Is it? I'm interested in trying ZFS but as a somewhat broke highschooler with only one 3.5 inch disk slot on my PC, I'd need to do some big upgrades to make use of mirrors and multi disk zpools.

reddit.com
u/cometomypartyyy — 19 days ago
▲ 7 r/zfs

Pool takes forever to mount (making system broken) and no further errors

I had a disaster last night on my proxmox server. Fortunately I had daily replication and lost just half day.

The system started to act weird. For example very unresponsive, processes (incl. KVM VMs) showed as stopped but were still running and couldn't even be killed with -9.

Reboot took forever and the next reboot revealed the culprit was my ZFS pool:

https://preview.redd.it/us1dxpo600ug1.png?width=2114&format=png&auto=webp&s=7b94edda1c13a7445c08dce6589f99e656b2cf00

After about 30 minutes it succeeded to boot with some processes timing out to start. From then on some parts of the system worked, some were extremely laggy. Access to all ZFS data worked flawlessly.

No issues are shown with zpool or zfs. No suspicious dmesg messages. Nothing.

It just seems that accessing the pool has become so slow that the system basically does no longer operate properly.

The pool is a mirror between an NVMe and a 2.5" SATA SSD.

Which options do I have to figure out what the heck is even going on and how to recover?

EDIT: This is concerning as the issue seems not perfectly reproducible and intermittent. When I rebooted again this morning, all worked as expected. Just after a short while the same issues (whole system lagging) re-appeard

EDIT2: No suspicious outputs to me in Smart. Short+Long tests are successful for the internal SSD (Crucial BX500 4TB, 6months old). The NVMe does not support self test. SMART ouputs here for reference: https://pastebin.com/ceLyB5DK,  https://pastebin.com/y6U1T9c8

EDIT3: All of a sudden I see a small number of failed writes:

# zpool status
  pool: rpool
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
  scan: scrub repaired 0B in 02:27:11 with 0 errors on Sun Mar  8 03:51:12 2026
config:

        NAME                                                  STATE     READ WRITE CKSUM
        rpool                                                 ONLINE       0     0     0
          mirror-0                                            ONLINE       0     0     0
            nvme-HP_SSD_EX900_Plus_2TB_HBSE54170100735-part3  ONLINE       0     0     0
            ata-CT4000BX500SSD1_2529E9C69BDE-part3            ONLINE       0     8     0

Maybe this evening I try removing and re-attaching again to see if it's connector or disk?

And if disk, does it make sense to purposefully degrade my array (remove the SSD) and confirm the issue disappears? Until I replace+resilver, I am aware that I am living dangerously without redundancy ...

reddit.com
u/segdy — 15 days ago