censorious-floridapompano6219

qnap-zfs-rescue

"Rescue data from QNAP QuTS Hero NAS with ZFS metadata corruption (ZFS-8000-72). Use this skill whenever the user mentions QNAP ZFS rescue, QuTS Hero pool recovery, ZFS metadata corrupted on QNAP, zpool FAULTED with all disks ONLINE, ZFS-8000-72 error on QNAP, or any situation involving a QNAP NAS running QuTS Hero where the ZFS pool cannot be imported despite healthy disks. This skill covers the complete rescue process including diagnosing QNAP's modified ZFS (QZFS), extracting QNAP native tools, and using QEMU to mount the pool with QNAP's own drivers."

Resources

4
GitHub

Install

npx skillscat add censorious-floridapompano6219/qnap-zfs-rescue

Install via the SkillsCat registry.

SKILL.md

QNAP QuTS Hero ZFS Pool Rescue

Overview

QNAP's QuTS Hero uses a heavily modified version of ZFS ("QZFS") that is incompatible with standard OpenZFS. When metadata corruption occurs, standard tools and even professional data recovery services may fail because they cannot recognize QNAP's proprietary format.

This skill guides through a complete rescue process that works by using QNAP's own kernel and ZFS drivers in a QEMU virtual machine.

Key Technical Facts

  • QNAP uberblock magic: 0x00bab10b (standard OpenZFS: 0x00bab10c)
  • QNAP uses proprietary indirect_layout=1 for block indexing
  • QNAP's zdb tool is stripped of -r/-O export options
  • QNAP's CDDL source release is incomplete (missing zib.h)

Prerequisites Check

Before starting, verify:

  1. All original disks are available and physically healthy
  2. A running QNAP NAS with QuTS Hero is accessible (for extracting tools/kernel)
  3. The rescue machine runs Ubuntu 22.04+ with enough SATA/SAS ports
  4. Network connectivity between rescue machine and target NAS for data transfer

Rescue Procedure

Step 1: Diagnose

sudo apt update && sudo apt install -y zfsutils-linux qemu-system-x86 bridge-utils

# Scan for the pool
sudo zpool import -d /dev/disk/by-id

# Expected output: pool FAULTED, all disks ONLINE
# Error: "The pool metadata is corrupted" (ZFS-8000-72)

Verify it's a QNAP magic number issue:

# Pick any disk's ZFS partition (usually part3)
DISK=$(ls /dev/disk/by-id/*-part3 | head -1)
sudo dd if=$DISK bs=512 skip=256 count=1 2>/dev/null | xxd | head -5

# Look for: 0b b1 ba 00 (QNAP) vs 0c b1 ba 00 (standard)

If you see 0b b1 ba 00, this is confirmed as a QNAP QZFS issue. Standard OpenZFS tools will NOT work. Proceed to Step 2.

Step 2: Extract QNAP Tools and Boot Files

SSH into the running QNAP NAS and extract everything needed:

# Create working directories
mkdir -p ~/qnap_tools ~/qnap_libs ~/qnap_boot

# SSH to NAS and identify files
ssh admin@[NAS_IP] "ldd /sbin/zdb"

# Copy ZFS tools
scp admin@[NAS_IP]:/sbin/zdb ~/qnap_tools/
scp admin@[NAS_IP]:/sbin/zpool ~/qnap_tools/
scp admin@[NAS_IP]:/sbin/zfs ~/qnap_tools/

# Copy all shared libraries (get full list from ldd output)
# Copy ld-linux dynamic linker

# Copy boot files - find them first:
ssh admin@[NAS_IP] "find / -name 'bzImage' -o -name 'initrd*' -o -name 'rootfs*' 2>/dev/null"

# Copy kernel and initrd
scp admin@[NAS_IP]:/path/to/bzImage ~/qnap_boot/
scp admin@[NAS_IP]:/path/to/initrd.boot ~/qnap_boot/
scp admin@[NAS_IP]:/path/to/rootfs2.bz ~/qnap_boot/

Step 3: Verify QNAP Tools Work

# Test QNAP zdb on the rescue machine
~/qnap_tools/ld-linux-x86-64.so.2 \
  --library-path ~/qnap_libs \
  ~/qnap_tools/zdb -l /dev/disk/by-id/[ANY_DISK]-part3

# Should show valid uberblock and label data

Step 4: Prepare QEMU VM

# Extract initrd
mkdir -p /tmp/qnap_initrd
cd /tmp/qnap_initrd
zcat ~/qnap_boot/initrd.boot | cpio -idmv
# Also extract rootfs if present
bzcat ~/qnap_boot/rootfs2.bz | cpio -idmv 2>/dev/null || true

Create custom init script at /tmp/qnap_initrd/sbin/init:

cat > /tmp/qnap_initrd/sbin/init << 'INITEOF'
#!/bin/sh
export PATH=/sbin:/bin:/usr/sbin:/usr/bin
mount -t proc proc /proc
mount -t sysfs sysfs /sys
mount -t devtmpfs devtmpfs /dev
sleep 3

# Load VirtIO drivers
for mod in virtio virtio_ring virtio_pci failover net_failover virtio_net virtio_blk; do
  find /lib/modules -name "${mod}.ko" -exec insmod {} \; 2>/dev/null
done

# Load QNAP ZFS drivers (ORDER MATTERS)
find /lib/modules -name "lpl.ko" -exec insmod {} \; 2>/dev/null
find /lib/modules -name "icp.ko" -exec insmod {} \; 2>/dev/null
find /lib/modules -name "zfs.ko" -exec insmod {} \; 2>/dev/null
sleep 2

# Import pool READ-ONLY
zpool import -d /dev -o readonly=on -R /oldpool -f zpool1 oldpool
if [ $? -eq 0 ]; then
  echo "=== POOL IMPORTED SUCCESSFULLY ==="
  zfs list
else
  echo "=== POOL IMPORT FAILED ==="
  zpool import -d /dev
fi

# Network setup - EDIT THESE VALUES
ip addr add [VM_IP]/[NETMASK] dev eth0
ip link set eth0 up
ip route add default via [GATEWAY]

# Mount NFS target - EDIT THESE VALUES
mkdir -p /mnt/nas
mount -t nfs [NAS_IP]:[NFS_SHARE_PATH] /mnt/nas

echo "=== Ready for data transfer ==="
echo "Use: cp -av /oldpool/[dataset]/ /mnt/nas/"
exec /bin/sh
INITEOF

chmod +x /tmp/qnap_initrd/sbin/init

Repack initrd:

cd /tmp/qnap_initrd
find . | cpio -o -H newc | gzip > /tmp/custom_initrd.gz

Step 5: Setup Network Bridge

# Replace eno1 with your actual network interface
IFACE="eno1"
HOST_IP="[HOST_IP]"
GATEWAY="[GATEWAY]"
NETMASK="24"

sudo ip link add br0 type bridge
sudo ip link set $IFACE master br0
sudo ip addr flush dev $IFACE
sudo ip addr add ${HOST_IP}/${NETMASK} dev br0
sudo ip link set br0 up
sudo ip route add default via $GATEWAY

sudo ip tuntap add tap0 mode tap
sudo ip link set tap0 master br0
sudo ip link set tap0 up

Step 6: Launch QEMU

Build the QEMU command with all disks as read-only virtio-blk devices:

# List all ZFS partitions
ls /dev/disk/by-id/*-part3

# Build QEMU command - add a -drive line for EACH disk
sudo qemu-system-x86_64 \
  -m 8G -smp 4 \
  -kernel ~/qnap_boot/bzImage \
  -initrd /tmp/custom_initrd.gz \
  -append "console=ttyS0" \
  -nographic \
  -netdev tap,id=net0,ifname=tap0,script=no,downscript=no \
  -device virtio-net-pci,netdev=net0 \
  -drive file=/dev/disk/by-id/[DISK1]-part3,format=raw,if=virtio,readonly=on \
  -drive file=/dev/disk/by-id/[DISK2]-part3,format=raw,if=virtio,readonly=on \
  # ... add all disks

Step 7: Transfer Data

Once the VM boots and pool is imported:

# Inside VM - list available datasets
ls /oldpool/

# Start copying (prioritize important data first)
cp -av /oldpool/[important_dataset]/ /mnt/nas/

# Monitor progress from host
watch -n 60 'du -sh /path/to/nas/mount/'

Troubleshooting

Pool import fails in VM

  • Check ZFS module loaded: lsmod | grep zfs
  • Check disks visible: ls /dev/vd*
  • Try without force: zpool import -d /dev -o readonly=on
  • Check dmesg for errors: dmesg | tail -50

Network not working in VM

  • Verify bridge setup on host: ip addr show br0
  • Check tap device: ip link show tap0
  • Try different IP range
  • Verify NFS export on NAS side

Slow transfer speed

  • Expected: 30-40 MB/s via NFS over QEMU
  • For 90TB this takes several days
  • Prioritize critical data first
  • Consider running multiple cp processes for different directories

Important Warnings

  1. NEVER write to the original disks — always use readonly=on
  2. NEVER attempt zpool import -F on QNAP pools — the force-rewind may corrupt data further
  3. Keep all disks safe until transfer is 100% verified
  4. Verify transferred data by spot-checking file integrity after transfer