Friday 1 April 2022

Replacing a disk in a ZFS pool (RAID mirrored) of a Proxmox node

I am just putting this here for my notes, but it may also help someone!

The disk to be replaced, contained 3 partitions including the boot partition.


 #Check the status of the ZFS pool  (see the UNAVAIL disk and note its ID)

zpool status
#Check the partitioning on the available disk and the new disk (new disk should be empty)
(new replacement is sdb in this case)
fdisk -l /dev/sda
fdisk -l /dev/sdb
#Copy (Replicate) partition table from sda to sdb and check with fdisk
sgdisk /dev/sda -R /dev/sdb
fdisk -l /dev/sdb
#Because we have copied the partition table, we need to generate unique random GUIDs for sdb
sgdisk -G /dev/sdb &
#Clone the boot and OS partitions
dd if=/dev/sda1 of=/dev/sdb1 bs=1M
dd if=/dev/sda2 of=/dev/sdb2 bs=1M
#For the 3rd (proxmox) partition, check the ID of the new partition  (sdb3)
ls -alh /dev/disk/by-id/
#Use zpool replace to replace old id with new id and use zpool status to check the progress of resilvering
zpool replace rpool /dev/disk/by-id/ata-<disk-ID>-part3 /dev/disk/by-id/ata-<disk-ID>-part3
zpool status
#When resilvered, check the partitions with lsblk, Need to change the UUID for boot partition since we copied from sda. Check the correct UUIDs with cat (may be in formart XXXX-XXXX)
lsblk -f
cat /etc/kernel/proxmox-boot-uuids
#Use mtools to modify the UUID of sdb3 back to its proxmox original (Install mtools if needed) (remove hyphen in the serial)
apt install mtools
mlabel -N XXXXXXXX -i /dev/sdb2 ::
lsblk -f
#Partprobe to update the partition info, and check with lsblk
partprobe /dev/sdb
lsblk -f
#Check the boot partitions using the proxmox tool, refresh to copy and generate grub file in new partition
proxmox-boot-tool status
proxmox-boot-tool refresh

Done.