Creating a Proxmox NFS Root on a
ZFS-Backed File Server...
No DHCP or TFTP Needed
Background
I decided I wanted to make my homelab mostly diskless, consolidating my storage onto a single machine. This has a number of benefits.- It's cheaper. Not having underutilized drives across several machines makes storage more efficient.
- Diskless booting lets you load different images at the boot screen...even install operating systems remotely. (The latter requires DHCP/TFTP.)
- I can manage the files of every server from one server--backups, snapshots, rollbacks are a snap. This also means that I can manage and modify configuration files and easily push them to my servers without even crossing file system lines.
- With ZFS I get a free root overlay. I can clone servers, run the clone, and then easily see any changes with zfs diff. If I want to look for changes to /bin or /etc I can do this with a single command...and reverse them with another.
However, some of my hardware doesn't play well with PXE booting and Proxmox would tend to wear out a flash drive. So I am going to show how I installed grub and the boot files on a flash drive but used an NFS root for my Proxmox servers. This also means I don't need to use a DHCP or TFTP server.
I relied heavily upon these sources, as well as others. I encourage you to look into them because their authors are more knowledgeable than I am.
- https://github.com/zfsonlinux/zfs/wiki/Debian-Stretch-Root-on-ZFS
- https://www.debian.org/releases/stable/amd64/apds03.html.en
- https://wiki.archlinux.org/index.php/EFI_system_partition
- https://pve.proxmox.com/wiki/Install_Proxmox_VE_on_Debian_Stretch
- https://www.linuxquestions.org/questions/blog/isaackuo-112178/diskless-pxe-netboot-how-to-for-debian-8-jessie-37169/
My NFS server is a Debian Stretch machine backed by ZFS on Linux. If you have a different OS of file system then many of these steps are irrelevant or will have to be substantially modified.
My zpool, tank, is mounted at /srv/tank.
1. Prep the Server
Create a dataset for your remote client files and enable NFS sharing
# zfs create tank/nodes
# zfs set sharenfs="rw=@192.168.195.0/24,no_root_squash,no_subtree_check,sync" tank/nodes
# zfs set sharenfs="rw=@192.168.195.0/24,no_root_squash,no_subtree_check,sync" tank/nodes
I opted for sync (for now). Async is usually faster but could cause issues in the event of power loss.
async
This option allows the NFS server to violate the NFS protocol and reply to requests before any changes made by that request have been committed to stable storage (e.g. disc drive).Using this option usually improves performance, but at the cost that an unclean server restart (i.e. a crash) can cause data to be lost or corrupted.
Create a dataset to hold your root
# zfs create tank/nodes/deb-j41-1
Check NFS sharing, etc. Make sure the any settings you configured are properly inherited.
# zfs get all tank/nodes/deb-j41-1
You can create a bunch of datasets in the target root. This allows you to selectively disable snapshots as well as modify zfs options individually. I opted for a simpler setup, but still created a tmp dataset. You could also include var/tmp, var/spool, var/log, etc.
Set up a file system.
# zfs create -o com.sun:auto-snapshot=false \
-o setuid=off \
tank/nodes/deb-j41-1/tmp
# chmod 1777 /srv/tank/nodes/deb-j41-1/tmp
-o setuid=off \
tank/nodes/deb-j41-1/tmp
# chmod 1777 /srv/tank/nodes/deb-j41-1/tmp
2. Install Debian
Install the basic file system using debootstrap. If you are installing from a Debian host you could simply copy that host into the dataset.
# apt install --yes debootstrap
# debootstrap stretch /srv/tank/nodes/deb-j41-1/
# debootstrap stretch /srv/tank/nodes/deb-j41-1/
Now we need the flash drive. I opted to use a UEFI boot process, so I partitioned my drive using gdisk.
# apt install --yes gdisk
# gdisk /dev/sdX
# ...create partitions. Hint: the partition type in gdisk for EFI is EF00.
# gdisk /dev/sdX
# ...create partitions. Hint: the partition type in gdisk for EFI is EF00.
After creating the partitions, format them.
# mkfs.fat -F32 /dev/sdX1
# mkfs.ext4 /dev/sdX2
# mkfs.ext4 /dev/sdX2
This is what it looked like when I finished.
# lsblk -o name,size,uuid /dev/sdX
NAME SIZE UUID
sdX 14.6G
├─sdX1 550M 8CHA-R1ID
└─sdX2 14.1G 1234what-ever-your-uuid-is5678901234
NAME SIZE UUID
sdX 14.6G
├─sdX1 550M 8CHA-R1ID
└─sdX2 14.1G 1234what-ever-your-uuid-is5678901234
Now chroot into your new system. Mounting proc and dev like this works but it is recommended to reboot when you get a chance to make sure everything is clean. Be warned, while working on this I overwrote the host grub on two occasions despite being in chroot and naming the correct drives.
Warning:
Many of the following commands are executed in the chroot environment. If these commands are run on the host damage will occur. When you `chroot` make sure you are in the correct directory.
# cd /srv/tank/nodes/deb-j41-1
# mount /dev/sdX2 ./mnt
# mkdir -p ./mnt/boot/efi
# mount /dev/sdX1 mnt/boot/efi
# mount -t proc /proc proc/
# mount --rbind /sys sys/
# mount --rbind /dev dev/
# chroot . /bin/bash --login
# mount /dev/sdX2 ./mnt
# mkdir -p ./mnt/boot/efi
# mount /dev/sdX1 mnt/boot/efi
# mount -t proc /proc proc/
# mount --rbind /sys sys/
# mount --rbind /dev dev/
# chroot . /bin/bash --login
3. Setup the client system
Edit /etc/apt/sources.list to look something like the following.
deb http://deb.debian.org/debian stretch main contrib non-free
deb http://deb.debian.org/debian-security/ stretch/updates main contrib non-free
deb http://deb.debian.org/debian stretch-updates main contrib non-free
deb http://deb.debian.org/debian-security/ stretch/updates main contrib non-free
deb http://deb.debian.org/debian stretch-updates main contrib non-free
Get your basic system setup
# apt update && apt upgrade
# dpkg-reconfigure tzdata
# apt install locales
# locale-gen en_US.UTF-8
# dpkg-reconfigure locales
# dpkg-reconfigure tzdata
# apt install locales
# locale-gen en_US.UTF-8
# dpkg-reconfigure locales
Install the kernel
# apt search linux-image
Then install the kernel package of your choice using its package name. For example:
# apt install linux-image-4.9.0-8-amd64
If your underlying file-system is zfs, you are going to need this to configre grub--grub-install: error: failed to get canonical path of tank/nodes/deb-j41-1--and again when mounting our filesystem.
# dpkg-reconfigure spl-dkms
# apt install dpkg-dev zfs-dkms zfs-initramfs
# ln -s /bin/rm /usr/bin/rm
# modprobe zfs
# apt install dpkg-dev zfs-dkms zfs-initramfs
# ln -s /bin/rm /usr/bin/rm
# modprobe zfs
So we can get into our server
# apt install openssh-server
Edit /etc/ssh/sshd_config and change "#PermitRootLogin prohibit-password" to
PermitRootLogin yes
Add root password for login
# passwd
Needed this for my Realtek NICs:
# apt install firmware-realtek
Install other network packages as needed
# apt install bridge-utils
# apt install vlan
# modprobe 8021q
# apt install vlan
# modprobe 8021q
Optional
Install some standard packages:# tasksel install standard
Lastly, clean up /var/cache/apt/archives/
# apt clean
Now some network configuration:
# echo 'deb-j41-1' > /etc/hostname
Edit /etc/network/interfaces. Make sure there is no hot-swap or auto line and use the correct interface name. You can also use dhcp if you want. Just make sure that your use of dhcp or static and the interface name are both consistent with grub.cfg later.
iface enp2s0 inet static
address 192.168.195.11
netmask 255.255.255.0
address 192.168.195.11
netmask 255.255.255.0
Modify /etc/fstab:
/dev/nfs / nfs tcp,nolock 0 0
proc /proc proc defaults 0 0
none /media tmpfs defaults 0 0
none /var/run tmpfs defaults 0 0
none /var/lock tmpfs defaults 0 0
# Persistent strorage on the flash drive
UUID=1234what-ever-your-uuid-is5678901234 /local ext4 noatime 0 0
proc /proc proc defaults 0 0
none /media tmpfs defaults 0 0
none /var/run tmpfs defaults 0 0
none /var/lock tmpfs defaults 0 0
# Persistent strorage on the flash drive
UUID=1234what-ever-your-uuid-is5678901234 /local ext4 noatime 0 0
Optional
Mount these in RAM rather than NFS. I kept mine on the NFS server. This uses the network but files and logs are persistent. A good option would be to mount /var/log in tmpfs and use a log server. If you have plenty of memory feel free to put tmp into ram. You can also cap the size of you tmpfs if you want to keep tmp from using too much.none /tmp tmpfs defaults 0 0
none /var/tmp tmpfs defaults 0 0
none /var/log tmpfs defaults 0 0
4. Make the boot files
Enable NFS in the initial ramdisk image configuration file by editing `/etc/initramfs-tools/initramfs.conf to add:
BOOT=nfs
Create the image and save it in the boot folder.
# mkinitramfs -d /etc/initramfs-tools -o /boot/initrd.img-4.9.0-8-amd64
# apt-get install grub-efi-amd64
# grub-install --target=x86_64-efi --recheck --removable --efi-directory=/mnt/boot/efi --boot-directory=/mnt/boot
# apt-get install grub-efi-amd64
# grub-install --target=x86_64-efi --recheck --removable --efi-directory=/mnt/boot/efi --boot-directory=/mnt/boot
Optional
Download iso images for alternative or rescue booting# mkdir /mnt/boot/iso/
# cd /mnt/boot/iso/
Then download. For example:
# wget http://releases.ubuntu.com/18.04/ubuntu-18.04.1-desktop-amd64.iso
Save the old grub scripts
# cp -r /etc/grub.d /etc/grub.old
Get rid of the OS scripts
# rm /etc/grub.d/{1*,2*,3*}
Edit /etc/grub.d/40_custom and add something like the following. Don't erase the existing contents of the file.
Important
Make sure that the linux line is all one line.
# Make the boot location persistent by setting root by UUID
# alternatively, use the hdd(0,1) or similar notation if you don't plan to have any other storage.
insmod search_fs_uuid
search --no-floppy --set=root --fs-uuid 1234what-ever-your-uuid-is5678901234
menuentry "Debian deb-j41-1 4.9.0-8-amd64" {
set client_ip='192.168.195.11'
set server_ip='192.168.195.100'
set gw_ip=''
set netmask='255.255.255.0'
set hostname='deb-j41-1'
set domain='.caiuscorvus.net'
set device='enp2s0'
set server_root='/srv/tank/nodes/'
linux /boot/vmlinuz-4.9.0-8-amd64 root=/dev/nfs ip=$client_ip:$server_ip:$gw_ip:$netmask:$hostname$domain:$device nfsroot=$server_ip:$server_root$hostname rw quiet
initrd /boot/initrd.img-deb-today
}
menuentry "Ubuntu 18.04 (LTS) Live Desktop amd64" --class ubuntu {
set isofile='/boot/iso/ubuntu-18.04.1-desktop-amd64.iso'
loopback loop $isofile
linux (loop)/casper/vmlinuz boot=casper img_dev=$root iso-scan/filename=$isofile quiet splash
initrd (loop)/casper/initrd.lz
}
# alternatively, use the hdd(0,1) or similar notation if you don't plan to have any other storage.
insmod search_fs_uuid
search --no-floppy --set=root --fs-uuid 1234what-ever-your-uuid-is5678901234
menuentry "Debian deb-j41-1 4.9.0-8-amd64" {
set client_ip='192.168.195.11'
set server_ip='192.168.195.100'
set gw_ip=''
set netmask='255.255.255.0'
set hostname='deb-j41-1'
set domain='.caiuscorvus.net'
set device='enp2s0'
set server_root='/srv/tank/nodes/'
linux /boot/vmlinuz-4.9.0-8-amd64 root=/dev/nfs ip=$client_ip:$server_ip:$gw_ip:$netmask:$hostname$domain:$device nfsroot=$server_ip:$server_root$hostname rw quiet
initrd /boot/initrd.img-deb-today
}
menuentry "Ubuntu 18.04 (LTS) Live Desktop amd64" --class ubuntu {
set isofile='/boot/iso/ubuntu-18.04.1-desktop-amd64.iso'
loopback loop $isofile
linux (loop)/casper/vmlinuz boot=casper img_dev=$root iso-scan/filename=$isofile quiet splash
initrd (loop)/casper/initrd.lz
}
When you have finished modifiying the file, commit the changes to grub.cfg
# update-grub
Make sure all the files are where they are supposed to be. In particular, make sure the kernel, initrd, grub.cfg, and EFI are on the flash drive.
# cp -r /boot/* /mnt/boot/
# exit
# umount --recursive .
# exit
# umount --recursive .
Insert the flash drive drive in the client machine. Test everything out, look around, then save your progress. If you built your root with extra datasets then make sure the snapshot is recursive (i.e. zfs snapshot -r ...). If you are only using a root dataset, then recursion isn't necessary.
nfsserver# zfs snapshot tank/nodes/deb-j41-1@today-debianinstalled
5. Install Proxmox
There are two ways to do this. One, you can connect the usb device to the server, chroot into your client, update, and copy the new boot files over as before. Two, you can just update it on the client device. I am opting for the latter just to demonstrate how kernel upgrades go when a remote root is involved.The problem is that grub will be unable to find the canonical path to root so we will be unable to update-grub. This has two primary effects. The first is that the Proxmox installer will complain, repeatedly. As far as I can tell you can ignore this. The other problem is that we will need to manually move the kernel, create the initial ramdisk, and modify grub.cfg directly. Every grub tutorial and response on the internet says don't do the latter but they say don't do this because grub.cfg is overwritten every time you run update-grub--which happens whenever you install something that modifies your kernel. Since update-grub will not be able to run on the client we are fairly safe.
Warning
This means that whenever you add a package that you need reflected in the initrd you will need to make a new image and move it to your usb/boot. Examples of this include using zfs, bridges, or vlans at boot time and installing these packages after you last updated the initrd image.However, when installing software while choorted on the server (with the flash drive connected) you will lose any modifications made to grub.cfg. So if you want to update-grub in the future, make sure any and all changes are reflected in the 40_custom file. Since the way I have installed grub you have to manually copy the config file to the flash drive, you have to really want to overwrite your config in order to lose it.
Furthermore, you will need to keep the EFI files up-to-date with later kernel updates. Failure to do so could result in an unbootable system. (So reads the Arch Wiki.) As for how to do this without being able to run grub-install.... So if you update the kernel and are unable to boot, I would recommend moving the flash drive (or another one) to the root server, mounting the EFI partition in the client's .../boot/efi/, and running grub-install while chrooted. Remember to copy your grub.cfg modifications to the 40_custom file first.
On the server
First, lets clone our Debian root. This keeps Debian as a bootable option while we install Proxmox. And if there are any problems with installation then no harm no foul.
Note
Creating a clone is not creating a copy. A ZFS clone uses the same blocks as the original dataset. This saves space but if you decide to keep both the clone and the origin for a long time they will diverge while still being inextricably linked. That is, you cannot destroy one without the other. So use clones when you are testing a new configuration or creating an ephemeral dataset--not when you want to create a persistent dataset.
nfsserver# zfs clone tank/nodes/deb-j41-1@today-debianinstalled tank/nodes/pve-j41-1
Cloning may not keep all the same options, so check and make sure sharenfs and other options are on:
nfsserver# zfs get all tank/nodes/pve-j41-1
On the Client
There are two ways to do this. You could mount the clone via NFS to /mnt and chroot into that environment. I am going to modify grub to load the new root, reboot, and install normally.Configure grub to load the new root dataset
Make the changes inside the 40_custom section so you can easily copy them back to the 40_custom file if/when needed. I am adding the new one to the top (above the other menuentrys but below search --no-floppy...) because the first item is the default. (This is configurable before you update-grub by modifying /etc/default/grub or by finding the setting earlier in the file.) I retained the old entry so that we can still boot the Debian kernel if there are any issues.
Notice
We will use the same Debian kernel and initrd image for now.Important
Make sure that the linux line is all one line.Edit /local/boot/grub/grub.cfg and insert something like:
menuentry "Proxmox pve-j41-1 4.15.18-9-pve" {
set client_ip='192.168.195.11'
set server_ip='192.168.195.100'
set gw_ip=''
set netmask='255.255.255.0'
set hostname='pve-j41-1'
set domain='.caiuscorvus.net'
set device='enp2s0'
set server_root='/srv/tank/nodes/'
linux /boot/vmlinuz-4.9.0-8-amd64 root=/dev/nfs ip=$client_ip:$server_ip:$gw_ip:$netmask:$hostname$domain:$device nfsroot=$server_ip:$server_root$hostname rw quiet
initrd /boot/initrd.img-deb-today
}
set client_ip='192.168.195.11'
set server_ip='192.168.195.100'
set gw_ip=''
set netmask='255.255.255.0'
set hostname='pve-j41-1'
set domain='.caiuscorvus.net'
set device='enp2s0'
set server_root='/srv/tank/nodes/'
linux /boot/vmlinuz-4.9.0-8-amd64 root=/dev/nfs ip=$client_ip:$server_ip:$gw_ip:$netmask:$hostname$domain:$device nfsroot=$server_ip:$server_root$hostname rw quiet
initrd /boot/initrd.img-deb-today
}
Modify the hostname to match the new entry
# echo 'pve-j41-1' > /etc/hostname
Now reboot and you should be on the clone. If you want to confirm this modify a file and look for it on the server.
# touch /IAMHERE
Add an /etc/hosts entry for your IP address
127.0.0.1 localhost.localdomain localhost
192.168.195.11 pve-j41-1.caiuscorvus.net pve-j41-1 pvelocalhost
# The following lines are desirable for IPv6 capable hosts
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
192.168.195.11 pve-j41-1.caiuscorvus.net pve-j41-1 pvelocalhost
# The following lines are desirable for IPv6 capable hosts
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
Modify the file /etc/kernel/postinst.d/zz-update-grub and comment out `exec update-grub`. This will help quiet some of the errors you get when you update your kernel. This may need to be repeated with future kernel updates.
### exec update-grub
Add the Proxmox VE repository:
# echo "deb http://download.proxmox.com/debian/pve stretch pve-no-subscription" > /etc/apt/sources.list.d/pve-free-repo.list
Add the Proxmox VE repository key:
# wget http://download.proxmox.com/debian/proxmox-ve-release-5.x.gpg -O /etc/apt/trusted.gpg.d/proxmox-ve-release-5.x.gpg
Update your repository and system by running:
# apt update && apt dist-upgrade
Like before, let's look at the kernels and select one. This time seach for pve-kernel.
# apt search pve-kernel
Then install the kernel package of your choice using its package name when you install Proxmox. For example:
# apt install proxmox-ve pve-firmware pve-kernel-4.15.18-9-pve
# apt install postfix open-iscsi
# apt install postfix open-iscsi
Proxmox may add the enterprise repo. If you will be using the community version of proxmox feel free to remove it.
# rm /etc/apt/sources.list.d/pve-enterprise.list
Clean up
# apt remove os-prober
# apt clean
# apt clean
6. Prepare the New Boot Files
The grub issue leaves us without a pve initrd image. So, let's make one.
# update-initramfs -c -v -k 4.15.18-9-pve
Copy the new image and the new kernel to the flash drive
# cp /boot/*pve /local/boot/
# ls /local/boot
# ls /local/boot
Lastly, we need to modify our grub entry to reflect the new kernel and initrd image. Just change the two lines in the Proxmox menuentry in /local/boot/grub/grub.cfg
linux /boot/vmlinuz-4.15.18-9-pve ...
initrd /boot/initrd.img-4.15.18-9-pve
Now reboot and if everything works feel free to take another snapshot on the server
nfsserver# zfs snapshot -r tank/nodes/pve-j41-1@today-proxmoxinstalled
Additionally, unless you really want to keep the debian system around, let's get rid of it. Be careful to not destroy the origin before you promote the clone.
nfsserver# zfs promote tank/nodes/pve-j41-1
nfsserver# zfs destroy -r tank/nodes/deb-j41-1
nfsserver# zfs destroy -r tank/nodes/deb-j41-1
Another thing I did on the server was create a dataset with files I want to push to all Proxmox clients--like the hosts file.
This is the header I include in those files. Note the command I use to push updates.
# *** Warning! ***
# This file is updated on the root server. Changes made here will be
# overwritten by files updated there.
#
# Run the following command on the server to update all pve clients
#
# echo /srv/tank/nodes/pve*/etc/hosts | \
# xargs -n 1 cp -v /srv/tank/nodes/pve-common/hosts
#
# This file is updated on the root server. Changes made here will be
# overwritten by files updated there.
#
# Run the following command on the server to update all pve clients
#
# echo /srv/tank/nodes/pve*/etc/hosts | \
# xargs -n 1 cp -v /srv/tank/nodes/pve-common/hosts
#
7. Creating a second client
To make a new client, send|recv, setup a new the flash drive, update grub, copy files, and update files like hostname and postfix. For example:
# zfs send -RDp tank/nodes/pve-j41-1@today-proxmoxinstalled | \
zfs recv tank/nodes/pve-j41-2
zfs recv tank/nodes/pve-j41-2
When creating the new usb, you can dd the whole thing or just copy the files. I prefer to keep the UUIDs the same so you don't have to modify any UUIDs in grub.cfg. You can do this when you format the new partitions:
# mkfs.fat -F32 -i 8CHAR1ID /dev/sdX1
# mkfs.ext4 -U 1234what-ever-your-uuid-is5678901234 /dev/sdX2
# mount /dev/sdX2 /mnt
# cp -r /srv/tank/nodes/pve-j41-2/boot /mnt
# mkfs.ext4 -U 1234what-ever-your-uuid-is5678901234 /dev/sdX2
# mount /dev/sdX2 /mnt
# cp -r /srv/tank/nodes/pve-j41-2/boot /mnt
Modify /mnt/boot/grub/grub.cfg to reflect the new hostname and root
menuentry "Debian deb-j41-2 4.9.0-8-amd64" {
set client_ip='192.168.195.12'
set server_ip='192.168.195.100'
set gw_ip=''
set netmask='255.255.255.0'
set hostname='deb-j41-2'
set domain='.caiuscorvus.net'
set device='enp2s0'
set client_ip='192.168.195.12'
set server_ip='192.168.195.100'
set gw_ip=''
set netmask='255.255.255.0'
set hostname='deb-j41-2'
set domain='.caiuscorvus.net'
set device='enp2s0'
Note
you will have to pull the EFI directory from an existing USB or copy it from a client's flash drive to their nfs-mounted directories and grab it from there. The alternative would be to chroot into the new system and run grub-install
# umount /mnt
# mount /dev/sdX1 /mnt
# cp -r /???/EFI /mnt
# umount /mnt
# mount /dev/sdX1 /mnt
# cp -r /???/EFI /mnt
# umount /mnt
To finish modify your hostname, postfix, and any other config files with the old hostname or ip. If you copied a client which was already in a cluster you will need to make a number of other changes.