An Uphill Battle Getting Packer to Play Nicely with Proxmox
Running a homelab is a great way to be introduced to a number of new technologies and methodologies with minimal risk.
One of the downsides of running a homelab however, is maintenance. In an ever evolving world of security threats and bad actors, it is important to keep your homelab updated and running recent software and hardware.
Since homelabs are generally educational ventures, it is unlikely that one is being paid to maintain them. In order to keep things running smoothly, automation is key to maintaining infrastructure hygeine.
In my lab I am using Proxmox and Packer to generate updated VM images which I can then deploy using OpenTofu. This setup worked fine when I was using Oracle Linux for the majority of my VMs at home, but I made the decision to standardize on Rocky Linux since it is more widely supported by the community. Hopping distros came with several challenges which I will outline below.
Unmaintained Packer Plugins
As it turns out the Proxmox plugin for Packer is currently unmaintained, with the latest releasae being over two years old at this point. Worse yet, the plugin has a bug where the cpu_type
parameter is not respected. (This bug supposedly goes back as far as the 1.20 release according to the GitHub issues page.) This makes it impossible to create a template based on RedHat derivatives newer than 9.1 as the default QEMU64/KVM64 cpu type is deprecated and causes a kernel panic on boot (You can see more details on the RedHat issue tracker[^1].
The solution came from the community. The Badsectorlabs fork maintains active development and actually works with current Proxmox releases. Switching required minimal configuration changes as it is a drop in for the official plugin:
packer {
required_plugins {
proxmox = {
version = "1.2.3"
source = "github.com/badsectorlabs/proxmox"
}
ansible = {
version = "1.0.3"
source = "github.com/hashicorp/ansible"
}
}
}
Kickstart Configuration Chaos
Rocky 9 and Oracle Linux 9 both claim RHEL compatibility. Their Kickstart implementations disagree.
Key differences that broke my builds:
Network Configuration: Rocky 9 simplified to:
network --bootproto=dhcp --device=link --activate
OL9 still wants explicit device names:
network --bootproto=dhcp --device=eth0 --onboot=yes --activate
Boot Commands: Rocky 9 needs a little bit of massaging when it comes to the boot command, after many hours of trial and error, I arrived at a boot command that neither ended up with a kernel panic, nor immediately loaded into the graphical Anaconda installer environment.
# Definition of the boot command in my .locals file.
{
uefi_boot_command = [
"<wait3s>c<wait3s>",
"linuxefi /images/pxeboot/vmlinuz",
"inst.stage2=hd:LABEL=Rocky-9-6-x86_64-dvd",
"inst.ks=http://{{ .HTTPIP }}:{{ .HTTPPort }}/ks-rocky9.cfg",
"ip=dhcp<enter>",
"initrdefi /images/pxeboot/initrd.img<enter>",
"boot<enter>"
]
boot_command = local.uefi_boot_command
}
It’s important to note that from RHEL8 to RHEL9 compatible distros,
inst
must be prepended to all of the installer boot parameters.
I also had to make a few modifications to the kickstart config to get it to play nice with Rocky:
# Required for text mode installs
text
# Mandatory url if using the minimal base image
url --url="http://download.rockylinux.org/pub/rocky/9/BaseOS/x86_64/os/"
lang en_US.UTF-8
keyboard us
timezone America/Edmonton
rootpw --plaintext packer
sshkey --username=root "<SSH_KEY_GOES_HERE>"
network --bootproto=dhcp --device=link --activate --onboot=on
# Clear all existing partitions
clearpart --all --initlabel
part /boot/efi --fstype=efi --size=200
part /boot --fstype=xfs --size=2048
# Initialize the disk for LVM
part pv.01 --fstype=lvmpv --size=0 --grow
# Create a volume group named 'vg01'
volgroup vg01 pv.01
# Root partition
logvol / --fstype=xfs --name=root --vgname=vg01 --size=10240
# /home partition
logvol /home --fstype=xfs --name=home --vgname=vg01 --size=8192 --grow
# /var partition
logvol /var --fstype=xfs --name=var --vgname=vg01 --size=12228
# /var/log partition
logvol /var/log --fstype=xfs --name=var_log --vgname=vg01 --size=2048
# /srv partition
logvol /srv --fstype=xfs --name=srv --vgname=vg01 --size=2048
# /tmp partition
logvol /tmp --fstype=xfs --name=tmp --vgname=vg01 --size=1024
# Swap partition
logvol swap --fstype=swap --name=swap --vgname=vg01 --size=2048
skipx
firstboot --disable
selinux --enforcing
%packages
@^minimal-environment
%end
reboot
Ansible Over Cloud-Init
Cloud-init promises simplicity. In practice, debugging YAML embedded in YAML while troubleshooting why your user data didn’t apply correctly erodes that promise quickly.
Switching to Ansible provided immediate benefits:
Version Control: Playbooks live in Git, not scattered across template metadata.
Testing: I can run playbooks in a test environment, before deploying them to VMs.
Modularity: Common tasks become roles. User creation, package installation, and security hardening get reused across different image types. I no longer have to write multiple cloud-inits for different edge cases.
The Packer configuration simplified to:
build {
sources = ["source.proxmox-iso.rocky9"]
# Using ansible playbooks to configure Rocky Linux 9
provisioner "ansible" {
playbook_file = "../ansible/packer.yml"
use_proxy = false
# Required for RHEL and derivatives.
sftp_command = "/usr/libexec/openssh/sftp-server -e"
user = "root"
galaxy_file = "../ansible/collections/requirements.yml"
roles_path = "../ansible/roles"
ansible_env_vars = [
"ANSIBLE_HOST_KEY_CHECKING=False",
"ANSIBLE_SSH_ARGS='-o ForwardAgent=yes -o ControlMaster=auto -o ControlPersist=60s'",
"ANSIBLE_NOCOLOR=True"
]
extra_arguments = [
"--extra-vars", "ansible_ssh_pass=packer",
"--extra-vars", "roles_to_install=[%{for role in var.ansible_roles_to_install}\"${role}\",%{endfor~}]",
"--extra-vars", "vm_id=${var.vmid}"
]
}
}
This approach trades cloud-init’s runtime flexibility for build-time predictability. For a homelab where I control the entire stack, it’s the right tradeoff. It’s also worth noting that I can use this same provisioning setup for any OS, not just Rocky 9 or cloud init compatible OSes.
The Result
A reproducible build pipeline that generates Rocky 9 templates in under 20 minutes. More importantly, the next person (probably future me) can understand what’s happening and why.