I'm trying to setup a cluster with Openstack. I've previously deployed Train on this cluster and it worked fine for me. I'm trying to install the latest version now, Antelope, with MAAS and a Juju bundle according to the directions provided here. (I've also tried the individual charm deployment guide with the same issues.)
I've provisioned/configured my machines with MAAS (commissioned machines, setup br-ex
as an Open VSwitch Bridge on all nodes, configured partitions for ceph-osd) bootstrapped juju, setup Vault/Placement and juju is reporting everything is OK:
Accessing Openstack via the CLI or Horizon is fine, and I've setup an image, security group, flavor, public network, subnets, etc. Creating an instances reports no errors in Openstack, but the instance fails to start. (I've tried using both a cirros image and a Ubuntu Jammy image.) From the console, I see this error:
I've checked /var/log/nova on the hypervisor, and it shows no errors. I've also checked out glance, keystone, ceph and cinder logs on various machines and I'm not seeing anything that looks even possibly related. I've re-downloaded the images from Openstack and they match the images I'm uploading, and verified that my download of these images from the official repos match.
Where else can I check for errors, or what additional information does anyone need to help debug what's going on? Thanks!
UPDATE After logging into the hypervisor and inspecting the images files used for starting the instance, it appears that nova has not downloaded the image file for the instance properly for some reason. Though, it's not showing any errors in the log. I checked this by mounting the base image file, and it pretty much just has boot information:
root@shen35:/var/lib/nova/instances/_base# file 615313348ae2e8ff2099cc01b35e30cf6e754d3d
jdh_img_test: DOS/MBR boot sector; GRand Unified Bootloader, stage1 version 0x3, 1st sector stage2 0x10c22, extended partition table (last)
root@shen35:/var/lib/nova/instances/_base# fdisk -l 615313348ae2e8ff2099cc01b35e30cf6e754d3d
Disk 615313348ae2e8ff2099cc01b35e30cf6e754d3d: 112 MiB, 117440512 bytes, 229376 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: F05EDE64-4EFD-477A-B01E-B37CCD5D3EB4
Mounting the two partitions in there shows me normal looking grub files:
ls -R jdh_mount
jdh_mount:
EFI
jdh_mount/EFI:
BOOT ubuntu
jdh_mount/EFI/BOOT:
bootx64.efi
jdh_mount/EFI/ubuntu:
grub.cfg
root@shen35:/var/lib/nova/instances/_base# fdisk -l jdh_img_test
Disk jdh_img_test: 112 MiB, 117440512 bytes, 229376 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: F05EDE64-4EFD-477A-B01E-B37CCD5D3EB4
Device Start End Sectors Size Type
jdh_img_test1 18432 229342 210911 103M Linux filesystem
jdh_img_test15 2048 18431 16384 8M EFI System
Partition table entries are not in disk order.
root@shen35:/var/lib/nova/instances/_base# ls -R
.:
615313348ae2e8ff2099cc01b35e30cf6e754d3d ephemeral_20_40d1d2c jdh_img_test jdh_mount jdh_mount_linux
./jdh_mount:
./jdh_mount_linux:
boot initrd.img lost+found vmlinuz
./jdh_mount_linux/boot:
config-5.3.0-26-generic grub initrd.img-5.3.0-26-generic vmlinuz-5.3.0-26-generic
./jdh_mount_linux/boot/grub:
e2fs_stage1_5 menu.lst stage1 stage2
./jdh_mount_linux/lost+found:
I am surprised that there's apparently no files for an operating system in there, just boot files. I checked the other base file, which is the ephemeral disk attached to the image. It's fine, but also completely empty. So, I think I can confirm that the reason the instance isn't booting is that none of the image files actually contain an OS, despite having grub setup OK. So, why isn't nova-compute getting the images right?
Update 2: I retried the previous experiment with Ubuntu Jammy and it seems that nova is in fact downloading the image OK. The small partitions in the cirros OS match the image I downloaded from before sending it to openstack. But, Jammy won't boot either:
So, now I think the images are downloaded properly, but QEMU isn't running them properly. Checking how qemu is run with ps
, I'm seeing this nightmare of a command:
/usr/bin/qemu-system-x86_64 -name guest=instance-00000004,debug-threads=on -S -object {"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-4-instance-00000004/master-key.aes"} -machine pc-i440fx-6.2,usb=off,dump-guest-core=off,memory-backend=pc.ram -accel kvm -cpu Broadwell-IBRS,vme=on,ss=on,vmx=on,pdcm=on,f16c=on,rdrand=on,hypervisor=on,arat=on,tsc-adjust=on,umip=on,md-clear=on,stibp=on,arch-capabilities=on,ssbd=on,xsaveopt=on,pdpe1gb=on,abm=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,skip-l1dfl-vmentry=on,pschange-mc-no=on -m 2 -object {"qom-type":"memory-backend-ram","id":"pc.ram","size":2097152} -overcommit mem-lock=off -smp 1,sockets=1,dies=1,cores=1,threads=1 -uuid d4920f5a-0491-4e1f-b1fc-875660d11eda -smbios type=1,manufacturer=OpenStack Foundation,product=OpenStack Nova,version=27.0.0,serial=d4920f5a-0491-4e1f-b1fc-875660d11eda,uuid=d4920f5a-0491-4e1f-b1fc-875660d11eda,family=Virtual Machine -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=39,server=on,wait=off -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -blockdev {"driver":"file","filename":"/var/lib/nova/instances/_base/56f431310d4bce927e45584a70e34f29141f40af","node-name":"libvirt-4-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-4-format","read-only":true,"discard":"unmap","cache":{"direct":true,"no-flush":false},"driver":"raw","file":"libvirt-4-storage"} -blockdev {"driver":"file","filename":"/var/lib/nova/instances/d4920f5a-0491-4e1f-b1fc-875660d11eda/disk","node-name":"libvirt-2-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-2-format","read-only":false,"discard":"unmap","cache":{"direct":true,"no-flush":false},"driver":"qcow2","file":"libvirt-2-storage","backing":"libvirt-4-format"} -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=libvirt-2-format,id=virtio-disk0,bootindex=1,write-cache=on -blockdev {"driver":"file","filename":"/var/lib/nova/instances/_base/ephemeral_20_40d1d2c","node-name":"libvirt-3-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-3-format","read-only":true,"discard":"unmap","cache":{"direct":true,"no-flush":false},"driver":"raw","file":"libvirt-3-storage"} -blockdev {"driver":"file","filename":"/var/lib/nova/instances/d4920f5a-0491-4e1f-b1fc-875660d11eda/disk.eph0","node-name":"libvirt-1-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-1-format","read-only":false,"discard":"unmap","cache":{"direct":true,"no-flush":false},"driver":"qcow2","file":"libvirt-1-storage","backing":"libvirt-3-format"} -device virtio-blk-pci,bus=pci.0,addr=0x5,drive=libvirt-1-format,id=virtio-disk1,write-cache=on -netdev tap,fd=42,id=hostnet0,vhost=on,vhostfd=44 -device virtio-net-pci,host_mtu=1500,netdev=hostnet0,id=net0,mac=fa:16:3e:dd:7e:73,bus=pci.0,addr=0x3 -add-fd set=3,fd=41 -chardev pty,id=charserial0,logfile=/dev/fdset/3,logappend=on -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0,bus=usb.0,port=1 -audiodev {"id":"audio1","driver":"none"} -vnc 10.246.117.211:2,audiodev=audio1 -device virtio-vga,id=video0,max_outputs=1,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -object {"qom-type":"rng-random","id":"objrng0","filename":"/dev/urandom"} -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x7 -device vmcoreinfo -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on
Nothing obviously jumps out as wrong, though. Still lost as to what's going wrong.