Running VMs On FreeBSD using QEMU with VDE
I decided to run some VMs on my FreeBSD server. Checking over BHyve, apparently it’s being developed in collaboration with NetAPP. The project is currently supported in FreeBSD 10 and I was running FreeBSD 9. So after all of my research, I decided to give Qemu/KQemu a try.
Now what exactly is the difference between Qemu and KVM? from the KVM FAQ:
QEMU uses emulation; KVM uses processor extensions (HVM) for virtualization.
And what about KQemu? from the Qemu wikipedia page:
QEMU (short for “Quick EMUlator”) is a free and open-source software product that performs hardware virtualization.
QEMU is a hosted virtual machine monitor: It emulates central processing units through dynamic binary translation and provides a set of device models, enabling it to run a variety of unmodified guest operating systems. It also provides an accelerated mode for supporting a mixture of binary translation (for kernel code) and native execution (for user code), in the same fashion VMware Workstation and VirtualBox do. QEMU can also be used purely for CPU emulation for user level processes, allowing applications compiled for one architecture to be run on another.
KQEMU was a Linux kernel module, also written by Fabrice Bellard, which notably sped up emulation of x86 or x86-64 guests on platforms with the same CPU architecture. This was accomplished by running user mode code (and optionally some kernel code) directly on the host computer’s CPU, and by using processor and peripheral emulation only for kernel mode and real mode code.
Unlike KVM, for example, KQEMU could execute code from many guest OSes even if the host CPU did not support hardware virtualization.
Lastly from the Qemu’s main page:
QEMU is a generic and open source machine emulator and virtualizer.
When used as a machine emulator, QEMU can run OSes and programs made for one machine (e.g. an ARM board) on a different machine (e.g. your own PC). By using dynamic translation, it achieves very good performance.
When used as a virtualizer, QEMU achieves near native performances by executing the guest code directly on the host CPU. QEMU supports virtualization when executing under the Xen hypervisor or using the KVM kernel module in Linux. When using KVM, QEMU can virtualize x86, server and embedded PowerPC, and S390 guests.
So to re-phrase my initial sentence: I will be virtualizing/emulating an OS within the FreeBSD kernel and using the KQemu acceleration kernel module to improve performance/execution.
Now let’s get started with the setup, first let’s install Qemu, instructions can be found here:
[email protected]:~> cd /usr/ports/emulators/qemu [email protected]:/usr/ports/emulators/qemu> make showconfig ===> The following configuration options are available for qemu-0.11.1_11: ADD_AUDIO=off: Emulate more audio hardware (experimental!) ALL_TARGETS=on: Also build non-x86 targets CDROM_DMA=on: IDE CDROM DMA CURL=on: libcurl dependency (remote images) GNS3=off: gns3 patches (udp, promiscuous multicast) GNUTLS=on: gnutls dependency (vnc encryption) KQEMU=on: Build with (alpha!) accelerator module PCAP=on: pcap dependency (networking with bpf) RTL8139_TIMER=off: allow use of re(4) nic with FreeBSD guests SAMBA=off: samba dependency (for -smb) SDL=on: SDL/X dependency (graphical output) ===> Use 'make config' to modify these settings [email protected]:/usr/ports/emulators/qemu> sudo make install clean
After the compile finished you can enable the KQemu module to be loaded on boot, this is done by adding the following to the /etc/rc.conf file:
This will load the KQemu module on start up and it will also check if the aio module is loaded. If the aio module is not loaded, it will also load it.
Before rebooting, load them manually and make sure they load properly:
That looks good. As a quick test, let’s create an img file:
The ‘-f’ option specifies the format of the image, from the man page here are the options:
“raw” Raw disk image format (default). This format has the advantage of being simple and easily exportable to all other emulators. If your file system supports holes (for example in ext2 or ext3 on Linux or NTFS on Windows), then only the written sectors will reserve space. Use “qemu-img info” to know the real size used by the image or “ls -ls” on Unix/Linux.
“qcow2” QEMU image format, the most versatile format. Use it to have smaller images (useful if your filesystem does not supports holes, for example on Windows), optional AES encryption, zlib based compression and support of multiple VM snapshots.
“qcow” Old QEMU image format. Left for compatibility.
“cow” User Mode Linux Copy On Write image format. Used to be the only growable image format in QEMU. It is supported only for compatibility with previous versions. It does not work on win32.
“vmdk” VMware 3 and 4 compatible image format.
“cloop” Linux Compressed Loop image, useful only to reuse directly compressed CD-ROM images present for example in the Knoppix CD- ROMs.
Since I will be running RHEL on the VM (that will use an ext filesystem), I decided to use the raw format. Now to start the VM using that img file as the hard drive:
[email protected]:/data/vms> qemu -cdrom ../rhel-server-5.5-i386-dvd.iso -hda rhel2.img -m 256 -boot d -kernel-kqemu -vnc :0 -localtime
Now to test connectivity, from a Fedora machine I installed a VNC client:
moxz:~> sudo yum install tigervnc
After it’s installed, I tried to connect to the VM:
moxz:~> vncviewer 192.168.1.101:5900
A VNC window popped up without any issues, and I saw the RHEL CD boot up. Now if we want our VM to have access to the network then we have a couple of options. From the Qemu Networking page, we can use “User Mode Networking”:
User Mode Networking – In this mode, the QEMU virtual machine automatically starts up an internal DHCP server on an internal networkaddress -10.0.2.2/24. This is internal to the guest environment and is not visible from the host environment. If the guest OS is set up for DHCP, the guest will get an IP address from this internal DHCP server. The QEMU virtual machine will also gateway packets onto the host network through 127.0.0.1. In this way, QEMU can provide an automatic network environment for the QEMU user without any manual configuration.
Here is a good diagram of how it works:
Second we can use “TUN/TAP Network Interface”. From this page:
TUN/TAP Network Interface – In this mode, the QEMU Virtual Machine opens a pre-allocated TUN or TAP device on the host and uses that interface to transfer data to the guest OS. This method involves the standard manual configuration of the guest OS interface using the ifconfig command. This method doesn’t involve the DHCP server. However, note that the server is still running, it’s just not used by an interface using this method.
Lastly we can use the “VDE” (Virtual Distributed Ethernet) method:
The VDE networking backend uses the Virtual Distributed Ethernet infrastructure to network guests. From the VDE page:
VDE switch Like a physical ethernet switch, a VDE switch has several virtual ports where virtual machines, applications, virtual interfaces, connectivity tools and - why not? - other VDE switch can be virtually plugged in.
vde_switch The vde_switch is a virtual switch provided with the vde networking architecture. As vde_switch can interconnect several virtual networking devices multiple vde_switches can be connected together with vde_cables.
Since Jarret already did a post on the bridged/tap networking with KVM, I decided to try out the VDE setup. First let’s install the VDE Package:
Here is the config I used:
[email protected]:/usr/ports/net/vde2> make showconfig ===> The following configuration options are available for vde2-2.3.2: PYTHON=on: Python bindings ===> Use 'make config' to modify these settings
After that is installed we should create a tap interface to act as an uplink for our VDE-switch. Instructions for this can be found here. First let’s create the bridge:
[email protected]:~> sudo ifconfig bridge0 create
Then let’s create the tap interface:
[email protected]:~> sudo ifconfig tap0 create
Now let’s bridge our physical interface with our tap interface:
[email protected]:~> sudo ifconfig bridge0 addm em0 addm tap0 up
Now checking the settings for both interfaces:
[email protected]:~> ifconfig bridge0 bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 ether 02:84:95:2c:24:00 nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15 maxage 20 holdcnt 6 proto rstp maxaddr 100 timeout 1200 root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0 member: tap0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP> ifmaxaddr 0 port 8 priority 128 path cost 2000000 member: em0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP> ifmaxaddr 0 port 1 priority 128 path cost 20000
We see both interfaces (tap0 and em0) there, and our tap0 interface:
[email protected]:~> ifconfig tap0 tap0: flags=8902<BROADCAST,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=80000<LINKSTATE> ether 00:bd:e4:d8:f1:00 nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
Since we will be connecting to the tap device with regular users, let’s allow regular users to connect to our tap device:
Also let’s fix the permissions on the file for the tap0 device:
Now confirming the permissions:
[email protected]:~> ls -l /dev/tap0 crw-rw---- 1 root elatov 0, 109 Feb 3 16:14 /dev/tap0
That looks good. Now let’s create a VDE-Switch and add tap0 as our uplink:
[email protected]:~> sudo vde_switch -d -s /tmp/vde1 -M /tmp/mgmt1 -tap tap0 -m 660 -g elatov --mgmtmode 660 --mgmtgroup elatov
Here are all the arguments explained:
- -d option tells vde_switch to run as daemon or background process.
- -s is the complete path to data socket for the switch.
- -M specifies where to create the management socket for the switch.
- -t specifies the tap interface name that connected to the switch.
- -m specifies mode of data socket -g specified group owner of data socket
- -mgmtmode specifies mode of mgmt socket
- -mgmtgroup specifies group owner of mgmt socket
Now let’s login into the virtual switch:
[email protected]:~> unixterm /tmp/mgmt1 VDE switch V.2.3.2 (C) Virtual Square Team (coord. R. Davoli) 2005,2006,2007 - GPLv2 vde$ ds/showinfo 0000 DATA END WITH '.' ctl dir /tmp/vde1 std mode 0660 . 1000 Success vde$ port/allprint 0000 DATA END WITH '.' Port 0001 untagged_vlan=0000 ACTIVE - Unnamed Allocatable Current User: NONE Access Control: (User: NONE - Group: NONE) -- endpoint ID 0007 module tuntap : tap0 . 1000 Success vde$ vlan/allprint 0000 DATA END WITH '.' VLAN 0000 -- Port 0001 tagged=0 active=1 status=Forwarding . 1000 Success
Most of the information regarding the VDE was taken from here; ‘unixterm’ was good enough for me :)
Now starting up the VM and connecting it to our VDE-Switch:
[email protected]:~>vdeqemu -hda rhel2.img -m 256 -kernel-kqemu -vnc :0 -localtime -no-acpi -net vde,sock=/tmp/vde1 -net nic,model=e1000
Now checking out the switch ports:
[email protected]:~>unixterm /tmp/mgmt1 VDE switch V.2.3.2 (C) Virtual Square Team (coord. R. Davoli) 2005,2006,2007 - GPLv2 vde$ port/allprint 0000 DATA END WITH '.' Port 0001 untagged_vlan=0000 ACTIVE - Unnamed Allocatable Current User: NONE Access Control: (User: NONE - Group: NONE) -- endpoint ID 0007 module tuntap : tap0 Port 0002 untagged_vlan=0000 ACTIVE - Unnamed Allocatable Current User: elatov Access Control: (User: NONE - Group: NONE) -- endpoint ID 0003 module unix prog : vdeqemu user=elatov PID=98624 SSH=192.168.1.102 . 1000 Success
We can see that now our VM is connected to the VDE-Switch. Since all the traffic is going through our tap0 interface, we can actually run tcpdump on it to see what traffic our VM is sending. Let’s ping a machine on the local subnet from the VM and see what we see on the tap0 interface. Here is the capture from the FreeBSD host as the ping is going:
[email protected]:~> sudo tcpdump -i tap0 -n host 192.168.1.110 and icmp tcpdump: WARNING: tap0: no IPv4 address assigned tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on tap0, link-type EN10MB (Ethernet), capture size 65535 bytes 17:40:01.353744 IP 192.168.1.110 > 192.168.1.102: ICMP echo request, id 12298, seq 1, length 64 17:40:01.353957 IP 192.168.1.102 > 192.168.1.110: ICMP echo reply, id 12298, seq 1, length 64
If you didn’t guess it, the VM’s IP is 192.168.1.110. You can also monitor traffic using file sockets. In depth instructions are laid out here.
Since we don’t want to keep recreating the above settings, let’s go ahead and setup all the above options/settings to be auto configured on boot. We already enabled the kqemu module to be loaded on boot, now let’s setup the bridge and tap interfaces to be created on boot. Add the following to the /etc/rc.conf file:
cloned_interfaces="tap0 bridge0" ifconfig_bridge0="addm em0 addm tap0 up"
Now let’s enable the sysctl options. Add the following to the /etc/sysctl.conf file:
Next let’s setup the appropriate permissions for our tap0 device. Add the following to the /etc/devfs.conf file:
own tap0 root:elatov perm tap0 660
Lastly, create the VDE-Switch boot. Add the following to the /etc/rc.local file:
/usr/local/bin/vde_switch -d -s /tmp/vde1 -M /tmp/mgmt1 -tap tap0 -m 660 -g elatov --mgmtmode 660 --mgmtgroup elatov
So I rebooted the FreeBSD host and I wanted to make sure all the settings look good. First I wanted to make sure my bridge0 and tap0 interfaces were setup:
[email protected]:~> ifconfig tap0 tap0: flags=8942<BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=80000<LINKSTATE> ether 00:bd:90:1a:00:00 nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
and the bridge0 interface:
[email protected]:~> ifconfig bridge0 bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 ether 02:84:95:2c:24:00 nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15 maxage 20 holdcnt 6 proto rstp maxaddr 100 timeout 1200 root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0 member: tap0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP> ifmaxaddr 0 port 7 priority 128 path cost 2000000 member: em0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP> ifmaxaddr 0 port 1 priority 128 path cost 2000000
That looked good. Next I wanted to make sure my VDE-Switch was created:
[email protected]:~> unixterm /tmp/mgmt1 VDE switch V.2.3.2 (C) Virtual Square Team (coord. R. Davoli) 2005,2006,2007 - GPLv2 vde$ ds/showinfo 0000 DATA END WITH '.' ctl dir /tmp/vde1 std mode 0660 . 1000 Success vde$ port/allprint 0000 DATA END WITH '.' Port 0001 untagged_vlan=0000 ACTIVE - Unnamed Allocatable Current User: NONE Access Control: (User: NONE - Group: NONE) -- endpoint ID 0007 module tuntap : tap0 . 1000 Success
That also looked good, we even see the tap0 device connected. Then I wanted to make sure the persmission on my tap0 device were correct:
[email protected]:~> ls -l /dev/tap0 crw-rw---- 1 root elatov 0, 102 Feb 3 18:00 /dev/tap0
and I wanted to make sure the sysctl settings looked good as well:
And lastly I wanted to make sure the kernel modules were loaded:
[email protected]:~>kldstat Id Refs Address Size Name 1 13 0xc0400000 e9ec64 kernel 2 1 0xc62de000 5000 if_tap.ko 3 1 0xc6302000 9000 if_bridge.ko 4 1 0xc630b000 6000 bridgestp.ko 5 1 0xc647c000 e000 fuse.ko 6 1 0xc64ef000 8000 aio.ko 7 1 0xc64fa000 21000 kqemu.ko
Everything looked good. As Jarret mentioned in his web-page with all the different tools. Checking over the FreeBSD ports, I only found the following:
All of the above (except aqemu) depend on the libvirt libraries. After further investigation it turned out that libvirt and VDE don’t work together… yet. From here, a snippet:
For generic situations libvirt and virt-manager are useful tools to help manage VM clusters. ubuntu-vm-builder is useful for creating VMs and adding them to libvirt hosts. VDE and libvirt Don’t Play Together
Unfortunately libvirt doesn’t currently support VDE networks, although it is possible for someone to implement a VDE interface using the libvirt network API.
I was planning on only running 2 VMs, so this wasn’t a big deal for me. Running the follow two commands to start my VMs is pretty easy:
[email protected]:~>vdeqemu -hda rhel1.img -m 512 -kernel-kqemu -vnc :0 -localtime -no-acpi -net nic,model=e1000,macaddr=52:54:00:12:34:56 -net vde,sock=/tmp/vde1 & [email protected]:~>vdeqemu -hda rhel2.img -m 256 -kernel-kqemu -vnc :1 -localtime -no-acpi -net nic,model=e1000,macaddr=52:54:00:12:34:57 -net vde,sock=/tmp/vde1 &
blog comments powered by Disqus