> In this setup, UML is essentially a userspace process that cleverly employs concepts like files and sockets to launch a new Linux kernel instance capable of running its own processes. The exact mapping of these processes to the host — specifically, how the CPU is virtualized — is something I’m not entirely clear on, and I’d welcome insights in the comments. One could envision an implementation where guest threads and processes map to host counterparts but with restricted system visibility, akin to containers, yet still operating within a nested Linux kernel.
At least in the first generation of UML, the guest processes are in fact host processes. The guest kernel (a userland process) essentially runs them under ptrace() and catches all of the system calls made by the guest process and rewires them so they do operations inside of the guest kernel. They otherwise run like host processes on host CPU, though.
Completing the illusion, however, the guest kernel also skillfully rewires the guest ptrace() calls so you can still use strace or gdb inside of the guest!
It's good enough that you can go deeper and run UML inside of UML.
> What’s the real-world utility here? Is UML suitable for running isolated workloads? My educated guess is: probably not for most production scenarios.
Back in the day there were hosts offering UML VMs for rent. This is actually how Linode got its start!
Why do they initialize a disk image with /dev/urandom instead of /dev/zero? Given it's not an encrypted disk container, I don't see any valid reason to do so, but perhaps I'm not seeing something?
It was great. I remember trying it about twenty years ago. The very first time I fired it up, I just typed "linux" at a prompt, and a kernel booted - right there in the terminal.
And then panicked, because it had no root. But hey, I've got a root filesystem right here!
So the second time I typed "linux root=/dev/hda1" (because we had parallel ATA drives back then).
It booted, mounted root, and of course that was the root filesystem the host was booted off.
Anyway it recovered after a power cycle and I didn't need to reinstall, and most importantly I learned not to do THAT again, which is often the important thing to learn.
> In this setup, UML is essentially a userspace process that cleverly employs concepts like files and sockets to launch a new Linux kernel instance capable of running its own processes. The exact mapping of these processes to the host — specifically, how the CPU is virtualized — is something I’m not entirely clear on, and I’d welcome insights in the comments. One could envision an implementation where guest threads and processes map to host counterparts but with restricted system visibility, akin to containers, yet still operating within a nested Linux kernel.
At least in the first generation of UML, the guest processes are in fact host processes. The guest kernel (a userland process) essentially runs them under ptrace() and catches all of the system calls made by the guest process and rewires them so they do operations inside of the guest kernel. They otherwise run like host processes on host CPU, though.
Completing the illusion, however, the guest kernel also skillfully rewires the guest ptrace() calls so you can still use strace or gdb inside of the guest!
It's good enough that you can go deeper and run UML inside of UML.
> What’s the real-world utility here? Is UML suitable for running isolated workloads? My educated guess is: probably not for most production scenarios.
Back in the day there were hosts offering UML VMs for rent. This is actually how Linode got its start!
The second generation was "skas" for Separate Kernel Address Space, some more background here: https://user-mode-linux.sourceforge.net/old/skas.html
The host kernel patch for skas was never merged, probably for good reason, but that and Xen/VM hardware support meant UML stopped making sense.
Do you know why people stopped? It would seem to be a potentially useful middle ground between docker containers and KVM VMs
It's slow for many of the things people want to use it for.
Performance, mostly.
I worked for a hosting company that sold UML-based virtual machines, while we trialed Xen as the successor, before moving to use KVM instead.
But also KVM supported things like live-migration and virtio drivers which made custom interfaces and portability easier to deal with.
Why do they initialize a disk image with /dev/urandom instead of /dev/zero? Given it's not an encrypted disk container, I don't see any valid reason to do so, but perhaps I'm not seeing something?
Probably avoid zero write optimizations. This force actual allocation of disk space for the data, instead of pretending to do so.
In case you wonder how UML is currently used: https://netdevconf.info/0x14/pub/slides/8/UML%20Time%20Trave...
It's testing. Using timetravel mode you can skip sleeps and speedup your unit tests massively.
It was great. I remember trying it about twenty years ago. The very first time I fired it up, I just typed "linux" at a prompt, and a kernel booted - right there in the terminal.
And then panicked, because it had no root. But hey, I've got a root filesystem right here!
So the second time I typed "linux root=/dev/hda1" (because we had parallel ATA drives back then).
It booted, mounted root, and of course that was the root filesystem the host was booted off.
Anyway it recovered after a power cycle and I didn't need to reinstall, and most importantly I learned not to do THAT again, which is often the important thing to learn.
Had been using this quite some time ago, it is sad that it has only 1-CPU support, preventing some SMP bugs from emerging.
Wonder if it's hard to make it SMP, if too many places use something like #ifdef CONFIG_ARCH_IS_UM to tell whether it is single CPU, it might be hard.
SMP has been implemented recently and is queued for the next release.
Interesting
That’s giving very firecracker vibes