notes.nitwit.se

Adventures in minimalism with Arch Linux

By Mark Dixon

Intro

How small can you go? Adventures in finding a minimal Linux for VMs.

A hundred years ago or more, back when Linux was still largely the undiscovered wild west, you would to install Linux by hand from a handful of 1.44mb floppies painstakingly copied from a friend: kernel, bootloader, gcc, gdb, Emacs, Ghost-script, the lot.

A decade later there came a period of minimalist distros such as Tiny Core Linux - aimed at getting X up on your screen in as few bytes as possible. Sadly(?) the need for such distros seems to have waned; Tiny hasn’t received an update since 2008.

I recently needed to spin up a tiny server VM to test something (some SQL injection, don’t ask). On my Mac, UTM seemed like the smoothest option with it’s gallery of pre-built VM images: a 500+MB installer download followed by a short install process leaving me with a 2GB virtual disk image. Hardly tiny, nor core - what if I want to share my VMs? Storage is cheap, but bandwidth still costs. Thus was born my brief obsession with the thought: how small can you go for a shareable Linux VM?

TLDR: don’t go with Arch or any mainstream distro. As shall be seen later, there are still a few distros around aimed at true minimalism. I can get Arch down to a circa 850MB file - but e.g. Alpine Linux weighs in at a fraction of that (50MB).


Installing QEMU

So shareable would mean least common denominator: targeting x64 architecture. This doesn’t leave many options for virtualisation - as a Mac user, we’re talking full system emulation or nothing: QEMU. UTM is really just a wrapper around QEMU - so for the sake of transparency and better control, lets stick to QEMU.:

MacOS

QEMU is easily installed with Homebrew (follow that link first if you don’t have it installed).

brew install qemu 

Grab a ☕.

Not MacOS

Your mileage may vary - but QEMU shouldn’t be that hard to get your hands on.

Ubuntu

sudo apt-get install qemu qemu-utils

Bootstrapping QEMU

Step 1: virtual disk image

To keep things as small as possible outside the VM, we need a sparse image disk. QEMU has a built-in tool for creating and converting disk images:

qemu-img create -f qcow2 vdisk1 100G

100GB is a bit bigger than we need - but since it is a sparse image, the file should never get that big.

qcow2 is QEMUs own file format for disk images with support for sparse imaging: aligning nicely with our goal.

Now you should have a single file vdisk1 that is roughly 200KB.

Step 2: download the Arch ISO installer

If we want to get small, we probably don’t want to use an off-the-shelf Arch VM image: we want to do this hardcore mode and that means starting with an ISO image. Find your convenient mirror from the list at https://archlinux.org/download/ and download with:

curl --fail --remote-name --location --continue-at - \
   https://ftp.acc.umu.se/mirror/archlinux/iso/2023.04.01/archlinux-2023.04.01-x86_64.iso

Going Arch

Now we can run QEMU with our newly created sparse disk image and the ISO image we downloaded.

Booting the installer ISO

Note: taking the lazy approach to networking (-netdev vmnet-shared) means having to run QEMU as root, allowing QEMU to automatically set up bridging for you.

sudo qemu-system-x86_64  -m 2048 \
     -hda ./vdisk1 -cdrom ./archlinux-2023.04.01-x86_64.iso  \
     -device virtio-net-pci,netdev=net0 \
     -netdev vmnet-shared,id=net0 \
     -name ArchLinux -device virtio-rng-pci -boot menu=on

If you don’t feel comfortable running your VM as root (on MacOS) then it looks like you might be out of luck for now: the Arch installer will want a network connection.

After a couple of minutes you should have a root prompt:

image

Preparing for installation

First a few small details to make the rest smoother.

Enable the correct keymap for your keyboard. The official Arch installation instructions recommend looking at all the available keymaps with:

ls /usr/share/kbd/keymaps/**/*.map.gz

If you are running MacOS then you can go straight to the relevant Mac files:

ls /usr/share/kbd/keymaps/mac/all

On my Swedish Macbook Pro this gives me apple-a1243-sv as my desired keymap.

Install with:

loadkeys apple-a1243-sv

Next, check you have a working network interface:

ip link

should reveal an ethernet device, probably with the device name ens3

image

If you see the green “UP” status then you should have a working internet connection. The final test is to verify that DNS is working and that you can ping out:

ping archlinux.org

Partitioning the disk

For anyone who did install Slackware from floppies, it should be somewhat reassuring to know that things really haven’t changed much in 30 years.

Your virtual disk should be available at /dev/sda and fdisk will let you set up your partitions ready for an install.

Double check the disk is there with:

fdisk -1

image

The standard Arch installation documentation will now recommend using fdisk to partition your disk. However, since we are targeting a simple and minimal virtual machine image, lets skip that and go straight to what works: gdisk.

We are going to create three partitions: a GPT boot partition, a minimal swap partition (this really should be optional, but will depend largely on how much memory you are prepared to allocate to your VM), and the remaining space for the file-system.

BIOS boot partition

This will prepare the disk for a GPT boot process (and a safe MBR). From the Arch wiki: MBR (master boot record) dates back to the DOS era and has a number of limitations. GPT is the more modern and more flexible boot loader. Unless you are using a legacy bios and want to dual-boot with Windows then GPT should suffice. For our VM we’ll go BIOS/GPT which means we need to reserve 1MB at the start of our disk for grub.

gdisk /dev/sda
o
y
n
1
2048
+1M
ef02

Swap

Still in gdisk we now create the swap partition (type 8200):

n
2
4096
+512M
8200

Filesystem

And the filesystem (type 8300):

n
3
1052672
209713151
8300

Confirm and write

Now time to make sure things look right before writing to the partition table, use gdisk’s print command p:

p

which should show your three partition tables with correct types:

Number  Start (sector)    End (sector)  Size       Code   Name
  1              2048           4095   1024.0 KiB  EF02   BIOS boot partition
  2              4096        1052671   512.0 MiB   8200   Linux swap
  3           1052672      209713151   99.5 GiB    8300   Linux filesystem

And finally write to disk and exit:

w
y

Initialising the disk

Now we should have two partitions available as /dev/sda1 and /dev/sda2 - if for some reason they haven’t turned up, the simplest solution is to reboot your VM:

reboot

Formatting

Now it is time to format the partitions:

mkswap /dev/sda2
mkfs.ext4 /dev/sda3

Mount

And mount them ready for installation:

mount /dev/sda3 /mnt
swapon /dev/sda2

Install Arch

Three short commands prep for the installation process: essentially bootstrapping the package manager onto our newly mounted partitio (/mnt), generating the fstab file for the partitions and finally chroot-ing onto the new filesystem for installing.

pacstrap -K /mnt base linux #linux-firmware
genfstab -U /mnt >> /mnt/etc/fstab
arch-chroot /mnt

Running pacstrap should take a few minutes, so another ☕.

Networking

We skipped installing ~linux-firmware with pacstrap above in order to keep with a minimalist install. However, this means that certain key packages for networking will be missing. Install those manually:

pacman -S dhcp

Editor

Install your favourite editor such as vi or nano

pacman -S nano

Timezone

ln -sf /usr/share/zoneinfo/Europe/Stockholm /etc/localtime

Where Europe and Stockholm can be swapped out with your own region and city.

Localization

locale-gen

Create a file called /etc/locale.conf using your favourite editor with:

LANG=en_US.UTF-8

And another file /etc/vconsole.conf with your keyboard keymap from way up above:

KEYMAP=apple-a1243-sv

Networking

Create the file /etc/hostname to define your new VMs network name:

mycoolvm

And since we are networked up, lets make sure we have a password on root:

passwd

Boot loader

This step is crucial - or we won’t be able to boot into our shiny new kernel. In keeping with the minimalist aim, lets go GRUB on GPT:

pacman -S grub
grub-install /dev/sda
grub-mkconfig -o /boot/grub/grub.cfg

Reboot!

Now it is time to leave the chroot and reboot the system into your new Arch installation:

exit
shutdown -r now

Remember to press escape at the first boot menu in order to pick the virtual harddisk image to boot from.

Final Arch setup

Now that we have booted into our VM image, there are a few last steps to getting a usable (but minimal) environment.

Networking

Check that your network interface was found, it is most likely named ens3:

ip link

Create the wired network config file /etc/systemd/network/20-wired.network with your favourite editor with the following content:

[Match]
Name=ens3

[Network]
DHCP=yes

And now activate networking and DNS resolution:

rm /etc/resolv.conf
ln -rsf /run/systemd/resolve/stub-resolv.conf /etc/resolv.conf
systemctl enable systemd-networkd.service
systemctl start systemd-networkd.service
systemctl enable systemd-resolved.service
systemctl start systemd-resolved.service

At this point you should have working networking which you can test with ping:

ping archlinux.org

Cleanup

Our goal was a minimal install - so now it is time to see how well we did:

df -H

which should reveal that we landed in circa 1.4G used space:

image

So not exactly minimal - which raises the question, what can we trim?

There is a useful utility called ncdu that will help us explore the filesystem and see what is taking up space.

pacman -S ncdu
cd /
ncdu

image

So mostly files in /usr and /var.

Cleaning /var

Digging deeper into /var shows that all the space is the cache for package management. We can safely wipe most of this - exit ncdu and run:

rm /var/cache/pacman/pkg/*

And now re-run ncdu:

image

Cleaning /usr

So far so good, that just leaves two offenders: /usr/lib and /usr/share. The latter is mostly locale data. Using ncdu navigate down to /usr/share/locale and remove any locale folder you aren’t using (in my case only en_GB and en_US is needed) - pressing d on any folder will prompt you to delete it.

Finally digging into /usr/lib shows that the final large offender is all the kernel modules for drivers: networking and media. Since our VM is only using the Intel e1000 driver we should be able to safely remove the other drivers (especially the wireless ones).

Just be careful not to delete the drivers you need - here I could do some more experimentation, I’m guessing 99% of the kernel modules can go.

Small fry

Again, more experimentation could be done here to get the build down even further: e.g. /usr/share/man is consuming 25MB - how often do you run man?

Checking the final result with:

df -H

now shows:

image

which means we are now landing under 1GB. For a clean Arch install this is probably about as good as we can get without chasing thousands of small files. But depending on how small you want to get you could spend more time here.

Final cleanup

Reboot to make sure your system is still (net)working!

The final task we can do is to get our sparse image as small as possible. But for that to work we need to zero out the free disk space (since sparse imaging is based on optimising for unused sectors).

cat /dev/zero /tmp/zeros
dd if=/dev/zero of=/tmp/zeros
rm /tmp/zeros
sync
poweroff

Once we exit our VM and check the size of vdisk1 you’ll probably see that it is a bit bigger than 873MB - so we can do a final compression back to a minimalist sparse image using qemu-conv:

qemu-img convert -O qcow2 vdisk1 vdisk1.minimal
gzip vdisk1.minimal
du -sm *

Now we should have a compressed sparse image landing in at circa 850MB - not great, but not bad either. This size VM could be shared without too much hassle.


Conclusion

So 850MB compressed images can theoretically be distributed using github releases - but if the goal is truely minimalist then we can do much better with another distro. For example Alpine linux, or even programmatically with Buildroot: topics for future notes.

One idea for minimalist / shareable VMs was to create a series of VMs for training ethical hacking techniques. So why not take the Arch image above and install nginx, php and postgresql and make a fake website with SQL injection vulnerabilities to play around with: https://yonkov.github.io/post/how-to-make-a-website-from-scratch-using-php-and-mysql/

Thanks for liking and commenting!

#Arch #Linux #VM #QEMU #Minimalism