I’ve been doing a lot of work on pmbootstrap recently, it’s the backbone of postmarketOS, and in my opinion it’s extensive capabilities and ease of use is one of the hallmarks of our success as a community.
This is just a thread about the refactoring and some of the cool new features I’m proud of.
pmb has been around since the beginning, @ollieparanoid wrote the first 4k lines of it back in 2017, 7 years ago (almost to the month), and it’s been worked on ever since as the de-facto way to get things done in the project.
If you haven’t heard of it before, it basically boils down all of the core
workflows that you need to contribute to pmOS into a nice simple command
line interface, requiring only a recent Python installation and sudo
.
You can use it to…
pmbootstrap init
pmbootstrap build
, including from your local git clone with --src=
pmbootstrap repo_bootstrap
pmbootstrap ci
(run this in the pmaports git
repo and see what happens)pmbootstrap install
pmbootstrap build --envkernel
pmbootrap flasher
pmbootstrap netboot
(see
Netboot)… And much more, really it does do everything.
With our decision to adopt #systemd in postmarketOS, Ollie and I set about
trying to make it possible to properly handle multiple source repositories,
since we need packages like gnome-shell to be built for openrc AND systemd (and
having gnome-shell-systemd
we decided would bring too much complexity to the
packaging). However, it quickly became apparent that this wasn’t really gonna
fly – in it’s 7 years pmb has gained a whole lot of new features, while the
codebase itself has largely stayed the same. The end result is that every new
features means more and more if/else blocks littered around the place to handle
very possibilities, and little-to-no full abstractions.
I had previously done some fairly substantial hacking while first developing a systemd PoC, but it wasn’t really suitable for merging.
I decided (maybe a little impulsively) that I just didn’t want to fight it anymore. Something has to give.
From a birds eye view, pmbootstrap is architectured as follows:
There is a work directory (by default ~/.local/var/pmbootstrap
) where
everything lives. Usually it looks something like this:
The chroot_*
directories are the various root filesystems it uses via
chroot
, this have a very standard naming scheme (native
, buildroot_$ARCH
,
rootfs_$DEVICE
, etc… now documented in code under
pmb/core/chroot.py
),
every function that needed to operate on a chroot (or called a function that
needed to operate on a chroot) accepted a suffix
parameter, this was a string
specifying which chroot to work on (native
, rootfs_qemu-amd64
, etc).
Then to actually use the chroot root, one would simply build a path:
This is certainly a fine approach, it’s simple and gets the job done. The issues only arise here at scale:
These are not difficult questions to answer, and the codebase has many answers.
But it has different answers depending on where you look… And that’s really
the problem here. Without any way to abstract away the implementation details of
a chroot
(which for the sake of simplicity is an object I have intentionally
left undefined).
There are functions to handle most of these things, and a recent venture to vastly improve the documentation in pmbootstrap (check out https://docs.postmarketos.org/pmbootstrap/), but I think this really calls for just a little OOP. Not /stateful/ OOP, but just a nice way to abstract away the implementation details and make it easy to find what you need.
To this end, there is now a nice, simple Chroot
class (in
pmb/core/chroot.py
)
which encompasses all the common usage + some minimal runtime validation into a
nice little package. Yay!
In much the same vein, many many parts of the codebase need to work with
architecures (like x86
, 86_64
, aarch64
, riscv64
and such). This is
thankfully much simpler than the Chroot situation, and there are some great
utilities in pmb/parse/arch.py
for converting between the Alpine/APKBUILD
representation of an arch, to the kernel representation (aarch64
-> arm64
for example) + a few others.
This has also been refactored into
pmb/core/arch.py
similarly offering some convenience methods like .is_native()
and runtime
validation.
Path building is a dangerous game, and getting it wrong can certainly cause
issues… During the early days of my refactor I decided to add calls to
input()
before each shell command, just to be safe :D
Pythons pathlib is a little controversial I think, but honestly it makes life so much easier for us here. It does some very cheeky operator overloading so you can build paths with:
chroot_path = ...
repositories = chroot_path / "etc/apk/repositories"
This ensures you don’t accidentally miss a path divider, and if you’re like me,
you’ll also add compatible overloads for the /
operator on any types where it
makes sense… Like remember the mark_in_chroot()
example before? Here it is
now:
Perhaps it’s taking it too far, but the new Arch class also has support for the
/
operator, since building paths where the arch makes up a directory is pretty
common, so you can do:
arch = Arch.aarch64
apkindex = workdir / "packages" / channel / arch / "APKINDEX.tar.gz"
What is args? All at once now:
args is an argparse Namespace object |
args is the pmbootstrap.cfg config data | args has the parsed deviceinfo data |
Hmm, I see…
In and of itself, args is /fine/, we have almost 100 lines of argparse configuration in pmb (that’s a whole other topic…), and we need to deal with these flags sometimes deep inside the call stack. As a result of this, and for the sake of simplicity, args is passed to almost every single function in the entire codebase.
The problem with args (and really with argparse) is that it doesn’t have type hinting. At all. Zilch. You just have to… know… what’s there.
Here’s some stats from when I started this project (pre any refactoring):
# Args has 138 properties
$ rg --vimgrep "((^|\s)args\.\w+)" --only-matching | cut -d"." -f3 | sort | uniq | wc -l
138
# It's referenced over 2000 times (including comments)
$ rg "args" pmb | wc -l
2471
# With 900 of them being usage (ignoring any getattr() magic)
$ rg "args\." pmb | wc -l
897
# And it's the first parameter to over 350 functions
$ rg "^def.+\(args" pmb | wc -l
368
These are obviously pretty fuzzy stats, relying on some common usage patterns like args always being the first parameter to the functions where it’s used (consistency with this stuff is definitely a strong point in pmbootstrap).
With the current state of my refactor, here’s the new stats:
# Down 29 properties
$ rg --vimgrep "((^|\s)args\.\w+)" --only-matching | cut -d"." -f3 | sort | uniq | wc -l
109
# Wayyy down to <800 references
$ rg "args" pmb | wc -l
797
# Down to <400 uses
$ rg "args\." pmb | wc -l
359
# And less than 100 functions use it
$ rg "^def.+\(args" pmb | wc -l
95
Most of the places it’s still used are for the more involved things like building the rootfs image or handling user configuration/prompts. The goal isn’t to get these numbers to 0 – we need to pass arguments around – but certainly not everywhere :D.
Worth noting, the refactor makes almost no changes to the fundamental layout of pmbootstrap. Very few functions are removed or added, so it’s pretty fair to compare uses in function calls directly.
Args has definitely been the biggest pet peeve of mine, and it’s great to be able to start using pmbootstrap’s internal machinery without having to initialize it.
I have introduced a new Context
type and (ehhhhhhh) global state, so you can
call get_context()
from anywhere and then do what you gotta do. This is
fiiiiine but a total PITA to mock for pytest, and not the makings of a good API.
My hope here is that we eventually replace this with per-module context as we
start to pick apart the API and build more concrete abstractions. At the very least it limits what stuff is available.
Pmbootstrap is capable of building packages, and maintaing your local binary repo (so you can build a device image out of the packages you built!). This means it has to deal with both the source repository “format” (pmaports in this case) and the binary repos (apk’s APKINDEX flat database). As with everything else, this is done very imperatively and on demand.
The pmaports git repo contains all of the package sources for postmarketOS,
these take the form of an APKBUILD file (the Alpine package build configuration)
+ extra sources, patches, etc. This repo is usually managed by pmbootstrap
for you (try pmbootstrap pull
and pmbootstrap status
) since as a first time
contributor you ought not have to deal with a bunch of manual setup and configs.
It’s cloned to ~/.local/share/pmbootstrap/cache_git/pmaports
, I have a symlink
in ~/pmos/pmaports
(and a pma
alias in my shell to jump there).
Since we support both the rolling edge branch and stable releases, there is a
pmaports.cfg
file in the repo which describes (among other things) which
branch of pmaports you should be on (the channel). Pmbootstrap will make
sure you’re on the right channel (the one you picked during pmbootstrap init
) before letting you do stuff, and it will put the packages you build into
a different local binary repo as appropriate.
Unfortunately, the systemd
repository kinda breaks all of these assumptions. I
originally set up a new package repo when doing bringup which required teaching
pmbootstrap to support multiple source repositories. We have since decided to
keep it in the same git repository as this makes the maintainership much simpler
(so we don’t have to reinvent the repo
tool).
But, as I alluded to above, we still more or less need to treat these as
different repositories. And as a result, the pkgrepo
API is born:
In addition to handling multiple source repos, to properly integrate systemd, we need to handle multiple binary repos too (since the systemd variants of packages in the systemd repo must take precedence over the non-systemd ones).
This also means handling the separate mirror URL (now handled via pmbootstrap config mirrors
) to operate depending on which repos are enabled. To get the
point, this is how it all works now:
If you’re thinking “but why is this important”, I’ll try to answer: I think friction is often the biggest hurdle to get poeple to try something new, I mean, capitalism certainly leans into this ideology pretty heavily. So while I don’t think we should try to simplify our UX to a dangerously slippery shine, we’ve always aimed for a good balance with postmarketOS.
Our tooling should be easy enough that anyone with an old phone a spare weekend can get tangible results (something I feel safe in saying is a situation thousands of people who have passed through our issue tracker or matrix rooms have found themselves in). But we can’t hide away all the complexity since it would become totally unmaintainable, and unusable for developers who are trying to customise things to suite their workflow.
Given just how far we’ve come with pmbootstrap in it’s current state, I feel pretty excited about how much an additional layer of abstraction will do for us – that’s a whole new layer for folks to play with and build upon.
According to the diffstat:
121 files changed, 6592 insertions(+), 4297 deletions(-)
The vast majority of this is just stripping out args, migrating over to the new
Chroot and Arch types, and adopting pathlib Paths. The way that the config file is parsed and used has also been redone to use a proper Config
type now (rather than merging it into args), and support for more than just a flat key/value store (for config.mirrors
).
We have a strict no-dependencies rule in pmbootstrap, which is definitely a bit annoying when it comes to (de)serializing config files, but overall I’m pretty happy with the new solution and how it handles choice-type config options in a generic way with enums:
And of course, it wouldn’t be a refactor without a bunch of new features…
The package building logic has been largely rewritten, it used to recursively call itself to build package dependencies (if needed), which was probably not great for performance and certainly left a lot to be desired in the UX (since we never actually knew how many packages we needed to build).
The new approach descends into package dependencies and queues everything up, then tells you what’s up before actually building everything, yay.
Here it is in action building packages for a feature branch automatically as part of building a rootfs image:
You now get a nice summary of what’s happening before it happens. The log output could definitely be further improved, so MRs welcome!
Alpines abuild
tool is pretty strict when it comes to package sources,
everything has a sha512sum
which is verified, and if it doesn’t match then
your package won’t be built. This is pretty important, especially for a public
binary repo. But when you’re just hacking on something it’s not the most ideal
thing.
Since abuild is written in sh
, and sources the APKBUILD file as part of the
build process, we can pretty easily override any part of it just by making
pmbootstrap append an override to the APKBUILD for the package we’re building.
Something it now does for the checksum verify()
function (unless you build
with --strict
). Now instead of a failure, the default behaviour will just be a
warning:
For a while now pmbootstrap has supported building packages with the
--src="$PWD"
flag. This does ungodly things to the APKBUILD for the package
you specify, teaching it to copy the package sources from the directory you
specified and run the build function against them.
This worked using a combination of bindmounts and rsync (rsync running inside
the chroot to avoid relying on host tools…). And since abuild always cleans up
build artifacts when it’s done (and rsync is called with --delete
), this
feature while super neat had kinda limited utility.
This has now all been adjusted so that build artifacts are kept around. And thanks to how rsync works, back to back builds of a package will now have incremental builds!
pmbootstrap chroot --image
Last on my list, and to finish up this post, is the new --image
flag to
pmbootstrap chroot
. This is definitely on the hackier side of things, but in
short it allows you to access a rootfs image you built just as you would any
other chroot. This is especially useful for testing in QEMU since the image can
be directly booted there. It’s been super useful for me while working on changes
to the initramfs. Here’s a quick demo of this workflow:
I also have to say, pmbootstrap qemu
is really such a useful tool. I have
generally avoided it (afterall, why run pmOS if not on a phone :P) due to a few
rough edges, but I’m coming around to it (and fixing them).
Pmbootstrap is a fairly bespoke tool, it isn’t just used by our users to build images, it’s used by device maintainers, kernel developers, our CI, and even https://build.postmarketos.org/.
Tools like mkosi are great for developing software and building customised images of many different distros for your specific software/hardware configuration, on the other hand pmbootstrap is a plug-and-play tool for building one specific distro for lots of different hardware.
Nothing really comes close to providing a replacement.
With that being said, mkosi and friends (systemd-repart
, bubblewrap
, system
extensions, etc) are all doing a lot of very interesting things. And we at
postmarketOS have some catching up to do (don’t get me started on mount
namespaces!).
While other distros generally have similar capabilities, I think pmbootstrap is pretty unique in the scope that it covers, and that it doesn’t require you to be running the distro you’re building for or manually set up a chroot (ahem Debian).
It handles cross compiling and building images for other architectures (though the build system will run in QEMU, this is something to improve on), and is very willing to hold your hand.
All of the changes I talked about here are a part of this MR, it also includes a summary of changes. It isn’t merged yet at the time of writing (but hopefully will be very soon), but testing and feedback would be appreciated!
If you’ve never used pmbootstrap before but find this kinda stuff interesting, I’d totally recommend giving it a whirl. You can get booted up to GNOME shell in QEMU just like this:
git clone https://gitlab.com/postmarketOS/pmbootstrap.git -b caleb/typed-suffix
mkdir -p ~/.local/bin; ln -sf $PWD/pmbootstrap/pmbootstrap.py ~/.local/bin
pmbootstrap init # Whack enter for every option, or adjust to taste
pmbootstrap config ui gnome
pmbootstrap install --password 147147
pmbootstrap qemu --efi
Exit with ctrl+C
(or ctrl+A x
if you set pmbootstrap config qemu_redir_stdio
).
Then you can, for example, adjust the kernel cmdline to disable the bootsplash by entering the image:
pmbootstrap chroot --image
And uncommenting the deviceinfo_kernel_cmdline_append
line in
/etc/deviceinfo
. Run mkinitfs
to re-generate the systemd-boot config, exit
with ctrl+D
and boot up QEMU again.