Handbook · Chapter 2 of 12 · 12 min read
Anatomy of the image repo
A bootc distro is, at its core, one git repo that produces one OCI image. For Margine that repo is margine-image. This chapter walks its layout, the Containerfile, the staged build scripts, and the build-time write rules that bootc/ostree impose.
2.1 Lineage: ublue-os/image-template
Margine descends from the Universal Blue image-template pattern (github.com/ublue-os/image-template), the same skeleton behind Bluefin, Bazzite and Aurora customizations. The contract is minimal:
- a
ContainerfilewhoseFROMis an existing bootc base image; - a
build_files/directory holding everything needed during the build but not wanted inside the final image; - a single
RUNinvocation (or a few) that bind-mountsbuild_files/and runs abuild.sh; - a final lint that proves the result is still a valid bootc container;
- CI that builds, signs and pushes to a registry on every commit.
Margine credits this explicitly (/var/home/daniel/dev/margine-image/README.md):
- Bluefin — base image and source of most of what Margine ships.
- Universal Blue — image-template, CI patterns, `uupd`.
- Origami Linux — reference for the MOK-signing kernel script.
- hhd-dev/rechunk — ostree rechunking action.
Repo top level:
margine-image/
├── Containerfile # the whole OS definition
├── build_files/ # build-time scripts + system_files overlay
├── installer/ # Anaconda installer-image context (Flatpak BAKE)
├── disk_config/ # bootc-image-builder TOML (qcow2, anaconda-iso)
├── live-env/ # Titanoboa live-ISO layer
├── docs/ # repo-local postmortems and plans
└── .github/workflows/ # build, disk, smoke-boot, ISO publish
installer/, disk_config/ and live-env/ are consumed by later chapters; everything that defines the booted OS lives in Containerfile + build_files/.
2.2 The Containerfile, stage by stage
The ctx scratch stage: build inputs that never ship
# /var/home/daniel/dev/margine-image/Containerfile
# ----- Build context: scripts that should NOT end up in the final image -----
FROM scratch AS ctx
COPY build_files /
# Make installer/flatpaks-base reachable from build.sh at
# /ctx/installer-flatpaks-base. Single source of truth for the BAKE
# Flatpak list (audit §3.5: drop the duplicate here-doc in build.sh).
COPY installer/flatpaks-base /installer-flatpaks-base
Practical effect: scripts live in a throwaway scratch stage and reach the real build only through an ephemeral --mount=type=bind. Nothing in build_files/ can leak into a shipped layer, and editing a script does not invalidate the base layer cache. The extra COPY installer/flatpaks-base makes one file the single source of truth for both the OCI image and the Anaconda installer (chapter on ISOs).
FROM bluefin-dx and pinning
# /var/home/daniel/dev/margine-image/Containerfile
# ----- Base: Bluefin DX (Fedora 44 track, "stable" tag) -----
FROM ghcr.io/ublue-os/bluefin-dx:stable
Margine pins to the floating :stable tag, not a digest. Trade-off: every weekly rebuild silently absorbs whatever Bluefin shipped (good: free maintenance of GNOME, drivers, dev tooling; bad: an upstream regression lands without a diff to review). The mitigations are downstream: a CI asset validator and a QEMU smoke-boot gate must pass before anything is promoted to Margine's own :stable (chapter on CI). The stricter alternative — digest pinning with Renovate/Dependabot bump PRs — is what several uBlue community images do; it buys reviewability at the cost of merge churn.
RUN --mount anatomy
Each build stage uses the same mount set:
# /var/home/daniel/dev/margine-image/Containerfile
RUN --mount=type=bind,from=ctx,source=/,target=/ctx \
--mount=type=cache,dst=/var/cache \
--mount=type=cache,dst=/var/log \
--mount=type=tmpfs,dst=/tmp \
--mount=type=secret,id=mok-key,target=/tmp/certs/MOK.key \
--mount=type=secret,id=mok-cert,target=/tmp/certs/MOK.pem \
/ctx/custom-kernel/install.sh
type=bind,from=ctx— scripts visible at/ctx, gone after theRUN.type=cacheon/var/cacheand/var/log— dnf metadata and logs persist across builds but never enter a layer. This doubles as a guard: anything written there cannot ship, which is exactly what ostree wants for/var(see §2.5).type=tmpfson/tmp— scratch space, guaranteed empty in the image.type=secret— the MOK private key, certificate and enrollment password exist only for the duration of this oneRUN. NoCOPYof key material, no credentials in layer history.
The four RUN stages
/ctx/custom-kernel/install.sh— swap the Fedora kernel for CachyOS from COPR, sign vmlinuz + every module with the MOK secrets, rebuild the initramfs (chapter 3)./ctx/build.sh— the orchestrator over all numberedNN-*/install.shstages (§2.3)./ctx/build-margine-extensions.sh— bake GNOME Shell extensions system-wide into/usr/share/gnome-shell/extensions/. The Containerfile comment records why this is a separate stage: it replaces a racy per-user first-login installer, copying the Bluefin/Bazzite practice of build-time system-wide extensions.bootc container lint— final validation (§2.6).
Stage granularity matters for iteration speed: a change to a GNOME default re-runs stages 2-4 but reuses the cached (expensive, COPR-fetching, module-signing) kernel layer.
2.3 The build orchestrator and numbered stages
build.sh is deliberately boring — a 1416-line monolith was decomposed into per-area scripts (documented in /var/home/daniel/dev/margine-image/docs/build-sh-decomposition.md):
# /var/home/daniel/dev/margine-image/build_files/build.sh
set -euo pipefail
. /ctx/00-common.sh
log "==== Margine build orchestrator: starting ===="
# Run every sub-script in lexicographic order. Globs expand
# deterministically because we name dirs <NN>-<area>.
for d in /ctx/[1-9][0-9]-*/install.sh; do
log "==> running $d"
bash "$d"
done
Practical effect: adding a build concern = adding a directory. Ordering is encoded in the name, the glob is deterministic, and set -euo pipefail plus bash "$d" (not source) means one failing stage kills the build without leaking state into the next.
Shared state lives in one sourced file:
# /var/home/daniel/dev/margine-image/build_files/00-common.sh
log() { printf '[margine-build] %s\n' "$*"; }
# retry_curl <url> <out> — 5 attempts, 30-150s backoff (COPR/raw.githubusercontent brownouts)
# retry_curl_strict <url> <out> — same, but aborts the build on missing/empty asset
export FEDORA_VER="${FEDORA_VER:-$(rpm -E %fedora 2>/dev/null || echo 44)}"
export BUILD_DATE="${BUILD_DATE:-$(date -u +%Y%m%d)}"
export MARGINE_REPO="${MARGINE_REPO:-https://raw.githubusercontent.com/daniel-g-carrasco/margine-fedora-atomic}"
export MARGINE_REF="${MARGINE_REF:-main}"
retry_curl_strict exists because a silently-failed asset download shipped user-visible regressions twice (missing welcome logo, missing About-panel logo); for assets the image is broken without, fail-loud beats a quiet placeholder.
The stages:
| Dir | Concern |
|---|---|
10-os-identity/ |
os-release rewrite, /etc/passwd+/etc/group factory seed, system_files/ overlay copy |
20-flatpaks/ |
BAKE list → /usr/share/margine/, DEFER list → /usr/share/flatpak/preinstall.d/ |
30-gnome-defaults/ |
zz1-margine.gschema.override (10 enabled extensions, favorites, accent), dconf keyfiles in /etc/dconf/db/distro.d/ |
40-spec-scripts/ |
fetch configure-*/validate-* helpers + declarations.yaml from the spec repo into /usr/bin |
45-wsf/ |
build wayland-scroll-factor, install LD_PRELOAD drop-in for org.gnome.Shell@.service |
50-branding/ |
logo, wallpaper, Plymouth theme, offline docs, GDM background, strip Bluefin branding |
60-ujust-services/ |
60-custom.just recipes, mask systemd-remount-fs, skel defaults |
The boot-time passwd re-seed unit, staleness/upgrade notifiers, and first-boot autostarts no longer have a build stage of their own: their payloads ship as tracked files under build_files/system_files/ (libexec scripts + systemd units), copied wholesale into the rootfs by stage 10-os-identity — the system_files overlay this chapter already describes.
One detail in 60-ujust-services generalizes to any Bluefin derivative: the recipe file must be named 60-custom.just.
# /var/home/daniel/dev/margine-image/build_files/60-ujust-services/install.sh
# Bluefin's /usr/share/ublue-os/just/00-entry.just hardcodes the list
# of imported recipe files. The ONLY one declared as optional is
# 60-custom.just (via `import?`) — that's the documented extension
# point for downstream distros. Files dropped under any other name
# (e.g. 99-margine.just) are simply ignored by `ujust --list`.
install -Dm0644 /ctx/60-custom.just /usr/share/ublue-os/just/60-custom.just
2.4 The system_files/ overlay
Static files (units, libexec scripts, tuned profiles, icons, autostart entries) do not get heredoc'd in scripts — they live under build_files/system_files/ in a tree that mirrors their final path, and stage 10 overlays the whole thing onto /:
# /var/home/daniel/dev/margine-image/build_files/10-os-identity/install.sh
# The whole tree gets rsync'd into the rootfs at "/" so file paths in
# the repo mirror their final installed location. Same pattern as
# Bluefin's system_files/shared/.
if [[ -d /ctx/system_files ]]; then
log "Copying /ctx/system_files/ → / (overlaying base rootfs)"
cp -a /ctx/system_files/. /
# Set executable bit on libexec scripts (cp -a preserves mode but
# git may have flagged them differently across platforms).
find /usr/libexec /usr/bin -type f \( \
-path '*/margine-*' -o \
-path '/usr/libexec/margine/*' \
\) -exec chmod 0755 {} \;
fi
Practical effect: git log build_files/system_files/usr/lib/systemd/system/margine-docs-refresh.service is the change history of that exact file on disk. The current tree ships almost exclusively into /usr (units in /usr/lib/systemd/system/, scripts in /usr/libexec/margine/, tuned profiles in /usr/lib/tuned/profiles/), plus one /etc/xdg/autostart entry — consistent with the write rules below.
Stage 10 also rewrites OS identity. The non-obvious part is which fields a derivative may change:
# /var/home/daniel/dev/margine-image/build_files/10-os-identity/install.sh
NAME="Margine"
ID=fedora # bootc-image-builder fails "could not find def file for
ID_LIKE=bluefin # distro margine-44" if ID=margine; BIB does NOT fall
VARIANT_ID=margine # back to ID_LIKE. Discriminate on VARIANT_ID instead.
...
printf '%s\n' "$OS_RELEASE_CONTENT" > /usr/lib/os-release
ln -sf ../usr/lib/os-release /etc/os-release # canonical Fedora layout
NAME/PRETTY_NAME/VARIANT* are the branding surface; ID is an ecosystem contract (tooling does exact ID-VERSION_ID lookups). Fedora's own spins (Silverblue, Kinoite) follow the identical ID=fedora + distinct VARIANT_ID pattern.
Lesson — os-release symlink vs switch-root. Symptom: first VM boots failed with
Failed to switch root: ... os-release file is missing, despite the file existing in the deployment. Root cause: two stacked issues. The initramfs lacked theostreedracut module (so/sysrootwas never pivoted to the deployment view), and the image pushed by plain buildah was not ostree-canonical, so composefs was not mounted over/usrwhen systemd's switch-root check didopenat(fd, "etc/os-release", O_NOFOLLOW)— the/etc/os-release → ../usr/lib/os-releasesymlink dangled. Fix: short-term, shipos-releaseas a regular file in both places ("Fix A"); proper fix ("Fix B"), adddracut --add ostreein the kernel stage and wirehhd-dev/rechunkinto CI so the published image is re-committed in ostree-canonical form — after which the canonical symlink was restored (theln -sfabove). Full writeups:margine-fedora-atomic/docs/lessons-learned/2026-05-28-initramfs-and-bootc-labels.mdand.../2026-06-03-rechunk-and-fixb.md.
2.5 What may write where at build time
The rule set every script in this repo obeys:
/usr— yes. The immutable payload. Binaries, units, schemas, extensions, kernels (/usr/lib/modules/<kver>/vmlinuz+initramfs.img), even the passwd factory (/usr/lib/passwd)./etc— yes, but it becomes the factory. At commit/rechunk time/etccontent is captured as/usr/etc; on each deployment ostree 3-way-merges it with the machine's live/etc. Writes here are defaults, not state./var— no./varis machine-local and reset/merged per deployment; content baked into it is dead weight at best and a lint error at worst. The Containerfile makes this structural:/var/cacheand/var/logare cache mounts, so dnf can do its job without the result ever entering a layer./tmp— tmpfs mount, guaranteed not to ship./opt,/usr/local— symlinks into/varon Fedora/ostree; same prohibition applies.
Some tooling assumes a writable, persistent /var and has to be tricked. akmods is the canonical offender:
# /var/home/daniel/dev/margine-image/build_files/custom-kernel/install.sh
# akmodsbuild on bootc images skips signing if /var isn't writable; patch
# it out so akmods proceeds inside the container build.
disable_akmodsbuild() {
_ak="/usr/sbin/akmodsbuild"
cp -p "$_ak" "$_ak.backup"
sed '/if \[\[ -w \/var \]\] ; then/,/fi/d' "$_ak" > "$_ak.tmp"
mv "$_ak.tmp" "$_ak"
chmod +x "$_ak"
}
The patch is reverted (restore_akmodsbuild) before the layer is committed — temporary mutations of /usr must be cleaned up by the same script that made them.
A second class of "build-time write" bug: transient dnf installs. The extensions stage refuses them entirely after an autoremove/Requires:-cascade incident:
# /var/home/daniel/dev/margine-image/build_files/build-margine-extensions.sh
# NO transient dnf installs. Lesson learned the hard way 2026-06-04:
# dnf5 -y remove jq # STILL broke things: scx-tools-git declares
# # Requires: jq → removal cascades through
# # scx-tools-git → scx-scheds → 16 packages.
# Robust fix: don't add or remove dnf packages here at all. Use
# Python stdlib (always present) for JSON parsing + zip extraction.
Lesson — rechunk strips the
/etcfactory. Symptom: after rebasing a Bluefin machine to Margine, boot spews dozens ofFailed to resolve group 'audio'/'kvm'/'tty'; TPM unlock and audio break. Root cause: Bluefin ships a near-empty/etc/passwd(sysusers populates it at boot). The build-time seed (stage 10) fills it, and CI confirmed 65 entries post-build — but rechunk's re-commit stripped/etc/passwd//etc/groupfrom the/usr/etcfactory, so ostree's 3-way merge on the rebased machine kept onlyrootplus the human user. Fix: belt and suspenders — keep the build-time seed and ship an idempotent boot-time oneshot that re-merges from/usr/lib/{passwd,group}whenever/etc/passwddrops below 20 entries:# build_files/system_files/usr/lib/systemd/system/margine-seed-etc-passwd.service [Unit] DefaultDependencies=no # DO NOT add After=local-fs.target: it creates an ordering cycle through # systemd-tmpfiles-setup-dev.service → /dev/disk/by-uuid never populated # → boot times out into emergency mode (incident 2026-06-01). After=local-fs-pre.target Before=systemd-sysusers.service systemd-tmpfiles-setup.service sysinit.targetThe comment is its own sub-lesson: the first version of this unit ordered itself
After=local-fs.targetand systemd resolved the resulting dependency cycle by disablingsystemd-tmpfiles-setup-dev— pushing every boot intoemergency.target(.../lessons-learned/2026-06-01-systemd-ordering-cycle-and-rechunk-storage.md).
2.6 Commit and lint
The image must end as something bootc can deploy. Margine's Containerfile finishes with:
# /var/home/daniel/dev/margine-image/Containerfile
# ----- Lint: verify final image is a valid bootc container -----
RUN bootc container lint
bootc container lint checks the invariants this chapter described: no content baked into /var, valid kernel/initramfs layout under /usr/lib/modules/, sane /etc and composefs-compatible structure. It fails the build, so a violating commit never reaches the registry.
Two related mechanisms in the same family:
ostree container commit— the older uBlue/image-template idiom, appended to eachRUNto clean/varand verify the layer (RUN /ctx/build.sh && ostree container commit). bootc-era templates replace it with the finalbootc container lint; Margine never carried the old form.- rechunk (
hhd-dev/rechunk, in CI, post-build) — re-commits the OCI image as an ostree-canonical tree with size-balanced layers. For Margine it is not just a bandwidth optimization: it is what made composefs come up early enough for the os-release symlink (Lesson above). The trade-off — it rewrites/usr/etcaggressively — produced the passwd-stripping Lesson.
Alternatives & other distros
Repo/build skeleton
- ublue-os/image-template (Bluefin/Bazzite/Aurora customs, Margine): Containerfile +
build_files/+ GitHub Actions; lowest-friction entry. - BlueBuild: declarative
recipe.ymlcompiled to a Containerfile; less bash, less control over stage ordering. - Fedora rpm-ostree treefiles (Silverblue/Kinoite proper): YAML/JSON compose on Fedora infra; not container-native, no
RUNstep. - NixOS: full system from a Nix expression; maximal reproducibility, entirely different ecosystem, no OCI base reuse.
- Vanilla OS (Vib + ABRoot): modular YAML recipe → OCI image, A/B partition deployment instead of ostree.
- openSUSE MicroOS/Aeon: built with KIWI on OBS; btrfs-snapshot atomicity (
transactional-update), not image-based delivery.
Base pinning
- Floating tag (
bluefin-dx:stable— Margine, most uBlue customs): zero maintenance, regressions absorbed silently; compensate with CI gates. - Digest pin + Renovate bumps: reviewable upstream diffs, constant PR churn.
- Build-from-source base (Bazzite, Bluefin themselves build from
ublue-os/main/Fedora base): full control, full maintenance burden.
Script staging
- Numbered
NN-*/install.shdirs (Margine) ≈ Bluefin'sbuild_files/shared/*.sh: deterministic, diff-friendly. - Single
build.sh(stock image-template): fine until ~300 lines. - One
RUNper concern in the Containerfile (Bazzite, dozens of layers): better layer caching per concern, registry layer-count bloat — exactly why rechunk exists.
Config overlay
system_files/mirror-tree copied to/(Margine, Bluefin, Bazzite): file paths == repo paths.- Heredocs in scripts (Margine uses these for generated files only): content next to logic, but unreviewable past a screenful.
- Nix modules / Vib modules: typed config instead of file trees; ecosystem lock-in.
Validation
bootc container lint(Margine, current uBlue): in-build, blocking.ostree container commit(legacy uBlue): per-layer cleanup + check.- External smoke boot in QEMU before tag promotion (Margine's
smoke-boot.yml, Bazzite's CI): catches what static lint cannot — the passwd and switch-root Lessons above were both runtime-only failures.