Reproducible OS images (was: Irregular status update about reproducible Debian live ISO images)
Vagrant Cascadian
vagrant at reproducible-builds.org
Fri Mar 28 02:04:45 UTC 2025
On 2025-03-27, John Gilmore wrote:
> Roland Clobus <rclobus at rclobus.nl> wrote:
>> I understand that you are uncomfortable with the claim '100%
>> reproducible live images for bookworm', which could perhaps better be
>> rephrased to 'all live images for bookworm are reproducible from their
>> Debian packages'.
>
> There's a word missing there: from their Debian *binary* packages.
Fair.
> Ian was right to call this out. It's a little surprising to me that
> it's an accomplishment to be able to unpack binary packages into a
> binary file-system image reproducibly.
install != unpack
Debian packages (RPM as well? Other package formats?) have scripts that
run as part of the installation and some most definitely do things
non-deterministically...
The work on reproducible Debian live images is fixing or working around
some of those problems.
> I mean, if the Debian tools couldn't already do that, then they
> couldn't reliably install Debian binary packages onto *anybody's*
> system; "apt-get install foo" would produce results that varied from
> system to system.
That is, unfortunately, the status quo.
> Yes, I get it that some of the tools used to copy stuff into live images
> had bugs or system dependencies, and those had to be worked out.
> Depending on the running kernel's file system layout code was also
> problematic. Fixing the dependencies in those tools is a useful step,
> but can hardly be called making Debian reproducible from its source
> code.
Some of those things may be relatively inconsequential, e.g. the user or
group ids assigned to various system users, for example. Finding those
issues and their ilk is what is being done, and is actual, real work,
believe it or not!
My "modest proposal" for Debian would be to drop maintainer scripts
entirely, but that... is not likely to ever happen. Would make
Reproducible Builds installations of Debian orders of magnitude easier,
I suspect.
> If "reproducible" doesn't mean "reproducible from its source
> code", then it becomes meaningless: a marketing term, not a technical
> one.
Reproducible under conditions X, Y and Z is still a useful point of
reference, and can detect compromises injecting altered, removed, or
additional packages...
But sure, Reproducible Builds from source code is an important gold
standard, and there is some warranted worry about weakening the meaning
of the term.
Regardless, in order to ever actually get to reproducible live images,
the work that is being done both on individual packages, as well as on
the live image generation tooling is necessary to reach the long-term
goals...
> PS: 6 years ago I posted a proposed definition of a "reproducible OS
> distribution build", to this list. Here's another copy of that proposal.
> I don't think we ever formally decided that this is the definition, though
> as far as I recall, nobody else posted any competing definitions either.
I liked that definition quite a bit!
> How close are we now to accomplishing this in Debian?
Sadly Debian is very far away from it. While many of the remaining 5-10%
of packages in Debian that do not build reproducibly are inconsequential
leaf packages, I do not think GCC and binutils are reproducible in
Debian(there are workarounds the improve that significantly)... pretty
much limit serious efforts at a self-bootstrapping bit-for-bit identical
system like you describe.
Systems like Guix are much closer to this being feasible, in that they
rebuild everything whenever any of the (recursive) dependencies changes,
and have similarly high reproducibility (~90%?), so the current state is
always essentially already able to build itself. They also do not do
anything comprable to Debian's maintainer scripts; a package is
essentially just a bunch of files, and system configuration is
(generally? largely? fully?) deterministic.
> I propose a definition for whether a bootable OS distro is reproducible.
> (If what you're building is not a whole distro that can self-compile,
> this definition doesn't apply.)
>
> Our initial goal would be to produce a bootable binary release (DVD or
> USB stick) and a source release (ditto). The source release would
> include the script that allows the binary release to recompile the
> source release to a new binary release that ends up bit-for-bit
> identical. Such a binary/source release pair would be called
> "reproducible".
>
> That's useful: If you have to fix a bug in it, you can make the mods you
> need in the source tree, rebuild the world, and out will come a release
> with just that one change in the binaries, verifiably identical except
> where it matters. And developers can use such a release to detect what
> changes matter to whom, such as: when you alter a system include file,
> which binaries change?
>
> During development, the code would be built by some earlier release's
> tools, built piecemeal, etc, like current build processes do. Anytime
> before release, the developers can test whether a draft source release
> builds into a binary release that itself can build the sources into the
> same binary release. And fix any discrepancies, ideally long before
> release.
>
> This is similar to what GCC does to test itself, or what Cygnus did to
> test the whole toolchain for cross-compiling. But applied to the
> entire OS release.
And I think your "But" there is exactly the challenge.
Doing this for a specific software project is one thing, doing this for
a hugely complicated intersection of interdependent build
dependencies... it would be nice, but not sure Debian is up to the
challenge in a reasonable timeframe. We do not even necessarily get
every package rebuilt every ~2 year release cycle, as packages are
maintained by a handful of volunteers...
Even if we narrowed it down to a minimal Debian base system that is
still well over 5000 packages when you account for build dependencies
alone ... multiplied across 9+ architectures (some of which are a bit
sluggish)... *sigh*
That said, it sure would be good to try!
> We haven't even accomplished a basic paired binary/source reproducible
>release yet, for any major release -- or have we?
I suspect Guix comes quite close... maybe also Nix, from which Guix got
many ideas and has a similar model... the fundamental design is a lot
closer to doing this sort of thing out of the box.
From that old thread, sounded like Yocto also was possibly pretty close.
I wonder about Gentoo, though the culture there is for everyone to
customize their own builds, so might be close in theory, but in actual
practice everyone has their own customized builds...
The common thread I am reaching at, is basically, distros which focus on
building from source, rather than building and distributing binaries,
are in a much better position to, well, rebuild everything from source!
live well,
vagrant
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 227 bytes
Desc: not available
URL: <http://lists.reproducible-builds.org/pipermail/rb-general/attachments/20250327/e4951bd8/attachment.sig>
More information about the rb-general
mailing list