Arch Linux minimal container userland 100% reproducible - now what?

John Gilmore gnu at toad.com
Fri Mar 22 17:52:47 UTC 2024


Congratulations on closing in toward Arch Linux reproducibility!!!

kpcyrd <kpcyrd at archlinux.org> wrote:
> Specifically what I mean - given a line like this:
> 
> FROM
> archlinux at sha256:2dbd72d1e5510e047db7f441bf9069e9c53391b87e04e5bee3f379cd03cec060
> 
> I want to reproduce the artifact(s) that are pulled in by this, with
> the packages our Arch Linux rebuilders have reproduced from source
> code. From what I understand this hash points to a json manifest that
> is not contained in the container image itself and was generated by
> the registry (should we archive them?), and this manifest then points
> to the sha256 of the tar containing the filesystem (I'm possibly
> missing an indirection here).

I have no experience with Arch -- am just reading what's on their
website.  From a quick glance at their docs, the Arch distribution
*only* distributes binary packages.  They only offer URLs for source
code, requiring that users depend on a working Internet connection and
what could be a large, arbitrary set of HTTPS servers that in theory
contain the matching source code.  See:

  https://wiki.archlinux.org/title/Arch_build_system

(I'm not sure how that even meets the requirements of the GPL for
binary distributors to make the matching source code available to
recipients of the binaries.)

It seems to me that the next step in making the Arch release ISOs
reproducible is to have the Arch release engineering team create a
source-code release ISO that matches each binary release ISO.  Then you
(or anyone) could test the reproducibility of the release by having
merely those two ISO images and a bare amd64 computer (without even an
Internet connection).  (Someone other than their releng team could do
this shortly after the binary release, hoping that none of the URLs
becomes inaccessible in the meantime.  But the right time to gather the
full source code for reproducibility is when they themselves pull in the
source code to BUILD those binary packages that they will put in their
release ISO.)

Making users reproduce an ISO full of binary packages by downloading the
sources from all over the Internet seems highly prone to fail -- in the
first few months, let alone five or ten years later.

Even Arch's binary releases are only available from Arch for three
(monthly) release cycles.  Then you're on your own if you want to find a
copy of what they released, like the one that was current last
Christmas.  See:

  https://archlinux.org/releng/releases/

Arch may do great release engineering (I hope they do!), but it's
apparently not *archival* release engineering.

	John


More information about the rb-general mailing list