Reproducible Builds Verification Format

Tue May 12 21:14:52 UTC 2020

On Tue, 2020-05-12 at 11:00 -1000, Paul Spooren wrote:
> at the RB Summit 2019 in Marrakesh there were some intense discussions about
> *rebuilders* and a *verification format*. While first discussed only with
> participants of the summit, it should now be shared with a broader audience!
> 
> A quck introduction to the topic of *rebuilders*: Open source projects usually
> offer compiled packages, which is great in case I don't want to compile every
> installed application. However it raises the questions if distributed packages
> are what they claim. This is where *reproducible builds* and *rebuilders* join
> the stage. The *rebuilders* try to recreate offered binaries following the
> upstream build process as close as necessary.
> 
> To make the results accessible, store-able and create tools around them, they
> should all follow the same schema, hello *reproducible builds verification
> format* (rbvf). The format tries to be as generic as possible to cover all open
> source projects offering precompiled source code. It stores the rebuilder
> results of what is reproducible and what not.
> 
> Rebuilders should publish those files publicly and sign them. Tools then collect
> those files and process them for users and developers.
> 
> Ideally multiple institutions spin up their own rebuilders so users can trust
> those rbuilders and only install packages verified by them.
> 
> The format is just a draft, please join in and share you thoughts. I'm happy to
> extend, explain and discuss all the details. Please find it here[0].
> 
> As a proof of concept, there is already a *collector* which compares upstream
> provided packages of Archlinux and OpenWrt with the results of rebuilders.
> Please see the frontend here[1].
> 
> If you already perform any rebuilds of your project, please contacy me on how to
> integrate the results in the collector!

I'm not sure how relevant this is but I can mention what the Yocto
Project is doing. We're not a traditional distro in that we don't ship
binaries. We do however care a lot about getting consistent results.

Whilst we don't ship binaries, we do cache build artefacts in our
"sstate". This means something can be reused if its in the cache rather
than building it again.

We thought long and hard about how to prove reproducibility and we
ended up adding a new "selftest" to our autobuilder:

http://git.yoctoproject.org/cgit.cgi/poky/tree/meta/lib/oeqa/selftest/cases/reproducible.py

This takes one set of artefacts from our "sstate" binary cache (if
available) and builds another set locally, both in different build
directories. It them compares the build results and flags up
differences, saving the diffoscope html output and the differing
binaries somewhere we can analyse them.

The sstate can be built and come from any worker in our cluster so many
different host distros and in arbitrary paths.

We have these tests passing for our deb and ipk package backends for
the 'core-image-minimal', 'core-image-sato' and 'core-image-full-
cmdline' images. Sato is an X11 based desktop target.

I'd say these count as "rebuilds", even if we are testing them against
ourselves and not worrying about signing (sstate can be signed but its
not particularly relevant here as we trust ourselves). Not sure they're
useful from a statistics perspective but we are running this quite
heavily, day in, day out.

Cheers,

Richard