"Reproducible build" definition in OpenSSF glossary

Simon Josefsson simon at josefsson.org
Mon Jun 30 06:59:12 UTC 2025


Ismael Luceno <ismael at iodev.co.uk> writes:

> El 29 de junio de 2025 13:58:24 UTC, Leo Wandersleb <Leo at LeoWandersleb.de> escribió:
>>Hi Ismael,
>>
>>I think we're talking past each other. Even in the OS world, binaries
>> are distributed - through apt, snap stores, flatpak, etc. When a
>> maintainer uploads a .deb or someone publishes a snap, those
>> binaries need verification.
>
> A wider definition threatens with a chasing game we don't want to play with upstream authors.
>
> We want people to, ideally, fix their buildsystems, and maintain that support forward.
>
> At some point it can be made a requirement, we don't expect to do any reverse engineering, and we don't want it to be an afterthought in the future.
>
> A narrow definition keeps those problems at bay as inherently out of scope.
>
> Binary distributions should aim for the same experience source based
> distributions have been providing for 25 years, binary packages should
> act like an optimisation to skip the build more or less.
>
> So it isn't about verifying the work of any single maintainer, but ideally a distributed check on the whole ecosystem.
>
> Does that make sense?

What I think is missing from this sub-thread is the answer to "what
source code?".

When (re-)building the Debian LiveCD the "source code" is mostly
previously built binary packages.

That's not what some people mean when they refer to "source code", hence
there is ambiguity.

I think there are at least two useful concepts here which there ought to
be distinct terms for:

1) "reproduced build" - reproducing bit-by-bit identical binary based on
some set of instructions and a set of opaque "source inputs" files which
may include previously built binaries, without any requirement that
those previously built binaries can be rebuilt or is even free software.

An example of 1) is the Debian Live CD situation, it is reproducibly
built mostly based on previous binaries, and some of those binaries we
don't have source code for and they are not freely licensed.

2) "reproducible build" (for lack of better specific term) -
(re-)building some binary based on its source code and a set of "build
dependencies" (tools, shared libraries, etc) which are NOT allowed to be
embedded statically into the output (unless those statical objects were
previously reproducibly built from real source code).  I think this is
what most of reproduce.debian.net is about, to recompile Debian packages
from their packaging sources, assuming they don't embed statical
information unless hints like Built-Using: is used.

I don't think the reproducible build definitions of "source code" and
"build inputs" matches what we usually mean by those words when talking
about licensing requirements for GPL etc.  These are subtly different
concepts.  I think this is one reason for people attaching different
meanings to these words, and I realize I'm one of them who has been
biased by the license-related definition of "source code" and trying to
apply to reproducble build concepts.

I don't think 2) necessarily requires recursive transitive closure of
the same requirement on all of the build inputs.  There are at least two
terms covering that additional requirement: A) "bootstrappable build",
which recursively rebuild things bit-by-bit identical back to a small
seed using earlier versions of software, and B) "idempotent rebuild",
which recursively bit-by-bit identically rebuild things using the latest
version of all involved tools.  Guix has proved A) is possible, but I'm
not aware of any proof that B) is possible with any modern non-trivial
OS.

/Simon
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 1251 bytes
Desc: not available
URL: <http://lists.reproducible-builds.org/pipermail/rb-general/attachments/20250630/b65f3c39/attachment.sig>


More information about the rb-general mailing list