"Reproducible build" definition in OpenSSF glossary
Timo Pohl
pohl at cs.uni-bonn.de
Wed Apr 23 12:39:15 UTC 2025
I have been thinking about this for a bit myself, and at least from the
point of view of someone trying to verify reproducibility of a certain
build, i would say the environment is definitely necessary, and if
anything the current definition is too unspecific to allow generic
verification.
> What matters is that someone is able to get the same bit-by-bit
identical output.
I would argue that what's relevant is that *everyone* is able to get the
same bit-by-bit identical output. Without specifying the relevant parts
of the environment, you quickly get into a situation where some
rebuilders happen to have an environment where the output artifact is
the same, while others with a different environment get a different
artifact. In my opinion, the inputs for a "reproducible build" should be
specific enough so that anyone adhering to these inputs gets the same
artifact out, and you don't need to get lucky, or get additional
information from somewhere else to successfully reproduce the same artifact.
> However, I believe that achieving "verified reproducible builds" does
not require reproducing the build environment bit-by-bit identically.
I agree! And i see that the given definition in the OpenSSF glossary may
be a little misleading in this regard. However, the source definition on
reproducible-builds.org further explains that it is about the *relevant
attributes of the build environment*, which are up to the project
maintainer to determine. That way, they ideally don't overspecify the
build environment, and rebuilders don't have to bit-by-bit reproduce the
whole original build environment. They just have to reproduce the parts
that the maintainer has deemed relevant for the build to be reproducible.
In a recent pre-publication [1] i proposed a more formal definition of
reproducibility:
> A tuple (source code, build instructions, build environment,
artifacts) is considered reproducible, if executing the build
instructions on the source code within the build environment always
produces the same artifacts when compared via bit-by-bit equality.
But as this is designed as a more precise version for academic use, i
can see that it may feel a little... clunky for a general purpose glossary.
Best
Timo
On 23.04.25 11:07, Simon Josefsson via rb-general wrote:
> "David A. Wheeler via rb-general"
> <rb-general at lists.reproducible-builds.org> writes:
>
>> The OpenSSF is building a "glossary" set (so we consistently use the
>> same meaning for the same term), and I drafted a definition for "reproducible build"
>> based on this group:
>>
>> https://glossary.openssf.org/reproducible-build/
> Thanks. I think the "same source code, build environment and build
> instructions" part may lead people the wrong way.
>
> Others may have different goals, but for me the point of supporting
> reproducible builds is so that we can get to "verified reproducible
> builds", which to me is what matters for end-users. It is great that
> you mention this goal above! It seems this goal is often forgotten.
>
> With that goal in mind, I don't think it matters what the build input or
> build environment is.
>
> What matters is that someone is able to get the same bit-by-bit
> identical output.
>
> Where I think people may go wrong with the text above is that you are
> led to believe that there is a one-to-one mapping involved for the build
> environment.
>
> However, I believe that achieving "verified reproducible builds" does
> not require reproducing the build environment bit-by-bit identically.
>
> For example, if I'm able to independently rebuild Debian's version of
> Firefox using the same Firefox source code but some other build
> environment, I would still count the firefix binary as a "verified
> reproducible build". Does anyone disagree with that? Why?
>
> Here is my attempt at clarification:
>
> OLD:
>
> A build is reproducible if given the same source code, build
> environment and build instructions, any party can recreate bit-by-bit
> identical copies of all specified artifacts.
>
> NEW:
>
> A build is reproducible if given the same source code, any party can
> recreate bit-by-bit identical copies of all specified artifacts.
> Information about the build environment and build instructions is
> usually needed to achieve that.
>
> What do you think?
>
> Btw, I recently wrote about verifying reproducible source tarballs:
>
> https://blog.josefsson.org/2025/04/17/verified-reproducible-tarballs/
>
> Turns out I was not able to reproduce any upstream-published tarballs
> that I looked at. Does anyone know of any earlier systematic efforts to
> verify reproducability of source tarballs in a similar way? Is anyone
> interested in working on this, for a couple of high-profile packages to
> see if we are able to reproduce them?
>
> /Simon
--
Timo Pohl
Institut für Informatik IV Raum: 1.018
Universität Bonn Tel.: +49 228 73-54246
Friedrich-Hirzebruch-Allee 8 E-Mail: pohl at cs.uni-bonn.de
53115 Bonn PGP key id: 0x4872A6DD1019A4D8
Department of Computer Science IV Room: 1.018
University of Bonn Phone: +49 228 73-54246
Friedrich-Hirzebruch-Allee 8 E-Mail: pohl at cs.uni-bonn.de
53115 Bonn PGP key id: 0x4872A6DD1019A4D8
Germany
More information about the rb-general
mailing list