"Reproducible build" definition in OpenSSF glossary

David A. Wheeler dwheeler at dwheeler.com
Wed May 7 22:36:23 UTC 2025



> On May 7, 2025, at 5:25 PM, Simon Josefsson <simon at josefsson.org> wrote:
> 
> "David A. Wheeler via rb-general"
> <rb-general at lists.reproducible-builds.org> writes:
> 
>> My thanks to the many who commented on the need to update the
>> definition of "reproducible builds".
>> 
>> I created a merge request that *attempted* to address all the comments:
>> https://salsa.debian.org/reproducible-builds/reproducible-website/-/merge_requests/178/diffs
> 
> I read it and I'm happy with everything except this part:
> 
>  A build is **reproducible** if given the same build inputs, any party
>                                           ^^^^^^^^^^^^^^^^^
>  can recreate bit-by-bit identical copies of all specified build
>  artifacts by generating them from the build inputs.
> 
> First, the term "build inputs" is not defined (as far as I can tell), so
> I'm not sure exactly what you want it to mean?

Fair enough.
Originally this was "source code" but we're trying to deal with the case
where people are building whole container images / ISO images / etc.
Our effort to generalize things unintentionally created some confusion.

So - how about adding this?:

Build inputs: Data used and processed by the build environment (including
the tools in the build environment) to produce the output build artifacts.
The build inputs are often the source code being built.

Build environment: The set of hardware and software used to perform a build
that accepts the build inputs and generates the artifacts.
The build environment often includes a compiler, run-time library, operating system, and
hardware used to execute them.

> Second, I don't think we want to give the impression that the exact same
> build inputs are required for a reproducible build.  What I believe
> matters are the outputs: if I compile a binary using GCC version X and
> get the same bit-by-bit identical output as someone with GCC version Y,
> then I would count that as a success.

If X & Y aren't exceedingly close, I would also count that as a miracle :-).
I *DO* agree with you that compiling with slightly different tool versions
and getting the same result is fine. It often doesn't happen, but it's fine!

However, I would consider the tools used during a build as
part of the build *environment*, and NOT as the build inputs.
The definition of "relevant attributes of the build environment" was already
there and I believe hinted at that. But we can do better than *hinting* at it.

See my definitions above. Hopefully they clarify all that.
I *think* the problem disappears with these terms more clearly defined.

> Therefor I suggest changing the
> above into:
> 
>  A build is **reproducible** if any party can recreate bit-by-bit
>  identical copies of all specified build artifacts.

I'm wary of that change because the word "recreate" is doing a lot of hidden
heavy lifting. In addition, this version downplays the importance of generating
results from the inputs.

I think we'd agree that "recreating" artifacts by copying the artifacts from
some other website doesn't count :-). I'd like to make it
clear that you have to rebuild from the *inputs* - it doesn't count if
you sneakily copy artifact results from somewhere else.
In *practice* you have to know a lot about the inputs, so I think it's
valuable to make it clear that they're important.

--- David A. Wheeler



More information about the rb-general mailing list