"Reproducible build" definition in OpenSSF glossary
Simon Josefsson
simon at josefsson.org
Mon May 12 07:31:38 UTC 2025
"David A. Wheeler" <dwheeler at dwheeler.com> writes:
>> First, the term "build inputs" is not defined (as far as I can tell), so
>> I'm not sure exactly what you want it to mean?
>
> Fair enough.
> Originally this was "source code" but we're trying to deal with the case
> where people are building whole container images / ISO images / etc.
> Our effort to generalize things unintentionally created some confusion.
>
> So - how about adding this?:
>
> Build inputs: Data used and processed by the build environment (including
> the tools in the build environment) to produce the output build artifacts.
> The build inputs are often the source code being built.
Oh! I sometimes use the term 'build inputs' to refer to 'build
dependencies', which is different from your meaning. I'm happy to try
to re-learn here though.
> See my definitions above. Hopefully they clarify all that.
> I *think* the problem disappears with these terms more clearly defined.
I agree. Your proposal is an improvement to the current version.
I still find the overall use of terms and meanings confusing and
inconsistent, and that some concepts doesn't have clear terms. A more
verbose "Taxonomy of Reproducible Build Terminology" would be useful.
>> Therefor I suggest changing the
>> above into:
>>
>> A build is **reproducible** if any party can recreate bit-by-bit
>> identical copies of all specified build artifacts.
>
> I'm wary of that change because the word "recreate" is doing a lot of hidden
> heavy lifting. In addition, this version downplays the importance of generating
> results from the inputs.
>
> I think we'd agree that "recreating" artifacts by copying the artifacts from
> some other website doesn't count :-). I'd like to make it
> clear that you have to rebuild from the *inputs* - it doesn't count if
> you sneakily copy artifact results from somewhere else.
> In *practice* you have to know a lot about the inputs, so I think it's
> valuable to make it clear that they're important.
Given that we should support reproducible builds of AI training models,
I think this aspect will become even more important.
One thing I'm missing from all these definitions is if online services
are allowed to be used or not during a build. My preference is to only
permit complete isolated builds without any network connection in the
definition, but alas this is not the reality today and I fear the trend
is moving to more dependence on online services during builds.
/Simon
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 1251 bytes
Desc: not available
URL: <http://lists.reproducible-builds.org/pipermail/rb-general/attachments/20250512/483f3fd6/attachment.sig>
More information about the rb-general
mailing list