Introducing: Semantically reproducible builds

Wed May 31 14:36:26 UTC 2023

I tend to think about reproducible builds in this generalizable way:

A build is reproducible if equivalent inputs (source, build tools, build
tool invocation, etc) to the build result in equivalent outputs.

The question then becomes: how are input equivalence and output equivalence
defined?

Output equivalence has traditionally been defined as the output being byte
for byte identical.  This definition has many wonderful characteristics,
including but not limited to:

1. Everyone agrees on what byte for byte identical means
2. It's simple to verify
3. Its results in identical hashes/signatures for all equivalent artifacts
- ie: equivalent outputs result in equivalent (identical) hashes.

#3 above turns out to be very significant and often overlooked.

You can, of course, have alternate definitions of output equivalence.  But
it might be worth understanding what characteristics we want in output
equivalence, and looking at alternate proposals through that lens.

Ed

On Sun, May 28, 2023 at 2:27 PM James Addison via rb-general <
rb-general at lists.reproducible-builds.org> wrote:

> Hi David,
>
> Thanks for sharing this.
>
> I think that the problem with this idea and name are:
>
> - That it does not allow two or more people to share and confirm that
> they have the same build of some software.
> - That it does not allow tests to fail-early, catching and preventing
> reproducibility  regressions (semantic or otherwise).
> - That the naming terminology conflates with true reproducible builds,
> therefore creating the potential for misunderstanding to consumers.
>
> Cheers,
> James
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.reproducible-builds.org/pipermail/rb-general/attachments/20230531/582804d8/attachment.htm>