Introducing: Semantically reproducible builds

Fri Jun 2 15:10:20 UTC 2023

> A project build is `semantically equivalent` if its build results can be
either recreated exactly (a bit for bit [reproducible build](
https://en.wikipedia.org/wiki/Reproducible_builds)), or if the differences
between the release package and a rebuilt package are not expected to
produce functional differences in normal cases.

This is why I was attempting to bring things back to desired
characteristics.  Intrinsic in what I suggested, talking about equivalent
outputs, is that you have a definition of equivalence that is sufficiently
precise that any two independent parties given the definition of
equivalence and a pair of outputs will always produce the same answer as to
whether the outputs are equivalent.  I don't think that "the differences
between the release package and a rebuilt package are not expected to
produce functional differences in normal cases" meets that basic criteria.
I could easily imagine two independent parties having differing
opinions wrt whether two outputs are equivalent under that definition.

Say I have an executable with a typo in its 'usage'.  I fix the typo in the
'usage' and decide to assert that the executable is semantically equivalent
to the previous executable.  From my point of view, its not functionally
different, because I don't consider 'usage' to be functional.  Someone else
may be *parsing* the usage, and does consider that to be a 'functional'
difference.

It's incredibly easy to convince yourself that your differences between two
executables aren't "not expected to produce functional differences in
normal cases".  You have "expected" "functional difference" and "normal
cases" to work with as blank spaces into which almost any definition can be
pushed.

Please don't get me wrong, the OSSGadget folks may be doing *really* good
work.  My complaint is that the definition of "Semantically Reproducible"
is effectively unusable as written above.  Can it be tightened ups to
something that at the very least meets the characteristics:

- Two independent parties given the definition of equivalence and a pair of
outputs will always produce the same answer as to whether the outputs are
equivalent.

Ed

On Fri, Jun 2, 2023 at 9:46 AM David A. Wheeler <dwheeler at dwheeler.com>
wrote:

>
>
> > On May 31, 2023, at 10:36 AM, Ed Warnicke <hagbard at gmail.com> wrote:
> >
> > I tend to think about reproducible builds in this generalizable way:
> >
> > A build is reproducible if equivalent inputs (source, build tools, build
> tool invocation, etc) to the build result in equivalent outputs.
>
> Fair enough. The immediate issue is to reduce confusion.
>
> The OSSGadget developers have decided to switch to the term "semantic
> equivalency"
> and "semantically equivalent":
> > A project build is `semantically equivalent` if its build results can be
> either recreated exactly (a bit for bit [reproducible build](
> https://en.wikipedia.org/wiki/Reproducible_builds)), or if the
> differences between the release package and a rebuilt package are not
> expected to produce functional differences in normal cases.
>
> Links:
> https://github.com/microsoft/OSSGadget/issues/426
> https://github.com/microsoft/OSSGadget/pull/429
>
> Their oss-reproducible tool, part of OSSGadget, uses a variety of steps to
> determine if a build is semantically equivalent.
>
> --- David A. Wheeler
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.reproducible-builds.org/pipermail/rb-general/attachments/20230602/3822daf7/attachment.htm>