Introducing: Semantically reproducible builds

Fri Jun 2 21:36:09 UTC 2023

> On Jun 2, 2023, at 12:26 PM, Andreas Enge <andreas at enge.fr> wrote:
> 
> Hello,
> 
> Am Fri, Jun 02, 2023 at 11:39:42AM -0400 schrieb David A. Wheeler:
>> I think the OSSGadget folks aren't fussed about this, because they're merely using this definition to explain what they're doing.
> 
> my first reaction also was "this is not a definition"! They are a few vague
> words describing a property that, defined strictly (for instance something
> like: "Two binaries are semantically equivalent if run in any environment
> with any input they produce identical outputs and side effects"), is
> certainly undecidable - it requires being able to decide the halting problem.

The proof of that is charmingly short. Here's the summary from my 2009 PhD dissertation
<https://dwheeler.com/trusting-trust/dissertation/html/wheeler-trusting-trust-ddc.html> section 5.6.1:

"Note that equality is a stricter relationship than equivalence. Two executables may be considered equivalent in an environment if they always produce equal outputs given equal inputs, even if their internal structure and/or values are different. Two executables that are equal are always equivalent, but equivalent executables need not be equal. Unfortunately, determining if two executables E1 and E2 are equivalent is undecidable in the general case. This is because if there was any decision procedure D capable of determining equivalence, it could be invoked by E1 and E2. If found equivalent they could perform different operations, and if found different they could act the same [Cohen1984, part 4]. Even in very special cases it is often difficult to determine the equivalence of two unequal executables. Instead of focusing on the difficult-to-determine equivalence relationship, we will instead focus on the stricter equality relationship, which is a far easier and more practical test to perform. Proof #2 and proof #3 will show that under certain common conditions, two executables will be equal (not just equivalent), so limiting proof #1 to equality does not significantly limit its practical utility."

> Reproducible builds are easily decidable and imply what one really wants,
> semantic equivalence.
> 
> Of course one can then pile up any number of heuristics and one-sided tests
> and conclude that one has not found a proof that two different binaries are
> semantically inequivalent, but this is not a proof that they are semantically
> equivalent (just as no number of tests proves that some code is correct).
> 
>> I think their answer on "are these semantically equivalent" would be, "run the current version of OSSGadget and the answer is what it says". It's a very straightward operational definition once put that way :-).
> 
> Then they could call it "OSSGadget-equivalent"...

I don't see the need for that. Practically all other security measures are heuristic estimates,
and we simply require information on who did the work & why they believe their claim is correct.

In this case, "We believe the posted package BAR is semantically equivalent to a regenerated package
(and thus is semantically equivalent to what the BAR source code says it will do)
because we organization ORG ran OSSGadget version FOO <with hash HASH>, and the tool
determined that they were semantically equivalent." We use tools to make assertions all the time.
If the tool gets it wrong, we try to fix the tool, and that updated tool has a new version number
to let us identify précisely what it does and does not do.

This si similar to statements like "We (organization ORG) use process XYZ to
evaluate the security of software BAR; we found vulnerabilities <list> and once those are
fixed we believe the software is adequately secure." I love the press towards proof and
exactitude, but remember where we're starting from :-).

--- David A. Wheeler