Why is not everything reproducible yet?

Bernhard M. Wiedemann bernhardout at lsmod.de
Thu Feb 15 04:55:50 UTC 2024



On 14/02/2024 16.19, Santiago Torres-Arias wrote:
> 1. can we study the conflicting interestes (i.e., above) that stop
> 	reproducibility from happening.

Yes, that should be possible. The above summarized my experience from 
the 1000 patches and bug-reports I did and the interactions with various 
upstreams.
The links are public and recorded in the monthly reports
https://salsa.debian.org/reproducible-builds/reproducible-website/-/tree/master/_reports
and earlier weekly posts
https://salsa.debian.org/reproducible-builds/reproducible-website/-/tree/master/_blog/posts

I can probably provide more input for such a study.

> 2. Are misunderstandings about reproducibility getting in the way from
> 	pushing to it (e.g., the notion that docker containers are
> 	inherrently reproducible). Is the perfect the enemy of the good?
> 	what notions of reproducibility exist and how can we build a
> 	roadmap from the weak to the strong?

There are some.
One is the confusion with what we started to call "repeatable builds" = 
the ability to be able to do a second build with the same explicit 
inputs. SBOMs help with repeatable builds, but if they become embedded 
in the build output, they can even hinder some side-benefits of 
reproducible builds, because every minor change in inputs now causes a 
change in output.

The other thing was
https://web.archive.org/web/20200807033032/https://blog.cmpxchg8b.com/2020/07/you-dont-need-reproducible-builds.html
that gained some anti-r-b mindshare, even though it neglegted several 
important aspects. E.g. it mentions the risk of stealing source-code 
which obviously does not apply to FLOSS.


> 3. What other uses of r-b exist beyond the malicious toolchain example?
> 	can we use them as leverage to increase interest in the space?

On a past r-b summit we collected
https://reproducible-builds.org/docs/buy-in/
e.g. in openSUSE we always pushed for some level of binary equivalence 
to do build-tree-pruning in our open-build-service to save build power, 
shorten rebuild time and save bandwidth for mirrors and users that do 
not need to update unchanged packages.
We also publish updates as delta-rpm-packages that probably were more 
compact with fewer random variations.

The page also lists the QA aspect. I did find a dozen corruption bugs 
that went unnoticed for years.

e.g. https://gitlab.gnome.org/GNOME/libxslt/-/issues/37 had this 
memorable quote from upstream:
> This was caused by an interesting bug in libxml2's streaming XPath engine. I'm still puzzled why it took so long to discover this issue. 

So for your study, you could find this link in _reports/2020-04.md

another corruption bug in
_reports/2023-10.md:    * 
[`OpenRGB`](https://gitlab.com/CalcProgrammer1/OpenRGB/-/issues/3675) 
([corruption-related 
issue](https://gitlab.com/CalcProgrammer1/OpenRGB/-/merge_requests/2103))


One benefit not listed is that with r-b it is possible to say "version 
1.2.3 has hash abcdef" and you can provide a signature of the file, 
without uploading the file itself. With content-addressable storage such 
as IPFS, you can then also link to such an artifact and anyone else can 
provide the correct file.

e.g. in
http://bafybeiezodttpdsrhy7gj7zuzklbs3exh42a4ezorsepnn74ar2gkicujy.ipfs.cf-ipfs.com/
if we had reproducible ISOs, I could build and sign them in a 
low-bandwidth place but build+upload from another.



Ciao
Bernhard M.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 236 bytes
Desc: OpenPGP digital signature
URL: <http://lists.reproducible-builds.org/pipermail/rb-general/attachments/20240215/0909234b/attachment.sig>


More information about the rb-general mailing list