Introducing: Semantically reproducible builds
David A. Wheeler
dwheeler at dwheeler.com
Mon May 29 03:25:49 UTC 2023
On Sun, 28 May 2023 08:02:18 +0200, "Bernhard M. Wiedemann via rb-general"
<rb-general at lists.reproducible-builds.org> wrote:
> I agree, that it is good to give it a name (I have called it
> semi-reproducible before), but we should be clear on communicating the
> disadvantages.
Agreed.
> However, while working with the tool, I already found three (3) bugs in
> build-compare that made it report packages with significant differences
> as 'identical'.
Obviously that's bad. However, my current alternative is
"hoping for the best when downloading from PyPI" and I'm not
a fan of that process either.
If you have tips on common likely errors, please post, I think
that would be of interest to many.
> And if you don't rely on such tools, you need expensive manual reviews
> every time that cannot be automated and might also miss issues.
Impractical. In 2019 it was reported that a new application
created using "create-react-app" version 2.1.5 had 1,568 dependencies.
<https://news.ycombinator.com/item?id=19195148>.
React is really popular in the JavaScript ecosystem.
Yes, this is "create-react-app" not React itself, and there are many
caveats, but the *reality* is that often users are trying to deal with
thousands of dependencies & we need *some* way to flag the
most concerning ones.
> Another disadvantage of such binaries is that you don't have a single
> correct SHAsum that can be signed, communicated and compared easily.
> You always need the full binary to compare to your rebuild.
Agreed, that's a problem. To be fair, usually there's a
"canonical binary package" that people are using & for which people
can use a hash. In fact, many package managers can specify a hash.
The problem is that there are few ways for people to gain confidence that
this package *is* generated from the putative source code.
> The cleaner way is to use strip-nondeterminism to remove all these
> insignificant bits during build and make the resulting bit-reproducible
> output the official binary.
For a Linux *distributor* this makes sense. If you have control over the
build process, a more rigorous build process is great, and hardening that
build process against attacks is a wonderful idea (e.g., OpenSSF's SLSA).
As a *recipient* who has no control over the build process used by
someone else to create their package, I need some workable
alternatives to estimate risk.
Thanks!
--- David A. Wheeler
More information about the rb-general
mailing list