"Reproducible build" definition in OpenSSF glossary

Fri Jul 11 23:04:31 UTC 2025

On 6/29/25 22:07, Ismael Luceno wrote:
> Isn't upstream not caring enough of a red flag?

Oh, I wish. It's such an up-hill battle. When I started 5 years ago, only one or 
so mobile Bitcoin wallet was functionally reproducible (reproducible except for 
the zip compression and the signature). Now we track thousands and the bulk are 
custodial (which is worse in the sense of your Bitcoins not being save but these 
are all also closed source, so they might do whatever on your phone prior to 
stealing your coins). We try to fetch the providers where they are as most are 
just not familiar with the concept of reproducible builds and many of the open 
source projects genuinely start caring after we make them aware of the issue but 
the Play Store ecosystem builds on signed binaries, so "bit by bit" 
reproducibility is unachievable for the "executable" if you consider the full 
apk file on Android as the artifact. If you consider it as a zip file and check 
only the content of it without the signature, then many open source apps are 
reproducible. As there are libraries that introduce short random strings (UUIDs) 
with the library providers not willing to fix this and the app providers not 
willing to not use these libraries, we still attest "functional reproducibility" 
in these cases.
That said, I wish devs and users would care so we can stop doing these extra 
steps when reproducing binaries we find on Google Play.
Google Play is also not the only platform where we find signed binaries. Bitcoin 
wallet firmware for Hardware Wallets also usually comes with embedded signatures 
that we have to disregard. These come in all their own formats, usually with 
some header that contains the signatures at a fixed offset. It makes sense to 
bundle the signature to make "trust on first use" work but it makes it a mess to 
call something reproducible then. But we still try as the goal is to provide 
transparency but it's a slippery slope, so we can't just call it functionally 
reproducible when the diff looks benign - to an expert - after 3h of analysis as 
those 3 expert hours won't scale for what we are trying to do.