Reproducible tarballs on Github?

Julien Lepiller julien at lepiller.eu
Sat Oct 23 19:21:31 UTC 2021


Le 23 octobre 2021 15:13:21 GMT-04:00, "David A. Wheeler" <dwheeler at dwheeler.com> a écrit :
>
>
>> On Oct 23, 2021, at 3:01 PM, Paul Spooren <mail at aparcar.org> wrote:
>> 
>> 
>> 
>>> On 23. Oct 2021, at 08:55, Bernhard M. Wiedemann <bernhardout at lsmod.de> wrote:
>>> 
>>> 
>>> 
>>>> On 23/10/2021 20.14, David A. Wheeler wrote:
>>>> 
>>>> A given version of tar should produce deterministic results. However, if
>>>> tar is updated, it’s not really
>>>> reasonable to expect that the result will be identical.
>>> 
>>>> It’s reasonable for GitHub to change its default tar implementation. What would you suggest as an alternative?
>> 
>> Can’t we reach out to GitHub/Microsoft and request that they fix their implementation? As a source distribution it should be their priority to keep user trust high.
>
>Sure. I know some folks who might be willing to help.
>But it’s not clear to me that they’ll view this as “fixing their implementation”.
>
>Complaining isn’t enough, you need to give them something easy & concrete to *do*.
>They use a toolset to create tarballs, and they *will* occasionally update the tools they use in the toolset.
>It’s useless to ask them to never update. That’d be a bad plan anyway.
>“Never compress” seems like a bad plan too, that’d make huge files.
>
>If there’s a flag or process that could force determinism for all future time without much loss, that’s a possibility; *is* there one?
>I don’t see one offhand, but I haven’t looked very seriously either.
>
>Usually determinism is for a specific version of a tool suite. What you’re asking for is much broader,
>you’re asking for determinism *regardless* of the underlying archive tool version.
>It’s not wrong to ask, but first you’d need to figure out how to reasonably accomplish it.
>
>--- David A. Wheeler
>

We've hit this issue in Guix before, so now we simply ignore the autogenerated tarball, and use either a release tarball made by the author (not by github), or directly clone the repository at the required commit.

I believe you could also checksum the content of the archive, instead of the archive itself. The content should always be the same, whatever compression or archiving method is used.


More information about the rb-general mailing list