[rb-general] Reproducing tarballs under various toolchains
Daniel Shahaf
danielsh at apache.org
Wed Sep 19 01:40:37 CEST 2018
I've been looking into how I might create reproducible *.tar.(gz|bz2|xz)
artifacts.
https://reproducible-builds.org/docs/archives/ recommends:
.
# requires GNU Tar 1.28+
$ tar --sort=name \
--mtime="@${SOURCE_DATE_EPOCH}" \
--owner=0 --group=0 --numeric-owner \
-cf product.tar build
.
which is sort of an answer: even on BSD one can install GNU tar as gtar,
and some PATH manipulation can arrange for it to be called. gtar is
made the gold standard; one reproduces a tarball by reproducing gtar's
behaviour bug for bug, or more likely, by running gtar.
That's nice, but it also makes gtar a single point of failure. If
there's a bug in gtar then everyone who uses gtar to achieve
reproducible tarballs would be affected. This is particularly jarring
because BSD tar _does_ have equivalent functionality, just under a
different name (s/--owner/--uid/).
So I suppose what I'm saying is:
One, it would be nice to be able to reproduce a tarball without having
to use exactly the same toolchain. (If I had to market this I would
say, "There's more to reproducibility than being deterministic.")
Two, GNU tar and BSD tar have an instance of xkcd.com/927/ in the names
of their option flags. It's hard to patch upstream tarball rolling
scripts to be reproducible when that would make them unportable.
Cheers,
Daniel
[1] https://man.freebsd.org/tar
[2] https://manpages.debian.org/tar
More information about the rb-general
mailing list