[rb-general] Reproducing tarballs under various toolchains

John Gilmore gnu at toad.com
Wed Sep 19 20:07:15 CEST 2018

When I wrote the original pdtar in 1985 (which later became GNU Tar), I
compared its output to that of the Unix tar command running on Sun
workstations.  There were various bugs or inconsistencies, including
whether particular subfields of the per-file header were terminated by
spaces or nulls (the checksum field was different).  I made pdtar match
what Unix tar created, byte for byte.

Of course, in the decades since I've been maintaining it, it may have
strayed from that goal.

I don't know the provenance of the free BSDs' tar.

In order to make tarballs reproducible with both GNU and BSD tools,
using the same arguments, I recommend submitting a change to BSD tar
that would make the required option (--owner) a synonym for --uid.

If tested and verified as working, this change would make the latest BSD
tar a complete replacement for GNU tar in generating reproducible builds
that produce tarballs.  This should be attractive to the BSD
maintainers, as it's an easy change with low likelihood of creating
problems, with an upside for the uptake of BSD and its tools in the
broader community.

It appears that the --sort=name option is relatively new in GNU tar
(only 2 releases back in v1.28), so an alternative would be to propose
adding --uid to GNU tar.  This has the disadvantage of requiring
builders to use the latest GNU tar to make reproducible builds with
consistent command line options.  But if the GNU maintainers are quicker
to accept the change, it may be the best option.


More information about the rb-general mailing list