Three bytes in a zip file
larry at doolittle.boa.org
Thu Apr 6 08:28:17 UTC 2023
I'm trying to make a process to generate byte-for-byte reproducible zip files.
I got the contents identical, including timestamps and permissions.
But three bytes at the 98.08% mark (bytes 5543078 to 5543081,
out of a file size 5651451) differ between my run and a friend's run.
Velocity-dependent? His was done on a train. ;-)
try.diffoscope.org is no help.
"Format-specific differences are supported for ZIP archives but no file-specific differences were detected; falling back to a binary diff."
I can get the same info as provided by diffoscope with
$ diff <(hexdump marble-ea2bb52c-mb-fab.zip) <(hexdump marble-ea2bb52c-ld-fab.zip)
< 05494a0 0300 ca68 642c 73cf 642e 7875 000b 0401
> 05494a0 0300 ca68 642c ca68 642c 7875 000b 0401
That is, 73cf642e becomes ca68642c.
The diff is so small, it seems silly to post both files, but I'll
do that anyway.
Any zip file format experts here, who can explain where this comes from?
And more importantly, can suggest how to fix the environment to prevent it?
The script making this file is at
but because I got the _contents_ to match already, I assert
the only important lines for the purposes of this question are
touch --date="@$SOURCE_DATE_EPOCH" fab/*
TZ=UTC zip --latest-time "$zipfile" fab/*
Side note, the "ea2bb52c" in the file names above refers
to the commit ID in the github repo.
More information about the rb-general