[rb-general] lzip/plzip alternatives

Vagrant Cascadian vagrant at debian.org
Wed Apr 19 22:58:38 CEST 2017


On 2017-04-19, Sylvain wrote:
> On Wed, Apr 19, 2017 at 06:53:24PM +0000, Peter Stuge wrote:
>> On Tue, Apr 18, 2017 at 08:43:05PM +0000, Daniel Shahaf wrote:
>> > Sylvain wrote on Tue, Apr 18, 2017 at 21:50:23 +0200:

>> > It's perfectly fine for lzip and plzip not to produce identical output
>> > to each other; whichever of them was *actually used* by the build should
>> > be recorded in the .buildinfo file, which would enable reproducing that
>> > particular build.  (That's exactly analogous to a "gzip 1.3 and gzip 1.4
>> > produce different outputs for identical inputs" situation.)
>
> .buildinfo is nice but this is specific to Debian packaging AFAICS.

That's certainly not the intention; there is some work done already to
support RPM based systems as well.

All a .buildinfo file is is a codified assertion "I built software X
with build environment Y, and got result Z."

That should be all the information (X and Y) you need to verify that you
also can get the same result (Z). If you don't get the same result,
that's also useful information.


> I personally welcome features that allow reproducibility by default

You can't prove something is reproducible, you can only prove that it's
not reproducible, at best, you can demonstrate that it's reproducible in
a multitude of varied configurations... if you can document it.

So tools that produce output reproducibly are a prerequisite for
reproducible builds...


> - I mean, the front-page example at https://reproducible-builds.org/
> is Volkswagen firmware.

Without the source or the build instructions, it's not really possible
for independent verification...


> About gzip:
>
> $ ls -lh a
> -rw-r--r-- 1 me me 81M Apr 19 21:23 a
> $ gzip -c a > a.gz16
> $ gzip-1.4/gzip -c a > a.gz14
> $ gzip-1.3.13/gzip -c a > a.gz13
> $ diff a.gz13 a.gz14
> [no differences]
> $ diff a.gz13 a.gz16
> [no differences]

It's great that multiple versions of gzip happen to not change the
output, but without documenting which version was used, when gzip
inevitably *does* change behavior (security fixes, optimizations, code
refactor, etc.), you don't have the information needed to track down the
problem of why a build was unreproducible.

You can prove all past versions of gzip produce output reproducibly, but
can you prove that all future versions will (or even should)?

And many or even most tools used in building software aren't going to be
as stable across versions as gzip, notably compilers. I would expect a
newer version of a compiler to produce better code, and that seems
perfectly reasonable.

So, even if gzip *never* breaks backwards compatibility, you need to
track other software used in the build system somehow (e.g. .buildinfo
or other method)... so you may as well track the gzip version as well.

I find it *interesting* if build tools produce artifacts reproducibly
across versions, but that doesn't seem like something to expect for most
tools.


live well,
  vagrant
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.reproducible-builds.org/pipermail/rb-general/attachments/20170419/48d7d5a2/attachment.sig>


More information about the rb-general mailing list