[rb-general] Regarding "Zero Install" manifests

Ximin Luo infinity0 at debian.org
Fri May 5 20:46:00 CEST 2017


Sorry for the delayed reply.

Anders Björklund:
> Ximin Luo wrote:
>> Hi Anders, do you have a specific proposal to suggest for us here?
>> The ideas all sound good but I'm not sure what the overall system
>> you're suggesting should be.
>>
>> At the moment we already *have* buildinfo files (i.e. signed
>> manifests), and the next step is to figure out what sorts of logic we
>> should add to say, `apt-get` so that users get a good sense of "how
>> reproducible" the packages that they're installing are.
> 
> Oh, when I asked my question I got the impression that there was no
> standardized output format (that would contain any checksums etc)
> 
> Looking at the docs, I saw only generic explanations but no formats:
> https://reproducible-builds.org/docs/checksums/
> https://reproducible-builds.org/docs/embedded-signatures/
> So that is why I gave an example of such a format that does exist ?
> 
> 
> Looking at https://wiki.debian.org/ReproducibleBuilds/BuildinfoFiles
> it seemed rather specific to Debian and I didn't see any contents ?
> 

Ah right, I understand now. Right, Debian buildinfo files do not contain checksums of the build-dependencies nor of packages-installed-at-build-time right now, for unfortunate technical reasons. I've been meaning to figure out how to do this properly and then file a bug to the dpkg maintainer, thanks for the reminder.

(Of course, they contain checksums of the actual source and binaries being built.)

However, just the concept of adding checksums is not such a sophisticated nor complex concept, so I am not sure that looking at how other people have done it, is worth the time (given other stuff that needs to be done). Or did you spot any particular "gotchas" or insightful implementation tricks that you could tell us about?

> The idea was for a single format that would describe the binaries.
> Wouldn't hurt if it was something like how git describes the code ?
> 

It wouldn't hurt no. Eventually it would be good to try to standardise and unify different packaging standards. This is a very hard task though, and in the meantime it's not clear to me what the benefit is, if we only have unified buildinfo files but they are still produced and consumed by different distros' incompatible packaging systems.

> [..]
> 
>>> It is much better to checksum the binaries (than e.g. the
>>> tarballs), because then the _same_ files can be distributed in lots
>>> of ways...
>>
>> Distributing the same files in different arrangements, could in
>> theory activate a backdoor in one arrangement but not the other.
>> 0install has a hash for each whole-arrangement as well, so there is
>> no security issue there. However, I'm just pointing out that the only
>> benefit of "hashing each file" would be for potential deduplication,
>> not security.
> 
> Keeping the checksum of the files rather than the archive they came in,
> offers the possibility to verify this checksum. I think it is handy ?
> 
> And yeah, the possibility for deduplication and caching is also nice.
> Even for distributed variants of those, or delta updates, or whatever.
> 

Concrete proposals are welcome. :) It's hard for me to reply, without one. These are all potentially nice ideas but it's not obvious if they are worth the cost of adding them into various systems.

We've already established that one must verify the *whole package* for the most confidence; what benefit do you see for users to verify each individual file as well as the *whole package*? If you are talking about the kernel making sure system files haven't been tampered with, that is a separate issue and can be done at installation time after verifying the whole package, without needing to carry checksums for specific files.

> 
> But nowadays I'm mostly using Docker images anyway, I suppose that
> is _another_ standard for describing the binaries... (e.g. the OCI)
> 
> They just recently changed to a content-addressable format, though.
> https://github.com/moby/moby/wiki/Engine-v1.10.0-content-addressability-migration
> 

I'm not familiar with Docker, could you explain this in some more detail?

Buildinfo files not only describe a binary, it describes *how it was built*. This is the important part; it means people can try to reproduce the build. I'm not aware that Docker has an equivalent. Also in this field (i.e. with related tools that try to recreate "images") they often use the term "reproducible" in a slightly different (weaker) way - they mean a semi-reproducible build environment, but the images may or may not be bitwise-identical if you run the build twice. The reproducibility of the build environment is seen as the primary positive characteristic.

By contrast, we use "reproducible builds" to mean a semi-variable build environment, that *despite the variations*, generate bitwise-identical end results. The primary positive characteristic is the bitwise-identical *end result*.

X

-- 
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git


More information about the rb-general mailing list