[rb-general] Reproducible builds and distributed CI
liw at liw.fi
Wed Jun 19 10:29:04 UTC 2019
On Sun, May 19, 2019 at 01:09:40PM +0300, Lars Wirzenius wrote:
> [various things]
I fear I may have been unclear in my previous message, sorry. I'll try
to clarify below. Also, it turns out I'm more than a little busy, so
it's taken a month to respond, sorry again.
* I'm building a new CI system, for general purpose use, not for the
reproducible builds project. Although it'd be excellent if it
benefited RB as well. The aim is for those who develop software to
use the new CI system as they would use, say, Jenkins or Travis, for
their own benefit.
* One of the things I'm exploring is ways to have a "distributed CI",
where CI build workers can be provided by anyone. This opens up
questions of trust. If a worker produces some output, how do we know
the output can be trusted? It might be intentionally misbuilt, even
have malware embedded in it. It's not possible to prove that it
isn't, I think, but I got the idea that if N workers do the same
build, and they produce the same output, then it's OK. In other
words, each worker would do a build in the way reproducible builds
does, to allow bitwise identical outputs.
* I don't know if that's actually feasible in practise. On the one
hand, how difficult is it to build something reproducibly? On the
other hand, it would require massively over-provisioning workers.
For the former, I need help and advice. For the latter, I'm hoping
to (in the long term), grow a community that provides workers for
free and open source projects via a distributed CI system. Anyone
who can run a VM or Docker container on their desktop of laptop, at
least occasionally, could "donate" build time for FOSS.
* The "donated worker build time" idea may be too crazy to become
reality. On the other hand, the idea of writing an operating system
kernel was crazy in 1991, but Linux happened. The idea of hobbyists
building a world-class operating system from free componets was
crazy in 1992 and 1993, but Slackware and Debian happened. The idea
of making an entire operating system build reproducibly was crazy
just a few years ago. I'm a big believer in aiming high. If I don't
aim high, I end up only shooting the ground.
Another way to think about this is "peer to peer CI". Given millions
of people did and still do Bittorrent, having a few thousands of
software developers provide a slice of their computing resources to
help with CI builds does not sound impossible to me.
* If my to-be future distributed CI system can do this, it will pretty
much require all projects that use it to build reproducibly. I think
this will be a good thing, even if it doesn't become hugely popular.
Making more developers aware of what they need to do to build
software reproducibly can't be a bad thing. I realize this might
mean it's too difficult to use my CI system; we'll see how this
* My CI system will also support more traditional use, where the
software developer provides the build workers, and only their
workers will be used for their project. Trust is much easier in this
case, and this bypasses any problems in getting builds to be
reproducible. But it also won't be distributed the way I'm
* I've made a note on the links to other projects doing something
similar to my vision, and will be investigating them later.
* Thank you for your patience and any further feedback you may have.
My questions are:
* Is it feasible for workers to provide an environment for
reproducible builds? From what I've read and heard over the years,
there are a lot of details, but it's mostly not difficult as such,
only a bit of work.
* Is the approach of at-least-N bitwise identical builds sensible,
assuming sufficient build workers being available? Or are there
security aspects and risks there that I am missing?
I want to build worthwhile things that might last. --joeyh
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 833 bytes
Desc: not available
More information about the rb-general