[Git][reproducible-builds/reproducible-website][master] Migrate 'history' from the Debian wiki.

Chris Lamb (@lamby) gitlab at salsa.debian.org
Sat Jul 31 13:21:48 UTC 2021



Chris Lamb pushed to branch master at Reproducible Builds / reproducible-website


Commits:
1b5838fe by Chris Lamb at 2021-07-31T14:21:01+01:00
Migrate 'history' from the Debian wiki.

- - - - -


2 changed files:

- _data/docs.yml
- + _docs/history.md


Changes:

=====================================
_data/docs.yml
=====================================
@@ -1,6 +1,7 @@
 - title: Introduction
   docs:
   - definition
+  - history
   - buy-in
   - plans
   - test-bench


=====================================
_docs/history.md
=====================================
@@ -0,0 +1,354 @@
+---
+title: History
+layout: docs
+permalink: /docs/history/
+---
+
+A history of *reproducible builds* in and around Debian.
+
+## An old idea
+
+The idea of *reproducible builds* is not very new. It was implemented
+for GNU tools in [the early 1990s](https://lists.reproducible-builds.org/pipermail/rb-general/2017-January/000309.html)
+(which we learned, much later in 2017). In Debian world, it was
+mentioned first in
+[2000](https://lists.debian.org/debian-devel/2000/11/msg01758.html),
+and then more explicitly in
+[2007](https://lists.debian.org/debian-devel/2007/09/msg00746.html)
+on `debian-devel`: "*I think it would be really cool if the Debian policy
+required that packages could be rebuild bit-identical from source."* The
+reactions were unfortunately not really enthusiastic both times.
+
+## Private property + Snowden effect
+
+The interest on reproducible builds picked up again with Bitcoin. Users of
+bitcoins needed a way to trust that they were not downloading corrupted
+software. Initial versions of [Gitian](http://gitian.org/) were written in 2011
+to solve the problem. It drives builds using virtual machines and Git.
+
+The [global surveillance disclosures](https://en.wikipedia.org/wiki/Global_surveillance_disclosures_%282013%E2%80%93present%29)
+in 2013 raised the interest even further. Mike Perry worked on [making the Tor Browser build reproducibly](https://blog.torproject.org/blog/deterministic-builds-part-two-technical-details)
+in [fear](https://blog.torproject.org/blog/deterministic-builds-part-one-cyberwar-and-global-compromise)
+of a "*malware that attacks the software development and build processes
+themselves to distribute copies of itself to tens or even hundreds of millions
+of machines in a single, officially signed, instantaneous update*".
+
+## Kick-off
+
+The success of making such a large piece of software build reproducibly
+proved that it was feasible for other projects. This prompted Lunar to
+organize a [discussion at
+DebConf13](http://penta.debconf.org/dc13_schedule/events/1063.en.html)
+happening July 2013. Even scheduled at the last minute, there was still
+about thirty attendees who were very much interested, amongst them
+members of the [technical
+committee](https://www.debian.org/devel/tech-ctte) and a few
+other core teams.
+[Minutes](attachment:dc13-bof-reproducible-builds.txt) are
+available.
+
+After some more research during the conference, a [wiki
+page](https://wiki.debian.org/ReproducibleBuilds?action=recall&rev=1)
+was created. The initial approach was to get Debian to "buy-in" on the
+idea by making five packages from different maintainers build
+reproducibly. However, it quickly appeared that before fixing issues in
+the toolchain, it would not be possible to even get a single package to
+be reproducible.
+
+## First mass-rebuilds
+
+Lunar came up with the first patches for DebianPts:dpkg at the August
+2013. This enabled DebianPts:hello to build reproducibly. The [first
+large scale rebuild](ReproducibleBuilds/Rebuild20130907) was
+performed soon after by David Suárez, with variations on time and build
+path. 24% of 5240 source packages were identified as reproducible. The
+[first version of a "smart" comparison
+script](https://wiki.debian.org/ReproducibleBuilds?action=recall&rev=83#Crude_bash_script_to_compare_two_binary_packages)
+was written to help reviewing differences.
+
+A [second mass rebuild](ReproducibleBuilds/Rebuild20140126)
+was made before the [presentation in the distro devroom at FOSDEM'14](https://fosdem.org/2014/schedule/event/reproducibledebian/).
+It used a slightly different approach regarding build paths and had
+`binutils` built in deterministic mode. 67% of 6887 source
+packages were found reproducible. A result applauded by the FOSDEM
+crowd.
+
+The presentation sparked interest and woke up the
+[mailing-list](https://lists.alioth.debian.org/mailman/listinfo/reproducible-builds)
+created some months ago. Tomasz Buchert wrote a [lintian check for gzip
+files](http://lists.alioth.debian.org/pipermail/reproducible-builds/Week-of-Mon-20140210/000038.html).
+Stéphane Glondu worked on sorting logs and [experimenting with
+alternatives for build path
+issues](http://lists.alioth.debian.org/pipermail/reproducible-builds/Week-of-Mon-20140217/000053.html).
+
+## .buildinfo control files
+
+In parallel, several approaches on where and how to record the build
+environment were considered. The first idea was to use the [.changes control file](https://www.debian.org/doc/debian-policy/ch-controlfields.html#s-debiansourcecontrolfiles)
+through a substitution variable (Bug:719854). Instead, Guillem Jover
+suggested to add new fields by passing
+`--changes-option="-DBuild-Env=...` to `dpkg-buildpackage`. As for
+the value, we discovered `dh-buildinfo` written by Yann Dirson,
+described as a "debhelper addon to track package versions used to build
+a package". Fit for reproducible builds!
+
+[What happened for a year](https://summit.debconf.org/debconf14/meeting/78/reproducible-builds-for-debian/)
+was presented at [DebConf14](http://debconf14.debconf.org/).
+The reception was unexpectedly good and the follow-up BoF truly
+productive. For one thing, a suitable way to record the build
+environment was sketched out.
+
+One issue about using `.changes` files is that they are not kept in the
+archive. So to be used as a way to record the environment, they would need to
+be distributed with the archive. But this would be a misunderstanding of their
+purpose. As their name implies, `.changes` control files represent *changes* to
+archive. They were inherently designed to be transient.
+
+So instead, we had the idea of a new `.buildinfo` control file which
+would be added to the archive alongside binary packages --- and be
+uploaded by referencing them in `.changes`. We quickly drafted a
+[specification](https://wiki.debian.org/ReproducibleBuilds/BuildinfoSpecification),
+and a couple of days later Niko Tyni came up with [an addition to
+debhelper](https://anonscm.debian.org/cgit/reproducible/debhelper.git/tree/dh_genbuildinfo?h=pu/reproducible_builds_2014&id=1543ea2535160bf9578149c681eb7ff324901471)
+which created a `.buildinfo` using the output of the aforementioned
+`dh-buildinfo`.
+
+## strip-nondeterminism
+
+Before DebConf14, an explicit *timestamp* was given during rebuilds,
+extracted from the `.changes` file. However, during the discussions,
+there was a consensus that the date of the latest entry in the
+[debian/changelog](https://www.debian.org/doc/debian-policy/ch-source.html#s-dpkgchangelog)
+file could be used as the reference timestamp when needed.
+
+This helped another idea: a generic tool that would post-process
+different file formats to remove timestamps or other source of
+non-determinism. Andrew Ayer further took the task of creating
+`strip-nondeterminism`. The first released version handled files
+created by gzip, Zip, Jar, Javadoc, and `.a` files.
+
+## Giving up on build paths
+
+Initially we though that variations happening when building the package
+from different build path should be eliminated. This has proven
+difficult. The main problem that has been identified is that full path
+to source files are written in debug symbols of ELF files.
+
+First attempt used the
+[-fdebug-prefix-map](https://gcc.gnu.org/onlinedocs/gcc-4.9.2/gcc/Debugging-Options.html)
+option which allows to map the current directory to a canonical one in
+what gets recorded. But compiler options get written to debug file as
+well. So it has to be doubled with `-gno-record-gcc-switches` to be
+used for reproducibility. The [first large scale
+rebuild](ReproducibleBuilds/Rebuild20130907) has proven that
+it was also hard to determine what the actual build path has been
+accurately.
+
+Second attempt used `debugedit` which is used by Fedora and
+other to change the source paths to a canonical location after the
+build. Unfortunately, `gcc` write debug strings in a *hashtable*.
+`debugedit` will not reorder the table after patching the strings, so
+the result is still unreproducible. Adding this feature to `debugedit`
+looked difficult. We can still make the approach work by passing
+`-fno-merge-debug-strings` but this is space expensive. The [second
+large scale rebuild](ReproducibleBuilds/Rebuild20140126) used
+the latter approach. It was still difficult to guess the initial build
+path properly. Stéphane Glondu was the first to [suggest to using a
+canonical build
+path](http://lists.alioth.debian.org/pipermail/reproducible-builds/Week-of-Mon-20140217/000065.html)
+to solve the issue.
+
+During [discussions at
+DebConf14](http://lists.alioth.debian.org/pipermail/reproducible-builds/Week-of-Mon-20140901/000198.html),
+we revisited the idea, and felt it was indeed appropriate to decide on a
+canonical build path. It has an added benefit of making it easier to use
+debug packages: one simply has to unpack the source in the right place,
+no extra configuration required.
+
+Finally, it was
+[agreed](http://lists.alioth.debian.org/pipermail/reproducible-builds/Week-of-Mon-20141117/000560.html)
+to add a `Build-Path` field to `.buildinfo` as it made it easier to
+reproduce the initial build if the canonical build location would
+change.
+
+## Improved comparison tool
+
+After `strip-nondeterminism` initial upload and integrating some
+more changes discussed during DebConf14 in DebianPts:dpkg and
+DebianPts:debhelper, Lunar [experimented with 172 core
+packages](http://lists.alioth.debian.org/pipermail/reproducible-builds/Week-of-Mon-20140915/000441.html).
+30% were reproduced without further modifications.
+
+As the current tools to understand differences between builds were slow
+and hard to read, Lunar wrote `debbindiff`. It
+[replaced](http://lists.alioth.debian.org/pipermail/reproducible-builds/Week-of-Mon-20140929/000492.html)
+inefficient shell scripts by structured Python with a HTML output.
+
+## Continuous integration
+
+At the end of September 2014, Holger Levsen started to work on extending
+[jenkins.debian.net](Services/jenkins.debian.net) to perform
+continuous integration for build reproducibility. Packages from *sid*
+started to be built and rebuild. This initially introduced variations
+for time and file ordering, and was extended later on to also use
+different users, groups, hostnames, and locales.
+
+The results were visible through a new
+[reproducible.debian.net](https://reproducible.debian.net/)
+website. The process of analyzing reproducibility failures could now be
+more easily shared. New contributors indeed showed up and started
+submitting sorting out common issues and providing patches.
+
+In July of 2015, [Vagrant begins hosting ARM
+boards](https://lists.alioth.debian.org/pipermail/reproducible-builds/Week-of-Mon-20150727/002492.html)
+for reproducibility testing the armhf architecture. They were added to
+jenkins in August of 2015, and by December, nearly all packages on armhf
+had been tested at least once.
+
+## dpkg-genbuildinfo
+
+The turn of 2015 saw the replacement of the prototype `.buildinfo`
+generator by a new implementation suitable for proper inclusion in
+`dpkg`. Previously, only packages using `dh` could generate
+`.buildinfo` and could thus be considered reproducible. After updating
+the experimental toolchain, the change allowed to reach the mark of [80%
+source packages
+reproducible](https://people.debian.org/~lunar/blog/posts/eighty_percent/).
+
+## FOSDEM 2015 and aftermath
+
+The presentation [Stretching out for trustworthy reproducible
+builds](https://fosdem.org/2015/schedule/event/stretching_out_for_trustworthy_reproducible_builds/)
+was well received at FOSDEM 2015 and was followed up by
+
+* [tracker.debian.org](https://tracker.debian.org) inclusion, see [739497](https://bug.debian.org/739497)
+* [Debian Developer's Packages Overview](https://qa.debian.org/developer.php) (DDPO) inclusion
+* debbindiff gained .rpm support
+* [Debian Maintainer Dashbord](https://udd.debian.org/dmd/) inclusion
+
+Finally, for now, not even two weeks after FOSDEM 2015 a mail with the
+subject "[Reproducible Builds --- proof of concept successful for 83%
+of all sources in
+main](https://lists.debian.org/debian-devel-announce/2015/02/msg00007.html)"
+was send to
+[debian-devel-announce at lists.ddebian.org](https://lists.debian.org/debian-devel-announce/)
+officially anouncing the project to the Debian developer community at
+large.
+
+## To be sorted out
+
+* 2015-03-26: [`binutils` `2.25-6` is built with `--enable-deterministic-archives`](https://tracker.debian.org/news/675691)
+* testing *`testing*` and *`experimental*` now, pkg sets available too.
+* 2015-05-27: [`iceweasel` `38.0.1-5`](https://tracker.debian.org/news/687110) is reproducible.
+
+## Google Summer of Code 2015
+
+During the summer of 2015 akira and Dhole will be working on moving
+forward reproducible builds as a Google Summer of Code project. Follow
+the links to check the accepted [akira's
+application](SummerOfCode2015/StudentApplications/MariaValentinaMarinRodrigues)
+and [Dhole's
+application](SummerOfCode2015/StudentApplications/EduardSanou).
+Dhole also made a [blog post about how Dhole got into GSoC
+2015](https://dhole.github.io/post/reproducible_builds_debian_gsoc2015/)..
+
+## CCCamp 2015
+
+Short mention of Lunar's talk to be written here. Add links.
+
+## DebConf15
+
+To be written: the first real life meeting of the Debian team. Talk
+given, roundstable discussion, hacking session. Mentioned in several
+talks, incl DPL key note. `SOURCE_DATE_EPOCH` was invented around this
+time too.
+
+## Continous tests for Coreboot, OpenWrt, NetBSD, FreeBSD, Archlinux and Fedora
+
+to be written: tests for these six projects have been added between June
+and December 2015...
+
+## Reproducible World Summit, December 1-3, 2015, Athens, Greece
+
+to be written, maybe some photos to be shared, pointers to reports, new
+mailinglists, new irc channel, an even wider community has started to
+grow,
+[website](https://reproducible-builds.org/events/athens2015/).
+
+## 2016 and 2017
+
+Are largely missing here, we should fix this, rather sooner than later.
+
+In January 2017 we learned, that John Gilmore [wrote an interesting
+mail about how Cygnus.com worked on reproducible builds in the early
+1990s](https://lists.reproducible-builds.org/pipermail/rb-general/2017-January/000309.html).
+It's eye opening to see how the dealt with basically the very same
+problems we're dealing with today, how they solved them and then to
+realize that most of this has been forgotten and bit-rotted in the last
+20 years. How will we prevent history repeating itself here?
+
+On August 21st 2017 reproducible-builds where first mentioned in Debian
+Policy, 4.1.0.
+
+## Archive wide rebuilds
+
+* [2013-09-07](ReproducibleBuilds/Rebuild20130907) by David Suárez. 24% of 5240 source packages reproducible. Variations: time, build path.
+* [2014-01-26](ReproducibleBuilds/Rebuild20140126) by David Suárez. 67% of 6887 source packages reproducible. Variations: time, build path.
+* [2014-09-19](ReproducibleBuilds/RebuildCore20140919) by Lunar, 30% of 172 source core packages reproducible. Variations: time, file order.
+* [Updated daily since 2014-09-28](https://reproducible.debian.net/userContent/reproducible.html) by jenkins.debian.net. On 2014-11-11, 13213 (61.4%) out of 21448 packages are reproducible.
+
+## Publicity
+
+[Publicity](https://wiki.debian.org/ReproducibleBuilds/About#Publicity)
+
+## Contributors
+
+* akira (Maria Valentina Marin)
+* Alexis Bienvenüe
+* Andrew Ayer
+* Asheesh Laroia
+* Ceridwen
+* Chris Lamb
+* Chris West
+* Christoph Berg
+* Daniel Kahn Gillmor
+* Daniel Shahaf
+* David Suarez
+* Dhole
+* Dmitry Bogatov
+* Drew Fisher
+* Esa Peuha
+* Fabian Wolff
+* Guillem Jover
+* Hans-Christoph Steiner
+* Helmut Grohne
+* Holger Levsen
+* HW42
+* Intrigeri
+* Jelmer Vernooij 
+* josch (Johannes Schauer)
+* Juan Picca
+* Lunar (Jérémy Bobbio)
+* Mathieu Bridon
+* Mattia Rizzolo
+* Nicolas Boulenguez 
+* Niels Thykier
+* Niko Tyni
+* Paul Gevers
+* Paul Wise
+* Peter De Wachter
+* Philip Rinn
+* Reiner Herrmann
+* hefee (Sandro Knauß)
+* Sascha Steinbiss
+* Satyam Zode
+* Scarlett Clark
+* Santiago Vila
+* Stefano Rivera
+* Stéphane Glondu
+* Steven Chamberlain
+* Tom Fitzhenry
+* Valerie Young
+* Valentin Lorentz
+* Wookey
+* Ximin Luo



View it on GitLab: https://salsa.debian.org/reproducible-builds/reproducible-website/-/commit/1b5838fe4ceee4ffd7b1ea8b8f71995c16acc408

-- 
View it on GitLab: https://salsa.debian.org/reproducible-builds/reproducible-website/-/commit/1b5838fe4ceee4ffd7b1ea8b8f71995c16acc408
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.reproducible-builds.org/pipermail/rb-commits/attachments/20210731/72922aa8/attachment.htm>


More information about the rb-commits mailing list