[Git][reproducible-builds/reproducible-website][master] history: include minutes from dc13-bof

Holger Levsen (@holger) gitlab at salsa.debian.org
Wed Nov 15 17:32:15 UTC 2023



Holger Levsen pushed to branch master at Reproducible Builds / reproducible-website


Commits:
0dd0ada4 by Holger Levsen at 2023-11-15T18:31:20+01:00
history: include minutes from dc13-bof

downloaded from https://wiki.debian.org/ReproducibleBuilds/History?action=AttachFile&do=get&target=dc13-bof-reproducible-builds.txt

thanks to iyanmv for pointing out the link was not working.

Signed-off-by: Holger Levsen <holger at layer-acht.org>

- - - - -


2 changed files:

- + _docs/history-dc13-minutes.txt
- _docs/history.md


Changes:

=====================================
_docs/history-dc13-minutes.txt
=====================================
@@ -0,0 +1,379 @@
+apt-get install gobby-infinote
+gobby -c gobby.debian.org -n
+debconf13/bof/reproducible-builds
+
+Byte-for-byte identical reproducible builds?
+============================================
+
+BoF at DebConf13 / Vaumarcus, Switzerland; chair: Lunar
+
+Abstract:
+
+    The Bitcoin client and the upcoming Tor Browser Bundle 3.0 series
+    are using a build system that produces “deterministic builds” —
+    packages which are byte-for-byte identical no matter who actually
+    builds them, or what hardware they use. The idea is that current
+    popular software development practices simply cannot survive
+    targeted attacks of the scale and scope that we are seeing today.
+    With “deterministic builds”, any individual can use an anonymity
+    network to download publicly signed and audited source code and
+    reproduce the builds exactly, without being subject to such
+    targeted attacks. If they notice any differences, they can alert
+    the public builders/signers, hopefully anonymously.
+
+    Is such ideas applicable to Debian? To what extent? What would be
+    the first stones to pave the way toward reproducible builds of
+    Debian packages?
+
+Foreword
+--------
+
+Huge, huge thanks to Asheesh for helping me prepare this BoF.
+
+Agenda
+------
+
+    “Good news everyone! We are are going to get pwned!”
+                                         — Professor Farnsworth
+
+1. Go around: why do you care? (5-10 min.)
+2. Mike Perry's work on the Tor Browser Bundle (5 min.)
+3. Asheesh's experiments (5 min.)
+4. On the technical side, there's two aspects to the problem:
+   a. at the package level: How do we guarantee that given the same
+      source package and the same build environment, we get the same
+      binary results? (5-10 min.)
+   b. at the archive level: How to record the build environment of a
+      package (and enable its reproduction at a later time)?
+      (5 min.)
+5. What's next? (15 min.)
+
+Experience from making the Tor Browser Bundle builds reproducible
+-----------------------------------------------------------------
+
+Mike Perry worked on making the Tor Browser Bundle builds
+reproducible. That's hard work: Tor Browser is based on Firefox
+(huge code base) and is built for Linux, Mac OS X and Windows.
+
+  - How:
+    - Uses Gitian from Bitcoin
+      - Thin layer around Ubuntu virtualization tools
+      - Spins up a ubuntu VM with fixed hostname, username,
+        path, and fake timestamps (via faketime)
+      - List packages and architecture
+      - Runs a bash script you specify
+      - Cross compiles for Windows (mingw-w64) and Mac (toolchain4)
+    - Took about 3-4 days per OS to write a working descriptor set
+      for Tor, Firefox and bundling/localization
+    - 2 weeks after starting, I was producing matching repeat builds
+      on my own hardware
+      - Issues:
+        - FIPS-140 mode has non-deterministic sigs on Linux
+        - Millisecond timestamps encoded by Firefox
+        - Mystery 3 bytes of randomness on Windows. Bitstomped
+    - 6 more weeks of work to get the builds to match externally
+      - Filesytem reordering
+        - Affects Zip, Tar, .a, and even aspects of Firefox scripts
+          - created wrappers for archives
+          - Firefox ordering enforced via sorting inputs in Firefox scripts
+      - Localization LC_ALL leaks
+        - Alters sort order
+      - Permissions differences
+        - Even though I set umask...
+
+To sum it up: the key that needs to be controled are the hostname,
+username, build path, OS locale, uname output, toolchain version,
+and time. We can either make everything deterministic or record on first build and the replay on subsequent builds.
+
+Results from Asheesh's experiments
+----------------------------------
+
+Asheesh jumped on the idea and played with the hello package.
+Rebuilt using faketime on top of fakeroot.
+
+* When you rebuild that way, the data.tar.gz of the built Debian
+  package has the same contents
+* Same with control.tar.gz
+
+However, the data.tar.gz and control.tar.gz *both* don't match each
+other. This is because of a semi-bug in dpkg, we need convince dpkg
+to fix the 'not calling gzip -n' issue.
+
+* ELF binaries like /usr/bin/hello in the "hello" package
+  contain *no* timestamp that needs to be stripped.
+* gzip files need '-n' to be passed to gzip for avoiding embedding a
+  timestamp.
+* xz and bzip2 don't have this problem. I'm too pressed for time to
+  write a test script, but I did test it.
+* dedup.debian.net can be used to detect duplicates, especially if
+  we hack it to detect files that change between uploads of a package,
+  rather than just between packages.
+  - future work: ssdeep hashes, which could be useful for finding files
+    that should be duplicates but aren't
+
+NOTE that this might instead be because the *timestamps* of files within
+control.tar.gz and data.tar.gz.. testing that theory... I have not finished
+testing this theory, sadly, but here is a shell script I use to set up a lab:
+
+http://rose.makesad.us/~paulproteus/tmp/extract_both.sh
+ - please provide an index for the PTS :)
+
+Package level issues
+--------------------
+
+### time
+
+ * Remove/strip the timestamps for build results.
+ * Use faketime (reports faked system time to programs). Time could be
+   automatically set to the time of the last debian/changelog entry.
+ * Base timestamps on timestamps of the source code, which should be unchanged
+ * Record time on first build and replay them later (see below).
+
+(In most case, recording the time of the build is actually
+wrong. For documentation, what matters is the time of the last
+change in the source package and not the time of the build
+itself.)
+
+### Build path
+
+ * Debian buildds use per-build temporary path names; so that any paths accidentally embedded in binaries do not exist on end-user systems (potential security issue).
+ * Stripping the path with debugedit (???)
+ * Correct solution: patch out where path appears -> use paths relative to the builddir
+   instead of having a common build directory for everyone.
+   (Because having encoded paths can hide real bugs, anyway.)
+
+### OS locale
+
+ * Use LANG=C.UTF-8 ? -> LC_ALL=C.UTF-8
+ * Let's make dpkg-buildpackage export this value
+   (or another wrapper? because dpkg-buildpackage is not
+   the policy canonical way to build all packages;
+   but debian/rules is painful)
+   Lets make this an option so that users see translated messages
+   and the buildds all build with English 
+ * Change the policy to make dpkg-buildpackage be the canonical
+   solution to build package.
+
+### hostname, uname output, username
+
+liblietome?
+
+But kernel version is part of the build environment, so
+we might need to record that somewhere else. Are kernels used on buildds always available? Or are some using non-standard kernels?
+
+### toolchain version
+
+ * part of the system state and build info
+
+### file ordering issues
+
+Need to patch the build systems to add proper `sort` calls.
+
+### Randomisation
+
+ * Define seed?
+ * ASLR?
+
+### pid numbers
+
+Let's patch that out if needed.
+
+### Others issues?
+
+
+Archive level issues
+--------------------
+
+Not all packages are built on the buildds so the build environment isn't going to be the same (for now).
+
+.changes file are not currently kept except on mailing lists.
+
+We want .changes files: they are signed by the maintainer.
+
+If we keep .changes file, we can add a `XC-Built-Environment` field.
+It would add to the .changes files something like:
+
+Built-Environment:
+ apt (= 0.9.9.4), aptitude (= 0.6.8.2-1), aptitude-common (= 0.6.8.2-1),
+ base-files (= 7.2), base-passwd (= 3.5.26), bash (= 4.2+dfsg-1),
+ binutils (= 2.23.52.20130727-1), bsdutils (= 1:2.20.1-5.5),
+ build-essential (= 11.6), bzip2 (= 1.0.6-4), ccache (= 3.1.9-1),
+ coreutils (= 8.21-1), cpp (= 4:4.8.1-2), cpp-4.6 (= 4.6.4-4),
+ cpp-4.7 (= 4.7.3-6), cpp-4.8 (= 4.8.1-8), dash (= 0.5.7-3),
+ debconf (= 1.5.50), debconf-i18n (= 1.5.50),
+ debian-archive-keyring (= 2012.4), debianutils (= 4.4),
+ diffutils (= 1:3.2-8), dpkg (= 1.17.1), dpkg-dev (= 1.17.1),
+ e2fslibs (= 1.42.8-1), e2fsprogs (= 1.42.8-1), fakeroot (= 1.19-2),
+ findutils (= 4.4.2-6), g++ (= 4:4.8.1-2), g++-4.6 (= 4.6.4-4),
+ g++-4.8 (= 4.8.1-8), gcc (= 4:4.8.1-2), gcc-4.4-base (= 4.4.7-4),
+ gcc-4.5-base (= 4.5.4-1), gcc-4.6 (= 4.6.4-4), gcc-4.6-base (= 4.6.4-4),
+ gcc-4.7 (= 4.7.3-6), gcc-4.7-base (= 4.7.3-6), gcc-4.8 (= 4.8.1-8),
+ gcc-4.8-base (= 4.8.1-8), gnupg (= 1.4.14-1), gpgv (= 1.4.14-1),
+ grep (= 2.14-2), gzip (= 1.6-1), hostname (= 3.13),
+ initscripts (= 2.88dsf-43), insserv (= 1.14.0-5), less (= 458-2),
+ libacl1 (= 2.2.52-1), libapt-pkg4.12 (= 0.9.9.4), libasan0 (= 4.8.1-8),
+ libatomic1 (= 4.8.1-8), libattr1 (= 1:2.4.47-1), libblkid1 (= 2.20.1-5.5),
+ libboost-iostreams1.49.0 (= 1.49.0-4), libbz2-1.0 (= 1.0.6-4),
+ libc-bin (= 2.17-92), libc-dev-bin (= 2.17-92), libc6 (= 2.17-92),
+ libc6-dev (= 2.17-92), libcap2 (= 1:2.22-1.2),
+ libclass-isa-perl (= 0.36-5), libcloog-isl4 (= 0.18.0-2),
+ libcloog-ppl1 (= 0.16.1-3), libcomerr2 (= 1.42.8-1),
+ libcwidget3 (= 0.5.16-3.4), libdb5.1 (= 5.1.29-6), libdpkg-perl (= 1.17.1),
+ libept1.4.12 (= 1.0.9), libfile-fcntllock-perl (= 0.14-2),
+ libgcc-4.7-dev (= 4.7.3-6), libgcc-4.8-dev (= 4.8.1-8),
+ libgcc1 (= 1:4.8.1-8), libgdbm3 (= 1.8.3-12), libgmp10 (= 2:5.1.2+dfsg-2),
+ libgmpxx4ldbl (= 2:5.1.2+dfsg-2), libgomp1 (= 4.8.1-8),
+ libgpm2 (= 1.20.4-6.1), libisl10 (= 0.11.2-1), libitm1 (= 4.8.1-8),
+ liblocale-gettext-perl (= 1.05-7+b1), liblzma5 (= 5.1.1alpha+20120614-2),
+ libmount1 (= 2.20.1-5.5), libmpc2 (= 0.9-4), libmpc3 (= 1.0.1-1),
+ libmpfr4 (= 3.1.1-1), libncurses5 (= 5.9+20130608-1),
+ libncursesw5 (= 5.9+20130608-1), libpam-modules (= 1.1.3-9),
+ libpam-modules-bin (= 1.1.3-9), libpam-runtime (= 1.1.3-9),
+ libpam0g (= 1.1.3-9), libpcre3 (= 1:8.31-2), libppl-c4 (= 1:1.0-7),
+ libppl12 (= 1:1.0-7), libquadmath0 (= 4.8.1-8),
+ libreadline6 (= 6.2+dfsg-0.1), libselinux1 (= 2.1.13-2),
+ libsemanage-common (= 2.1.10-2), libsemanage1 (= 2.1.10-2),
+ libsepol1 (= 2.1.9-2), libsigc++-2.0-0c2a (= 2.2.10-0.2),
+ libslang2 (= 2.2.4-15), libsqlite3-0 (= 3.7.17-1),
+ libss2 (= 1.42.8-1), libstdc++-4.8-dev (= 4.8.1-8),
+ libstdc++6 (= 4.8.1-8), libstdc++6-4.6-dev (= 4.6.4-4),
+ libswitch-perl (= 2.16-2), libtext-charwidth-perl (= 0.04-7+b1),
+ libtext-iconv-perl (= 1.7-5), libtext-wrapi18n-perl (= 0.06-7),
+ libtimedate-perl (= 1.2000-1), libtinfo5 (= 5.9+20130608-1),
+ libtsan0 (= 4.8.1-8), libusb-0.1-4 (= 2:0.1.12-23.2),
+ libustr-1.0-1 (= 1.0.4-3), libuuid1 (= 2.20.1-5.5),
+ libxapian22 (= 1.2.15-2), linux-libc-dev (= 3.10.3-1),
+ login (= 1:4.1.5.1-1), lsb-base (= 4.1+Debian12),
+ make (= 3.81-8.2), mawk (= 1.3.3-17), mount (= 2.20.1-5.5),
+ multiarch-support (= 2.17-92), ncurses-base (= 5.9+20130608-1),
+ ncurses-bin (= 5.9+20130608-1), passwd (= 1:4.1.5.1-1), patch (= 2.7.1-3),
+ perl (= 5.14.2-21),
+ perl-base (= 5.14.2-21), perl-modules (= 5.14.2-21),
+ readline-common (= 6.2+dfsg-0.1), screen (= 4.1.0~20120320gitdb59704-9),
+ sed (= 4.2.2-2), sensible-utils (= 0.0.9), sysv-rc (= 2.88dsf-43),
+ sysvinit (= 2.88dsf-43), sysvinit-utils (= 2.88dsf-43),
+ tar (= 1.26+dfsg-6), tzdata (= 2013d-1), ucf (= 3.0027+nmu1),
+ util-linux (= 2.20.1-5.5), vim (= 2:7.3.923-3), vim-common (= 2:7.3.923-3),
+ vim-runtime (= 2:7.3.923-3), xz-utils (= 5.1.1alpha+20120614-2),
+ zlib1g (= 1:1.2.8.dfsg-1)
+
+   (Example naively generated by taking all packages installed
+    by pbuilder when building the `hello` package.)
+
+ * Do we want to trim this list? How?
+    -> use the access time to files in the various packages
+       to determine what was used or not (or another mechanism
+       to be notified of packages that matters)
+ * Do we want to include arch (eg. `:amd64`) in there? Yes - multiarch means we can have cross-arch deps (but not yet - britney needs work)
+
+Then, the good news: snapshot.debian.org keeps binary packages! but not .changes
+
+make (= 3.81-8.2)
+  => http://snapshot.debian.org/package/make-dfsg/3.81-8.2/#make_3.81-8.2
+
+Is there an easy way to script installing a specific set of
+binary packages from snapshot? Yes - use a specific date in your sources.list:
+
+deb     http://snapshot.debian.org/archive/debian/20091004T111800Z/ lenny main
+deb-src http://snapshot.debian.org/archive/debian/20091004T111800Z/ lenny main
+deb     http://snapshot.debian.org/archive/debian-security/20091004T121501Z/ lenny/updates main
+deb-src http://snapshot.debian.org/archive/debian-security/20091004T121501Z/ lenny/updates main
+
+What's next?
+------------
+
+ * Do we have a “Champion”?… looks like not. :(
+ * Fill up a page on the wiki
+ * Who wants to have their package build reproducible?
+   - Asheesh: alpine
+   - Lunar: haveged
+   - pabs: iotop (python based)
+   - joeyh: debhelper :D
+   - lindi: magit
+ * [Asheesh] Convince dpkg to fix the 'not calling gzip -n' issue.
+ * Another change needed in dpkg: tar --numeric-owner --owner=0
+ * [Asheesh, Helmut] Attempt to code a downstream version of dedup.debian.net
+   that lets us detect when files change between uploads of a package,
+   and then run it on the archive.
+ * Automated archive-wide testing of this issue and export to the PTS
+ * [rbalint, lindi] libfaketime updates?
+   advancing time in faketime with each time() call: https://github.com/wolfcw/libfaketime/pull/20
+   [rbalint] replaying timestamp needs bigger changes in faketime, I'm working on those
+ * [fil] talk to Ganeff about keeping .changes - hash chain from the Release files needed
+ * Script to transform the "Built-Environment" list to
+   links to file in the snapshot archives.
+ * pbuilder like script that install all the packages in a
+   chroot and rebuild the package there.
+ * How about a sprint‽ Yes!
+   Together with Multi-Arch friends? Sponsorship from ARM?
+
+Other ideas:
+
+ * Research other distros (NixOS?)
+ * Research
+   https://build.opensuse.org/package/show/openSUSE:Factory/build-compare
+ * Deterministic virtual machines
+   "ReVirt: Enabling Intrusion Analysis through Virtual-Machine Logging and Replay" http://www.eecs.umich.edu/virtual/papers/dunlap02.pdf (HTTP 403 currently :-()
+   "Debugging operating systems with time-traveling virtual machines" http://www.eecs.umich.edu/virtual/papers/king05_1.pdf (HTTP 403 currently :-()
+   "A Particular Bug Trap: Execution Replay Using Virtual Machines" http://arxiv.org/pdf/cs.DC/0310030
+   "ReTrace: Collecting Execution Trace with Virtual Machine Deterministic Replay"
+   "Execution Replay for Multiprocessor Virtual Machines" http://www.eecs.umich.edu/~pmchen/papers/dunlap08.slides.ppt
+
+
+
+More post-BoF experiments
+-------------------------
+
+diff --git a/debian/control b/debian/control
+index 1ef9ccd..50b5221 100644
+--- a/debian/control
++++ b/debian/control
+@@ -7,6 +7,7 @@ Standards-Version: 3.9.4
+ Homepage: http://www.issihosts.com/haveged/
+ Vcs-Git: git://git.debian.org/git/collab-maint/haveged.git
+ Vcs-Browser: http://git.debian.org/?p=collab-maint/haveged.git
++XC-Build-Environment: ${misc:Build-Environment}
+ 
+ Package: haveged
+ Architecture: linux-any
+diff --git a/debian/rules b/debian/rules
+index 04d6fcc..cb2cdf3 100755
+--- a/debian/rules
++++ b/debian/rules
+@@ -15,3 +15,10 @@ override_dh_auto_configure:
+ 
+ override_dh_strip:
+        dh_strip --dbg-package=libhavege1-dbg
++
++override_dh_gencontrol:
++       COLUMNS=999 | dpkg -l | awk ' \
++                       BEGIN { printf "misc:Build-Environment=" } \
++                       /^ii/ { ORS=", "; print $$2 " (= " $$3 ")" }' | \
++               sed -e 's/, $$//' >> debian/substvars
++       dh_gencontrol
+
+
+This does not work as `dpkg-genchanges` does not substitute
+the variable before adding the field in debian/changes! :(
+  — Lunar
+
+But it is a trivial patch against dpkg:
+
+diff --git a/scripts/dpkg-genchanges.pl b/scripts/dpkg-genchanges.pl
+index 0b004c7..13cedd6 100755
+--- a/scripts/dpkg-genchanges.pl
++++ b/scripts/dpkg-genchanges.pl
+@@ -516,4 +516,5 @@ for my $f (keys %remove) {
+     delete $fields->{$f};
+ }
+ 
+-$fields->output(\*STDOUT); # Note: no substitution of variables
++$fields->apply_substvars($substvars);
++$fields->output(\*STDOUT);
+
+
+
+--------------------------------------------------------
+
+-----------------------------------------------------------
+


=====================================
_docs/history.md
=====================================
@@ -44,7 +44,7 @@ about thirty attendees who were very much interested, amongst them
 members of the [technical
 committee](https://www.debian.org/devel/tech-ctte) and a few
 other core teams.
-[Minutes](attachment:dc13-bof-reproducible-builds.txt) are
+[Minutes](../history-dc13-minutes.txt) are
 available.
 
 After some more research during the conference, a [wiki



View it on GitLab: https://salsa.debian.org/reproducible-builds/reproducible-website/-/commit/0dd0ada424c3207f66dcb8e010337d1b62a6d3d7

-- 
View it on GitLab: https://salsa.debian.org/reproducible-builds/reproducible-website/-/commit/0dd0ada424c3207f66dcb8e010337d1b62a6d3d7
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.reproducible-builds.org/pipermail/rb-commits/attachments/20231115/e6134bc2/attachment.htm>


More information about the rb-commits mailing list