[rb-general] Reproducible Java builds with Maven
Eric Myhre
hash at exultant.us
Mon Nov 26 16:21:33 CET 2018
On 26.11.2018 03:00, Bernhard M. Wiedemann wrote:
> Hi Hervé,
>
> thanks for raising this topic.
>
> On 26/11/2018 09.08, Hervé Boutemy wrote:
>> Anybody interested in working together?
> With openSUSE we are doing all builds offline to ensure that we can
> repeat builds later (without worry about offline or hacked servers), but
> for maven this often meant we had to download 300 MB of someone else's
> binaries to use in the build.
I love all the reproducibility issues of jars enumerated in this wiki page.
However... another +1 to this issue raised by Bernhard and Julien. One
of the biggest practical hurdles in working with Maven comes before any
of that: there's no clear separation of "download time" vs "resolve
time" vs "build time".
Maven seems to intermix downloads and execution operations fairly freely
(e.g. plugin download, now plugin eval, now dep download -- download and
execution are interleaved). This makes it very, very difficult to
ensure all the needed dependencies can be identified and downloaded (and
saved locally) in advance.
Some distributions and build environments prefer to completely disable
the network during builds in order to make certain that there aren't
uncaptured information sources or dependencies being downloaded at build
time -- in order to make rigorously sure we satisfy our core definition
of reproducible: "given the same source code, [and] build environment".
I'd love to work on making Maven as compatible with this goal as possible.
Even some features for more explicit/pre-build-phase dependency
enumeration would be a big help in this area. I chatted with some other
Maven enthusiastic folk at our last summit, and while we found ways to
instruct Maven to yield a list of resolved dependencies, this still
didn't cover a lot of critical ground: the output was human-readable,
but not very easily machine-parsible; and if I recall correctly it
covered dependencies but not plugins, making it somewhat incomplete. An
API for these operations would be incredibly useful. (And then ideally,
perhaps we'd like a way to take our resolved list of dependencies and
automatically write out a new pom file with either those fixed versions
or a fixed reference to everything needed to perform an identical
resolution process offline in the future; but that's a next step.
Sounds like Guix has a tool for that; it'd be nice if such a tool was in
mainline Maven itself.)
Of course if I'm misspeaking and there are more features for dependency
enumeration and separating download/resolve/build phases -- I love being
wrong :) -- then this whole email can instead be: I'd love to round up
some documentation about these features and add it to these wiki pages
about reproducibility :)
---
https://github.com/signalapp/gradle-witness might be interesting in
relation to this topic. It is a Gradle plugin to add hash checks to
downloads.
It ran into a few issues that seem likely to arise again:
- It's very opt-in; you can't apply it to a project without modifying
the pom^H^H^H build.gradle file, and this limits its usefulness to folk
from the distro perspective
- As the readme mentions, it has something of a bootstrapping problem
(it can't fetch *itself* by hash...)
- IIUC, it doesn't work for Maven/Gradle plugins, only for the project
dependencies... which means it's not a complete coverage of the build
environment.
- It only applies the checks to dependencies listed in the
configuration; if transitive resolution somehow adds a new
dependency, it goes unchecked (and this does come up: for example,
if building on a different architecture, the dependency resolution
may yield different results *even when* all versions are pinned),
and so again, it's not complete coverage.
In general, the lesson here seems to be that when trying to get a
complete view of the sources and build environment, tools built into the
core can really can shine a lot brighter; when trying to do it in
plugins, then things like (ironically) plugins seem to end up very
difficult to handle.
---
Cheers! Very excited for the gathering of effort.
More information about the rb-general
mailing list