[rb-general] Reproducible Java builds with Maven

Hervé Boutemy hboutemy at apache.org
Tue Nov 27 00:12:13 CET 2018


in multi-module context, every module is fully built before the next one: then 
one consequence is that it gives interleaving, which is not an issue in 
general.

The typical solution to do a global download before building offline is to use 
dependency:go-offline
https://maven.apache.org/plugins/maven-dependency-plugin/go-offline-mojo.html
Did you try it? Is there an issue for your use case?

on download vs resolve, I suppose that you want to choose your own 
dependencies versions.
In well designed pom.xml for multi-module builds, versions choice should be 
done in root pom.xml's dependencyManagement+pluginManagement (and no versions 
should be forced in dependency nor plugin).
And you can define your own profile to define these versions without changing 
default (no-profile) versions.
But all this is quite theoretical, since there are many projects that don't 
use dependencyManagement+pluginManagement consistently...


Last topic: checking hashes.
Did yo try "mvn --strict-checksums"?
I must confess I don't really use this...

Regards,

Hervé

Le lundi 26 novembre 2018, 16:21:33 CET Eric Myhre a écrit :
> On 26.11.2018 03:00, Bernhard M. Wiedemann wrote:
> > Hi Hervé,
> > 
> > thanks for raising this topic.
> > 
> > On 26/11/2018 09.08, Hervé Boutemy wrote:
> >> Anybody interested in working together?
> > 
> > With openSUSE we are doing all builds offline to ensure that we can
> > repeat builds later (without worry about offline or hacked servers), but
> > for maven this often meant we had to download 300 MB of someone else's
> > binaries to use in the build.
> 
> I love all the reproducibility issues of jars enumerated in this wiki page.
> 
> However... another +1 to this issue raised by Bernhard and Julien. One
> of the biggest practical hurdles in working with Maven comes before any
> of that: there's no clear separation of "download time" vs "resolve
> time" vs "build time".
> 
> Maven seems to intermix downloads and execution operations fairly freely
> (e.g. plugin download, now plugin eval, now dep download -- download and
> execution are interleaved).  This makes it very, very difficult to
> ensure all the needed dependencies can be identified and downloaded (and
> saved locally) in advance.
> 
> Some distributions and build environments prefer to completely disable
> the network during builds in order to make certain that there aren't
> uncaptured information sources or dependencies being downloaded at build
> time -- in order to make rigorously sure we satisfy our core definition
> of reproducible: "given the same source code, [and] build environment". 
> I'd love to work on making Maven as compatible with this goal as possible.
> 
> Even some features for more explicit/pre-build-phase dependency
> enumeration would be a big help in this area. I chatted with some other
> Maven enthusiastic folk at our last summit, and while we found ways to
> instruct Maven to yield a list of resolved dependencies, this still
> didn't cover a lot of critical ground: the output was human-readable,
> but not very easily machine-parsible; and if I recall correctly it
> covered dependencies but not plugins, making it somewhat incomplete.  An
> API for these operations would be incredibly useful.  (And then ideally,
> perhaps we'd like a way to take our resolved list of dependencies and
> automatically write out a new pom file with either those fixed versions
> or a fixed reference to everything needed to perform an identical
> resolution process offline in the future; but that's a next step. 
> Sounds like Guix has a tool for that; it'd be nice if such a tool was in
> mainline Maven itself.)
> 
> Of course if I'm misspeaking and there are more features for dependency
> enumeration and separating download/resolve/build phases -- I love being
> wrong :) -- then this whole email can instead be: I'd love to round up
> some documentation about these features and add it to these wiki pages
> about reproducibility :)
> 
> ---
> 
> https://github.com/signalapp/gradle-witness might be interesting in
> relation to this topic.  It is a Gradle plugin to add hash checks to
> downloads.
> 
> It ran into a few issues that seem likely to arise again:
> 
> - It's very opt-in; you can't apply it to a project without modifying
> the pom^H^H^H build.gradle file, and this limits its usefulness to folk
> from the distro perspective
> 
> - As the readme mentions, it has something of a bootstrapping problem
> (it can't fetch *itself* by hash...)
> 
> - IIUC, it doesn't work for Maven/Gradle plugins, only for the project
> dependencies... which means it's not a complete coverage of the build
> environment.
> 
>     - It only applies the checks to dependencies listed in the
>     configuration; if transitive resolution somehow adds a new
>     dependency, it goes unchecked (and this does come up: for example,
>     if building on a different architecture, the dependency resolution
>     may yield different results *even when* all versions are pinned),
>     and so again, it's not complete coverage.
> 
> 
> In general, the lesson here seems to be that when trying to get a
> complete view of the sources and build environment, tools built into the
> core can really can shine a lot brighter; when trying to do it in
> plugins, then things like (ironically) plugins seem to end up very
> difficult to handle.
> 
> ---
> 
> Cheers!  Very excited for the gathering of effort.






More information about the rb-general mailing list