rebuilding Maven Central Repository artifacts: welcome reproducible-central

Hervé Boutemy hboutemy at apache.org
Fri Apr 3 06:03:14 UTC 2020


Le dimanche 29 mars 2020 13:41:25 CEST, vous avez écrit :
> >I'll need community here to define what can be done next and how (and I
> >mean
> >done, not dream: I have a lot of dreams that seem inaccessible...)
> 
> Maybe not too difficult: track reproducibility accross dependencies. As a
> developer it's nice if my package is reproducible, but if my dependencies
> are not, it's a bit of a wasted effort.

sadly, this is exactly the ultimate dream :)
notice this is a good dream, it is my own dream too
But I'll explain issues I get to do this, perhaps we'll find solutions

Getting dependency list is not hard: "mvn dependency:list"
The big question is: where is the database that tells that a binary artifact 
is reproducible?
Who should one trust for such a database? based on what proof?

reproducible-central is a very first step towards such a database, but it has 
many many known limitations:
- it won't scale to the target millions of binary artifacts
- it does not really provide easy lookup to know if an artifact is 
reproducible
- should you trust me and just me = the guy who published the results?
- should anybody only use the exact recipe I published, with my personally 
chosen Docker images? Shouldn't other people test with different environment, 
that prove that my Docker image choice was not compromised?

And final aspect on this dream: currently, there are so few reproducible JVM 
artifacts on Central Repository that I can code the algorithm that will match 
99.99% checks: if you have any dependency, they are not reproducible :)


That's why I found this markdown table reproducibility results, with its 
automation scripts, published to reproducible-central Git repository, as a 
first step to get visible effective reproducibility status of JVM artifacts 
published in the wild.
For now, few JVM projects are working on improving their builds to provide 
reproducible builds by default: I'm trying to provide them PRs when they are 
using Maven. But I would need help:
- other people to provide PRs: perhaps we could track on reproducible-central 
to give us group confidence?
- PRs for builds with other build tools: I'm doing with my own build tool, I 
say that my work is not Maven centric, but with no help from people experts in 
other build tools, my work is de-facto Maven centric :(

During the last week, thinking at what I could do next, I know that I will 
write some scripts to check every Apache Maven build and detect if the current 
master head is reproducible or not: this will improve the chances that the 
next official release will be reproducible

I need help from diverse people to really expand the JVM effective 
reproducibility for the real artifacts that everybody downloads

Regards,

Hervé




More information about the rb-general mailing list