rebuilding Maven Central Repository artifacts: welcome reproducible-central

Julien Lepiller julien at lepiller.eu
Fri Apr 3 11:06:36 UTC 2020


Le 3 avril 2020 02:03:14 GMT-04:00, "Hervé Boutemy" <hboutemy at apache.org> a écrit :
>Le dimanche 29 mars 2020 13:41:25 CEST, vous avez écrit :
>> >I'll need community here to define what can be done next and how
>(and I
>> >mean
>> >done, not dream: I have a lot of dreams that seem inaccessible...)
>> 
>> Maybe not too difficult: track reproducibility accross dependencies.
>As a
>> developer it's nice if my package is reproducible, but if my
>dependencies
>> are not, it's a bit of a wasted effort.
>
>sadly, this is exactly the ultimate dream :)
>notice this is a good dream, it is my own dream too
>But I'll explain issues I get to do this, perhaps we'll find solutions
>
>Getting dependency list is not hard: "mvn dependency:list"
>The big question is: where is the database that tells that a binary
>artifact 
>is reproducible?
>Who should one trust for such a database? based on what proof?

I see, I didn't realise :)

>
>reproducible-central is a very first step towards such a database, but
>it has 
>many many known limitations:
>- it won't scale to the target millions of binary artifacts
>- it does not really provide easy lookup to know if an artifact is 
>reproducible
>- should you trust me and just me = the guy who published the results?
>- should anybody only use the exact recipe I published, with my
>personally 
>chosen Docker images? Shouldn't other people test with different
>environment, 
>that prove that my Docker image choice was not compromised?
>
>And final aspect on this dream: currently, there are so few
>reproducible JVM 
>artifacts on Central Repository that I can code the algorithm that will
>match 
>99.99% checks: if you have any dependency, they are not reproducible :)
>
>
>That's why I found this markdown table reproducibility results, with
>its 
>automation scripts, published to reproducible-central Git repository,
>as a 
>first step to get visible effective reproducibility status of JVM
>artifacts 
>published in the wild.
>For now, few JVM projects are working on improving their builds to
>provide 
>reproducible builds by default: I'm trying to provide them PRs when
>they are 
>using Maven. But I would need help:
>- other people to provide PRs: perhaps we could track on
>reproducible-central 
>to give us group confidence?
>- PRs for builds with other build tools: I'm doing with my own build
>tool, I 
>say that my work is not Maven centric, but with no help from people
>experts in 
>other build tools, my work is de-facto Maven centric :(
>
>During the last week, thinking at what I could do next, I know that I
>will 
>write some scripts to check every Apache Maven build and detect if the
>current 
>master head is reproducible or not: this will improve the chances that
>the 
>next official release will be reproducible

That's pretty cool :)

>
>I need help from diverse people to really expand the JVM effective 
>reproducibility for the real artifacts that everybody downloads

To be honest I'm already busy with the situation in guix, which requires that maven artifacts are reproducible *and* bootstrapped (the real hard part for us). I managed to bootstrap the maven build system in a personal repository and will prepare patches shortly (it involved a partial java parser and manipulating different xml files). I know you don't count plugins as dependencies, which are the hard bootstrapping problems I just solved, so my work will sound silly to you ^^.

Of course your work is useful to me: if you can ensure artifacts are reproducible on your side, then I have more confidence guix artifacts will be reproducible from within the guix framework.

I don't know if it's a goal for you, but something that could help guix is a relation groupid/artifactid -> source. Artifactids refer to binaries, which I think makes your work harder, since you have to find the sources before you can try and reproduce. I know maven is not built to support that, so I'm probably just dreaming :)

>Regards,
>
>Hervé

Thanks for your work on maven!


More information about the rb-general mailing list