[rb-general] [jvm] How to share rebuilder attestations

Hervé Boutemy hboutemy at apache.org
Wed Jan 9 08:52:19 CET 2019


Le lundi 7 janvier 2019, 18:24:58 CET Eli Schwartz a écrit :
> On 1/7/19 3:26 AM, Hervé Boutemy wrote:
> > this scenario is highly hypothetical if the buildinfo is thought as a
> > recording of current build environment, then generated
> > 
> > I understand that you expect the vast majority to be 1 buildinfo per
> > rebuilder, which seems reasonable from a rebuilder perspective: it's the
> > attestation sharing solution that will have to deal with the risk to have
> > many rebuilders requiring to publish many buildinfo files for 1 artifact
> > (let's be optimistic and tell that there will be thousands of rebuilders:
> > it's not for now, but that would be a sign of success)
> 
> In Arch Linux, we've settled on embedding the buildinfo into the binary
> artifact (packaged tarball), which solves several issues for us:
> 
> - we only focus on attesting that a build can be repeated, not on how it
>   can be smoketested with variations
> - ensures the buildinfo is always available, no exceptions
> - append your PGP signature for the package and sign both at once
> - no proliferation of buildinfo
> 
> It also integrates nicely with existing packaging infrastructure we
> have, and there are plans in the works to leverage this to guarantee
> that all packages are built in chroots containing only officially
> published packages.
I see the advantages of this scenario
but I also see 1 key drawback = the buildinfo has to be reproducible, and in 
the case of JVM artifacts in public repositories like Maven Central, this 
could be really problematic since every publisher has his own build platform, 
with his own JDK patch level and own OS (usually one of Windows/Linux/Mac, to 
just limit the diversity but I'm sure it's even more diverse)
I fear that you can do that because of the strict environment control that a 
Linux distro,  but this cannot be the same  with the public JVM repos

can you provide me a pointer to an ArchLinux JVM artifact (preferably built 
with Maven...) that I could try to reproduce myself, please?

> 
> > I'll precise that I'm trying to figure out how the workflow will happen
> > for
> > binary artifacts published in JVM public repositories (like Maven Central
> > or Android Maven Repository): that may influence some ideas
> > While telling that, the case of Linux distros rebuilding JVM artifacts is
> > really a mix that I don't yet manage to see: if someone canshow me how JVM
> > artifacts are rebuild for Debian, for example, I'm interested...
> 
> In Arch Linux we have a mixture of packages that just rebundle an
> existing binary, and packages that just run gradle/mvn in the source
> directory. As far as handling dependencies goes, I'm not sure we do much
> if any of that at all -- I'm not even sure how that would work for java.
> 
> So I'd be interested in more perspectives on this as well -- especially
> if they have a reliable answer to the question of offline builds.
yes, I don't know how Linux distros manage JVM dependencies when rebuilding: 
are you using Maven Central or only the distro-specific rebuilds, that are 
available somewhere (where?)? Is it the same strategy for every Linux distro?

> 
> As I understand it, the usual build workflow involves looking up
> dependent artifacts by querying an actual server instance and then
> downloading them.
yes, using external server to download dependencies is a default behaviour, 
but if someone wants to override to get his own artifact repository instead, 
you can do it with parameters (at least with Maven, but I suppose every build 
tool can)

> 
> >> - Is a rebuild expected to reproduce the buildinfo file verbatim?
> > 
> > clearly not, since the buildinfo intent is to record the environment used
> > with sometimes too precise data (like for example the patch level of the
> > JVM used, which in general is not really important AFAIK, or the OS that
> > in general should not be important in the JVM case)
> 
> If the information is important enough to record, it might be important
> enough to make a difference in the outcome. If varying the JVM version
> is still desirable, I'd see this as mostly useful for the testing
> process, i.e. checking reproducibility issues and using diffoscope to
> see what changed.
let's dig into the JVM requirement:
from experience, bytecode produced by major JVM versions is really different 
(tested with JDK 7, 8, 9, 10 and 11)
but patch level is not
since what we record easily is the full JDK version (major version + patch 
level), we mix strong requirement (major version) with something that is not 
that important (patch level) and that we would like to accept variation (I 
already have 5 JDK versions on my computer for 5 major versions, if I need to 
have strict patch level, I'll finish with hundreds, since once again I want to 
rebuild every artifact from Maven Central, that has been built by anybody in 
his own personal environment.

> 
> >> - What exactly gets PGP-signed?  (The binary artifact?  The buildinfo?
> >> 
> >>   If the latter, how does one then establish trust in the binary
> >>   artifact?)
> > 
> > good question:
> > the rebuilders's buildinfo, for sure, gets signed by the rebuilder
> > Signing the binary artifact could make sense, but the workflow for that
> > may
> > not be easy...
> > Signing the original buildinfo file to me does not really make sense: if
> > we
> > sign an existing file, IMHO it's better to go with the binary artifact
> 
> +1
> 
> The buildinfo doesn't need trust, it's the source recipe
it's the source recipe + a little bit of environment details for the JVM case 
(like JDK patch level, OS): this is the second part that hurts and would not 
be that easy to remove (it's easy to have automated recording of build 
environment with classical useful data, it's harder to expect from every 
developer to manually define what is significant and what is more flexible)

> and you can
> prove it's the correct one by using it yourself to rebuild the artifact.
> Signing the build artifact tells you that the recipe worked, and it also
> makes it more straightforward to check the validity.
+1 downloading 1 .asc file that contains thousands of signatures is easier 
than downloading thousands of .asc files each one containing 1 signature

> 
> (It drives me nuts as a distro packager to see upstream projects that
> provide a sha512sums file and PGP-sign the checksums, but don't PGP-sign
> the binaries.
:)

> It's completely unfeasible to support the many ways this
> file can be created in a purely programmatic way, and the proliferation
> of signing data is difficult to standardize. If the source artifact
> itself is not PGP-signed, we simply don't support it and the package
> does not get PGP verification at all.)






More information about the rb-general mailing list