[rb-general] buildinfo content for JVM based build

Hervé Boutemy hboutemy at apache.org
Sat Dec 22 07:22:49 CET 2018


After Arnout's excellent PoC [1], I'd like to discuss the buildinfo content based on reviewing current example:

> name=stamina-core
> group_id=com.scalapenos
> artifact_id=stamina-core_2.12
> version=0.1.5-SNAPSHOT
ok, same meaning than usual pom IIUC

> build_architecture=all
why "all" value?
to me, buildinfo is more a record of the conditions when the build was done than a definition of supported conditions
For example, it could be useful to record if the build was done on Windows or any Unix system, for newlines

We'll have to define also if recording info like platform encoding or locale is useful: relying on platform encoding is a bad practice, then I'd prefer avoiding
but locale may be something useful to record for generated documentation (or assume it should be reproducible as english...)

> source=com.scalapenos:stamina-core_2.12
> binary=com.scalapenos:stamina-core_2.12
> package=com.scalapenos:stamina-core_2.12
I don't get the meaning of source vs binary vs package
And as told on my previous email, if source value is a groupId:artifactId coordinate, I expect to find a ${artifactId}-${version}-source-release.zip file in the repository

> java.version=1.8.0_191
ok, this is the exact version of JVM used to run the build tool, and we expect that any 1.8.0 JDK will permit same rebuild

> sbt.version=1.0.4
> scala.version=2.12.4
> scala.binary-version=2.12
I suppose these are SBT build tool specific fields: I don't know Scala nor SBT, then I am not really able to judge if these 3 fields are strictly necessary
But from an outsider perspective, this seems many fields: do you install scala independently from SBT? And is binary-version also installed/defined separately?

Perhaps we should add a basic record: build_tool=sbt
to avoid guessing build tool from visible properties :)
then have a defined list of known build tools and for each tool the list of build-tool-specific variable to define

> date=1545329140000
date of what?
should we record source-date-epoch [2]?

> checksums_sha256.0.filename=stamina-core_2.12-0.1.5-SNAPSHOT.pom
> checksums_sha256.0.length=1824
> checksums_sha256.0.checksum=f0c46fef74b6b27d50dbbeae7b8db88f978985be6ce865cb58353e76919b7b67
> checksums_sha256.1.filename=stamina-core_2.12-0.1.5-SNAPSHOT.jar
> checksums_sha256.1.length=132425
> checksums_sha256.1.checksum=2c48a2359e7db58308debb5d30fee53f1023f04aabc71db29bd629f79e7ab9bb
IIUC, that's the checksum of main build result: POM and binary jar
I don't know if -javadoc.jar and -sources.jar checksums should also be provided, since they are also build result, but less critical
Perhaps it's something that can be chosen by the release manager: not every published bit has to be reproducible, I suppose

During Reproducible Builds even in Paris [3], there was an idea of recording checksum for effective dependencies used.
I'm not personally convinced this is useful, since it's part of reproducible process.
But I wanted at least to share the *input* checksums vs *output* checksums classification
If anybody wants to discuss about input checksums...

And of course there will be the question on where to store the result of our work



[1] https://oss.sonatype.org/content/repositories/snapshots/com/scalapenos/stamina-core_2.12/0.1.5-SNAPSHOT/

[2] https://reproducible-builds.org/docs/source-date-epoch/

[3] https://reproducible-builds.org/events/paris2018/

More information about the rb-general mailing list