[rb-general] buildinfo content for JVM based build

Hervé BOUTEMY herve.boutemy at free.fr
Sat Dec 22 19:13:19 CET 2018

Le samedi 22 décembre 2018, 11:58:37 CET Arnout Engelen a écrit :
> On Sat, Dec 22, 2018 at 7:23 AM Hervé Boutemy <hboutemy at apache.org> wrote:
> > After Arnout's excellent PoC [1], I'd like to discuss the buildinfo 
content based on reviewing current example:
> > > name=stamina-core
> > > group_id=com.scalapenos
> > > artifact_id=stamina-core_2.12
> > > version=0.1.5-SNAPSHOT
> > 
> > ok, same meaning than usual pom IIUC
> Yes, I think following the maven/ivy conventions here makes sense.
> > > build_architecture=all
> > 
> > why "all" value?
> > to me, buildinfo is more a record of the conditions when the build was
> > done than a definition of supported conditions
> Agreed. If anything we could record the `-target` level, but since
> AFAIK it is rare to publish the same artifact with different `-target`
> levels, let's drop this field.
in Maven, the target level is not defined in the command line but in the build 
file: then no need to record

> I do think we should include the
> 'classifier' field, if any, though.
what do you call "classifier"?

> > For example, it could be useful to record if the build was done on Windows
> > or any Unix system, for newlines
> That might be interesting. I wonder if we should just include most of
> the commonly defined JVM system properties
> (https://docs.oracle.com/javase/tutorial/essential/environment/sysprop.html)
ok to name the fields like this
but not to add everything: a lot of fields are definitely not relevant, or the 
build recipe is really to much dependent on the environment

> .
> > We'll have to define also if recording info like platform encoding or
> > locale is useful: relying on platform encoding is a bad practice, then
> > I'd prefer avoiding but locale may be something useful to record for
> > generated documentation (or assume it should be reproducible as
> > english...)
> I agree it would be useful to include those: they shouldn't affect the
> build, but including them in the buildinfo may make it easier to spot
> when they accidentally do (combined with comparing diffoscope output
> of course).
notice that we could imagine a verbose buildinfo for investigation, while 
normal buildinfo would not contain too much

> > > source=com.scalapenos:stamina-core_2.12
> > > binary=com.scalapenos:stamina-core_2.12
> > > package=com.scalapenos:stamina-core_2.12
> > 
> > I don't get the meaning of source vs binary vs package
> I agree - these mostly came from when I was still much closer to the
> Debian format, but I don't think they make much sense in the JVM
> context. Perhaps let's just remove all 3 until we find a need for
> them?
source is useful, since it's the location where to get sources
should support many options: artifact coordinates, url of tarball, vcs tag 

> > And as told on my previous email, if source value is a groupId:artifactId
> > coordinate, I expect to find a
> > ${artifactId}-${version}-source-release.zip file in the repository
> Perhaps we should promote including 4 'scm' fields corresponding to
> https://maven.apache.org/pom.html#SCM?
definitely less than the 4 fields: developper connexion and "url" (which means 
html rendering) are not useful

> > > java.version=1.8.0_191
> > 
> > ok, this is the exact version of JVM used to run the build tool, and we
> > expect that any 1.8.0 JDK will permit same rebuild
> I'm not sure that is safe to expect, but in any case we should record it ;).
we'll need to share experience on how build result is sensitive to the JDK 
version: I hope patch version is not something important, nor JDK 
distribution, or having effective rebuild will require complex specific setup 
on each case

> > > sbt.version=1.0.4
> > > scala.version=2.12.4
> > > scala.binary-version=2.12
> > 
> > I suppose these are SBT build tool specific fields: I don't know Scala nor
> > SBT, then I am not really able to judge if these 3 fields are strictly
> > necessary But from an outsider perspective, this seems many fields: do
> > you install scala independently from SBT?
> Yes, possibly.
from the test done, yes, scala-version is used in the command line to rebuild, 
then it has to be recorded

> > And is binary-version also installed/defined separately?
> No, the binary-version of scala 2.12.4 is always 2.12, so strictly
> speaking the binary-version field is redundant.
> > Perhaps we should add a basic record: build_tool=sbt
> > to avoid guessing build tool from visible properties :)
> Agreed. Let's make it 'build-tool=sbt' since hyphens seem to be a bit
> more common than underscores in properties files?
I didn't want to start the debate on hyphens vs underscores, but yes, I prefer 
hyphens, just used underscore to adapt to your initial choice :)

> > then have a defined list of known build tools and for each tool the list
> > of build-tool-specific variable to define
> I'm not sure we need to standardize this strictly, but sure.
> > > date=1545329140000
> > 
> > date of what?
> This was the build date.
if the build date is useful to get the same result, I consider the build as 
not reproducible: in theory we can reproduce the build date, but that's really 
not convenient

> I wasn't sure if 'seconds since epoch' is a
> great format, on the other hand SOURCE_DATE_EPOCH also uses it and
> it's easy to parse.
the interest of this value is not its format but its semantics

> > should we record source-date-epoch [2]?
> I think including the date of the latest commit is useful - not sure
> if it should necessarily come from a SOURCE_DATE_EPOCH environment
> variable or could be determined otherwise.
ok, let's change our strategy: do you need a date to get your sbt build 
if you don't need it, let's not record any date that we don't know the 
expected semantics

> > IIUC, that's the checksum of main build result: POM and binary jar
> Yes
> > I don't know if -javadoc.jar and -sources.jar checksums should also be
> > provided, since they are also build result, but less critical
> Good question. For now I'm not taking those into account (and am not
> making any effort towards making those reproducible as well).
perfect choice for me

> > Perhaps it's something that can be chosen by the release manager: not
> > every published bit has to be reproducible, I suppose
> Agreed
> > During Reproducible Builds even in Paris [3], there was an idea of
> > recording checksum for effective dependencies used. I'm not personally
> > convinced this is useful, since it's part of reproducible process.
> I'm not convinced yet either, but I haven't heard the motivation.
perfect for me :)



> Kind regards,
> Arnout
> _______________________________________________
> rb-general at lists.reproducible-builds.org mailing list
> To change your subscription options, visit
> https://lists.reproducible-builds.org/listinfo/rb-general.
> To unsubscribe, send an email to
> rb-general-unsubscribe at lists.reproducible-builds.org.

More information about the rb-general mailing list