Reproducibility for Java
Magnus Ihse Bursie
magnus.ihse.bursie at oracle.com
Tue Nov 12 14:33:26 UTC 2024
Hi,
I'm working with the Build Group on OpenJDK. We have tried to make the
build of the JDK to be reproducible, as a long-term project spanning
several years. This has, at time, included changes to Java itself (like
a way to make .properties files without a timestamp comment).
The builds we produce in the OpenJDK project, on Linux, is fully
reproducible as far as I am aware. I must admit I did not follow all
links in your mail to fully understand the problem you are seeing, but
it seems that the gist is "the problem is in our distribution, but the
fix must lie within Java itself".
If indeed a fix is needed in Java to enable reproducibility, I can
probably help to try and drive such a change. However, I need to
understand the problem better first, to determine if I agree that a fix
in the JDK itself is in place.
Can you indulge me by writing down a clear problem statement in a mail
to this list, on why reproducibility fails in your build, what change
you want to make in the JDK, and why you think this will solve the
problem? I'm sorry to put the burden on you, but the Build Group is
unfortunately understaffed and I do not have much time to work with
"non-essential" stuff (like reproducibility) as I'd like.
/Magnus
On 2024-11-12 12:41, Roland Clobus wrote:
> Hello Chris, list,
>
> On 08/11/2024 20:50, Chris Lamb wrote:
>> Roland Clobus wrote:
>>
>>> Should production runtime environments
>>> be sensitive to SOURCE_DATE_EPOCH (instead of during building)?
>>
>> Starting with this bit first: given the context of your question is in
>> relation to building live images, I assume that installing the
>> ca-certificates-java package is the source of the non-reproducibility…
>> presumably via the postinst script doing some processing.
>>
>> Indeed, yes, this seems to be the case:
>>
>> https://salsa.debian.org/java-team/ca-certificates-java/-/blob/master/debian/ca-certificates-java.postinst
>>
>> If so, then I agree in general and specific terms: I believe that
>> preinst and postinst scripts should be deterministic, and I've filed
>> many such bugs in the past to make them so, chiefly in the process of
>> getting Tails reproducible.
>
> Indeed, the source of the non-reproducible file is coming from the
> postinst step of ca-certificates-java. However, none of the code from
> that package needs adjustment, the fix needs to be applied to openJDK
> (as the 'new Date()' calls are in openJDK itself.
>
>>> What strategy would you propose? [The package] embeds timestamps for
>>> 'now' in /etc/ssl/certs/java/cacerts.
>>
>> Hm. Given that the codebase calls 'new Date()' in a bunch of places,
>> then I think that any of the options that propose changes to Java are
>> not going to be visible in the short- or medium-term because of the
>> time it would take for those changes to filter down to Debian. They
>> are, of course, worth pursuing; but I suspect you would also like a
>> stopgap situation as well.
>
> Changing Java (just for Debian) will possibly have undesired,
> unmaintainable side-effects, which I cannot oversee, so those need to
> be tested and approved by upstream.
>
> But a stopgap is required too, to get a reproducible live image sooner.
>
>> Using faketime would of course 'work', but are you proposing that the
>> maintainer of the ca-certificates-java patch their postinst to always
>> use faketime? Otherwise, I am not sure how you would ensure that this
>> bit was called within the faketime environment only when building your
>> live image.
>
> After the regular postinst has run, I can run the postinst step again
> but then with faketime active.
> There is sufficient control over the order of the installation of
> packages, that it is possible to ensure that the faketime-based
> version will be embedded in the live image.
> This will fulfil the need in the live image, but will not be a general
> solution.
>
> Generally, it would be possible to use faketime in the postinst, with
> the same value as SOURCE_DATE_EPOCH.
> I would propose faketime as a 'Suggests:' dependency for the package.
> Then only when _both_ faketime is installed _and_ SOURCE_DATE_EPOCH is
> set, the faketime will be applied.
>
>> Now I do like your idea of not shipping the unreproducible file, and
>> it would be especially elegant if the package worked with or without
>> the file(s) being present. But I don't think that is the case: the
>> very point is that it generates these files in a known place on the
>> filesystem so that other programs can access them.
> >
>> Similarly, I don't think this package has any broader concept of a
>> 'first run' in which it could be generated if it doesn't exist. You
>> can't even be 100% sure that these files will only be accessed by a
>> Debian-shipped Java runtime.
> >
>> But I do note that the update_cacerts method is called in that
>> postinst when a new Java runtime is installed. The very fact that this
>> is abstracted out is promising. I wonder if you could: (a) remove the
>> offending file as you outline; and then (b) call this very method
>> during the live script's boot, perhaps by manually invoking the dpkg
>> trigger that is meant to be for when a new JRE is installed?
>
> The package live-config is the package to use for such first-run steps
> as (re)generating files.
>
> As the stopgap method, I'll go with the faketime version of the
> postinst step, as that will cause no additional startup-delay for the
> live image. [1]
>
>> Lastly: the package's maintainers may have a more elegant solution,
>> so it might be worth looping them in.
>
> I'll send a mail to the maintainers and debian-java later.
>
> Thanks for thinking along,
> With kind regards,
> Roland
>
> [1] https://salsa.debian.org/live-team/live-build/-/merge_requests/385
More information about the rb-general
mailing list