Tools respecting SOURCE_DATE_EPOCH (was Re: Reproducible XFS Filesystems Builds for VMs)

Vagrant Cascadian vagrant at reproducible-builds.org
Sun Apr 13 16:51:50 UTC 2025


On 2025-04-13, Simon Josefsson wrote:
> Vagrant Cascadian <vagrant at reproducible-builds.org> writes:
>>> FWIW, I ran into a similar problem with 'help2man' which behave
>>> different depending on SOURCE_DATE_EPOCH setting and there is no way
>>> except to use datefudge/faketime to make help2man use another time than
>>> current time. Or to perform post-processing of the output files.  Or
>>> give up and hard-code SOURCE_DATE_EPOCH when calling help2man (which is
>>> what I settled with), but that invalidates the callers ability to pass
>>> in another SOURCE_DATE_EPOCH value and this nests badly.
>>
>> I am not seeing an actual problem here... so perhaps I am missing
>> something. Why do you need multiple different SOURCE_DATE_EPOCH values
>> as part of a single build process? If you have multiple build processes,
>> why are they getting ... an undesired SOURCE_DATE_EPOCH value?
>>
>> In my opinion, help2man is a build tool, and if it embeds any sort of
>> date, respecting SOURCE_DATE_EPOCH is the correct thing to do. At least,
>> that is the intention of how SOURCE_DATE_EPOCH is supposed to work...
>
> I'm not using help2man as a build tool when building a Debian package,
> I'm using help2man as a tool to create man pages during preparation of
> an upstream project, which are included in 'make dist' tarballs.  I want
> to control the output of that manpage.  Recall the (excellent!) spec for
> SOURCE_DATE_EPOCH:
>
> https://reproducible-builds.org/specs/source-date-epoch/
>
> In particular this quote:
>
>    This specification therefore defines a distribution-agnostic standard
>    for upstream build processes to consume this timestamp from packaging
>    systems. The intended result is a build where the output looks as if
>    the build had happened instantly at the time specified in that
>    timestamp.
>
> That situation isn't applicable for my situation, yet SOURCE_DATE_EPOCH
> is being used by help2man to influence the output in my situation.

So, my interpretation here is that "make dist" is just another packaging
system. You take some source, follow a build process, and output one or
more artifacts... (e.g. a source tarball).


> The embedded timestamp is causing reproducability issues for my 'make
> dist' tarballs, so I would prefer to either 1) remove the timestamp from
> the manpages

There's no time like ... no time! Yeah, that would be my ideal for
reproducible builds, but I seem to recall man pages having problems with
no date last I tried, but would love to be shown otherwise!


> , or 2) set it to something predictable that I control in
> upstream.  Help2man has this code:
>
> my $epoch_secs = time;
> if (exists $ENV{SOURCE_DATE_EPOCH} and $ENV{SOURCE_DATE_EPOCH} =~ /^(\d+)$/)
> {
>     $epoch_secs = $1;
>     $ENV{TZ} = 'UTC0';
> }
>
> So either help2man will use current time, which I could influence by
> using datefudge/faketime, or it uses SOURCE_DATE_EPOCH which I can set
> during build.  I could also chose to post-process the output.  I've
> settled with using this:
>
> gsasl.1: $(top_srcdir)/src/gsasl.c $(top_srcdir)/src/gsasl.ggo \
> 		$(top_srcdir)/.version
> 	$(AM_V_GEN)env SOURCE_DATE_EPOCH=$(SOURCETIME_HELP2MAN) $(HELP2MAN) \
>         ...

That seems reasonable on the surface to me...


> I derive a suitable SOURCETIME_HELP2MAN during ./configure time to a
> date that I want to be in the manpages (mapping to the last git commit,
> or release date from NEWS file when not building from git, although I
> may have bugs in this code since I'm still iterating on the solution).

Barring bugs of course, that sounds reasonable.


> I'm pretty sure that the timestamp I want to use is different from what
> SOURCE_DATE_EPOCH would be when building gsasl in a binary distribution.
> For Debian I think it is the date timestamp in debian/changelog?  I
> don't want that timestamp to end up in my man pages, because then we
> would have ten different manpages for the same version depending on when
> different distributions packaged this.  That's a reproducability
> concern, although of course not an important one, but all small
> differences adds up eating auditors time.

Well, if a distribution patches the manpage in some way, then it should
still show a different timestamp (or, in the case of man pages, a date,
if I recall correctly).

So, you can either be fastidious and figure out the source change
timestamp on each man page individually ... or slightly sloppy and
accept that the date on the distribution packages and source tarballs
differ.

If a distribution ships the manpage unchanged from the upstream tarball,
obviously this would not be a problem.

Although there is considerable worry these days about the source tarball
containing any generated artifacts(and I think included man pages count
as pregenerated artifacts), and so distributions should generally
regenerate everything anyways, at which point, it can be argued that the
distribution's timestamps are at least as appropriate as upstream's...


> I hope this explains the problem I'm facing, and why I believe it is a
> bad idea for tools to use SOURCE_DATE_EPOCH as a end-user facing
> environment variable in context that aren't building binaries for
> distribution.

I think it explains it, though I may disagree on your conclusions...


> I would have prefer to be able to use an approach like this, which would
> override both current time and SOURCE_DATE_EPOCH:
>
> gsasl.1: $(top_srcdir)/src/gsasl.c $(top_srcdir)/src/gsasl.ggo \
> 		$(top_srcdir)/.version
> 	$(AM_V_GEN)$(HELP2MAN) --timestamp $(SOURCETIME_HELP2MAN) \
>         ...
>
> Do you have any other suggested solution in this situation?  Reflection
> on where I make some bad assumption or go wrong in my thinking?

I really don't see a significant difference with "--timestamp
$(SOURCETIME_HELP2MAN)" vs. "SOURCE_DATE_EPOCH=$(SOURCETIME_HELP2MAN)"
... except the latter is supported upstream and the former is not yet
supported upstream... :)


> Or would you simply disagree that ending up with a predictable identical
> man page timestamp across distributions is a worthy goal?  I don't think
> that is a goal many people seem to care about, so it is fine to have
> disagreement here.

I think it might be worthy ideal... but perhaps not worth the effort to
handle all the possible permutations? Especially in the light of an
ideal of not shipping tarballs with pregenerated contents...

I mean, if it is easy, sure, go for it, but in the grander scheme of
things, I am much more concerned about consistency within a distro than
across distros... with built artifacts.


live well,
  vagrant
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 227 bytes
Desc: not available
URL: <http://lists.reproducible-builds.org/pipermail/rb-general/attachments/20250413/cd68d328/attachment.sig>


More information about the rb-general mailing list