[rb-general] GNU coding standards discussion

Ian Jackson ijackson at chiark.greenend.org.uk
Fri Dec 2 15:55:04 CET 2016


Ximin Luo writes ("Re: [rb-general] GNU coding standards discussion"):
> Ian Jackson:
> > gnu-proc-disc is not a public list, and it contains a number of quite
> > difficult-to-talk-to people :-), so you may find it easier to use me
> > as a go-between.
> 
> Thanks for taking on this role!

YW.  Someone needs to do it and I seem to have enough feet in the
various camps that I might even make a useful difference...

> Ian Jackson:
> >   For filenames and contents `identical' means an identical sequence
> >   of bytes.  So it does not mean only semantically equivalent
> >   contents, such as an equivalent but different sequence of unicode
> >   codepoints.  (If the build/install is done in a non-UTF-8 locale,
> >   filenames and contents may depend on the locale.)
> 
> Why are they putting this in? This contradicts the definition of
> "input" above. If the output may "depend on the locale" then the
> locale is part of the definition of "input". We'd prefer not to
> define "input" like this...

Those are my words.  They may be a bad idea.  I will try to explain my
thinking:

Packages often produce documentation, and it is not unusual for them
to need to produce documentation containing non-ASCII characters.  If
the LC_CTYPE in force for the build is not UTF-8, but some other
multibyte encoding, then the documentation should be produced in that
other encoding, surely ?

Inevitably this means that reproducibility of a source build in a
non-UTF locale might depend on the locale.

In practice I think all widely-distributed binaries should (and will)
be built with a UTF-8 LC_CTYPE, so this is not a problem for third
party verification of distro binaries.

Whether to handle this case, in the GNU coding standards, by a
definition of "input" or by restricting the set of aspects of the
output which are supposed to be identical, is just a difference in
wording, not a difference in meaning.  The GNU coding standards
necessarily have a more complicated definition of `output' than a
distro build.

> >   Any symlinks have identical targets, and the hardlink structure (if
> >   any) wll be identical.
> > 
> >   If the build/install is done with the same password and group
> >   databases, the files will also have identical ownerships.
> 
> Again, this contradicts the definition of "input" above.

Can you reframe that as a practical problem ?

I think this kind of exception is necessary because a build system's
"make install" target might need to do something like this:

   install -m 2755 -o root -g plugdev blah-helper $(DESTDIR)/usr/bin/

Obviously the resulting ownership of of $(DESTDIR)/usr/bin/blah-helper
will depend on what the prevailing group database thinks about
`plugdev'.  (In the worst case, the plugdev group might not exist at
all.  I'm not sure what install(8) does in that case...)

> >   The modification timestamp (mtime) of each file or directory is
> >   either:
> >    (i) In each build/install, no earlier than the start of that build;
> >    (ii) Identical across all builds/installs, and no later than the
> >        age of the youngest input.
> >   The ctime and atime are of no concern and may vary.
> 
> Again, this contradicts the definition of "input" above.

Again, can you reframe that as a practical problem ?  As a matter of
standards-exegisis it obviously can't contradict the definiton of
`input' because it does not try to define `input' or anything that
`input' depends on.

My goal was to legitimise the common pattern of using install(8) to
install source files directly into $DESTDIR.

> I think a simpler option would be to avoid making comments about
> $DESTDIR and the filesystem structure, but instead to make comments
> about the tarball that would be created if you did that, probably
> with --clamp-mtime set to the last entry of the ChangeLog - GNU
> packages are all supposed to have this, right?

This might be a good alternative approach.

> For more theoretical discussion on what reproducibility "means" you
> and/or the other GNU folks might like this doc I wrote:

I have encountered that before, yes, thanks.

Ian.


More information about the rb-general mailing list