[rb-general] GNU coding standards discussion
Ian Jackson
ijackson at chiark.greenend.org.uk
Fri Dec 2 15:55:04 CET 2016
Ximin Luo writes ("Re: [rb-general] GNU coding standards discussion"):
> Ian Jackson:
> > gnu-proc-disc is not a public list, and it contains a number of quite
> > difficult-to-talk-to people :-), so you may find it easier to use me
> > as a go-between.
>
> Thanks for taking on this role!
YW. Someone needs to do it and I seem to have enough feet in the
various camps that I might even make a useful difference...
> Ian Jackson:
> > For filenames and contents `identical' means an identical sequence
> > of bytes. So it does not mean only semantically equivalent
> > contents, such as an equivalent but different sequence of unicode
> > codepoints. (If the build/install is done in a non-UTF-8 locale,
> > filenames and contents may depend on the locale.)
>
> Why are they putting this in? This contradicts the definition of
> "input" above. If the output may "depend on the locale" then the
> locale is part of the definition of "input". We'd prefer not to
> define "input" like this...
Those are my words. They may be a bad idea. I will try to explain my
thinking:
Packages often produce documentation, and it is not unusual for them
to need to produce documentation containing non-ASCII characters. If
the LC_CTYPE in force for the build is not UTF-8, but some other
multibyte encoding, then the documentation should be produced in that
other encoding, surely ?
Inevitably this means that reproducibility of a source build in a
non-UTF locale might depend on the locale.
In practice I think all widely-distributed binaries should (and will)
be built with a UTF-8 LC_CTYPE, so this is not a problem for third
party verification of distro binaries.
Whether to handle this case, in the GNU coding standards, by a
definition of "input" or by restricting the set of aspects of the
output which are supposed to be identical, is just a difference in
wording, not a difference in meaning. The GNU coding standards
necessarily have a more complicated definition of `output' than a
distro build.
> > Any symlinks have identical targets, and the hardlink structure (if
> > any) wll be identical.
> >
> > If the build/install is done with the same password and group
> > databases, the files will also have identical ownerships.
>
> Again, this contradicts the definition of "input" above.
Can you reframe that as a practical problem ?
I think this kind of exception is necessary because a build system's
"make install" target might need to do something like this:
install -m 2755 -o root -g plugdev blah-helper $(DESTDIR)/usr/bin/
Obviously the resulting ownership of of $(DESTDIR)/usr/bin/blah-helper
will depend on what the prevailing group database thinks about
`plugdev'. (In the worst case, the plugdev group might not exist at
all. I'm not sure what install(8) does in that case...)
> > The modification timestamp (mtime) of each file or directory is
> > either:
> > (i) In each build/install, no earlier than the start of that build;
> > (ii) Identical across all builds/installs, and no later than the
> > age of the youngest input.
> > The ctime and atime are of no concern and may vary.
>
> Again, this contradicts the definition of "input" above.
Again, can you reframe that as a practical problem ? As a matter of
standards-exegisis it obviously can't contradict the definiton of
`input' because it does not try to define `input' or anything that
`input' depends on.
My goal was to legitimise the common pattern of using install(8) to
install source files directly into $DESTDIR.
> I think a simpler option would be to avoid making comments about
> $DESTDIR and the filesystem structure, but instead to make comments
> about the tarball that would be created if you did that, probably
> with --clamp-mtime set to the last entry of the ChangeLog - GNU
> packages are all supposed to have this, right?
This might be a good alternative approach.
> For more theoretical discussion on what reproducibility "means" you
> and/or the other GNU folks might like this doc I wrote:
I have encountered that before, yes, thanks.
Ian.
More information about the rb-general
mailing list