[rb-general] rb-prefix-map spec: don't be as democratic to consumers

Daniel Shahaf danielsh at apache.org
Mon Mar 6 18:01:53 CET 2017


Ximin Luo wrote on Sun, Mar 05, 2017 at 19:30:00 +0000:
> Daniel Shahaf:
> > Hi,
> > 
> > https://github.com/infinity0/rb-prefix-map/blob/master/spec-main.rst#applying-the-decoded-structure
> > currently says:
> > 
> >> Consumers SHOULD implement one of the following algorithms:
> >>
> >> 1. …
> >>
> >> 2. …
> > 
> > The two algorithms, #1 and #2, have different semantics.
> > 
> > I think specifications MUST NOT allow consumers leeway to choose between
> > different semantics.  (Why?  Just imagine a world in which gcc
> > implemented #1 and clang implemented #2; in such a world, gcc and clang
> > wouldn't be interchangeable.)
> > 
> > So, I propose:
> > 
> > a. Specify precisely which semantics consumers MUST implement.  The spec
> > MAY recommend a particular algorithm, but MUST NOT give consumers choice
> > of semantics.
> > 
> > b. I have no opinion as to _which_ semantics consumers should implement;
> > whether the #1 semantics, the #2 semantics, or possibly something else.
> > 
> > Cheers,
> > 
> > Daniel
> > 
> > P.S. Use-case: imagine a compression program that links two different
> > libxz versions and allows choosing between them at runtime.  The source
> > of that program might include /foo/xz-42/main.c and /foo/xz-43/main.c
> > and its build system might then define BUILD_PATH_PREFIX_MAP="/foo/xz=xz".
> > If the build system then started using a different C compiler, the
> > program would no longer be reproducible.
> > 
> > P.P.S. Sorry for not catching this earlier, I was a bit busy for the
> > past few weeks.
> 
> Hey, no worries and thanks for the input.
> 
> I did think about this, but decided in the end against it because
> I didn't think it was necessary, and might hinder adoption in case
> a consumer really thought the specific algorithm was not so good -

Letting consumers choose semantics would hinder adoption by producers:
it would burden producers with figuring out how to set BUILD_PATH_PREFIX_MAP
to values that would be interpreted correctly regardless of which semantics
the consumer chooses.

> and TBH I personally think (2) is better than (1) which GCC
> implements.  (2) was originally suggested in a Rust debuginfo thread
> that I had been participating in.

I also prefer (2) to (1), but that's beside the point.

(There are a couple of interesting bikesheds to have here regarding
paths that contain '.' or '..' elements or symlinks or doubled slashes
in the middle, but I'm not going to broach them. ;-))

> I think it's not necessary, because in general we don't expect two
> tools to generate the same output, nor even one tool at different
> versions to do that. So GCC and Clang implementing different
> algorithms don't matter, as long as they both read the variable in the
> same way, and their output is reproducible given the same reading of
> the same variable and the same build paths being used. Similarly with
> two different versions of libxz.
> 
> Does that make sense? Happy to talk about it more though, if you don't
> think my reasoning just now was good.

Let's just say I'm not convinced yet. :)

I think the use-case you're considering is that of somebody trying to
reproduce a particular binary.  Such a person would be using .buildinfo
to use exact-same tools and versions, since she needs bug-for-bug
compatibility in the compiler and surrounding tools.  So, as you say, in
this situation it doesn't matter whether clang and gcc are interchangeable.
In fact, I'd go further and say that in this situation, it doesn't
matter at all what the spec says — precisely because the tools and
versions are fixed.

I was thinking of a different use-case.  One use-case, the "two libxz's"
example, is that of an upstream project that wants to allow multiple
downstreams to build it reproducibly, regardless of which compiler those
downstreams use.  Each individual downstream would use either gcc or
clang, but the upstream project needs to support both.

The general pattern here is producers and consumers that are decoupled.
Another example of this is the package-agnostic build systems of various
OSes.  Taking FreeBSD for example, I can envision its package build
system, ports(7), setting BUILD_PATH_PREFIX_MAP to a value that depends
on $(WRKDIRPREFIX)¹ but doesn't depend on the particular port/package
being built, nor on whether $(CC) is gcc or clang.  The alternative
would be for ports(7) to have two different codepaths for setting
BUILD_PATH_PREFIX_MAP, and for each individual port to be responsible
for choosing the right codepath.  This alternative would have higher
ongoing maintenance costs (both in ports(7) and in the individual
ports), and moreover, I'm not at all sure what can be done when a single
port's build has two consumers — say, $(CC) and doxygen — that consume
the envvar using different interpretations.

The râison d'être of standards is interoperability.  The standard
should allow any producer to work with any consumer (including consumers
that didn't exist when the producer was written).

Makes sense?

Cheers,

Daniel

¹ $(WRKDIRPREFIX) is a build-time knob that sets $(SOURCE_ROOT_DIR).



More information about the rb-general mailing list