[rb-general] BUILD_PATH_PREFIX_MAP format spec, draft #1

Damien Martin-guillerez dmarting at google.com
Mon Jan 23 15:24:15 CET 2017

Voicing up since I am in that mailing-list :)

Yes, for Bazel we prefer to not have specification that restrict the
allowed paths. We are already facing difficulties with legacy reason that
prevent us from supporting random characters in path names and I would
rather not implement something that will add other restrictions. Like an
estimate colleague of mine said: make general case simple and corner cases
possible. The current draft with hex encode sounds good to me.

On Mon, Jan 23, 2017 at 3:17 PM Ximin Luo <infinity0 at debian.org> wrote:

> Ximin Luo:
> > Chris Lamb:
> >> Ximin Luo wrote:
> >>
> >>>         case ':':
> >>>         case '=':
> >>>           return 0; // invalid, should have been escaped
> >>>         case '%':
> >>>           if (!(x[0] = *++src) || !(x[1] = *++src))
> >>>             return 0; // invalid, past end of string
> >>>           sscanf(x, "%2hhx", &c); // could be more efficient but my
> C-foo is low
> >>>           if (errno != 0)
> >>>             return 0; // invalid, not valid hex
> >>>           *dest = c;
> >>>           break;
> >>>         default:
> >>>           *dest = *src;
> >>
> >> I'm afraid I'm still extremely bullish on this encoding question to the
> >> point where it's even difficult to analyse the rest of the draft. :/
> >>
> >> Whilst I can see the technical merit of allowing colons, requiring such
> >> code just seems like an *awfully* bad smell for a specification we want
> >> people to actually implement.
> >>
> >
> > Nobody has come up with even slightly-detailed explanations of why this
> is the case, and I don't accept vague opinions as holding any weight.
> >
> > Given everything else that needs to be done to implement the rest of the
> spec, this really does not increase the overall code complexity by much.
> The rest of the C code I didn't paste is 150+ lines to define the data
> structure, apply a splitting, and parse it for example.
> >
> > I'm not going to change this part of the spec - at least unless GCC
> absolutely refuse to implement it. So please everyone stop repeating the
> same thing. Say something different at least.
> >
> > ':' occurs in Windows drive letters and defining the separator character
> as os.pathsep (';' on Windows) would be annoying as well, especially when
> it comes to trying to transfer these values between platforms.
> >
> Daniel asked me to explain this in some more detail.
> So far we've heard from a few people they'd prefer a "simpler" spec. I'm
> perfectly happy ignoring these opinions. Why? Because it won't matter.
> On this mailing list we have more people on the "producer" side of things,
> they work on distributions and higher-level build tools that want to
> reproduce things. If you don't believe that {%=:} will appear in your
> paths, then simply don't .replace().replace().replace() when you're setting
> this envvar. There will be two situations:
> 1. your path actually doesn't contain those chars
>    simple-split: works
>    hexencode: works
> 2. your path does contain these chars
>    simple-split: stuff will break silently
>    hexencode: fails early, it's an invalid encoding
> So the hexencode case is strictly better, and I don't particular care that
> these people "think it's too complex". Sorry if this is you.
> OTOH, at RWS and elsewhere we have heard from several potential
> *consumers* of this variable (Basel and rust) that they'd prefer to avoid
> this restriction. They aren't on this mailing list to voice their opinions.
> However these are two data points that give me extra information. Note that
> these consumers are *already* benevolent towards reproducible builds. So I
> extrapolate, consumers that might be more skeptical or hostile towards R-B
> would probably have even stronger preferences towards avoiding these
> restrictions. As for GCC, Matthias Klose told me that they seem relatively
> neutral on the topic as long as the code is fine.
> Of course to really test this, I have to submit actual real patches to
> consumers. So I am still open towards changing the encoding based on their
> feedback, but further discussion on this mailing list is realy not going to
> help me do that.
> Also I would appreciated it if people stopped abusing the term
> "consensus". 3 people on a mailing list saying a few sentences without even
> going to the lengths to think through different cases in detail (which I
> have had the misfortune of spending most of my past week doing) is not
> "consensus". OTOH I do believe that everyone wants something "roughly like"
> this to pass.
> So I'm going to continue with this, write up the code, tidy up the bad
> wording around "encoding" as Daniel pointed out, and go submit some actual
> patches to consumers.
> X
> --
> GPG: ed25519/56034877E1F87C35
> GPG: rsa4096/1318EFAC5FBBDBCE
> https://github.com/infinity0/pubkeys.git
> _______________________________________________
> rb-general at lists.reproducible-builds.org mailing list
> To change your subscription options, visit
> https://lists.reproducible-builds.org/listinfo/rb-general.
> To unsubscribe, send an email to
> rb-general-unsubscribe at lists.reproducible-builds.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.reproducible-builds.org/pipermail/rb-general/attachments/20170123/25f31df9/attachment.html>

More information about the rb-general mailing list