[rb-general] BUILD_PATH_PREFIX_MAP format spec, draft #1

Ximin Luo infinity0 at debian.org
Mon Jan 23 15:17:00 CET 2017

Ximin Luo:
> Chris Lamb:
>> Ximin Luo wrote:
>>>         case ':':
>>>         case '=':
>>>           return 0; // invalid, should have been escaped
>>>         case '%':
>>>           if (!(x[0] = *++src) || !(x[1] = *++src))
>>>             return 0; // invalid, past end of string
>>>           sscanf(x, "%2hhx", &c); // could be more efficient but my C-foo is low
>>>           if (errno != 0)
>>>             return 0; // invalid, not valid hex
>>>           *dest = c;
>>>           break;
>>>         default:
>>>           *dest = *src;
>> I'm afraid I'm still extremely bullish on this encoding question to the
>> point where it's even difficult to analyse the rest of the draft. :/
>> Whilst I can see the technical merit of allowing colons, requiring such
>> code just seems like an *awfully* bad smell for a specification we want
>> people to actually implement.
> Nobody has come up with even slightly-detailed explanations of why this is the case, and I don't accept vague opinions as holding any weight.
> Given everything else that needs to be done to implement the rest of the spec, this really does not increase the overall code complexity by much. The rest of the C code I didn't paste is 150+ lines to define the data structure, apply a splitting, and parse it for example.
> I'm not going to change this part of the spec - at least unless GCC absolutely refuse to implement it. So please everyone stop repeating the same thing. Say something different at least.
> ':' occurs in Windows drive letters and defining the separator character as os.pathsep (';' on Windows) would be annoying as well, especially when it comes to trying to transfer these values between platforms.

Daniel asked me to explain this in some more detail.

So far we've heard from a few people they'd prefer a "simpler" spec. I'm perfectly happy ignoring these opinions. Why? Because it won't matter.

On this mailing list we have more people on the "producer" side of things, they work on distributions and higher-level build tools that want to reproduce things. If you don't believe that {%=:} will appear in your paths, then simply don't .replace().replace().replace() when you're setting this envvar. There will be two situations:

1. your path actually doesn't contain those chars
   simple-split: works
   hexencode: works

2. your path does contain these chars
   simple-split: stuff will break silently
   hexencode: fails early, it's an invalid encoding

So the hexencode case is strictly better, and I don't particular care that these people "think it's too complex". Sorry if this is you.

OTOH, at RWS and elsewhere we have heard from several potential *consumers* of this variable (Basel and rust) that they'd prefer to avoid this restriction. They aren't on this mailing list to voice their opinions. However these are two data points that give me extra information. Note that these consumers are *already* benevolent towards reproducible builds. So I extrapolate, consumers that might be more skeptical or hostile towards R-B would probably have even stronger preferences towards avoiding these restrictions. As for GCC, Matthias Klose told me that they seem relatively neutral on the topic as long as the code is fine.

Of course to really test this, I have to submit actual real patches to consumers. So I am still open towards changing the encoding based on their feedback, but further discussion on this mailing list is realy not going to help me do that.

Also I would appreciated it if people stopped abusing the term "consensus". 3 people on a mailing list saying a few sentences without even going to the lengths to think through different cases in detail (which I have had the misfortune of spending most of my past week doing) is not "consensus". OTOH I do believe that everyone wants something "roughly like" this to pass.

So I'm going to continue with this, write up the code, tidy up the bad wording around "encoding" as Daniel pointed out, and go submit some actual patches to consumers.


GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE

More information about the rb-general mailing list