[rb-general] BUILD_PATH_PREFIX_MAP format spec, draft #1

Daniel Shahaf danielsh at apache.org
Sun Jan 22 19:11:22 CET 2017


Ximin Luo wrote on Sat, Jan 21, 2017 at 20:49:00 +0000:
> Daniel Shahaf:
> > [..] It would be better for data that is not encodeable in the
> > envvar's value is to be transmitted out-of-band and the envvar
> > reconstructed to a conforming value by the recipient.
> 
> I don't understand what you mean by "data that is not encodeable in
> the envvar's value" nor what you mean by "transmitted out-of-band".
> Environment variable values don't get "transmitted" anywhere, you have
> to expressly read the value and turn it into a string. In which case
> you are not transmitting an envvar but a string. Then you should
> "postprocess" this string (to reuse my earlier terminology) if you
> expect it contains characters that aren't suited to your transmission
> protocol or your recipient.

Like I said, the phrasing was ambiguous in that it wasn't clear whether
the encoded value is passed in-band (in the envvar's value) or
out-of-band.  You've now clarified [in the trimmed portion of the quote]
it's out-of-band, so we're in agreement about what the behaviour should
be.

> >> On the other hand, if you expect that your paths do *not* contain such
> >> characters, e.g. if they only contain printable ASCII characters, then you
> >> could transmit the value of BUILD_PATH_PREFIX_MAP as-is.
> >>
> >> Rejected options
> >> ================
> >>
> >> - Any variant of backslash-escape, because it is annoying to implement in
> >>   higher-level languages. Backslash-escape is an encoding that is optimised for
> >>   being typed manually by humans, but I don't expect that will be a major
> >>   use-case for this encoding.
> > 
> > Both of these are your subjective opinions, not objective properties of
> > backslash escaping.
> > 
> > Regarding interactive use, I'd say that backslash encoding is superior 
> > URL-encoding since it doesn't involve looking up byte values, but that
> > neither of them is the holy grail of UX.
> > 
> > Regarding ease of implementation...
> > 
> >     def decode(s):
> >         "Decode a $PATH-with-backslash-escaping-encoded value into a list"
> >         return \
> >             "".join(
> >                 '\0' if x == ':' else x[-1]
> >                 for x in re.compile(r'[\\]?.').findall(s)
> >             ).split('\0')
> > 
> 
> Sure, but I didn't want to make this dependent on regex either, since
> every language does those very slightly differently. It makes it more
> time-consuming to verify that these are exactly following a spec
> including handling all the error conditions, and that it's behaving
> the same way as another implementation in a different language.

I did not propose anything that would require comparing regexp
libraries.

Daniel


More information about the rb-general mailing list