[rb-general] BUILD_PATH_PREFIX_MAP code examples and test cases

Daniel Shahaf danielsh at apache.org
Tue Jan 31 04:08:50 CET 2017


Ximin Luo wrote on Mon, Jan 30, 2017 at 16:24:00 +0000:
> I made the change because one of the aims is to make decode() easy to
> implement both in C as well as higher-level languages. The previous
> mapping had one major issue:
> 
> % -> %%
> = -> %+
> : -> %;
> 
> This means that "the easy way" of doing it in C (single-pass
> left-to-right parsing) gives different results from "the easy way" of
> doing it in higher-level languages, for the string sequence "%%+".
> This is not a possible output of a good encoder; however I don't want
> to take chances and even "invalid" encodings should still be decoded
> in the same way by all implementations.

I agree that we should avoid '%%' (and '%:' and '%=').  +1 to %p/%e/%c
then.

I also agree that all implementations should be required to behave the
same way on strings that can't be produced by the encoder, such as '%%+'
or '%x'.  However, my first preference would be to specify that
consumers <rfc2219>must</rfc2219> reject such strings with an error
message.  That is: I would require consumers to be strict in what they
accept.

Have we run this encoding past windows folks?  The character '%' is
special to batch scripts, so it would be good to ensure windows build
scripts can handle the encoding scheme we're discussing here.  (Or if
they can't, to use «\» or «@» or whatever else instead of «%».)  We
don't have to check this right now, but we should check this before
releasing v1.0 of the spec.

> https://github.com/infinity0/rb-prefix-map/tree/master/consume/

I see there are test cases there.  It would be great if we could
incorporate them into the 1.0 specification.

Cheers,

Daniel


More information about the rb-general mailing list