www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - splitter semantics

reply "monarch_dodra" <monarchdodra gmail.com> writes:
I was trying to improve algorithm's "splitter" functions.

This was mostly relaxing/documenting the restrictions, and adding 
a new implementation for simple forward ranges and/or infinite 
ranges support (the current implementation was limited to RA + 
hasLength + hasSlicing).

Anyways, I noticed there was an (undocumented) splitter(pred) 
inside algorithm. Problem: It is currently using completely 
unique semantics.

Contrary to the other std.algorithm.splitter, it joins runs of 
terminators together (avoiding empty empty tokens). Also, 
contrary to the other std.algorithm.splitter, if the last element 
satisfies isTerminator, it does not create an empty token. It 
*does*, however, create a leading empty token.

It turns out that one of the only use there is of this functions 
is to implement std.array.splitter(someString), by simply calling 
std.algorithm.splitter!isWhite.

But what makes also that strange though is that the lazy 
"std.array.splitter(someString)" *ALSO* deviates from the 
aggressive "std.array.split(someString)" (that it is supposed to 
emulate), in that split will simply NOT create any empty tokens, 
even in front.

TLDR:
1. " hello  ".split() => ["hello"]; //std.string
2. " hello  ".splitter() => ["", "hello"];         //std.string
3. " hello  ".splitter!isWhite() => ["", "hello"]; //std.algo
4. " hello  ".splitter(' ') => ["", "hello", "", ""]; //std.algo

As you can see 2-3 are using some absolutely unique semantics. 
This semantic (IMO), makes no sense.

I'd like to propose a clear split between std.algorithm's 
splitter, and std.array's:
*std.algo.splitter!pred should have the same behavior as the 
other splitters in the module.
*std.array.splitter(string) will have the same behavior as 
std.array.split(string)

The way I see it: algorithm.splitter!predicate is not documented, 
so we can do anything we want with it.

As for std.array.splitter(string): It's behavior just doesn't 
make any sense. So we might as well fix it to parallel 
std.array.split's.

Thoughts? Feedback?

--------
Related note: Why are:
"auto splitter(C)(C[] s) if(isSomeString!(C[]))"
"S[] split(S)(S s) if (isSomeString!S)"

inside std.array anyways? Their behavior is string specific, with 
strictly no raw array equivalent, places them square in the 
middle of std.string, no? Along with splitLines et al, for 
example.
Oct 22 2012
parent "bearophile" <bearophileHUGS lycos.com> writes:
monarch_dodra:

 1. " hello  ".split() => ["hello"]; //std.string
 2. " hello  ".splitter() => ["", "hello"];         //std.string
 3. " hello  ".splitter!isWhite() => ["", "hello"]; //std.algo
 4. " hello  ".splitter(' ') => ["", "hello", "", ""]; //std.algo
There is a bug report on this stuff (8013). Please model them on the ouput of std.string where possible. Bye, bearophile
Oct 22 2012