digitalmars.D.learn - RFC on SlidingSplitter Range

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (17/17) Oct 03 2014 As a follow up to

John Colvin (5/22) Oct 03 2014 First thing to to do with any range, once written, it to get it
monarch_dodra (15/16) Oct 03 2014 As a quick comment, your definition of "moveFront" is not what

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (4/13) Oct 03 2014 Ok. Fixed know.

monarch_dodra (92/105) Oct 03 2014 Well, yes and no. You are still providing a moveFront that does

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (25/63) Oct 03 2014 Ok, I removed it and added a test using std.range.moveFront

monarch_dodra (29/55) Oct 03 2014 The idea is to try to keep as much code in common as possible.

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (6/20) Oct 03 2014 Superb!

monarch_dodra (6/9) Oct 03 2014 Be it C, C++ or D, "pre increment is maybe faster, and is never

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (2/7) Oct 03 2014 Ok. Thanks.

Phil (10/13) Oct 03 2014 I assume the reason is the same as in C++. As users can provide

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (21/23) Oct 03 2014 Could you please take a look again at

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (14/15) Oct 03 2014 I made another update at

monarch_dodra (2/17) Oct 03 2014 It must be resolving to std.range.stride. You want std.utf.stride.

monarch_dodra (10/30) Oct 03 2014 Sounds like your range isn't a range. Did you check that:

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (31/32) Oct 03 2014 std.utf.stride solved all but one thing...namely that

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (6/7) Oct 03 2014 I cracked it :)

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (2/3) Oct 03 2014 That's a really cool syntax, btw. I got to remember that.

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:

As a follow up to

http://forum.dlang.org/thread/dndicafxfubzmndehzux forum.dlang.org

I've begun implementing a new range, I currently call, 
SlidingSplitter at

https://github.com/nordlow/justd/blob/master/range_ex.d#L12

I would now very much like comment/reflections especially with 
regards to

- naming of range
- mutability (inout)
- and how to correct implement support for foreach which 
currently doesn't work
- how to infer range-powers of SlidingSplitter from template 
parameter R
- and other possible range primitives I may have forgotten :)
- if enough people are interested I might try adding this to 
std.range

Destroy, please!

Oct 03 2014

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Friday, 3 October 2014 at 15:22:06 UTC, Nordlöw wrote:
 As a follow up to

 http://forum.dlang.org/thread/dndicafxfubzmndehzux forum.dlang.org

 I've begun implementing a new range, I currently call, 
 SlidingSplitter at

 https://github.com/nordlow/justd/blob/master/range_ex.d#L12

 I would now very much like comment/reflections especially with 
 regards to

 - naming of range
 - mutability (inout)
 - and how to correct implement support for foreach which 
 currently doesn't work
 - how to infer range-powers of SlidingSplitter from template 
 parameter R
 - and other possible range primitives I may have forgotten :)
 - if enough people are interested I might try adding this to 
 std.range

 Destroy, please!

First thing to to do with any range, once written, it to get it 
to pass the compile-time checks in std.range wherever 
appropriate. isInputRange, isForwardRange, isRandomAccessRange 
etc.

Oct 03 2014

"monarch_dodra" <monarchdodra gmail.com> writes:

On Friday, 3 October 2014 at 15:22:06 UTC, Nordlöw wrote:
 Destroy, please!

As a quick comment, your definition of "moveFront" is not what 
phobos understands with "moveFront":
https://github.com/D-Programming-Language/phobos/blob/7914fa31cb3b53f4e50421399f2b99d2012e8031/std/range.d#L8267

Namelly, the idea of moveFront, is simply that it takes the front 
and std.algorithm.move's it.

If anything, I'd have expected you to provide something returns 
the popped element. What you do pops an element, and then returns 
the *next* one. What good is that?

Also, what you want to check is "hasSlicing", which is more 
powerful than "isRA". Speaking of which, if you don't have 
"isRA", it looks like your range errors out (no "front").

Your sliding splitter does not handle narrow strings. I'd have 
thought that was the original goal. Well, it is better to not 
support it at all of course, rather than doing it wrong :)

Oct 03 2014

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:

On Friday, 3 October 2014 at 16:32:24 UTC, monarch_dodra wrote:
 If anything, I'd have expected you to provide something returns 
 the popped element. What you do pops an element, and then 
 returns the *next* one. What good is that?

My mistake. It's fixed now.

 Also, what you want to check is "hasSlicing", which is more 
 powerful than "isRA". Speaking of which, if you don't have 
 "isRA", it looks like your range errors out (no "front").

Ok. Fixed know.

 Your sliding splitter does not handle narrow strings. I'd have 
 thought that was the original goal. Well, it is better to not 
 support it at all of course, rather than doing it wrong :)

That's part of episode 2, for now ;)

Oct 03 2014

"monarch_dodra" <monarchdodra gmail.com> writes:

On Friday, 3 October 2014 at 17:06:41 UTC, Nordlöw wrote:
 On Friday, 3 October 2014 at 16:32:24 UTC, monarch_dodra wrote:
 If anything, I'd have expected you to provide something 
 returns the popped element. What you do pops an element, and 
 then returns the *next* one. What good is that?

 My mistake. It's fixed now.

Well, yes and no. You are still providing a moveFront that does 
not conform to what the range interface expects. EG:

auto app = appender();
auto myRange = slidingSplitter([1, 2, 3]);
for ( ; !myRange.empty ; myRange.popFront() )
   app.put(myRange.moveFront());

As you can see from this code snippet, moveFront is not expected 
to pop. At all. As a matter of fact, this will error out during 
runtime, as the loop will attempt a pop on an empty range.

IMO, you should just leave it out.

 Also, what you want to check is "hasSlicing", which is more 
 powerful than "isRA". Speaking of which, if you don't have 
 "isRA", it looks like your range errors out (no "front").

 Ok. Fixed know.

There's still the issue that if you don't have slicing, then your 
range doesn't have "front" at all. You might as well just make it 
a general template restriction.

 Your sliding splitter does not handle narrow strings. I'd have 
 thought that was the original goal. Well, it is better to not 
 support it at all of course, rather than doing it wrong :)

 That's part of episode 2, for now ;)

Cool.

More things I saw:
Don't make opIndex const. You have no guarantee R.opindex is 
const.
Also, "opSlice()" is not a range primitive. You can implement it, 
it's basically just a no-op. The one that's important is 
"opSlice(lhs, rhs)".

Also, no point in returning an "auto ref" from "slidingSplitter", 
you *know* it's an rvalue.

If your implementation use two ranges that you slice "on the 
fly", then you can trivially support strings, thanks to popFront.

I threw this together. I left out checks for infinity in favor of 
brevity:

//----
import std.range, std.stdio;
import std.typecons: Tuple, tuple;

struct SlidingSplitter(R)
if (hasSlicing!R || isSomeString!R)
{
     this(R)(R data)
     {
         _original = data;
         _right = data.save;
     }

     auto front()  property
     {
         return tuple(_original[0 .. $ - _right.length], _right);
     }

     void popFront()
     {
         if (_right.empty)
             _original = _right;
         else
             _right.popFront();
     }

     bool empty() const { return _original.empty; }

     static if (hasSlicing!R)
     {
         auto opIndex(size_t i)
         {
             return tuple(
                 _original[0 .. i + $ - _right.length],
                 _right[i .. $]);
         }

         size_t length() { return _right.length + 1; }
     }

     private R _original;
     private R _right;
}

auto slidingSplitter(R)(R data)
{
     return SlidingSplitter!R(data);
}

void main()
{
     {
         auto r = slidingSplitter([1, 2, 3]);
         writefln("r.length: %s", r.length);
         writefln("r[1]: %s", r[1]);
         writefln("r[2]: %s", r[2]);
     }
     {
         writefln("%(%s\n%)", slidingSplitter("Nordlöw"));
     }
}
//----

Output:

r.length: 4
r[1]: Tuple!(int[], int[])([1], [2, 3])
r[2]: Tuple!(int[], int[])([1, 2], [3])
Tuple!(string, string)("", "Nordlöw")
Tuple!(string, string)("N", "ordlöw")
Tuple!(string, string)("No", "rdlöw")
Tuple!(string, string)("Nor", "dlöw")
Tuple!(string, string)("Nord", "löw")
Tuple!(string, string)("Nordl", "öw")
Tuple!(string, string)("Nordlö", "w")
Tuple!(string, string)("Nordlöw", "")

Oct 03 2014

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:

On Friday, 3 October 2014 at 17:46:18 UTC, monarch_dodra wrote:
 My mistake. It's fixed now.

 Well, yes and no. You are still providing a moveFront that does 
 not conform to what the range interface expects. EG:

 auto app = appender();
 auto myRange = slidingSplitter([1, 2, 3]);
 for ( ; !myRange.empty ; myRange.popFront() )
   app.put(myRange.moveFront());

 As you can see from this code snippet, moveFront is not 
 expected to pop. At all. As a matter of fact, this will error 
 out during runtime, as the loop will attempt a pop on an empty 
 range.

 IMO, you should just leave it out.

Ok, I removed it and added a test using std.range.moveFront 
instead.

 Also, what you want to check is "hasSlicing", which is more 
 powerful than "isRA". Speaking of which, if you don't have 
 "isRA", it looks like your range errors out (no "front").



Ok, I use hasSlicing instead.

 There's still the issue that if you don't have slicing, then 
 your range doesn't have "front" at all. You might as well just 
 make it a general template restriction.

 Your sliding splitter does not handle narrow strings. I'd 
 have thought that was the original goal. Well, it is better 
 to not support it at all of course, rather than doing it 
 wrong :)

 That's part of episode 2, for now ;)

 Cool.

 More things I saw:
 Don't make opIndex const. You have no guarantee R.opindex is 
 const.
 Also, "opSlice()" is not a range primitive. You can implement 
 it, it's basically just a no-op. The one that's important is 
 "opSlice(lhs, rhs)".

 Also, no point in returning an "auto ref" from 
 "slidingSplitter", you *know* it's an rvalue.

A typo. I know about the difference between auto and auto ref ;)

 If your implementation use two ranges that you slice "on the 
 fly", then you can trivially support strings, thanks to 
 popFront.

Very clever. That's what I wanted.

 I threw this together.

Superb!

It could be motivated to use

static if (isRandomAccessRange!R)
{
     // my solution for member functions...
     private R _data;
     private size_t _index;
else // assumes R is either string or wstring
{
     // your solution for member functions...
     private R _original;
     private R _right;
}

to save space in the RA-case, right?

Is isRandomAccessRange!R the correct way to check for this?

Nice unittest ;)

 I left out checks for infinity in favor of brevity:

Last thing for now:
- What do you mean by this?
- Do I have to indicate that this range is not infinite?

Oct 03 2014

"monarch_dodra" <monarchdodra gmail.com> writes:

On Friday, 3 October 2014 at 19:12:54 UTC, Nordlöw wrote:
 On Friday, 3 October 2014 at 17:46:18 UTC, monarch_dodra wrote:
 If your implementation use two ranges that you slice "on the 
 fly", then you can trivially support strings, thanks to 
 popFront.

 Very clever. That's what I wanted.

 I threw this together.

 Superb!

 It could be motivated to use

 static if (isRandomAccessRange!R)
 {
     // my solution for member functions...
     private R _data;
     private size_t _index;
 else // assumes R is either string or wstring
 {
     // your solution for member functions...
     private R _original;
     private R _right;
 }

 to save space in the RA-case, right?

The idea is to try to keep as much code in common as possible. 
You can keep your version, provided you write this for popFront:

     void popFront()
     {
         if (_index < _data.length)
         {
             static if (isNarrowString!R)
                 _index += stride(_data, _index)
             else
                 ++_index;
         }
     }

 Is isRandomAccessRange!R the correct way to check for this?

Either that or isNarrowString. given the restrictions, both are 
correct.

 Nice unittest ;)

Thanks ;)

 I left out checks for infinity in favor of brevity:

 Last thing for now:
 - What do you mean by this?

I mean that this would not work if you passed in a "repeat(5)" 
even though repeat verifies RA and hasSlicing. That's because 
there's a point where you check for length.

The "correct" restrictions should have been:
if (isSomeString || hasSlicing && !isInfinite)

Note that this works regardless of operator precedence.

 - Do I have to indicate that this range is not infinite?

You indicate a range is infinite with an "enum empty = false;". 
By having a run-time "empty", you are implicitly stating you are 
not infinite.

...


That said...
You could perfectly well support infinite ranges, with the 
correct static ifs. You'd produce an infinite sliding splitter.

Oct 03 2014

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:

On Friday, 3 October 2014 at 19:31:30 UTC, monarch_dodra wrote:
 The idea is to try to keep as much code in common as possible. 
 You can keep your version, provided you write this for popFront:

     void popFront()
     {
         if (_index < _data.length)
         {
             static if (isNarrowString!R)
                 _index += stride(_data, _index)
             else
                 ++_index;
         }
     }

Superb!

Is prefix ++ preferred in D because of some specific reason? I 
recall it, for some containers/iterators, gives smaller/faster 
codegen in C++?

 You could perfectly well support infinite ranges, with the 
 correct static ifs. You'd produce an infinite sliding splitter.

That's for episode 3 ;)

Oct 03 2014

"monarch_dodra" <monarchdodra gmail.com> writes:

On Friday, 3 October 2014 at 19:46:10 UTC, Nordlöw wrote:
 Is prefix ++ preferred in D because of some specific reason? I 
 recall it, for some containers/iterators, gives smaller/faster 
 codegen in C++?

Be it C, C++ or D, "pre increment is maybe faster, and is never 
slower".

So as a rule of thumb, unless you should *specifically* require 
post increment, pre-increment is to be prefered, though in 95% of 
the cases, it results in the same code.

Oct 03 2014

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:

On Friday, 3 October 2014 at 19:57:31 UTC, monarch_dodra wrote:
 Be it C, C++ or D, "pre increment is maybe faster, and is never 
 slower".

 So as a rule of thumb, unless you should *specifically* require 
 post increment, pre-increment is to be prefered, though in 95% 
 of the cases, it results in the same code.

Ok. Thanks.

Oct 03 2014

"Phil" <phil.j.ellison gmail.com> writes:

 Is prefix ++ preferred in D because of some specific reason? I 
 recall it, for some containers/iterators, gives smaller/faster 
 codegen in C++?

I assume the reason is the same as in C++. As users can provide 
their own implementations of pre and postfix incrementing, the 
compiler can't assume that x++ always means "copy x then 
increment x and return the copy". It could have arbitrary side 
effects. Therefore for user defined types x++ can't be replaced 
with ++x by the compiler even if you're not using the return 
value. The compiler _can_ do this for primitive types, but as 
monarch_dodra says, if you get in the habit of using ++x unless 
you explicitly need x++ then you're never going to suffer and can 
sometimes get a speed win.

Oct 03 2014

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:

On Friday, 3 October 2014 at 17:46:18 UTC, monarch_dodra wrote:
 I threw this together. I left out checks for infinity in favor 
 of brevity:

Could you please take a look again at

https://github.com/nordlow/justd/blob/master/range_ex.d#L15

I added all the stuff we talked about.

Note that I had to tweak empty() a bit. I don't know if I got 
right. Could you check?

Further I can't get string support to work:

     writefln("%(%s\n%)", slidingSplitter("Nordlöw"));

errors as

std.format.FormatException /home/per/opt/x86_64-unknown-linux-gnu/dmd/linux/bin64/src/phobos
std/format.d(2591): 
Expected '%s' format specifier for type 'SlidingSplitter!string'

and

     foreach (e; name)
     {
         version(show) writeln(e);
     }

errors as

range_ex.d(132,5): Error: invalid foreach aggregate name, define 
opApply(), range primitives, or use .tupleof

which diagnostics I btw is responsible for :)

Could you please check?

Oct 03 2014

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:

On Friday, 3 October 2014 at 20:15:33 UTC, Nordlöw wrote:
 Could you please take a look again at

I made another update at

https://github.com/nordlow/justd/blob/master/range_ex.d#L15

I had forgotten to move front and popFront out of static if 
hasSlicing!R

Now

     auto name = slidingSplitter("Nordlöw");

errors

range_ex.d(41,24): Error: incompatible types for ((this._index) 
+= (stride(this._data, this._index))): 'ulong' and 'Result'
range_ex.d(88,12): Error: template instance 
range_ex.SlidingSplitter!string error instantiating
range_ex.d(131,32):        instantiated from here: 
slidingSplitter!string

Oct 03 2014

"monarch_dodra" <monarchdodra gmail.com> writes:

On Friday, 3 October 2014 at 20:28:24 UTC, Nordlöw wrote:
 On Friday, 3 October 2014 at 20:15:33 UTC, Nordlöw wrote:
 Could you please take a look again at

 I made another update at

 https://github.com/nordlow/justd/blob/master/range_ex.d#L15

 I had forgotten to move front and popFront out of static if 
 hasSlicing!R

 Now

     auto name = slidingSplitter("Nordlöw");

 errors

 range_ex.d(41,24): Error: incompatible types for ((this._index) 
 += (stride(this._data, this._index))): 'ulong' and 'Result'
 range_ex.d(88,12): Error: template instance 
 range_ex.SlidingSplitter!string error instantiating
 range_ex.d(131,32):        instantiated from here: 
 slidingSplitter!string

It must be resolving to std.range.stride. You want std.utf.stride.

Oct 03 2014

"monarch_dodra" <monarchdodra gmail.com> writes:

On Friday, 3 October 2014 at 20:15:33 UTC, Nordlöw wrote:
 Note that I had to tweak empty() a bit. I don't know if I got 
 right. Could you check?

Sounds about right, but I didn't really look.

 Further I can't get string support to work:

     writefln("%(%s\n%)", slidingSplitter("Nordlöw"));

 errors as

 std.format.FormatException /home/per/opt/x86_64-unknown-linux-gnu/dmd/linux/bin64/src/phobos
std/format.d(2591): 
 Expected '%s' format specifier for type 'SlidingSplitter!string'

 and

     foreach (e; name)
     {
         version(show) writeln(e);
     }

 errors as

 range_ex.d(132,5): Error: invalid foreach aggregate name, 
 define opApply(), range primitives, or use .tupleof

 which diagnostics I btw is responsible for :)

 Could you please check?

Sounds like your range isn't a range. Did you check that:
isInputRange!(SlidingSplitter!string)
?

As mentioned in the first reply, it's a very important check, as 
it's very easy to get wrong.

For example, if you forget to mark front and empty as properties, 
then you don't have the correct range interface I believe.

 On Friday, 3 October 2014 at 17:46:18 UTC, monarch_dodra wrote:
        writefln("%(%s\n%)", slidingSplitter("Nordlöw"));

 That's a really cool syntax, btw. I got to remember that.

Yup. Love it. Use it.

Oct 03 2014

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:

On Friday, 3 October 2014 at 21:17:54 UTC, monarch_dodra wrote:
 Sounds about right, but I didn't really look.

std.utf.stride solved all but one thing...namely that

https://github.com/nordlow/justd/blob/master/range_ex.d#L131

prints all but last

Tuple!(string, string)("", "Nordlöw")
Tuple!(string, string)("N", "ordlöw")
Tuple!(string, string)("No", "rdlöw")
Tuple!(string, string)("Nor", "dlöw")
Tuple!(string, string)("Nord", "löw")
Tuple!(string, string)("Nordl", "öw")
Tuple!(string, string)("Nordlö", "w")

I'm guessing empty() is the problem but changing

     _data.length == _index

to

     _data.length < _index

instead leads to infinite iteration

Tuple!(string, string)("", "Nordlöw")
Tuple!(string, string)("N", "ordlöw")
Tuple!(string, string)("No", "rdlöw")
Tuple!(string, string)("Nor", "dlöw")
Tuple!(string, string)("Nord", "löw")
Tuple!(string, string)("Nordl", "öw")
Tuple!(string, string)("Nordlö", "w")
Tuple!(string, string)("Nordlöw", "")
Tuple!(string, string)("Nordlöw", "")
Tuple!(string, string)("Nordlöw", "")
Tuple!(string, string)("Nordlöw", "")
Tuple!(string, string)("Nordlöw", "")
Tuple!(string, string)("Nordlöw", "")

...

Why?

Oct 03 2014

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:

On Friday, 3 October 2014 at 21:34:33 UTC, Nordlöw wrote:
 Why?

I cracked it :)

See: https://github.com/nordlow/justd/blob/master/range_ex.d#L36

Thanks for all your help!

Now I know a lot more about ranges.

Do you think anybody is interested in including this in Phobos?

Oct 03 2014

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:

On Friday, 3 October 2014 at 17:46:18 UTC, monarch_dodra wrote:
         writefln("%(%s\n%)", slidingSplitter("Nordlöw"));

That's a really cool syntax, btw. I got to remember that.

Oct 03 2014

D Programming

C/C++ Programming

Other

digitalmars.D.learn - RFC on SlidingSplitter Range