www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - RFC on SlidingSplitter Range

reply =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
As a follow up to

http://forum.dlang.org/thread/dndicafxfubzmndehzux forum.dlang.org

I've begun implementing a new range, I currently call, 
SlidingSplitter at

https://github.com/nordlow/justd/blob/master/range_ex.d#L12

I would now very much like comment/reflections especially with 
regards to

- naming of range
- mutability (inout)
- and how to correct implement support for foreach which 
currently doesn't work
- how to infer range-powers of SlidingSplitter from template 
parameter R
- and other possible range primitives I may have forgotten :)
- if enough people are interested I might try adding this to 
std.range

Destroy, please!
Oct 03 2014
next sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Friday, 3 October 2014 at 15:22:06 UTC, Nordlöw wrote:
 As a follow up to

 http://forum.dlang.org/thread/dndicafxfubzmndehzux forum.dlang.org

 I've begun implementing a new range, I currently call, 
 SlidingSplitter at

 https://github.com/nordlow/justd/blob/master/range_ex.d#L12

 I would now very much like comment/reflections especially with 
 regards to

 - naming of range
 - mutability (inout)
 - and how to correct implement support for foreach which 
 currently doesn't work
 - how to infer range-powers of SlidingSplitter from template 
 parameter R
 - and other possible range primitives I may have forgotten :)
 - if enough people are interested I might try adding this to 
 std.range

 Destroy, please!
First thing to to do with any range, once written, it to get it to pass the compile-time checks in std.range wherever appropriate. isInputRange, isForwardRange, isRandomAccessRange etc.
Oct 03 2014
prev sibling parent reply "monarch_dodra" <monarchdodra gmail.com> writes:
On Friday, 3 October 2014 at 15:22:06 UTC, Nordlöw wrote:
 Destroy, please!
As a quick comment, your definition of "moveFront" is not what phobos understands with "moveFront": https://github.com/D-Programming-Language/phobos/blob/7914fa31cb3b53f4e50421399f2b99d2012e8031/std/range.d#L8267 Namelly, the idea of moveFront, is simply that it takes the front and std.algorithm.move's it. If anything, I'd have expected you to provide something returns the popped element. What you do pops an element, and then returns the *next* one. What good is that? Also, what you want to check is "hasSlicing", which is more powerful than "isRA". Speaking of which, if you don't have "isRA", it looks like your range errors out (no "front"). Your sliding splitter does not handle narrow strings. I'd have thought that was the original goal. Well, it is better to not support it at all of course, rather than doing it wrong :)
Oct 03 2014
parent reply =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
On Friday, 3 October 2014 at 16:32:24 UTC, monarch_dodra wrote:
 If anything, I'd have expected you to provide something returns 
 the popped element. What you do pops an element, and then 
 returns the *next* one. What good is that?
My mistake. It's fixed now.
 Also, what you want to check is "hasSlicing", which is more 
 powerful than "isRA". Speaking of which, if you don't have 
 "isRA", it looks like your range errors out (no "front").
Ok. Fixed know.
 Your sliding splitter does not handle narrow strings. I'd have 
 thought that was the original goal. Well, it is better to not 
 support it at all of course, rather than doing it wrong :)
That's part of episode 2, for now ;)
Oct 03 2014
parent reply "monarch_dodra" <monarchdodra gmail.com> writes:
On Friday, 3 October 2014 at 17:06:41 UTC, Nordlöw wrote:
 On Friday, 3 October 2014 at 16:32:24 UTC, monarch_dodra wrote:
 If anything, I'd have expected you to provide something 
 returns the popped element. What you do pops an element, and 
 then returns the *next* one. What good is that?
My mistake. It's fixed now.
Well, yes and no. You are still providing a moveFront that does not conform to what the range interface expects. EG: auto app = appender(); auto myRange = slidingSplitter([1, 2, 3]); for ( ; !myRange.empty ; myRange.popFront() ) app.put(myRange.moveFront()); As you can see from this code snippet, moveFront is not expected to pop. At all. As a matter of fact, this will error out during runtime, as the loop will attempt a pop on an empty range. IMO, you should just leave it out.
 Also, what you want to check is "hasSlicing", which is more 
 powerful than "isRA". Speaking of which, if you don't have 
 "isRA", it looks like your range errors out (no "front").
Ok. Fixed know.
There's still the issue that if you don't have slicing, then your range doesn't have "front" at all. You might as well just make it a general template restriction.
 Your sliding splitter does not handle narrow strings. I'd have 
 thought that was the original goal. Well, it is better to not 
 support it at all of course, rather than doing it wrong :)
That's part of episode 2, for now ;)
Cool. More things I saw: Don't make opIndex const. You have no guarantee R.opindex is const. Also, "opSlice()" is not a range primitive. You can implement it, it's basically just a no-op. The one that's important is "opSlice(lhs, rhs)". Also, no point in returning an "auto ref" from "slidingSplitter", you *know* it's an rvalue. If your implementation use two ranges that you slice "on the fly", then you can trivially support strings, thanks to popFront. I threw this together. I left out checks for infinity in favor of brevity: //---- import std.range, std.stdio; import std.typecons: Tuple, tuple; struct SlidingSplitter(R) if (hasSlicing!R || isSomeString!R) { this(R)(R data) { _original = data; _right = data.save; } auto front() property { return tuple(_original[0 .. $ - _right.length], _right); } void popFront() { if (_right.empty) _original = _right; else _right.popFront(); } bool empty() const { return _original.empty; } static if (hasSlicing!R) { auto opIndex(size_t i) { return tuple( _original[0 .. i + $ - _right.length], _right[i .. $]); } size_t length() { return _right.length + 1; } } private R _original; private R _right; } auto slidingSplitter(R)(R data) { return SlidingSplitter!R(data); } void main() { { auto r = slidingSplitter([1, 2, 3]); writefln("r.length: %s", r.length); writefln("r[1]: %s", r[1]); writefln("r[2]: %s", r[2]); } { writefln("%(%s\n%)", slidingSplitter("Nordlöw")); } } //---- Output: r.length: 4 r[1]: Tuple!(int[], int[])([1], [2, 3]) r[2]: Tuple!(int[], int[])([1, 2], [3]) Tuple!(string, string)("", "Nordlöw") Tuple!(string, string)("N", "ordlöw") Tuple!(string, string)("No", "rdlöw") Tuple!(string, string)("Nor", "dlöw") Tuple!(string, string)("Nord", "löw") Tuple!(string, string)("Nordl", "öw") Tuple!(string, string)("Nordlö", "w") Tuple!(string, string)("Nordlöw", "")
Oct 03 2014
next sibling parent reply =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
On Friday, 3 October 2014 at 17:46:18 UTC, monarch_dodra wrote:
 My mistake. It's fixed now.
Well, yes and no. You are still providing a moveFront that does not conform to what the range interface expects. EG: auto app = appender(); auto myRange = slidingSplitter([1, 2, 3]); for ( ; !myRange.empty ; myRange.popFront() ) app.put(myRange.moveFront()); As you can see from this code snippet, moveFront is not expected to pop. At all. As a matter of fact, this will error out during runtime, as the loop will attempt a pop on an empty range. IMO, you should just leave it out.
Ok, I removed it and added a test using std.range.moveFront instead.
 Also, what you want to check is "hasSlicing", which is more 
 powerful than "isRA". Speaking of which, if you don't have 
 "isRA", it looks like your range errors out (no "front").
Ok, I use hasSlicing instead.
 There's still the issue that if you don't have slicing, then 
 your range doesn't have "front" at all. You might as well just 
 make it a general template restriction.

 Your sliding splitter does not handle narrow strings. I'd 
 have thought that was the original goal. Well, it is better 
 to not support it at all of course, rather than doing it 
 wrong :)
That's part of episode 2, for now ;)
Cool. More things I saw: Don't make opIndex const. You have no guarantee R.opindex is const. Also, "opSlice()" is not a range primitive. You can implement it, it's basically just a no-op. The one that's important is "opSlice(lhs, rhs)". Also, no point in returning an "auto ref" from "slidingSplitter", you *know* it's an rvalue.
A typo. I know about the difference between auto and auto ref ;)
 If your implementation use two ranges that you slice "on the 
 fly", then you can trivially support strings, thanks to 
 popFront.
Very clever. That's what I wanted.
 I threw this together.
Superb! It could be motivated to use static if (isRandomAccessRange!R) { // my solution for member functions... private R _data; private size_t _index; else // assumes R is either string or wstring { // your solution for member functions... private R _original; private R _right; } to save space in the RA-case, right? Is isRandomAccessRange!R the correct way to check for this? Nice unittest ;)
 I left out checks for infinity in favor of brevity:
Last thing for now: - What do you mean by this? - Do I have to indicate that this range is not infinite?
Oct 03 2014
parent reply "monarch_dodra" <monarchdodra gmail.com> writes:
On Friday, 3 October 2014 at 19:12:54 UTC, Nordlöw wrote:
 On Friday, 3 October 2014 at 17:46:18 UTC, monarch_dodra wrote:
 If your implementation use two ranges that you slice "on the 
 fly", then you can trivially support strings, thanks to 
 popFront.
Very clever. That's what I wanted.
 I threw this together.
Superb! It could be motivated to use static if (isRandomAccessRange!R) { // my solution for member functions... private R _data; private size_t _index; else // assumes R is either string or wstring { // your solution for member functions... private R _original; private R _right; } to save space in the RA-case, right?
The idea is to try to keep as much code in common as possible. You can keep your version, provided you write this for popFront: void popFront() { if (_index < _data.length) { static if (isNarrowString!R) _index += stride(_data, _index) else ++_index; } }
 Is isRandomAccessRange!R the correct way to check for this?
Either that or isNarrowString. given the restrictions, both are correct.
 Nice unittest ;)
Thanks ;)
 I left out checks for infinity in favor of brevity:
Last thing for now: - What do you mean by this?
I mean that this would not work if you passed in a "repeat(5)" even though repeat verifies RA and hasSlicing. That's because there's a point where you check for length. The "correct" restrictions should have been: if (isSomeString || hasSlicing && !isInfinite) Note that this works regardless of operator precedence.
 - Do I have to indicate that this range is not infinite?
You indicate a range is infinite with an "enum empty = false;". By having a run-time "empty", you are implicitly stating you are not infinite. ... That said... You could perfectly well support infinite ranges, with the correct static ifs. You'd produce an infinite sliding splitter.
Oct 03 2014
parent reply =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
On Friday, 3 October 2014 at 19:31:30 UTC, monarch_dodra wrote:
 The idea is to try to keep as much code in common as possible. 
 You can keep your version, provided you write this for popFront:

     void popFront()
     {
         if (_index < _data.length)
         {
             static if (isNarrowString!R)
                 _index += stride(_data, _index)
             else
                 ++_index;
         }
     }
Superb! Is prefix ++ preferred in D because of some specific reason? I recall it, for some containers/iterators, gives smaller/faster codegen in C++?
 You could perfectly well support infinite ranges, with the 
 correct static ifs. You'd produce an infinite sliding splitter.
That's for episode 3 ;)
Oct 03 2014
next sibling parent reply "monarch_dodra" <monarchdodra gmail.com> writes:
On Friday, 3 October 2014 at 19:46:10 UTC, Nordlöw wrote:
 Is prefix ++ preferred in D because of some specific reason? I 
 recall it, for some containers/iterators, gives smaller/faster 
 codegen in C++?
Be it C, C++ or D, "pre increment is maybe faster, and is never slower". So as a rule of thumb, unless you should *specifically* require post increment, pre-increment is to be prefered, though in 95% of the cases, it results in the same code.
Oct 03 2014
parent =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
On Friday, 3 October 2014 at 19:57:31 UTC, monarch_dodra wrote:
 Be it C, C++ or D, "pre increment is maybe faster, and is never 
 slower".

 So as a rule of thumb, unless you should *specifically* require 
 post increment, pre-increment is to be prefered, though in 95% 
 of the cases, it results in the same code.
Ok. Thanks.
Oct 03 2014
prev sibling parent "Phil" <phil.j.ellison gmail.com> writes:
 Is prefix ++ preferred in D because of some specific reason? I 
 recall it, for some containers/iterators, gives smaller/faster 
 codegen in C++?
I assume the reason is the same as in C++. As users can provide their own implementations of pre and postfix incrementing, the compiler can't assume that x++ always means "copy x then increment x and return the copy". It could have arbitrary side effects. Therefore for user defined types x++ can't be replaced with ++x by the compiler even if you're not using the return value. The compiler _can_ do this for primitive types, but as monarch_dodra says, if you get in the habit of using ++x unless you explicitly need x++ then you're never going to suffer and can sometimes get a speed win.
Oct 03 2014
prev sibling next sibling parent reply =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
On Friday, 3 October 2014 at 17:46:18 UTC, monarch_dodra wrote:
 I threw this together. I left out checks for infinity in favor 
 of brevity:
Could you please take a look again at https://github.com/nordlow/justd/blob/master/range_ex.d#L15 I added all the stuff we talked about. Note that I had to tweak empty() a bit. I don't know if I got right. Could you check? Further I can't get string support to work: writefln("%(%s\n%)", slidingSplitter("Nordlöw")); errors as std.format.FormatException /home/per/opt/x86_64-unknown-linux-gnu/dmd/linux/bin64/src/phobos std/format.d(2591): Expected '%s' format specifier for type 'SlidingSplitter!string' and foreach (e; name) { version(show) writeln(e); } errors as range_ex.d(132,5): Error: invalid foreach aggregate name, define opApply(), range primitives, or use .tupleof which diagnostics I btw is responsible for :) Could you please check?
Oct 03 2014
next sibling parent reply =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
On Friday, 3 October 2014 at 20:15:33 UTC, Nordlöw wrote:
 Could you please take a look again at
I made another update at https://github.com/nordlow/justd/blob/master/range_ex.d#L15 I had forgotten to move front and popFront out of static if hasSlicing!R Now auto name = slidingSplitter("Nordlöw"); errors range_ex.d(41,24): Error: incompatible types for ((this._index) += (stride(this._data, this._index))): 'ulong' and 'Result' range_ex.d(88,12): Error: template instance range_ex.SlidingSplitter!string error instantiating range_ex.d(131,32): instantiated from here: slidingSplitter!string
Oct 03 2014
parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Friday, 3 October 2014 at 20:28:24 UTC, Nordlöw wrote:
 On Friday, 3 October 2014 at 20:15:33 UTC, Nordlöw wrote:
 Could you please take a look again at
I made another update at https://github.com/nordlow/justd/blob/master/range_ex.d#L15 I had forgotten to move front and popFront out of static if hasSlicing!R Now auto name = slidingSplitter("Nordlöw"); errors range_ex.d(41,24): Error: incompatible types for ((this._index) += (stride(this._data, this._index))): 'ulong' and 'Result' range_ex.d(88,12): Error: template instance range_ex.SlidingSplitter!string error instantiating range_ex.d(131,32): instantiated from here: slidingSplitter!string
It must be resolving to std.range.stride. You want std.utf.stride.
Oct 03 2014
prev sibling parent reply "monarch_dodra" <monarchdodra gmail.com> writes:
On Friday, 3 October 2014 at 20:15:33 UTC, Nordlöw wrote:
 Note that I had to tweak empty() a bit. I don't know if I got 
 right. Could you check?
Sounds about right, but I didn't really look.
 Further I can't get string support to work:

     writefln("%(%s\n%)", slidingSplitter("Nordlöw"));

 errors as

 std.format.FormatException /home/per/opt/x86_64-unknown-linux-gnu/dmd/linux/bin64/src/phobos
std/format.d(2591): 
 Expected '%s' format specifier for type 'SlidingSplitter!string'

 and

     foreach (e; name)
     {
         version(show) writeln(e);
     }

 errors as

 range_ex.d(132,5): Error: invalid foreach aggregate name, 
 define opApply(), range primitives, or use .tupleof

 which diagnostics I btw is responsible for :)

 Could you please check?
Sounds like your range isn't a range. Did you check that: isInputRange!(SlidingSplitter!string) ? As mentioned in the first reply, it's a very important check, as it's very easy to get wrong. For example, if you forget to mark front and empty as properties, then you don't have the correct range interface I believe.
 On Friday, 3 October 2014 at 17:46:18 UTC, monarch_dodra wrote:
        writefln("%(%s\n%)", slidingSplitter("Nordlöw"));
That's a really cool syntax, btw. I got to remember that.
Yup. Love it. Use it.
Oct 03 2014
parent reply =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
On Friday, 3 October 2014 at 21:17:54 UTC, monarch_dodra wrote:
 Sounds about right, but I didn't really look.
std.utf.stride solved all but one thing...namely that https://github.com/nordlow/justd/blob/master/range_ex.d#L131 prints all but last Tuple!(string, string)("", "Nordlöw") Tuple!(string, string)("N", "ordlöw") Tuple!(string, string)("No", "rdlöw") Tuple!(string, string)("Nor", "dlöw") Tuple!(string, string)("Nord", "löw") Tuple!(string, string)("Nordl", "öw") Tuple!(string, string)("Nordlö", "w") I'm guessing empty() is the problem but changing _data.length == _index to _data.length < _index instead leads to infinite iteration Tuple!(string, string)("", "Nordlöw") Tuple!(string, string)("N", "ordlöw") Tuple!(string, string)("No", "rdlöw") Tuple!(string, string)("Nor", "dlöw") Tuple!(string, string)("Nord", "löw") Tuple!(string, string)("Nordl", "öw") Tuple!(string, string)("Nordlö", "w") Tuple!(string, string)("Nordlöw", "") Tuple!(string, string)("Nordlöw", "") Tuple!(string, string)("Nordlöw", "") Tuple!(string, string)("Nordlöw", "") Tuple!(string, string)("Nordlöw", "") Tuple!(string, string)("Nordlöw", "") ... Why?
Oct 03 2014
parent =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
On Friday, 3 October 2014 at 21:34:33 UTC, Nordlöw wrote:
 Why?
I cracked it :) See: https://github.com/nordlow/justd/blob/master/range_ex.d#L36 Thanks for all your help! Now I know a lot more about ranges. Do you think anybody is interested in including this in Phobos?
Oct 03 2014
prev sibling parent =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
On Friday, 3 October 2014 at 17:46:18 UTC, monarch_dodra wrote:
         writefln("%(%s\n%)", slidingSplitter("Nordlöw"));
That's a really cool syntax, btw. I got to remember that.
Oct 03 2014