www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - How to make rsplit (like in Python) in D

reply Uranuz <neuranuz gmail.com> writes:
How to make rsplit (like in Python) in D without need for extra 
allocation using standard library? And why there is no algorithms 
(or parameter in existing algorithms) to process range from the 
back. Is `back` and `popBack` somehow worse than `front` and 
`popFront`.

I've tried to write somethig that would work without allocation, 
but failed.
I have searching in forum. Found this thread:
https://forum.dlang.org/post/bug-10309-3 http.d.puremagic.com%2Fissues%2F

I tried to use `findSplitBefore` with `retro`, but it doesn't 
compile:

import std.stdio;
import std.algorithm;
import std.range;
import std.string;

void main()
{
	string str = "Human.Engineer.Programmer.DProgrammer";
	
	writeln( findSplitBefore(retro(str), ".")[0].retro );
}

Compilation output:
/d153/f534.d(10): Error: template std.range.retro cannot deduce 
function from argument types !()(Result), candidates are:
/opt/compilers/dmd2/include/std/range/package.d(198):        
std.range.retro(Range)(Range r) if 
(isBidirectionalRange!(Unqual!Range))


Why I have to write such strange things to do enough wide-spread 
operation. I using Python at the job and there is very much cases 
when I use rsplit. So it's very strange to me that D library has 
a lot of `advanced` algorithms that are not very commonly used, 
but there is no rsplit.

Maybe I missing something, so please give me some advice)
Oct 01 2016
next sibling parent reply Uranuz <neuranuz gmail.com> writes:
On Saturday, 1 October 2016 at 16:45:11 UTC, Uranuz wrote:
 How to make rsplit (like in Python) in D without need for extra 
 allocation using standard library? And why there is no 
 algorithms (or parameter in existing algorithms) to process 
 range from the back. Is `back` and `popBack` somehow worse than 
 `front` and `popFront`.

 I've tried to write somethig that would work without 
 allocation, but failed.
 I have searching in forum. Found this thread:
 https://forum.dlang.org/post/bug-10309-3 http.d.puremagic.com%2Fissues%2F

 I tried to use `findSplitBefore` with `retro`, but it doesn't 
 compile:

 import std.stdio;
 import std.algorithm;
 import std.range;
 import std.string;

 void main()
 {
 	string str = "Human.Engineer.Programmer.DProgrammer";
 	
 	writeln( findSplitBefore(retro(str), ".")[0].retro );
 }

 Compilation output:
 /d153/f534.d(10): Error: template std.range.retro cannot deduce 
 function from argument types !()(Result), candidates are:
 /opt/compilers/dmd2/include/std/range/package.d(198):        
 std.range.retro(Range)(Range r) if 
 (isBidirectionalRange!(Unqual!Range))


 Why I have to write such strange things to do enough 
 wide-spread operation. I using Python at the job and there is 
 very much cases when I use rsplit. So it's very strange to me 
 that D library has a lot of `advanced` algorithms that are not 
 very commonly used, but there is no rsplit.

 Maybe I missing something, so please give me some advice)
Sorry for noise. It was easy enough: import std.stdio; import std.algorithm; import std.range; import std.string; void main() { string str = "Human.Engineer.Programmer.DProgrammer"; writeln( splitter(str, '.').back ); } But I still interested why the above not compiles and how to do `rfind` or indexOf from the right in D. I think even if we do not have exactly algorithms with these names we could provide some examples how to *emulate* behaviour of standard functions from other popular languages)
Oct 01 2016
next sibling parent reply Uranuz <neuranuz gmail.com> writes:
On Saturday, 1 October 2016 at 17:23:16 UTC, Uranuz wrote:
 On Saturday, 1 October 2016 at 16:45:11 UTC, Uranuz wrote:
 How to make rsplit (like in Python) in D without need for 
 extra allocation using standard library? And why there is no 
 algorithms (or parameter in existing algorithms) to process 
 range from the back. Is `back` and `popBack` somehow worse 
 than `front` and `popFront`.

 I've tried to write somethig that would work without 
 allocation, but failed.
 I have searching in forum. Found this thread:
 https://forum.dlang.org/post/bug-10309-3 http.d.puremagic.com%2Fissues%2F

 I tried to use `findSplitBefore` with `retro`, but it doesn't 
 compile:

 import std.stdio;
 import std.algorithm;
 import std.range;
 import std.string;

 void main()
 {
 	string str = "Human.Engineer.Programmer.DProgrammer";
 	
 	writeln( findSplitBefore(retro(str), ".")[0].retro );
 }

 Compilation output:
 /d153/f534.d(10): Error: template std.range.retro cannot 
 deduce function from argument types !()(Result), candidates 
 are:
 /opt/compilers/dmd2/include/std/range/package.d(198):        
 std.range.retro(Range)(Range r) if 
 (isBidirectionalRange!(Unqual!Range))


 Why I have to write such strange things to do enough 
 wide-spread operation. I using Python at the job and there is 
 very much cases when I use rsplit. So it's very strange to me 
 that D library has a lot of `advanced` algorithms that are not 
 very commonly used, but there is no rsplit.

 Maybe I missing something, so please give me some advice)
Sorry for noise. It was easy enough: import std.stdio; import std.algorithm; import std.range; import std.string; void main() { string str = "Human.Engineer.Programmer.DProgrammer"; writeln( splitter(str, '.').back ); } But I still interested why the above not compiles and how to do `rfind` or indexOf from the right in D. I think even if we do not have exactly algorithms with these names we could provide some examples how to *emulate* behaviour of standard functions from other popular languages)
But these example fails. Oops. Looks like a bug( import std.stdio; import std.algorithm; import std.range; import std.string; void main() { string str = ""; writeln( splitter(str, '.').back ); } core.exception.AssertError std/algorithm/iteration.d(3132): Assertion failure ---------------- ??:? _d_assert [0x43dd1f] ??:? void std.algorithm.iteration.__assert(int) [0x4432b0] ??:? pure property safe immutable(char)[] std.algorithm.iteration.splitter!("a == b", immutable(char)[], char).splitter(immutable(char)[], char).Result.back() [0x43b8d6] ??:? _Dmain [0x43ae41] ??:? _D2rt6dmain211_d_run_mainUiPPaPUAAaZiZ6runAllMFZ9__lambda1MFZv [0x43e33e] ??:? void rt.dmain2._d_run_main(int, char**, extern (C) int function(char[][])*).tryExec(scope void delegate()) [0x43e288] ??:? void rt.dmain2._d_run_main(int, char**, extern (C) int function(char[][])*).runAll() [0x43e2fa] ??:? void rt.dmain2._d_run_main(int, char**, extern (C) int function(char[][])*).tryExec(scope void delegate()) [0x43e288] ??:? _d_run_main [0x43e1f9] ??:? main [0x43d049]
Oct 01 2016
parent reply Uranuz <neuranuz gmail.com> writes:
On Saturday, 1 October 2016 at 17:32:59 UTC, Uranuz wrote:
 On Saturday, 1 October 2016 at 17:23:16 UTC, Uranuz wrote:
 [...]
But these example fails. Oops. Looks like a bug( import std.stdio; import std.algorithm; import std.range; import std.string; [...]
I created bug report on this: https://issues.dlang.org/show_bug.cgi?id=16569
Oct 01 2016
parent reply pineapple <meapineapple gmail.com> writes:
On Saturday, 1 October 2016 at 17:55:08 UTC, Uranuz wrote:
 On Saturday, 1 October 2016 at 17:32:59 UTC, Uranuz wrote:
 On Saturday, 1 October 2016 at 17:23:16 UTC, Uranuz wrote:
 [...]
But these example fails. Oops. Looks like a bug( import std.stdio; import std.algorithm; import std.range; import std.string; [...]
I created bug report on this: https://issues.dlang.org/show_bug.cgi?id=16569
This isn't a bug. It's illegal to access the front or back of an empty range. (If anything is a bug, it's the nondescriptiveness of the error.) You should write this instead: void main() { string str = ""; auto split = str.splitter('.'); if(!split.empty) writeln(split.back); }
Oct 01 2016
parent reply Uranuz <neuranuz gmail.com> writes:
On Saturday, 1 October 2016 at 18:55:54 UTC, pineapple wrote:
 On Saturday, 1 October 2016 at 17:55:08 UTC, Uranuz wrote:
 On Saturday, 1 October 2016 at 17:32:59 UTC, Uranuz wrote:
 On Saturday, 1 October 2016 at 17:23:16 UTC, Uranuz wrote:
 [...]
But these example fails. Oops. Looks like a bug( import std.stdio; import std.algorithm; import std.range; import std.string; [...]
I created bug report on this: https://issues.dlang.org/show_bug.cgi?id=16569
This isn't a bug. It's illegal to access the front or back of an empty range. (If anything is a bug, it's the nondescriptiveness of the error.) You should write this instead: void main() { string str = ""; auto split = str.splitter('.'); if(!split.empty) writeln(split.back); }
When I pass empty string to splitter in most of languages I expect to get list with 1 item (empty string) as a result, but I get error instead. And I see inconsistency in that .front behaves normally, but .back is not. Usually I access front of range directly without any check when I expect it to have exactly 1 item. But in this case it not working and is very strange.
Oct 03 2016
parent pineapple <meapineapple gmail.com> writes:
On Monday, 3 October 2016 at 19:25:59 UTC, Uranuz wrote:
 When I pass empty string to splitter in most of languages I 
 expect to get list with 1 item (empty string) as a result, but 
 I get error instead. And I see inconsistency in that .front 
 behaves normally, but .back is not. Usually I access front of 
 range directly without any check when I expect it to have 
 exactly 1 item. But in this case it not working and is very 
 strange.
Hm, if front works but not back that is probably a bug. I think checking whether the range is empty before accessing the member should be a viable workaround.
Oct 03 2016
prev sibling parent cym13 <cpicard openmailbox.org> writes:
On Saturday, 1 October 2016 at 17:23:16 UTC, Uranuz wrote:
 On Saturday, 1 October 2016 at 16:45:11 UTC, Uranuz wrote:
 How to make rsplit (like in Python) in D without need for 
 extra allocation using standard library? And why there is no 
 algorithms (or parameter in existing algorithms) to process 
 range from the back. Is `back` and `popBack` somehow worse 
 than `front` and `popFront`.

 ...
I'm glad to see you found your way on your own. Just a note about your remark on backward range processing: an important thing to understand many peculiarities of Phobos is that ranges are supposed to be possibly infinite until proven otherwise. In practice that means that bidirectional ranges will allow you to work backward, but that as algorithms are generic they generally aren't designed specifically for those ranges.
Oct 01 2016
prev sibling parent reply TheFlyingFiddle <kurtyan student.chalmers.se> writes:
On Saturday, 1 October 2016 at 16:45:11 UTC, Uranuz wrote:
 How to make rsplit (like in Python) in D without need for extra 
 allocation using standard library? And why there is no 
 algorithms (or parameter in existing algorithms) to process 
 range from the back. Is `back` and `popBack` somehow worse than 
 `front` and `popFront`.

 I've tried to write somethig that would work without 
 allocation, but failed.
 I have searching in forum. Found this thread:
 https://forum.dlang.org/post/bug-10309-3 http.d.puremagic.com%2Fissues%2F

 I tried to use `findSplitBefore` with `retro`, but it doesn't 
 compile:

 import std.stdio;
 import std.algorithm;
 import std.range;
 import std.string;

 void main()
 {
 	string str = "Human.Engineer.Programmer.DProgrammer";
 	
 	writeln( findSplitBefore(retro(str), ".")[0].retro );
 }

 Compilation output:
 /d153/f534.d(10): Error: template std.range.retro cannot deduce 
 function from argument types !()(Result), candidates are:
 /opt/compilers/dmd2/include/std/range/package.d(198):        
 std.range.retro(Range)(Range r) if 
 (isBidirectionalRange!(Unqual!Range))


 Why I have to write such strange things to do enough 
 wide-spread operation. I using Python at the job and there is 
 very much cases when I use rsplit. So it's very strange to me 
 that D library has a lot of `advanced` algorithms that are not 
 very commonly used, but there is no rsplit.

 Maybe I missing something, so please give me some advice)
There are two reasons why this does not compile. The first has to do with how retro() (and indeed most function in std.range) work with utf-8 strings (eg the string type). When working on strings as ranges, the ranges internally change the type of ".front" from 'char' into 'dchar'. This is done to ensure that algorithms working on strings do not violate utf-8. See related thread: http://forum.dlang.org/post/mailman.384.1389668512.15871.digitalmars-d-learn puremagic.com The second reason has to do with findSplitBefore called with bidirectional ranges. Now to why your code does not compile. retro takes as input parameter a bidirectional or random access range. Because of how strings are handled, the string type "string" is not characterized as a random access range but instead as a bidirectional range. So the output from retro is a bidirectional range. Calling findSplitBefore with a bidirectional range unfortunately does not return results that are also bidirectional ranges. Instead findSplitBefore returns forward ranges when bidirectional ranges are given as input. I am not entirely sure why. The chain of calls has the following types. string is (bidirectional range) -> retro(str) is (bidirectional range) -> findSplitBefore is (forward range) Now the last call to retro is called with a forward range and retro needs bidirectional or random access ranges as input. If you change the line:
 string str = "Human.Engineer.Programmer.DProgrammer";
into: dstring str = "Human.Engineer.Programmer.DProgrammer"; Then the code will compile. The reason for this is: dstring is (random access range) -> retro(str) is (random access range) -> findSplitBefore is (random access range) Now the last call to retro gets a random access range as input and everyone is happy.
Oct 01 2016
parent Uranuz <neuranuz gmail.com> writes:
On Saturday, 1 October 2016 at 18:33:02 UTC, TheFlyingFiddle 
wrote:
 On Saturday, 1 October 2016 at 16:45:11 UTC, Uranuz wrote:
 [...]
There are two reasons why this does not compile. The first has to do with how retro() (and indeed most function in std.range) work with utf-8 strings (eg the string type). When working on strings as ranges, the ranges internally change the type of ".front" from 'char' into 'dchar'. This is done to ensure that algorithms working on strings do not violate utf-8. [...]
Thanks for clarification. It seems that once upon a time I'll write my own string wrapper that will return just slice of `string` pointing to source multibyte sequence as .front and will use it, and be happy. When I looking more at other languages (like Python) then I more convinced that working with UTF-8 string as array of single bytes is not very good
Oct 03 2016