digitalmars.D.learn - splitting numbers from a test file
- Craig Dillabaugh (35/35) Sep 18 2012 Hello I am trying to read in a set of numbers from a text file.
- bearophile (15/30) Sep 18 2012 Here to!string() is probably unnecessary, it's a wasted
- Craig Dillabaugh (6/37) Sep 18 2012 Thanks very much.
- =?UTF-8?B?QWxpIMOHZWhyZWxp?= (15/22) Sep 18 2012 No, parts is a lazy range, which is ready to serve its elements as
- Jonathan M Davis (44/83) Sep 18 2012 The docs do not show that splitter returns an array, because it doesn't....
- Craig Dillabaugh (19/120) Sep 18 2012 Thanks, a few others have pointed that out to me too. But as a D
- Jonathan M Davis (19/36) Sep 18 2012 The documentation says that it returns a range. Presumably then, the pro...
- Craig Dillabaugh (27/47) Sep 18 2012 From:
- =?UTF-8?B?QWxpIMOHZWhyZWxp?= (11/22) Sep 18 2012 It is unfortunate that there is also the other splitter, which at least
- Craig Dillabaugh (8/35) Sep 19 2012 Ali and Johnathan:
- =?UTF-8?B?QWxpIMOHZWhyZWxp?= (8/11) Sep 19 2012 Thank you. :)
- Jonathan M Davis (10/12) Sep 18 2012 Ah. I was looking at std.algorithm.splitter (which operates on generic r...
Hello I am trying to read in a set of numbers from a text file. The file in questions looks something like this: 35 2 0 1 0 0.49463548699999998 0.88077994719999997 0 1 0.60672109949999997 0.2254208717 0 After each line I want to check how many numbers were on the line I just read. My code to read this file looks like: 1 import std.stdio; 2 import std.conv; 3 4 int main( string[] argv ) { 5 real[] numbers_read; 6 size_t line_count=1; 7 8 auto f = std.stdio.File("test.txt", "r"); 9 foreach( char[] s; f.byLine() ) { 10 string line = std.string.strip( to!string(s) ); 11 auto parts = std.array.splitter( line ); 12 writeln("There are ", parts.length, " numbers in line ", line_count++); 13 foreach(string p; parts) { 14 numbers_read ~= to!real(p); 15 } 16 } 17 f.close(); 18 return 0; 19 } When I try to compile this I get an error: test.d(12): Error undefined identifier 'length; However, shouldn't splitter be returning an array (thats what the docs seem to show)? What is the type of 'parts'? (I tried using std.traits to figure this out, but that just generated more syntax errors for me). Cheers, Craig
Sep 18 2012
Craig Dillabaugh:8 auto f = std.stdio.File("test.txt", "r"); 9 foreach( char[] s; f.byLine() ) { 10 string line = std.string.strip( to!string(s) ); 11 auto parts = std.array.splitter( line ); 12 writeln("There are ", parts.length, " numbers in line ", line_count++); 13 foreach(string p; parts) { 14 numbers_read ~= to!real(p); 15 } 16 } 17 f.close(); 18 return 0; 19 } When I try to compile this I get an error: test.d(12): Error undefined identifier 'length;Here to!string() is probably unnecessary, it's a wasted allocation. splitter() returns a lazy range that doesn't know its length. To solve your problem there are two main solutions: to use split() instead of splitter(), or to use walkLength() on the range given by splitter(). In theory splitter() should faster, but in practice this isn't always true. Keep in mind that "real" is usually more than 64 bits long, and it's not so fast. Maybe nowdays there are other ways to load that data, I don't know if readfln("%(%f %)%") or something similar works. Bye, bearophile
Sep 18 2012
On Wednesday, 19 September 2012 at 02:58:33 UTC, bearophile wrote:Craig Dillabaugh:Thanks very much. I tried the strip() without to!string and got a syntax error when I tried to compile. Cheers, Craig8 auto f = std.stdio.File("test.txt", "r"); 9 foreach( char[] s; f.byLine() ) { 10 string line = std.string.strip( to!string(s) ); 11 auto parts = std.array.splitter( line ); 12 writeln("There are ", parts.length, " numbers in line ", line_count++); 13 foreach(string p; parts) { 14 numbers_read ~= to!real(p); 15 } 16 } 17 f.close(); 18 return 0; 19 } When I try to compile this I get an error: test.d(12): Error undefined identifier 'length;Here to!string() is probably unnecessary, it's a wasted allocation. splitter() returns a lazy range that doesn't know its length. To solve your problem there are two main solutions: to use split() instead of splitter(), or to use walkLength() on the range given by splitter(). In theory splitter() should faster, but in practice this isn't always true. Keep in mind that "real" is usually more than 64 bits long, and it's not so fast. Maybe nowdays there are other ways to load that data, I don't know if readfln("%(%f %)%") or something similar works. Bye, bearophile
Sep 18 2012
On 09/18/2012 07:50 PM, Craig Dillabaugh wrote:11 auto parts = std.array.splitter( line ); 12 writeln("There are ", parts.length, " numbers in line ",When I try to compile this I get an error: test.d(12): Error undefined identifier 'length;That is a very common confusion with ranges.However, shouldn't splitter be returning an array (thats what the docs seem to show)?No, parts is a lazy range, which is ready to serve its elements as needed. If you want to convert its elements to an array eagerly, you can call std.array.array: import std.array; // ... writeln("There are ", array(parts).length, " numbers in line ", line_count++);What is the type of 'parts'?writeln(typeid(parts)); or writeln(typeof(parts).stringof); Ali -- D Programming Language Tutorial: http://ddili.org/ders/d.en/index.html
Sep 18 2012
On Wednesday, September 19, 2012 04:50:45 Craig Dillabaugh wrote:Hello I am trying to read in a set of numbers from a text file. The file in questions looks something like this: 35 2 0 1 0 0.49463548699999998 0.88077994719999997 0 1 0.60672109949999997 0.2254208717 0 After each line I want to check how many numbers were on the line I just read. My code to read this file looks like: 1 import std.stdio; 2 import std.conv; 3 4 int main( string[] argv ) { 5 real[] numbers_read; 6 size_t line_count=1; 7 8 auto f = std.stdio.File("test.txt", "r"); 9 foreach( char[] s; f.byLine() ) { 10 string line = std.string.strip( to!string(s) ); 11 auto parts = std.array.splitter( line ); 12 writeln("There are ", parts.length, " numbers in line ", line_count++); 13 foreach(string p; parts) { 14 numbers_read ~= to!real(p); 15 } 16 } 17 f.close(); 18 return 0; 19 } When I try to compile this I get an error: test.d(12): Error undefined identifier 'length; However, shouldn't splitter be returning an array (thats what the docs seem to show)? What is the type of 'parts'? (I tried using std.traits to figure this out, but that just generated more syntax errors for me).The docs do not show that splitter returns an array, because it doesn't. It returns a lazy range type which finds each successive element as you iterate over it. It doesn't have a length property, because it's length isn't known until you iterate over it. You have three options: 1. Use std.array.split, which returns an array (so, it's eager and requires additional memory allocations to create the array, but you'll have its length without having to iterate over it multiple times). 2. Use std.range.walkLength to get the length of the range. If a range has a length property, then walkLength just returns that, otherwise it iterates over the whole range and counts its elements. So, you won't get extra memory allocations, but you'll have to iterate over the range twice. 3. Simply count up the number of elements as you iterate over them and _then_ print out the length. Also, theres no need to convert s to a string like that. If you were saving the string or needed an actual string instead of char[], then that would make sense, but you're just splitting it and then converting it to a number. char[] will work just fine for that. So, something like this would probably be better import std.conv; import std.stdio; import std.string; void main() { real[] numbers_read; size_t line_count = 0; auto f = std.stdio.File("test.txt", "r"); foreach(line; f.byLine()) { line = strip(line); auto parts = std.array.splitter(line); size_t length = 0; foreach(p; parts) { numbers_read ~= to!real(p); ++length; } writeln("There are ", length, " numbers in line ", ++line_count); } } If you aren't familiar with ranges, then read this http://ddili.org/ders/d.en/ranges.html But ranges are used quite heavily in Phobos, so you should be familiar with them if you intend to use D. - Jonathan M Davis
Sep 18 2012
On Wednesday, 19 September 2012 at 03:12:21 UTC, Jonathan M Davis wrote:On Wednesday, September 19, 2012 04:50:45 Craig Dillabaugh wrote:Thanks, a few others have pointed that out to me too. But as a D newbie how would I have any clue what splitter returns since the return type is auto? The is an example in the docs. auto a = " a bcd ef gh "; assert(equal(splitter(a), ["", "a", "bcd", "ef", "gh"][])); I guessed that since the return of splitter was equal to : ["", "a", "bcd", "ef", "gh"][] it was returning some sort of 2D array! When a function returns an 'auto' in the Phobos is this generally indicative of the return value being a range?Hello I am trying to read in a set of numbers from a text file. The file in questions looks something like this: 35 2 0 1 0 0.49463548699999998 0.88077994719999997 0 1 0.60672109949999997 0.2254208717 0 After each line I want to check how many numbers were on the line I just read. My code to read this file looks like: 1 import std.stdio; 2 import std.conv; 3 4 int main( string[] argv ) { 5 real[] numbers_read; 6 size_t line_count=1; 7 8 auto f = std.stdio.File("test.txt", "r"); 9 foreach( char[] s; f.byLine() ) { 10 string line = std.string.strip( to!string(s) ); 11 auto parts = std.array.splitter( line ); 12 writeln("There are ", parts.length, " numbers in line ", line_count++); 13 foreach(string p; parts) { 14 numbers_read ~= to!real(p); 15 } 16 } 17 f.close(); 18 return 0; 19 } When I try to compile this I get an error: test.d(12): Error undefined identifier 'length; However, shouldn't splitter be returning an array (thats what the docs seem to show)? What is the type of 'parts'? (I tried using std.traits to figure this out, but that just generated more syntax errors for me).The docs do not show that splitter returns an array, because it doesn't. It returns a lazy range type which finds each successive element as you iterate over it. It doesn't have a length property, because it's length isn't known until you iterate over it. You have three options:1. Use std.array.split, which returns an array (so, it's eager and requires additional memory allocations to create the array, but you'll have its length without having to iterate over it multiple times). 2. Use std.range.walkLength to get the length of the range. If a range has a length property, then walkLength just returns that, otherwise it iterates over the whole range and counts its elements. So, you won't get extra memory allocations, but you'll have to iterate over the range twice. 3. Simply count up the number of elements as you iterate over them and _then_ print out the length. Also, theres no need to convert s to a string like that. If you were saving the string or needed an actual string instead of char[], then that would make sense, but you're just splitting it and then converting it to a number. char[] will work just fine for that. So, something like this would probably be betterI think my problem was that I was trying to call strip on it first to remove leading/trailing whitespace and I was getting syntax errors when I called strip() on the char[]. Just calling split works as you say.import std.conv; import std.stdio; import std.string; void main() { real[] numbers_read; size_t line_count = 0; auto f = std.stdio.File("test.txt", "r"); foreach(line; f.byLine()) { line = strip(line); auto parts = std.array.splitter(line); size_t length = 0; foreach(p; parts) { numbers_read ~= to!real(p); ++length; } writeln("There are ", length, " numbers in line ", ++line_count); } } If you aren't familiar with ranges, then read this http://ddili.org/ders/d.en/ranges.html But ranges are used quite heavily in Phobos, so you should be familiar with them if you intend to use D. - Jonathan M Davis
Sep 18 2012
On Wednesday, September 19, 2012 05:36:36 Craig Dillabaugh wrote:Thanks, a few others have pointed that out to me too. But as a D newbie how would I have any clue what splitter returns since the return type is auto?The documentation says that it returns a range. Presumably then, the problem is that you're not familiar with ranges, and that needs to be handled better. We really need a proper article/tutorial on the main site which explains them, and we don't. But I don't know what we'd do differently in the documentation for functions in general. Ranges are a concept that are used quite heavily in Phobos, and it wouldn't make sense to try and explain them for every function that uses them.The is an example in the docs. auto a = " a bcd ef gh "; assert(equal(splitter(a), ["", "a", "bcd", "ef", "gh"][]));It would have used == if it were an array. equal operates on ranges, so if it's used, odds are that the types on the right and left sides are different.I guessed that since the return of splitter was equal to : ["", "a", "bcd", "ef", "gh"][] it was returning some sort of 2D array! When a function returns an 'auto' in the Phobos is this generally indicative of the return value being a range?That's the most common, but it's not always the case. It will usually say in the documentation though (and if you're familiar with ranges, it's generally fairly obvious if the return type is a range just based on what the function is doing), and in this case it does.I think my problem was that I was trying to call strip on it first to remove leading/trailing whitespace and I was getting syntax errors when I called strip() on the char[]. Just calling split works as you say.strip works just fine on a char[]. I don't know why you were having problems with it. Maybe you're using an older release of the compiler and strip used to take a string rather than being templated on character type? I don't know. If you're on 2.060 though, strip should work just fine with char[]. - Jonathan M Davis
Sep 18 2012
On Wednesday, 19 September 2012 at 04:03:44 UTC, Jonathan M Davis wrote:On Wednesday, September 19, 2012 05:36:36 Craig Dillabaugh wrote:From: http://dlang.org/phobos/std_array.html#splitter The documentation (copied and pasted) for splitter reads: auto splitter(C)(C[] s); Splits a string by whitespace. Example: auto a = " a bcd ef gh "; assert(equal(splitter(a), ["", "a", "bcd", "ef", "gh"][])); I have this awful feeling that I am missing something blatantly obvious here, and that by posting this reply I am leaving a permanent testament to my stupidity on the internet, but I really want to understand this ... I just want to figure out how you can explicitly say "the documentation says it returns a range" based on that! Is is simply because you recognize the range from the assert statement in the example? I am sure the Phobos developers have better things to do then writing documentation that coddles newbies, but could the documentation not say: auto splitter(C)(C[] s); Splits a string by whitespace. Returns an InputRange of all substrings. Or something to that affect. Thanks again for your time. clip ....Thanks, a few others have pointed that out to me too. But as a D newbie how would I have any clue what splitter returns since the return type is auto?The documentation says that it returns a range. Presumably then, the problem is that you're not familiar with ranges, and that needs to be handled better. We really need a proper article/tutorial on the main site which explains them, and we don't. But I don't know what we'd do differently in the documentation for functions in general. Ranges are a concept that are used quite heavily in Phobos, and it wouldn't make sense to try and explain them for every function that uses them.
Sep 18 2012
On 09/18/2012 09:56 PM, Craig Dillabaugh wrote:On Wednesday, 19 September 2012 at 04:03:44 UTC, Jonathan M Davis wrote:The documentation says that it returns a range.From: http://dlang.org/phobos/std_array.html#splitter The documentation (copied and pasted) for splitter reads: auto splitter(C)(C[] s); Splits a string by whitespace. Example: auto a = " a bcd ef gh "; assert(equal(splitter(a), ["", "a", "bcd", "ef", "gh"][]));It is unfortunate that there is also the other splitter, which at least implies ranges: :-/ http://dlang.org/phobos/std_algorithm.html#splitter Yes, the documentation can be much better. For example, the documentation for the second splitter above looks exacly like the other one, except that one says "using an element as a separator." while the other one says "using another range as a separator". I think it is a ddoc limitation: Template constraints are not included in documentation yet. Ali
Sep 18 2012
On Wednesday, 19 September 2012 at 06:09:38 UTC, Ali Çehreli wrote:On 09/18/2012 09:56 PM, Craig Dillabaugh wrote:Ali and Johnathan: Thank you for your help. Also Ali thanks for your book, motivated by this little problem I've started reading your Chapter on ranges. It is very helpful. Cheers, CraigOn Wednesday, 19 September 2012 at 04:03:44 UTC, Jonathan MDaviswrote:The documentation says that it returns a range.From: http://dlang.org/phobos/std_array.html#splitter The documentation (copied and pasted) for splitter reads: auto splitter(C)(C[] s); Splits a string by whitespace. Example: auto a = " a bcd ef gh "; assert(equal(splitter(a), ["", "a", "bcd", "ef", "gh"][]));It is unfortunate that there is also the other splitter, which at least implies ranges: :-/ http://dlang.org/phobos/std_algorithm.html#splitter Yes, the documentation can be much better. For example, the documentation for the second splitter above looks exacly like the other one, except that one says "using an element as a separator." while the other one says "using another range as a separator". I think it is a ddoc limitation: Template constraints are not included in documentation yet. Ali
Sep 19 2012
On 09/19/2012 10:22 AM, Craig Dillabaugh wrote:Thank you for your help. Also Ali thanks for your book, motivated by this little problem I've started reading your Chapter on ranges. It is very helpful.Thank you. :) Obviously, I am aware of its shortcomings. Especially, the difference between a container and a range must be stressed. The chapter touches on that idea at different places but I don't think it is spelled out sufficiently. To be improved some time in the future... :) Ali
Sep 19 2012
On Wednesday, September 19, 2012 06:56:23 Craig Dillabaugh wrote:From: http://dlang.org/phobos/std_array.html#splitterAh. I was looking at std.algorithm.splitter (which operates on generic ranges and separators) which _does_ explicitly say that it returns a range. Yeah. The documentation on std.array.splitter is incredibly sparse. It doesn't even state the result is lazy (though if it did, it would be bound to say that it was a lazy range, which would then mean that it was stating that the return type was a range), making the difference between it and split not at all obvious. That should be fixed. Internally, it just does return std.algorithm.splitter!(std.uni.isWhite)(s); - Jonathan M Davis
Sep 18 2012