digitalmars.D.learn - Schroedinger's Ranges
- vacuum_tube (97/97) Jun 02 2021 I've been trying to make a struct for CSV parsing and
- Paul Backus (6/34) Jun 02 2021 `File.byLine` overwrites the previous line's data every time it
- kdevel (22/26) Jun 03 2021 a) What is the rationale behind not making byLineCopy the default?
- Mike Parker (3/5) Jun 03 2021 byLine was the original implementation. byLineCopy was added
- Mike Parker (3/9) Jun 03 2021 See:
- kdevel (5/12) Jun 03 2021 THX. BTW byLineCopy defaults to immutable char. That's why one
- Steven Schveighoffer (7/25) Jun 03 2021 I was going to suggest use byLineCopy!(char, char), because the second
- WebFreak001 (9/18) Jun 03 2021 additionally to the other comment, you probably want to use
I've been trying to make a struct for CSV parsing and manipulating. The code was as follows: ``` struct CSVData(bool HeaderFromFirstLine) { char[][] header = []; char[][][] rest = []; this(string filename) { auto tmp = File(filename).byLine(); if(HeaderFromFirstLine) { this.header = CSVData.parseCSV(tmp.front()).array; tmp.popFront(); } this.rest = tmp.map!(e => parseCSV(e)).array; } static char[][] parseCSV(char[] str) { char[][] tmp = split(str, ","); return tmp; } void print() { writeln(this.header); foreach(e; this.text) writeln(e); } } void main() { auto data = CSVData!true("testdata"); data.print(); } ``` The "testdata" text file looked like this: ``` 10,15,Hello world stuff,,more stuff ``` And the output from running it looked like this: ``` ["st", "ff", ",more stuff"] ["stuff", "", "more stuff"] ``` As you can see, the `header` field is not printing correctly. In an attempt to debug, I added several `writeln`s to the constructor: ``` this(string filename) { auto tmp = File(filename).byLine(); if(HeaderFromFirstLine) { this.header = CSVData.parseCSV(tmp.front()).array; tmp.popFront(); writeln(this.header); } this.text = tmp.map!(e => parseCSV(e)).array; writeln(this.header); } ``` This produced the following output: ``` ["10", "15", "Hello world"] ["st", "ff", ",more stuff"] ["st", "ff", ",more stuff"] ["stuff", "", "more stuff"] ``` I then tried commenting out the offending line (the one with the `map`) and got the expected result: ``` ["10", "15", "Hello world"] ["10", "15", "Hello world"] ["10", "15", "Hello world"] ``` Finally, I replaced the offending line and called a different function on `tmp`: ``` writeln(tmp.front); ``` And got the following result: ``` ["10", "15", "Hello world"] stuff,,more stuff ["st", "ff", ",more stuff"] ["st", "ff", ",more stuff"] ``` So it appears that observing or modifying `tmp` somehow modifies `header`, despite not interacting with it in any visible way. What is the reason for this? I'm guessing it either has to do with the internals of ranges, or that the arrays were messing up somehow, but I'm not sure. Thanks in advance!
Jun 02 2021
On Thursday, 3 June 2021 at 00:39:04 UTC, vacuum_tube wrote:I've been trying to make a struct for CSV parsing and manipulating. The code was as follows: ``` struct CSVData(bool HeaderFromFirstLine) { char[][] header = []; char[][][] rest = []; this(string filename) { auto tmp = File(filename).byLine(); if(HeaderFromFirstLine) { this.header = CSVData.parseCSV(tmp.front()).array; tmp.popFront(); } this.rest = tmp.map!(e => parseCSV(e)).array; } ```[...]The "testdata" text file looked like this: ``` 10,15,Hello world stuff,,more stuff ``` And the output from running it looked like this: ``` ["st", "ff", ",more stuff"] ["stuff", "", "more stuff"]`File.byLine` overwrites the previous line's data every time it reads a new line. If you want to store each line's data for later use, you need to use [`byLineCopy`][1] instead. [1]: https://phobos.dpldocs.info/std.stdio.File.byLineCopy.1.html
Jun 02 2021
On Thursday, 3 June 2021 at 01:22:14 UTC, Paul Backus wrote:auto tmp = File(filename).byLine();`File.byLine` overwrites the previous line's data every time it reads a new line. If you want to store each line's data for later use, you need to use [`byLineCopy`][1] instead.a) What is the rationale behind not making byLineCopy the default? b) Does not compile: csv.d(17): Error: function csv.CSVData!true.CSVData.parseCSV(char[] str) is not callable using argument types (string) csv.d(17): cannot pass argument tmp.front() of type string to parameter char[] str csv.d(21): Error: function csv.CSVData!true.CSVData.parseCSV(char[] str) is not callable using argument types (string) csv.d(21): cannot pass argument e of type string to parameter char[] str [...]/../../src/phobos/std/algorithm/iteration.d(525): instantiated from here: MapResult!(__lambda2, ByLineCopy!(immutable(char), char)) csv.d(21): instantiated from here: map!(ByLineCopy!(immutable(char), char)) csv.d(40): instantiated from here: CSVData!true c) Reminds me of the necessity to add dups here and there. And reminds me of "helping the compiler" [1]? [1] <https://wiki.c2.com/?HelpingTheCompilerIsEvil>
Jun 03 2021
On Thursday, 3 June 2021 at 10:18:25 UTC, kdevel wrote:a) What is the rationale behind not making byLineCopy the default?byLine was the original implementation. byLineCopy was added later after the need for it became apparent.
Jun 03 2021
On Thursday, 3 June 2021 at 10:30:24 UTC, Mike Parker wrote:On Thursday, 3 June 2021 at 10:18:25 UTC, kdevel wrote:See: https://forum.dlang.org/post/lg4l7s$11rl$1 digitalmars.coma) What is the rationale behind not making byLineCopy the default?byLine was the original implementation. byLineCopy was added later after the need for it became apparent.
Jun 03 2021
THX. BTW byLineCopy defaults to immutable char. That's why one has to use auto tmp = File(filename).byLineCopy!(char, char); or auto tmp = File(filename).byLine.map!dup;See: https://forum.dlang.org/post/lg4l7s$11rl$1 digitalmars.coma) What is the rationale behind not making byLineCopy the default?byLine was the original implementation. byLineCopy was added later after the need for it became apparent.
Jun 03 2021
On 6/3/21 9:00 AM, kdevel wrote:I was going to suggest use byLineCopy!(char, char), because the second option with map makes a copy every time you call front. And, my goodness, that is backwards for the template parameters. The terminator type should be determined by IFTI, it should never have been the first template parameter! -SteveTHX. BTW byLineCopy defaults to immutable char. That's why one has to use auto tmp = File(filename).byLineCopy!(char, char); or auto tmp = File(filename).byLine.map!dup;See: https://forum.dlang.org/post/lg4l7s$11rl$1 digitalmars.coma) What is the rationale behind not making byLineCopy the default?byLine was the original implementation. byLineCopy was added later after the need for it became apparent.
Jun 03 2021
On Thursday, 3 June 2021 at 00:39:04 UTC, vacuum_tube wrote:I've been trying to make a struct for CSV parsing and manipulating. The code was as follows: ``` struct CSVData(bool HeaderFromFirstLine) { char[][] header = []; char[][][] rest = []; ``` [...]additionally to the other comment, you probably want to use `string` (`immutable(char)[]`) instead of char[] here, as you want your data to stay the same and not be modified after assignment. If you replace them with `string` and have your code be ` safe`, the compiler will tell you where you try to assign your char[] data that may be modified and in those cases you would want to call `.idup` to duplicate the data to make it persistent.
Jun 03 2021