digitalmars.D - csvReader read file byLine()?
- Jens Mueller (14/14) Jun 21 2012 Hi,
- Timon Gehr (2/16) Jun 21 2012 You might make use of std.algorithm.joiner.
- Jens Mueller (10/33) Jun 21 2012 You mean like
- Jens Mueller (4/27) Jun 21 2012 The problem is that csvParser expects a range with elements of type
- Jesse Phillips (21/34) Jun 21 2012 It requires a dchar range so that Unicode support is enforced. It
- Jens Mueller (19/60) Jun 22 2012 auto file = File("test.csv", "r");
- Jesse Phillips (21/30) Jun 22 2012 Yes, and it seems .joiner isn't as lazy as I'd have thought.
- Jens Mueller (6/40) Jun 22 2012 Thanks. That works. But this should be either mentioned in the
- travert phare.normalesup.org (Christophe Travert) (12/54) Jun 22 2012 Yes, and that increases GC usage a lot.
- Jesse Phillips (4/7) Jun 22 2012 I'd say start by filing a bug that joiner does not work with
- Jonathan M Davis (3/12) Jun 22 2012 http://d.puremagic.com/issues/show_bug.cgi?id=8085
- Andrei Alexandrescu (3/11) Jun 27 2012 No, it should be easily fixable.
- Andrei Alexandrescu (3/7) Jun 27 2012 Yah, it's a bug in joiner that Walter also found.
Hi, I used std.csv for reading a CSV file. Thanks a lot to Jesse for writing and adding std.csv to Phobos. Using it is fairly straightforward but I miss one thing. Very commonly you need to read a CSV file. With std.csv that boils down to auto records = csvReader!(Record)(readText(filename)); But csvReader won't parse from File(filename, "r").byLine() even though that is an InputRange, isn't it? That means I always have to use readText. All IO happens that very moment. Shouldn't the csvReader support lazy reading from a file like this auto file = File(filename, "r"); auto records = csvReader!(Record)(file.byLine()); Am I missing something? Was this left out for a reason or an oversight? Jens
Jun 21 2012
On 06/21/2012 02:17 PM, Jens Mueller wrote:Hi, I used std.csv for reading a CSV file. Thanks a lot to Jesse for writing and adding std.csv to Phobos. Using it is fairly straightforward but I miss one thing. Very commonly you need to read a CSV file. With std.csv that boils down to auto records = csvReader!(Record)(readText(filename)); But csvReader won't parse from File(filename, "r").byLine() even though that is an InputRange, isn't it? That means I always have to use readText. All IO happens that very moment. Shouldn't the csvReader support lazy reading from a file like this auto file = File(filename, "r"); auto records = csvReader!(Record)(file.byLine()); Am I missing something? Was this left out for a reason or an oversight? JensYou might make use of std.algorithm.joiner.
Jun 21 2012
Timon Gehr wrote:On 06/21/2012 02:17 PM, Jens Mueller wrote:You mean like auto file = File(filename, "r"); auto records = csvReader!(Record)(joiner(file.byLine(KeepTerminator.yes))); Then a CSVException is raised. Don't know why. Have to investigate. Thanks for the pointer. BTW std.stdio should publicly import std.string.KeepTerminator, shouldn't it. Otherwise you have to import it yourself. JensHi, I used std.csv for reading a CSV file. Thanks a lot to Jesse for writing and adding std.csv to Phobos. Using it is fairly straightforward but I miss one thing. Very commonly you need to read a CSV file. With std.csv that boils down to auto records = csvReader!(Record)(readText(filename)); But csvReader won't parse from File(filename, "r").byLine() even though that is an InputRange, isn't it? That means I always have to use readText. All IO happens that very moment. Shouldn't the csvReader support lazy reading from a file like this auto file = File(filename, "r"); auto records = csvReader!(Record)(file.byLine()); Am I missing something? Was this left out for a reason or an oversight? JensYou might make use of std.algorithm.joiner.
Jun 21 2012
Timon Gehr wrote:On 06/21/2012 02:17 PM, Jens Mueller wrote:The problem is that csvParser expects a range with elements of type dchar. Any idea why that is required for CSV parsing? JensHi, I used std.csv for reading a CSV file. Thanks a lot to Jesse for writing and adding std.csv to Phobos. Using it is fairly straightforward but I miss one thing. Very commonly you need to read a CSV file. With std.csv that boils down to auto records = csvReader!(Record)(readText(filename)); But csvReader won't parse from File(filename, "r").byLine() even though that is an InputRange, isn't it? That means I always have to use readText. All IO happens that very moment. Shouldn't the csvReader support lazy reading from a file like this auto file = File(filename, "r"); auto records = csvReader!(Record)(file.byLine()); Am I missing something? Was this left out for a reason or an oversight? JensYou might make use of std.algorithm.joiner.
Jun 21 2012
On Thursday, 21 June 2012 at 20:30:07 UTC, Jens Mueller wrote:It requires a dchar range so that Unicode support is enforced. It is the same reason char[] is a range of dchar. You'll have to give me some example code, my test has no issue using joiner with byLine. import std.stdio; import std.algorithm; import std.csv; void main() { struct Record { string one, two, three; } auto filename = "file.csv"; auto file = File(filename, "r"); auto records = csvReader!Record(file.byLine().joiner("\n")); foreach(r; records) { writeln(r); } }The problem is that csvParser expects a range with elements of type dchar. Any idea why that is required for CSV parsing? Jensauto file = File(filename, "r"); auto records = csvReader!(Record)(file.byLine()); Am I missing something? Was this left out for a reason or an oversight? JensYou might make use of std.algorithm.joiner.
Jun 21 2012
Content-Disposition: inline Jesse Phillips wrote:On Thursday, 21 June 2012 at 20:30:07 UTC, Jens Mueller wrote:auto file = File("test.csv", "r"); auto records = csvReader!double(file.byLine().joiner("\n")); writeln(records); The last line throws a CSVException due to some conversion error 'Floating point conversion error for input "".' for the attached input. If you change the input to 3.0 4.0 you get no exception but wrong a output of [[4], [4]] . Using readText or auto records = csvReader!Record(["3.00", "4.0"].joiner("\n")); works as expected. Can you reproduce the issue? I'm running dmd2.059 on Linux. Thanks. JensIt requires a dchar range so that Unicode support is enforced. It is the same reason char[] is a range of dchar. You'll have to give me some example code, my test has no issue using joiner with byLine. import std.stdio; import std.algorithm; import std.csv; void main() { struct Record { string one, two, three; } auto filename = "file.csv"; auto file = File(filename, "r"); auto records = csvReader!Record(file.byLine().joiner("\n")); foreach(r; records) { writeln(r); } }The problem is that csvParser expects a range with elements of type dchar. Any idea why that is required for CSV parsing? Jensauto file = File(filename, "r"); auto records = csvReader!(Record)(file.byLine()); Am I missing something? Was this left out for a reason or an oversight? JensYou might make use of std.algorithm.joiner.
Jun 22 2012
On Friday, 22 June 2012 at 08:12:59 UTC, Jens Mueller wrote:The last line throws a CSVException due to some conversion error 'Floating point conversion error for input "".' for the attached input. If you change the input to 3.0 4.0 you get no exception but wrong a output of [[4], [4]] .Yes, and it seems .joiner isn't as lazy as I'd have thought. byLine() reuses its buffer so it will overwrite previous lines in the file. This can be resolved by mapping a dup to it. import std.stdio; import std.algorithm; import std.csv; void main() { struct Record { double one; } auto filename = "file.csv"; auto file = File(filename, "r"); auto input = map!(a => a.idup)(file.byLine()).joiner("\n"); auto records = csvReader!Record(input); foreach(r; records) { writeln(r); } }
Jun 22 2012
Jesse Phillips wrote:On Friday, 22 June 2012 at 08:12:59 UTC, Jens Mueller wrote:Thanks. That works. But this should be either mentioned in the documentation or fixed. I would prefer a fix because the code above looks like a work around. Probably byLine or joiner then need some fixing. What do you think? JensThe last line throws a CSVException due to some conversion error 'Floating point conversion error for input "".' for the attached input. If you change the input to 3.0 4.0 you get no exception but wrong a output of [[4], [4]] .Yes, and it seems .joiner isn't as lazy as I'd have thought. byLine() reuses its buffer so it will overwrite previous lines in the file. This can be resolved by mapping a dup to it. import std.stdio; import std.algorithm; import std.csv; void main() { struct Record { double one; } auto filename = "file.csv"; auto file = File(filename, "r"); auto input = map!(a => a.idup)(file.byLine()).joiner("\n"); auto records = csvReader!Record(input); foreach(r; records) { writeln(r); } }
Jun 22 2012
Jens Mueller , dans le message (digitalmars.D:170448), a écrit :Jesse Phillips wrote:Yes, and that increases GC usage a lot. Looking at the implementation, joiner as a behavior that is incompatible with ranges reusing some buffer: joiner immidiately call's the range of range's popFront after having taken its front range. Instead, it should wait until it is necessary before calling popFront (at least until all the data has be read by the next tool of the chain). Fixing this should not be very hard. Is there an issue preventing to make this change? -- ChristopheOn Friday, 22 June 2012 at 08:12:59 UTC, Jens Mueller wrote:Thanks. That works. But this should be either mentioned in the documentation or fixed. I would prefer a fix because the code above looks like a work around. Probably byLine or joiner then need some fixing. What do you think? JensThe last line throws a CSVException due to some conversion error 'Floating point conversion error for input "".' for the attached input. If you change the input to 3.0 4.0 you get no exception but wrong a output of [[4], [4]] .Yes, and it seems .joiner isn't as lazy as I'd have thought. byLine() reuses its buffer so it will overwrite previous lines in the file. This can be resolved by mapping a dup to it. import std.stdio; import std.algorithm; import std.csv; void main() { struct Record { double one; } auto filename = "file.csv"; auto file = File(filename, "r"); auto input = map!(a => a.idup)(file.byLine()).joiner("\n"); auto records = csvReader!Record(input); foreach(r; records) { writeln(r); } }
Jun 22 2012
On Friday, 22 June 2012 at 15:11:14 UTC, travert phare.normalesup.org (Christophe Travert) wrote:Fixing this should not be very hard. Is there an issue preventing to make this change?I'd say start by filing a bug that joiner does not work with File.byLine()
Jun 22 2012
On Friday, June 22, 2012 19:33:39 Jesse Phillips wrote:On Friday, 22 June 2012 at 15:11:14 UTC, travert phare.normalesup.org (Christophe Travert) wrote:http://d.puremagic.com/issues/show_bug.cgi?id=8085 - Jonathan M DavisFixing this should not be very hard. Is there an issue preventing to make this change?I'd say start by filing a bug that joiner does not work with File.byLine()
Jun 22 2012
On 6/22/12 11:11 AM, Christophe Travert wrote:Looking at the implementation, joiner as a behavior that is incompatible with ranges reusing some buffer: joiner immidiately call's the range of range's popFront after having taken its front range. Instead, it should wait until it is necessary before calling popFront (at least until all the data has be read by the next tool of the chain). Fixing this should not be very hard. Is there an issue preventing to make this change?No, it should be easily fixable. Andrei
Jun 27 2012
On 6/22/12 10:44 AM, Jens Mueller wrote:Thanks. That works. But this should be either mentioned in the documentation or fixed. I would prefer a fix because the code above looks like a work around. Probably byLine or joiner then need some fixing. What do you think?Yah, it's a bug in joiner that Walter also found. Andrei
Jun 27 2012