digitalmars.D.learn - Multidimensional dynamic array of strings initialized with split()
- Ludovit Lucenic (23/23) Sep 04 2013 Hello friends,
- H. S. Teoh (11/41) Sep 04 2013 [...]
- Ludovit Lucenic (4/14) Sep 04 2013 Thank you so much for your explanation.
- Ludovit Lucenic (2/2) Sep 05 2013 I have created a wiki on this one.
- =?UTF-8?B?QWxpIMOHZWhyZWxp?= (30/32) Sep 05 2013 Compiling with "DMD64 D Compiler v2.064-devel-52cc287" produces the
- Ludovit Lucenic (6/36) Sep 22 2017 Thank you for pointing out the errors, Ali.
Hello friends, with the following code import std.stdio; import std.array; auto file71 = File(argv[2], "r"); string[][] buffer; foreach (line; file71.byLines) { buffer ~= split(line, "\t"); } I am trying to cut the lines from the file with tab as delimiter to pre-fetch the content of a file before further processing. Each split() call gives correct string[] values in and of itself. But when I try to read buffer, after the loop, I got corrupted data, like this: [ ["-", "_Unit226", "constructor", "sub_00BE896C\t1\t?:?\t\t//con", "t", "uc... Obviously the concatenation is doing no good, since there are tabs in the values... What am I missing here ? Is it that split() allocated memory that gets overwritten in the loop and the ~= just copies the subarrays not copying the subsubarrays ? How to overcome this ? Thank you very much, Ludovit
Sep 04 2013
On Thu, Sep 05, 2013 at 12:57:34AM +0200, Ludovit Lucenic wrote:Hello friends, with the following code import std.stdio; import std.array; auto file71 = File(argv[2], "r"); string[][] buffer; foreach (line; file71.byLines) { buffer ~= split(line, "\t"); } I am trying to cut the lines from the file with tab as delimiter to pre-fetch the content of a file before further processing. Each split() call gives correct string[] values in and of itself. But when I try to read buffer, after the loop, I got corrupted data, like this: [ ["-", "_Unit226", "constructor", "sub_00BE896C\t1\t?:?\t\t//con", "t", "uc... Obviously the concatenation is doing no good, since there are tabs in the values... What am I missing here ? Is it that split() allocated memory that gets overwritten in the loop and the ~= just copies the subarrays not copying the subsubarrays ? How to overcome this ?[...] The problem is that File.byLine() reuses its buffer for efficiency, and split is optimized to return slices into that buffer instead of copying each substring. So after every iteration the buffer (and therefore the slices into it) gets overwritten. Replace the loop body with the following and it should work: buffer ~= split(line.dup, "\t"); T -- Dogs have owners ... cats have staff. -- Krista Casada
Sep 04 2013
On Wednesday, 4 September 2013 at 23:06:10 UTC, H. S. Teoh wrote:The problem is that File.byLine() reuses its buffer for efficiency, and split is optimized to return slices into that buffer instead of copying each substring. So after every iteration the buffer (and therefore the slices into it) gets overwritten. Replace the loop body with the following and it should work: buffer ~= split(line.dup, "\t"); TThank you so much for your explanation. Helped me a lot to understand things and works actually :-) LL
Sep 04 2013
I have created a wiki on this one. http://wiki.dlang.org/Read_table_data_from_file
Sep 05 2013
On 09/05/2013 01:14 AM, Ludovit Lucenic wrote:I have created a wiki on this one. http://wiki.dlang.org/Read_table_data_from_fileCompiling with "DMD64 D Compiler v2.064-devel-52cc287" produces the following errors: * You had byLines in your original code as well. Shouldn't it be byLine? * You are missing the closing brace of the foreach loop as well. * "Error: cannot append type char[][] to type string[][]" I have to replace .dup with .idup The following version is lazy: import std.stdio; import std.array; import std.algorithm; auto readInData(File inputFile, string fieldSeparator) { return inputFile .byLine .map!(line => line .idup .split("\t")); } The caller can either use the result lazily: import std.range; void main() { auto file = File("deneme.txt"); writeln(readInData(file, "\t").take(2)); } Or call .array on the result to consume the range eagerly: auto table = readInData(file, "\t").array; Ali
Sep 05 2013
On Thursday, 5 September 2013 at 16:22:46 UTC, Ali Çehreli wrote:Compiling with "DMD64 D Compiler v2.064-devel-52cc287" produces the following errors: * You had byLines in your original code as well. Shouldn't it be byLine? * You are missing the closing brace of the foreach loop as well. * "Error: cannot append type char[][] to type string[][]" I have to replace .dup with .idupThank you for pointing out the errors, Ali. I have updated the example.The following version is lazy: import std.stdio; import std.array; import std.algorithm; auto readInData(File inputFile, string fieldSeparator) { return inputFile .byLine .map!(line => line .idup .split("\t")); } The caller can either use the result lazily: import std.range; void main() { auto file = File("deneme.txt"); writeln(readInData(file, "\t").take(2)); } Or call .array on the result to consume the range eagerly: auto table = readInData(file, "\t").array; AliThank you for the alternative approaches. This thread is linked from Credits section, if someone wants to find out more on the topic from the wiki.
Sep 22 2017