www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - csvReader: how to read only selected columns while the class Layout

reply mw <mingwu gmail.com> writes:
Hi,

I'm following the example on

https://dlang.org/phobos/std_csv.html

```
     class Layout
     {
         int value;
         double other;
         string name;
         int extra_field;  // un-comment to see the error
     }

void main()
{
     import std.csv;
     import std.stdio: write, writeln, writef, writefln;
     import std.algorithm.comparison : equal;
     string text = "a,b,c\nHello,65,2.5\nWorld,123,7.5";

     auto records =
         text.csvReader!Layout(["b","c","a"]);  // Read only these 
column
     foreach (r; records) writeln(r.name);
}
```

This works fine so far, but if I un-comment the extra_field line, 
I got runtime error:

```
core.exception.ArrayIndexError /dlang/dmd/linux/bin64/../../src/pho
os/std/csv.d(1209): index [3] is out of bounds for array of length 3
----------------
??:? _d_arraybounds_indexp [0x5565b4b974d1]
/dlang/dmd/linux/bin64/../../src/phobos/std/csv.d:1209 pure  safe 
void std.csv.CsvReader!(onlineapp.Layout, 1, immutable(char)[], 
dchar, immutable(char)[][]).CsvReader.prime() [0x5565b4b73ed2]
/dlang/dmd/linux/bin64/../../src/phobos/std/csv.d:1154 pure  safe 
void std.csv.CsvReader!(onlineapp.Layout, 1, immutable(char)[], 
dchar, immutable(char)[][]).CsvReader.popFront() [0x5565b4b73c80]
/dlang/dmd/linux/bin64/../../src/phobos/std/csv.d:1069 pure ref 
 safe std.csv.CsvReader!(onlineapp.Layout, 1, immutable(char)[], 
dchar, immutable(char)[][]).CsvReader 
std.csv.CsvReader!(onlineapp.Layout, 1, immutable(char)[], dchar, 
immutable(char)[][]).CsvReader.__ctor(immutable(char)[], 
immutable(char)[][], dchar, dchar, bool) [0x5565b4b73ae8]
/dlang/dmd/linux/bin64/../../src/phobos/std/csv.d:366 pure  safe 
std.csv.CsvReader!(onlineapp.Layout, 1, immutable(char)[], dchar, 
immutable(char)[][]).CsvReader 
std.csv.csvReader!(onlineapp.Layout, 1, immutable(char)[], 
immutable(char)[][], char).csvReader(immutable(char)[], 
immutable(char)[][], char, char, bool) [0x5565b4b735f3]
./onlineapp.d:18 _Dmain [0x5565b4b72ca4]
```

I'm just wondering how to work-around this?

Thanks.
Oct 02 2022
parent reply mw <mingwu gmail.com> writes:
```
         text.csvReader!Layout(["b","c","a"]);  // Read only these 
column
```

The intention is very clear: only read the selected columns from 
the csv, and for any other fields of class Layout, just ignore 
(with the default D .init value).
Oct 02 2022
next sibling parent reply rassoc <rassoc posteo.de> writes:
On 10/2/22 21:48, mw via Digitalmars-d-learn wrote:
 ```
          text.csvReader!Layout(["b","c","a"]);  // Read only these
column
 ```
 
 The intention is very clear: only read the selected columns from the csv, and
for any other fields of class Layout, just ignore (with the default D .init
value).
 
Here's why it's not currently working: "An optional header can be provided. The first record will be read in as the header. If Contents is a struct then the header provided is expected to correspond to the fields in the struct." "expected to correspond" means that the number of fields in the content struct can't exceed the header element count as you can see in the actual code [1]: ``` foreach (ti, ToType; Fields!(Contents)) { if (indices[ti] == colIndex) // indices.length depends on passed in colHeaders.length ... } ``` The current index exception is bad, this needs an assert in the constructor with a nicer error message. But say, I'm curious, what's the purpose of adding an optional/useless contents field? What's the use-case here? [1] https://github.com/dlang/phobos/blob/8e8aaae5080ccc2e0a2202cbe9778dca96496a95/std/csv.d#L1209
Oct 02 2022
parent reply mw <mingwu gmail.com> writes:
On Sunday, 2 October 2022 at 21:03:40 UTC, rassoc wrote:

 But say, I'm curious, what's the purpose of adding an 
 optional/useless contents field? What's the use-case here?
We have a class/struct for a data record, some of its data fields need to be saved/loaded from CSV files; while there are other helper fields which are useful for various computation tasks (e.g. caching some intermediate computation results), these fields do not need to be saved/loaded from the csv files. A CSV library should consider all the use cases, and allow users to ignore certain fields.
Oct 02 2022
next sibling parent rassoc <rassoc posteo.de> writes:
On 10/2/22 23:18, mw via Digitalmars-d-learn wrote:
 A CSV library should consider all the use cases, and allow users to ignore
certain fields.
Filed issue: https://issues.dlang.org/show_bug.cgi?id=23383 Let's see what others have to say.
Oct 02 2022
prev sibling parent jmh530 <john.michael.hall gmail.com> writes:
On Sunday, 2 October 2022 at 21:18:43 UTC, mw wrote:
 [snipping]

 A CSV library should consider all the use cases, and allow 
 users to ignore certain fields.
In R, you have to force `NULL` for `colClasses` for the other columns. In other words, the user has to know the number of columns of the csv file in order to be able to skip them. https://stackoverflow.com/questions/29332713/how-to-skip-column-when-reading-csv-file
Oct 03 2022
prev sibling parent reply Salih Dincer <salihdb hotmail.com> writes:
On Sunday, 2 October 2022 at 19:48:52 UTC, mw wrote:
 ```
         text.csvReader!Layout(["b","c","a"]);  // Read only 
 these column
 ```

 The intention is very clear: only read the selected columns 
 from the csv, and for any other fields of class Layout, just 
 ignore (with the default D .init value).
Why don't you do this? For example you can try the following? ```d import std.csv, std.math.algebraic : abs;     string str = "a,b,c\nHello,65,63.63\n➊➋➂❹,123,3673.562";     struct Layout     {         int value;         double other;         string name;     }     auto records = csvReader!Layout(str, ["b","c","a"]);     Layout[2] ans;     ans[0].name = "Hello";     ans[0].value = 65;     ans[0].other = 63.63;     ans[1].name = "➊➋➂❹";     ans[1].value = 123;     ans[1].other = 3673.562;     int count;     foreach (record; records)     {         assert(ans[count].name == record.name);         assert(ans[count].value == record.value);         assert(abs(ans[count].other - record.other) < 0.00001);         count++;     }     assert(count == ans.length); ``` SDB 79
Oct 03 2022
parent mw <mingwu gmail.com> writes:
On Monday, 3 October 2022 at 18:02:51 UTC, Salih Dincer wrote:
 On Sunday, 2 October 2022 at 19:48:52 UTC, mw wrote:
 ```
         text.csvReader!Layout(["b","c","a"]);  // Read only 
 these column
 ```

 The intention is very clear: only read the selected columns 
 from the csv, and for any other fields of class Layout, just 
 ignore (with the default D .init value).
Why don't you do this? For example you can try the following? ```d import std.csv, std.math.algebraic : abs;     string str = "a,b,c\nHello,65,63.63\n➊➋➂❹,123,3673.562";     struct Layout     {         int value;         double other;         string name;     }
You didn't get my question, please add: ``` int extra_field; // un-comment to see the error ``` to the struct, then you will see the error.
Oct 03 2022