digitalmars.D.learn - CSV Data to Binary File
- TJB (46/46) Aug 07 2014 I am trying to read data in from a csv file into a struct, and
- TJB (14/60) Aug 07 2014 Some of the code got messed up when I pasted. Should be:
- "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (58/70) Aug 07 2014 (You forgot to include the error. For other readers: It fails to
- TJB (4/62) Aug 07 2014 Thanks Marc. Not sure what to do here. I need to the binary data
- "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (39/43) Aug 07 2014 Well, in your CSV data, they don't have the right length, so you
- "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (21/24) Aug 07 2014 Something else: The `align(1)` on your type definition specifies
- "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (4/8) Aug 07 2014 Sorry, should have been `ex` and `mmid`. I've posted my question
- Era Scarecrow (9/13) Aug 07 2014 I can't help but think somehow that as long as the data is
I am trying to read data in from a csv file into a struct, and then turn around and write that data to binary format. Here is my code: import std.algorithm; import std.csv; import stdio = std.stdio; import std.stream; align(1) struct QuotesBin { int qtim;9 int bid; int ofr; int bidsiz; int ofrsiz; short mode; char[1] ex; char[4] mmid; } void main() { string infile = "temp.csv"; string outfile = "temp.bin"; Stream fin = new BufferedFile(infile); Stream fout = new BufferedFile(outfile, FileMode.Out); foreach(ulong n, char[] line; fin) { auto record = csvReader!QuotesBin(line).front; fout.writeExact(&record, QuotesBin.sizeof); } fin.close(); fout.close(); } Here is a snippet of my csv data: 34220, 370000, 371200, 1, 1, 12, N, 34220, 369000, 372500, 1, 11, 12, P, 34220, 370000, 371200, 1, 2, 12, N, 34220, 370000, 371100, 1, 33, 12, N, 34220, 369400, 371100, 6, 3, 12, P, 34220, 370000, 371200, 1, 2, 12, N, 34220, 369300, 371200, 9, 2, 12, N, 34220, 369300, 371200, 5, 2, 12, N, 34220, 368900, 371200, 13, 2, 12, N, 34220, 368900, 371100, 13, 1, 12, N, For some reason this fails miserably. Can anyone help me out as to why? What do I need to do differently? Thanks, TJB
Aug 07 2014
On Thursday, 7 August 2014 at 15:11:48 UTC, TJB wrote:I am trying to read data in from a csv file into a struct, and then turn around and write that data to binary format. Here is my code: import std.algorithm; import std.csv; import stdio = std.stdio; import std.stream; align(1) struct QuotesBin { int qtim;9 int bid; int ofr; int bidsiz; int ofrsiz; short mode; char[1] ex; char[4] mmid; } void main() { string infile = "temp.csv"; string outfile = "temp.bin"; Stream fin = new BufferedFile(infile); Stream fout = new BufferedFile(outfile, FileMode.Out); foreach(ulong n, char[] line; fin) { auto record = csvReader!QuotesBin(line).front; fout.writeExact(&record, QuotesBin.sizeof); } fin.close(); fout.close(); } Here is a snippet of my csv data: 34220, 370000, 371200, 1, 1, 12, N, 34220, 369000, 372500, 1, 11, 12, P, 34220, 370000, 371200, 1, 2, 12, N, 34220, 370000, 371100, 1, 33, 12, N, 34220, 369400, 371100, 6, 3, 12, P, 34220, 370000, 371200, 1, 2, 12, N, 34220, 369300, 371200, 9, 2, 12, N, 34220, 369300, 371200, 5, 2, 12, N, 34220, 368900, 371200, 13, 2, 12, N, 34220, 368900, 371100, 13, 1, 12, N, For some reason this fails miserably. Can anyone help me out as to why? What do I need to do differently? Thanks, TJBSome of the code got messed up when I pasted. Should be: align(1) struct QuotesBin { int qtim; int bid; int ofr; int bidsiz; int ofrsiz; short mode; char[1] ex; char[4] mmid; } Thanks!
Aug 07 2014
On Thursday, 7 August 2014 at 15:14:00 UTC, TJB wrote:align(1) struct QuotesBin { int qtim; int bid; int ofr; int bidsiz; int ofrsiz; short mode; char[1] ex; char[4] mmid; } Thanks!(You forgot to include the error. For other readers: It fails to compile with "template std.conv.toImpl cannot deduce function from argument types !(char[4])(string)" and similar error messages.) This is caused by the two `char` arrays. `std.conv.to` cannot convert strings to fixed-size char arrays, probably because it's not clear what should happen if the input string is too long or too short. Would it be a good idea to support this? As a workaround, you could declare a second struct with the same members, but `ex` and `mmid` as strings, read your data into these, and assign it to the first structure: import std.algorithm; import std.csv; import stdio = std.stdio; import std.stream; align(1) struct QuotesBinDummy { int qtim; int bid; int ofr; int bidsiz; int ofrsiz; short mode; string ex; string mmid; } align(1) struct QuotesBin { int qtim; int bid; int ofr; int bidsiz; int ofrsiz; short mode; char[1] ex; char[4] mmid; } void main() { string infile = "temp.csv"; string outfile = "temp.bin"; Stream fin = new BufferedFile(infile); Stream fout = new BufferedFile(outfile, FileMode.Out); foreach(ulong n, char[] line; fin) { auto temp = csvReader!QuotesBinDummy(line).front; QuotesBin record; record.tupleof = temp.tupleof; fout.writeExact(&record, QuotesBin.sizeof); } fin.close(); fout.close(); } The line "record.tupleof = temp.tupleof;" will however fail with your example data, because the `ex` field includes a space in the CSV, and the last field is empty, but needs to be 4 chars long.
Aug 07 2014
Thanks Marc. Not sure what to do here. I need to the binary data to be exactly the number of bytes as specified by the struct. How to handle the conversion from string to char[]? TJB(You forgot to include the error. For other readers: It fails to compile with "template std.conv.toImpl cannot deduce function from argument types !(char[4])(string)" and similar error messages.) This is caused by the two `char` arrays. `std.conv.to` cannot convert strings to fixed-size char arrays, probably because it's not clear what should happen if the input string is too long or too short. Would it be a good idea to support this? As a workaround, you could declare a second struct with the same members, but `ex` and `mmid` as strings, read your data into these, and assign it to the first structure: import std.algorithm; import std.csv; import stdio = std.stdio; import std.stream; align(1) struct QuotesBinDummy { int qtim; int bid; int ofr; int bidsiz; int ofrsiz; short mode; string ex; string mmid; } align(1) struct QuotesBin { int qtim; int bid; int ofr; int bidsiz; int ofrsiz; short mode; char[1] ex; char[4] mmid; } void main() { string infile = "temp.csv"; string outfile = "temp.bin"; Stream fin = new BufferedFile(infile); Stream fout = new BufferedFile(outfile, FileMode.Out); foreach(ulong n, char[] line; fin) { auto temp = csvReader!QuotesBinDummy(line).front; QuotesBin record; record.tupleof = temp.tupleof; fout.writeExact(&record, QuotesBin.sizeof); } fin.close(); fout.close(); } The line "record.tupleof = temp.tupleof;" will however fail with your example data, because the `ex` field includes a space in the CSV, and the last field is empty, but needs to be 4 chars long.
Aug 07 2014
On Thursday, 7 August 2014 at 16:08:01 UTC, TJB wrote:Thanks Marc. Not sure what to do here. I need to the binary data to be exactly the number of bytes as specified by the struct. How to handle the conversion from string to char[]?Well, in your CSV data, they don't have the right length, so you have to decide how to handle that. The easiest way would be to set the length. This will fill up the string with "\0" bytes if it is to short: align(1) struct QuotesBin { int qtim; int bid; int ofr; int bidsiz; int ofrsiz; short mode; char[1] ex; char[4] mmid; this(const QuotesBinDummy rhs) { this.qtim = rhs.qtim; this.bid = rhs.bid; this.ofr = rhs.ofr; this.bidsiz = rhs.bidsiz; this.ofrsiz = rhs.ofrsiz; this.mode = rhs.mode; string tmp; tmp = rhs.ex; tmp.length = this.ex.length; this.ex = tmp; tmp = rhs.mmid; tmp.length = this.mmid.length; this.mmid = tmp; } } ... auto temp = csvReader!QuotesBinDummy(line).front; QuotesBin record = temp; fout.writeExact(&record, QuotesBin.sizeof); ... But of course, whether this is correct depends on whether your binary format allows it, or requires all chars to be non-zero ASCII values.
Aug 07 2014
On Thursday, 7 August 2014 at 16:08:01 UTC, TJB wrote:Thanks Marc. Not sure what to do here. I need to the binary data to be exactly the number of bytes as specified by the struct.Something else: The `align(1)` on your type definition specifies the alignment of the entire struct, but has no effect on the alignment of its fields relative to the beginning. Your probably want this: align(1) struct QuotesBin { align(1): int qtim; int bid; int ofr; int bidsiz; int ofrsiz; short mode; char[1] ex; char[4] mmid; } This align the struct as a whole, and all its fields at byte boundaries. Without the second `align(1)`, there should be a gap between `mode` and `ex`. Strangely enough, when I test it, there's none. Will have to ask...
Aug 07 2014
On Thursday, 7 August 2014 at 17:12:35 UTC, Marc Schütz wrote:This align the struct as a whole, and all its fields at byte boundaries. Without the second `align(1)`, there should be a gap between `mode` and `ex`. Strangely enough, when I test it, there's none. Will have to ask...Sorry, should have been `ex` and `mmid`. I've posted my question here: http://forum.dlang.org/post/bkearrybmwguqrliexsw forum.dlang.org
Aug 07 2014
On Thursday, 7 August 2014 at 15:11:48 UTC, TJB wrote:Here is a snippet of my csv data: 34220, 370000, 371200, 1, 1, 12, N, 34220, 369000, 372500, 1, 11, 12, P, 34220, 370000, 371200, 1, 2, 12, N,I can't help but think somehow that as long as the data is numbers or words, that scanf would be useful even if it's a C function... Someone mentioned the final empty field, this makes me scratch my head... And a struct of exactly 27 bytes... i'd probably pad that to 28 or 32 if possible which allows you to expand your definition later... not to mention being 32bit aligned (if speed becomes important)
Aug 07 2014