digitalmars.D.learn - A bit of binary I/O
- Heinz (58/58) Jan 20 2007 Hi guys, i'm having great fun writing and reading binary files. It's my ...
- Frank Benoit (keinfarbton) (11/17) Jan 20 2007 09 00 00 00 00 00 00 00 // the ulong with value 9
- Heinz (2/26) Jan 20 2007 I get it, but if i'm actually writing the address of my data and not the...
- Frank Benoit (keinfarbton) (4/5) Jan 20 2007 Hehe, this works because the string is still in memory. And then you
- Heinz (5/33) Jan 20 2007 I think i'm getting it, the data retrieved are addresses to the start of...
- Jarrett Billingsley (61/115) Jan 20 2007 You're writing the string wrong. All you're doing is writing the length...
- Heinz (5/162) Jan 20 2007 Wow, that covers all, thanks for your reply.
- janderson (6/12) Jan 20 2007 You have to use some form of encryption. XOR encryption is one of the
- Chris Nicholson-Sauls (64/224) Jan 20 2007 Well technically it will write it as UTF8, which is as near to ASCII as ...
- Jarrett Billingsley (8/11) Jan 20 2007 Yeah, that's perfectly fine as long as the structure doesn't contain any...
- Heinz (2/17) Jan 20 2007 The base64 in phobos also could be useful.
- Heinz (4/4) Jan 20 2007 In C++ you can write an entire structure to a binary file:
- Chris Nicholson-Sauls (4/10) Jan 20 2007 Sure, and it will work between instances of the program so long as none ...
- Heinz (6/18) Jan 20 2007 So, you mean i can't have this structure because i has an array?
- janderson (33/55) Jan 20 2007 Right, because these arrays are essentially a pointer to data somewhere
- Heinz (2/14) Jan 20 2007 What about classes can they be written under the same rules?
- Jarrett Billingsley (4/6) Jan 20 2007 Class instances == object variables. All instances of classes are
- Heinz (8/31) Jan 20 2007 Sorry, i pressed tab and then enter and posted before i typed the struct...
- Frank Benoit (keinfarbton) (3/3) Jan 20 2007 You can take a look at the source of a serialisation library. E.g. see
- Christian Kamm (5/8) Jan 25 2007 Up to date versions of that library are found at
Hi guys, i'm having great fun writing and reading binary files. It's my first time doing this and i've got a few questions in mind. I write the same data(1 ulong and 1 string, i call them primitives) in 3 different ways and i get a different output for one of them. I create 1 file per method. If you open the created file with an hex editor you can see this. The first way is to write primitives manually one by one: // primitive way ulong i = 9; char[] s = "hello world"; myFile.writeExact(&i, i.sizeof); myFile.writeExact(&s, s.sizeof); Reading data: // Is done by reading each primitive. ulong i2; char[] s2; myFile.readExact(&i2, i2.sizeof); myFile.readExact(&s2, s2.sizeof); The second way is to write a structure with all the primitives as members: // struct way struct t { ulong i; char[] s; } t mt; mt.i = 9; mt.s = "hello world"; myFile.writeExact(&mt, mt.sizeof); Reading data: // We read the entire struct. t mt2; myFile.readExact(&mt2, mt2.sizeof); And the third way is to write a class with all the primitives as members: // class way class tt { ulong i; char[] s; } tt mtt = new tt(); mtt.i = 9; mtt.s = "hello world"; ResFile.writeExact(&mtt, mtt.sizeof); Reading data: // We read the entire class. tt mtt2; myFile.readExact(&mtt2, mtt2.sizeof); All of these methods works perfect. I'm able to retrieve values from all of them. Now lets check at the outputs: // Primitive 09 00 00 00 00 00 00 00 0B 00 00 00 A0 C7 41 00 // Structure 09 00 00 00 00 00 00 00 0B 00 00 00 A0 C7 41 00 // Class C0 3F 91 00 My questions are: 1) What's the best method to write data (in terms of data protection/encryption against reversion). The class way seems to me at first look the most secure way. 2) Wich method is the faster in retrieving data? 3) How the hell does this work? I mean, the string s is 10 chars long but the first 2 methods uses only 8 bytes to store the string and most of them are 0. Even more interesting, look at the class method, it uses only 4 bytes to store about 18 bytes of real data! WTF. I'm really ? This is a very interesting subject to me and if someone could clear my mind i would apreciate it very much. Thx you very very much in advance. Heinz
Jan 20 2007
// Primitive 09 00 00 00 00 00 00 00 0B 00 00 00 A0 C7 41 0009 00 00 00 00 00 00 00 // the ulong with value 9 0B 00 00 00 // arraysize 11 A0 C7 41 00 // pointervalue to the start of data// Structure 09 00 00 00 00 00 00 00 0B 00 00 00 A0 C7 41 00same here// Class C0 3F 91 00the first 4 bytes of your class. mtt.sizeof is the size of the reference not the size of the object itself. s.ptr is the pointer to the array data. &s is the address of the struct, that holds the array length and the pointer to the data. To write the string, you might want to try this: myFile.writeExact( s.ptr, s.length );
Jan 20 2007
Frank Benoit (keinfarbton) Wrote:I get it, but if i'm actually writing the address of my data and not the data itself then why i'm able to retrieve the data even if it's not there?// Primitive 09 00 00 00 00 00 00 00 0B 00 00 00 A0 C7 41 0009 00 00 00 00 00 00 00 // the ulong with value 9 0B 00 00 00 // arraysize 11 A0 C7 41 00 // pointervalue to the start of data// Structure 09 00 00 00 00 00 00 00 0B 00 00 00 A0 C7 41 00same here// Class C0 3F 91 00the first 4 bytes of your class. mtt.sizeof is the size of the reference not the size of the object itself. s.ptr is the pointer to the array data. &s is the address of the struct, that holds the array length and the pointer to the data. To write the string, you might want to try this: myFile.writeExact( s.ptr, s.length );
Jan 20 2007
I get it, but if i'm actually writing the address of my data and not the data itself then why i'm able to retrieve the data even if it's not there?Hehe, this works because the string is still in memory. And then you read back the pointer address from the file, and overwrite the other array data ptr with it. Now s2 points to the data of s. If you do the read in a second program run, it will probably not work.
Jan 20 2007
Heinz Wrote:Frank Benoit (keinfarbton) Wrote:I think i'm getting it, the data retrieved are addresses to the start of data but in my RAM, so if i take this file to another computer the data received should be different, am i right? To solve this and write the real data you suggest using the .ptr, is this property available in every object. I'm sorry to bother you so much Frank: I'm interested in your oppinion about the other 2 questions. Really thanks man, you rule.I get it, but if i'm actually writing the address of my data and not the data itself then why i'm able to retrieve the data even if it's not there?// Primitive 09 00 00 00 00 00 00 00 0B 00 00 00 A0 C7 41 0009 00 00 00 00 00 00 00 // the ulong with value 9 0B 00 00 00 // arraysize 11 A0 C7 41 00 // pointervalue to the start of data// Structure 09 00 00 00 00 00 00 00 0B 00 00 00 A0 C7 41 00same here// Class C0 3F 91 00the first 4 bytes of your class. mtt.sizeof is the size of the reference not the size of the object itself. s.ptr is the pointer to the array data. &s is the address of the struct, that holds the array length and the pointer to the data. To write the string, you might want to try this: myFile.writeExact( s.ptr, s.length );
Jan 20 2007
"Heinz" <billgates microsoft.com> wrote in message news:eou69k$8tf$1 digitaldaemon.com...The first way is to write primitives manually one by one: // primitive way ulong i = 9; char[] s = "hello world"; myFile.writeExact(&i, i.sizeof); myFile.writeExact(&s, s.sizeof); Reading data: // Is done by reading each primitive. ulong i2; char[] s2; myFile.readExact(&i2, i2.sizeof); myFile.readExact(&s2, s2.sizeof);You're writing the string wrong. All you're doing is writing the length and pointer of the array data, without actually writing the data. The Stream class (and by extension, the File class) provides functions for writing out every basic type: ulong i = 9; char[] s = "hello world"; myFile.write(i); myFile.write(s); ... ulong i2; char[] s2; myFile.read(i2); myFile.read(s);The second way is to write a structure with all the primitives as members: // struct way struct t { ulong i; char[] s; } t mt; mt.i = 9; mt.s = "hello world"; myFile.writeExact(&mt, mt.sizeof); Reading data: // We read the entire struct. t mt2; myFile.readExact(&mt2, mt2.sizeof);Again, you're just writing out the array reference without writing its contents. You have to write out each member individually. If there were no reference types in the struct, this would work fine.And the third way is to write a class with all the primitives as members: // class way class tt { ulong i; char[] s; } tt mtt = new tt(); mtt.i = 9; mtt.s = "hello world"; ResFile.writeExact(&mtt, mtt.sizeof); Reading data: // We read the entire class. tt mtt2; myFile.readExact(&mtt2, mtt2.sizeof);This is incorrect, and is only working because of how you've written your program. You're not writing the data out at all, you're writing a class reference. The 00913FC0 is just the memory address of the class instance that mtt points to, and when you read that address back in, you're just looking at the data in memory. This program wouldn't work if you write the file, exited, then had another program that read the data. You'd end up with a memory access violation, and none of the data in the class is actually written out. If you want to write a class out to a file, a common way is to have some kind of generic "serialize" and "unserialize" functions for the class: class C { ulong i; char[] s; void serialize(Stream s) { s.write(i); s.write(s); } static C unserialize(Stream s) { C c = new C(); s.read(c.i); s.read(c.s); return c; } } ... C c = new C(); c.i = 5; c.s = "foo"; c.serialize(myFile); ... C c = C.unserialize(myFile);All of these methods works perfect. I'm able to retrieve values from all of them. Now lets check at the outputs: // Primitive 09 00 00 00 00 00 00 00 0B 00 00 00 A0 C7 41 00 // Structure 09 00 00 00 00 00 00 00 0B 00 00 00 A0 C7 41 00 // Class C0 3F 91 00 My questions are: 1) What's the best method to write data (in terms of data protection/encryption against reversion). The class way seems to me at first look the most secure way.As explained before, the class method is wrong, and there is no encryption going on here. It's just a memory address, and you should never, ever write memory addresses to a file. That being said, the best way is probably to just use the primitive .read and .write methods of File. Just .. never, ever write pointers or references of any kind to a file.2) Wich method is the faster in retrieving data?If you implement them correctly, all three sample programs should make the exact same output file using the same number of writes (and read it in the same number of reads), and so they are all the same in terms of performance.
Jan 20 2007
Jarrett Billingsley Wrote:"Heinz" <billgates microsoft.com> wrote in message news:eou69k$8tf$1 digitaldaemon.com...Wow, that covers all, thanks for your reply. But, can i still write an entire structure with writeExact()? or you suggest writting each member of the structure with write()? Another question: Writting a type char[] with write() writes string as ASCII? if so then is a legible string, how can i protect that data? Thanks manThe first way is to write primitives manually one by one: // primitive way ulong i = 9; char[] s = "hello world"; myFile.writeExact(&i, i.sizeof); myFile.writeExact(&s, s.sizeof); Reading data: // Is done by reading each primitive. ulong i2; char[] s2; myFile.readExact(&i2, i2.sizeof); myFile.readExact(&s2, s2.sizeof);You're writing the string wrong. All you're doing is writing the length and pointer of the array data, without actually writing the data. The Stream class (and by extension, the File class) provides functions for writing out every basic type: ulong i = 9; char[] s = "hello world"; myFile.write(i); myFile.write(s); ... ulong i2; char[] s2; myFile.read(i2); myFile.read(s);The second way is to write a structure with all the primitives as members: // struct way struct t { ulong i; char[] s; } t mt; mt.i = 9; mt.s = "hello world"; myFile.writeExact(&mt, mt.sizeof); Reading data: // We read the entire struct. t mt2; myFile.readExact(&mt2, mt2.sizeof);Again, you're just writing out the array reference without writing its contents. You have to write out each member individually. If there were no reference types in the struct, this would work fine.And the third way is to write a class with all the primitives as members: // class way class tt { ulong i; char[] s; } tt mtt = new tt(); mtt.i = 9; mtt.s = "hello world"; ResFile.writeExact(&mtt, mtt.sizeof); Reading data: // We read the entire class. tt mtt2; myFile.readExact(&mtt2, mtt2.sizeof);This is incorrect, and is only working because of how you've written your program. You're not writing the data out at all, you're writing a class reference. The 00913FC0 is just the memory address of the class instance that mtt points to, and when you read that address back in, you're just looking at the data in memory. This program wouldn't work if you write the file, exited, then had another program that read the data. You'd end up with a memory access violation, and none of the data in the class is actually written out. If you want to write a class out to a file, a common way is to have some kind of generic "serialize" and "unserialize" functions for the class: class C { ulong i; char[] s; void serialize(Stream s) { s.write(i); s.write(s); } static C unserialize(Stream s) { C c = new C(); s.read(c.i); s.read(c.s); return c; } } ... C c = new C(); c.i = 5; c.s = "foo"; c.serialize(myFile); ... C c = C.unserialize(myFile);All of these methods works perfect. I'm able to retrieve values from all of them. Now lets check at the outputs: // Primitive 09 00 00 00 00 00 00 00 0B 00 00 00 A0 C7 41 00 // Structure 09 00 00 00 00 00 00 00 0B 00 00 00 A0 C7 41 00 // Class C0 3F 91 00 My questions are: 1) What's the best method to write data (in terms of data protection/encryption against reversion). The class way seems to me at first look the most secure way.As explained before, the class method is wrong, and there is no encryption going on here. It's just a memory address, and you should never, ever write memory addresses to a file. That being said, the best way is probably to just use the primitive .read and .write methods of File. Just .. never, ever write pointers or references of any kind to a file.2) Wich method is the faster in retrieving data?If you implement them correctly, all three sample programs should make the exact same output file using the same number of writes (and read it in the same number of reads), and so they are all the same in terms of performance.
Jan 20 2007
Heinz wrote:Jarrett Billingsley Wrote: Another question: Writting a type char[] with write() writes string as ASCII? if so then is a legible string, how can i protect that data? Thanks manYou have to use some form of encryption. XOR encryption is one of the simplest, although not the most secure. Here's a C doc about it http://www.cprogramming.com/tutorial/xor.html. Maybe there's already an encryption library in D? -Joel
Jan 20 2007
Heinz wrote:Jarrett Billingsley Wrote:Well technically it will write it as UTF8, which is as near to ASCII as makes no nevermind. If you don't want it readable (and this is a binary file anyway) you could just use some simple reversable encryption algorithm. Something like this for a silly random. <code> module silly; import tango .io .Stdout ; struct SillyCrypt { alias process opCall ; static const CHUNK_SIZE = 32_U ; static const ROT = 16_U ; static const XOR = 24_U ; static char[] process (char[] src) { char[] result ; foreach (ch; chunks(src)) { result ~= mutate(ch); } return result; } private static char[][] chunks (char[] x) { char[] source = x ; char[][] result ; while (source.length >= CHUNK_SIZE) { result ~= source[0 .. CHUNK_SIZE] ; source = source[CHUNK_SIZE .. $ ] ; } if (source.length) { result ~= source; } return result; } private static char[] mutate (char[] x) { char[] result ; if (x.length > ROT) { result = x[ROT .. $] ~ x[0 .. ROT]; } else { result = x.dup; } foreach (inout c; result) { c ^= XOR; } return result; } } const SOURCE = "I would say hello to you, but you couldn't read it even if I did."c ; void main () { auto enc = SillyCrypt(SOURCE) ; auto dec = SillyCrypt(enc ) ; Stdout ("Source -> "c)(SOURCE).newline() ("Encrypt -> "c)(enc ).newline() ("Decrypt -> "c)(dec ).newline() .flush ; } </code> The output when I tried it was this: Source -> I would say hello to you, but you couldn't read it even if I did. Encrypt -> w8lw8awm48zml8awQ8owmt|8kya8p}ttql8}n}v8q~8Q8|q|m8{wmt|v?l8j}y|86 Decrypt -> I would say hello to you, but you couldn't read it even if I did. I know I don't personally know anyone who can read "w8lw8awm48zml8awQ8owmt|8kya8p}ttql8}n}v8q~8Q8|q|m8{wmt|v?l8j}y|86" at all. :) -- Chris Nicholson-Sauls"Heinz" <billgates microsoft.com> wrote in message news:eou69k$8tf$1 digitaldaemon.com...Wow, that covers all, thanks for your reply. But, can i still write an entire structure with writeExact()? or you suggest writting each member of the structure with write()? Another question: Writting a type char[] with write() writes string as ASCII? if so then is a legible string, how can i protect that data? Thanks manThe first way is to write primitives manually one by one: // primitive way ulong i = 9; char[] s = "hello world"; myFile.writeExact(&i, i.sizeof); myFile.writeExact(&s, s.sizeof); Reading data: // Is done by reading each primitive. ulong i2; char[] s2; myFile.readExact(&i2, i2.sizeof); myFile.readExact(&s2, s2.sizeof);You're writing the string wrong. All you're doing is writing the length and pointer of the array data, without actually writing the data. The Stream class (and by extension, the File class) provides functions for writing out every basic type: ulong i = 9; char[] s = "hello world"; myFile.write(i); myFile.write(s); ... ulong i2; char[] s2; myFile.read(i2); myFile.read(s);The second way is to write a structure with all the primitives as members: // struct way struct t { ulong i; char[] s; } t mt; mt.i = 9; mt.s = "hello world"; myFile.writeExact(&mt, mt.sizeof); Reading data: // We read the entire struct. t mt2; myFile.readExact(&mt2, mt2.sizeof);Again, you're just writing out the array reference without writing its contents. You have to write out each member individually. If there were no reference types in the struct, this would work fine.And the third way is to write a class with all the primitives as members: // class way class tt { ulong i; char[] s; } tt mtt = new tt(); mtt.i = 9; mtt.s = "hello world"; ResFile.writeExact(&mtt, mtt.sizeof); Reading data: // We read the entire class. tt mtt2; myFile.readExact(&mtt2, mtt2.sizeof);This is incorrect, and is only working because of how you've written your program. You're not writing the data out at all, you're writing a class reference. The 00913FC0 is just the memory address of the class instance that mtt points to, and when you read that address back in, you're just looking at the data in memory. This program wouldn't work if you write the file, exited, then had another program that read the data. You'd end up with a memory access violation, and none of the data in the class is actually written out. If you want to write a class out to a file, a common way is to have some kind of generic "serialize" and "unserialize" functions for the class: class C { ulong i; char[] s; void serialize(Stream s) { s.write(i); s.write(s); } static C unserialize(Stream s) { C c = new C(); s.read(c.i); s.read(c.s); return c; } } ... C c = new C(); c.i = 5; c.s = "foo"; c.serialize(myFile); ... C c = C.unserialize(myFile);All of these methods works perfect. I'm able to retrieve values from all of them. Now lets check at the outputs: // Primitive 09 00 00 00 00 00 00 00 0B 00 00 00 A0 C7 41 00 // Structure 09 00 00 00 00 00 00 00 0B 00 00 00 A0 C7 41 00 // Class C0 3F 91 00 My questions are: 1) What's the best method to write data (in terms of data protection/encryption against reversion). The class way seems to me at first look the most secure way.As explained before, the class method is wrong, and there is no encryption going on here. It's just a memory address, and you should never, ever write memory addresses to a file. That being said, the best way is probably to just use the primitive .read and .write methods of File. Just .. never, ever write pointers or references of any kind to a file.2) Wich method is the faster in retrieving data?If you implement them correctly, all three sample programs should make the exact same output file using the same number of writes (and read it in the same number of reads), and so they are all the same in terms of performance.
Jan 20 2007
"Heinz" <billgates microsoft.com> wrote in message news:eoualo$1rcv$1 digitaldaemon.com...Wow, that covers all, thanks for your reply. But, can i still write an entire structure with writeExact()? or you suggest writting each member of the structure with write()?Yeah, that's perfectly fine as long as the structure doesn't contain any reference members (pointers, class references, dynamic arrays). Binary files a lot of times have some kind of standard header which can be written or read in one big chunk, which is possible to do with a structure. But if the structure contains any reference members, writing it out with writeExact will not work, and you'll have to write out the members manually.
Jan 20 2007
janderson Wrote:Heinz wrote:The base64 in phobos also could be useful.Jarrett Billingsley Wrote: Another question: Writting a type char[] with write() writes string as ASCII? if so then is a legible string, how can i protect that data? Thanks manYou have to use some form of encryption. XOR encryption is one of the simplest, although not the most secure. Here's a C doc about it http://www.cprogramming.com/tutorial/xor.html. Maybe there's already an encryption library in D? -Joel
Jan 20 2007
In C++ you can write an entire structure to a binary file: http://www.gamedev.net/reference/articles/article1127.asp http://www.codersource.net/cpp_file_io_binary.html Can you do the same in D?
Jan 20 2007
Heinz wrote:In C++ you can write an entire structure to a binary file: http://www.gamedev.net/reference/articles/article1127.asp http://www.codersource.net/cpp_file_io_binary.html Can you do the same in D?Sure, and it will work between instances of the program so long as none of the structure's members are referances: pointers, object variables, arrays. -- Chris Nicholson-Sauls
Jan 20 2007
Chris Nicholson-Sauls Wrote:Heinz wrote:So, you mean i can't have this structure because i has an array? struct h { } Could you post an example please?In C++ you can write an entire structure to a binary file: http://www.gamedev.net/reference/articles/article1127.asp http://www.codersource.net/cpp_file_io_binary.html Can you do the same in D?Sure, and it will work between instances of the program so long as none of the structure's members are referances: pointers, object variables, arrays. -- Chris Nicholson-Sauls
Jan 20 2007
Heinz wrote:Chris Nicholson-Sauls Wrote:Right, because these arrays are essentially a pointer to data somewhere else, they don't exist in the same block of memory. To do it automatically you would need some form of metadata (which would identify pointers) or something like serialization (which handled each element on its own). In D and C++ you can read a block like below in one go: struct h { int x; int y; char a; char b[100]; //Note because this is constant its included in this block. }; However you can't write something like this in D or C++: struct h { int x; int y; char a; char* b; //This is pointing elsewhere in memory. You'll need to fix this pointer up when you read it in. }; Since D dynamic arrays are really: struct Darray { size_t length; T* type; //Pointer to some location }; You can't save these out inside a struct. You need to save the data it points to as well. -JoelHeinz wrote:So, you mean i can't have this structure because i has an array? struct h { } Could you post an example please?In C++ you can write an entire structure to a binary file: http://www.gamedev.net/reference/articles/article1127.asp http://www.codersource.net/cpp_file_io_binary.html Can you do the same in D?Sure, and it will work between instances of the program so long as none of the structure's members are referances: pointers, object variables, arrays. -- Chris Nicholson-Sauls
Jan 20 2007
Chris Nicholson-Sauls Wrote:Heinz wrote:What about classes can they be written under the same rules?In C++ you can write an entire structure to a binary file: http://www.gamedev.net/reference/articles/article1127.asp http://www.codersource.net/cpp_file_io_binary.html Can you do the same in D?Sure, and it will work between instances of the program so long as none of the structure's members are referances: pointers, object variables, arrays. -- Chris Nicholson-Sauls
Jan 20 2007
"Heinz" <billgates microsoft.com> wrote in message news:eoug19$22to$1 digitaldaemon.com...Chris Nicholson-Sauls Wrote: What about classes can they be written under the same rules?Class instances == object variables. All instances of classes are references (pointers) implicitly.
Jan 20 2007
Heinz Wrote:Chris Nicholson-Sauls Wrote:Sorry, i pressed tab and then enter and posted before i typed the struct, here it goes: struct h { int i = 4; char[] s = "hello"; bool b = false; }Heinz wrote:So, you mean i can't have this structure because i has an array? struct h { } Could you post an example please?In C++ you can write an entire structure to a binary file: http://www.gamedev.net/reference/articles/article1127.asp http://www.codersource.net/cpp_file_io_binary.html Can you do the same in D?Sure, and it will work between instances of the program so long as none of the structure's members are referances: pointers, object variables, arrays. -- Chris Nicholson-Sauls
Jan 20 2007
You can take a look at the source of a serialisation library. E.g. see this thread: "serialization library" in the group D.announce on 8th Nov 2006
Jan 20 2007
On Sun, 21 Jan 2007 03:15:38 +0100, Frank Benoit (keinfarbton) <benoit tionex.removethispart.de> wrote:You can take a look at the source of a serialisation library. E.g. see this thread: "serialization library" in the group D.announce on 8th Nov 2006Up to date versions of that library are found at http://www.dsource.org/projects/serialization Christian
Jan 25 2007