digitalmars.D.bugs - [Issue 4572] New: std.file.read return type
- d-bugmail puremagic.com (22/22) Aug 02 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4572
- d-bugmail puremagic.com (14/14) Aug 03 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4572
- d-bugmail puremagic.com (20/20) Aug 03 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4572
- d-bugmail puremagic.com (6/6) Aug 03 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4572
- d-bugmail puremagic.com (9/9) Aug 03 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4572
- d-bugmail puremagic.com (30/34) Aug 04 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4572
- d-bugmail puremagic.com (19/25) Aug 04 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4572
- d-bugmail puremagic.com (14/40) Aug 04 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4572
- d-bugmail puremagic.com (20/24) Aug 04 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4572
- d-bugmail puremagic.com (29/38) Aug 04 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4572
- d-bugmail puremagic.com (28/61) Aug 04 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4572
- d-bugmail puremagic.com (14/14) Aug 06 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4572
- d-bugmail puremagic.com (7/7) Oct 15 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4572
- d-bugmail puremagic.com (10/10) Jan 09 2011 http://d.puremagic.com/issues/show_bug.cgi?id=4572
http://d.puremagic.com/issues/show_bug.cgi?id=4572 Summary: std.file.read return type Product: D Version: D2 Platform: All OS/Version: All Status: NEW Severity: enhancement Priority: P2 Component: Phobos AssignedTo: nobody puremagic.com ReportedBy: bearophile_hugs eml.cc I am unsure about this, but I think that it is better for std.file.read() to return something like a ubyte[] instead of a void[]. It's a problem to perform a cast(ubyte[]) in SafeD: auto data = cast(ubyte[])std.file.read(); Or maybe in SafeD I have to use other safer functions to load binary data, like slurp() or something similar. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 02 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4572 David Simcha <dsimcha yahoo.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |dsimcha yahoo.com I agree, and here's another reason to add to the list: void[] implies an array that may contain pointers. I just looked at the impl. of std.file and was surprised to learn that it allocates the array as not containing pointers via GC.malloc. However, when you do a new void[N], you get an array that may contain pointers. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 03 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4572 nfxjfg gmail.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |nfxjfg gmail.com I say using the void[] type is a design error in this case. void[] is just a safer void*. void[] should only be used in cases when you have to reference a specific region of memory. You use it because the type system is not expressive enough to describe the actual type (see Variant and so on). But in this case, it's very clear what the actual data type is: it's an array of bytes. File contents are nothing else than untyped bag of bytes. Thus the return type should be ubyte[]. Note that reinterpret casting ubyte[] slices to other arrays or structs is not clean: there are issues of byte order and padding. This too doesn't apply to void[]. There's a clear difference. Tange makes the same error. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 03 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4572 Thank you David and nfxjfg :-) -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 03 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4572 One can add that .dup-ing a void[] will make all the precaution in std.file.read of allocating the void[] with GC.BlkAttr.NO_SCAN useless. The dup'ed array won't have the NO_SCAN flag set. It really shows that void[] is not the natural type to use here. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 03 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4572 Steven Schveighoffer <schveiguy yahoo.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |schveiguy yahoo.com 05:24:44 PDT ---One can add that .dup-ing a void[] will make all the precaution in std.file.read of allocating the void[] with GC.BlkAttr.NO_SCAN useless. The dup'ed array won't have the NO_SCAN flag set. It really shows that void[] is not the natural type to use here.any array operation which copies a block to another copies the flags from the original array block. So the NO_SCAN flag should persist. If it doesn't, that's a bug (looking at it, dup is the only function that doesn't copy the block attributes, I'll address this on the mailing list). I think the void[] type is more consistent than ubyte. Consider two things. First, a read where the array is provided by the caller. If this function is typed: ubyte[] read(ubyte[] into); then, you must cast your data to a ubyte[]. But void[] can be implicitly cast to from anything, so void[] wins here. Second, a write operation: int write(ubyte[] from); Again, cast is necessary where void[] would not require it. To be sure std.file.read() could return a ubyte[], and unless you intend to actually deal with it as a ubyte[], you need to cast to the type you need. But shouldn't it be consistent with other operations? But we can forgo all this stuff if we add a template parameter to read, so you can specify exactly the type you want. If you know your file is a bunch of int's, you could do: std.file.read!(int)(); -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 04 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4572 Thank you for your answer Steven.then, you must cast your data to a ubyte[]. But void[] can be implicitly cast to from anything, so void[] wins here.If you compile this program with dmd 2.047: void main() { auto v = new void[10]; ubyte a1, a2; a1 = v; v = a2; } It produces the errors: test.d(4): Error: cannot implicitly convert expression (v) of type void[] to ubyte test.d(5): Error: cannot implicitly convert expression (a2) of type ubyte to void[]But we can forgo all this stuff if we add a template parameter to read, so you can specify exactly the type you want. If you know your file is a bunch of int's, you could do: std.file.read!(int)();It seems a good idea. So I presume std.file.read!(int[])(); reads a matrix. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 04 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4572 06:02:21 PDT ---Thank you for your answer Steven.Yes, D does not convert non-arrays to void[]. You can do something like this: v = (&a2)[0..1]; which is typically what is done. ubyte[] provides no automatic conversion from anything, including other array types.then, you must cast your data to a ubyte[]. But void[] can be implicitly cast to from anything, so void[] wins here.If you compile this program with dmd 2.047: void main() { auto v = new void[10]; ubyte a1, a2; a1 = v; v = a2; } It produces the errors: test.d(4): Error: cannot implicitly convert expression (v) of type void[] to ubyte test.d(5): Error: cannot implicitly convert expression (a2) of type ubyte to void[]Well, I would assume it would return an int[][], which probably would mean nothing since arrays are pointer/length values, and any pointer/length values read from a file would be bogus. I'd say read should reject reading elements that have references in them. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------But we can forgo all this stuff if we add a template parameter to read, so you can specify exactly the type you want. If you know your file is a bunch of int's, you could do: std.file.read!(int)();It seems a good idea. So I presume std.file.read!(int[])(); reads a matrix.
Aug 04 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4572 Steven Schveighoffer:Well, I would assume it would return an int[][], which probably would mean nothing since arrays are pointer/length values, and any pointer/length values read from a file would be bogus. I'd say read should reject reading elements that have references in them.Yes, you are right, I probably meant a static array, so this reads a dynamic array of ints: std.file.read!(int[])(); A static array of two ints: std.file.read!(int[2])(); A static array of static array of ints (the memory of a fixed-size matrix is contiguous in D): std.file.read!(int[2][5])(); The problem is that such syntax doesn't read the items of the static array in-place, the result is a static array that gets copied. So you need an argument by ref: int[2][5] m; std.file.read(T)(ref T m); I don't like this a lot. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 04 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4572any array operation which copies a block to another copies the flags from the original array block. So the NO_SCAN flag should persist. If it doesn't, that's a bug (looking at it, dup is the only function that doesn't copy the block attributes, I'll address this on the mailing list).Concatenating arrays doesn't preserve the bit either. And if the user allocates the void[], the NO_SCAN bit won't be set anyway. The user must remember to allocate a ubyte[] array to get it right. How many users will even understand why? (Also if you copy one array into another, flags won't be preserved. If you allocate the void[] with C-malloc, lifetime.d won't do the right thing either.) Let me repeat: the only reason why you want to use void[] here is because void[] implicitly converts to any other array type. But doing that is bogus anyway. If you convert it to an int[], you implicitly assume the data stored on the disk is in the same endian as your machine's. If it's an array of struct, you assume there are no padding issues (or further byte order issues). If it's a char[], you assume it's valid utf-8. No really, there should be specialized functions for proper I/O with certain types. In no case should the user be encouraged to do the false thing. void[] encourages to do 2 false things: 1. allocating the incorrect type, and 2. reading/writing data in a machine specific format. I think it's a good thing to remind the user of 2. by forcing him to explicitly cast the array. Yes, dear user, you ARE reinterpret casting your data to a bunch of bytes! I guess Andrei will remove the usage of void[] anyway because of his beloved Java subset of D. void[] would allow aliasing pointers with integers, which is a no-no in SafeD.But we can forgo all this stuff if we add a template parameter to read, so you can specify exactly the type you want. If you know your file is a bunch of int's, you could do: std.file.read!(int)();I guess nothing can stop the devs from killing Phobos with even more template bloat. Your function call signature forgets anything about endians, though. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 04 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4572 10:31:17 PDT ---Yes, that should be addressed too, thanks for pointing it out!any array operation which copies a block to another copies the flags from the original array block. So the NO_SCAN flag should persist. If it doesn't, that's a bug (looking at it, dup is the only function that doesn't copy the block attributes, I'll address this on the mailing list).Concatenating arrays doesn't preserve the bit either.And if the user allocates the void[], the NO_SCAN bit won't be set anyway. The user must remember to allocate a ubyte[] array to get it right. How many users will even understand why? (Also if you copy one array into another, flags won't be preserved.You're splitting hairs here. ubyte and void are both just as nebulous, it's just that void[] requires you to cast to something before reading it. It means "I don't know what the type is." ubyte means "this is an array of unsigned bytes", which is somewhat as nebulous, and one could argue more useful. But wouldn't it be best to just ask for the type you wish? What if you want to read a utf-8 file, wouldn't it be better for file.read() to return a char[]? I think the template solution works best.If you allocate the void[] with C-malloc, lifetime.d won't do the right thing either.)Who's allocating a void[] with c-malloc? I thought std.file.read allocated the data with gc_malloc?Let me repeat: the only reason why you want to use void[] here is because void[] implicitly converts to any other array type.Well, it's for consistency. void[] does not implicitly convert to another array type, rather another array type can implicitly convert to void[]. So as far as returning void[], it forces you to cast, even if you expect the data to be ubyte[].But doing that is bogus anyway. If you convert it to an int[], you implicitly assume the data stored on the disk is in the same endian as your machine's. If it's an array of struct, you assume there are no padding issues (or further byte order issues). If it's a char[], you assume it's valid utf-8.So? Is one not allowed to assume the contents of a file are in a format they expect? Typically one inserts magic numbers or something similar in a fie to ensure the file is correct, but I don't see why casting makes the user more keenly aware that the file might be incorrect.I think it's a good thing to remind the user of 2. by forcing him to explicitly cast the array. Yes, dear user, you ARE reinterpret casting your data to a bunch of bytes!And that is what a void[] return does... not sure what you are getting at here.I guess Andrei will remove the usage of void[] anyway because of his beloved Java subset of D. void[] would allow aliasing pointers with integers, which is a no-no in SafeD.hehe, I don't think you know how close Andrei and Java are.All it does is apply a type to a common function's return. Endianness is an implementation issue. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------But we can forgo all this stuff if we add a template parameter to read, so you can specify exactly the type you want. If you know your file is a bunch of int's, you could do: std.file.read!(int)();I guess nothing can stop the devs from killing Phobos with even more template bloat. Your function call signature forgets anything about endians, though.
Aug 04 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4572 05:13:42 PDT --- After discussion on the druntime mailing list, it was agreed that dup should preserve the bits, but we decided to leave concatenation as-is. This shouldn't be a huge issue, because one doesn't often concatenate void[]s together. There's no clear choice in concatenation because multiple arrays are being concatenated together, and none of them is going to "own" the data afterwards. So I think in light of dup now preserving the scan bit, the issue of using void[] as the return type of std.file.read (non-templated version) is moot. But I still think being able to apply a type to the return data array via a template parameter would be a good enhancement. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 06 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4572 10:39:39 PDT --- To me void[] stinks a lot and it's not a big deal to avoid it completely. It's an abomination from trying to evolve C concepts for D. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Oct 15 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4572 Andrei Alexandrescu <andrei metalanguage.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED CC| |andrei metalanguage.com AssignedTo|nobody puremagic.com |andrei metalanguage.com -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jan 09 2011