digitalmars.D - Dynamic arrays allocation size
- =?UTF-8?B?Ikx1w61z?= Marques" (31/31) Mar 25 2013 Hi,
- =?UTF-8?B?QWxpIMOHZWhyZWxp?= (6/13) Mar 25 2013 On a tangent, despite appearances, the type of the right-hand side is
- =?UTF-8?B?Ikx1w61z?= Marques" (9/14) Mar 26 2013 Ahh, right. If you do
- renoX (6/14) Mar 26 2013 I don't know why there is this behaviour but I fully agree with
- bearophile (5/7) Mar 26 2013 It's a unacceptable trap for a modern language:
- =?UTF-8?B?Ikx1w61z?= Marques" (2/4) Mar 26 2013 Thank you all for your feedback. I've added my vote to bug 4835.
- =?UTF-8?B?Ikx1w61z?= Marques" (18/20) Mar 26 2013 I agree, but apparently, for what its worth, Java doesn't
- =?UTF-8?B?Ikx1w61z?= Marques" (8/8) Mar 26 2013 BTW, as far as I can see the overflow/underflow behavior never
-
Steven Schveighoffer
(14/22)
Mar 26 2013
On Tue, 26 Mar 2013 13:56:35 -0400, Lu=C3=ADs Marques
- =?UTF-8?B?Ikx1w61z?= Marques" (9/13) Mar 26 2013 You say not an error as meaning the language definition does not
-
Steven Schveighoffer
(11/15)
Mar 26 2013
On Tue, 26 Mar 2013 14:17:16 -0400, Lu=C3=ADs Marques
- =?UTF-8?B?Ikx1w61z?= Marques" (17/21) Mar 26 2013 Thanks Steve!
- bearophile (14/17) Mar 26 2013 I have used similar tests and it's not very costly, not
- Steven Schveighoffer (19/34) Mar 26 2013 Array bounds tests are removed for release code. And an array bounds te...
- Steven Schveighoffer (5/13) Mar 26 2013 Let me clarify that constants that are combined by default (even without...
- Johannes Pfau (5/9) Mar 26 2013 I think this is way more annoying though if the overflow happens in
- Steven Schveighoffer (6/15) Mar 26 2013 Yes, I agree there. The OP's code should either error out, or
- Andrei Alexandrescu (5/13) Mar 26 2013 D obeys two's complement overflow rules for its signed and unsigned
- =?UTF-8?B?Ikx1w61z?= Marques" (7/10) Mar 26 2013 I guess my searches for "overflow", "underflow", "modulus", etc
- Andrei Alexandrescu (4/7) Mar 26 2013 I think we're a bit biased toward x86, but I also think C's cavalier
- Brad Roberts (5/12) Mar 26 2013 The bias towards x86 is less than the bias towards standard integer
- =?UTF-8?B?Ikx1w61z?= Marques" (19/23) Mar 26 2013 Brad, if the overflow/underflow was undefined behavior you could
- =?UTF-8?B?Ikx1w61z?= Marques" (31/31) Mar 26 2013 BTW, in platforms (defined not just by the hardware, but the OS,
- Brad Roberts (12/32) Mar 26 2013 If and could. Yes, you're right. However, that would be making a trade...
- =?UTF-8?B?Ikx1w61z?= Marques" (17/19) Mar 26 2013 Sure, I was not arguing for changing that. I just wanted to
Hi, There seems to be a bug allocating large dynamic arrays in a 64-bit aware dmd (v2.062). Apparently, the size argument makes a trip through 32-bit ptrdiff_t land or something like that: unittest { immutable size_t size = 3 * 1024 * 1024 * 1024; auto data = new byte[size]; // compiler error: // file.d(line): Error: negative array index 18446744072635809792LU } unittest { immutable size_t size = 4 * 1024 * 1024 * 1024; auto data = new byte[size]; // fails silently, zero length array assert(data.length != 0); // assert error } Have you seen this before? I can open a bug, but just checking. In any case, I don't understand why the compiler doesn't complain about overflows at compile time: unittest { size_t s1 = uint.max + 1; // shouldn't it complain with -m32 flag? it does not. assert(s1 != 0); // fails for -m32, as expected uint s2 = 0xFFFFFFFF + 1; // shouldn't it complain? it does not. } Regards, Luís
Mar 25 2013
On 03/25/2013 07:23 PM, "Luís Marques" <luismarques gmail.com>" wrote:Hi, There seems to be a bug allocating large dynamic arrays in a 64-bit aware dmd (v2.062). Apparently, the size argument makes a trip through 32-bit ptrdiff_t land or something like that: unittest { immutable size_t size = 3 * 1024 * 1024 * 1024;On a tangent, despite appearances, the type of the right-hand side is int with the value of -1073741824. It is according to the arithmetic conversion rules: http://dlang.org/type.html Ali
Mar 25 2013
On Tuesday, 26 March 2013 at 05:38:41 UTC, Ali Çehreli wrote:On a tangent, despite appearances, the type of the right-hand side is int with the value of -1073741824. It is according to the arithmetic conversion rules: http://dlang.org/type.html AliAhh, right. If you do auto data = new byte[3 * 1024 * 1024 * 1024L]; with the L suffix, then it works, of course. But this is crazy! :-) Really, something needs to be rethought here, do you really want your constant folding to to overflow at 32 bits by default?. Is this because of CTFE? Luís
Mar 26 2013
On Tuesday, 26 March 2013 at 13:56:26 UTC, Luís Marques wrote: [cut]Ahh, right. If you do auto data = new byte[3 * 1024 * 1024 * 1024L]; with the L suffix, then it works, of course. But this is crazy! :-) Really, something needs to be rethought here, do you really want your constant folding to to overflow at 32 bits by default?. Is this because of CTFE? LuísI don't know why there is this behaviour but I fully agree with you that this is a bug. It should at least trigger a warning.. renoX
Mar 26 2013
Luís Marques:Really, something needs to be rethought here, do you really want your constant folding to to overflow at 32 bits by default?It's a unacceptable trap for a modern language: http://d.puremagic.com/issues/show_bug.cgi?id=4835 Bye, bearophile
Mar 26 2013
On Tuesday, 26 March 2013 at 14:04:48 UTC, bearophile wrote:It's a unacceptable trap for a modern language: http://d.puremagic.com/issues/show_bug.cgi?id=4835Thank you all for your feedback. I've added my vote to bug 4835.
Mar 26 2013
On Tuesday, 26 March 2013 at 14:04:48 UTC, bearophile wrote:It's a unacceptable trap for a modern language: http://d.puremagic.com/issues/show_bug.cgi?id=4835I agree, but apparently, for what its worth, Java doesn't complain either: class Test { public static void main(String[] args) { long a = 3 * 1024 * 1024 * 1024; long b = 3 * 1024 * 1024 * 1024L; assert(a < 0); assert(b > 0); } } $ javac test.java $ java Test $ (no error) On the other hand, if this is fixed (err, improved?) then you have one more reason to say that D is better than Java ;-)
Mar 26 2013
BTW, as far as I can see the overflow/underflow behavior never got specified by the language, in any case: 1) http://forum.dlang.org/thread/jo2c0a$31hh$1 digitalmars.com <-- no conclusion 2) Andrei's book doesn't seem to mention the topic. If it is specified somewhere please do tell. Whatever the behavior should be (unspecified, modulus for unsigned integers, etc) there really should be an official stance.
Mar 26 2013
On Tue, 26 Mar 2013 13:56:35 -0400, Lu=C3=ADs Marques <luismarques gmail= .com> = wrote:BTW, as far as I can see the overflow/underflow behavior never got =specified by the language, in any case: 1) http://forum.dlang.org/thread/jo2c0a$31hh$1 digitalmars.com <-- no ==conclusion 2) Andrei's book doesn't seem to mention the topic. If it is specified somewhere please do tell. Whatever the behavior =should be (unspecified, modulus for unsigned integers, etc) there real=ly =should be an official stance.The official stance is, it's not an error. If we treated it as an error= , = then it would be very costly to implement, every operation would have to= = check for overflow. The CPU does not assist in this. You can construct an "overflow-detecting" integer type that should be ab= le = to do what you want. -Steve
Mar 26 2013
On Tuesday, 26 March 2013 at 18:04:25 UTC, Steven Schveighoffer wrote:The official stance is, it's not an error. If we treated it as an error, then it would be very costly to implement, every operation would have to check for overflow. The CPU does not assist in this.You say not an error as meaning the language definition does not guarantee checking for overflows/underflows and throwing an exception if one occurs. But my point is even more simple: is there a stance on what the overflow/underflow semantics are? E.g., are they undefined (might wrap, might saturate, might have one's complement behavior, etc), defined only for unsigned integers (like C and C++), etc?
Mar 26 2013
On Tue, 26 Mar 2013 14:17:16 -0400, Lu=C3=ADs Marques <luismarques gmail= .com> = wrote:But my point is even more simple: is there a stance on what the =overflow/underflow semantics are? E.g., are they undefined (might wrap=, =might saturate, might have one's complement behavior, etc), defined on=ly =for unsigned integers (like C and C++), etc?http://dlang.org/expression.html#AddExpression "If both operands are of integral types and an overflow or underflow = occurs in the computation, wrapping will happen. That is, uint.max + 1 =3D= =3D = uint.min and uint.min - 1 =3D=3D uint.max." -Steve
Mar 26 2013
On Tuesday, 26 March 2013 at 18:24:39 UTC, Steven Schveighoffer wrote:http://dlang.org/expression.html#AddExpression "If both operands are of integral types and an overflow or underflow occurs in the computation, wrapping will happen. That is, uint.max + 1 == uint.min and uint.min - 1 == uint.max."Thanks Steve! Do you know if there ever was a (public?) discussion about this, before being defined this way? I wanted to see what trade-offs were considered, etc. (For instance, one disadvantage I see with this definition is that it exacerbates the potential problems with D's well-defined integral types' sizes. Imagine I'm programming some microcontroller with unusual word or register sizes. For instance, 10 bits bytes instead of the usual 8 bit bytes. In C there would not be any performance penalty even for the unsigned char, which mandates wrapping, because the wrapping would occur at 2^10. In D you would have to put extra checks because a well defined size plus a well defined wrapping would not allow just using the native arithmetic instructions alone, which presumably would not guarantee wrapping at 8-bit widths.)
Mar 26 2013
Steven Schveighoffer:If we treated it as an error, then it would be very costly to implement, every operation would have to check for overflow.I have used similar tests and it's not very costly, not significantly more costly than array bound tests. In the meantime Clang has introduced similar run-time tests for C/C++ code. So C/C++ are now better (more modern, safer) than the D language/official compiler in this regard. (And Issue 4835 is about compile-time constants. CFFE is already plenty slow, mostly because of memory allocations. Detecting overflow in constants is not going to significantly slow down compilation, and it has no effect on the runtime. Even GCC 4.3.4 performs such compile-time tests.)The CPU does not assist in this.The X86 CPUs have overflow and carry flags that help. Bye, bearophile
Mar 26 2013
On Tue, 26 Mar 2013 14:20:30 -0400, bearophile <bearophileHUGS lycos.com> wrote:Steven Schveighoffer:Array bounds tests are removed for release code. And an array bounds test is unequivocally an error. In many cases, overflowing integers are not a problem, easily proven not to occur, or are expected. Such designs would have to fight the compiler to get efficient code if the compiler insisted on checking overflows and possibly throwing errors.If we treated it as an error, then it would be very costly to implement, every operation would have to check for overflow.I have used similar tests and it's not very costly, not significantly more costly than array bound tests.In the meantime Clang has introduced similar run-time tests for C/C++ code. So C/C++ are now better (more modern, safer) than the D language/official compiler in this regard. (And Issue 4835 is about compile-time constants. CFFE is already plenty slow, mostly because of memory allocations. Detecting overflow in constants is not going to significantly slow down compilation, and it has no effect on the runtime. Even GCC 4.3.4 performs such compile-time tests.)If CTFE did something different than real code, that would be a problem. Again, you should be able to construct the needed types with a struct, to use in both CTFE and real code.What I mean is the cost is not free. Like null pointer checks are free. For code that is specifically designed to be very fast and is properly designed not to experience overflow, it would be needlessly penalized. The simple for loop: for(int i = 0; i < 10; ++i) would now have to deal with uselessly checking i for overflow. This could add up quickly. -SteveThe CPU does not assist in this.The X86 CPUs have overflow and carry flags that help.
Mar 26 2013
Let me clarify that constants that are combined by default (even without optimizations enabled), like the OP's example of 1024 * 1024 * 1024, should be able to emit an error at compile time on overflow, or automatically upgrade the type, I agree with that. -Steve(And Issue 4835 is about compile-time constants. CFFE is already plenty slow, mostly because of memory allocations. Detecting overflow in constants is not going to significantly slow down compilation, and it has no effect on the runtime. Even GCC 4.3.4 performs such compile-time tests.)If CTFE did something different than real code, that would be a problem. Again, you should be able to construct the needed types with a struct, to use in both CTFE and real code.
Mar 26 2013
Am Tue, 26 Mar 2013 14:04:25 -0400 schrieb "Steven Schveighoffer" <schveiguy yahoo.com>:The official stance is, it's not an error. If we treated it as an error, then it would be very costly to implement, every operation would have to check for overflow. The CPU does not assist in this.I think this is way more annoying though if the overflow happens in constant folding such as in the original example. In that case checking "only" adds overhead at compile time, so a warning would be nice.
Mar 26 2013
On Tue, 26 Mar 2013 14:30:21 -0400, Johannes Pfau <nospam example.com> wrote:Am Tue, 26 Mar 2013 14:04:25 -0400 schrieb "Steven Schveighoffer" <schveiguy yahoo.com>:Yes, I agree there. The OP's code should either error out, or auto-promote to long. But not in the general case. -SteveThe official stance is, it's not an error. If we treated it as an error, then it would be very costly to implement, every operation would have to check for overflow. The CPU does not assist in this.I think this is way more annoying though if the overflow happens in constant folding such as in the original example. In that case checking "only" adds overhead at compile time, so a warning would be nice.
Mar 26 2013
On 3/26/13 1:56 PM, "Luís Marques" <luismarques gmail.com>" wrote:BTW, as far as I can see the overflow/underflow behavior never got specified by the language, in any case: 1) http://forum.dlang.org/thread/jo2c0a$31hh$1 digitalmars.com <-- no conclusion 2) Andrei's book doesn't seem to mention the topic. If it is specified somewhere please do tell. Whatever the behavior should be (unspecified, modulus for unsigned integers, etc) there really should be an official stance.D obeys two's complement overflow rules for its signed and unsigned arithmetic. TDPL defines a checked integer type as an example of operator overloading. Andrei
Mar 26 2013
On Tuesday, 26 March 2013 at 18:48:27 UTC, Andrei Alexandrescu wrote:D obeys two's complement overflow rules for its signed and unsigned arithmetic. TDPL defines a checked integer type as an example of operator overloading.I guess my searches for "overflow", "underflow", "modulus", etc missed that :-) BTW, Andrei, what do you think the impact of this is for embedded systems with unusual word lengths (combined with D's well-defined type sizes)?
Mar 26 2013
On 3/26/13 2:53 PM, "Luís Marques" <luismarques gmail.com>" wrote:BTW, Andrei, what do you think the impact of this is for embedded systems with unusual word lengths (combined with D's well-defined type sizes)?I think we're a bit biased toward x86, but I also think C's cavalier approach to data sizes and operational semantics ain't better. Andrei
Mar 26 2013
On 3/26/13 12:01 PM, Andrei Alexandrescu wrote:On 3/26/13 2:53 PM, "Luís Marques" <luismarques gmail.com>" wrote:The bias towards x86 is less than the bias towards standard integer sizes. D explicitly ignores platforms with odd sizes. D does NOT support byte where byte is outside the range -127..128. Etc. BradBTW, Andrei, what do you think the impact of this is for embedded systems with unusual word lengths (combined with D's well-defined type sizes)?I think we're a bit biased toward x86, but I also think C's cavalier approach to data sizes and operational semantics ain't better. Andrei
Mar 26 2013
On Tuesday, 26 March 2013 at 21:00:43 UTC, Brad Roberts wrote:The bias towards x86 is less than the bias towards standard integer sizes. D explicitly ignores platforms with odd sizes. D does NOT support byte where byte is outside the range -127..128. Etc.Brad, if the overflow/underflow was undefined behavior you could easily map D types into the machine's weird native types with little performance loss. So, in that sense, D would support "platforms with odd sizes". Even as is I'm sure D *can* support those unconventional platforms, it just has a performance penalty to assure that the exact semantics are followed, because there isn't a completely direct map between the native instructions/registers and D's type model. Just because a D byte is mapped into a 10-bit register does not mean the language is supporting bytes outside of the [-128, +127] range. The question is if the compiler has to add extra instruction to ensure if the overflow behavior of the registers/CPU instructions matches the language overflow behavior. If the overflow behavior was undefined then the 10-bit register would be a direct implementation of D's byte. Since it isn't undefined, and the register presumably wraps at 10 bits, then the compiler has to emit extra code, to model the behavior of an 8-bit two's-complement variable in a 10-bit register.
Mar 26 2013
BTW, in platforms (defined not just by the hardware, but the OS, etc) where at least one of the C types did not exactly match any of D's types then there would be an interesting problem. In core.stdc.config the C types are defined as aliases to D types, but if you had, say, a 256-bit long long then you'd be up for trouble :-). You couldn't do as is done currently with the aliases: version( Windows ) { alias int c_long; alias uint c_ulong; } else { static if( (void*).sizeof > int.sizeof ) { alias long c_long; alias ulong c_ulong; } else { alias int c_long; alias uint c_ulong; } } An interesting idea would be to have the standard types defined at the current sizes but allowing other sizes, other overflow / underflow behaviors (unspecified, exception, wrapping, saturation...), etc. I don't expect that to happen, but just saying, it would be cool :-)
Mar 26 2013
On 3/26/13 3:49 PM, "Luís.Marques" <luismarques gmail.com>" wrote:On Tuesday, 26 March 2013 at 21:00:43 UTC, Brad Roberts wrote:If and could. Yes, you're right. However, that would be making a trade off in an odd direction. It'd add undefind behavior to add a capability for integer sizes to vary on platforms that most developers don't use or test on. The result is what you see in C, the chances of any given C app actually working correctly on these platforms is fairly close to 0. So, D has explicitly defined the sizes on purpose, to make correct, working code, easier to create at the expense of making those extremely rare architectures have to do extra work if they want to support D. I think it's the right tradeoff. Either way, it's the trade off that's been made, and it's not likely to change. BradThe bias towards x86 is less than the bias towards standard integer sizes. D explicitly ignores platforms with odd sizes. D does NOT support byte where byte is outside the range -127..128. Etc.Brad, if the overflow/underflow was undefined behavior you could easily map D types into the machine's weird native types with little performance loss. So, in that sense, D would support "platforms with odd sizes". Even as is I'm sure D *can* support those unconventional platforms, it just has a performance penalty to assure that the exact semantics are followed, because there isn't a completely direct map between the native instructions/registers and D's type model. Just because a D byte is mapped into a 10-bit register does not mean the language is supporting bytes outside of the [-128, +127] range. The question is if the compiler has to add extra instruction to ensure if the overflow behavior of the registers/CPU instructions matches the language overflow behavior. If the overflow behavior was undefined then the 10-bit register would be a direct implementation of D's byte. Since it isn't undefined, and the register presumably wraps at 10 bits, then the compiler has to emit extra code, to model the behavior of an 8-bit two's-complement variable in a 10-bit register.
Mar 26 2013
On Wednesday, 27 March 2013 at 00:32:12 UTC, Brad Roberts wrote:Either way, it's the trade off that's been made, and it's not likely to change.Sure, I was not arguing for changing that. I just wanted to clarify that when you say that "D explicitly ignores platforms with odd sizes" that does not mean that D cannot be implemented on these other machines, only that there might be a performance penalty (as had to be the case, given Turing et al...), depending on the exact circumstances. What might actually be cooler would be being able to define your own types (though I don't expect that idea to be adopted soon, either), with their own properties, such as having ints that saturate instead of wrapping (like MMX), with different numbers of bits, etc. On a good compiler some of those alternative types would allow exploiting nice machine properties, and would complement the benefits of having the standard types, the same way pointers complement arrays. And you could actually define the C types on platforms where they don't match with the D types, as I pointed out earlier in this thread.
Mar 26 2013