digitalmars.D - Dynamic arrays allocation size

=?UTF-8?B?Ikx1w61z?= Marques" (31/31) Mar 25 2013 Hi,

=?UTF-8?B?QWxpIMOHZWhyZWxp?= (6/13) Mar 25 2013 On a tangent, despite appearances, the type of the right-hand side is

=?UTF-8?B?Ikx1w61z?= Marques" (9/14) Mar 26 2013 Ahh, right. If you do

renoX (6/14) Mar 26 2013 I don't know why there is this behaviour but I fully agree with
bearophile (5/7) Mar 26 2013 It's a unacceptable trap for a modern language:

=?UTF-8?B?Ikx1w61z?= Marques" (2/4) Mar 26 2013 Thank you all for your feedback. I've added my vote to bug 4835.
=?UTF-8?B?Ikx1w61z?= Marques" (18/20) Mar 26 2013 I agree, but apparently, for what its worth, Java doesn't

=?UTF-8?B?Ikx1w61z?= Marques" (8/8) Mar 26 2013 BTW, as far as I can see the overflow/underflow behavior never

Steven Schveighoffer (14/22) Mar 26 2013 On Tue, 26 Mar 2013 13:56:35 -0400, Lu=C3=ADs Marques

=?UTF-8?B?Ikx1w61z?= Marques" (9/13) Mar 26 2013 You say not an error as meaning the language definition does not

Steven Schveighoffer (11/15) Mar 26 2013 On Tue, 26 Mar 2013 14:17:16 -0400, Lu=C3=ADs Marques

=?UTF-8?B?Ikx1w61z?= Marques" (17/21) Mar 26 2013 Thanks Steve!

bearophile (14/17) Mar 26 2013 I have used similar tests and it's not very costly, not

Steven Schveighoffer (19/34) Mar 26 2013 Array bounds tests are removed for release code. And an array bounds te...

Steven Schveighoffer (5/13) Mar 26 2013 Let me clarify that constants that are combined by default (even without...

Johannes Pfau (5/9) Mar 26 2013 I think this is way more annoying though if the overflow happens in

Steven Schveighoffer (6/15) Mar 26 2013 Yes, I agree there. The OP's code should either error out, or

Andrei Alexandrescu (5/13) Mar 26 2013 D obeys two's complement overflow rules for its signed and unsigned

=?UTF-8?B?Ikx1w61z?= Marques" (7/10) Mar 26 2013 I guess my searches for "overflow", "underflow", "modulus", etc

Andrei Alexandrescu (4/7) Mar 26 2013 I think we're a bit biased toward x86, but I also think C's cavalier

Brad Roberts (5/12) Mar 26 2013 The bias towards x86 is less than the bias towards standard integer

=?UTF-8?B?Ikx1w61z?= Marques" (19/23) Mar 26 2013 Brad, if the overflow/underflow was undefined behavior you could

=?UTF-8?B?Ikx1w61z?= Marques" (31/31) Mar 26 2013 BTW, in platforms (defined not just by the hardware, but the OS,
Brad Roberts (12/32) Mar 26 2013 If and could. Yes, you're right. However, that would be making a trade...

=?UTF-8?B?Ikx1w61z?= Marques" (17/19) Mar 26 2013 Sure, I was not arguing for changing that. I just wanted to

=?UTF-8?B?Ikx1w61z?= Marques" <luismarques gmail.com> writes:

Hi,

There seems to be a bug allocating large dynamic arrays in a 
64-bit aware dmd (v2.062). Apparently, the size argument makes a 
trip through 32-bit ptrdiff_t land or something like that:

unittest
{
     immutable size_t size = 3 * 1024 * 1024 * 1024;
     auto data = new byte[size]; // compiler error:
     // file.d(line): Error: negative array index 
18446744072635809792LU
}

unittest
{
     immutable size_t size = 4 * 1024 * 1024 * 1024;
     auto data = new byte[size]; // fails silently, zero length 
array
     assert(data.length != 0); // assert error
}

Have you seen this before? I can open a bug, but just checking.

In any case, I don't understand why the compiler doesn't complain 
about overflows at compile time:

unittest
{
     size_t s1 = uint.max + 1; // shouldn't it complain with -m32 
flag? it does not.
     assert(s1 != 0); // fails for -m32, as expected
     uint s2 = 0xFFFFFFFF + 1; // shouldn't it complain? it does 
not.
}

Regards,
Luís

Mar 25 2013

=?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:

On 03/25/2013 07:23 PM, "Luís Marques" <luismarques gmail.com>" wrote:
 Hi,

 There seems to be a bug allocating large dynamic arrays in a 64-bit
 aware dmd (v2.062). Apparently, the size argument makes a trip through
 32-bit ptrdiff_t land or something like that:

 unittest
 {
 immutable size_t size = 3 * 1024 * 1024 * 1024;

On a tangent, despite appearances, the type of the right-hand side is 
int with the value of -1073741824. It is according to the arithmetic 
conversion rules:

   http://dlang.org/type.html

Ali

Mar 25 2013

=?UTF-8?B?Ikx1w61z?= Marques" <luismarques gmail.com> writes:

On Tuesday, 26 March 2013 at 05:38:41 UTC, Ali Çehreli wrote:
 On a tangent, despite appearances, the type of the right-hand 
 side is int with the value of -1073741824. It is according to 
 the arithmetic conversion rules:

   http://dlang.org/type.html

 Ali

Ahh, right. If you do

     auto data = new byte[3 * 1024 * 1024 * 1024L];

with the L suffix, then it works, of course. But this is crazy! 
:-)

Really, something needs to be rethought here, do you really want 
your constant folding to to overflow at 32 bits by default?. Is 
this because of CTFE?

Luís

Mar 26 2013

"renoX" <renozyx gmail.com> writes:

On Tuesday, 26 March 2013 at 13:56:26 UTC, Luís Marques wrote:
[cut]
 Ahh, right. If you do

     auto data = new byte[3 * 1024 * 1024 * 1024L];

 with the L suffix, then it works, of course. But this is crazy! 
 :-)

 Really, something needs to be rethought here, do you really 
 want your constant folding to to overflow at 32 bits by 
 default?. Is this because of CTFE?

 Luís

I don't know why there is this behaviour but I fully agree with 
you that this is a bug.
It should at least trigger a warning..

renoX

Mar 26 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Luís Marques:

 Really, something needs to be rethought here, do you really 
 want your constant folding to to overflow at 32 bits by default?

It's a unacceptable trap for a modern language:

http://d.puremagic.com/issues/show_bug.cgi?id=4835

Bye,
bearophile

Mar 26 2013

=?UTF-8?B?Ikx1w61z?= Marques" <luismarques gmail.com> writes:

On Tuesday, 26 March 2013 at 14:04:48 UTC, bearophile wrote:
 It's a unacceptable trap for a modern language:

 http://d.puremagic.com/issues/show_bug.cgi?id=4835

Thank you all for your feedback. I've added my vote to bug 4835.

Mar 26 2013

=?UTF-8?B?Ikx1w61z?= Marques" <luismarques gmail.com> writes:

On Tuesday, 26 March 2013 at 14:04:48 UTC, bearophile wrote:
 It's a unacceptable trap for a modern language:

 http://d.puremagic.com/issues/show_bug.cgi?id=4835

I agree, but apparently, for what its worth, Java doesn't 
complain either:

class Test
{
     public static void main(String[] args)
     {
         long a = 3 * 1024 * 1024 * 1024;
         long b = 3 * 1024 * 1024 * 1024L;

         assert(a < 0);
         assert(b > 0);
     }
}

$ javac test.java
$ java Test
$ (no error)

On the other hand, if this is fixed (err, improved?) then you 
have one more reason to say that D is better than Java ;-)

Mar 26 2013

=?UTF-8?B?Ikx1w61z?= Marques" <luismarques gmail.com> writes:

BTW, as far as I can see the overflow/underflow behavior never 
got specified by the language, in any case:

1) http://forum.dlang.org/thread/jo2c0a$31hh$1 digitalmars.com 
<-- no conclusion

2) Andrei's book doesn't seem to mention the topic.

If it is specified somewhere please do tell. Whatever the 
behavior should be (unspecified, modulus for unsigned integers, 
etc) there really should be an official stance.

Mar 26 2013

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 26 Mar 2013 13:56:35 -0400, Lu=C3=ADs Marques <luismarques gmail=
.com>  =

wrote:

 BTW, as far as I can see the overflow/underflow behavior never got  =

 specified by the language, in any case:

 1) http://forum.dlang.org/thread/jo2c0a$31hh$1 digitalmars.com <-- no =

 =

 conclusion

 2) Andrei's book doesn't seem to mention the topic.

 If it is specified somewhere please do tell. Whatever the behavior  =

 should be (unspecified, modulus for unsigned integers, etc) there real=

ly  =

 should be an official stance.

The official stance is, it's not an error.  If we treated it as an error=
,  =

then it would be very costly to implement, every operation would have to=
  =

check for overflow.  The CPU does not assist in this.

You can construct an "overflow-detecting" integer type that should be ab=
le  =

to do what you want.

-Steve

Mar 26 2013

=?UTF-8?B?Ikx1w61z?= Marques" <luismarques gmail.com> writes:

On Tuesday, 26 March 2013 at 18:04:25 UTC, Steven Schveighoffer 
wrote:
 The official stance is, it's not an error.  If we treated it as 
 an error, then it would be very costly to implement, every 
 operation would have to check for overflow.  The CPU does not 
 assist in this.

You say not an error as meaning the language definition does not 
guarantee checking for overflows/underflows and throwing an 
exception if one occurs.

But my point is even more simple: is there a stance on what the 
overflow/underflow semantics are? E.g., are they undefined (might 
wrap, might saturate, might have one's complement behavior, etc), 
defined only for unsigned integers (like C and C++), etc?

Mar 26 2013

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 26 Mar 2013 14:17:16 -0400, Lu=C3=ADs Marques <luismarques gmail=
.com>  =

wrote:

 But my point is even more simple: is there a stance on what the  =

 overflow/underflow semantics are? E.g., are they undefined (might wrap=

,  =

 might saturate, might have one's complement behavior, etc), defined on=

ly  =

 for unsigned integers (like C and C++), etc?

http://dlang.org/expression.html#AddExpression

"If both operands are of integral types and an overflow or underflow  =

occurs in the computation, wrapping will happen. That is, uint.max + 1 =3D=
=3D  =

uint.min and uint.min - 1 =3D=3D uint.max."

-Steve

Mar 26 2013

=?UTF-8?B?Ikx1w61z?= Marques" <luismarques gmail.com> writes:

On Tuesday, 26 March 2013 at 18:24:39 UTC, Steven Schveighoffer 
wrote:
 http://dlang.org/expression.html#AddExpression

 "If both operands are of integral types and an overflow or 
 underflow occurs in the computation, wrapping will happen. That 
 is, uint.max + 1 == uint.min and uint.min - 1 == uint.max."

Thanks Steve!

Do you know if there ever was a (public?) discussion about this, 
before being defined this way? I wanted to see what trade-offs 
were considered, etc.

(For instance, one disadvantage I see with this definition is 
that it exacerbates the potential problems with D's well-defined 
integral types' sizes. Imagine I'm programming some 
microcontroller with unusual word or register sizes. For 
instance, 10 bits bytes instead of the usual 8 bit bytes. In C 
there would not be any performance penalty even for the unsigned 
char, which mandates wrapping, because the wrapping would occur 
at 2^10. In D you would have to put extra checks because a well 
defined size plus a well defined wrapping would not allow just 
using the native arithmetic instructions alone, which presumably 
would not guarantee wrapping at 8-bit widths.)

Mar 26 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Steven Schveighoffer:

 If we treated it as an error, then it would be very costly to 
 implement, every operation would have to check for overflow.

I have used similar tests and it's not very costly, not 
significantly more costly than array bound tests.

In the meantime Clang has introduced similar run-time tests for 
C/C++ code. So C/C++ are now better (more modern, safer) than the 
D language/official compiler in this regard.

(And Issue 4835 is about compile-time constants. CFFE is already 
plenty slow, mostly because of memory allocations. Detecting 
overflow in constants is not going to significantly slow down 
compilation, and it has no effect on the runtime. Even GCC 4.3.4 
performs such compile-time tests.)


 The CPU does not assist in this.

The X86 CPUs have overflow and carry flags that help.

Bye,
bearophile

Mar 26 2013

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 26 Mar 2013 14:20:30 -0400, bearophile <bearophileHUGS lycos.com>  
wrote:

 Steven Schveighoffer:

 If we treated it as an error, then it would be very costly to  
 implement, every operation would have to check for overflow.

 I have used similar tests and it's not very costly, not significantly  
 more costly than array bound tests.

Array bounds tests are removed for release code.  And an array bounds test  
is unequivocally an error.

In many cases, overflowing integers are not a problem, easily proven not  
to occur, or are expected.  Such designs would have to fight the compiler  
to get efficient code if the compiler insisted on checking overflows and  
possibly throwing errors.

 In the meantime Clang has introduced similar run-time tests for C/C++  
 code. So C/C++ are now better (more modern, safer) than the D  
 language/official compiler in this regard.

 (And Issue 4835 is about compile-time constants. CFFE is already plenty  
 slow, mostly because of memory allocations. Detecting overflow in  
 constants is not going to significantly slow down compilation, and it  
 has no effect on the runtime. Even GCC 4.3.4 performs such compile-time  
 tests.)

If CTFE did something different than real code, that would be a problem.   
Again, you should be able to construct the needed types with a struct, to  
use in both CTFE and real code.

 The CPU does not assist in this.

 The X86 CPUs have overflow and carry flags that help.

What I mean is the cost is not free.  Like null pointer checks are free.

For code that is specifically designed to be very fast and is properly  
designed not to experience overflow, it would be needlessly penalized.

The simple for loop:

for(int i = 0; i < 10; ++i)

would now have to deal with uselessly checking i for overflow.  This could  
add up quickly.

-Steve

Mar 26 2013

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

 (And Issue 4835 is about compile-time constants. CFFE is already plenty  
 slow, mostly because of memory allocations. Detecting overflow in  
 constants is not going to significantly slow down compilation, and it  
 has no effect on the runtime. Even GCC 4.3.4 performs such compile-time  
 tests.)

 If CTFE did something different than real code, that would be a  
 problem.  Again, you should be able to construct the needed types with a  
 struct, to use in both CTFE and real code.

Let me clarify that constants that are combined by default (even without  
optimizations enabled), like the OP's example of 1024 * 1024 * 1024,  
should be able to emit an error at compile time on overflow, or  
automatically upgrade the type, I agree with that.

-Steve

Mar 26 2013

Johannes Pfau <nospam example.com> writes:

Am Tue, 26 Mar 2013 14:04:25 -0400
schrieb "Steven Schveighoffer" <schveiguy yahoo.com>:

 
 The official stance is, it's not an error.  If we treated it as an
 error, then it would be very costly to implement, every operation
 would have to check for overflow.  The CPU does not assist in this.

I think this is way more annoying though if the overflow happens in
constant folding such as in the original example. In that case checking
"only" adds overhead at compile time, so a warning would be nice.

Mar 26 2013

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 26 Mar 2013 14:30:21 -0400, Johannes Pfau <nospam example.com>  
wrote:

 Am Tue, 26 Mar 2013 14:04:25 -0400
 schrieb "Steven Schveighoffer" <schveiguy yahoo.com>:

 The official stance is, it's not an error.  If we treated it as an
 error, then it would be very costly to implement, every operation
 would have to check for overflow.  The CPU does not assist in this.

 I think this is way more annoying though if the overflow happens in
 constant folding such as in the original example. In that case checking
 "only" adds overhead at compile time, so a warning would be nice.

Yes, I agree there.  The OP's code should either error out, or  
auto-promote to long.

But not in the general case.

-Steve

Mar 26 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/26/13 1:56 PM, "Luís Marques" <luismarques gmail.com>" wrote:
 BTW, as far as I can see the overflow/underflow behavior never got
 specified by the language, in any case:

 1) http://forum.dlang.org/thread/jo2c0a$31hh$1 digitalmars.com <-- no
 conclusion

 2) Andrei's book doesn't seem to mention the topic.

 If it is specified somewhere please do tell. Whatever the behavior
 should be (unspecified, modulus for unsigned integers, etc) there really
 should be an official stance.

D obeys two's complement overflow rules for its signed and unsigned 
arithmetic. TDPL defines a checked integer type as an example of 
operator overloading.

Andrei

Mar 26 2013

=?UTF-8?B?Ikx1w61z?= Marques" <luismarques gmail.com> writes:

On Tuesday, 26 March 2013 at 18:48:27 UTC, Andrei Alexandrescu 
wrote:
 D obeys two's complement overflow rules for its signed and 
 unsigned arithmetic. TDPL defines a checked integer type as an 
 example of operator overloading.

I guess my searches for "overflow", "underflow", "modulus", etc 
missed that :-)

BTW, Andrei, what do you think the impact of this is for embedded 
systems with unusual word lengths (combined with D's well-defined 
type sizes)?

Mar 26 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/26/13 2:53 PM, "Luís Marques" <luismarques gmail.com>" wrote:
 BTW, Andrei, what do you think the impact of this is for embedded
 systems with unusual word lengths (combined with D's well-defined type
 sizes)?

I think we're a bit biased toward x86, but I also think C's cavalier 
approach to data sizes and operational semantics ain't better.

Andrei

Mar 26 2013

Brad Roberts <braddr puremagic.com> writes:

On 3/26/13 12:01 PM, Andrei Alexandrescu wrote:
 On 3/26/13 2:53 PM, "Luís Marques" <luismarques gmail.com>" wrote:
 BTW, Andrei, what do you think the impact of this is for embedded
 systems with unusual word lengths (combined with D's well-defined type
 sizes)?

 I think we're a bit biased toward x86, but I also think C's cavalier
 approach to data sizes and operational semantics ain't better.

 Andrei

The bias towards x86 is less than the bias towards standard integer 
sizes.  D explicitly ignores platforms with odd sizes.  D does NOT 
support byte where byte is outside the range -127..128.  Etc.

Brad

Mar 26 2013

=?UTF-8?B?Ikx1w61z?= Marques" <luismarques gmail.com> writes:

On Tuesday, 26 March 2013 at 21:00:43 UTC, Brad Roberts wrote:
 The bias towards x86 is less than the bias towards standard 
 integer sizes.  D explicitly ignores platforms with odd sizes.  
 D does NOT support byte where byte is outside the range 
 -127..128.  Etc.

Brad, if the overflow/underflow was undefined behavior you could 
easily map D types into the machine's weird native types with 
little performance loss. So, in that sense, D would support 
"platforms with odd sizes".  Even as is I'm sure D *can*  support 
those unconventional platforms, it just has a performance penalty 
to assure that the exact semantics are followed, because there 
isn't a completely direct map between the native 
instructions/registers and D's type model.

Just because a D byte is mapped into a 10-bit register does not 
mean the language is supporting bytes outside of the [-128, +127] 
range. The question is if the compiler has to add extra 
instruction to ensure if the overflow behavior of the 
registers/CPU instructions matches the language overflow 
behavior. If the overflow behavior was undefined then the 10-bit 
register would be a direct implementation of D's byte. Since it 
isn't undefined, and the register presumably wraps at 10 bits, 
then the compiler has to emit extra code, to model the behavior 
of an 8-bit two's-complement variable in a 10-bit register.

Mar 26 2013

=?UTF-8?B?Ikx1w61z?= Marques" <luismarques gmail.com> writes:

BTW, in platforms (defined not just by the hardware, but the OS, 
etc) where at least one of the C types did not exactly match any 
of D's types then there would be an interesting problem. In 
core.stdc.config the C types are defined as aliases to D types, 
but if you had, say, a 256-bit long long then you'd be up for 
trouble :-). You couldn't do as is done currently with the 
aliases:

version( Windows )
{
     alias int   c_long;
     alias uint  c_ulong;
}
else
{
   static if( (void*).sizeof > int.sizeof )
   {
     alias long  c_long;
     alias ulong c_ulong;
   }
   else
   {
     alias int   c_long;
     alias uint  c_ulong;
   }
}

An interesting idea would be to have the standard types defined 
at the current sizes but allowing other sizes, other overflow / 
underflow behaviors (unspecified, exception, wrapping, 
saturation...), etc.

I don't expect that to happen, but just saying, it would be cool 
:-)

Mar 26 2013

Brad Roberts <braddr puremagic.com> writes:

On 3/26/13 3:49 PM, "Luís.Marques" <luismarques gmail.com>" wrote:
 On Tuesday, 26 March 2013 at 21:00:43 UTC, Brad Roberts wrote:
 The bias towards x86 is less than the bias towards standard integer
 sizes.  D explicitly ignores platforms with odd sizes. D does NOT
 support byte where byte is outside the range -127..128.  Etc.

 Brad, if the overflow/underflow was undefined behavior you could easily
 map D types into the machine's weird native types with little
 performance loss. So, in that sense, D would support "platforms with odd
 sizes".  Even as is I'm sure D *can*  support those unconventional
 platforms, it just has a performance penalty to assure that the exact
 semantics are followed, because there isn't a completely direct map
 between the native instructions/registers and D's type model.

 Just because a D byte is mapped into a 10-bit register does not mean the
 language is supporting bytes outside of the [-128, +127] range. The
 question is if the compiler has to add extra instruction to ensure if
 the overflow behavior of the registers/CPU instructions matches the
 language overflow behavior. If the overflow behavior was undefined then
 the 10-bit register would be a direct implementation of D's byte. Since
 it isn't undefined, and the register presumably wraps at 10 bits, then
 the compiler has to emit extra code, to model the behavior of an 8-bit
 two's-complement variable in a 10-bit register.

If and could.  Yes, you're right.  However, that would be making a trade 
off in an odd direction.  It'd add undefind behavior to add a capability 
for integer sizes to vary on platforms that most developers don't use or 
test on.  The result is what you see in C, the chances of any given C 
app actually working correctly on these platforms is fairly close to 0.

So, D has explicitly defined the sizes on purpose, to make correct, 
working code, easier to create at the expense of making those extremely 
rare architectures have to do extra work if they want to support D.  I 
think it's the right tradeoff.  Either way, it's the trade off that's 
been made, and it's not likely to change.

Brad

Mar 26 2013

=?UTF-8?B?Ikx1w61z?= Marques" <luismarques gmail.com> writes:

On Wednesday, 27 March 2013 at 00:32:12 UTC, Brad Roberts wrote:
 Either way, it's the trade off that's been made, and it's not 
 likely to change.

Sure, I was not arguing for changing that. I just wanted to 
clarify that when you say that "D explicitly ignores platforms 
with odd sizes" that does not mean that D cannot be implemented 
on these other machines, only that there might be a performance 
penalty (as had to be the case, given Turing et al...), depending 
on the exact circumstances.

What might actually be cooler would be being able to define your 
own types (though I don't expect that idea to be adopted soon, 
either), with their own properties, such as having ints that 
saturate instead of wrapping (like MMX), with different numbers 
of bits, etc. On a good compiler some of those alternative types 
would allow exploiting nice machine properties, and would 
complement the benefits of having the standard types, the same 
way pointers complement arrays. And you could actually define the 
C types on platforms where they don't match with the D types, as 
I pointed out earlier in this thread.

Mar 26 2013

D Programming

C/C++ Programming

Other

digitalmars.D - Dynamic arrays allocation size