digitalmars.D - Signed integer overflow undefined behavior or not?

=?UTF-8?Q?Ali_=c3=87ehreli?= (28/28) Nov 12 2015 I searched but I could not find a definitive answer. I am pretty sure

deadalnix (1/1) Nov 12 2015 Signed overflow are defined as well, as wraparound.

Don (35/36) Nov 13 2015 Can we please, please, please not have that as official policy

Kagamin (4/9) Nov 13 2015 In C allowed undefined behavior resulted in questionable
John Colvin (9/31) Nov 13 2015 I don't understand what you think is so complicated about it?

deadalnix (11/12) Nov 13 2015 It is not that it is complicated, but that signed wraparound is

Don (8/22) Nov 13 2015 Complicated in the sense that: when are those semantics useful?

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (4/8) Nov 13 2015 That would be a silly restriction that nobody would need to care
John Colvin (4/12) Nov 13 2015 They are 99% useless, I agree. The only good argument for them I

Kagamin (3/7) Nov 13 2015 What about unsigned integers? Most of the time they are used as

Matthias Bentrup (10/12) Nov 13 2015 I guess you meant mod 2^n in both cases...

John Colvin (3/7) Nov 13 2015 haha, yes, sorry.

Walter Bright (2/3) Nov 13 2015 I'd be happy if you contributed the precise wording we need!
Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (10/12) Nov 13 2015 Well, negative overflow for unsigned probably should be illegal

Walter Bright (4/6) Nov 12 2015 Yes. I see no reason to support 1's complement.

=?UTF-8?Q?Ali_=c3=87ehreli?= (6/13) Nov 12 2015 Since it's UB in C and C++, I've heard that both clang and gcc do remove...

deadalnix (3/8) Nov 12 2015 Clang does it, but LLVM IR defines flags for overflow behavior
Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (6/10) Nov 13 2015 The question you want to ask is probably if modular arithmetics

=?UTF-8?Q?Ali_=c3=87ehreli?= (6/16) Nov 13 2015 On the other hand, in order to define the wrapping behavior at all, one

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (22/27) Nov 13 2015 That's my interpretation of what Walter has said before too. So a

Iain Buclaw via Digitalmars-d (5/11) Nov 13 2015 overflow is

Walter Bright (2/3) Nov 13 2015 Good!

David Nadlinger (3/5) Nov 13 2015 Signed types will wrap around correctly for LDC.

=?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:

I searched but I could not find a definitive answer. I am pretty sure 
this thread will turn into yet another about what it should be, but I 
need an answer soon before updating my book to be review by Russel 
Winder, who will not give it a good mark before I get this part right. :)

Quoting from the following link:

   http://dlang.org/expression.html#AddExpression

"If both operands are of integral types and an overflow or underflow 
occurs in the computation, wrapping will happen. That is, uint.max + 1 
== uint.min and uint.min - 1 == uint.max."

Since it does not say "unsigned integral type", one might think it 
includes signed integral types as well. However, the rest of that quote 
is about the "wrap" behaviour of unsigned types and the fact that it 
conveniently uses an unsigned type in the only example makes me think 
that the spec means unsigned types there.

So the question is, do we support twos complement only, hence signed 
overflow is defined as wrap, or do we consider it as undefined behaviour?

Ali

P.S. The quote above has a misconception, which I've become aware of 
just recently myself: Contrary to what it may convey, underflow is not 
"having a value less than .min". For integral types, that is still 
called overflow[1]. Underflow is for floating point types only and it 
means "smaller in magnitude (that is, closer to zero) than the smallest 
value representable as a normal floating point number"[2]. So, underflow 
would not take a floating point to -.infinity; rather, towards less than 
.min_normal.

[1] https://en.wikipedia.org/wiki/Arithmetic_overflow (Note "greater in 
magnitude" there; it covers negative values as well.)

[2] https://en.wikipedia.org/wiki/Arithmetic_underflow

Nov 12 2015

deadalnix <deadalnix gmail.com> writes:

Signed overflow are defined as well, as wraparound.

Nov 12 2015

Don <prosthetictelevisions teletubby.medical.com> writes:

On Friday, 13 November 2015 at 05:47:03 UTC, deadalnix wrote:
 Signed overflow are defined as well, as wraparound.

Can we please, please, please not have that as official policy 
without carefully thinking through the implications?

It is undefined behaviour in C and C++, so we are not constrained 
by backwards compatibility with existing code.

I have never seen an example where signed integer overflow 
happened, which was not a bug. In my opinion, making it legal is 
an own goal, an unforced error.

Suppose we made it an error. We'd be in a much better position 
than C. We could easily add a check for integer overflow into 
CTFE. We could allow compilers and static analysis tools to 
implement runtime checks for integer overflow, as well.
Are we certain that we want to disallow this?

At the very least, we should change the terminology on that page. 
The word "overflow" should not be used when referring to both 
signed and unsigned types. On that page, it is describing two 
very different phenomena, and gives the impression that it was 
written by somebody who does not understand what they are talking 
about.
The usage of the word "wraps" is sloppy.

That page should state something like:
For any unsigned integral type T, all arithmetic is performed 
modulo (T.max + 1).
Thus, for example, uint.max + 1 == 0.
There is no reason to mention the highly misleading word 
"overflow".

For a signed integral type T, T.max + 1 is not representable in 
type T.
Then, we have a choice of either declaring it to be an error, as 
C does; or stating that the low bits of the infinitely-precise 
result will be interpreted as a two's complement value. For 
example, T.max + 1 will be negative.

(Note that unlike the unsigned case, there is no simple 
explanation of what happens).

Please let's be precise about this.

Nov 13 2015

Kagamin <spam here.lot> writes:

On Friday, 13 November 2015 at 09:09:33 UTC, Don wrote:
 Suppose we made it an error. We'd be in a much better position 
 than C. We could easily add a check for integer overflow into 
 CTFE. We could allow compilers and static analysis tools to 
 implement runtime checks for integer overflow, as well.
 Are we certain that we want to disallow this?

In C allowed undefined behavior resulted in questionable 
aggressive optimizations forced on everyone. That's what's 
disallowed.

Nov 13 2015

John Colvin <john.loughran.colvin gmail.com> writes:

On Friday, 13 November 2015 at 09:09:33 UTC, Don wrote:
 At the very least, we should change the terminology on that 
 page. The word "overflow" should not be used when referring to 
 both signed and unsigned types. On that page, it is describing 
 two very different phenomena, and gives the impression that it 
 was written by somebody who does not understand what they are 
 talking about.
 The usage of the word "wraps" is sloppy.

 That page should state something like:
 For any unsigned integral type T, all arithmetic is performed 
 modulo (T.max + 1).
 Thus, for example, uint.max + 1 == 0.
 There is no reason to mention the highly misleading word 
 "overflow".

 For a signed integral type T, T.max + 1 is not representable in 
 type T.
 Then, we have a choice of either declaring it to be an error, 
 as C does; or stating that the low bits of the 
 infinitely-precise result will be interpreted as a two's 
 complement value. For example, T.max + 1 will be negative.

 (Note that unlike the unsigned case, there is no simple 
 explanation of what happens).

 Please let's be precise about this.

I don't understand what you think is so complicated about it?

It's just circular boundary conditions. Unsigned has the 
boundaries at 0 and 2^n - 1, signed has them at -2^(n-1) and 
2^(n-1) - 1.

Less straightforwardly, but if you like modular arithmetic:
After arithmetic operations f is applied
unsigned: f(v) = v mod 2^n - 1
signed: f(v) = ((v + 2^(n-1)) mod (2^n - 1)) - 2^(n-1)

Nov 13 2015

deadalnix <deadalnix gmail.com> writes:

On Friday, 13 November 2015 at 09:33:51 UTC, John Colvin wrote:
 I don't understand what you think is so complicated about it?

It is not that it is complicated, but that signed wraparound is 
almost always a bug. In C/C++, that result in very questionable 
optimizations. But defining the thing as wraparound is also 
preventing it to become an error. On the other hand, detection 
the overflow is expensive on most machines.

I think Don has a point and the spec should say something like :
signed integer overflow is defined as being a runtime error. For 
performance reasons, the compiler may choose to not emit error 
checking code and use wraparound semantic instead.

Or something along these lines.

Nov 13 2015

Don <prosthetictelevisions teletubby.medical.com> writes:

On Friday, 13 November 2015 at 09:37:41 UTC, deadalnix wrote:
 On Friday, 13 November 2015 at 09:33:51 UTC, John Colvin wrote:
 I don't understand what you think is so complicated about it?


 After arithmetic operations f is applied
 signed: f(v) = ((v + 2^(n-1)) mod (2^n - 1)) - 2^(n-1)


Complicated in the sense that: when are those semantics useful? 
The answer of course, is, pretty much never. They are very 
bizarre.

 It is not that it is complicated, but that signed wraparound is 
 almost always a bug. In C/C++, that result in very questionable 
 optimizations. But defining the thing as wraparound is also 
 preventing it to become an error. On the other hand, detection 
 the overflow is expensive on most machines.

 I think Don has a point and the spec should say something like :
 signed integer overflow is defined as being a runtime error. 
 For performance reasons, the compiler may choose to not emit 
 error checking code and use wraparound semantic instead.

 Or something along these lines.

Oh, I like that! That does seem to be the best of both worlds. 
Then, as a QOI issue, the compiler can try to detect the error. 
If it does not detect the error, it MUST provide the two's 
complement result. It is not allowed to do any weird stuff.

Nov 13 2015

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:

On Friday, 13 November 2015 at 10:20:53 UTC, Don wrote:
 Oh, I like that! That does seem to be the best of both worlds. 
 Then, as a QOI issue, the compiler can try to detect the error. 
 If it does not detect the error, it MUST provide the two's 
 complement result. It is not allowed to do any weird stuff.

That would be a silly restriction that nobody would need to care 
about.  If the user cannot assume wrapping then compiler vendors 
will make more aggressive optimizations available.

Nov 13 2015

John Colvin <john.loughran.colvin gmail.com> writes:

On Friday, 13 November 2015 at 10:20:53 UTC, Don wrote:
 On Friday, 13 November 2015 at 09:37:41 UTC, deadalnix wrote:
 On Friday, 13 November 2015 at 09:33:51 UTC, John Colvin wrote:
 I don't understand what you think is so complicated about it?


 After arithmetic operations f is applied
 signed: f(v) = ((v + 2^(n-1)) mod (2^n - 1)) - 2^(n-1)


 Complicated in the sense that: when are those semantics useful? 
 The answer of course, is, pretty much never. They are very 
 bizarre.

They are 99% useless, I agree. The only good argument for them I 
can think of is that it's a faithful mapping to the underlying 
machine.

Nov 13 2015

Kagamin <spam here.lot> writes:

On Friday, 13 November 2015 at 09:37:41 UTC, deadalnix wrote:
 It is not that it is complicated, but that signed wraparound is 
 almost always a bug. In C/C++, that result in very questionable 
 optimizations. But defining the thing as wraparound is also 
 preventing it to become an error.

What about unsigned integers? Most of the time they are used as 
positive numbers, positive number overflow is the same bug.

Nov 13 2015

Matthias Bentrup <matthias.bentrup googlemail.com> writes:

On Friday, 13 November 2015 at 09:33:51 UTC, John Colvin wrote:
 unsigned: f(v) = v mod 2^n - 1
 signed: f(v) = ((v + 2^(n-1)) mod (2^n - 1)) - 2^(n-1)

I guess you meant mod 2^n in both cases...

If you look at how Mathematics deals with this issue, there is 
simply no signed or unsigned arithmetic modulo n, because they 
are exactly the same. There are only separate types in 
programming languages because the comparison operators are 
defined differently on them.

Mathematicians don't define comparison on modular rings, because 
it is not possible to do so in a way that is consistent with the 
usual rules anyway (e.g. x+1 > x is always false for some x).

Nov 13 2015

John Colvin <john.loughran.colvin gmail.com> writes:

On Friday, 13 November 2015 at 12:06:43 UTC, Matthias Bentrup 
wrote:
 On Friday, 13 November 2015 at 09:33:51 UTC, John Colvin wrote:
 unsigned: f(v) = v mod 2^n - 1
 signed: f(v) = ((v + 2^(n-1)) mod (2^n - 1)) - 2^(n-1)

 I guess you meant mod 2^n in both cases...

haha, yes, sorry.

Nov 13 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 11/13/2015 1:09 AM, Don wrote:
 Please let's be precise about this.

I'd be happy if you contributed the precise wording we need!

Nov 13 2015

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:

On Friday, 13 November 2015 at 09:09:33 UTC, Don wrote:
 (Note that unlike the unsigned case, there is no simple 
 explanation of what happens).

Well, negative overflow for unsigned probably should be illegal 
too. Ada got this right by having:

32 bit signed integers monotonic
31 bit unsigned integers monotonic

That way you can transition between unsigned and signed without 
having negative values turned into positive ones and vice versa 
and have violations detected by verifier.

In addition Ada also provides explicit modular integers in user 
specified ranges.

Nov 13 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 11/12/2015 4:43 PM, Ali Çehreli wrote:
 So the question is, do we support twos complement only, hence signed overflow
is
 defined as wrap,

Yes. I see no reason to support 1's complement.

It's worth checking how LDC and GDC deal with this deep in their optimizer - is 
it considering it undefined behavior?

Nov 12 2015

=?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:

On 11/12/2015 10:00 PM, Walter Bright wrote:
 On 11/12/2015 4:43 PM, Ali Çehreli wrote:
 So the question is, do we support twos complement only, hence signed
 overflow is
 defined as wrap,

 Yes. I see no reason to support 1's complement.

It's official! :)

 It's worth checking how LDC and GDC deal with this deep in their
 optimizer - is it considering it undefined behavior?

Since it's UB in C and C++, I've heard that both clang and gcc do remove 
code branches if they can prove that there will be signed overflow. I 
don't know how or whether that optimization is turned off for D.

Ali

Nov 12 2015

deadalnix <deadalnix gmail.com> writes:

On Friday, 13 November 2015 at 06:46:37 UTC, Ali Çehreli wrote:
 Since it's UB in C and C++, I've heard that both clang and gcc 
 do remove code branches if they can prove that there will be 
 signed overflow. I don't know how or whether that optimization 
 is turned off for D.

 Ali

Clang does it, but LLVM IR defines flags for overflow behavior 
and it is up to the frontend to choose which one it want to use.

Nov 12 2015

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:

On Friday, 13 November 2015 at 06:46:37 UTC, Ali Çehreli wrote:
 Since it's UB in C and C++, I've heard that both clang and gcc 
 do remove code branches if they can prove that there will be 
 signed overflow. I don't know how or whether that optimization 
 is turned off for D.

The question you want to ask is probably if modular arithmetics 
is legal D code for all integers, signed and unsigned. Can the 
programmer assume that wrapping is legal D code?

If so, then D does not have any kind of integer overflow at all, 
by definition.

Nov 13 2015

=?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:

On 11/13/2015 12:30 AM, Ola Fosheim Grøstad wrote:
 On Friday, 13 November 2015 at 06:46:37 UTC, Ali Çehreli wrote:
 Since it's UB in C and C++, I've heard that both clang and gcc do
 remove code branches if they can prove that there will be signed
 overflow. I don't know how or whether that optimization is turned off
 for D.

 The question you want to ask is probably if modular arithmetics is legal
 D code for all integers, signed and unsigned. Can the programmer assume
 that wrapping is legal D code?

I understood Walter's response to be so.

 If so, then D does not have any kind of integer overflow at all, by
 definition.

On the other hand, in order to define the wrapping behavior at all, one 
must speak of overflow first. Wrapping is the solution for the condition 
of overflow, which D must have to begin with, no? :)

Ali

Nov 13 2015

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:

On Friday, 13 November 2015 at 08:51:27 UTC, Ali Çehreli wrote:
 I understood Walter's response to be so.

That's my interpretation of what Walter has said before too. So a 
D compiler cannot prevent compilation of a statically detected 
wrapping (overflow). As a result D-integers are circular 
enumerations.

 On the other hand, in order to define the wrapping behavior at 
 all, one must speak of overflow first.

For educational purposes, probably :-)

 Wrapping is the solution for the condition of overflow, which D 
 must have to begin with, no? :)

For definition, not really. Signed integers are often defined as 
three functions (other definitions are possible):

Zero: 0
Successor of X: S
Predecessor of X: P

For a 2 bit signed modular arithmetics you would get the complete 
normalized set:

PP0, P0, 0, S0

with the defined rewrites

SPX = X
PSX = X
PPP0 = S0
SS0 = PP0

(implies PPPPX = X and SSSSX = X)

then you define the operators on the set (+,- etc) using 
relations such as PSX = X etc.

Nov 13 2015

Iain Buclaw via Digitalmars-d <digitalmars-d puremagic.com> writes:

On 13 Nov 2015 7:05 am, "Walter Bright via Digitalmars-d" <
digitalmars-d puremagic.com> wrote:
 On 11/12/2015 4:43 PM, Ali =C3=87ehreli wrote:
 So the question is, do we support twos complement only, hence signed


overflow is
 defined as wrap,


 Yes. I see no reason to support 1's complement.

 It's worth checking how LDC and GDC deal with this deep in their

optimizer - is it considering it undefined behavior?

We are not.  For gdc, the fwrapv flag is enabled by default.

Nov 13 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 11/13/2015 1:10 AM, Iain Buclaw via Digitalmars-d wrote:
 We are not.  For gdc, the fwrapv flag is enabled by default.

Good!

Nov 13 2015

David Nadlinger <code klickverbot.at> writes:

On Friday, 13 November 2015 at 06:00:08 UTC, Walter Bright wrote:
 It's worth checking how LDC and GDC deal with this deep in 
 their optimizer - is it considering it undefined behavior?

Signed types will wrap around correctly for LDC.

  — David

Nov 13 2015

D Programming

C/C++ Programming

Other

digitalmars.D - Signed integer overflow undefined behavior or not?