www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - A strange div bug on Linux x86_64, (both dmd & ldc2): long -5000 /

reply mw <mingwu gmail.com> writes:
I post here because I think this bug's impact maybe pretty wide, 
it's a div bug  on Linux x86_64, (both dmd & ldc2).  I'm 
interested in knowing what caused this bug.

On Windows, I only tested dmd.exe, it correctly outputs -2500.


size_t: because I was taking array length, maybe many people do 
that too.


https://issues.dlang.org/show_bug.cgi?id=21151

import std.stdio;

void main() {
   long   a = -5000;
   size_t b = 2;
   long   c = a / b;
   writeln(c);
}


$ dmd divbug.d
$ ./divbug
9223372036854773308


$ ldc2 divbug.d
$ ./divbug
9223372036854773308


x86_64 x86_64 x86_64 GNU/Linux

$ dmd --version
DMD64 D Compiler v2.092.0

$ ldc2 --version
LDC - the LLVM D compiler (1.21.0):
   based on DMD v2.091.1 and LLVM 10.0.0
   built with LDC - the LLVM D compiler (1.21.0)
   Default target: x86_64-unknown-linux-gnu
   Host CPU: bdver2
Aug 13 2020
next sibling parent FeepingCreature <feepingcreature gmail.com> writes:
On Thursday, 13 August 2020 at 07:22:18 UTC, mw wrote:
 I post here because I think this bug's impact maybe pretty 
 wide, it's a div bug  on Linux x86_64, (both dmd & ldc2).  I'm 
 interested in knowing what caused this bug.

 On Windows, I only tested dmd.exe, it correctly outputs -2500.


 size_t: because I was taking array length, maybe many people do 
 that too.


 https://issues.dlang.org/show_bug.cgi?id=21151

 import std.stdio;

 void main() {
   long   a = -5000;
   size_t b = 2;
   long   c = a / b;
   writeln(c);
 }


 $ dmd divbug.d
 $ ./divbug
 9223372036854773308


 $ ldc2 divbug.d
 $ ./divbug
 9223372036854773308


 x86_64 x86_64 x86_64 GNU/Linux

 $ dmd --version
 DMD64 D Compiler v2.092.0

 $ ldc2 --version
 LDC - the LLVM D compiler (1.21.0):
   based on DMD v2.091.1 and LLVM 10.0.0
   built with LDC - the LLVM D compiler (1.21.0)
   Default target: x86_64-unknown-linux-gnu
   Host CPU: bdver2
"Not a bug." Or rather, same semantics as in C. The division reinterprets a as unsigned, and the result is within the positive range of a long, so it stays positive post-division. #include <stdio.h> void main() { long long int a = -5000; long long unsigned int b = 2; // the expression is unsigned, and reinterpreted as positive signed with the assignment long long int c = a / b; printf("%lli\n", c); // crazy number }
Aug 13 2020
prev sibling next sibling parent reply FeepingCreature <feepingcreature gmail.com> writes:
On Thursday, 13 August 2020 at 07:22:18 UTC, mw wrote:
 I post here because I think this bug's impact maybe pretty 
 wide, it's a div bug  on Linux x86_64, (both dmd & ldc2).  I'm 
 interested in knowing what caused this bug.

 On Windows, I only tested dmd.exe, it correctly outputs -2500.


 size_t: because I was taking array length, maybe many people do 
 that too.


 https://issues.dlang.org/show_bug.cgi?id=21151

 import std.stdio;

 void main() {
   long   a = -5000;
   size_t b = 2;
   long   c = a / b;
   writeln(c);
 }


 $ dmd divbug.d
 $ ./divbug
 9223372036854773308


 $ ldc2 divbug.d
 $ ./divbug
 9223372036854773308


 x86_64 x86_64 x86_64 GNU/Linux

 $ dmd --version
 DMD64 D Compiler v2.092.0

 $ ldc2 --version
 LDC - the LLVM D compiler (1.21.0):
   based on DMD v2.091.1 and LLVM 10.0.0
   built with LDC - the LLVM D compiler (1.21.0)
   Default target: x86_64-unknown-linux-gnu
   Host CPU: bdver2
"Not a bug." Or rather, same semantics as in C. The division reinterprets a as unsigned, and the result is within the positive range of a long, so it stays positive post-division. #include <stdio.h> void main() { long long int a = -5000; long long unsigned int b = 2; // the expression is unsigned, and reinterpreted as positive signed with the assignment long long int c = a / b; printf("%lli\n", c); // crazy number } Whether that is correct is a whole other question. Personally I think it's insane.
Aug 13 2020
parent reply mw <mingwu gmail.com> writes:
On Thursday, 13 August 2020 at 07:51:06 UTC, FeepingCreature 
wrote:
 "Not a bug." Or rather, same semantics as in C. The division 
 reinterprets a as unsigned, and the result is within the 
 positive range of a long, so it stays positive post-division.
https://issues.dlang.org/show_bug.cgi?id=21151 Let me add all my post together: It can NOT silently do this, at least a warning. BTW, on Windows, dmd correctly output -2500. Just because C/C++ did it doesn't means it's correct. And D supposed to be an improvement of C++. OK, let me write this in this way to show it's impact: ================================== import std.algorithm; import std.stdio; void main() { long[] a = [-5000, 0]; long c = sum(a) / a.length; writeln(c); } ================================== $ ./divbug 9223372036854773308
Aug 13 2020
next sibling parent reply John Colvin <john.loughran.colvin gmail.com> writes:
On Thursday, 13 August 2020 at 07:56:31 UTC, mw wrote:
 On Thursday, 13 August 2020 at 07:51:06 UTC, FeepingCreature 
 wrote:
 "Not a bug." Or rather, same semantics as in C. The division 
 reinterprets a as unsigned, and the result is within the 
 positive range of a long, so it stays positive post-division.
https://issues.dlang.org/show_bug.cgi?id=21151 Let me add all my post together: It can NOT silently do this, at least a warning. BTW, on Windows, dmd correctly output -2500. Just because C/C++ did it doesn't means it's correct. And D supposed to be an improvement of C++. OK, let me write this in this way to show it's impact: ================================== import std.algorithm; import std.stdio; void main() { long[] a = [-5000, 0]; long c = sum(a) / a.length; writeln(c); } ================================== $ ./divbug 9223372036854773308
The windows version is probably different because you are building a 32 bit exe and therefore size_t is uint, which will be promoted to long in the division. If you do a bit of a search on the forums you'll find this issue having come up many times before and you might find those discussions informative.
Aug 13 2020
parent reply mw <mingwu gmail.com> writes:
On Thursday, 13 August 2020 at 08:07:56 UTC, John Colvin wrote:
 The windows version is probably different because you are 
 building a 32 bit exe and therefore size_t is uint, which will 
 be promoted to long in the division.
$ /mnt/c/project/dmd2/windows/bin64/dmd.exe --version DMD64 D Compiler v2.092.0-dirty Copyright (C) 1999-2020 by The D Language Foundation, All Rights Reserved written by Walter Bright
 If you do a bit of a search on the forums you'll find this 
 issue having come up many times before and you might find those 
 discussions informative.
Sigh, if D continue behave like this, it can hardly be called a C++ improvement.
Aug 13 2020
parent reply John Colvin <john.loughran.colvin gmail.com> writes:
On Thursday, 13 August 2020 at 08:23:30 UTC, mw wrote:
 On Thursday, 13 August 2020 at 08:07:56 UTC, John Colvin wrote:
 The windows version is probably different because you are 
 building a 32 bit exe and therefore size_t is uint, which will 
 be promoted to long in the division.
$ /mnt/c/project/dmd2/windows/bin64/dmd.exe --version DMD64 D Compiler v2.092.0-dirty Copyright (C) 1999-2020 by The D Language Foundation, All Rights Reserved written by Walter Bright
compiler architecture doesn't imply generated code architecture. E.g. try compiling your file with -m32 or -m64 or -m32mscoff (see https://dlang.org/dmd-windows.html#switch-m32)
 If you do a bit of a search on the forums you'll find this 
 issue having come up many times before and you might find 
 those discussions informative.
Sigh, if D continue behave like this, it can hardly be called a C++ improvement.
D has many differences with C/C++, many of which I would consider an improvement. However, it does not address all of the things that cause difficulty with C/C++. Personally I would love to be able to disallow implicit signed-to-unsigned conversions in almost all cases, but that would be a big breaking change.
Aug 13 2020
next sibling parent bachmeier <no spam.net> writes:
On Thursday, 13 August 2020 at 09:25:07 UTC, John Colvin wrote:

 Personally I would love to be able to disallow implicit 
 signed-to-unsigned conversions in almost all cases, but that 
 would be a big breaking change.
This was broken without even an announcement or discussion (maybe I missed it): foreach(int ii, val; [1, 2, 3, 4, 5]) {} Upon compilation, you get a message Deprecation: foreach: loop index implicitly converted from size_t to int I had to go back and fix lots of previously working code, even though there was no reason at all that it would ever fail to work correctly. Then there is the move to safe by default. Surely if we can make those changes, we can change the behavior in this case. The output of this program is a serious WTF: import std; void main() { foreach(int ii, val; [1, 2, 3, 4, 5]) { writeln(ii); } writeln(-5000/[1, 2, 3, 4].length); }
Aug 13 2020
prev sibling parent mw <mingwu gmail.com> writes:
On Thursday, 13 August 2020 at 09:25:07 UTC, John Colvin wrote:

 Personally I would love to be able to disallow implicit 
 signed-to-unsigned conversions in almost all cases, but that 
 would be a big breaking change.
At least I want a warning message, even with a turn-on command-line switch is fine, I personally will turn it on all the time, silently performing this conversions is horrible. I spent half of the night yesterday to trace down the issue. https://run.dlang.io/is/je8L4r ================================== import std.algorithm; import std.stdio; void main() { long[] a = [-5000, 0]; long c = sum(a) / a.length; writeln(c); // output 9223372036854773308 } ================================== In contrast, there was a compiler warning message for loop index https://run.dlang.io/is/4Sqc5n onlineapp.d(4): Deprecation: foreach: loop index implicitly converted from size_t to int
Aug 13 2020
prev sibling parent FeepingCreature <feepingcreature gmail.com> writes:
On Thursday, 13 August 2020 at 07:56:31 UTC, mw wrote:
 On Thursday, 13 August 2020 at 07:51:06 UTC, FeepingCreature 
 wrote:
 "Not a bug." Or rather, same semantics as in C. The division 
 reinterprets a as unsigned, and the result is within the 
 positive range of a long, so it stays positive post-division.
https://issues.dlang.org/show_bug.cgi?id=21151 Let me add all my post together: It can NOT silently do this, at least a warning. BTW, on Windows, dmd correctly output -2500. Just because C/C++ did it doesn't means it's correct. And D supposed to be an improvement of C++. OK, let me write this in this way to show it's impact: ================================== import std.algorithm; import std.stdio; void main() { long[] a = [-5000, 0]; long c = sum(a) / a.length; writeln(c); } ================================== $ ./divbug 9223372036854773308
Yes, hence: not a compiler bug, but a spec bug. Implicit conversions should never throw away data, and silently converting from signed to unsigned, especially for a sign-sensitive operation like division, is insane. I'm not saying it's not an issue, I'm saying it should be raised against the spec, not the compiler, which faithfully implements it.
Aug 13 2020
prev sibling next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Aug 13, 2020 at 07:22:18AM +0000, mw via Digitalmars-d wrote:
[...]
 void main() {
   long   a = -5000;
   size_t b = 2;
   long   c = a / b;
   writeln(c);
 }
You're mixing signed and unsigned values. That's generally dangerous territory where integer promotion rules inherited from C/C++ take over and cause sometimes weird effects, like here. Changing integer promotion rules will probably never happen now, because it will cause massive *silent* breakage of existing code. So, in the spirit of defensive programming, I recommend avoiding mixing signed/unsigned values in this way. T -- It said to install Windows 2000 or better, so I installed Linux instead.
Aug 13 2020
next sibling parent reply FeepingCreature <feepingcreature gmail.com> writes:
On Thursday, 13 August 2020 at 10:10:34 UTC, H. S. Teoh wrote:
 On Thu, Aug 13, 2020 at 07:22:18AM +0000, mw via Digitalmars-d 
 wrote: [...]
 void main() {
   long   a = -5000;
   size_t b = 2;
   long   c = a / b;
   writeln(c);
 }
You're mixing signed and unsigned values. That's generally dangerous territory where integer promotion rules inherited from C/C++ take over and cause sometimes weird effects, like here. Changing integer promotion rules will probably never happen now, because it will cause massive *silent* breakage of existing code. So, in the spirit of defensive programming, I recommend avoiding mixing signed/unsigned values in this way. T
Changing integer promotion rules to disallow promotion of signed to unsigned for division will not cause massive silent breakage. - But it will cause massive visible breakage; Phobos uses this all over.
Aug 13 2020
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 8/13/20 6:13 AM, FeepingCreature wrote:
 On Thursday, 13 August 2020 at 10:10:34 UTC, H. S. Teoh wrote:
 On Thu, Aug 13, 2020 at 07:22:18AM +0000, mw via Digitalmars-d wrote: 
 [...]
 void main() {
   long   a = -5000;
   size_t b = 2;
   long   c = a / b;
   writeln(c);
 }
You're mixing signed and unsigned values. That's generally dangerous territory where integer promotion rules inherited from C/C++ take over and cause sometimes weird effects, like here. Changing integer promotion rules will probably never happen now, because it will cause massive *silent* breakage of existing code. So, in the spirit of defensive programming, I recommend avoiding mixing signed/unsigned values in this way. T
Changing integer promotion rules to disallow promotion of signed to unsigned for division will not cause massive silent breakage. - But it will cause massive visible breakage; Phobos uses this all over.
I wonder if such a change would uncover any bugs in phobos.
Aug 17 2020
parent mw <mingwu gmail.com> writes:
On Monday, 17 August 2020 at 11:11:22 UTC, Andrei Alexandrescu 
wrote:
 On 8/13/20 6:13 AM, FeepingCreature wrote:
 Changing integer promotion rules to disallow promotion of 
 signed to unsigned for division will not cause massive silent 
 breakage. - But it will cause massive visible breakage; Phobos 
 uses this all over.
I wonder if such a change would uncover any bugs in phobos.
... On Monday, 17 August 2020 at 11:24:08 UTC, Andrei Alexandrescu wrote:
 On 8/14/20 5:29 PM, Walter Bright wrote:
 On 8/14/2020 6:32 AM, Adam D. Ruppe wrote:
 just I don't care, I want 16 bit operations here.
The design considerations are mutually incompatible: 1. performance
...
 14. should work like Python
Heh, this looks like a list of things CheckedInt is supposed to help with.
That's why I did this study: use checkedint as a drop-in replacement of native long https://forum.dlang.org/thread/omskttpjhwmmhifymiha forum.dlang.org and logged this bug: https://issues.dlang.org/show_bug.cgi?id=21169 If we can have these enhancements fixed faster, we will be able to quickly test the replacement in any library (even better with a compiler command line option to activate the switch).
Aug 17 2020
prev sibling next sibling parent reply bachmeier <no spam.net> writes:
On Thursday, 13 August 2020 at 10:10:34 UTC, H. S. Teoh wrote:

 So, in the spirit of defensive programming, I recommend 
 avoiding mixing signed/unsigned values in this way.
This is not a practical solution. [1, 2, 3, 4].length returns ulong, which guarantees this type of mixing goes on without people even realizing it's in their code base. Imagine a new user to the language wanting to compute the mean of an array of numbers: import std; void main() { long sum = 0; long[] vec = [-112, 2, 23, -4]; foreach(val; vec) { sum += val; } writeln(sum/vec.length); } This is inexcusable.
Aug 13 2020
parent reply jmh530 <john.michael.hall gmail.com> writes:
On Thursday, 13 August 2020 at 10:40:33 UTC, bachmeier wrote:
 
 [snip]

 Imagine a new user to the language wanting to compute the mean 
 of an array of numbers:

 import std;
 void main()
 {
     long sum = 0;
     long[] vec = [-112, 2, 23, -4];
     foreach(val; vec) {
         sum += val;
     }
     writeln(sum/vec.length);
 }

 This is inexcusable.
It's certainly annoying, but if there were an equivalent length function in C, then it would have the same behavior. Unfortunately, this is one of those things that were carried over from C. Also, note that the true mean of vec above is -22.75, which wouldn't even be the result of your function if length returned a signed variable, because you would be doing integer division. A person who comes to D without ever having programmed before would get tripped up by that too. A new user to the language can use mir's mean function: import mir.math.stat: mean; void main() { long[] vec = [-112, 2, 23, -4]; assert(vec.mean == -22.75); }
Aug 13 2020
parent reply bachmeier <no spam.net> writes:
On Thursday, 13 August 2020 at 11:40:59 UTC, jmh530 wrote:
 On Thursday, 13 August 2020 at 10:40:33 UTC, bachmeier wrote:
 
 [snip]

 Imagine a new user to the language wanting to compute the mean 
 of an array of numbers:

 import std;
 void main()
 {
     long sum = 0;
     long[] vec = [-112, 2, 23, -4];
     foreach(val; vec) {
         sum += val;
     }
     writeln(sum/vec.length);
 }

 This is inexcusable.
It's certainly annoying, but if there were an equivalent length function in C, then it would have the same behavior. Unfortunately, this is one of those things that were carried over from C.
The source of wrong behavior is vec.length having type ulong. It would be very unusual for someone to even think about that.
 Also, note that the true mean of vec above is -22.75, which 
 wouldn't even be the result of your function if length returned 
 a signed variable, because you would be doing integer division. 
 A person who comes to D without ever having programmed before 
 would get tripped up by that too.
That's why that behavior needs to be changed as well. It's horrible to implicitly cast from int to double when doing so results in obviously wrong behavior. Hopefully there won't be any more talk about safe by default as long as the language has features like this that are obviously broken and trivially fixed.
Aug 13 2020
next sibling parent reply FeepingCreature <feepingcreature gmail.com> writes:
On Thursday, 13 August 2020 at 13:33:19 UTC, bachmeier wrote:
 That's why that behavior needs to be changed as well. It's 
 horrible to implicitly cast from int to double when doing so 
 results in obviously wrong behavior. Hopefully there won't be 
 any more talk about safe by default as long as the language has 
 features like this that are obviously broken and trivially 
 fixed.
It's not trivially fixed. :-( I added a check for this case in DMD, just to see, and it breaks Phobos all over. Anything that interfaces to C with more complicated struct types does division with mixed signs. There'd need to be a lot of casts added as a result of changing this.
Aug 13 2020
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 8/13/20 9:40 AM, FeepingCreature wrote:
 On Thursday, 13 August 2020 at 13:33:19 UTC, bachmeier wrote:
 That's why that behavior needs to be changed as well. It's horrible to 
 implicitly cast from int to double when doing so results in obviously 
 wrong behavior. Hopefully there won't be any more talk about safe by 
 default as long as the language has features like this that are 
 obviously broken and trivially fixed.
It's not trivially fixed. :-( I added a check for this case in DMD, just to see, and it breaks Phobos all over. Anything that interfaces to C with more complicated struct types does division with mixed signs. There'd need to be a lot of casts added as a result of changing this.
Maybe a solution would be to only reject code with integrals statically knoen to be negative that are converted to unsigned integrals. Most of those are arguably bugs.
Aug 17 2020
prev sibling next sibling parent jmh530 <john.michael.hall gmail.com> writes:
On Thursday, 13 August 2020 at 13:33:19 UTC, bachmeier wrote:
 
 Also, note that the true mean of vec above is -22.75, which 
 wouldn't even be the result of your function if length 
 returned a signed variable, because you would be doing integer 
 division. A person who comes to D without ever having 
 programmed before would get tripped up by that too.
That's why that behavior needs to be changed as well. It's horrible to implicitly cast from int to double when doing so results in obviously wrong behavior.
I'm a little confused by this. My point was that the way D works now is that there is no error in the below code. assert(-91 / 4 == -22); even though the result is not the mathematically correct one (-22.75). There is no implicit cast to double in this case. Modifying it to assert(-91.0 / 4 == -22.75); gives the right result, but if you want to be more explicit and start from a signed variable and an unsigned variable, you would do assert(cast(double) -91 / cast(double) 4u == -22.75);
Aug 13 2020
prev sibling next sibling parent =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 8/13/20 6:33 AM, bachmeier wrote:

 The source of wrong behavior is vec.length having type ulong.
I remember watching C++ experts on an "ask us anything" kind of session at a conference. They were agreeing that it was a mistake that std::vector::size returns unsigned. (Herb Sutter: "We were wrong.") Ali
Aug 13 2020
prev sibling parent reply matheus <matheus gmail.com> writes:
On Thursday, 13 August 2020 at 13:33:19 UTC, bachmeier wrote:
 ...
 The source of wrong behavior is vec.length having type ulong. 
 It would be very unusual for someone to even think about that.
May I ask what type should it be? Matheus.
Aug 13 2020
parent reply mw <mingwu gmail.com> writes:
On Thursday, 13 August 2020 at 18:40:40 UTC, matheus wrote:
 On Thursday, 13 August 2020 at 13:33:19 UTC, bachmeier wrote:
 ...
 The source of wrong behavior is vec.length having type ulong. 
 It would be very unusual for someone to even think about that.
May I ask what type should it be?
Signed (size_t, the length of the machine's address space) Just as in Java (as an improvement of C++): https://stackoverflow.com/questions/211311/what-is-the-data-type-for-length-property-for-java-arrays It is an int. See the Java Language Specification, section 10.7. Initially, Java don't have unsigned integer type; it's added as late as Java 8, and when it's added, they also added the extra methods to properly handle them: https://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html """ int: By default, the int data type is a 32-bit signed two's complement integer, which has a minimum value of -231 and a maximum value of 231-1. In Java SE 8 and later, you can use the int data type to represent an unsigned 32-bit integer, which has a minimum value of 0 and a maximum value of 232-1. Use the Integer class to use int data type as an unsigned integer. See the section The Number Classes for more information. Static methods like compareUnsigned, divideUnsigned etc have been added to the Integer class to support the arithmetic operations for unsigned integers. """ In contrast, D does the potential harmful conversion *silently*.
Aug 13 2020
next sibling parent reply jmh530 <john.michael.hall gmail.com> writes:
On Thursday, 13 August 2020 at 18:51:09 UTC, mw wrote:
 [snip]

 It is an int. See the Java Language Specification, section 10.7.


 Initially, Java don't have unsigned integer type; it's added as 
 late as Java 8, and when it's added, they also added the extra 
 methods to properly handle them:

 https://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html
 [snip]
In other words, it was not added until 2014, and even then done in a backwards compatible way that doesn't let you actually declare unsigned ints, just to call some methods on them assuming they are unsigned.
Aug 13 2020
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 8/13/2020 12:03 PM, jmh530 wrote:
 In other words, it was not added until 2014, and even then done in a backwards 
 compatible way that doesn't let you actually declare unsigned ints, just to
call 
 some methods on them assuming they are unsigned.
I view it as an admission of failure at doing away with unsigned integers.
Aug 13 2020
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Aug 13, 2020 at 02:40:46PM -0700, Walter Bright via Digitalmars-d wrote:
 On 8/13/2020 12:03 PM, jmh530 wrote:
 In other words, it was not added until 2014, and even then done in a
 backwards compatible way that doesn't let you actually declare
 unsigned ints, just to call some methods on them assuming they are
 unsigned.
I view it as an admission of failure at doing away with unsigned integers.
Yeah, in spite of all the problems, unsigned values *are* needed for certain things. Java not having it was a *big* turnoff for me, because certain things that ought to be simple become needlessly convoluted (like parsing unsigned output from a C program, for example -- to prevent silent data corruption you had to treat everything as strings, which is a royal pain in Java). Then again, a lot of things are needlessly convoluted in Java, so it's not saying very much. :-P T -- People who are more than casually interested in computers should have at least some idea of what the underlying hardware is like. Otherwise the programs they write will be pretty weird. -- D. Knuth
Aug 13 2020
parent reply mw <mingwu gmail.com> writes:
On Thursday, 13 August 2020 at 22:07:11 UTC, H. S. Teoh wrote:
 Then again, a lot of things are needlessly convoluted in Java, 
 so it's not saying very much. :-P
Some of the D's competitors: ---------------------------------------------------------------------------- using System; class test{ public static void Main(string[] args) { long a = -5000; ulong b = 2; long c = a / b; // Operator '/' is ambiguous on operands of type 'long' and 'ulong' } } ---------------------------------------------------------------------------- $ mcs div.cs div.cs(6,14): error CS0019: Operator `/' cannot be applied to operands of type `long' and `ulong' Compilation failed: 1 error(s), 0 warnings Rust: ---------------------------------------------------------------------------- fn main() { let a: i64 = -5000; let b: u64 = 2; let c: i64 = a / b; } ---------------------------------------------------------------------------- $ rustc div.rs error[E0308]: mismatched types --> div.rs:4:20 | 4 | let c: i64 = a / b; | ^ expected `i64`, found `u64` error[E0277]: cannot divide `i64` by `u64` --> div.rs:4:18 | 4 | let c: i64 = a / b; | ^ no implementation for `i64 / u64` | = help: the trait `std::ops::Div<u64>` is not implemented for `i64` error: aborting due to 2 previous errors Some errors have detailed explanations: E0277, E0308. For more information about an error, try `rustc --explain E0277`.
Aug 13 2020
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Aug 13, 2020 at 11:07:23PM +0000, mw via Digitalmars-d wrote:
[...]

 ----------------------------------------------------------------------------
[...]
 div.cs(6,14): error CS0019: Operator `/' cannot be applied to operands of
 type `long' and `ulong'
[...]
 
 Rust:
[...]
 error[E0277]: cannot divide `i64` by `u64`
[...] Honestly, I'd be happy if we turned these implicit sign conversions to errors. The cases where you *want* a/b to convert to unsigned are limited; if you really want to do it, you could just write a cast. It does make the code much clearer: ulong x = ...; long y = ...; auto z = x / cast(ulong) y; // see? now it's completely clear And before somebody tells me this is too verbose: we already have to do this for short ints, no thanks to the recent change that arithmetic involving anything smaller than int will implicitly promote to int first: ubyte x; ubyte y; //ubyte z = x + y; // NG ubyte z = cast(ubyte)(x + y); // OK Yes, it's *that* ugly. Welcome to the Dungeon of D's Dark Corners, where you see the ugly side of D that people don't want to talk about. We hope you enjoy your stay. (Or not.) :-D T -- Caffeine underflow. Brain dumped.
Aug 13 2020
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Thursday, 13 August 2020 at 23:28:43 UTC, H. S. Teoh wrote:
 And before somebody tells me this is too verbose: we already 
 have to do this for short ints, no thanks to the recent change 
 that arithmetic involving anything smaller than int will 
 implicitly promote to int first:
That's not actually a recent change; it has always been like that, inherited from C. D's difference is that C will implicitly convert back to the small type and D won't without an explicit cast. Drives me nuts and very rarely actually catches real mistakes. The most recent change here is even like `-a` will trigger the error. It always did the promotion but it used to cast right back and now will error instead. Ugh.
Aug 13 2020
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Aug 13, 2020 at 11:38:03PM +0000, Adam D. Ruppe via Digitalmars-d wrote:
 On Thursday, 13 August 2020 at 23:28:43 UTC, H. S. Teoh wrote:
 And before somebody tells me this is too verbose: we already have to
 do this for short ints, no thanks to the recent change that
 arithmetic involving anything smaller than int will implicitly
 promote to int first:
That's not actually a recent change; it has always been like that, inherited from C. D's difference is that C will implicitly convert back to the small type and D won't without an explicit cast. Drives me nuts and very rarely actually catches real mistakes. The most recent change here is even like `-a` will trigger the error. It always did the promotion but it used to cast right back and now will error instead. Ugh.
Ah, yeah, that last one was what really annoyed me, and it was pretty recent. I never understood the logic of that. I mean the starting value is already `byte`, why does its negation have to auto-promote to int?! But thus sez the rulez, so who am I to question it amirite? :-( It annoyed me so much that I wrote a wrapper module just to work around it. Basically you append a `.np` to any of your narrow int values, and it wraps it in an infectious struct that auto-truncates arithmetic operations. The presence of the `.np` is a kind of self-documentation that we're flouting the usual int promotion rules, so I'm reasonably happy with this hack. :-P Now somebody just has to write a wrapper for signed/unsigned conversions... ;-) T -- It is not the employer who pays the wages. Employers only handle the money. It is the customer who pays the wages. -- Henry Ford -----------------------------------snip--------------------------------------- module nopromote; enum isNarrowInt(T) = is(T : int) || is(T : uint); /** * A wrapper around a built-in narrow int that truncates the result of * arithmetic operations to the narrow type, overriding built-in int promotion * rules. */ struct Np(T) if (isNarrowInt!T) { T impl; alias impl this; /** * Truncating binary operator. */ Np opBinary(string op, U)(U u) if (is(typeof((T x, U y) => mixin("x " ~ op ~ " y")))) { return Np(cast(T) mixin("this.impl " ~ op ~ " u")); } /** * Truncating unary operator. */ Np opUnary(string op)() if (is(typeof((T x) => mixin(op ~ "cast(int) x")))) { return Np(cast(T) mixin(op ~ " cast(int) this.impl")); } /** * Infectiousness: any expression containing Np should automatically use Np * operator semantics. */ Np opBinaryRight(string op, U)(U u) if (is(typeof((T x, U y) => mixin("x " ~ op ~ " y")))) { return Np(cast(T) mixin("u " ~ op ~ " this.impl")); } } /** * Returns: A lightweight wrapped type that overrides built-in arithmetic * operators to always truncate to the given type without promoting to int or * uint. */ auto np(T)(T t) if (isNarrowInt!T) { return Np!T(t); } // Test binary ops safe unittest { ubyte x = 1; ubyte y = 2; auto z = x.np + y; static assert(is(typeof(z) : ubyte)); assert(z == 3); byte zz = x.np + y; assert(zz == 3); x = 255; z = x.np + y; assert(z == 1); } safe unittest { byte x = 123; byte y = 5; auto z = x.np + y; static assert(is(typeof(z) : byte)); assert(z == byte.min); byte zz = x.np + y; assert(zz == byte.min); } safe unittest { import std.random; short x = cast(short) uniform(0, 10); short y = 10; auto z = x.np + y; static assert(is(typeof(z) : short)); assert(z == x + 10); short s = x.np + y; assert(s == x + 10); } // Test unary ops safe unittest { byte b = 10; auto c = -b.np; static assert(is(typeof(c) : byte)); assert(c == -10); ubyte ub = 16; auto uc = -ub.np; static assert(is(typeof(uc) : ubyte)); assert(uc == 0xF0); } version(unittest) { // These tests are put here as actual module functions, to force optimizer // not to discard calls to these functions, so that we can see the actual // generated code. byte byteNegate(byte b) { return -b.np; } ubyte ubyteNegate(ubyte b) { return -b.np; } byte byteTest1(int choice, byte a, byte b) { if (choice == 1) return a.np + b; if (choice == 2) return a.np / b; assert(0); } short shortAdd(short a, short b) { return a.np + b; } // Test opBinaryRight byte byteRightTest(byte a, byte c) { auto result = a + c.np; static assert(is(typeof(result) : byte)); return result; } unittest { assert(byteRightTest(127, 1) == byte.min); } short multiTest1(short x, short y) { return short(2) + 2*(x - y.np); } unittest { // Test wraparound semantics. assert(multiTest1(32767, 16384) == short.min); } short multiTest2(short a, short b) { short x = a; short y = b; return (2*x + 1) * (y.np/2 - 1); } unittest { assert(multiTest2(1, 4) == 3); } } -----------------------------------snip---------------------------------------
Aug 13 2020
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Aug 13, 2020 at 06:51:09PM +0000, mw via Digitalmars-d wrote:
 On Thursday, 13 August 2020 at 18:40:40 UTC, matheus wrote:
 On Thursday, 13 August 2020 at 13:33:19 UTC, bachmeier wrote:
 ...
 The source of wrong behavior is vec.length having type ulong. It
 would be very unusual for someone to even think about that.
May I ask what type should it be?
Signed (size_t, the length of the machine's address space)
[...] size_t is unsigned, because the address space of a 64-bit machine is 2^64, but a signed value would only be able to address half of that space (2^63). T -- Bomb technician: If I'm running, try to keep up.
Aug 13 2020
next sibling parent reply mw <mingwu gmail.com> writes:
On Thursday, 13 August 2020 at 19:03:38 UTC, H. S. Teoh wrote:
 On Thu, Aug 13, 2020 at 06:51:09PM +0000, mw via Digitalmars-d 
 wrote:
 On Thursday, 13 August 2020 at 18:40:40 UTC, matheus wrote:
 On Thursday, 13 August 2020 at 13:33:19 UTC, bachmeier wrote:
 ...
 The source of wrong behavior is vec.length having type 
 ulong. It
 would be very unusual for someone to even think about that.
May I ask what type should it be?
Signed (size_t, the length of the machine's address space)
[...] size_t is unsigned, because the address space of a 64-bit machine is 2^64, but a signed value would only be able to address half of that space (2^63).
Yes, I know that, that's why I put it in the brackets. But for practical purpose: half that space is large/good enough, 2^63 = 9,223,372,036,854,775,808, you sure your machine have that much memory installed? (very roughly, 9G of GB?)
Aug 13 2020
next sibling parent reply Tove <tove fransson.se> writes:
On Thursday, 13 August 2020 at 19:11:24 UTC, mw wrote:
 On Thursday, 13 August 2020 at 19:03:38 UTC, H. S. Teoh wrote:
 On Thu, Aug 13, 2020 at 06:51:09PM +0000, mw via Digitalmars-d 
 wrote:
 On Thursday, 13 August 2020 at 18:40:40 UTC, matheus wrote:
 On Thursday, 13 August 2020 at 13:33:19 UTC, bachmeier 
 wrote:
 ...
 The source of wrong behavior is vec.length having type 
 ulong. It
 would be very unusual for someone to even think about 
 that.
May I ask what type should it be?
Signed (size_t, the length of the machine's address space)
[...] size_t is unsigned, because the address space of a 64-bit machine is 2^64, but a signed value would only be able to address half of that space (2^63).
Yes, I know that, that's why I put it in the brackets. But for practical purpose: half that space is large/good enough, 2^63 = 9,223,372,036,854,775,808, you sure your machine have that much memory installed? (very roughly, 9G of GB?)
One should always use unsigned whenever possible as it generates better code, many believe factor 2 is simply a shift, but not so on signed. ssize_t fun_slow(ssize_t x) { return x/2; } size_t fun_fast(size_t x) { return x/2u; } mov rax, rdi shr rax, 63 add rax, rdi sar rax ret fun_fast(unsigned long) mov rax, rdi shr rax ret
Aug 13 2020
parent reply mw <mingwu gmail.com> writes:
On Thursday, 13 August 2020 at 19:24:11 UTC, Tove wrote:
 One should always use unsigned whenever possible as it 
 generates better code, many believe factor 2 is simply a shift, 
 but not so on signed.
I'm fine with that. In many area of the language design, we need to make a choice between: correctness v.s. raw performance. But at least we also need *explicit* visible warning message after we've made that choice: -- especially warnings about *correctness* when the choice was made favoring performance -- if the choice was made favoring correctness, user will notice the performance when the program runs. Personally, I will favor correctness over performance in my program design decisions: make it correct first, and faster later; you never know before-hand where your program's bottleneck is. I'm sure you know the famous quote: "Premature optimization is the root of all evil!"
Aug 13 2020
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Aug 13, 2020 at 07:40:28PM +0000, mw via Digitalmars-d wrote:
 On Thursday, 13 August 2020 at 19:24:11 UTC, Tove wrote:
 One should always use unsigned whenever possible as it generates
 better code, many believe factor 2 is simply a shift, but not so on
 signed.
I'm fine with that. In many area of the language design, we need to make a choice between: correctness v.s. raw performance. But at least we also need *explicit* visible warning message after we've made that choice:
I agree that the compiler should at least warn or prohibit implicit conversions between signed/unsigned. It has been the source of quite a number of frustrating bugs over the years -- frustrating mostly because implicit conversion yields unexpected results yet due to code breakage it's unlikely to ever change. Unfortunately I don't see the situation changing anytime soon, unless somebody comes up with a *really* convincing argument that can win Walter over. After the flop with the recent bool != int DIP, I've kinda given up hope that this area of D (int promotion rules, including implicit conversion) will ever improve. I don't agree with making array length signed, though. The language should not whitewash the harsh reality of the underlying hardware, even if we make concessions in the way of warning the user of potentially unexpected/unwanted semantics, such as when there's implicit conversion between signed/unsigned values. T -- The irony is that Bill Gates claims to be making a stable operating system and Linus Torvalds claims to be trying to take over the world. -- Anonymous
Aug 13 2020
prev sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 13.08.20 21:40, mw wrote:
 On Thursday, 13 August 2020 at 19:24:11 UTC, Tove wrote:
 One should always use unsigned whenever possible as it generates 
 better code, many believe factor 2 is simply a shift, but not so on 
 signed.
I'm fine with that. In many area of the language design, we need to make a choice between:  correctness v.s. raw performance.
This is not such a case. There is no good reason to round signed integer division towards zero.
Aug 13 2020
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 8/13/2020 12:11 PM, mw wrote:
 But for practical purpose: half that space is large/good enough, 2^63 = 
 9,223,372,036,854,775,808, you sure your machine have that much memory 
 installed? (very roughly, 9G of GB?)
Now think about 32 bit address spaces, where arrays larger than 2Gb do happen. For example, I noticed that many 32 bit programs that operated on files, such as compressors, would corrupt files and do mysterious ugly things when given files larger than 2Gb.
Aug 13 2020
prev sibling next sibling parent Adam D. Ruppe <destructionator gmail.com> writes:
On Thursday, 13 August 2020 at 19:03:38 UTC, H. S. Teoh wrote:
 size_t is unsigned, because the address space of a 64-bit 
 machine is 2^64, but a signed value would only be able to 
 address half of that space (2^63).
The address bus on existing processors only uses like 48 bits and even there the lower three are reserved cuz of alignment. But besides, even if you wanted it all, the signed negative value has the same bit pattern as the high bit set anyway so it isn't like the cpu would care, assuming it was actually mapped. On 32 bit it makes a little more sense to say unsigned but even there the same bit pattern logic applies anyway.
Aug 13 2020
prev sibling parent Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Thursday, 13 August 2020 at 19:03:38 UTC, H. S. Teoh wrote:
 On Thu, Aug 13, 2020 at 06:51:09PM +0000, mw via Digitalmars-d 
 wrote:
 On Thursday, 13 August 2020 at 18:40:40 UTC, matheus wrote:
 On Thursday, 13 August 2020 at 13:33:19 UTC, bachmeier wrote:
 ...
 The source of wrong behavior is vec.length having type 
 ulong. It
 would be very unusual for someone to even think about that.
May I ask what type should it be?
Signed (size_t, the length of the machine's address space)
[...] size_t is unsigned, because the address space of a 64-bit machine is 2^64, but a signed value would only be able to address half of that space (2^63).
While the rationale makes sense and I'm definitely in the camp of unsigned size_t, signed addresses can without problem access the whole address range. The other half will then be addressed with negative numbers. The last address 0xFFFFFFFFFFFFFFFF is -1L (that's what was used on Apple II integer basic, which only had signed 16 bit integers as variable type, that's why entering monitor was done with CALL -151 and not CALL 65385).
Aug 13 2020
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 8/13/2020 3:10 AM, H. S. Teoh wrote:
 You're mixing signed and unsigned values. That's generally dangerous
 territory where integer promotion rules inherited from C/C++ take over
 and cause sometimes weird effects, like here. Changing integer promotion
 rules will probably never happen now, because it will cause massive
 *silent* breakage of existing code. So, in the spirit of defensive
 programming, I recommend avoiding mixing signed/unsigned values in this
 way.
There are some things everyone simply needs to know to use a systems programming language successfully: 1. How 2's complement arithmetic works, especially in regard to how negative numbers are handled. 2. The range of values (signed and unsigned) in 2's complement values. 3. Overflow is not detected. 4. 2's complement arithmetic wraps around. 5. The integral promotion rules (the issue in this thread). I say "systems programming language" because although other languages (like Python) take care of these issues, but that comes at a high cost in terms of performance. Some languages (like Java) get rid of the signed/unsigned issue by getting rid of unsigned integer types. This choice makes it very hard to do some sorts of operations. The choice Java made to remove unsigned integers is an indication that "just add a warning" is not as workable as it sounds. The integral promotions rule is in D because: 1. C/C++ programmers are very used to it. Subtly changing the rules will make transfer of code and skills C <=> D a much riskier proposition, especially if you're not the person who wrote that code. 2. Interoperability of C <=> D and even machine translation is far more pragmatic if these rules are followed. Historical Note: Before 1990, half of the C compilers used "value preserving" integral promotions, half used "sign preserving". C was undergoing standardization, and a great debate raged about which one was better. Eventually, one was picked, and the other compiler vendors had to suck it up and change, and the newly broken C code had to be fixed. ("Value preserving" was picked, which is why ubyte promotes to int, not uint.) These rules often do cause some difficulty with people new to C/C++/D. I know they seem insane to them. But they aren't hard to learn, and it's well worth the few minutes it takes to do it. To check for overflows, etc., use core.checkedint: https://dlang.org/phobos/core_checkedint.html If you're willing to accept some performance reduction, std.experimental.checkedint provides integral types that protect against all kinds of integer arithmetic issues, including "unexpected change of sign": https://dlang.org/phobos/std_experimental_checkedint.html
Aug 13 2020
next sibling parent reply mw <mingwu gmail.com> writes:
On Thursday, 13 August 2020 at 21:33:11 UTC, Walter Bright wrote:

 To check for overflows, etc., use core.checkedint:

 https://dlang.org/phobos/core_checkedint.html

 If you're willing to accept some performance reduction, 
 std.experimental.checkedint provides integral types that 
 protect against all kinds of integer arithmetic issues, 
 including "unexpected change of sign":

 https://dlang.org/phobos/std_experimental_checkedint.html
So instead of let user change their existing code *manually* all over the place, e.g. auto r = new int[a.length + b.length]; ==> auto r = new int[(checked(a.length) + b.length).get]; For users / applications that do value correctness more than performance, can we have a compiler switch which turn all the types & operations (e.g. in modules, that users also specified on command-line) into core_checkedint or std_experimental_checkedint *automatically*?
Aug 13 2020
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 8/13/2020 3:27 PM, mw wrote:
 For users / applications that do value correctness more than performance, can
we 
 have a compiler switch which turn all the types & operations (e.g. in modules, 
 that users also specified on command-line) into core_checkedint or 
 std_experimental_checkedint *automatically*?
The thing about checkedint is there are several different behaviors one might choose as responses to overflow. There is no one-size-fits-all, if we did pick one we'll inevitably have complaints. checkedint is really a nice package. I encourage you to check it out.
Aug 13 2020
next sibling parent reply mw <mingwu gmail.com> writes:
On Thursday, 13 August 2020 at 23:47:30 UTC, Walter Bright wrote:
 On 8/13/2020 3:27 PM, mw wrote:
 For users / applications that do value correctness more than 
 performance, can we have a compiler switch which turn all the 
 types & operations (e.g. in modules, that users also specified 
 on command-line) into core_checkedint or 
 std_experimental_checkedint *automatically*?
The thing about checkedint is there are several different behaviors one might choose as responses to overflow. There is no one-size-fits-all, if we did pick one we'll inevitably have complaints.
I’ve seen the different hooks, and was just about to add: If we choose this approach, the hook can also have its own command line option to let the user in explicit full control of what the integer operation behavior s/he really wants for his/her application.
Aug 13 2020
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Aug 13, 2020 at 11:54:57PM +0000, mw via Digitalmars-d wrote:
 On Thursday, 13 August 2020 at 23:47:30 UTC, Walter Bright wrote:
[...]
 The thing about checkedint is there are several different behaviors
 one might choose as responses to overflow. There is no
 one-size-fits-all, if we did pick one we'll inevitably have
 complaints.
[...]
 If we choose this approach, the hook can also have its own command
 line option to let the user in explicit full control of what the
 integer operation behavior s/he really wants for his/her application.
[...] The application can just use CheckedInt instantiated with whatever preferences it wants to. Define an alias to that in a common module and import that, and you're good to go: alias Int = CheckInt!(... /* whatever settings you want here */); ... Int i, j; Int k = i + j; // etc. What we do *not* want is a compiler switch that will silently change the meaning of existing code. Imagine the chaos if you compiled your application with -int-behaviour=blahblah, and suddenly all of your dub dependencies start misbehaving because their code was written with the standard rules in mind, and subtly changing that causes massive breakage of their internal logic. Or worse, silent, subtle breakage in rare corner cases, of the kind that introduces security holes without a single warning, the kind that cyber-criminals love to exploit. T -- Political correctness: socially-sanctioned hypocrisy.
Aug 13 2020
parent mw <mingwu gmail.com> writes:
On Friday, 14 August 2020 at 00:04:55 UTC, H. S. Teoh wrote:
 On Thu, Aug 13, 2020 at 11:54:57PM +0000, mw via Digitalmars-d 
 wrote:
 On Thursday, 13 August 2020 at 23:47:30 UTC, Walter Bright 
 wrote:
[...]
 The thing about checkedint is there are several different 
 behaviors one might choose as responses to overflow. There 
 is no one-size-fits-all, if we did pick one we'll inevitably 
 have complaints.
[...]
 If we choose this approach, the hook can also have its own 
 command line option to let the user in explicit full control 
 of what the integer operation behavior s/he really wants for 
 his/her application.
[...] The application can just use CheckedInt instantiated with whatever preferences it wants to. Define an alias to that in a common module and import that, and you're good to go: alias Int = CheckInt!(... /* whatever settings you want here */);
This doesn’t work for language buildins array.length.
 What we do *not* want is a compiler switch that will silently 
 change the meaning of existing code. Imagine the chaos if you 
 compiled your application with -int-behaviour=blahblah, and 
 suddenly all of your dub dependencies start misbehaving because 
 their code was written with the standard rules in mind, and
I saw that problem already, that’s why I also said : “in modules, that users also specified on command-line”. Now on a 2nd thought, on the module file level , let user specify at the top.
Aug 13 2020
prev sibling parent reply mw <mingwu gmail.com> writes:
On Thursday, 13 August 2020 at 23:47:30 UTC, Walter Bright wrote:

 checkedint is really a nice package. I encourage you to check 
 it out.
Yes, it works: https://github.com/mingwugmail/dlang_tour/blob/master/divbug.d But it need a more *automatic* way to use this package to be useful, i.e. minimal change on existing code base (all over the places), from: ---------------------------------------------------------- void default_div() { long[] a = [-5000L, 0L]; long c = sum(a) / a.length; writeln(c); // 9223372036854773308 } ---------------------------------------------------------- to: ---------------------------------------------------------- alias Long = Checked!long; void checked_div() { Long[] a = [checked(-5000L), checked(0L)]; Long c = sum(a) / a.length; // compile error writeln(c); // Checked!(long, Abort)(-2500) } ---------------------------------------------------------- currently, there is a compile error Error: template std.experimental.checkedint.Checked!(long, Abort).Checked.__ctor cannot deduce function from argument types !()(Checked!(ulong, Abort)) have to re-write it as: ``` Long c = sum(a); c /= a.length; ```
Aug 13 2020
next sibling parent mw <mingwu gmail.com> writes:
On Friday, 14 August 2020 at 02:10:20 UTC, mw wrote:
 currently, there is a compile error
 Error: template std.experimental.checkedint.Checked!(long, 
 Abort).Checked.__ctor cannot deduce function from argument 
 types !()(Checked!(ulong, Abort))
Actually this is the compile error I wanted! to force me to re-write.
Aug 13 2020
prev sibling next sibling parent mw <mingwu gmail.com> writes:
On Friday, 14 August 2020 at 02:10:20 UTC, mw wrote:

The only other extra tedious thing need to do is: manually add 
checked() on literals in the array:

```
   Long b = 1L;  // OK
//Long[] a = [-5000L, 0L];  // Error: cannot implicitly convert 
expression [-5000L, 0L] of type long[] to Checked!(long, Abort)[]
   Long[] a = [checked(-5000L), checked(0L)];  // here for each 
literals
```
Aug 13 2020
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Aug 14, 2020 at 02:10:20AM +0000, mw via Digitalmars-d wrote:
[...]
 currently, there is a compile error
 Error: template std.experimental.checkedint.Checked!(long,
 Abort).Checked.__ctor cannot deduce function from argument types
 !()(Checked!(ulong, Abort))
 
 have to re-write it as:
 ```
   Long c = sum(a);
   c /= a.length;
 ```
Try writing it as: auto x = Long(sum(a) / a.length); T -- He who sacrifices functionality for ease of use, loses both and deserves neither. -- Slashdotter
Aug 13 2020
parent mw <mingwu gmail.com> writes:
On Friday, 14 August 2020 at 02:28:13 UTC, H. S. Teoh wrote:
 On Fri, Aug 14, 2020 at 02:10:20AM +0000, mw via Digitalmars-d 
 wrote: [...]
 currently, there is a compile error
 Error: template std.experimental.checkedint.Checked!(long,
 Abort).Checked.__ctor cannot deduce function from argument 
 types !()(Checked!(ulong, Abort))
^^^^^ ||||| We want that error message, maybe you missed that too :-)
 Try writing it as:
same error; but we want it!
Aug 13 2020
prev sibling parent reply James Blachly <james.blachly gmail.com> writes:
On 8/13/20 5:33 PM, Walter Bright wrote:
 There are some things everyone simply needs to know to use a systems 
 programming language successfully:
 ...
Walter, You make a very good case for the choice overall but not the lack of warnings. Part of the issue (IMO) is that D is /more/ than just a systems language (side note, it sometimes seems it tries to be everything to everyone), hence the original complaint of "silent" or surprising errors cropping up in the context of e.g. Array.length returning ulong. Especially in cases like this, compiler warnings would be helpful. James
Aug 13 2020
next sibling parent mw <mingwu gmail.com> writes:
On Thursday, 13 August 2020 at 23:33:27 UTC, James Blachly wrote:
 On 8/13/20 5:33 PM, Walter Bright wrote:
 You make a very good case for the choice overall but not the 
 lack of warnings. Part of the issue (IMO) is that D is /more/ 
 than just a systems language (side note, it sometimes seems it 
 tries to be everything to everyone), hence the original 
 complaint of "silent" or surprising errors cropping up in the 
 context of e.g.

 Array.length

 returning ulong. Especially in cases like this, compiler 
 warnings would be helpful.
Yes, I want the warning messages, so I know where are the potential problems are located, and *explicitly* write intentional cast to fix them; instead of spending nights debugging the application.
Aug 13 2020
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 8/13/2020 4:33 PM, James Blachly wrote:
 Especially in cases like this, compiler warnings would be helpful.
Warnings are inherently problematic, as the poor user is always faced with "is this a real problem or not?" I've been very resistant to adding warnings in the past because it balkanizes the language into dialects. The core language should decide what's an error and what isn't.
Aug 13 2020
next sibling parent mw <mingwu gmail.com> writes:
On Thursday, 13 August 2020 at 23:50:08 UTC, Walter Bright wrote:
 On 8/13/2020 4:33 PM, James Blachly wrote:
 Warnings are inherently problematic, as the poor user is always 
 faced with "is this a real problem or not?" I've been very 
 resistant to adding warnings in the past because it balkanizes 
 the language into dialects.
It can be default-to-off warnings. So for users who really care, or when their program run into trouble, they can turn the compiler option on explicitly to find the potential breaks. Personally, I always turn on compiler warning to its max level.
Aug 13 2020
prev sibling parent reply Simen =?UTF-8?B?S2rDpnLDpXM=?= <simen.kjaras gmail.com> writes:
On Thursday, 13 August 2020 at 23:50:08 UTC, Walter Bright wrote:
 On 8/13/2020 4:33 PM, James Blachly wrote:
 Especially in cases like this, compiler warnings would be 
 helpful.
Warnings are inherently problematic, as the poor user is always faced with "is this a real problem or not?" I've been very resistant to adding warnings in the past because it balkanizes the language into dialects. The core language should decide what's an error and what isn't.
First, DMD has -w switch. I know you don't like it (I remember your disapproval from at least 15 years ago), but it's there. :p It's the logical place to put a warning like this. Having it be a warning means copy-pasted C code will do what it did in C without a hitch, while also providing an easy way to check for something that may not be an error, but can be very surprising.
 To check for overflows, etc., use core.checkedint:
This does not in any way address the problem here, namely that the intuitive way to do things causes issues in possibly-very-rare situations. Telling users to use checkedint for (arr.sum / arr.length) is equivalent to telling them to simply cast arr.length to signed - it's bug-prone in that it's easy to forget, and it's bug-prone in that a new user doesn't know that it's necessary. We want our users to fall into the pit of success, and that is not what's happening here. -- Simen
Aug 13 2020
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 8/13/2020 10:47 PM, Simen Kjærås wrote:
 This does not in any way address the problem here, namely that the intuitive
way 
 to do things causes issues in possibly-very-rare situations.
On a more personal note, I came to C from Pascal. Pascal required explicit casts everywhere one did mixed integer arithmetic. I grew to dislike it, it was ugly and just plain annoying. Then a friend loaned me K+R, and the way its expressions worked was simple and effective, and looked a lot nicer on the page. I never had any trouble with it. (To be fair, at the time I had done a lot of assembler programming and was very well aware of 2's complement arithmetic. I also had done a lot of PDP-11 assembler and recognized the integral promotion semantics from that.) I never wrote another line of Pascal. The only thing I did like about Pascal was its way of nested functions, which were marvelous and hence D's marvelous nested functions :-) As for intuitive, what that really means is what you're used to. 2's complement arithmetic is intuitive for systems programmers and assembler programmers, people coming from school arithmetic are going to find it unintuitive. There isn't a way to please everyone, we have to make a choice. P.S. I should add to my previous list that people doing systems programming need to come to terms with size_t, ptrdiff_t, and their model dependent behavior. At least D doesn't do random sizes for int, and the random signed-ness of char. C people get regularly punished for assuming char is signed or unsigned. Maybe I should expand this into an article.
Aug 14 2020
next sibling parent reply FeepingCreature <feepingcreature gmail.com> writes:
On Friday, 14 August 2020 at 09:12:50 UTC, Walter Bright wrote:
 On 8/13/2020 10:47 PM, Simen Kjærås wrote:
 This does not in any way address the problem here, namely that 
 the intuitive way to do things causes issues in 
 possibly-very-rare situations.
On a more personal note, I came to C from Pascal. Pascal required explicit casts everywhere one did mixed integer arithmetic. I grew to dislike it, it was ugly and just plain annoying.
To be fair, this issue specifically only happens with int/uint division. Addition, subtraction and even multiplication are all fine. (*Why* is multiplication fine? I have no idea... but it works in spot testing.)
Aug 14 2020
parent Adam D. Ruppe <destructionator gmail.com> writes:
On Friday, 14 August 2020 at 09:32:58 UTC, FeepingCreature wrote:
 (*Why* is multiplication fine? I have no idea... but it works 
 in spot testing.)
mul 32 bit * 32 bit and imul 32*32 give the same result in the lower 32 bits of the result. Only the upper half shows the difference. The algorithm* in binary looks kinda like a sign extension followed by a series of shifts and adds. https://en.wikipedia.org/wiki/Multiplication_algorithm#Binary_or_Peasant_multiplication The signed product is the sum of left-shifted values after sign extension. We know the sum of two's complement is the same regardless of sign after extension, and that left shift is going to naturally only start to become an issue on the high word. Thus same deal for a 32 bit result (int = int * int), but you can expect differences in a 64 bit result (long = int * int). Let's first demo it with a 4 bit example. Say -2 * 3. -2 = 1110 (2 = 0010, then flip the bits + 1: 1101 + 1 = 1110) 3 = 0011 I'll do the long form in 8 bit, so first, we need to sign extend them, which is just duplicating the msb all the way left): shr --- shl 11111110 * 00000011 ----------- 01111111 * 00000110 00111111 * 00001100 00011111 * 00011000 00001111 * 00110000 00000111 * 01100000 00000011 * 11000000 00000001 * 10000000 (c) Now we get the sum of all the rhs values there if the lhs small bit is set... which happens to be all of them here 11111010 (c) That's obviously signed, flip the bits and add one to get our result back out, -6. Whether we chop off or keep those high bits, no difference. That was an imul since I sign extended. Now, let's do mul, the unsigned one. Same deal except we just pad left with zero: 00001110 * 00000011 (unsigned would be 14 * 3) ------------------- 00000111 * 00000110 00000011 * 00001100 00000001 * 00011000 Sum: 00101010 (unsigned is 42) Well, those lower bits look the same... 1010, though here we interpret that as decimal 10 instead of -6, but same bits so if the compiler casted back to signed we'd never know the difference. But those upper bits... oh my, zeroes instead of ones, positive number. With positive values, sign extension and and zero extension are the same thing - all zeroes. And since 0 * x = 0 for all x, it is all discarded once it shifts into the target bits. But with negative values, the sign extension gives us a bunch of ones on the lhs to shift in. The rhs doesn't really care - it gets a bunch of zeroes shifted in on the right so it ignores it. But those ones on the left change the high word, it now results in that rhs stuff getting added. And if you wanna do a test program with 32 bit numbers of course you will see this same result. Same result as int, discarding those upper 32 bits, but different assigned long or ulong since now the initial sign extension led to different values up there. But since C and D will both happily discard those without an explicit cast you might never even know. sorry if this was a bit wordy, if i had the time, i would edit it down more
Aug 14 2020
prev sibling next sibling parent claptrap <clap trap.com> writes:
On Friday, 14 August 2020 at 09:12:50 UTC, Walter Bright wrote:
 On 8/13/2020 10:47 PM, Simen Kjærås wrote:

 As for intuitive, what that really means is what you're used to.
Everybody is used to -50/2 = -25. I'd argue that intuition is far more fundamental than 2 compliment. I'd also argue a lot of experienced C/C++ programmers would probably get caught out by that bug because it's the kind of thing that's easy to forget or overlook. That's why it should be an error. I reckon most C/C++ people would think it's actually a good idea.
Aug 14 2020
prev sibling next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Friday, 14 August 2020 at 09:12:50 UTC, Walter Bright wrote:
 2's complement arithmetic is intuitive for systems programmers 
 and assembler programmers
A lot of us know perfectly well how it works (at least when we stop and think about it for a second), but it is still easy to just unintentionally mix things and screw it up without realizing it since the edge cases may not manifest quickly. But I do find this a little ironic given how absolutely brutal D is when it comes to narrowing conversions after implicit promotion. You know: ushort a, b; ushort c = a + b; // requires explicit cast! even ushort c = ushort(a + b); complains. Gotta explicitly cast the whole parenthetical, and it is pretty annoying. You and I both know the CPU can do this trivially, in x86 asm it is just `add ax, bx`. Of course we both also know C promotes them to ints (ok to uints here) before any arithmetic and the carry bit is discarded when truncated back to ushort and I understand the concern the compiler is trying to communicate... just I don't care, I want 16 bit operations here. So anyway I bring this up in this thread because when it comes to signed and unsigned, you say we need to know how two's complement works, this is a systems programming language. But when it comes to discarding carry bits on 16 bit operations, now the compiler treats us like this like we're ignorant fools. Like I said, I sometimes ignorantly and foolishly mix signed and unsigned when I should know better. I just forget. And maybe I sometimes would do that with int and ushort or whatever too if the compiler didn't say something. So I get why it does this. Just it irks me and seeing the very different behavior for these two situations doesn't make much sense to me. Either say we need to know how it works and make the compiler accept it or say we are prone to mistakes and make the compiler complain. I find it hard to believe explicit 16 bit arithmetic is more prone to real world bugs than unsigned issues.
Aug 14 2020
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 8/14/2020 6:32 AM, Adam D. Ruppe wrote:
 just I don't care, I want 16 bit operations here.
The design considerations are mutually incompatible: 1. performance 2. aesthetics 3. requiring casts breaks generic code 4. should be able to index the entire address space 5. I need wraparound arithmetic 6. I need saturation arithmetic 7. must throw exception on overflow/underflow 8. must not throw exception on overflow/underflow 9. should work like 3rd grade arithmetic 10. type inference 11. should work like C 13. should work like Java 14. should work like Python Language design is an art where we do the best we can given what the language use is targeted at. D is meant for high performance, systems programming, memory safety, and C compatibility. The compromises go in that direction.
Aug 14 2020
next sibling parent Avrina <avrina12309412342 gmail.com> writes:
On Friday, 14 August 2020 at 21:29:42 UTC, Walter Bright wrote:
 On 8/14/2020 6:32 AM, Adam D. Ruppe wrote:
 just I don't care, I want 16 bit operations here.
The design considerations are mutually incompatible: 1. performance 2. aesthetics 3. requiring casts breaks generic code 4. should be able to index the entire address space 5. I need wraparound arithmetic 6. I need saturation arithmetic 7. must throw exception on overflow/underflow 8. must not throw exception on overflow/underflow 9. should work like 3rd grade arithmetic 10. type inference 11. should work like C 13. should work like Java 14. should work like Python
I think most people would rather not have it work like C, or as a compromise work like C and have warnings like C as it is bug prone. Both those are better than what D is doing now. Trying to make it look like a strawmans, it seems there's a disconnect between designers and users that write actual code.
 Language design is an art where we do the best we can given 
 what the language use is targeted at.

 D is meant for high performance, systems programming, memory 
 safety, and C compatibility. The compromises go in that 
 direction.
Most people looking for language that is for high performance and systems programming would probably not want a GC. They can not use the GC but a lot of features are lost because they specifically cator to the GC.
Aug 14 2020
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 8/14/20 5:29 PM, Walter Bright wrote:
 On 8/14/2020 6:32 AM, Adam D. Ruppe wrote:
 just I don't care, I want 16 bit operations here.
The design considerations are mutually incompatible: 1. performance 2. aesthetics 3. requiring casts breaks generic code 4. should be able to index the entire address space 5. I need wraparound arithmetic 6. I need saturation arithmetic 7. must throw exception on overflow/underflow 8. must not throw exception on overflow/underflow 9. should work like 3rd grade arithmetic 10. type inference 11. should work like C 13. should work like Java 14. should work like Python
Heh, this looks like a list of things CheckedInt is supposed to help with.
Aug 17 2020
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 8/17/2020 4:24 AM, Andrei Alexandrescu wrote:
 Heh, this looks like a list of things CheckedInt is supposed to help with.
Exactly :-)
Aug 19 2020
parent mw <mingwu gmail.com> writes:
On Thursday, 20 August 2020 at 01:25:17 UTC, Walter Bright wrote:
 On 8/17/2020 4:24 AM, Andrei Alexandrescu wrote:
 Heh, this looks like a list of things CheckedInt is supposed 
 to help with.
Exactly :-)
Walter, We also found issues that better to be fixed by the compiler: https://forum.dlang.org/thread/omskttpjhwmmhifymiha forum.dlang.org?page=2 on the 2 page of the thread. In particular: https://issues.dlang.org/show_bug.cgi?id=21175
Aug 19 2020
prev sibling parent Mathias LANG <geod24 gmail.com> writes:
On Friday, 14 August 2020 at 09:12:50 UTC, Walter Bright wrote:
 On 8/13/2020 10:47 PM, Simen Kjærås wrote:
 This does not in any way address the problem here, namely that 
 the intuitive way to do things causes issues in 
 possibly-very-rare situations.
On a more personal note, I came to C from Pascal. Pascal required explicit casts everywhere one did mixed integer arithmetic. I grew to dislike it, it was ugly and just plain annoying.
IMO that's the most important point. If we were to make it a warning, it would trigger *everywhere* and very often on valid code. The signal to noise ratio would be terrible, as it currently is for C++ code. I really think there is room for improvement, but anything that would trigger false positive I would strongly oppose. In D I never feels like I have to please the compiler, because errors are almost always sane. In C++, you always have to throw a few compiler switches (`-Wall -Wextra`) and then write your code so that the compiler is happy. No thanks. If anyone want an example of bad warnings in D: https://issues.dlang.org/show_bug.cgi?id=14835
Aug 19 2020
prev sibling next sibling parent Avrina <avrina12309412342 gmail.com> writes:
This has been ongoing since almost the beginning.

https://issues.dlang.org/show_bug.cgi?id=259

It might be the oldest filed D bug now (just checked, it is). Of 
course, because Walter detests warning messages no resolution 
will probably ever be made. The only thing Walter contributed to 
that issue is trying to close it without resolving it.

Here's to 14 years.
Aug 13 2020
prev sibling parent reply Guillaume Piolat <first.name gmail.com> writes:
On Thursday, 13 August 2020 at 07:22:18 UTC, mw wrote:
 void main() {
   long   a = -5000;
   size_t b = 2;
   long   c = a / b;
   writeln(c);
 }


 $ dmd divbug.d
 $ ./divbug
 9223372036854773308
Feels correct to me ! When you have an unsigned and signed integer mixed with a binary operator, the operands are converted to unsigned. This is how it works in C and C++ and we wouldn't be able to port C code to D if this were to be changed.
Aug 13 2020
parent jmh530 <john.michael.hall gmail.com> writes:
On Thursday, 13 August 2020 at 21:09:41 UTC, Guillaume Piolat 
wrote:
 [snip]

 Feels correct to me !

 When you have an unsigned and signed integer mixed with a 
 binary operator, the operands are converted to unsigned.

 This is how it works in C and C++ and we wouldn't be able to 
 port C code to D if this were to be changed.
One way to look at it is that a design goal of D is that a C user should be able to copy and paste code into D with minimal changes. From that perspective, the integer promotion rules make sense. However, if the design goal were instead based upon automatic conversion of C code to D, particularly given the (mostly) automatic conversion of dmd from C to D, then different integer promotion rules would not have been as significant a blocker for people coming to D from C. At this point, it would be a big breaking change. Also, we have templates and operator overloading. Nothing stops people from making their own Int type that has different semantics for division. import std.traits: isIntegral; struct Integer(T) if (isIntegral!T) { T x; alias x this; Integer!T opBinary(string op)(size_t rhs) { assert(rhs < int.max); static if (op == "/") return Integer!T(x / cast(T) rhs); else static assert(0, "Operator " ~ op ~ " not implemented"); } } void main() { auto x = Integer!long(-5000L); size_t y = 2; auto z = x / y; import std.stdio: writeln; assert(z == -2500); }
Aug 13 2020