www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Common type of ubyte and const ubyte is int

reply Steven Schveighoffer <schveiguy gmail.com> writes:
```d
ubyte a;
const(ubyte) b;

auto x = true ? a : b;

pragma(msg, typeof(x)); // int
```

Why? This isn't a VRP problem, as both are `ubyte`.

Related issue

https://issues.dlang.org/show_bug.cgi?id=19817

-Steve
Apr 30
next sibling parent reply "H. S. Teoh" <hsteoh qfbox.info> writes:
On Tue, Apr 30, 2024 at 06:19:28PM +0000, Steven Schveighoffer via
Digitalmars-d wrote:
 ```d
 ubyte a;
 const(ubyte) b;
 
 auto x = true ? a : b;
 
 pragma(msg, typeof(x)); // int
 ```
 
 Why? This isn't a VRP problem, as both are `ubyte`.
 
 Related issue
 
 https://issues.dlang.org/show_bug.cgi?id=19817
[...] Betcha it's something to do with integer promotion rules. T -- Just because you survived after you did it, doesn't mean it wasn't stupid!
Apr 30
parent Daniel N <no public.email> writes:
On Tuesday, 30 April 2024 at 19:23:58 UTC, H. S. Teoh wrote:
 On Tue, Apr 30, 2024 at 06:19:28PM +0000, Steven Schveighoffer 
 via Digitalmars-d wrote:
 ```d
 ubyte a;
 const(ubyte) b;
 
 auto x = true ? a : b;
 
 pragma(msg, typeof(x)); // int
 ```
 
 Why? This isn't a VRP problem, as both are `ubyte`.
 
 Related issue
 
 https://issues.dlang.org/show_bug.cgi?id=19817
[...] Betcha it's something to do with integer promotion rules. T
WAT! Thanks for posting this before it bit me!
Apr 30
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 4/30/24 20:19, Steven Schveighoffer wrote:
 ```d
 ubyte a;
 const(ubyte) b;
 
 auto x = true ? a : b;
 
 pragma(msg, typeof(x)); // int
 ```
 
 Why? This isn't a VRP problem, as both are `ubyte`.
 
 Related issue
 
 https://issues.dlang.org/show_bug.cgi?id=19817
 
 -Steve
FWIW, my frontend says `ubyte` here and there is a comment somewhere that claims it implements some rules from TDPL page 60.
Apr 30
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On Tuesday, 30 April 2024 at 21:18:36 UTC, Timon Gehr wrote:
 On 4/30/24 20:19, Steven Schveighoffer wrote:
 ```d
 ubyte a;
 const(ubyte) b;
 
 auto x = true ? a : b;
 
 pragma(msg, typeof(x)); // int
 ```
 
 Why? This isn't a VRP problem, as both are `ubyte`.
 
 Related issue
 
 https://issues.dlang.org/show_bug.cgi?id=19817
 
 -Steve
FWIW, my frontend says `ubyte` here and there is a comment somewhere that claims it implements some rules from TDPL page 60.
Thanks to phone OCR, this is what is on that page: ``` 1. If a and b have the same type, T is that type; 2. else if a and b are integrals, first promote anything smaller than 32-bit to int, then choose T as the larger type, with a preference for unsigned type if tied in size; 3. else if one is an integral and the other is a floating-point type, T is the floating-point type; 4. else if both have floating-point types, T is the larger of the two; 5. else if the types have a common supertype (e.g., base class), T is that supertype (we will return to this topic in Chapter 6); 6. else try implicitly converting a to b's type and b to a's type; if exactly one of these succeeds, T is the type of the successful conversion target; 7. else the expression is in error. ``` It seems rule 2 would apply instead of rule 6? but I don't like it. -Steve
May 01
next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 5/1/24 16:42, Steven Schveighoffer wrote:
 On Tuesday, 30 April 2024 at 21:18:36 UTC, Timon Gehr wrote:
 On 4/30/24 20:19, Steven Schveighoffer wrote:
 ```d
 ubyte a;
 const(ubyte) b;

 auto x = true ? a : b;

 pragma(msg, typeof(x)); // int
 ```

 Why? This isn't a VRP problem, as both are `ubyte`.

 Related issue

 https://issues.dlang.org/show_bug.cgi?id=19817

 -Steve
FWIW, my frontend says `ubyte` here and there is a comment somewhere that claims it implements some rules from TDPL page 60.
Thanks to phone OCR, this is what is on that page: ``` 1. If a and b have the same type, T is that type; 2. else if a and b are integrals, first promote anything smaller than 32-bit to int, then choose T as the larger type, with a preference for unsigned type if tied in size; 3. else if one is an integral and the other is a floating-point type, T is the floating-point type; 4. else if both have floating-point types, T is the larger of the two; 5. else if the types have a common supertype (e.g., base class), T is that supertype (we will return to this topic in Chapter 6); 6. else try implicitly converting a to b's type and b to a's type; if exactly one of these succeeds, T is the type of the successful conversion target; 7. else the expression is in error. ``` ...
Thanks! Did not have the book at hand myself.
 It seems rule 2 would apply instead of rule 6?
Indeed, I think technically DMD conforms to those rules, but my frontend does not, on this example.
 but I don't like it.
 
 -Steve
Same. (Not biased.)
May 01
prev sibling next sibling parent reply Dom DiSc <dominikus scherkl.de> writes:
On Wednesday, 1 May 2024 at 14:42:25 UTC, Steven Schveighoffer 
wrote:
 ```
 1. If a and b have the same type, T is that type;
 2. else if a and b are integrals, first promote anything 
 smaller than 32-bit to int, then choose T as the larger type, 
 with a preference for unsigned type if tied in size;
 3. else if one is an integral and the other is a floating-point 
 type, T is the floating-point type;
 4. else if both have floating-point types, T is the larger of 
 the two;
 5. else if the types have a common supertype (e.g., base 
 class), T is that supertype (we will return to this topic in 
 Chapter 6);
 6. else try implicitly converting a to b's type and b to a's 
 type; if exactly one of these succeeds, T is the type of the 
 successful conversion target;
 7. else the expression is in error.
 ```

 It seems rule 2 would apply instead of rule 6? but I don't like 
 it.

 -Steve
Yes, those rules are often cumbersome. The promotions should look more like: ``` ubyte > ushort > uint > ulong v v v v byte > > short > > int > > long > v v v float > double > real ``` To find the common type, go to the crossing of the rightmost and the downmost of the types and if there is no type at this point, go one further right or if that is not possible go one further down. This works for all numeric basic types. E.g. common(byte, ubyte) = short, common(float, uint) = double, common(long, ulong) = real If real is 80bit, it can represent all values of long and ulong, because that has 64bit mantissa plus a sign bit (and an exponent, which is not necessary for integer values). If real is smaller, at least we tried our best but the low bits of the value may get lost. The result of all operators should be the common type. Especially for unary operators the result should always be the same type as the operand. Calculation can be done in the processor word-size, but without explicit cast the result should be truncated to the common type.
May 02
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On Thursday, 2 May 2024 at 12:34:03 UTC, Dom DiSc wrote:
 Yes, those rules are often cumbersome. The promotions should 
 look more like:
 ```
     ubyte   >  ushort  >   uint   >   ulong
       v          v           v          v
 byte > >  short > >   int   > >  long  >
                  v           v          v
                float   >  double  >   real
 ```
 To find the common type, go to the crossing of the rightmost 
 and the downmost of the types
 and if there is no type at this point, go one further right or 
 if that is not possible go one further down. This works for all 
 numeric basic types.
 E.g. common(byte, ubyte) = short,
 common(float, uint) = double,
 common(long, ulong) = real
But let's take a step back. As far as integrals, they are *the same type*, it's just that one is const and one is not. There is no rule that says if the types are the same type except for type modifiers, what should happen. I think there should be. The result is very unexpected. -Steve
May 02
parent reply DrDread <DrDread cheese.com> writes:
On Thursday, 2 May 2024 at 16:16:23 UTC, Steven Schveighoffer 
wrote:
 On Thursday, 2 May 2024 at 12:34:03 UTC, Dom DiSc wrote:
 [...]
But let's take a step back. As far as integrals, they are *the same type*, it's just that one is const and one is not. There is no rule that says if the types are the same type except for type modifiers, what should happen. I think there should be. The result is very unexpected. -Steve
I'm for not having types promote at all. it's been an endless source of bugs for me. it messes with metaprogramming.
May 02
parent reply user1234 <user1234 12.de> writes:
On Thursday, 2 May 2024 at 16:25:15 UTC, DrDread wrote:
 I'm for not having types promote at all. it's been an endless 
 source of bugs for me. it messes with metaprogramming.
PLs that dont promote have their own problems too. The most obvious is that overflowing is more easy. At least promotion mitigates that. However I'm quite sure that D promotions rules were not seen as such. It's more C compatibility, walking on the C tracks, to speak metaphorically.
May 02
next sibling parent reply user1234 <user1234 12.de> writes:
On Thursday, 2 May 2024 at 17:25:55 UTC, user1234 wrote:
 On Thursday, 2 May 2024 at 16:25:15 UTC, DrDread wrote:
 I'm for not having types promote at all. it's been an endless 
 source of bugs for me. it messes with metaprogramming.
PLs that dont promote have their own problems too. The most obvious is that overflowing is more easy. At least promotion mitigates that. However I'm quite sure that D promotions rules were not seen as such. It's more C compatibility, walking on the C tracks, to speak metaphorically.
Just remembered, one argument that was once exposed by Walter is be that arithmetic instructions for 32 bits registers would be faster than the ones let's say for 16 or 8.
May 02
parent reply Dom DiSc <dominikus scherkl.de> writes:
On Thursday, 2 May 2024 at 17:30:12 UTC, user1234 wrote:
 Just remembered, one argument that was once exposed by Walter 
 is be that arithmetic instructions for 32 bits registers would 
 be faster than the ones let's say for 16 or 8.
Working on the processor word size is always the fastest. But that is totally independent of the result type. Calculation can always be done extended to word-size (or may need multiple words for large operands), but the result should have the common type. To say 32bit is always the best is only true for 32bit architectures. And having no promotion is not an option. If two operands are to be combined, we need some common type for the result, no matter how this result is produced. About type attributes - the operands may be mutable, const or immutable, but the result is a new value with attributes independent of the operand attributes. It should be assignable to mutable variables - to const or immutable objects only during initialization, to shared objects only if they are locked. At least that is what I expect.
May 02
next sibling parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 03/05/2024 6:52 PM, Dom DiSc wrote:
 On Thursday, 2 May 2024 at 17:30:12 UTC, user1234 wrote:
 
     Just remembered, one argument that was once exposed by Walter is be
     that arithmetic instructions for 32 bits registers would be faster
     than the ones let's say for 16 or 8.
 
 Working on the processor word size is always the fastest. But that is 
 totally independent of the result type. Calculation can always be done 
 extended to word-size (or may need multiple words for large operands), 
 but the result should have the common type. To say 32bit is always the 
 best is only true for 32bit architectures. And having no promotion is 
 not an option. If two operands are to be combined, we need some common 
 type for the result, no matter how this result is produced.
Normally for x86 32/64bit are equal in speed but 8/16bit are not very far behind these days. Due to C its pretty safe to assume that 32bit registers will be pretty heavily optimized for 32bit or above processes. Consider the Ice Lake series, which last had releases in 2020. DIV IDIV r8 4 4 DIV IDIV r16 4 4 DIV IDIV r32 4 4 DIV IDIV r64 4 4 Now compare that to Haswell architecture from 10 years ago: DIV r8 9 9 DIV r16 11 11 DIV r32 10 10 DIV r64 36 36 IDIV r8 9 9 IDIV r16 10 10 IDIV r32 9 9 IDIV r64 59 59 If you don't know the exact target CPU sticking with 32bit is still a good recommendation for CPU's that are 32bit or above. On the other hand if you know its more recent (say running Windows 11), just use whatever you need and don't worry about it.
May 03
parent Daniel N <no public.email> writes:
On Friday, 3 May 2024 at 07:05:21 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
 Due to C its pretty safe to assume that 32bit registers will be 
 pretty heavily optimized for 32bit or above processes.

 Consider the Ice Lake series, which last had releases in 2020.

 DIV IDIV	r8	4	4
 DIV IDIV	r16	4	4
 DIV IDIV	r32	4	4
 DIV IDIV	r64	4	4

 Now compare that to Haswell architecture from 10 years ago:

 DIV	r8	9	9
 DIV	r16	11	11
 DIV	r32	10	10
 DIV	r64	36	36
 IDIV	r8	9	9
 IDIV	r16	10	10
 IDIV	r32	9	9
 IDIV	r64	59	59

 If you don't know the exact target CPU sticking with 32bit is 
 still a good recommendation for CPU's that are 32bit or above.

 On the other hand if you know its more recent (say running 
 Windows 11), just use whatever you need and don't worry about 
 it.
That's an interesting data-point, however x86 is unusual in that is supports all register sizes, on other CPUs you have to manually mask the result.
May 03
prev sibling next sibling parent Zoadian <no no.no> writes:
On Friday, 3 May 2024 at 06:52:00 UTC, Dom DiSc wrote:
 On Thursday, 2 May 2024 at 17:30:12 UTC, user1234 wrote:
 Just remembered, one argument that was once exposed by Walter 
 is be that arithmetic instructions for 32 bits registers would 
 be faster than the ones let's say for 16 or 8.
Working on the processor word size is always the fastest. But that is totally independent of the result type. Calculation can always be done extended to word-size (or may need multiple words for large operands), but the result should have the common type. To say 32bit is always the best is only true for 32bit architectures. And having no promotion is not an option. If two operands are to be combined, we need some common type for the result, no matter how this result is produced. About type attributes - the operands may be mutable, const or immutable, but the result is a new value with attributes independent of the operand attributes. It should be assignable to mutable variables - to const or immutable objects only during initialization, to shared objects only if they are locked. At least that is what I expect.
just force the user to cast when the types are of incompatible sizes instead of promoting them to some arbitrary type. if the user uses smaller types, they generally do it for size reasons, and don't want to end up with a bigger type. you normally shouldn't use types <= int32, so it should not be an issue in practice. what is an issue however is the compiler picking the wrong templates because of some promotion rules.
May 03
prev sibling parent Steven Schveighoffer <schveiguy gmail.com> writes:
On Friday, 3 May 2024 at 06:52:00 UTC, Dom DiSc wrote:
 On Thursday, 2 May 2024 at 17:30:12 UTC, user1234 wrote:
 Just remembered, one argument that was once exposed by Walter 
 is be that arithmetic instructions for 32 bits registers would 
 be faster than the ones let's say for 16 or 8.
Working on the processor word size is always the fastest. But that is totally independent of the result type. Calculation can always be done extended to word-size (or may need multiple words for large operands), but the result should have the common type. To say 32bit is always the best is only true for 32bit architectures. And having no promotion is not an option. If two operands are to be combined, we need some common type for the result, no matter how this result is produced.
This is all very interesting, but remember, there is no operation or calculation happening here. We have a construct that is picking an 8 bit unsigned value, or an 8 bit unsigned value, and it is resulting in a 32-bit signed value. If this kind of thing requires integer promotion then the following should be true: ```d ubyte x = 5; auto y = x; static assert(is(typeof(y) == int)); ``` But that is not the case. I don't mean to be pedantic here, but this is the inconsistency I see in this part of the compiler.
 About type attributes - the operands may be mutable, const or 
 immutable, but the result is a new value with attributes 
 independent of the operand attributes. It should be assignable 
 to mutable variables - to const or immutable objects only 
 during initialization, to shared objects only if they are 
 locked.
Common type modifier differences can be tested with `int`, and I believe the rules are sound there. -Steve
May 03
prev sibling parent reply Daniel N <no public.email> writes:
On Thursday, 2 May 2024 at 17:25:55 UTC, user1234 wrote:
 On Thursday, 2 May 2024 at 16:25:15 UTC, DrDread wrote:
 I'm for not having types promote at all. it's been an endless 
 source of bugs for me. it messes with metaprogramming.
PLs that dont promote have their own problems too. The most obvious is that overflowing is more easy. At least promotion mitigates that. However I'm quite sure that D promotions rules were not seen as such. It's more C compatibility, walking on the C tracks, to speak metaphorically.
C++ actually gets it right(!) ```d #include <cstdint> #include <cstddef> #include <cstdio> int main(void) { uint8_t a = {}; const uint8_t b = {}; const uint16_t c = {}; auto x = 1 ? a : b; auto y = 1 ? a : c; } ```
May 02
parent user1234 <user1234 12.de> writes:
On Thursday, 2 May 2024 at 18:09:56 UTC, Daniel N wrote:
 On Thursday, 2 May 2024 at 17:25:55 UTC, user1234 wrote:
 On Thursday, 2 May 2024 at 16:25:15 UTC, DrDread wrote:
 [...]
PLs that dont promote have their own problems too. The most obvious is that overflowing is more easy. At least promotion mitigates that. However I'm quite sure that D promotions rules were not seen as such. It's more C compatibility, walking on the C tracks, to speak metaphorically.
C++ actually gets it right(!) ```d #include <cstdint> #include <cstddef> #include <cstdio> int main(void) { uint8_t a = {}; const uint8_t b = {}; const uint16_t c = {}; auto x = 1 ? a : b; auto y = 1 ? a : c; } ```
I see, c++ promotes, but differently. D specs should be revised. Probably the rules mentioned earlier, those from ANdrei, should be sorted differently.
May 02
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/1/2024 7:42 AM, Steven Schveighoffer wrote:
 It seems rule 2 would apply instead of rule 6? but I don't like it.
``` #include <stdio.h> void main() { char u; const char v; printf("%ld %ld\n", sizeof(u), sizeof(1?u:v)); } ``` This prints "1 4". D follows the same integral promotion rules, and the reason is if one translates C code to D, one doesn't get an unpleasant hidden surprise.
May 03
next sibling parent reply matheus <matheus gmail.com> writes:
On Friday, 3 May 2024 at 19:40:36 UTC, Walter Bright wrote:
 On 5/1/2024 7:42 AM, Steven Schveighoffer wrote:
 It seems rule 2 would apply instead of rule 6? but I don't 
 like it.
``` #include <stdio.h> void main() { char u; const char v; printf("%ld %ld\n", sizeof(u), sizeof(1?u:v)); } ``` This prints "1 4". D follows the same integral promotion rules, and the reason is if one translates C code to D, one doesn't get an unpleasant hidden surprise.
Like this D code: import std.stdio; void main(){ char u; const char v; writefln("%d %d", (u.sizeof), (1?u:v).sizeof); } Prints: "1 1". =] Matheus.
May 03
parent Walter Bright <newshound2 digitalmars.com> writes:
On 5/3/2024 1:14 PM, matheus wrote:
 Like this D code:
 
 import std.stdio;
 
 void main(){
      char u;
      const char v;
      writefln("%d %d", (u.sizeof), (1?u:v).sizeof);
 }
 
 Prints: "1 1".
You're right. But change `char` to `ubyte` and you'll get "1 4". This is because the table in impcnvtab.d does not deal with char types: https://github.com/dlang/dmd/blob/master/compiler/src/dmd/impcnvtab.d#L165 So, char types do not undergo integral promotion when trying to bring two expressions to a common type. Byte, ubyte, short, and ushort do. This can be debated as to which it should be, but there's a lot of water under that bridge.
May 04
prev sibling next sibling parent reply Daniel N <no public.email> writes:
On Friday, 3 May 2024 at 19:40:36 UTC, Walter Bright wrote:
 On 5/1/2024 7:42 AM, Steven Schveighoffer wrote:
 It seems rule 2 would apply instead of rule 6? but I don't 
 like it.
``` #include <stdio.h> void main() { char u; const char v; printf("%ld %ld\n", sizeof(u), sizeof(1?u:v)); } ``` This prints "1 4". D follows the same integral promotion rules, and the reason is if one translates C code to D, one doesn't get an unpleasant hidden surprise.
However as you can see in my post https://forum.dlang.org/post/fhmzjloxfzzwgmohkxnc forum.dlang.org C++ actually went the other direction on this, in the rare cases that they differ, shouldn't D be free to chose the best option from C or C++? Considering C++ had auto ever since C++11 but it was only recently added in C23. There's a lot more C++ code that depends on the type than C code.
May 04
parent Walter Bright <newshound2 digitalmars.com> writes:
On 5/4/2024 8:34 AM, Daniel N wrote:
 However as you can see in my post
 https://forum.dlang.org/post/fhmzjloxfzzwgmohkxnc forum.dlang.org
 C++ actually went the other direction on this, in the rare cases that they 
 differ, shouldn't D be free to chose the best option from C or C++?
True, but D is more compatible with C than with C++.
May 04
prev sibling next sibling parent Steven Schveighoffer <schveiguy gmail.com> writes:
On Friday, 3 May 2024 at 19:40:36 UTC, Walter Bright wrote:
 On 5/1/2024 7:42 AM, Steven Schveighoffer wrote:
 It seems rule 2 would apply instead of rule 6? but I don't 
 like it.
``` #include <stdio.h> void main() { char u; const char v; printf("%ld %ld\n", sizeof(u), sizeof(1?u:v)); } ``` This prints "1 4". D follows the same integral promotion rules, and the reason is if one translates C code to D, one doesn't get an unpleasant hidden surprise.
Cool, now let's try `char` against `char`: ```c printf("%ld]\n", sizeof(1? u : u)); ``` This prints 4 still. Wait, what does D do? ```d ubyte u; writeln((1 ? u : u).sizeof); ``` prints "1". This must be a mistake, right? How does anyone ever port C code with this glaring change in functionality?! I'm being a little bit overdramatic here, but you get the drift. C does not have auto, or function overloading, so the inferred type of a ternary expression in terms of *integer promotion* is meaningless. There isn't even a *typeof* expression in C, so you have to resort to `sizeof` expressions. As far as I can tell, this is the only place where you can see the difference. Since C allows implicit truncation, nobody will ever notice that this type is `int`. As pointed out by Daniel N, C++ gets this right. You know who would be surprised by compiling C code if it did something *so different* it affected outcomes? C++ developers. They use C libraries as-is, not even porting, whenever they want. If those things started misbehaving, they would notice. But they don't care. Why? Because it doesn't affect anything in C. Please, just change this. -Steve
May 04
prev sibling parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Friday, May 3, 2024 1:40:36 PM MDT Walter Bright via Digitalmars-d wrote:
 On 5/1/2024 7:42 AM, Steven Schveighoffer wrote:
 It seems rule 2 would apply instead of rule 6? but I don't like it.
``` #include <stdio.h> void main() { char u; const char v; printf("%ld %ld\n", sizeof(u), sizeof(1?u:v)); } ``` This prints "1 4". D follows the same integral promotion rules, and the reason is if one translates C code to D, one doesn't get an unpleasant hidden surprise.
Sure, but are there actually any unpleasant surprises if C code using the ternary operator is converted to D? If you have uint8_t left; const uint8_t right; uint8_t result = cond ? left : right; whether integer promotion occurs or not is irrelevant, because no arithmetic is happening, and C will happily downcast an int to a uint8_t. So, whether the ternary operator results in uint8_t or int doesn't affect the result at all. On the flip side, if you assign the result to an int, uint8_t left; const uint8_t right; int result = cond ? left : right; whether integer promotion occurred is again irrelevant, because no arithmetic is occurring, and both results will fit in an int whether the ternary operator converted the result to int or left it as uint8_t. And since char in C is the same as either uint8_t or int8_t depending on the platform, using char in those two examples would have the same result. So, as far as I can tell, the C code does not care whether integer promotion occurs with the ternary operator or not. It would care if it had type inference via something like D's auto, but it doesn't. The value of the result is the same whether integer promotion occurs or not, and that's all that C actually cares about unless you're doing something like sizeof on the expression, which is not exactly a typical thing to be doing. And that's probably why C++ was perfectly fine with changing the behavior. It doesn't actually break code in practice. This issue only comes up in D, because we have auto, and we make downcasting with integer types illegal. So, promoting byte to int when no arithmetic is actually occurring just because the other branch of the ternary used const is highly surprising and increases the chances of code breakage due to downcasting not being legal. It's also pretty terrible for generic code, because with almost all types if you have T left; const T right; auto result = cond ? left : right; the result will be const T, whereas with smaller integer types, it would be int, which almost no one will expect. And unless VRP kicks in T left; const T right; const T result = cond ? left : right; will fail to compile for small integer types while it compiles for every other type in the language. So, from what I can tell, we're not going to break C code whether the ternary operator does integer promotion or not, but it _does_ cause problems for D code that integer promotion occurs even though there is no arithmetic expression involved. And on top of that, by changing D to not do integer promotion with the ternary operator, we would be more compatible with C++ where the difference _does_ matter, because they have their own auto, and they have function overloading. So, I don't see how sticking to the C behavior in this case helps at all, and it clearly hurts us for both straight up D code and for porting C++ code to D. - Jonathan M Davis
May 04
next sibling parent Dom DiSc <dominikus scherkl.de> writes:
On Sunday, 5 May 2024 at 04:13:16 UTC, Jonathan M Davis wrote:
 And on top of that, by changing D to not do integer promotion 
 with the ternary operator, we would be more compatible with C++ 
 where the difference _does_ matter, because they have their own 
 auto, and they have function overloading.
integer promotion wouldn't hurt us, if it would promote to something sensible. promoting (ubyte, uint) to int is not sensible. it introduces a sign where there was none before and thereby destroy large uint values. I see no benefit in staying compatible to such strange behaviour of C. The common type of (T, const T) should be const T for any T and not something arbitrary. (And int is arbitrary, because why not short or long? - and don't call C compatibility here, because C *does* use short on 16bit machines and long for some 64bit machines!)
May 05
prev sibling parent Don Allen <donaldcallen gmail.com> writes:
On Sunday, 5 May 2024 at 04:13:16 UTC, Jonathan M Davis wrote:
 On Friday, May 3, 2024 1:40:36 PM MDT Walter Bright via 
 Digitalmars-d wrote:
 On 5/1/2024 7:42 AM, Steven Schveighoffer wrote:
 It seems rule 2 would apply instead of rule 6? but I don't 
 like it.
``` #include <stdio.h> void main() { char u; const char v; printf("%ld %ld\n", sizeof(u), sizeof(1?u:v)); } ``` This prints "1 4". D follows the same integral promotion rules, and the reason is if one translates C code to D, one doesn't get an unpleasant hidden surprise.
Sure, but are there actually any unpleasant surprises if C code using the ternary operator is converted to D? If you have uint8_t left; const uint8_t right; uint8_t result = cond ? left : right; whether integer promotion occurs or not is irrelevant, because no arithmetic is happening, and C will happily downcast an int to a uint8_t. So, whether the ternary operator results in uint8_t or int doesn't affect the result at all. On the flip side, if you assign the result to an int, uint8_t left; const uint8_t right; int result = cond ? left : right; whether integer promotion occurred is again irrelevant, because no arithmetic is occurring, and both results will fit in an int whether the ternary operator converted the result to int or left it as uint8_t. And since char in C is the same as either uint8_t or int8_t depending on the platform, using char in those two examples would have the same result. So, as far as I can tell, the C code does not care whether integer promotion occurs with the ternary operator or not. It would care if it had type inference via something like D's auto, but it doesn't. The value of the result is the same whether integer promotion occurs or not, and that's all that C actually cares about unless you're doing something like sizeof on the expression, which is not exactly a typical thing to be doing. And that's probably why C++ was perfectly fine with changing the behavior. It doesn't actually break code in practice. This issue only comes up in D, because we have auto, and we make downcasting with integer types illegal. So, promoting byte to int when no arithmetic is actually occurring just because the other branch of the ternary used const is highly surprising and increases the chances of code breakage due to downcasting not being legal. It's also pretty terrible for generic code, because with almost all types if you have T left; const T right; auto result = cond ? left : right; the result will be const T, whereas with smaller integer types, it would be int, which almost no one will expect. And unless VRP kicks in T left; const T right; const T result = cond ? left : right; will fail to compile for small integer types while it compiles for every other type in the language. So, from what I can tell, we're not going to break C code whether the ternary operator does integer promotion or not, but it _does_ cause problems for D code that integer promotion occurs even though there is no arithmetic expression involved. And on top of that, by changing D to not do integer promotion with the ternary operator, we would be more compatible with C++ where the difference _does_ matter, because they have their own auto, and they have function overloading. So, I don't see how sticking to the C behavior in this case helps at all, and it clearly hurts us for both straight up D code and for porting C++ code to D. - Jonathan M Davis
I think Jonathan, Steve and others have made a strong case for fixing this language anomaly.The terms "unpleasant surprise" and "no one would expect" are not what you want to read in descriptions of your language. We all expect to have a mental model of what the code we are writing does. Discovering the hard way that, despite reasonable diligence, our model was wrong tends to drive people away. This is the Principle of Least Surprise. Have any of you written any PL-1? No? Here's an example of why (this may not be syntactically correct; I haven't written any PL-1 in over 50 years): ```` declare foo bit(1) foo = 1 ```` What is the value of foo after this executes? The answer is 0. Why? 1 is a decimal constant with precision (p) of 1. To do the assignment, it is first converted to a binary constant of length p+3: 0001. This is converted to the bit string '0001' of length 4, which is then assigned to foo, a bit string of length 1. Assignment of longer to shorter bit strings is done high order bit first. A lot of other issues contributed to PL-1's ultimate lack of success, but things like this certainly did not help.
May 05