digitalmars.D - Common type of ubyte and const ubyte is int

Steven Schveighoffer (10/10) Apr 30 ```d

H. S. Teoh (6/20) Apr 30 [...]

Daniel N (2/21) Apr 30 WAT! Thanks for posting this before it bit me!

Timon Gehr (3/19) Apr 30 FWIW, my frontend says `ubyte` here and there is a comment somewhere

Steven Schveighoffer (21/41) May 01 Thanks to phone OCR, this is what is on that page:

Timon Gehr (5/49) May 01 Indeed, I think technically DMD conforms to those rules, but my frontend...
Dom DiSc (29/49) May 02 Yes, those rules are often cumbersome. The promotions should look

Steven Schveighoffer (7/24) May 02 But let's take a step back. As far as integrals, they are *the

DrDread (4/13) May 02 I'm for not having types promote at all. it's been an endless

user1234 (7/9) May 02 PLs that dont promote have their own problems too. The most

user1234 (4/13) May 02 Just remembered, one argument that was once exposed by Walter is

Dom DiSc (16/19) May 02 Working on the processor word size is always the fastest. But

Richard (Rikki) Andrew Cattermole (23/36) May 03 Normally for x86 32/64bit are equal in speed but 8/16bit are not very

Daniel N (5/26) May 03 That's an interesting data-point, however x86 is unusual in that

Zoadian (9/30) May 03 just force the user to cast when the types are of incompatible
Steven Schveighoffer (17/37) May 03 This is all very interesting, but remember, there is no operation

Daniel N (17/26) May 02 C++ actually gets it right(!)

user1234 (4/31) May 02 I see, c++ promotes, but differently. D specs should be revised.

Walter Bright (12/13) May 03 ```

matheus (11/26) May 03 Like this D code:

Walter Bright (9/20) May 04 You're right.

Daniel N (9/24) May 04 However as you can see in my post

Walter Bright (2/6) May 04 True, but D is more compatible with C than with C++.

Steven Schveighoffer (28/43) May 04 Cool, now let's try `char` against `char`:
Jonathan M Davis (56/70) May 04 Sure, but are there actually any unpleasant surprises if C code using th...

Dom DiSc (11/15) May 05 integer promotion wouldn't hurt us, if it would promote to
Don Allen (24/105) May 05 I think Jonathan, Steve and others have made a strong case for

Steven Schveighoffer <schveiguy gmail.com> writes:

```d
ubyte a;
const(ubyte) b;

auto x = true ? a : b;

pragma(msg, typeof(x)); // int
```

Why? This isn't a VRP problem, as both are `ubyte`.

Related issue

https://issues.dlang.org/show_bug.cgi?id=19817

-Steve

Apr 30

"H. S. Teoh" <hsteoh qfbox.info> writes:

On Tue, Apr 30, 2024 at 06:19:28PM +0000, Steven Schveighoffer via
Digitalmars-d wrote:
 ```d
 ubyte a;
 const(ubyte) b;
 
 auto x = true ? a : b;
 
 pragma(msg, typeof(x)); // int
 ```
 
 Why? This isn't a VRP problem, as both are `ubyte`.
 
 Related issue
 
 https://issues.dlang.org/show_bug.cgi?id=19817

[...]

Betcha it's something to do with integer promotion rules.


T

-- 
Just because you survived after you did it, doesn't mean it wasn't stupid!

Apr 30

Daniel N <no public.email> writes:

On Tuesday, 30 April 2024 at 19:23:58 UTC, H. S. Teoh wrote:
 On Tue, Apr 30, 2024 at 06:19:28PM +0000, Steven Schveighoffer 
 via Digitalmars-d wrote:
 ```d
 ubyte a;
 const(ubyte) b;
 
 auto x = true ? a : b;
 
 pragma(msg, typeof(x)); // int
 ```
 
 Why? This isn't a VRP problem, as both are `ubyte`.
 
 Related issue
 
 https://issues.dlang.org/show_bug.cgi?id=19817

 [...]

 Betcha it's something to do with integer promotion rules.


 T

WAT! Thanks for posting this before it bit me!

Apr 30

Timon Gehr <timon.gehr gmx.ch> writes:

On 4/30/24 20:19, Steven Schveighoffer wrote:
 ```d
 ubyte a;
 const(ubyte) b;
 
 auto x = true ? a : b;
 
 pragma(msg, typeof(x)); // int
 ```
 
 Why? This isn't a VRP problem, as both are `ubyte`.
 
 Related issue
 
 https://issues.dlang.org/show_bug.cgi?id=19817
 
 -Steve

FWIW, my frontend says `ubyte` here and there is a comment somewhere 
that claims it implements some rules from TDPL page 60.

Apr 30

Steven Schveighoffer <schveiguy gmail.com> writes:

On Tuesday, 30 April 2024 at 21:18:36 UTC, Timon Gehr wrote:
 On 4/30/24 20:19, Steven Schveighoffer wrote:
 ```d
 ubyte a;
 const(ubyte) b;
 
 auto x = true ? a : b;
 
 pragma(msg, typeof(x)); // int
 ```
 
 Why? This isn't a VRP problem, as both are `ubyte`.
 
 Related issue
 
 https://issues.dlang.org/show_bug.cgi?id=19817
 
 -Steve

 FWIW, my frontend says `ubyte` here and there is a comment 
 somewhere that claims it implements some rules from TDPL page 
 60.

Thanks to phone OCR, this is what is on that page:

```
1. If a and b have the same type, T is that type;
2. else if a and b are integrals, first promote anything smaller 
than 32-bit to int, then choose T as the larger type, with a 
preference for unsigned type if tied in size;
3. else if one is an integral and the other is a floating-point 
type, T is the floating-point type;
4. else if both have floating-point types, T is the larger of the 
two;
5. else if the types have a common supertype (e.g., base class), 
T is that supertype (we will return to this topic in Chapter 6);
6. else try implicitly converting a to b's type and b to a's 
type; if exactly one of these succeeds, T is the type of the 
successful conversion target;
7. else the expression is in error.
```

It seems rule 2 would apply instead of rule 6? but I don't like 
it.

-Steve

May 01

Timon Gehr <timon.gehr gmx.ch> writes:

On 5/1/24 16:42, Steven Schveighoffer wrote:
 On Tuesday, 30 April 2024 at 21:18:36 UTC, Timon Gehr wrote:
 On 4/30/24 20:19, Steven Schveighoffer wrote:
 ```d
 ubyte a;
 const(ubyte) b;

 auto x = true ? a : b;

 pragma(msg, typeof(x)); // int
 ```

 Why? This isn't a VRP problem, as both are `ubyte`.

 Related issue

 https://issues.dlang.org/show_bug.cgi?id=19817

 -Steve

 FWIW, my frontend says `ubyte` here and there is a comment somewhere 
 that claims it implements some rules from TDPL page 60.

 
 Thanks to phone OCR, this is what is on that page:
 
 ```
 1. If a and b have the same type, T is that type;
 2. else if a and b are integrals, first promote anything smaller than 
 32-bit to int, then choose T as the larger type, with a preference for 
 unsigned type if tied in size;
 3. else if one is an integral and the other is a floating-point type, T 
 is the floating-point type;
 4. else if both have floating-point types, T is the larger of the two;
 5. else if the types have a common supertype (e.g., base class), T is 
 that supertype (we will return to this topic in Chapter 6);
 6. else try implicitly converting a to b's type and b to a's type; if 
 exactly one of these succeeds, T is the type of the successful 
 conversion target;
 7. else the expression is in error.
 ```
 ...

Thanks! Did not have the book at hand myself.

 It seems rule 2 would apply instead of rule 6?

Indeed, I think technically DMD conforms to those rules, but my frontend 
does not, on this example.

 but I don't like it.
 
 -Steve

Same. (Not biased.)

May 01

Dom DiSc <dominikus scherkl.de> writes:

On Wednesday, 1 May 2024 at 14:42:25 UTC, Steven Schveighoffer 
wrote:
 ```
 1. If a and b have the same type, T is that type;
 2. else if a and b are integrals, first promote anything 
 smaller than 32-bit to int, then choose T as the larger type, 
 with a preference for unsigned type if tied in size;
 3. else if one is an integral and the other is a floating-point 
 type, T is the floating-point type;
 4. else if both have floating-point types, T is the larger of 
 the two;
 5. else if the types have a common supertype (e.g., base 
 class), T is that supertype (we will return to this topic in 
 Chapter 6);
 6. else try implicitly converting a to b's type and b to a's 
 type; if exactly one of these succeeds, T is the type of the 
 successful conversion target;
 7. else the expression is in error.
 ```

 It seems rule 2 would apply instead of rule 6? but I don't like 
 it.

 -Steve

Yes, those rules are often cumbersome. The promotions should look 
more like:
```
     ubyte   >  ushort  >   uint   >   ulong
       v          v           v          v
byte > >  short > >   int   > >  long  >
                  v           v          v
                float   >  double  >   real
```
To find the common type, go to the crossing of the rightmost and 
the downmost of the types
and if there is no type at this point, go one further right or if 
that is not possible go one further down. This works for all 
numeric basic types.
E.g. common(byte, ubyte) = short,
common(float, uint) = double,
common(long, ulong) = real

If real is 80bit, it can represent all values of long and ulong, 
because that has 64bit mantissa plus a sign bit (and an exponent, 
which is not necessary for integer values).
If real is smaller, at least we tried our best but the low bits 
of the value may get lost.

The result of all operators should be the common type. Especially 
for unary operators the result should always be the same type as 
the operand.
Calculation can be done in the processor word-size, but without 
explicit cast the result should be truncated to the common type.

May 02

Steven Schveighoffer <schveiguy gmail.com> writes:

On Thursday, 2 May 2024 at 12:34:03 UTC, Dom DiSc wrote:
 Yes, those rules are often cumbersome. The promotions should 
 look more like:
 ```
     ubyte   >  ushort  >   uint   >   ulong
       v          v           v          v
 byte > >  short > >   int   > >  long  >
                  v           v          v
                float   >  double  >   real
 ```
 To find the common type, go to the crossing of the rightmost 
 and the downmost of the types
 and if there is no type at this point, go one further right or 
 if that is not possible go one further down. This works for all 
 numeric basic types.
 E.g. common(byte, ubyte) = short,
 common(float, uint) = double,
 common(long, ulong) = real

But let's take a step back. As far as integrals, they are *the 
same type*, it's just that one is const and one is not.

There is no rule that says if the types are the same type except 
for type modifiers, what should happen. I think there should be.

The result is very unexpected.

-Steve

May 02

DrDread <DrDread cheese.com> writes:

On Thursday, 2 May 2024 at 16:16:23 UTC, Steven Schveighoffer 
wrote:
 On Thursday, 2 May 2024 at 12:34:03 UTC, Dom DiSc wrote:
 [...]

 But let's take a step back. As far as integrals, they are *the 
 same type*, it's just that one is const and one is not.

 There is no rule that says if the types are the same type 
 except for type modifiers, what should happen. I think there 
 should be.

 The result is very unexpected.

 -Steve

I'm for not having types promote at all. it's been an endless 
source of bugs for me. it messes with metaprogramming.

May 02

user1234 <user1234 12.de> writes:

On Thursday, 2 May 2024 at 16:25:15 UTC, DrDread wrote:
 I'm for not having types promote at all. it's been an endless 
 source of bugs for me. it messes with metaprogramming.

PLs that dont promote have their own problems too. The most 
obvious is that overflowing is more easy. At least promotion 
mitigates that.

However I'm quite sure that D promotions rules were not seen as 
such. It's more C compatibility, walking on the C tracks, to 
speak metaphorically.

May 02

user1234 <user1234 12.de> writes:

On Thursday, 2 May 2024 at 17:25:55 UTC, user1234 wrote:
 On Thursday, 2 May 2024 at 16:25:15 UTC, DrDread wrote:
 I'm for not having types promote at all. it's been an endless 
 source of bugs for me. it messes with metaprogramming.

 PLs that dont promote have their own problems too. The most 
 obvious is that overflowing is more easy. At least promotion 
 mitigates that.

 However I'm quite sure that D promotions rules were not seen as 
 such. It's more C compatibility, walking on the C tracks, to 
 speak metaphorically.

Just remembered, one argument that was once exposed by Walter is 
be that arithmetic instructions for 32 bits registers would be 
faster than the ones let's say for 16 or 8.

May 02

Dom DiSc <dominikus scherkl.de> writes:

On Thursday, 2 May 2024 at 17:30:12 UTC, user1234 wrote:
 Just remembered, one argument that was once exposed by Walter 
 is be that arithmetic instructions for 32 bits registers would 
 be faster than the ones let's say for 16 or 8.

Working on the processor word size is always the fastest. But 
that is totally independent of the result type. Calculation can 
always be done extended to word-size (or may need multiple words 
for large operands), but the result should have the common type.
To say 32bit is always the best is only true for 32bit 
architectures.
And having no promotion is not an option. If two operands are to 
be combined, we need some common type for the result, no matter 
how this result is produced.

About type attributes - the operands may be mutable, const or 
immutable, but the result is a new value with attributes 
independent of the operand attributes. It should be assignable to 
mutable variables - to const or immutable objects only during 
initialization, to shared objects only if they are locked.

At least that is what I expect.

May 02

"Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:

On 03/05/2024 6:52 PM, Dom DiSc wrote:
 On Thursday, 2 May 2024 at 17:30:12 UTC, user1234 wrote:
 
     Just remembered, one argument that was once exposed by Walter is be
     that arithmetic instructions for 32 bits registers would be faster
     than the ones let's say for 16 or 8.
 
 Working on the processor word size is always the fastest. But that is 
 totally independent of the result type. Calculation can always be done 
 extended to word-size (or may need multiple words for large operands), 
 but the result should have the common type. To say 32bit is always the 
 best is only true for 32bit architectures. And having no promotion is 
 not an option. If two operands are to be combined, we need some common 
 type for the result, no matter how this result is produced.

Normally for x86 32/64bit are equal in speed but 8/16bit are not very 
far behind these days.

Due to C its pretty safe to assume that 32bit registers will be pretty 
heavily optimized for 32bit or above processes.

Consider the Ice Lake series, which last had releases in 2020.

DIV IDIV	r8	4	4
DIV IDIV	r16	4	4
DIV IDIV	r32	4	4
DIV IDIV	r64	4	4

Now compare that to Haswell architecture from 10 years ago:

DIV	r8	9	9
DIV	r16	11	11
DIV	r32	10	10
DIV	r64	36	36
IDIV	r8	9	9
IDIV	r16	10	10
IDIV	r32	9	9
IDIV	r64	59	59

If you don't know the exact target CPU sticking with 32bit is still a 
good recommendation for CPU's that are 32bit or above.

On the other hand if you know its more recent (say running Windows 11), 
just use whatever you need and don't worry about it.

May 03

Daniel N <no public.email> writes:

On Friday, 3 May 2024 at 07:05:21 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
 Due to C its pretty safe to assume that 32bit registers will be 
 pretty heavily optimized for 32bit or above processes.

 Consider the Ice Lake series, which last had releases in 2020.

 DIV IDIV	r8	4	4
 DIV IDIV	r16	4	4
 DIV IDIV	r32	4	4
 DIV IDIV	r64	4	4

 Now compare that to Haswell architecture from 10 years ago:

 DIV	r8	9	9
 DIV	r16	11	11
 DIV	r32	10	10
 DIV	r64	36	36
 IDIV	r8	9	9
 IDIV	r16	10	10
 IDIV	r32	9	9
 IDIV	r64	59	59

 If you don't know the exact target CPU sticking with 32bit is 
 still a good recommendation for CPU's that are 32bit or above.

 On the other hand if you know its more recent (say running 
 Windows 11), just use whatever you need and don't worry about 
 it.

That's an interesting data-point, however x86 is unusual in that 
is supports all register sizes, on other CPUs you have to 
manually mask the result.

May 03

Zoadian <no no.no> writes:

On Friday, 3 May 2024 at 06:52:00 UTC, Dom DiSc wrote:
 On Thursday, 2 May 2024 at 17:30:12 UTC, user1234 wrote:
 Just remembered, one argument that was once exposed by Walter 
 is be that arithmetic instructions for 32 bits registers would 
 be faster than the ones let's say for 16 or 8.

 Working on the processor word size is always the fastest. But 
 that is totally independent of the result type. Calculation can 
 always be done extended to word-size (or may need multiple 
 words for large operands), but the result should have the 
 common type.
 To say 32bit is always the best is only true for 32bit 
 architectures.
 And having no promotion is not an option. If two operands are 
 to be combined, we need some common type for the result, no 
 matter how this result is produced.

 About type attributes - the operands may be mutable, const or 
 immutable, but the result is a new value with attributes 
 independent of the operand attributes. It should be assignable 
 to mutable variables - to const or immutable objects only 
 during initialization, to shared objects only if they are 
 locked.

 At least that is what I expect.

just force the user to cast when the types are of incompatible 
sizes instead of promoting them to some arbitrary type.
if the user uses smaller types, they generally do it for size 
reasons, and don't want to end up with a bigger type.
you normally shouldn't use types <= int32, so it should not be an 
issue in practice.
what is an issue however is the compiler picking the wrong 
templates because of some promotion rules.

May 03

Steven Schveighoffer <schveiguy gmail.com> writes:

On Friday, 3 May 2024 at 06:52:00 UTC, Dom DiSc wrote:
 On Thursday, 2 May 2024 at 17:30:12 UTC, user1234 wrote:
 Just remembered, one argument that was once exposed by Walter 
 is be that arithmetic instructions for 32 bits registers would 
 be faster than the ones let's say for 16 or 8.

 Working on the processor word size is always the fastest. But 
 that is totally independent of the result type. Calculation can 
 always be done extended to word-size (or may need multiple 
 words for large operands), but the result should have the 
 common type.
 To say 32bit is always the best is only true for 32bit 
 architectures.
 And having no promotion is not an option. If two operands are 
 to be combined, we need some common type for the result, no 
 matter how this result is produced.

This is all very interesting, but remember, there is no operation 
or calculation happening here. We have a construct that is 
picking an 8 bit unsigned value, or an 8 bit unsigned value, and 
it is resulting in a 32-bit signed value.

If this kind of thing requires integer promotion then the 
following should be true:

```d
ubyte x = 5;
auto y = x;
static assert(is(typeof(y) == int));
```

But that is not the case. I don't mean to be pedantic here, but 
this is the inconsistency I see in this part of the compiler.

 About type attributes - the operands may be mutable, const or 
 immutable, but the result is a new value with attributes 
 independent of the operand attributes. It should be assignable 
 to mutable variables - to const or immutable objects only 
 during initialization, to shared objects only if they are 
 locked.

Common type modifier differences can be tested with `int`, and I 
believe the rules are sound there.

-Steve

May 03

Daniel N <no public.email> writes:

On Thursday, 2 May 2024 at 17:25:55 UTC, user1234 wrote:
 On Thursday, 2 May 2024 at 16:25:15 UTC, DrDread wrote:
 I'm for not having types promote at all. it's been an endless 
 source of bugs for me. it messes with metaprogramming.

 PLs that dont promote have their own problems too. The most 
 obvious is that overflowing is more easy. At least promotion 
 mitigates that.

 However I'm quite sure that D promotions rules were not seen as 
 such. It's more C compatibility, walking on the C tracks, to 
 speak metaphorically.

C++ actually gets it right(!)

```d
#include <cstdint>
#include <cstddef>
#include <cstdio>

int main(void)
{
     uint8_t a = {};
     const uint8_t b = {};
     const uint16_t c = {};

     auto x = 1 ? a : b;
     auto y = 1 ? a : c;



}
```

May 02

user1234 <user1234 12.de> writes:

On Thursday, 2 May 2024 at 18:09:56 UTC, Daniel N wrote:
 On Thursday, 2 May 2024 at 17:25:55 UTC, user1234 wrote:
 On Thursday, 2 May 2024 at 16:25:15 UTC, DrDread wrote:
 [...]

 PLs that dont promote have their own problems too. The most 
 obvious is that overflowing is more easy. At least promotion 
 mitigates that.

 However I'm quite sure that D promotions rules were not seen 
 as such. It's more C compatibility, walking on the C tracks, 
 to speak metaphorically.

 C++ actually gets it right(!)

 ```d
 #include <cstdint>
 #include <cstddef>
 #include <cstdio>

 int main(void)
 {
     uint8_t a = {};
     const uint8_t b = {};
     const uint16_t c = {};

     auto x = 1 ? a : b;
     auto y = 1 ? a : c;



 }
 ```

I see, c++ promotes, but differently. D specs should be revised. 
Probably the rules mentioned earlier, those from ANdrei, should 
be sorted differently.

May 02

Walter Bright <newshound2 digitalmars.com> writes:

On 5/1/2024 7:42 AM, Steven Schveighoffer wrote:
 It seems rule 2 would apply instead of rule 6? but I don't like it.

```
#include <stdio.h>

void main()
{
     char u;
     const char v;
     printf("%ld %ld\n", sizeof(u), sizeof(1?u:v));
}
```

This prints "1 4". D follows the same integral promotion rules, and the reason 
is if one translates C code to D, one doesn't get an unpleasant hidden surprise.

May 03

matheus <matheus gmail.com> writes:

On Friday, 3 May 2024 at 19:40:36 UTC, Walter Bright wrote:
 On 5/1/2024 7:42 AM, Steven Schveighoffer wrote:
 It seems rule 2 would apply instead of rule 6? but I don't 
 like it.

 ```
 #include <stdio.h>

 void main()
 {
     char u;
     const char v;
     printf("%ld %ld\n", sizeof(u), sizeof(1?u:v));
 }
 ```

 This prints "1 4". D follows the same integral promotion rules, 
 and the reason is if one translates C code to D, one doesn't 
 get an unpleasant hidden surprise.

Like this D code:

import std.stdio;

void main(){
     char u;
     const char v;
     writefln("%d %d", (u.sizeof), (1?u:v).sizeof);
}

Prints: "1 1".

=]

Matheus.

May 03

Walter Bright <newshound2 digitalmars.com> writes:

On 5/3/2024 1:14 PM, matheus wrote:
 Like this D code:
 
 import std.stdio;
 
 void main(){
      char u;
      const char v;
      writefln("%d %d", (u.sizeof), (1?u:v).sizeof);
 }
 
 Prints: "1 1".

You're right.

But change `char` to `ubyte` and you'll get "1 4".

This is because the table in impcnvtab.d does not deal with char types:

https://github.com/dlang/dmd/blob/master/compiler/src/dmd/impcnvtab.d#L165

So, char types do not undergo integral promotion when trying to bring two 
expressions to a common type. Byte, ubyte, short, and ushort do.

This can be debated as to which it should be, but there's a lot of water under 
that bridge.

May 04

Daniel N <no public.email> writes:

On Friday, 3 May 2024 at 19:40:36 UTC, Walter Bright wrote:
 On 5/1/2024 7:42 AM, Steven Schveighoffer wrote:
 It seems rule 2 would apply instead of rule 6? but I don't 
 like it.

 ```
 #include <stdio.h>

 void main()
 {
     char u;
     const char v;
     printf("%ld %ld\n", sizeof(u), sizeof(1?u:v));
 }
 ```

 This prints "1 4". D follows the same integral promotion rules, 
 and the reason is if one translates C code to D, one doesn't 
 get an unpleasant hidden surprise.

However as you can see in my post
https://forum.dlang.org/post/fhmzjloxfzzwgmohkxnc forum.dlang.org
C++ actually went the other direction on this, in the rare cases 
that they differ, shouldn't D be free to chose the best option 
from C or C++?

Considering C++ had auto ever since C++11 but it was only 
recently added in C23. There's a lot more C++ code that depends 
on the type than C code.

May 04

Walter Bright <newshound2 digitalmars.com> writes:

On 5/4/2024 8:34 AM, Daniel N wrote:
 However as you can see in my post
 https://forum.dlang.org/post/fhmzjloxfzzwgmohkxnc forum.dlang.org
 C++ actually went the other direction on this, in the rare cases that they 
 differ, shouldn't D be free to chose the best option from C or C++?

True, but D is more compatible with C than with C++.

May 04

Steven Schveighoffer <schveiguy gmail.com> writes:

On Friday, 3 May 2024 at 19:40:36 UTC, Walter Bright wrote:
 On 5/1/2024 7:42 AM, Steven Schveighoffer wrote:
 It seems rule 2 would apply instead of rule 6? but I don't 
 like it.

 ```
 #include <stdio.h>

 void main()
 {
     char u;
     const char v;
     printf("%ld %ld\n", sizeof(u), sizeof(1?u:v));
 }
 ```

 This prints "1 4". D follows the same integral promotion rules, 
 and the reason is if one translates C code to D, one doesn't 
 get an unpleasant hidden surprise.

Cool, now let's try `char` against `char`:

```c
printf("%ld]\n", sizeof(1? u : u));
```

This prints 4 still. Wait, what does D do?

```d
ubyte u;
writeln((1 ? u : u).sizeof);
```

prints "1". This must be a mistake, right? How does anyone ever 
port C code with this glaring change in functionality?!

I'm being a little bit overdramatic here, but you get the drift.

C does not have auto, or function overloading, so the inferred 
type of a ternary expression in terms of *integer promotion* is 
meaningless. There isn't even a *typeof* expression in C, so you 
have to resort to `sizeof` expressions. As far as I can tell, 
this is the only place where you can see the difference. Since C 
allows implicit truncation, nobody will ever notice that this 
type is `int`.

As pointed out by Daniel N, C++ gets this right. You know who 
would be surprised by compiling C code if it did something *so 
different* it affected outcomes? C++ developers. They use C 
libraries as-is, not even porting, whenever they want. If those 
things started misbehaving, they would notice.

But they don't care. Why? Because it doesn't affect anything in 
C. Please, just change this.

-Steve

May 04

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Friday, May 3, 2024 1:40:36 PM MDT Walter Bright via Digitalmars-d wrote:
 On 5/1/2024 7:42 AM, Steven Schveighoffer wrote:
 It seems rule 2 would apply instead of rule 6? but I don't like it.

 ```
 #include <stdio.h>

 void main()
 {
      char u;
      const char v;
      printf("%ld %ld\n", sizeof(u), sizeof(1?u:v));
 }
 ```

 This prints "1 4". D follows the same integral promotion rules, and the
 reason is if one translates C code to D, one doesn't get an unpleasant
 hidden surprise.

Sure, but are there actually any unpleasant surprises if C code using the
ternary operator is converted to D? If you have

    uint8_t left;
    const uint8_t right;

    uint8_t result = cond ? left : right;

whether integer promotion occurs or not is irrelevant, because no arithmetic
is happening, and C will happily downcast an int to a uint8_t. So, whether
the ternary operator results in uint8_t or int doesn't affect the result at
all.

On the flip side, if you assign the result to an int,

    uint8_t left;
    const uint8_t right;

    int result = cond ? left : right;

whether integer promotion occurred is again irrelevant, because no
arithmetic is occurring, and both results will fit in an int whether the
ternary operator converted the result to int or left it as uint8_t.

And since char in C is the same as either uint8_t or int8_t depending on the
platform, using char in those two examples would have the same result.

So, as far as I can tell, the C code does not care whether integer promotion
occurs with the ternary operator or not. It would care if it had type
inference via something like D's auto, but it doesn't. The value of the
result is the same whether integer promotion occurs or not, and that's all
that C actually cares about unless you're doing something like sizeof on the
expression, which is not exactly a typical thing to be doing. And that's
probably why C++ was perfectly fine with changing the behavior. It doesn't
actually break code in practice.

This issue only comes up in D, because we have auto, and we make downcasting
with integer types illegal. So, promoting byte to int when no arithmetic is
actually occurring just because the other branch of the ternary used const
is highly surprising and increases the chances of code breakage due to
downcasting not being legal.

It's also pretty terrible for generic code, because with almost all types if
you have

    T left;
    const T right;

    auto result = cond ? left : right;

the result will be const T, whereas with smaller integer types, it would be
int, which almost no one will expect. And unless VRP kicks in

    T left;
    const T right;

    const T result = cond ? left : right;

will fail to compile for small integer types while it compiles for every
other type in the language.

So, from what I can tell, we're not going to break C code whether the
ternary operator does integer promotion or not, but it _does_ cause problems
for D code that integer promotion occurs even though there is no arithmetic
expression involved.

And on top of that, by changing D to not do integer promotion with the
ternary operator, we would be more compatible with C++ where the difference
_does_ matter, because they have their own auto, and they have function
overloading.

So, I don't see how sticking to the C behavior in this case helps at all,
and it clearly hurts us for both straight up D code and for porting C++ code
to D.

- Jonathan M Davis

May 04

Dom DiSc <dominikus scherkl.de> writes:

On Sunday, 5 May 2024 at 04:13:16 UTC, Jonathan M Davis wrote:
 And on top of that, by changing D to not do integer promotion 
 with the ternary operator, we would be more compatible with C++ 
 where the difference _does_ matter, because they have their own 
 auto, and they have function overloading.

integer promotion wouldn't hurt us, if it would promote to 
something sensible. promoting (ubyte, uint) to int is not 
sensible. it introduces a sign where there was none before and 
thereby destroy large uint values. I see no benefit in staying 
compatible to such strange behaviour of C.

The common type of (T, const T) should be const T for any T and 
not something arbitrary. (And int is arbitrary, because why not 
short or long? - and don't call C compatibility here, because C 
*does* use short on 16bit machines and long for some 64bit 
machines!)

May 05

Don Allen <donaldcallen gmail.com> writes:

On Sunday, 5 May 2024 at 04:13:16 UTC, Jonathan M Davis wrote:
 On Friday, May 3, 2024 1:40:36 PM MDT Walter Bright via 
 Digitalmars-d wrote:
 On 5/1/2024 7:42 AM, Steven Schveighoffer wrote:
 It seems rule 2 would apply instead of rule 6? but I don't 
 like it.

 ```
 #include <stdio.h>

 void main()
 {
      char u;
      const char v;
      printf("%ld %ld\n", sizeof(u), sizeof(1?u:v));
 }
 ```

 This prints "1 4". D follows the same integral promotion 
 rules, and the reason is if one translates C code to D, one 
 doesn't get an unpleasant hidden surprise.

 Sure, but are there actually any unpleasant surprises if C code 
 using the ternary operator is converted to D? If you have

     uint8_t left;
     const uint8_t right;

     uint8_t result = cond ? left : right;

 whether integer promotion occurs or not is irrelevant, because 
 no arithmetic is happening, and C will happily downcast an int 
 to a uint8_t. So, whether the ternary operator results in 
 uint8_t or int doesn't affect the result at all.

 On the flip side, if you assign the result to an int,

     uint8_t left;
     const uint8_t right;

     int result = cond ? left : right;

 whether integer promotion occurred is again irrelevant, because 
 no arithmetic is occurring, and both results will fit in an int 
 whether the ternary operator converted the result to int or 
 left it as uint8_t.

 And since char in C is the same as either uint8_t or int8_t 
 depending on the platform, using char in those two examples 
 would have the same result.

 So, as far as I can tell, the C code does not care whether 
 integer promotion occurs with the ternary operator or not. It 
 would care if it had type inference via something like D's 
 auto, but it doesn't. The value of the result is the same 
 whether integer promotion occurs or not, and that's all that C 
 actually cares about unless you're doing something like sizeof 
 on the expression, which is not exactly a typical thing to be 
 doing. And that's probably why C++ was perfectly fine with 
 changing the behavior. It doesn't actually break code in 
 practice.

 This issue only comes up in D, because we have auto, and we 
 make downcasting with integer types illegal. So, promoting byte 
 to int when no arithmetic is actually occurring just because 
 the other branch of the ternary used const is highly surprising 
 and increases the chances of code breakage due to downcasting 
 not being legal.

 It's also pretty terrible for generic code, because with almost 
 all types if you have

     T left;
     const T right;

     auto result = cond ? left : right;

 the result will be const T, whereas with smaller integer types, 
 it would be int, which almost no one will expect. And unless 
 VRP kicks in

     T left;
     const T right;

     const T result = cond ? left : right;

 will fail to compile for small integer types while it compiles 
 for every other type in the language.

 So, from what I can tell, we're not going to break C code 
 whether the ternary operator does integer promotion or not, but 
 it _does_ cause problems for D code that integer promotion 
 occurs even though there is no arithmetic expression involved.

 And on top of that, by changing D to not do integer promotion 
 with the ternary operator, we would be more compatible with C++ 
 where the difference _does_ matter, because they have their own 
 auto, and they have function overloading.

 So, I don't see how sticking to the C behavior in this case 
 helps at all, and it clearly hurts us for both straight up D 
 code and for porting C++ code to D.

 - Jonathan M Davis

I think Jonathan, Steve and others have made a strong case for 
fixing this language anomaly.The terms "unpleasant surprise" and 
"no one would expect" are not what you want to read in 
descriptions of your language. We all expect to have a mental 
model of what the code we are writing does. Discovering the hard 
way that, despite reasonable diligence, our model was wrong tends 
to drive people away. This is the Principle of Least Surprise.

Have any of you written any PL-1? No? Here's an example of why 
(this may not be syntactically correct; I haven't written any 
PL-1 in over 50 years):
````
declare foo bit(1)

foo = 1
````
What is the value of foo after this executes? The answer is 0.

Why? 1 is a decimal constant with precision (p) of 1. To do the 
assignment, it is first converted to a binary constant of length 
p+3: 0001. This is converted to the bit string '0001' of length 
4, which is then assigned to foo, a bit string of length 1. 
Assignment of longer to shorter bit strings is done high order 
bit first.

A lot of other issues contributed to PL-1's ultimate lack of 
success, but things like this certainly did not help.

May 05

D Programming

C/C++ Programming

Other

digitalmars.D - Common type of ubyte and const ubyte is int