digitalmars.dip.ideas - Deprecate implicit conversion between signed and unsigned integers

Paul Backus (36/36) May 12 2024 D inherited these implicit conversions from C and C++, where they

monkyyy (8/13) May 12 2024 D often takes a worse of both worlds; bad defaults and verbose
Nick Treleaven (20/38) May 12 2024 Signed to unsigned should be deprecated (except where VRP can

Nick Treleaven (3/7) May 12 2024 I forgot, those already exist, at least in Phobos:
Dom DiSc (5/7) May 13 2024 We have a working solution that always returns the correct result

Dom DiSc (23/25) May 14 2024 And by the way: This solution doesn't involve integer propagation
Paul Backus (8/15) May 14 2024 As I said in my reply to Nick, this proposal makes no distinction

Paul Backus (15/31) May 14 2024 I assume by "changing the size of the type" you are referring

Dukc (11/14) May 13 2024 Ditching all backwards-compatibility issues, it would be a good idea.

Nick Treleaven (4/17) May 13 2024 I think even with editions we need to avoid making it hard to

Jonathan M Davis (40/55) May 14 2024 Deprecations are the language's tool for making changes where code will

Steven Schveighoffer (6/6) May 13 2024 I think yes, we should ban signed/unsigned conversions, but I
An Pham (12/14) May 14 2024 Just focus on sign vs unsign is not good enough. Sometime you
Jonathan M Davis (24/27) May 14 2024 In my experience, this hasn't been a big enough issue for me to care, an...
Walter Bright (19/19) Mar 14 Note that this thread straddles two discussions:

Paul Backus <snarwin gmail.com> writes:

D inherited these implicit conversions from C and C++, where they 
are widely regarded as a source of bugs.

* In his 2018 paper, ["Subscripts and sizes should be 
signed"][1], Bjarne Stroustrup gives several examples of bugs 
caused by the use of unsigned sizes and indices in C++. About 
half of them are caused by wrapping subtraction; the other half 
are caused by implicit conversion from signed to unsigned.

* The C++20 standard includes [safe integer comparison 
functions][2] specifically to avoid these implicit conversions 
when comparing integers of different signedness.

* Both [GCC][3] and [Clang][4] provide a `-Wsign-conversion` flag 
to warn about these implicit conversions.

In D, they cause the additional problem of breaking value-range 
propagation (VRP):

     enum byte a = -1;
     enum uint b = a; // Ok, but...
     enum byte c = b; // Error - original value range has been lost

D would be a simpler, easier-to-use language if these implicit 
conversions were removed. The first step to doing that is to 
deprecate them.

While this is a breaking change, migration of old code would be 
very simple: simply insert an explicit `cast` to silence the 
error and restore the original behavior. In many cases, migration 
could be performed automatically with a tool that uses the DMD 
frontend as a library.

I believe this change would be received positively by existing D 
programmers, since D's willingness to discard C and C++'s 
mistakes is one of the things that draws programmers to D in the 
first place.

[1]: 
https://open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1428r0.pdf
[2]: https://en.cppreference.com/w/cpp/utility/intcmp
[3]: 
https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wsign-conversion
[4]: 
https://clang.llvm.org/docs/DiagnosticsReference.html#wsign-conversion

May 12 2024

monkyyy <crazymonkyyy gmail.com> writes:

On Sunday, 12 May 2024 at 13:32:36 UTC, Paul Backus wrote:
 D
 I believe this change would be received positively by existing 
 D programmers, since D's willingness to discard C and C++'s 
 mistakes is one of the things that draws programmers to D in 
 the first place.

D often takes a worse of both worlds; bad defaults and verbose 
handling

size_t(-1) >0 is a problem with indexes being unsigned for a 
theatrical computer with >9334 petabytes of ram (2^63) being the 
wrong tradeoff

No; types would need better defaults before even considering 
adding verbosity id rather see breaking changes.

May 12 2024

Nick Treleaven <nick geany.org> writes:

On Sunday, 12 May 2024 at 13:32:36 UTC, Paul Backus wrote:
 In D, they cause the additional problem of breaking value-range 
 propagation (VRP):

     enum byte a = -1;
     enum uint b = a; // Ok, but...
     enum byte c = b; // Error - original value range has been 
 lost

 D would be a simpler, easier-to-use language if these implicit 
 conversions were removed. The first step to doing that is to 
 deprecate them.

Signed to unsigned should be deprecated (except where VRP can 
tell the source was not negative).

Unsigned to signed can preserve the value range when the signed 
type is bigger than the unsigned type, e.g.:

     extern ubyte x;
     short y = x; // OK, short.max >= ubyte.max
     byte z = x;  // Deprecate, byte.max < ubyte.max

These deprecations should be for the next edition of D.

 While this is a breaking change, migration of old code would be 
 very simple: simply insert an explicit `cast` to silence the 
 error and restore the original behavior.

`cast` can be bug-prone if the original type gets changed. It 
would be better to have druntime template functions `signed` and 
`unsigned` to do the casts with IFTI to avoid changing the size 
of the type.

 In many cases, migration could be performed automatically with 
 a tool that uses the DMD frontend as a library.

Can you give some examples?

 I believe this change would be received positively by existing 
 D programmers, since D's willingness to discard C and C++'s 
 mistakes is one of the things that draws programmers to D in 
 the first place.

Yes, implicit conversion to a type with incompatible value range 
is too bug-prone, D should prevent that. It is particularly 
galling that decent C compilers have had warnings for this for 
such a long time.

What about comparisons between incompatible signed and unsigned, 
deprecate too?

May 12 2024

Nick Treleaven <nick geany.org> writes:

On Sunday, 12 May 2024 at 20:20:10 UTC, Nick Treleaven wrote:
 `cast` can be bug-prone if the original type gets changed. It 
 would be better to have druntime template functions `signed` 
 and `unsigned` to do the casts with IFTI to avoid changing the 
 size of the type.

I forgot, those already exist, at least in Phobos:
https://dlang.org/phobos/std_conv.html#.unsigned

May 12 2024

Dom DiSc <dominikus scherkl.de> writes:

On Sunday, 12 May 2024 at 20:20:10 UTC, Nick Treleaven wrote:
 What about comparisons between incompatible signed and 
 unsigned, deprecate too?

We have a working solution that always returns the correct result 
(see https://issues.dlang.org/show_bug.cgi?id=259). I never 
understood why anyone would rely on a wrong comparison result, so 
this should not be considered a breaking change.

May 13 2024

Dom DiSc <dominikus scherkl.de> writes:

On Tuesday, 14 May 2024 at 06:59:16 UTC, Dom DiSc wrote:
 We have a working solution that always returns the correct 
 result (see https://issues.dlang.org/show_bug.cgi?id=259).

And by the way: This solution doesn't involve integer propagation 
at all and also works for comparison of long with ulong.
For beginners this is by far the worst bug in D, and its there 
since 18 (in words: eightteen) years - feels like lingering there 
longer than D itself.
This is so distracting.

It's ten lines of code and costs nothing (except if you indeed 
compare differnt signed types, and then it's still very cheep):

```d
int opCmp(T, U)(const(T) a, const(U) b) pure  safe  nogc nothrow
    if(isIntegral!T && isIntegral!U && is(Unqual!T != Unqual!U))
{
    static if(isSigned!T && isUnsigned!U && T.sizeof <= U.sizeof)
       return (a < 0) ? -1 : opCmp(cast(U)a, b);
    else static if(isUnsigned!T && isSigned!U && T.sizeof >= 
U.sizeof)
       return (b < 0) ? 1 : opCmp(a, cast(T)b);
    else // use common type as ever:
       return opCmp(cast(CommonType!(T, U))a, cast(CommonType!(T, 
U))b);
}
```

May 14 2024

Paul Backus <snarwin gmail.com> writes:

On Tuesday, 14 May 2024 at 06:59:16 UTC, Dom DiSc wrote:
 On Sunday, 12 May 2024 at 20:20:10 UTC, Nick Treleaven wrote:
 What about comparisons between incompatible signed and 
 unsigned, deprecate too?

 We have a working solution that always returns the correct 
 result (see https://issues.dlang.org/show_bug.cgi?id=259). I 
 never understood why anyone would rely on a wrong comparison 
 result, so this should not be considered a breaking change.

As I said in my reply to Nick, this proposal makes no distinction 
between conversions done in the context of a comparison and 
conversions done in any other context.

I would rather not introduce a special case for comparisons, 
since special cases generally make the language more complex and 
harder to use. However, if you think this is a good idea, I 
encourage you to submit it as a separate proposal.

May 14 2024

Paul Backus <snarwin gmail.com> writes:

On Sunday, 12 May 2024 at 20:20:10 UTC, Nick Treleaven wrote:
 Signed to unsigned should be deprecated (except where VRP can 
 tell the source was not negative).

 Unsigned to signed can preserve the value range when the signed 
 type is bigger than the unsigned type, e.g.:

     extern ubyte x;
     short y = x; // OK, short.max >= ubyte.max
     byte z = x;  // Deprecate, byte.max < ubyte.max

Agreed.

 `cast` can be bug-prone if the original type gets changed. It 
 would be better to have druntime template functions `signed` 
 and `unsigned` to do the casts with IFTI to avoid changing the 
 size of the type.

I assume by "changing the size of the type" you are referring 
specifically to *narrowing* conversions, not widening ones. If 
so, then yes, it's probably a good idea to use a helper template 
to avoid that.

 In many cases, migration could be performed automatically with 
 a tool that uses the DMD frontend as a library.

 Can you give some examples?

Easier to give examples of the cases where it won't work: 
templates, because there's no reliable way to only apply the 
migration to specific instantiations; and string mixins, because 
there's no reliable way to find the source code corresponding to 
a mixed-in expression (if it even exists--it could be generated 
by CTFE).

 What about comparisons between incompatible signed and 
 unsigned, deprecate too?

All binary operators, including comparison operators, use the 
same implicit conversions, so yes, comparisons would be covered 
by this proposal.

May 14 2024

Dukc <ajieskola gmail.com> writes:

Paul Backus kirjoitti 12.5.2024 klo 16.32:
 D would be a simpler, easier-to-use language if these implicit 
 conversions were removed. The first step to doing that is to deprecate 
 them.

Ditching all backwards-compatibility issues, it would be a good idea. 
But, this would cause *tremendous* amounts of breakage.

Before, I would have said it simply isn't worth it. But since we're 
going to have editions, maybe. I'm still somewhat sceptical though. 
Nothing will break without a warning and people can stay at older 
editions if they want, but it's going to add a lot of work for someone 
migrating 100_000 lines to a new edition. That amount of code will 
likely have hundreds or even thousands of deprecations to fix.

I tend to think that if we will write an official automatic tool to add 
the needed casts, it's probably worth it. Otherwise not.

May 13 2024

Nick Treleaven <nick geany.org> writes:

On Monday, 13 May 2024 at 12:48:04 UTC, Dukc wrote:
 Paul Backus kirjoitti 12.5.2024 klo 16.32:
 Ditching all backwards-compatibility issues, it would be a good 
 idea. But, this would cause *tremendous* amounts of breakage.

 Before, I would have said it simply isn't worth it. But since 
 we're going to have editions, maybe. I'm still somewhat 
 sceptical though. Nothing will break without a warning and 
 people can stay at older editions if they want, but it's going 
 to add a lot of work for someone migrating 100_000 lines to a 
 new edition. That amount of code will likely have hundreds or 
 even thousands of deprecations to fix.

I think even with editions we need to avoid making it hard to 
port code to a newer edition. So instead of a deprecation, we 
could make it a `-w` warning instead.

 I tend to think that if we will write an official automatic 
 tool to add the needed casts, it's probably worth it. Otherwise 
 not.

May 13 2024

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Monday, May 13, 2024 8:04:34 AM MDT Nick Treleaven via dip.ideas wrote:
 On Monday, 13 May 2024 at 12:48:04 UTC, Dukc wrote:
 Paul Backus kirjoitti 12.5.2024 klo 16.32:
 Ditching all backwards-compatibility issues, it would be a good
 idea. But, this would cause *tremendous* amounts of breakage.

 Before, I would have said it simply isn't worth it. But since
 we're going to have editions, maybe. I'm still somewhat
 sceptical though. Nothing will break without a warning and
 people can stay at older editions if they want, but it's going
 to add a lot of work for someone migrating 100_000 lines to a
 new edition. That amount of code will likely have hundreds or
 even thousands of deprecations to fix.

 I think even with editions we need to avoid making it hard to
 port code to a newer edition. So instead of a deprecation, we
 could make it a `-w` warning instead.

Deprecations are the language's tool for making changes where code will
later become illegal, and normally, the only result is that a message is
printed. No code is broken until the language is actually changed to remove
the deprecated feature.

In contrast, with how warnings are typically used in D, adding a warning is
as good as adding an error, since it's extremely common to compile with -w,
which makes all warnings errors, whereas arguably, -wi would be the better
choice (but -w has been around longer and is shorter).

Warnings are also an utterly terrible idea in general and really should
never have been added to the compiler. Even if you compile them in the
fashion that most compilers do and have them actually be warnings and not
errors, you inevitably end up in one of two situations with warnings:

1. You ignore many of them, because many of them are actually fine (since
they typically warn of something that's potentially a problem and not
something that's definitively a problem), and the ultimate result is that
you get a wall of warnings, burying any useful messages where they'll never
be seen, meaning that even the ones that should be fixed don't get fixed.

2. In order to avoid having a bunch of messages being printed and to avoid
burying warnings that really should be fixed, you "fix" all warnings. In
many cases, this requires changing code that is actually perfectly fine, but
whether the code was fine or not, the fact that you're always making sure to
remove any warnings that pop up makes it so that they might as well have
been errors instead of warnings.

The end result is that warnings are utterly useless. Either they should have
been errors, or they're better left to a linting tool. So, we really should
not be adding to that problem by introducing more warnings.

And the fact that D's type introspection often checks whether a particular
piece of code compiles in order to construct the checks for template
constraints and static ifs and the like means that having flags which change
whether a particular piece of code compiles or not is particularly bad for
D, and adding more warnings can actually change what code compiles or not
(or can even change which overload of a template is used). So, we really
shouldn't be adding more warnings.

Deprecations don't have any of those problems unless you choose to compile
with -de, which makes them warnings and which arguably shouldn't be a thing
for the same reasons that it's problematic that -w turns warnings into
errors. It actually affects conditional compilation and can do so in ways
that are not easy to detect.

- Jonathan M Davis

May 14 2024

Steven Schveighoffer <schveiguy gmail.com> writes:

I think yes, we should ban signed/unsigned conversions, but I 
also think implicit conversions when VRP has validated all the 
values are representable is fine (e.g. `ubyte` should implicitly 
convert to `short` or `int`). This should cut down on the false 
positives.

-Steve

May 13 2024

An Pham <home home.com> writes:

On Sunday, 12 May 2024 at 13:32:36 UTC, Paul Backus wrote:
 D inherited these implicit conversions from C and C++, where 
 they are widely regarded as a source of bugs.

Just focus on sign vs unsign is not good enough. Sometime you 
need to specify a range of values. There is module std.CheckedInt 
which
1. Should extend it to be a runtime system module which does not 
need to 'import'
2. Add range template parameters Checked!(int, X, Y, ...) like 
Checked!(int, -5, 200, ...)
which only hold values from -5 to 200 inclusively
3. Extend language to allow implicit passing parameter void 
foo(Checked!(int, X, Y, ...) z)
and can be called by foo(10) is ok but foo(1000) should failed

May 14 2024

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Sunday, May 12, 2024 7:32:36 AM MDT Paul Backus via dip.ideas wrote:
 D would be a simpler, easier-to-use language if these implicit
 conversions were removed. The first step to doing that is to
 deprecate them.

In my experience, this hasn't been a big enough issue for me to care, and
it's seemed like more of an academic concern than an actual problem, but I
probably just don't typically write the kind of code that runs into problems
because of it.

So, I don't mind the status quo, but I'm also fine with getting rid of such
implicit conversions.

The main question IMHO is how annoying it'll be in practice. The primary
case I can think of where there would likely be problems would be code that
returns -1 for an index with size_t (e.g. some of the Phobos functions do
that when the item being searched for isn't found). It's something that
works perfectly fine in general, but it means comparing a signed type and an
unsigned type. It also sometimes mean explicitly assigning -1 to an unsigned
type. Those can be replace with using the type's max instead, so it's not
the end of the world buy any means, but it will require code changes, and
the result is arguably uglier.

As Steven pointed out though, VRP should still allow the conversion where
appropriate, which should reduce how much code would need to be changed.

A related problem is that the compiler allows implicit conversions between
character types and integer types. And personally, I care about that one far
more and would love to see that changed, but I'm not against the idea of
getting rid of implicit conversions between signed and unsigned integer
types.

- Jonathan M Davis

May 14 2024

Walter Bright <newshound2 digitalmars.com> writes:

Note that this thread straddles two discussions:

This one:

https://www.digitalmars.com/d/archives/digitalmars/dip/ideas/Deprecate_implicit_conversion_between_signed_and_unsigned_integers_334.html

and the other one:

https://www.digitalmars.com/d/archives/digitalmars/dip/ideas/Deprecate_implicit_conversion_between_signed_and_unsigned_integers_1105.html

This idea comes up repeatedly, all the way back to the beginning. Every
proposal 
I've seen fixes some issues, and cause an equivalent number of new issues.

The idea of warnings keeps coming up, too. The trouble with warnings is when
one 
makes them errors, then templates instantiations change and funky problems crop 
up when compiling different modules with different switch settings.

D is a systems programming language, and trying to hide what the machine 
actually does usually results in awkwardness.

Let's just keep signed and unsigned behavior as it is. It is very well known, 
and we have enormous experience behind it.

I've proposed some simple rules that avoid most issues:

1. use unsigned if the declaration should never be negative.

2. use size_t for all pointer offsets

3. use ptrdiff_t for deltas of size_t that could go negative

4. otherwise, use signed

Mar 14

D Programming

C/C++ Programming

Other

digitalmars.dip.ideas - Deprecate implicit conversion between signed and unsigned integers