digitalmars.dip.ideas - Deprecate implicit conversion between signed and unsigned integers
- Paul Backus (36/36) May 12 2024 D inherited these implicit conversions from C and C++, where they
- monkyyy (8/13) May 12 2024 D often takes a worse of both worlds; bad defaults and verbose
- Nick Treleaven (20/38) May 12 2024 Signed to unsigned should be deprecated (except where VRP can
- Nick Treleaven (3/7) May 12 2024 I forgot, those already exist, at least in Phobos:
- Dom DiSc (5/7) May 13 2024 We have a working solution that always returns the correct result
- Dom DiSc (23/25) May 14 2024 And by the way: This solution doesn't involve integer propagation
- Paul Backus (8/15) May 14 2024 As I said in my reply to Nick, this proposal makes no distinction
- Paul Backus (15/31) May 14 2024 I assume by "changing the size of the type" you are referring
- Dukc (11/14) May 13 2024 Ditching all backwards-compatibility issues, it would be a good idea.
- Nick Treleaven (4/17) May 13 2024 I think even with editions we need to avoid making it hard to
- Jonathan M Davis (40/55) May 14 2024 Deprecations are the language's tool for making changes where code will
- Steven Schveighoffer (6/6) May 13 2024 I think yes, we should ban signed/unsigned conversions, but I
- An Pham (12/14) May 14 2024 Just focus on sign vs unsign is not good enough. Sometime you
- Jonathan M Davis (24/27) May 14 2024 In my experience, this hasn't been a big enough issue for me to care, an...
D inherited these implicit conversions from C and C++, where they are widely regarded as a source of bugs. * In his 2018 paper, ["Subscripts and sizes should be signed"][1], Bjarne Stroustrup gives several examples of bugs caused by the use of unsigned sizes and indices in C++. About half of them are caused by wrapping subtraction; the other half are caused by implicit conversion from signed to unsigned. * The C++20 standard includes [safe integer comparison functions][2] specifically to avoid these implicit conversions when comparing integers of different signedness. * Both [GCC][3] and [Clang][4] provide a `-Wsign-conversion` flag to warn about these implicit conversions. In D, they cause the additional problem of breaking value-range propagation (VRP): enum byte a = -1; enum uint b = a; // Ok, but... enum byte c = b; // Error - original value range has been lost D would be a simpler, easier-to-use language if these implicit conversions were removed. The first step to doing that is to deprecate them. While this is a breaking change, migration of old code would be very simple: simply insert an explicit `cast` to silence the error and restore the original behavior. In many cases, migration could be performed automatically with a tool that uses the DMD frontend as a library. I believe this change would be received positively by existing D programmers, since D's willingness to discard C and C++'s mistakes is one of the things that draws programmers to D in the first place. [1]: https://open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1428r0.pdf [2]: https://en.cppreference.com/w/cpp/utility/intcmp [3]: https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wsign-conversion [4]: https://clang.llvm.org/docs/DiagnosticsReference.html#wsign-conversion
May 12 2024
On Sunday, 12 May 2024 at 13:32:36 UTC, Paul Backus wrote:D I believe this change would be received positively by existing D programmers, since D's willingness to discard C and C++'s mistakes is one of the things that draws programmers to D in the first place.D often takes a worse of both worlds; bad defaults and verbose handling size_t(-1) >0 is a problem with indexes being unsigned for a theatrical computer with >9334 petabytes of ram (2^63) being the wrong tradeoff No; types would need better defaults before even considering adding verbosity id rather see breaking changes.
May 12 2024
On Sunday, 12 May 2024 at 13:32:36 UTC, Paul Backus wrote:In D, they cause the additional problem of breaking value-range propagation (VRP): enum byte a = -1; enum uint b = a; // Ok, but... enum byte c = b; // Error - original value range has been lost D would be a simpler, easier-to-use language if these implicit conversions were removed. The first step to doing that is to deprecate them.Signed to unsigned should be deprecated (except where VRP can tell the source was not negative). Unsigned to signed can preserve the value range when the signed type is bigger than the unsigned type, e.g.: extern ubyte x; short y = x; // OK, short.max >= ubyte.max byte z = x; // Deprecate, byte.max < ubyte.max These deprecations should be for the next edition of D.While this is a breaking change, migration of old code would be very simple: simply insert an explicit `cast` to silence the error and restore the original behavior.`cast` can be bug-prone if the original type gets changed. It would be better to have druntime template functions `signed` and `unsigned` to do the casts with IFTI to avoid changing the size of the type.In many cases, migration could be performed automatically with a tool that uses the DMD frontend as a library.Can you give some examples?I believe this change would be received positively by existing D programmers, since D's willingness to discard C and C++'s mistakes is one of the things that draws programmers to D in the first place.Yes, implicit conversion to a type with incompatible value range is too bug-prone, D should prevent that. It is particularly galling that decent C compilers have had warnings for this for such a long time. What about comparisons between incompatible signed and unsigned, deprecate too?
May 12 2024
On Sunday, 12 May 2024 at 20:20:10 UTC, Nick Treleaven wrote:`cast` can be bug-prone if the original type gets changed. It would be better to have druntime template functions `signed` and `unsigned` to do the casts with IFTI to avoid changing the size of the type.I forgot, those already exist, at least in Phobos: https://dlang.org/phobos/std_conv.html#.unsigned
May 12 2024
On Sunday, 12 May 2024 at 20:20:10 UTC, Nick Treleaven wrote:What about comparisons between incompatible signed and unsigned, deprecate too?We have a working solution that always returns the correct result (see https://issues.dlang.org/show_bug.cgi?id=259). I never understood why anyone would rely on a wrong comparison result, so this should not be considered a breaking change.
May 13 2024
On Tuesday, 14 May 2024 at 06:59:16 UTC, Dom DiSc wrote:We have a working solution that always returns the correct result (see https://issues.dlang.org/show_bug.cgi?id=259).And by the way: This solution doesn't involve integer propagation at all and also works for comparison of long with ulong. For beginners this is by far the worst bug in D, and its there since 18 (in words: eightteen) years - feels like lingering there longer than D itself. This is so distracting. It's ten lines of code and costs nothing (except if you indeed compare differnt signed types, and then it's still very cheep): ```d int opCmp(T, U)(const(T) a, const(U) b) pure safe nogc nothrow if(isIntegral!T && isIntegral!U && is(Unqual!T != Unqual!U)) { static if(isSigned!T && isUnsigned!U && T.sizeof <= U.sizeof) return (a < 0) ? -1 : opCmp(cast(U)a, b); else static if(isUnsigned!T && isSigned!U && T.sizeof >= U.sizeof) return (b < 0) ? 1 : opCmp(a, cast(T)b); else // use common type as ever: return opCmp(cast(CommonType!(T, U))a, cast(CommonType!(T, U))b); } ```
May 14 2024
On Tuesday, 14 May 2024 at 06:59:16 UTC, Dom DiSc wrote:On Sunday, 12 May 2024 at 20:20:10 UTC, Nick Treleaven wrote:As I said in my reply to Nick, this proposal makes no distinction between conversions done in the context of a comparison and conversions done in any other context. I would rather not introduce a special case for comparisons, since special cases generally make the language more complex and harder to use. However, if you think this is a good idea, I encourage you to submit it as a separate proposal.What about comparisons between incompatible signed and unsigned, deprecate too?We have a working solution that always returns the correct result (see https://issues.dlang.org/show_bug.cgi?id=259). I never understood why anyone would rely on a wrong comparison result, so this should not be considered a breaking change.
May 14 2024
On Sunday, 12 May 2024 at 20:20:10 UTC, Nick Treleaven wrote:Signed to unsigned should be deprecated (except where VRP can tell the source was not negative). Unsigned to signed can preserve the value range when the signed type is bigger than the unsigned type, e.g.: extern ubyte x; short y = x; // OK, short.max >= ubyte.max byte z = x; // Deprecate, byte.max < ubyte.maxAgreed.`cast` can be bug-prone if the original type gets changed. It would be better to have druntime template functions `signed` and `unsigned` to do the casts with IFTI to avoid changing the size of the type.I assume by "changing the size of the type" you are referring specifically to *narrowing* conversions, not widening ones. If so, then yes, it's probably a good idea to use a helper template to avoid that.Easier to give examples of the cases where it won't work: templates, because there's no reliable way to only apply the migration to specific instantiations; and string mixins, because there's no reliable way to find the source code corresponding to a mixed-in expression (if it even exists--it could be generated by CTFE).In many cases, migration could be performed automatically with a tool that uses the DMD frontend as a library.Can you give some examples?What about comparisons between incompatible signed and unsigned, deprecate too?All binary operators, including comparison operators, use the same implicit conversions, so yes, comparisons would be covered by this proposal.
May 14 2024
Paul Backus kirjoitti 12.5.2024 klo 16.32:D would be a simpler, easier-to-use language if these implicit conversions were removed. The first step to doing that is to deprecate them.Ditching all backwards-compatibility issues, it would be a good idea. But, this would cause *tremendous* amounts of breakage. Before, I would have said it simply isn't worth it. But since we're going to have editions, maybe. I'm still somewhat sceptical though. Nothing will break without a warning and people can stay at older editions if they want, but it's going to add a lot of work for someone migrating 100_000 lines to a new edition. That amount of code will likely have hundreds or even thousands of deprecations to fix. I tend to think that if we will write an official automatic tool to add the needed casts, it's probably worth it. Otherwise not.
May 13 2024
On Monday, 13 May 2024 at 12:48:04 UTC, Dukc wrote:Paul Backus kirjoitti 12.5.2024 klo 16.32: Ditching all backwards-compatibility issues, it would be a good idea. But, this would cause *tremendous* amounts of breakage. Before, I would have said it simply isn't worth it. But since we're going to have editions, maybe. I'm still somewhat sceptical though. Nothing will break without a warning and people can stay at older editions if they want, but it's going to add a lot of work for someone migrating 100_000 lines to a new edition. That amount of code will likely have hundreds or even thousands of deprecations to fix.I think even with editions we need to avoid making it hard to port code to a newer edition. So instead of a deprecation, we could make it a `-w` warning instead.I tend to think that if we will write an official automatic tool to add the needed casts, it's probably worth it. Otherwise not.
May 13 2024
On Monday, May 13, 2024 8:04:34 AM MDT Nick Treleaven via dip.ideas wrote:On Monday, 13 May 2024 at 12:48:04 UTC, Dukc wrote:Deprecations are the language's tool for making changes where code will later become illegal, and normally, the only result is that a message is printed. No code is broken until the language is actually changed to remove the deprecated feature. In contrast, with how warnings are typically used in D, adding a warning is as good as adding an error, since it's extremely common to compile with -w, which makes all warnings errors, whereas arguably, -wi would be the better choice (but -w has been around longer and is shorter). Warnings are also an utterly terrible idea in general and really should never have been added to the compiler. Even if you compile them in the fashion that most compilers do and have them actually be warnings and not errors, you inevitably end up in one of two situations with warnings: 1. You ignore many of them, because many of them are actually fine (since they typically warn of something that's potentially a problem and not something that's definitively a problem), and the ultimate result is that you get a wall of warnings, burying any useful messages where they'll never be seen, meaning that even the ones that should be fixed don't get fixed. 2. In order to avoid having a bunch of messages being printed and to avoid burying warnings that really should be fixed, you "fix" all warnings. In many cases, this requires changing code that is actually perfectly fine, but whether the code was fine or not, the fact that you're always making sure to remove any warnings that pop up makes it so that they might as well have been errors instead of warnings. The end result is that warnings are utterly useless. Either they should have been errors, or they're better left to a linting tool. So, we really should not be adding to that problem by introducing more warnings. And the fact that D's type introspection often checks whether a particular piece of code compiles in order to construct the checks for template constraints and static ifs and the like means that having flags which change whether a particular piece of code compiles or not is particularly bad for D, and adding more warnings can actually change what code compiles or not (or can even change which overload of a template is used). So, we really shouldn't be adding more warnings. Deprecations don't have any of those problems unless you choose to compile with -de, which makes them warnings and which arguably shouldn't be a thing for the same reasons that it's problematic that -w turns warnings into errors. It actually affects conditional compilation and can do so in ways that are not easy to detect. - Jonathan M DavisPaul Backus kirjoitti 12.5.2024 klo 16.32: Ditching all backwards-compatibility issues, it would be a good idea. But, this would cause *tremendous* amounts of breakage. Before, I would have said it simply isn't worth it. But since we're going to have editions, maybe. I'm still somewhat sceptical though. Nothing will break without a warning and people can stay at older editions if they want, but it's going to add a lot of work for someone migrating 100_000 lines to a new edition. That amount of code will likely have hundreds or even thousands of deprecations to fix.I think even with editions we need to avoid making it hard to port code to a newer edition. So instead of a deprecation, we could make it a `-w` warning instead.
May 14 2024
I think yes, we should ban signed/unsigned conversions, but I also think implicit conversions when VRP has validated all the values are representable is fine (e.g. `ubyte` should implicitly convert to `short` or `int`). This should cut down on the false positives. -Steve
May 13 2024
On Sunday, 12 May 2024 at 13:32:36 UTC, Paul Backus wrote:D inherited these implicit conversions from C and C++, where they are widely regarded as a source of bugs.Just focus on sign vs unsign is not good enough. Sometime you need to specify a range of values. There is module std.CheckedInt which 1. Should extend it to be a runtime system module which does not need to 'import' 2. Add range template parameters Checked!(int, X, Y, ...) like Checked!(int, -5, 200, ...) which only hold values from -5 to 200 inclusively 3. Extend language to allow implicit passing parameter void foo(Checked!(int, X, Y, ...) z) and can be called by foo(10) is ok but foo(1000) should failed
May 14 2024
On Sunday, May 12, 2024 7:32:36 AM MDT Paul Backus via dip.ideas wrote:D would be a simpler, easier-to-use language if these implicit conversions were removed. The first step to doing that is to deprecate them.In my experience, this hasn't been a big enough issue for me to care, and it's seemed like more of an academic concern than an actual problem, but I probably just don't typically write the kind of code that runs into problems because of it. So, I don't mind the status quo, but I'm also fine with getting rid of such implicit conversions. The main question IMHO is how annoying it'll be in practice. The primary case I can think of where there would likely be problems would be code that returns -1 for an index with size_t (e.g. some of the Phobos functions do that when the item being searched for isn't found). It's something that works perfectly fine in general, but it means comparing a signed type and an unsigned type. It also sometimes mean explicitly assigning -1 to an unsigned type. Those can be replace with using the type's max instead, so it's not the end of the world buy any means, but it will require code changes, and the result is arguably uglier. As Steven pointed out though, VRP should still allow the conversion where appropriate, which should reduce how much code would need to be changed. A related problem is that the compiler allows implicit conversions between character types and integer types. And personally, I care about that one far more and would love to see that changed, but I'm not against the idea of getting rid of implicit conversions between signed and unsigned integer types. - Jonathan M Davis
May 14 2024