digitalmars.dip.ideas - Deprecate implicit conversion between signed and unsigned integers
- Atila Neves (4/7) Feb 03 My bias is to not like any implicit conversions of any kind, but
- Paul Backus (6/14) Feb 03 That's why I focused my proposal on the specific conversions that
- Quirin Schroll (6/22) Feb 05 Those are annoying, yes. Especially unary operators. If you asked
- Quirin Schroll (7/15) Feb 04 Any implicit conversions? I’d boldly claim the following
- Paul Backus (8/14) Feb 05 In general, implicit conversions that preserve the value range of
- Quirin Schroll (36/53) Feb 06 I even think that implicit conversions from integral to
- Walter Bright (6/9) Feb 06 We already do VRP checks for cases:
- Quirin Schroll (10/19) Feb 13 I didn’t know that, but I hardly ever use floating-point types.
- Kagamin (10/15) Feb 06 I once ported a 32 bit C++ application to 64 bit. The code was
- Quirin Schroll (10/26) Feb 06 The fault 100% lies in converting `std::size_t` (which is
- Walter Bright (5/5) Feb 06 Having a function that searches an array for a value and returns the ind...
- monkyyy (4/7) Feb 06 All options suck, -1 should suck least; size_t is insane in the
- Walter Bright (3/5) Feb 06 As D is also a systems programming language, it provides access to the m...
- monkyyy (5/11) Feb 06 Do any of the embedded projects have working slices? Are there no
- Kagamin (13/13) Feb 07 FWIW, if you want C# array idiom
- Walter Bright (1/1) Feb 17 size_t is just an alias declaration. The compiler does not actually know...
- DLearner (7/12) Feb 07 [...]
- Walter Bright (2/7) Feb 17 That's FORTRAN style. It would break about every piece of D code.
- Guillaume Piolat (4/6) Feb 05 Sounds like churn.
- Walter Bright (31/31) Feb 06 [I'm not sure why a new thread was created?]
- Walter Bright (16/16) Feb 06 I forgot to mention:
- Richard (Rikki) Andrew Cattermole (12/31) Feb 06 Within the last couple of days on Twitter the C community has mentioned
- Walter Bright (3/4) Feb 17 For popcount, not for anything else. There are a lot of functions with `...
- Atila Neves (6/17) Feb 07 In Haskell, it could be either and the type would either be
- Walter Bright (4/5) Feb 17 Pascal required explicit casts. It sounded like a good idea. After a whi...
- Atila Neves (2/9) Feb 17 `cast(typeof(foo)) bar`?
- Walter Bright (4/7) Feb 17 That can work, but when best practices mean adding more code, the result...
- Atila Neves (2/11) Feb 17 Compilation or test failure, probably.
- Nick Treleaven (5/7) Feb 17 In this case, we can use these with IFTI instead of explicit
- Walter Bright (4/8) Feb 17 Yes (those were Andrei's initiative).
- Quirin Schroll (62/100) Feb 13 Java 23 does not have unsigned types, though. There are only
- Walter Bright (25/62) Feb 17 Signed and unsigned multiplication produce the exact same bit pattern re...
- Quirin Schroll (110/184) Feb 17 You’re right, I was mistaken. I thought multiplication by −1 had
- Paul Backus (4/10) Feb 17 Dividing an integer by zero is UB according to the D spec [1],
- Walter Bright (9/16) Feb 17 That's correct. But it's not memory corruption, and requiring casts does...
- Paul Backus (10/23) Feb 17 An optimizing compiler (like LDC or GDC) is allowed to generate
- Dom DiSc (12/20) Feb 06 I think most of the problems with these implicit conversions
- Richard (Rikki) Andrew Cattermole (4/29) Feb 06 We should revisit this once editions are accepted.
- Kagamin (8/9) Feb 06 I agree with Bjarne, the problem is entirely caused by abuse of
- Quirin Schroll (6/16) Feb 13 What would be a “proper number”? At best, signed and unsigned
- Kagamin (8/12) Feb 15 The problem is they are incompatible slices that you have to mix
https://forum.dlang.org/post/pbhjffbxdqpdwtmcbikh forum.dlang.org On Sunday, 12 May 2024 at 13:32:36 UTC, Paul Backus wrote:D inherited these implicit conversions from C and C++, where they are widely regarded as a source of bugs. [...]My bias is to not like any implicit conversions of any kind, but I'm not sure I can convince Walter of that.
Feb 03
On Monday, 3 February 2025 at 18:40:20 UTC, Atila Neves wrote:https://forum.dlang.org/post/pbhjffbxdqpdwtmcbikh forum.dlang.org On Sunday, 12 May 2024 at 13:32:36 UTC, Paul Backus wrote:That's why I focused my proposal on the specific conversions that are the most error-prone. I don't think we'll ever convince Walter to get rid of integer promotion in general, but there's a chance we can convince him to get rid of these specific conversions.D inherited these implicit conversions from C and C++, where they are widely regarded as a source of bugs. [...]My bias is to not like any implicit conversions of any kind, but I'm not sure I can convince Walter of that.
Feb 03
On Monday, 3 February 2025 at 19:30:14 UTC, Paul Backus wrote:On Monday, 3 February 2025 at 18:40:20 UTC, Atila Neves wrote:Those are annoying, yes. Especially unary operators. If you asked me right now what `~x` returns on a small integer type, I honestly don’t know. D has C’s rules because of one design decision early on: If it looks like C, it acts like C or it’s an error.https://forum.dlang.org/post/pbhjffbxdqpdwtmcbikh forum.dlang.org On Sunday, 12 May 2024 at 13:32:36 UTC, Paul Backus wrote:That's why I focused my proposal on the specific conversions that are the most error-prone. I don't think we'll ever convince Walter to get rid of integer promotion in general, but there's a chance we can convince him to get rid of these specific conversions.D inherited these implicit conversions from C and C++, where they are widely regarded as a source of bugs. [...]My bias is to not like any implicit conversions of any kind, but I'm not sure I can convince Walter of that.
Feb 05
On Monday, 3 February 2025 at 18:40:20 UTC, Atila Neves wrote:https://forum.dlang.org/post/pbhjffbxdqpdwtmcbikh forum.dlang.org On Sunday, 12 May 2024 at 13:32:36 UTC, Paul Backus wrote:Any implicit conversions? I’d boldly claim the following conversions are unproblematic: * `float` → `double` → `real` * signed integer → bigger signed integer * unsigned integer → bigger unsigned integer And it would be really annoying to have to explicitly cast them.D inherited these implicit conversions from C and C++, where they are widely regarded as a source of bugs. [...]My bias is to not like any implicit conversions of any kind, but I'm not sure I can convince Walter of that.
Feb 04
On Tuesday, 4 February 2025 at 16:29:22 UTC, Quirin Schroll wrote:Any implicit conversions? I’d boldly claim the following conversions are unproblematic: * `float` → `double` → `real` * signed integer → bigger signed integer * unsigned integer → bigger unsigned integer And it would be really annoying to have to explicitly cast them.In general, implicit conversions that preserve the value range of the original type are ok. So, for example: * `ushort` → `int` * `long` → `float` The reason that conversions like `int` → `uint` and `uint` → `int` are problematic is that the value range of the original type does not fit into the value range of the target type.
Feb 05
On Wednesday, 5 February 2025 at 16:29:25 UTC, Paul Backus wrote:On Tuesday, 4 February 2025 at 16:29:22 UTC, Quirin Schroll wrote:I even think that implicit conversions from integral to floating-point type are bad, considering that `int` → `float` and `long` → `double` aren’t lossless in general. Here’s an attempt to classify: 1. Definitely okay implicit conversions: * `float` → `double` → `real` * `byte` → `short` → `int` → `long` * `ubyte` → `ushort` → `uint` → `ulong` 2. Probably okay implicit conversions: * `ubyte` → `short` → `int` → `long` * `ushort` → `int` → `long` * `uint` → `long` 3. Somewhat contentious implicit conversions: * `byte`/`ubyte`/`short`/`ushort` → `float` → `double` → `real` * `int`/`uint` → `double` → `real` * `long`/`ulong` → `real` 4. Micro-lossy narrowing conversions: * `int`/`uint` → `float` * `long`/`ulong` → `float`/`double` 5. Bit-pattern-preserving value-altering conversions: * `byte` ↔ `ubyte` * `short` ↔ `ushort` * `int` ↔ `uint` * `long` ↔ `ulong` 6. Lossy narrowing conversions: * The reverse of any “→” above. It appears to me that you can reasonably draw 7 lines (from before 1 to after 6). Examples for what existing languages do (to my knowledge): * Haskell draws the line at 0/1. It has no implicit conversions whatsoever. * D draws the line at 5/6. * C/C++ draws the line at 6/7.Any implicit conversions? I’d boldly claim the following conversions are unproblematic: * `float` → `double` → `real` * signed integer → bigger signed integer * unsigned integer → bigger unsigned integer And it would be really annoying to have to explicitly cast them.In general, implicit conversions that preserve the value range of the original type are ok. So, for example: * `ushort` → `int` * `long` → `float` The reason that conversions like `int` → `uint` and `uint` → `int` are problematic is that the value range of the original type does not fit into the value range of the target type.
Feb 06
On 2/6/2025 7:18 AM, Quirin Schroll wrote:4. Micro-lossy narrowing conversions: * `int`/`uint` → `float` * `long`/`ulong` → `float`/`double`We already do VRP checks for cases: ``` float f = 1; // passes float g = 0x1234_5678; // fails ```
Feb 06
On Thursday, 6 February 2025 at 20:52:53 UTC, Walter Bright wrote:On 2/6/2025 7:18 AM, Quirin Schroll wrote:I didn’t know that, but I hardly ever use floating-point types. However, that’s not exactly VRP, but a useful check that compile-time-known values are representable in the target type. VRP means that while you normally need a cast to assign an integer to a `ubyte`, you can assign `myInt & 0xFF` to a `ubyte` without cast. You *can* assign any run-time `int` to a `float`. What you’re pointing out is that “micro-lossy narrowing conversions” are a compile-error if they’re *definitely* occurring.4. Micro-lossy narrowing conversions: * `int`/`uint` → `float` * `long`/`ulong` → `float`/`double`We already do VRP checks for cases: ``` float f = 1; // passes float g = 0x1234_5678; // fails ```
Feb 13
On Tuesday, 4 February 2025 at 16:29:22 UTC, Quirin Schroll wrote:Any implicit conversions? I’d boldly claim the following conversions are unproblematic: * `float` → `double` → `real` * signed integer → bigger signed integer * unsigned integer → bigger unsigned integerI once ported a 32 bit C++ application to 64 bit. The code was ``` uint32_t found=str1.find(str2); if(found==string::npos)return; str3=str1.substr(0,found); ``` One can say the problem is in narrowing conversion, but there's still the fundamental problem that `npos` of different widths are incompatible.
Feb 06
On Thursday, 6 February 2025 at 16:48:27 UTC, Kagamin wrote:On Tuesday, 4 February 2025 at 16:29:22 UTC, Quirin Schroll wrote:The fault 100% lies in converting `std::size_t` (which is `std::uint64_t` on all(?) 64-bit platforms) to `std::uint32_t`. You could also say it’s bad that the compiler didn’t warn you about a non-trivial expression that will always be `false` because a `std::uint32_t` simply can’t be `std::string::npos` (which is `~std::uint64_t{}` guaranteed by the C++ Standard). Clang warns on these, GCC doesn’t. You really can’t blame `std::uint32_t` converting to `std::uint64_t`. That is completely reasonable.Any implicit conversions? I’d boldly claim the following conversions are unproblematic: * `float` → `double` → `real` * signed integer → bigger signed integer * unsigned integer → bigger unsigned integerI once ported a 32 bit C++ application to 64 bit. The code was ``` uint32_t found=str1.find(str2); if(found==string::npos)return; str3=str1.substr(0,found); ``` One can say the problem is in narrowing conversion, but there's still the fundamental problem that `npos` of different widths are incompatible.
Feb 06
Having a function that searches an array for a value and returns the index of the array if found, and -1 if not found, is not a good practice. An index being returned should be size_t, and the not-found value should be size_t.max. See my other post on recommendations for selecting integral types.
Feb 06
On Thursday, 6 February 2025 at 20:44:46 UTC, Walter Bright wrote:Having a function that searches an array for a value and returns the index of the array if found, and -1 if not found, is not a good practice.All options suck, -1 should suck least; size_t is insane in the 64bit era no one has that much ram, no one has that much ram for compressed bools of 2^63 bits
Feb 06
On 2/6/2025 12:59 PM, monkyyy wrote:All options suck, -1 should suck least; size_t is insane in the 64bit era no one has that much ram, no one has that much ram for compressed bools of 2^63 bitsAs D is also a systems programming language, it provides access to the model the hardware implements.
Feb 06
On Friday, 7 February 2025 at 01:30:31 UTC, Walter Bright wrote:On 2/6/2025 12:59 PM, monkyyy wrote:Do any of the embedded projects have working slices? Are there no ways to make size_t only signed on 64 bit machines, or as a flag? I dont even know what the argument is for when that 64th bit will be used.All options suck, -1 should suck least; size_t is insane in the 64bit era no one has that much ram, no one has that much ram for compressed bools of 2^63 bitsAs D is also a systems programming language, it provides access to the model the hardware implements.
Feb 06
``` int count(T)(in T[] a) { debug assert(a.length==cast(int)a.length); return cast(int)a.length; } long lcount(T)(in T[] a) { debug assert(long(a.length)>=0); return long(a.length); } ```
Feb 07
size_t is just an alias declaration. The compiler does not actually know it exists.
Feb 17
On Thursday, 6 February 2025 at 20:44:46 UTC, Walter Bright wrote:Having a function that searches an array for a value and returns the index of the array if found, and -1 if not found, is not a good practice. An index being returned should be size_t, and the not-found value should be size_t.max.[...] Or, maintaining size_t, make first index of an array 1 not 0, and return 0 if not found. Like malloc. First array index is 1 also eliminates a fruitful source of off-by-one errors.
Feb 07
On 2/7/2025 1:04 PM, DLearner wrote:Or, maintaining size_t, make first index of an array 1 not 0, and return 0 if not found. Like malloc. First array index is 1 also eliminates a fruitful source of off-by-one errors.That's FORTRAN style. It would break about every piece of D code.
Feb 17
On Monday, 3 February 2025 at 18:40:20 UTC, Atila Neves wrote:My bias is to not like any implicit conversions of any kind, but I'm not sure I can convince Walter of that.Sounds like churn. Even the => syntax prevents to use old compilers with new package and cause churn in the DUB ecosystem.
Feb 05
[I'm not sure why a new thread was created?] This comes up now and then. It's an attractive idea, and seems obvious. But I've always been against it for multiple reasons. 1. Pascal solved this issue by not allowing any implicit conversions. The result was casts everywhere, which made the code ugly. I hate ugly code. 2. Java solve this by not having an unsigned type. People went to great lengths to emulate unsigned behavior. Eventually, the Java people gave up and added it. 3. Is `1` a signed int or an unsigned int? 4. What happens with `p[i]`? If p is the beginning of a memory object, we want i to be unsigned. If p points to the middle, we want i to be signed. What should be the type of `p - q`? signed or unsigned? 5. We rely on 2's complement overflow semantics to get the same behavior if i is signed or unsigned, most of the time. 6. Casts are a blunt instrument that impair readability and can cause unexpected behavior when changing a type in a refactoring. High quality code avoids the use of explicit casts as much as possible. 7. C behavior on this is extremely well known. 8. The Value Range Propagation feature was a brilliant solution, that resolved most issues with implicit signed and unsigned conversions, without causing any problems. 9. Array bounds checking tends to catch the usual bugs with conflating signed with unsigned. Array bounds checking is a total winner of a feature. Andrei and I went around and around on this, pointing out the contradictions. There was no solution. There is no "correct" answer for integer 2's complement arithmetic. Here's what I do: 1. use unsigned if the declaration should never be negative. 2. use size_t for all pointer offsets 3. use ptrdiff_t for deltas of size_t that could go negative 4. otherwise, use signed Stick with those and most of the problems will be avoided.
Feb 06
I forgot to mention: ``` int popcount(int x); ``` ``` uint y = ...; popcount(y); // do you really want that to fail? ``` Let's fix it: ``` int popcount(uint x); ``` ``` int y = ...; popcount(y); // now this fails ```
Feb 06
On 07/02/2025 9:55 AM, Walter Bright wrote:I forgot to mention: ``` int popcount(int x); ``` ``` uint y = ...; popcount(y); // do you really want that to fail? ``` Let's fix it: ``` int popcount(uint x); ``` ``` int y = ...; popcount(y); // now this fails ```Within the last couple of days on Twitter the C community has mentioned that they'd like implicit conversion for numeric types to static arrays. That could resolve this quite nicely. ```d int popcount(ubyte[4]); int y = ...; popcount(y); uint z = ...; popcount(z); ``` An explicit cast with ``ref`` could go the other way.
Feb 06
On 2/6/2025 8:26 PM, Richard (Rikki) Andrew Cattermole wrote:That could resolve this quite nicely.For popcount, not for anything else. There are a lot of functions with `int` or `uint` parameters, but the sign is meaningless to its operation.
Feb 17
On Thursday, 6 February 2025 at 09:10:41 UTC, Walter Bright wrote:[I'm not sure why a new thread was created?] This comes up now and then. It's an attractive idea, and seems obvious. But I've always been against it for multiple reasons. 1. Pascal solved this issue by not allowing any implicit conversions. The result was casts everywhere, which made the code ugly. I hate ugly code.I hate ugly code too, but I'd rather have explicit casts.3. Is `1` a signed int or an unsigned int?In Haskell, it could be either and the type would either be inferred. Or the programmer chooses: 1 :: Int4. What happens with `p[i]`? If p is the beginning of a memory object, we want i to be unsigned. If p points to the middle, we want i to be signed. What should be the type of `p - q`? signed or unsigned?Good questions.
Feb 07
On 2/7/2025 4:50 AM, Atila Neves wrote:I hate ugly code too, but I'd rather have explicit casts.Pascal required explicit casts. It sounded like a good idea. After a while, I hated it. It was so nice switching to C and leaving that behind. (Did I mention that explicit casts also hide errors introduced by refactoring?)
Feb 17
On Monday, 17 February 2025 at 08:30:44 UTC, Walter Bright wrote:On 2/7/2025 4:50 AM, Atila Neves wrote:`cast(typeof(foo)) bar`?I hate ugly code too, but I'd rather have explicit casts.Pascal required explicit casts. It sounded like a good idea. After a while, I hated it. It was so nice switching to C and leaving that behind. (Did I mention that explicit casts also hide errors introduced by refactoring?)
Feb 17
On 2/17/2025 1:06 AM, Atila Neves wrote:That can work, but when best practices mean adding more code, the result is usually failure. Also, what if `foo` changes to something not anticipated by that cast?(Did I mention that explicit casts also hide errors introduced by refactoring?)`cast(typeof(foo)) bar`?
Feb 17
On Monday, 17 February 2025 at 22:24:37 UTC, Walter Bright wrote:On 2/17/2025 1:06 AM, Atila Neves wrote:Compilation or test failure, probably.That can work, but when best practices mean adding more code, the result is usually failure. Also, what if `foo` changes to something not anticipated by that cast?(Did I mention that explicit casts also hide errors introduced by refactoring?)`cast(typeof(foo)) bar`?
Feb 17
On Monday, 17 February 2025 at 08:30:44 UTC, Walter Bright wrote:(Did I mention that explicit casts also hide errors introduced by refactoring?)In this case, we can use these with IFTI instead of explicit casts: https://dlang.org/phobos/std_conv.html#signed https://dlang.org/phobos/std_conv.html#unsigned
Feb 17
On 2/17/2025 1:11 PM, Nick Treleaven wrote:In this case, we can use these with IFTI instead of explicit casts: https://dlang.org/phobos/std_conv.html#signed https://dlang.org/phobos/std_conv.html#unsignedYes (those were Andrei's initiative). Up to a point. An explicit use of a signed template doesn't work if one is refactoring to an unsigned type.
Feb 17
On Thursday, 6 February 2025 at 09:10:41 UTC, Walter Bright wrote:[I'm not sure why a new thread was created?] This comes up now and then. It's an attractive idea, and seems obvious. But I've always been against it for multiple reasons. 1. Pascal solved this issue by not allowing any implicit conversions. The result was casts everywhere, which made the code ugly. I hate ugly code.Let me guess: Pascal has no value-range propagation?2. Java solve this by not having an unsigned type. People went to great lengths to emulate unsigned behavior. Eventually, the Java people gave up and added it.Java 23 does not have unsigned types, though. There are only operations that essentially reinterpret the bits of signed integer types as unsigned integers and do operations on them. Signed and unsigned multiplication, division and modulo are completely different operations.3. Is `1` a signed int or an unsigned int?Ideally, it has its own type that implicitly converts to anything that can be initialized by the constant. Of course, `typeof()` must return something, there are three options: - `typeof(1)` is `typeof(1)`, similar to `typeof(null)` - `typeof(1)` is `__static_integer` (cf. Zig’s `comptime_int`) - `typeof(1)` is `int`, which makes it indistinguishable from a runtime expression. D chooses the latter. None of those are a bad choice; tradeoffs everywhere.4. What happens with `p[i]`? If `p` is the beginning of a memory object, we want `i` to be unsigned. If `p` points to the middle, we want `i` to be signed. What should be the type of `p - q`? signed or unsigned?Two questions, two answers.What happens with `p[i]`?That’s a vague question. If `p` is a slice, range error if `i` is signed and negative. If `p` is a pointer, it’s `*(p + i)` and if `i` is signed and negative, so be it. `typeof(p + i)` is `typeof(p)`, so there shouldn’t be a problem.What should be the type of `p - q`? signed or unsigned?Signed. If `p` and `q` are compile-time constants, so is `p - q`, and if it’s nonnegative, converts to unsigned types. While it would be annoying for sure, it does make sense to use a function for pointer subtraction when one assumes the difference to be positive: `unsignedDifference(p, q)` It would assert that the result is in fact positive or zero and return a `size_t`. The cool thing about it is that if you expect an unsigned result and happen to be wrong, you’ll find out quicker than otherwise.5. We rely on 2's complement overflow semantics to get the same behavior if `i` is signed or unsigned, most of the time.As I see it, 2’s complement for both signed and unsigned arithmetic is a straightforward choice D made to keep ` safe` useful. If D made any of them UB, it would exclude part of basic arithmetic from ` safe` because ` safe` bans every operation that *can* introduce UB. It’s essentially why pointer arithmetic is banned in ` safe`, since `++p` might push `p` outside an array, which is UB. D offers slices as a safe (because checked) alternative to pointers.6. Casts are a blunt instrument that impair readability and can cause unexpected behavior when changing a type in a refactoring. High quality code avoids the use of explicit casts as much as possible.In my experience, when signed and unsigned are mixed, it points to a design issue. I had this experience a couple of times working on an older C++ codebase.7. C behavior on this is extremely well known.Making something valid in C do something it can’t do in C is a bad idea and invites bugs, that is true. Making questionable C things errors *prima facie* isn’t. AFAICT, D for the most part sticks to: If it looks like C, it behaves like C or doesn’t compile. Banning signed-to-unsigned conversions (unless VRP proves it’s okay) simply falls into the latter box.8. The Value Range Propagation feature was a brilliant solution, that resolved most issues with implicit signed and unsigned conversions, without causing any problems.Of course VRP is great. For the most part, it means if an implicit conversion compiles, it’s because nothing weird happens, no data can be lost, etc. Signed to unsigned conversion breaks this expectation that VRP in fact co-created.9. Array bounds checking tends to catch the usual bugs with conflating signed with unsigned. Array bounds checking is a total winner of a feature.It’s generally good. Almost no-one complains about it.Andrei and I went around and around on this, pointing out the contradictions. There was no solution. There is no "correct" answer for integer 2's complement arithmetic.I don’t really know what that means. Integer types in C and most languages derived from it (D included) inherited have this oddity that addition and subtraction is 2’s complement, but multiplication, division, and modulo are not (`cast(uint)(-10 / 3)` and `cast(uint)-10 / 3` are different). Mathematically speaking, integers in D are neither values modulo 2ⁿ nor a section of ℤ.Here's what I do: 1. use unsigned if the declaration should never be negative. 2. use size_t for all pointer offsets 3. use ptrdiff_t for deltas of size_t that could go negative 4. otherwise, use signed Stick with those and most of the problems will be avoided.Sounds reasonable.
Feb 13
On 2/13/2025 4:00 PM, Quirin Schroll wrote:Signed and unsigned multiplication, division and modulo are completely different operations.Signed and unsigned multiplication produce the exact same bit pattern result. Division and modulo are indeed different.None of those are a bad choice; tradeoffs everywhere.It's always tradeoffs.Sorry, I meant `p` as a pointer. I use `a` as an array (or slice). A pointer can move forward or backwards, so the index is signed. A slice cannot back up, so the index is unsigned. A slice can be converted to a pointer. So then what, is the index signed or unsigned? There's no answer for that.4. What happens with `p[i]`? If `p` is the beginning of a memory object, we want `i` to be unsigned. If `p` points to the middle, we want `i` to be signed. What should be the type of `p - q`? signed or unsigned?Two questions, two answers.What happens with `p[i]`?That’s a vague question. If `p` is a slice, range error if `i` is signed and negative. If `p` is a pointer, it’s `*(p + i)` and if `i` is signed and negative, so be it. `typeof(p + i)` is `typeof(p)`, so there shouldn’t be a problem.That doesn't work if the array is bigger than the int range, or happens to straddle `int.max`. (The garbage collector can run into this.)What should be the type of `p - q`? signed or unsigned?Signed.While it would be annoying for sure, it does make sense to use a function for pointer subtraction when one assumes the difference to be positive: `unsignedDifference(p, q)` It would assert that the result is in fact positive or zero and return a `size_t`. The cool thing about it is that if you expect an unsigned result and happen to be wrong, you’ll find out quicker than otherwise.I'm sorry, all these extra baggage and rules about signed and unsigned makes it harder to use, not easier.As I see it, 2’s complement for both signed and unsigned arithmetic is a straightforward choice D made to keep ` safe` useful.D's type system preceded safe by many years :-/If D made any of them UB, it would exclude part of basic arithmetic from ` safe` because ` safe` bans every operation that *can* introduce UB.safe only bans memory corruption. 2's complement arithmetic is not UB.It’s essentially why pointer arithmetic is banned in ` safe`, since `++p` might push `p` outside an array, which is UB. D offers slices as a safe (because checked) alternative to pointers.`--p` and `++p` are always unsafe whether the implicit conversions are there or not.Hence my suggestions. I look at it this way. D is a systems programming language. A requirement for being successful at it is understanding 2's complement arithmetic, including what wraparound is. It's not that dissimilar to the requirement of some understanding of how floating point code works and its limitations, otherwise grief will be your inevitable companion. Also that a bool is a one bit integer arithmetic type. I know there are languages that attempt to hide all this stuff, but D isn't one of them.6. Casts are a blunt instrument that impair readability and can cause unexpected behavior when changing a type in a refactoring. High quality code avoids the use of explicit casts as much as possible.In my experience, when signed and unsigned are mixed, it points to a design issue. I had this experience a couple of times working on an older C++ codebase.
Feb 17
On Monday, 17 February 2025 at 09:01:45 UTC, Walter Bright wrote:On 2/13/2025 4:00 PM, Quirin Schroll wrote:You’re right, I was mistaken. I thought multiplication by −1 had to be different than multiplication my `T.max`, but it’s not.Signed and unsigned multiplication, division and modulo are completely different operations.Signed and unsigned multiplication produce the exact same bit pattern result. Division and modulo are indeed different.Sometimes, there are better things.None of those are a bad choice; tradeoffs everywhere.It's always tradeoffs.The index already has a type. The operation `p + i` can support signed and unsigned `i` via overloading. I really don’t see the problem. You’re not inferring the type of the index because of the operation.Sorry, I meant `p` as a pointer. I use `a` as an array (or slice). A pointer can move forward or backwards, so the index is signed. A slice cannot back up, so the index is unsigned. A slice can be converted to a pointer. So then what, is the index signed or unsigned? There's no answer for that.4. What happens with `p[i]`? If `p` is the beginning of a memory object, we want `i` to be unsigned. If `p` points to the middle, we want `i` to be signed. What should be the type of `p - q`? signed or unsigned?Two questions, two answers.What happens with `p[i]`?That’s a vague question. If `p` is a slice, range error if `i` is signed and negative. If `p` is a pointer, it’s `*(p + i)` and if `i` is signed and negative, so be it. `typeof(p + i)` is `typeof(p)`, so there shouldn’t be a problem.Why would the GC use `int`? Unless, of course, it happens to equal `ptrdiff_t`? Those are conceptually different. The general problem is, basically, that differences of n-bit integers require n+1 bits to represent. That problem is not inherent to unsigned values, it’s just more obvious because 2 − 1 can’t be represented. In signed world, `-2` − `int.max` doesn’t fit in an `int` either. Making them signed doesn’t fix differences of indices totally, only differences of non-negative values.That doesn't work if the array is bigger than the int range, or happens to straddle `int.max`. (The garbage collector can run into this.)What should be the type of `p - q`? signed or unsigned?Signed.It’s much harder to write bugs when signed and unsigned are separated.While it would be annoying for sure, it does make sense to use a function for pointer subtraction when one assumes the difference to be positive: `unsignedDifference(p, q)` It would assert that the result is in fact positive or zero and return a `size_t`. The cool thing about it is that if you expect an unsigned result and happen to be wrong, you’ll find out quicker than otherwise.I'm sorry, all these extra baggage and rules about signed and unsigned makes it harder to use, not easier.My argument isn’t so much about history, but UB. Java does the same.As I see it, 2’s complement for both signed and unsigned arithmetic is a straightforward choice D made to keep ` safe` useful.D's type system preceded safe by many years :-/In the language design space, there’s no difference between UB and memory corruption because memory corruption is a form of UB and any UB can lead to memory corruption (by definition really). Therefore, speaking about memory corruption is equivalent to speaking about UB generally. D’s ` safe` bans all UB (by intent at least). If it didn’t, it would allow for memory corruption; it doesn’t matter if it’s directly or indirectly.If D made any of them UB, it would exclude part of basic arithmetic from ` safe` because ` safe` bans every operation that *can* introduce UB.safe only bans memory corruption.2's complement arithmetic is not UB.Of course it’s not. The alternative to 2’s complement is UB (practically speaking). There are some odd platforms with a negative representation that’s not 2’s complement, but D supports none of them. What I’m saying is, when designing a programming language, your choices to integer overflow are: 2’s complement or UB. D chose 2’s complement overall (also Java), C/C++ chose 2’s complement for unsigned and UB for signed, Zig chose UB overall. Guaranteeing 2’s complement means the operation is well-defined for all inputs, but the optimizer can do less. Tradeoffs everywhere. Even before ` safe`, having all operations on integers well-defined (maybe ignore division by zero) has positives that I guess you saw. Historically speaking, had D taken the C/C++ or Zig route, there would be no ` safe` because if basic operations on integers can be UB, adding a feature like ` safe` makes no sense.What I find interesting is that: - For pointers, it’s obvious to almost anyone that slices are a win because of bounds checking, even though it comes with a dual cost: The length has to be stored and indexing operations have to range-checked. - For integer operations, people seem to be hesitant to range-check them, even though that comes only with the cost of doing the check; no bound has to be stored. It’s not that 2’s complement doesn’t have its place; what I am saying is: The language constructs should be as close to the intuition of the programmer as possible. I for once know when I’m making deliberate use of the bit representation of integers, however, without checks, I’m making use of the bit representation of integers with every operation, most of the time when I don’t intend to. Most of the time, the fact that integers are binary is conceptually irrelevant.It’s essentially why pointer arithmetic is banned in ` safe`, since `++p` might push `p` outside an array, which is UB. D offers slices as a safe (because checked) alternative to pointers.`--p` and `++p` are always unsafe whether the implicit conversions are there or not.One cannot apply suggestions retroactively to a huge codebase that’s >15 years old. One can, however, ban narrowing conversions and discover the problematic spots in compile errors and address them properly.Hence my suggestions.6. Casts are a blunt instrument that impair readability and can cause unexpected behavior when changing a type in a refactoring. High quality code avoids the use of explicit casts as much as possible.In my experience, when signed and unsigned are mixed, it points to a design issue. I had this experience a couple of times working on an older C++ codebase.I look at it this way. D is a systems programming language. A requirement for being successful at it is understanding 2's complement arithmetic, including what wraparound is.While I agree that it is true and that I would exclude anyone from being called a competent programmer who doesn’t understand 2’s complement, I find myself rarely thinking about indices and whatnot something other than an integer with a limited range. For hashing and some other algorithms, you do think of those as elements of an ordered [unitary ring](https://en.wikipedia.org/wiki/Ring_(mathematics)) with an operation referred to as “division with remainder.” D inherited its types from C and C inherited them from the operations of machines. It wouldn’t have occurred to the creators of C to provide different types for doing boolean logic, integer arithmetic, indexing arithmetic a.k.a. addressing, and bit operations. All of these happen in the same kinds of registers; to most people, however, a boolean value isn’t an integer (even C added `_Bool` and then `bool`); a number isn’t an index, and an index isn’t a bit-vector. To most people, `size_t` means more than “alias to the bit-width unsigned integer type the same size as addresses,” but conceptualizes sizes of memory or indices into arrays (in memory). Nobody would use a `size_t` to model the age of something; age is a number (within some range) and not an index. What’s the difference between `i << 1` and `i * 2`? From the low-level perspective, literally none after optimization. However, in code, those encode very different intents. D is a low-level _and_ a high-level language. From the higher levels, mixing bit-vectors and numbers is usually a mistake. The language requiring to state that, yes, that’s indeed what you want isn’t exactly bad.It's not that dissimilar to the requirement of some understanding of how floating point code works and its limitations, otherwise grief will be your inevitable companion. Also that a `bool` is a one bit integer arithmetic type.I wonder why D has an 1-bit integer type which is conceptually a boolean value, but no general n-bit integer types? C23 added `_BitInt(n)` and `_BitInt(1)` is not `bool` (which C23 made a proper type).I know there are languages that attempt to hide all this stuff, but D isn't one of them.There’s a difference between hiding and not needlessly exposing. Making the implicit conversion of `int` to and from `uint` an error isn’t hiding things akin to Java hiding its pointers. Narrowing implicit conversions warrant a warning in C and C++ and rightly so – it is likely a mistake and a local fix is available (use an explicit cast); brace-initialization in C++ outright bans it. By the design of D, it should be an error. Alternatives are: - Redesign so the error doesn’t even come up anymore. - Assert, then cast. (If you’re “really sure” it can’t fail.) - Use a throwing narrowing conversion function. (If you’re “mostly sure” it can’t fail.)
Feb 17
On Monday, 17 February 2025 at 09:01:45 UTC, Walter Bright wrote:On 2/13/2025 4:00 PM, Quirin Schroll wrote:Dividing an integer by zero is UB according to the D spec [1], and it is allowed in safe code. [1] https://dlang.org/spec/expression.html#divisionIf D made any of them UB, it would exclude part of basic arithmetic from ` safe` because ` safe` bans every operation that *can* introduce UB.safe only bans memory corruption. 2's complement arithmetic is not UB.
Feb 17
On 2/17/2025 7:07 AM, Paul Backus wrote:On Monday, 17 February 2025 at 09:01:45 UTC, Walter Bright wrote:That's correct. But it's not memory corruption, and requiring casts doesn't address it. The usual result is a signal is generated. These can be intercepted at the user's discretion. The compiler will flag an error if it can statically determine that the divisor is zero. Runtime checks could be added, but since other languages don't do that, it would put D at a competitive disadvantage. As always, there are tradeoffs.safe only bans memory corruption. 2's complement arithmetic is not UB.Dividing an integer by zero is UB according to the D spec [1], and it is allowed in safe code. [1] https://dlang.org/spec/expression.html#division
Feb 17
On Tuesday, 18 February 2025 at 00:33:27 UTC, Walter Bright wrote:On 2/17/2025 7:07 AM, Paul Backus wrote:An optimizing compiler (like LDC or GDC) is allowed to generate code that produces memory corruption if a division by zero would occur. So this is absolutely a hole in safe. If the compiler could guarantee that a signal would be generated on division by zero, that would be sufficient to close the safety hole.Dividing an integer by zero is UB according to the D spec [1], and it is allowed in safe code. [1] https://dlang.org/spec/expression.html#divisionThat's correct. But it's not memory corruption, and requiring casts doesn't address it. The usual result is a signal is generated. These can be intercepted at the user's discretion.The compiler will flag an error if it can statically determine that the divisor is zero. Runtime checks could be added, but since other languages don't do that, it would put D at a competitive disadvantage.An alternative solution that does not require giving up any runtime performance would be to require safe code to use std.checkedint for dividing integers.
Feb 17
On Monday, 3 February 2025 at 18:40:20 UTC, Atila Neves wrote:https://forum.dlang.org/post/pbhjffbxdqpdwtmcbikh forum.dlang.org On Sunday, 12 May 2024 at 13:32:36 UTC, Paul Backus wrote:I think most of the problems with these implicit conversions would be gone if we make this work: ```d byte a= -5; ulong b = 1_000_000_000_000; assert(a < b); // fails ``` And we already do have a solution for this (see https://issues.dlang.org/show_bug.cgi?id=259), but Walter refuses it, because it will break code that relies on this bug. How much less likely is it to convince him of your proposal?D inherited these implicit conversions from C and C++, where they are widely regarded as a source of bugs. [...]My bias is to not like any implicit conversions of any kind, but I'm not sure I can convince Walter of that.
Feb 06
On 07/02/2025 12:07 AM, Dom DiSc wrote:On Monday, 3 February 2025 at 18:40:20 UTC, Atila Neves wrote:We should revisit this once editions are accepted. It sounds reasonable to disable comparisons as long as VRP is kicking in to allow it selectively.https://forum.dlang.org/post/pbhjffbxdqpdwtmcbikh forum.dlang.org On Sunday, 12 May 2024 at 13:32:36 UTC, Paul Backus wrote:I think most of the problems with these implicit conversions would be gone if we make this work: ```d byte a= -5; ulong b = 1_000_000_000_000; assert(a < b); // fails ``` And we already do have a solution for this (see https:// issues.dlang.org/show_bug.cgi?id=259), but Walter refuses it, because it will break code that relies on this bug. How much less likely is it to convince him of your proposal?D inherited these implicit conversions from C and C++, where they are widely regarded as a source of bugs. [...]My bias is to not like any implicit conversions of any kind, but I'm not sure I can convince Walter of that.
Feb 06
On Monday, 3 February 2025 at 18:40:20 UTC, Atila Neves wrote:https://forum.dlang.org/post/pbhjffbxdqpdwtmcbikh forum.dlang.orgI agree with Bjarne, the problem is entirely caused by abuse of unsigned integers as positive numbers. And deprecation of implicit conversion is impossible due to this abuse: signed and unsigned integers will be mixed everywhere because signed integers are proper numbers and unsigned integers are everywhere almost all interfaces and it just works.
Feb 06
On Thursday, 6 February 2025 at 16:39:26 UTC, Kagamin wrote:On Monday, 3 February 2025 at 18:40:20 UTC, Atila Neves wrote:What would be a “proper number”? At best, signed and unsigned types represent various slices of the infinite integers.https://forum.dlang.org/post/pbhjffbxdqpdwtmcbikh forum.dlang.orgI agree with Bjarne, the problem is entirely caused by abuse of unsigned integers as positive numbers. And deprecation of implicit conversion is impossible due to this abuse: signed and unsigned integers will be mixed everywhere because signed integers are proper numbers and unsigned integers are everywhere due to abuse.interfaces and it just works.unsigned types. There’s a [`CLSCompliantAttribute`](https://learn.microsoft.com/de-de/dotnet/api/system.cl compliantattribute) that warns you if you expose unsigned integers to your is unsigned and `sbyte` is the signed, non-CLS-compliant variant.
Feb 13
On Friday, 14 February 2025 at 00:09:14 UTC, Quirin Schroll wrote:What would be a “proper number”? At best, signed and unsigned types represent various slices of the infinite integers.The problem is they are incompatible slices that you have to mix due to abuse of unsigned integers everywhere. At best unsigned integer gives you an extra bit, but in practice it doesn't cut: when you want a bigger integer, you use a much wider integer, not one bit bigger integer.unsigned types.It demonstrates that the problem is due to abuse of unsigned integers.
Feb 15