digitalmars.D - core.checkedint
- Andrei Alexandrescu (32/32) Jun 24 2016 There are a few simple enhancements to add to core.checkedint:
- Dominikus Dittes Scherkl (100/132) Jun 24 2016 I have a save exponentiation function (+ two helper functions
- Dominikus Dittes Scherkl (18/20) Jun 24 2016 Ah, and I use a better abs() function:
- Walter Bright (22/27) Jun 24 2016 The reason for the different names is:
- Andrei Alexandrescu (10/41) Jun 24 2016 These functions are not supposed to be used naïvely.
- Walter Bright (17/18) Jun 24 2016 They are also meant to be inspectable. I want to look at it and triviall...
- Andrei Alexandrescu (3/7) Jun 24 2016 Then the right API is add(x, y, overflow). I look at it and I trivially
- Andrei Alexandrescu (15/15) Jun 24 2016 To bring a different angle, the best front-end for arithmetic operations...
- Walter Bright (6/10) Jun 24 2016 I believe adding such behavior is beyond the charter of checkedint. I un...
- Andrei Alexandrescu (2/3) Jun 24 2016 Which checkedint? The one in core? -- Andrei
- Walter Bright (2/5) Jun 24 2016 Your Checked!int. Sorry for the confusion.
- Andrei Alexandrescu (19/25) Jun 24 2016 Though you should be able to implement that via a hook, I think at this
- Andrei Alexandrescu (3/8) Jun 24 2016 To clarify: if array.length is 0, then indeed that should be signaled as...
- Observer (59/71) Jun 25 2016 I'm curious how one attempts to handle overflow detection in a
- Walter Bright (3/9) Jun 24 2016 I'm curious if that produces an overflow error on the newer clang that
- Walter Bright (8/14) Jun 24 2016 That's a seductive test case. But I worry that mixed signed/unsigned ari...
- Andrei Alexandrescu (7/25) Jun 24 2016 Doesn't seem that way (with some simplifying rules, associativity is
There are a few simple enhancements to add to core.checkedint: 1. No need for different names (addu/adds) etc., overloading should take care of it (simplifies caller code). Right now the API is an odd mix of overloading and distinct names. 2. The overflow flag should be an integral, not a bool, so you get to count how many overflows there were. This is minor but has no extra cost on the happy case. 3. There should be an exponentiation function. So I'm thinking of the signatures: pure nothrow nogc safe { int add(int x, int y, ref uint overflows); uint add(uint x, uint y, ref uint overflows); long add(long x, int y, ref uint overflows); ulong add(ulong x, uint y, ref uint overflows); int mul(int x, int y, ref uint overflows); uint mul(uint x, uint y, ref uint overflows); long mul(long x, int y, ref uint overflows); ulong mul(ulong x, uint y, ref uint overflows); int sub(int x, int y, ref uint overflows); uint sub(uint x, uint y, ref uint overflows); long sub(long x, int y, ref uint overflows); ulong sub(ulong x, uint y, ref uint overflows); int neg(int x, ref uint overflows); long neg(long x, ref uint overflows); int pow(int x, int y, ref uint overflows); uint pow(uint x, uint y, ref uint overflows); long pow(long x, int y, ref uint overflows); ulong pow(ulong x, uint y, ref uint overflows); } Thoughts? Andrei
Jun 24 2016
On Friday, 24 June 2016 at 16:56:36 UTC, Andrei Alexandrescu wrote:There are a few simple enhancements to add to core.checkedint: 1. No need for different names (addu/adds) etc., overloading should take care of it (simplifies caller code). Right now the API is an odd mix of overloading and distinct names. 2. The overflow flag should be an integral, not a bool, so you get to count how many overflows there were. This is minor but has no extra cost on the happy case. 3. There should be an exponentiation function. So I'm thinking of the signatures: pure nothrow nogc safe { int add(int x, int y, ref uint overflows); uint add(uint x, uint y, ref uint overflows); long add(long x, int y, ref uint overflows); ulong add(ulong x, uint y, ref uint overflows); int mul(int x, int y, ref uint overflows); uint mul(uint x, uint y, ref uint overflows); long mul(long x, int y, ref uint overflows); ulong mul(ulong x, uint y, ref uint overflows); int sub(int x, int y, ref uint overflows); uint sub(uint x, uint y, ref uint overflows); long sub(long x, int y, ref uint overflows); ulong sub(ulong x, uint y, ref uint overflows); int neg(int x, ref uint overflows); long neg(long x, ref uint overflows); int pow(int x, int y, ref uint overflows); uint pow(uint x, uint y, ref uint overflows); long pow(long x, int y, ref uint overflows); ulong pow(ulong x, uint y, ref uint overflows); } Thoughts? AndreiI have a save exponentiation function (+ two helper functions that may also be useful on their own): /// add a property to numeric types that can be used as return value if a result is out of bounds template invalid(T) if(isNumeric!T) { static if(isFloatingPoint!T) property enum invalid = T.init; else static if(isSigned!T) property enum invalid = T.min; // 0x80..00 else // unsigned property enum invalid = T.max; // 0xFF..FF } /// calculate the maximal power of the value that would fit in an ucent /// for smaller types simply shift down the result /// (e.g. divide by 4 to calc the maxpower that would fit in an uint) ubyte maxpow(const ulong a) pure safe nogc nothrow { assert(a>1); // no useful maxpower exists for 0 and 1 static immutable ubyte[55] mp = [ 127, 80, 63, 55, 49, 45, 43, 40, 38, 37, 35, 34, 33, 32, 31, 31, 30, 30, 29, 29, 28, 28, 27, 27, 27, 26, 26, 26, 26, 25, 25, 25, 25, 24, 24, 24, 24, 24, 24, 23, 23, 23, 23, 23, 23, 23, 22, 22, 22, 22, 22, 22, 22, 22, 22 ]; return (a<139) ? (a<57) ? mp[cast(ubyte)a-2] : ((a<85) ? ((a<69) ? 21 : 20) : ((a<107) ? 19 : 18)) : ((a<7132) ? ((a<566) ? ((a<256) ? ((a<185) ? 17 : 16) : ((a<371) ? 15 : 14)) : ((a<1626) ? ((a<921) ? 13 : 12) : ((a<3184) ? 11 : 10))) : ((a<2642246) ? ((a<65536) ? ((a<19113) ? 9 : 8) : ((a<319558) ? 7 : 6)) : ((a<4294967296) ? ((a<50859009) ? 5 : 4) : ((a<6981463658332) ? 3 : 2)))); } /// exponentiation without overflow /// return invalid if overflow would occur. T safePow(T)(const(T) base, const ubyte exp) pure safe nogc nothrow if(isIntegral!T) { static if(isUnsigned!T) { if(!exp) return 1; // x^^0 is always 1 if(exp == 1 || base < 2) return base; // x^^1 = x, 1^^n = 1 and 0^^n = 0, so do nothing static if(T.sizeof > ulong.sizeof) { if(base > ulong.max) return invalid!T; } if(exp > (maxpow(base)>>(5-bitlen(T.sizeof)))) return invalid!T; return base ^^ exp; // calc only if no overflow for sure (very efficient) } else { auto r = safePow(abs(base), exp); // sometimes calc even if result won't fit if(r > T.max) return invalid!T; // but at least we can easily detect if it happened return (base < 0 && odd(exp)) ? -cast(T)r : cast(T)r; } } unittest { void test(T)() { T r; foreach(base; T.min..T.max+1) { foreach(exp; 0..256) { r = safePow(base, exp); if(r == invalid!T) { assert(base^^exp > T.max, "max wrong: "+T.stringOf()); break; } assert(r == base^^exp, "calc wrong"); } } } test!byte; test!ubyte; test!short; test!ushort; /+ test!int; test!uint; test!long; test!ulong; +/ }
Jun 24 2016
On Friday, 24 June 2016 at 17:25:32 UTC, Dominikus Dittes Scherkl wrote:I have a save exponentiation function (+ two helper functions that may also be useful on their own):Ah, and I use a better abs() function: /// get the absolute value of x as unsigned type. always succeeds, even for T.min Unsigned!T abs(T)(const(T) x) pure safe nogc nothrow if(isIntegral!T) { static if(isSigned!T) if(x < 0) return -x; return x; } unittest { byte a = -128; auto b = abs(a); assert(is(typeof(b) == ubyte)); assert(b == 128); }
Jun 24 2016
On 6/24/2016 9:56 AM, Andrei Alexandrescu wrote:1. No need for different names (addu/adds) etc., overloading should take care of it (simplifies caller code). Right now the API is an odd mix of overloading and distinct names.The reason for the different names is: 1. Integral promotion rules will affect which overload is selected, when the argument type is not an exact match. Few programmers have an exact understanding of integral promotion rules. 2. Things become further obfuscated when type aliases are used. 3. Mixing signed and unsigned integer types calls which overload? 4. People who use core.checkedint want pedantically correct code. Being absolutely sure what operation is being done is part of that. Using different names makes it trivially inspectable. 5. It is a design mistake to make overloads with the same name have different implementations. Arguably, in this context, a signed overflow is different from an unsigned overflow, and so should properly be a different name.2. The overflow flag should be an integral, not a bool, so you get to counthow many overflows there were. This is minor but has no extra cost on the happy case. I see no value in this beyond amusement value, and a potential bug: 1. One operation in a sequence that overflows will make subsequent operations meaningless, and will likely even cause more overflows. Sticky error flags are used in floating point, and I've never heard any desire for nor use case for counts. 2. The overflow count could conceivably itself overflow, thereby masking the failure in rare cases.3. There should be an exponentiation function.Agreed.
Jun 24 2016
On 06/24/2016 06:23 PM, Walter Bright wrote:On 6/24/2016 9:56 AM, Andrei Alexandrescu wrote:These functions are not supposed to be used naïvely.1. No need for different names (addu/adds) etc., overloading should take care of it (simplifies caller code). Right now the API is an odd mix of overloading and distinct names.The reason for the different names is: 1. Integral promotion rules will affect which overload is selected, when the argument type is not an exact match. Few programmers have an exact understanding of integral promotion rules.2. Things become further obfuscated when type aliases are used.These functions are not supposed to be used naïvely.3. Mixing signed and unsigned integer types calls which overload?These functions are not supposed to be used naïvely.4. People who use core.checkedint want pedantically correct code. Being absolutely sure what operation is being done is part of that. Using different names makes it trivially inspectable.These functions are not supposed to be used naïvely.5. It is a design mistake to make overloads with the same name have different implementations. Arguably, in this context, a signed overflow is different from an unsigned overflow, and so should properly be a different name.Wait, what? std.conv.to much?> 2. The overflow flag should be an integral, not a bool, so you get to count how many overflows there were. This is minor but has no extra cost on the happy case. I see no value in this beyond amusement value, and a potential bug: 1. One operation in a sequence that overflows will make subsequent operations meaningless, and will likely even cause more overflows. Sticky error flags are used in floating point, and I've never heard any desire for nor use case for counts.OK.2. The overflow count could conceivably itself overflow, thereby masking the failure in rare cases.Touché.> 3. There should be an exponentiation function. Agreed.Yay. Andrei
Jun 24 2016
On 6/24/2016 3:29 PM, Andrei Alexandrescu wrote:These functions are not supposed to be used naïvely.They are also meant to be inspectable. I want to look at it and trivially know what I am getting. Deep knowledge of subtle cases in the language (and integral promotions satisfy that) should not be necessary *especially* for this set of functions. People get into enough trouble over misunderstanding whether they are using signed or unsigned arithmetic as it is. I've used core.checkedint here and there, and have been well satisfied with the existing interface. int add(int x, int y, ref uint overflows); uint add(uint x, uint y, ref uint overflows); long add(long x, int y, ref uint overflows); ulong add(ulong x, uint y, ref uint overflows); ubyte i; uint u; uint overflow; add(i, u, overflow); Quickly, without trying it out, tell me which one it calls. No cheating!
Jun 24 2016
On 06/24/2016 06:56 PM, Walter Bright wrote:On 6/24/2016 3:29 PM, Andrei Alexandrescu wrote:Then the right API is add(x, y, overflow). I look at it and I trivially know what I am getting. -- AndreiThese functions are not supposed to be used naïvely.They are also meant to be inspectable. I want to look at it and trivially know what I am getting.
Jun 24 2016
To bring a different angle, the best front-end for arithmetic operations that may overflow is: typeof(lhs + rhs) add(L, R)(const L lhs, const R rhs, ref bool overflow) if (isIntegral!L && isIntegral!R); The overflow bit is set if and only if the mathematical result is not the same as the concrete result. Interestingly, things like add(5u, -1) should succeed without overflow (returning 4u) even though the negative value is conceptually converted to a large positive number and the operation overflows. (I've implemented this behavior in the DbI checkedint.) One point that escaped me initially is that sometimes there's no need for any checks, e.g. adding any two types smaller than int doesn't need to be checked. I'm not sure whether this function belongs in core.checkedint, but that's the right front end design for the primitives in there. Andrei
Jun 24 2016
On 6/24/2016 3:55 PM, Andrei Alexandrescu wrote:Interestingly, things like add(5u, -1) should succeed without overflow (returning 4u) even though the negative value is conceptually converted to a large positive number and the operation overflows. (I've implemented this behavior in the DbI checkedint.)I believe adding such behavior is beyond the charter of checkedint. I understand the charter is to check for all overflow, undefined and implementation defined behaviors, and not go beyond that. Pedantically, -1 is an int. But (unsigned op signed) converts the latter to unsigned, becoming (unsigned op unsigned), and so an overflow occurs.
Jun 24 2016
On 06/24/2016 07:56 PM, Walter Bright wrote:I believe adding such behavior is beyond the charter of checkedint.Which checkedint? The one in core? -- Andrei
Jun 24 2016
On 6/24/2016 4:59 PM, Andrei Alexandrescu wrote:On 06/24/2016 07:56 PM, Walter Bright wrote:Your Checked!int. Sorry for the confusion.I believe adding such behavior is beyond the charter of checkedint.Which checkedint? The one in core? -- Andrei
Jun 24 2016
On 06/24/2016 08:11 PM, Walter Bright wrote:On 6/24/2016 4:59 PM, Andrei Alexandrescu wrote:Though you should be able to implement that via a hook, I think at this point we need to agree to disagree. The relevant code is at https://gist.github.com/andralex/a0c0ad32704e6ba66e458ac48add4a99#file-checked-d-L88: static if (!isUnsigned!L) { if (lhs < 0) { return subu(Result(rhs), Result(negs(lhs, overflow)), overflow); } } With your suggestion, this would also be an overflow: long x = -1; auto y = array.length + x; I would be hard pressed to acknowledge that as an overflow that needs to be dynamically signaled. And the beauty of two's complement is that indeed it just works. AndreiOn 06/24/2016 07:56 PM, Walter Bright wrote:Your Checked!int. Sorry for the confusion.I believe adding such behavior is beyond the charter of checkedint.Which checkedint? The one in core? -- Andrei
Jun 24 2016
On 06/24/2016 09:42 PM, Andrei Alexandrescu wrote:long x = -1; auto y = array.length + x; I would be hard pressed to acknowledge that as an overflow that needs to be dynamically signaled. And the beauty of two's complement is that indeed it just works.To clarify: if array.length is 0, then indeed that should be signaled as an error. But otherwise it should just go through, no overflow. -- Andrei
Jun 24 2016
On Saturday, 25 June 2016 at 01:49:22 UTC, Andrei Alexandrescu wrote:On 06/24/2016 09:42 PM, Andrei Alexandrescu wrote:I'm curious how one attempts to handle overflow detection in a user-level implementation. It seems to me that such calculations are much better handled at the hardware level -- and that means, in the compiler. To wit, I'm looking at some old processor manuals I happen to have sitting around, since they tend to be quite descriptive about such details. My pdp-11 processor handbook says about the V (overflow) and C (carry) status bits in an ADD instruction: V: set if there was arithmetic overflow as a result of the operation; that is both operands were of the same sign and the result was of the opposite sign; cleared otherwise C: set if there was a carry from the most significant bit of the result; cleared otherwise I guess this is just assuming both operands are signed. It doesn't talk about a separate ADDU (add unsigned) instruction, so I guess you'd have to just use ADD and interpret the flags appropriately for that situation. The point is, there are two separate flags involved, and they need to be thought about separately and in detail in such contexts. In the MC68020 32-Bit Microprocessor User's Manual, Appendix A ("Condition Codes Computation") is really nice because it is precise and goes right down to the bottom-most detail, listing the logical equations used to set such flags. For instance, for the ADD instruction, the V and C flags are set according to these logical equations: V = (Sm and Dm and (not Rm)) or ((not Sm) and (not Dm) and Rm) C = (Sm and Dm) or ((not Rm) and Dm) or (Sm and (not Rm)) where: Sm = source (first operand) most significant bit Dm = destination (second operand) most significant bit Rm = result most significant bit (I've added parentheses to what is presented in the book, assuming that logical AND is of higher precedence than logical OR.) I'm assuming that once again, these flags are being set on the assumption that both operands are signed, as I see no separate ADDU instruction. My points are: * correct overflow detection involves an awful lot of bit-fiddling * the V (overflow) bit is *not* the same as the C (carry) bit that everyone first thinks about when considering whether or not a simple add calculation overflowed I had to wonder, how is this handled at the user level in a checkedint implementation? Do you perform a lot of data masking and testing in user-level code? Or do you depend on accessing the machine's hardware flags register and hoping it hasn't changed in between when you executed the compiled ADD instruction and when you attempt to access it from high-level code? I just checked the implementations of adds() and addu() here: https://github.com/dlang/druntime/blob/master/src/core/checkedint.d They don't seem to have quite the complexity of the logical equations I just showed. Not that they're necessarily incorrect as they stand; I'm just curious.long x = -1; auto y = array.length + x; I would be hard pressed to acknowledge that as an overflow that needs to be dynamically signaled. And the beauty of two's complement is that indeed it just works.To clarify: if array.length is 0, then indeed that should be signaled as an error. But otherwise it should just go through, no overflow. -- Andrei
Jun 25 2016
On 6/24/2016 6:42 PM, Andrei Alexandrescu wrote:With your suggestion, this would also be an overflow: long x = -1; auto y = array.length + x; I would be hard pressed to acknowledge that as an overflow that needs to be dynamically signaled. And the beauty of two's complement is that indeed it just works.I'm curious if that produces an overflow error on the newer clang that instruments such things.
Jun 24 2016
On 6/24/2016 6:42 PM, Andrei Alexandrescu wrote:With your suggestion, this would also be an overflow: long x = -1; auto y = array.length + x; I would be hard pressed to acknowledge that as an overflow that needs to be dynamically signaled. And the beauty of two's complement is that indeed it just works.That's a seductive test case. But I worry that mixed signed/unsigned arithmetic is not so simple. What about: x + array.length commutativity in general associativity Does this become a morass of special cases?
Jun 24 2016
On 06/24/2016 11:19 PM, Walter Bright wrote:On 6/24/2016 6:42 PM, Andrei Alexandrescu wrote:Doesn't seem that way (with some simplifying rules, associativity is left to right so not necessarily optimal), but commutativity works nicely, please take a close look at https://gist.github.com/andralex/a0c0ad32704e6ba66e458ac48add4a99 and destroy what you find unfit. And indeed UBSAN is a good baseline to keep an eye on. -- AndreiWith your suggestion, this would also be an overflow: long x = -1; auto y = array.length + x; I would be hard pressed to acknowledge that as an overflow that needs to be dynamically signaled. And the beauty of two's complement is that indeed it just works.That's a seductive test case. But I worry that mixed signed/unsigned arithmetic is not so simple. What about: x + array.length commutativity in general associativity Does this become a morass of special cases?
Jun 24 2016