digitalmars.D - Casts, overflows and demonstrations
- bearophile (92/92) Jun 05 2012 This is a reduced part of some D code:
This is a reduced part of some D code: import std.bigint, std.conv, std.algorithm, std.range; void foo(BigInt number) in { assert(number >= 0); } body { ubyte[] digits = text(number + 1) .retro() .map!(c => cast(ubyte)(c - '0'))() .array(); // ... } void main() {} The important line of code adds one to 'number', converts it to a string, scans it starting from its end, and for each char (digit) finds its value, removing the ASCII value of '0', and casts the result to ubyte. Then converts the lazy range to an array, an ubyte[]. The cast in the D code is needed because 'c' is a char. If you remove '0' from a char, in D the result is an int, and D doesn't allow to assign that int (I guess the compiler performs range analysis on the expression, so it knows the result can be negative too) to an ubyte, to avoid losing information. Casts are dangerous so it's better to avoid them where possible. A cast looks kind of safe because you usually know what you are doing while you program. But when later you change other parts of the code, the cast keeps being silent, and maybe it's not casting from the type you think it does. Maybe that kind of bugs are avoided by a templated function like this that makes it explicit both from and to types (it doesn't compile if the from type is wrong) (this code is not fully correct, the traits is not working well): template Cast(From, To) if (__traits(compiles, cast(To)From.init)) { To Cast(T)(T x) if (is(T == From)) { return cast(To)x; } } void main() { int x = -100; ubyte y = Cast!(int, ubyte)(x); string s = "123"; int y2 = Cast!(string, int)(s); } The following code is similar, but to!() performs a run-time test that makes it sure the subtraction result is representable inside an ubyte, otherwise throws an exception: ubyte[] digits = text(number + 1) .retro() .map!(c => to!ubyte(c - '0'))() .array(); That code is safer than the cast, but it performs a run-time test for each digit, this is not good. In theory a smarter compiler (working on good enough code) is able to do better: text() calls a BigInt method that returns the textual representation of the value in base ten (today such method is toString(), but maybe this situation will change and improve). BigInt.toString() could have a post-condition like this: string toString() out(result) { size_t start = 0; if (this < 0) { assert(result[0] = '-'); start = 1; } foreach (digit; result[start .. $]) assert(digit >= '0' && digit <= '9'); // If you want you can also assert that the first // digit is zero only if the bigint value is zero. } body { // ... } Given that information, plus the foo pre-condition in{assert(number >= 0);}, a smart compiler is able to infer that (or asks the programmer to demonstrate that) text() returns an array of just ['0',..,'9'] chars, that retro() doesn't change the contents of the range, so if you remove '0' from them you get a number in [0,..,9] that is always representable in an ubyte. So no cast is needed. Now and then I take a look at the ongoing development and refinement of the "Modern Eiffel" language (it's a kind of Eiffel2, see http://tecomp.sourceforge.net/index.php?file=doc/papers/lang/modern_eiffel.txt ), that is supposed to be (or become able) to perform those inferences (or to use them if the programmer has demonstrated them), so I think it will be able to spare both that cast and the run-time tests on each char, avoiding overflow bugs. According to Bertrand Meyer and others in 20 years similar things are going is going to become a part of the normal programming experience. Bye, bearophile
Jun 05 2012