digitalmars.D - Casts and some suggestions to avoid them
- bearophile (159/159) Apr 08 2014 In D (and other languages) casts are dangerous because often they
- bearophile (4/9) Apr 08 2014 https://d.puremagic.com/issues/show_bug.cgi?id=12548
- H. S. Teoh (56/121) Apr 08 2014 [...]
- bearophile (17/26) Apr 08 2014 Example: in a nothrow function. Unless you catch the exception
- bearophile (7/13) Apr 08 2014 Better to avoid magic constants, you can forget one F or
- Colden Cullen (5/5) Apr 08 2014 One issue I've had huge amounts of trouble with is casting to and
- Marco Leise (34/40) Apr 09 2014 Can you explain what level of atomicity you expect?
- Colden Cullen (10/48) Apr 15 2014 I was under the impression that casting away from shared was bad
- Rikki Cattermole (6/11) Apr 09 2014 ...
- bearophile (12/15) Apr 09 2014 In my post I have not shown examples of the casts for the this
- Meta (3/18) Apr 09 2014 I forgot that nested AAs were even possible. I was thinking about
In D (and other languages) casts are dangerous because often they punch holes in the type system, and they shut up the compiler, so nothing catches your mistakes. And even if you write correct code the first time, later you can change some types in your code and introduce some incongruity that casts will not complain about. Phobos and D help avoid casts in several ways, like value range analysis, the new double(x) syntax, functions and templates like std.conv.signed and std.traits.Signed, the powerful converter to!, using "cast()" to convert to mutable without writing the type, using strongly pure functions to convert mutable results to immutable implicitly, using CTFE to initialize immutable data, using Unqual!, or using std.string.representation, using std.exception.assumeUnique, etc. D and Phobos are doing a lot to avoid the need to cast, but perhaps more can be done. I've done a little statistics on about 208 casts in code I have written. The relative frequency of the various casts changes according to the kind of D code you write, if you do a lot of OOP with dynamic casts, or if you do lot of low-level programming (or lot of interfacing with C code), that often requires some casts. Here beside the usage frequencies, I also show some examples of each kind, and some ideas to reduce the need to cast, usually with Phobos code. - - - - - - - - Of those casts about 73 casts are conversions from a floating point value to integral value, like: cast(uint)(x * 1.75) cast(int)sqrt(real(ns)) In some cases you can use the to! template instead of cast. - - - - - - - - About 20 casts are conversions from a floating point value returned by floor/round/ceil to integral, like: cast(ubyte)round(x) cast(int)floor(y) At first looking at std.math I was a bit puzzled by those functions returning a floating point value. 99% of the times I need to cast their result to an integral value. But what type of integral type? So I think I'd like those functions (or similar functions) to accept a template type argument to specify what type I want the result: round!ubyte(x) floor!int(y) - - - - - - - - About 20 casts are for the return type of malloc/calloc/realloc/alloca, like: cast(ubyte*)alloca(ubyte.sizeof * x); cast(T*)malloc(typeof(T).sizeof * 10); A set of 3 little wrappers around those functions in Phobos can remove those casts (this can't be done with alloca), they are safer than using the raw C functions: cMalloc!T(n) cCalloc!T(n) cRealloc(ptr, n) - - - - - - - - About 14 are reinterpret casts, sometimes to see an uint as a sequence of ubytes, array casts, etc: cast(ubyte*)&x; cast(ubyte[4]*)&data; cast(uint[])text.to!(dchar[]) cast(ubyte[3])[x % 256, y % 256, x % 256] - - - - - - - - About 8 casts are needed by the opposite of std.string.representation, so they replace a unrepresentation function. See: https://d.puremagic.com/issues/show_bug.cgi?id=10162 With such function in Phobos all or most of such casts are not needed. - - - - - - - - About 7 are caused by feqrel, that requires mutable arguments: const double x, y; feqrel(cast()x, cast()y) I presume this is just a Phobos bug, so such casts can eventually be removed. https://d.puremagic.com/issues/show_bug.cgi?id=6586 - - - - - - - - About 6 casts are used to convert an array of enums to an array of the underlying type, like: enum C : char { A='a', B='b' } C[50] arr; cast(char[])arr Keeping 'arr' as an array of C is handy for safety or for other reasons, but perhaps you need to print arr compactly or you need the char[] for other reasons. I think you can't use to! in this case. - - - - - - - - About 5 casts are used to convert the result of std.file.read to an usable array type (because in some cases readText is not the right function to use), like: cast(char[])"data1.txt".read cast(ubyte[])"data2.txt".read The cast can be avoided with similar function that accepts a template type (there are perhaps ways to this with already present Phobos functions, suggestions are welcome): read!(char[])("data1.txt") - - - - - - - - About 4 casts are needed because the D compiler misses some "obvious" value range propagations, like: void foo(immutable ulong x) { if (x <= uint.max) uint y = x; char['z' - 'a' + 1] arr; foreach (immutable i, ref c; arr) c = 'a' + i; } struct Foo { immutable char c; this(in int c_) in { assert(c_ >= '0' && c_ <= '9'); } body { this.c = c_; } } See: https://d.puremagic.com/issues/show_bug.cgi?id=9570 https://d.puremagic.com/issues/show_bug.cgi?id=10594 https://d.puremagic.com/issues/show_bug.cgi?id=10685 https://d.puremagic.com/issues/show_bug.cgi?id=12514 - - - - - - - - About 4 casts are used by hex strings, like: ubyte[] data = cast(ubyte[])x"00 11 22 33 AB"; I think hex strings should be implicitly castable to ubyte[], avoiding the need to a cast, or if you don't like implicit casts then I think they should be of type ubyte[], because in about 100% of the cases I don't want a char[]. There are many cases of such useless cast in Phobos: https://d.puremagic.com/issues/show_bug.cgi?id=10453 - - - - - - - - In about 4 cases I have used a cast to take part of a number, like taking the lower 32 bits of a ulong, and so on. In some cases you can remove such casts using a union (like a union of one ulong and a uint[2]). - - - - - - - - In 2 cases I have used cast because despite array concatenations generate a new array, if you concatenate two const/immutable arrays the result can't be a mutable (and I needed a mutable result): void main() { const char[] a, b; char[] c = a ~ b; char d; char[] e = a ~ d; } This is an old issue: https://d.puremagic.com/issues/show_bug.cgi?id=1654 - - - - - - - - In 2 cases I have had to cast to convert an array length to type uint to allow the code compile on both a 32 and 64 bit system, to assign such length to some uint value. - - - - - - - - In 1 case I've had to use a dynamic cast on class instances. In theory in Phobos you can add specialized upcasts, downcasts, etc, that are more explicit and safer. - - - - - - - - I have also counted about 38 unsorted casts that don't easily fit in the precedent categories. They are so varied that it's not easy to find ways to avoid them. Bye, bearophile
Apr 08 2014
round!ubyte(x) floor!int(y)https://d.puremagic.com/issues/show_bug.cgi?id=12547cMalloc!T(n) cCalloc!T(n) cRealloc(ptr, n)https://d.puremagic.com/issues/show_bug.cgi?id=12548 Bye, bearophile
Apr 08 2014
On Tue, Apr 08, 2014 at 06:38:46PM +0000, bearophile wrote: [...]I've done a little statistics on about 208 casts in code I have written.[...]Of those casts about 73 casts are conversions from a floating point value to integral value, like: cast(uint)(x * 1.75) cast(int)sqrt(real(ns)) In some cases you can use the to! template instead of cast.Which cases don't work? My impression is that to! should be preferred to casts in this case, because it will actually check runtime value ranges and throw an error if, say, the float exceeds the range of int. Using a cast will silently ignore overflowed values, leading to hard-to-find bugs. [...]About 20 casts are for the return type of malloc/calloc/realloc/alloca, like: cast(ubyte*)alloca(ubyte.sizeof * x); cast(T*)malloc(typeof(T).sizeof * 10); A set of 3 little wrappers around those functions in Phobos can remove those casts (this can't be done with alloca), they are safer than using the raw C functions: cMalloc!T(n) cCalloc!T(n) cRealloc(ptr, n)This issue will (hopefully?) be addressed when Andrei finalizes his allocators, perhaps? [...]About 14 are reinterpret casts, sometimes to see an uint as a sequence of ubytes, array casts, etc: cast(ubyte*)&x; cast(ubyte[4]*)&data; cast(uint[])text.to!(dchar[]) cast(ubyte[3])[x % 256, y % 256, x % 256]Reinterpret casts are probably irreplaceable, because often they are used when you want to directly access the raw representation of some piece of data (e.g., to transmit a struct over the network, or serialize it to file, etc.). D does give some useful tools to do this with minimal risks (e.g., .sizeof), but still, this kind of cast is inherently dangerous and prone to breakage when you redefine your types. [...]About 6 casts are used to convert an array of enums to an array of the underlying type, like: enum C : char { A='a', B='b' } C[50] arr; cast(char[])arr Keeping 'arr' as an array of C is handy for safety or for other reasons, but perhaps you need to print arr compactly or you need the char[] for other reasons. I think you can't use to! in this case.I think to! can probably be extended to perform this conversion.About 5 casts are used to convert the result of std.file.read to an usable array type (because in some cases readText is not the right function to use), like: cast(char[])"data1.txt".read cast(ubyte[])"data2.txt".read The cast can be avoided with similar function that accepts a template type (there are perhaps ways to this with already present Phobos functions, suggestions are welcome): read!(char[])("data1.txt")Agreed. [...]About 4 casts are used by hex strings, like: ubyte[] data = cast(ubyte[])x"00 11 22 33 AB"; I think hex strings should be implicitly castable to ubyte[], avoiding the need to a cast, or if you don't like implicit casts then I think they should be of type ubyte[], because in about 100% of the cases I don't want a char[].Agreed, I can't think of any common use case where you'd want a hex string to be char[] instead of ubyte[]. The only case I can think of, (which is not common at all) is when you want to explicitly construct test cases for UTF strings with specific code point sequences (e.g., invalid sequences to test UTF error-catching code). [...]In about 4 cases I have used a cast to take part of a number, like taking the lower 32 bits of a ulong, and so on. In some cases you can remove such casts using a union (like a union of one ulong and a uint[2]).Using a union here is not a good idea, because the results depend on the endianness of the machine! It's better to just use (a & 0xFFFF) or (a >> 16) instead. [...]In 2 cases I have had to cast to convert an array length to type uint to allow the code compile on both a 32 and 64 bit system, to assign such length to some uint value.This is inherently unsafe, since it risks silent truncation of very large arrays. Admittedly, that's unlikely on a 32-bit machine, but still... I think a cast is justified here (as a warning sign that the code may have fragile behaviour -- e.g., while running on a 64-bit machine). [...]In 1 case I've had to use a dynamic cast on class instances. In theory in Phobos you can add specialized upcasts, downcasts, etc, that are more explicit and safer.In OO, explicit downcasting is usually frowned upon as the sign of bad design (due to the Liskov Substitution Principle). Nevertheless, AFAIK, downcasting in D is actually safe: BaseClass b; auto d = cast(DerivedClass) b; if (d is null) { // b was not an instance of DerivedClass } else { // d is safe to use } So I don't think this case counts. The cast operator was explicitly designed to handle this case (among other cases). T -- If creativity is stifled by rigid discipline, then it is not true creativity.
Apr 08 2014
H. S. Teoh:Which cases don't work?Example: in a nothrow function. Unless you catch the exception locally. To solve this in Bugzilla I have proposed a nothrow function maybeTo that returns a Nullable!T: https://d.puremagic.com/issues/show_bug.cgi?id=6840 Also a cast is faster and lighter than to! so in some cases it's needed.This issue will (hopefully?) be addressed when Andrei finalizes his allocators, perhaps?Andrei allocators are very nice, and they help, but I think they can't replace the C allocation functions in every case.I think to! can probably be extended to perform this conversion.It's not so simple, there are some constraints.Using a union here is not a good idea, because the results depend on the endianness of the machine! It's better to just use (a & 0xFFFF) or (a >> 16) instead.Right.This is inherently unsafe, since it risks silent truncation of very large arrays.In some cases you can assume to not have huge arrays. And you can even test the hugeness of the length before the cast or inside the function precondition. Bye, bearophile
Apr 08 2014
H. S. Teoh:Better to avoid magic constants, you can forget one F or something. In this case you have to use 0xFFFF_FFFFu. This is safer and more readable: a & uint.max Bye, bearophileIn some cases you can remove such casts using a union (like a union of one ulong and a uint[2]).Using a union here is not a good idea, because the results depend on the endianness of the machine! It's better to just use (a & 0xFFFF) or (a >> 16) instead.
Apr 08 2014
One issue I've had huge amounts of trouble with is casting to and from shared. The primary problem is that most of phobos doesn't handle shared values at all. If there was some inout style thing but for shared/unshared instead of mutable/immutable/const that would be super helpful.
Apr 08 2014
Am Tue, 08 Apr 2014 21:30:08 +0000 schrieb "Colden Cullen" <ColdenCullen gmail.com>:One issue I've had huge amounts of trouble with is casting to and from shared. The primary problem is that most of phobos doesn't handle shared values at all. If there was some inout style thing but for shared/unshared instead of mutable/immutable/const that would be super helpful.Can you explain what level of atomicity you expect? 1) what atomicity? 2) atomic operations on single instructions 3) the whole Phobos function should be atomic with respect to the shared values passed to it 4) some mutex in your "business logic" will make sure there are no race conditions Shared currently does two things I know of (besides circumventing TLS): - simply tag a variable as "multi-threaded" so you don't forget that fact - the compiler will not reorder or cache access to it So what would it add to Phobos if everything accepted shared? In particular how would that improve thread-safety, which is the aim of marking things shared? It doesn't, because only the functions in core.atomic make sense to accept shared. The reason is simply that they are running a single instruction on a single shared operand and not a complete algorithm. Anything longer needs to be implemented with thought put into race conditions. Example: x = min(a, b); Say a == 1 and b == 2. The function would load a from memory into a CPU register, then some other thread changes a to 3, then the function compares the register content with b and returns 1, which is no longer correct at this point in time. It is not that it can never be what you want, but that min() alone cannot decide what is right for YOUR code. So instead of passing shared values to generic algorithms, we only really need UNSHARED! -- Marco
Apr 09 2014
On Wednesday, 9 April 2014 at 11:27:24 UTC, Marco Leise wrote:Am Tue, 08 Apr 2014 21:30:08 +0000 schrieb "Colden Cullen" <ColdenCullen gmail.com>:I was under the impression that casting away from shared was bad form. Is this not true? I don't expect any atomicity (at least from the standard library). All locking should be done by the user. I just want to not have to cast away from shared whenever using the standard library. I'm not asking for guaranteed atomicity, just something that says that this function may take a shared value. I would like to reiterate that I think that having to cast away from shared is a bad solution.One issue I've had huge amounts of trouble with is casting to and from shared. The primary problem is that most of phobos doesn't handle shared values at all. If there was some inout style thing but for shared/unshared instead of mutable/immutable/const that would be super helpful.Can you explain what level of atomicity you expect? 1) what atomicity? 2) atomic operations on single instructions 3) the whole Phobos function should be atomic with respect to the shared values passed to it 4) some mutex in your "business logic" will make sure there are no race conditions Shared currently does two things I know of (besides circumventing TLS): - simply tag a variable as "multi-threaded" so you don't forget that fact - the compiler will not reorder or cache access to it So what would it add to Phobos if everything accepted shared? In particular how would that improve thread-safety, which is the aim of marking things shared? It doesn't, because only the functions in core.atomic make sense to accept shared. The reason is simply that they are running a single instruction on a single shared operand and not a complete algorithm. Anything longer needs to be implemented with thought put into race conditions. Example: x = min(a, b); Say a == 1 and b == 2. The function would load a from memory into a CPU register, then some other thread changes a to 3, then the function compares the register content with b and returns 1, which is no longer correct at this point in time. It is not that it can never be what you want, but that min() alone cannot decide what is right for YOUR code. So instead of passing shared values to generic algorithms, we only really need UNSHARED!
Apr 15 2014
On Tuesday, 8 April 2014 at 18:38:47 UTC, bearophile wrote: ...In 2 cases I have had to cast to convert an array length to type uint to allow the code compile on both a 32 and 64 bit system, to assign such length to some uint value....Bye, bearophilePersonally I design my code around size_t/ptrdiff_t to eliminate these issues as much as possible. Yeah its more memory but it does mean less issues with 32/64bit.
Apr 09 2014
I have also counted about 38 unsorted casts that don't easily fit in the precedent categories. They are so varied that it's not easy to find ways to avoid them.In my post I have not shown examples of the casts for the this "unsorted" category. They are sometimes needed to work around compiler bugs, like this one (the code doesn't compile if you remove the cast): void main() { enum E { a, b } int[E][E] foo = cast()[E.a: [E.a: 1, E.b: 2], E.b: [E.a: 3, E.b: 4]]; } Bye, bearophile
Apr 09 2014
On Wednesday, 9 April 2014 at 21:18:38 UTC, bearophile wrote:I forgot that nested AAs were even possible. I was thinking about this yesterday and was positive that they weren't.I have also counted about 38 unsorted casts that don't easily fit in the precedent categories. They are so varied that it's not easy to find ways to avoid them.In my post I have not shown examples of the casts for the this "unsorted" category. They are sometimes needed to work around compiler bugs, like this one (the code doesn't compile if you remove the cast): void main() { enum E { a, b } int[E][E] foo = cast()[E.a: [E.a: 1, E.b: 2], E.b: [E.a: 3, E.b: 4]]; } Bye, bearophile
Apr 09 2014