digitalmars.D.learn - How are switches optimized
- IntegratedDimensions (22/22) Jun 01 2018 What is the best optimizations that a compiler does to switches
- Johan Engelen (24/37) Jun 02 2018 The best possible depends a lot on the specific case at hand.
- Dennis (79/81) Jun 02 2018 It depends on the compiler and optimization level. In general, no
What is the best optimizations that a compiler does to switches in general and in the D compilers? A switch can be seen as if statements, or safer, nested if elses. but surely the cost per case does not grow with depth in the switch? If one has a switch of N case then the last cost surely does not cost N times the cost of the first, approximately? This is the cost when implementing a switch as nested ifs. Tables can be used to give O(1) cost, are these used in D's compilers? How are they generally implemented? Hash tables? If the switch is on an enum of small values is it optimized for a simple calculating offset table? switch(e) { case E.a: ... case E.z: ... } jmp offset + val(e) Or, in general a map table can be used jmp offset + map[e] Just curious...
Jun 01 2018
On Friday, 1 June 2018 at 21:18:25 UTC, IntegratedDimensions wrote:What is the best optimizations that a compiler does to switches in general and in the D compilers?The best possible depends a lot on the specific case at hand. Best possible is to fully elide the switch, which does happen. You can use d.godbolt.org to investigate what happens for different pieces of code. Sometimes a jumptable is used, sometimes an if-chain. LLVM's (LDC) and GCC's (GDC) optimizers are strong and the optimized code will often do extra calculations before indexing in the table or before doing the comparisons in the if-chain. Different compilers will make different optimizations. https://godbolt.org/g/pHptff https://godbolt.org/g/AwZ69oA switch can be seen as if statements, or safer, nested if elses. but surely the cost per case does not grow with depth in the switch? If one has a switch of N case then the last cost surely does not cost N times the cost of the first, approximately?Depends on the code, but it's not O(N).This is the cost when implementing a switch as nested ifs.Not true. Nested if's are optimized as well. Sometimes switch is faster, sometimes if-chain, sometimes it's the same.Tables can be used to give O(1) cost, are these used in D's compilers?Yes (LDC and GDC).How are they generally implemented? Hash tables? If the switch is on an enum of small values is it optimized for a simple calculating offset table?Table stored in the instruction stream. Simple offset table with calculation on the index value (I guess you could say it is a simplified hash table). Note that with performance, the rest of the program and the execution flow also matters... cheers, Johan
Jun 02 2018
On Friday, 1 June 2018 at 21:18:25 UTC, IntegratedDimensions wrote:If one has a switch of N case then the last cost surely does not cost N times the cost of the first, approximately?It depends on the compiler and optimization level. In general, no optimization or just a handful of cases means it's as if you wrote a bunch of if-then-else statements. When there are many cases, a jump table is the go-to way to optimize it. But the compiler may do more clever transformations: ``` void smallCase(); void largeCase(); void defaultCase(); int switchfunc(int i) { switch(i) { case 8: .. case 64: smallCase(); return 1; case 256: .. case 512: largeCase(); return 2; default: defaultCase(); return 3; } } ``` Even though the front end expands the ranged cases to individual case statements, LDC with -O1 or higher will actually output code akin to this: ``` int switchfunc(int i) { if (cast(uint) (i - 256) >= 257) { if (cast(uint) (i - 8) > 56) { smallCase(); return 1; } defaultCase(); return 3; } largeCase(); return 2; } ``` Notice how even though I put the 256-512 range second, LDC chose to check it first because it's a larger range than 8-64. Also notice the use of unsigned integer underflow to check whether an integer is in a certain range. Something I like to do for text reading functions is this: ``` bool isOneOf(string str)(char chr) { switch (chr) { static foreach(c; str) { case c: return true; } default: return false; } } alias isHexadecimalDigit = isOneOf!"abcdefABCDEF0123456789"; ``` This will efficiently check if a character is in a compile-time known character set using compile-time optimizations. While DMD makes a jumptable for this, LDC actually transforms it into this: ``` bool isHexadecimalDigit(char chr) { ubyte x = cast(ubyte) (chr - 48); if (x > 54) return false; return (0b1111110000000000000000000000000011111100000001111111111 >> x) & 1; } ``` The cases can be encoded in a binary number that is shifted by the value of the character, so that the correct boolean return value (1 or 0) ends up as the least significant bit. DMD uses the same trick if you write it in this way: ``` return (chr == 'a' || chr == 'b' || chr == 'c' || ...); ``` So the most common optimization is indexing in (jump) tables, but smart transformations are done too sometimes. By the way, a chain of if-then-else statements may also be optimized as if it was written in a switch statement. If you want to see how the compiler optimizes your switch statement, check out run.dlang.io (DMD and LDC, press ASM button) or d.godbolt.org (DMD, LDC and GDC).
Jun 02 2018