digitalmars.D - Checking if an Integer is an Exact Binary Power
- =?UTF-8?B?Tm9yZGzDtnc=?= (8/8) Apr 23 2016 Wanted: CT-trait and run-time predicate for checking whether its
- Vladimir Panteleev (2/4) Apr 23 2016 popcnt(x) <= 1 ?
- =?UTF-8?B?Tm9yZGzDtnc=?= (25/29) Apr 23 2016 Great.
- Andrei Alexandrescu (5/9) Apr 23 2016 Would be interesting to see how it compares to the handwritten version
- Andrei Alexandrescu (3/13) Apr 23 2016 Oh, and the bummer is that on gdc it's a function call:
- Iain Buclaw via Digitalmars-d (3/22) Apr 24 2016 That can easily be changed. ;-)
- Andrei Alexandrescu (5/12) Apr 23 2016 There's stuff for that in e.g.
- Dmitry Olshansky (4/11) Apr 23 2016 x & (x-1) == 0
- Andrei Alexandrescu (3/17) Apr 23 2016 Yah, that's the canonical. I forgot why I chose (x & -x) > (x - 1) over
- Andrei Alexandrescu (2/21) Apr 23 2016 Oh I remember. If x == 0, the result should also be 0. -- Andrei
- =?UTF-8?B?Tm9yZGzDtnc=?= (4/7) Apr 23 2016 So is there a way to check if popcnt builtin is available on
- Lass Safin (5/12) Apr 23 2016 CPUID: https://en.wikipedia.org/wiki/CPUID.
- =?UTF-8?B?Tm9yZGzDtnc=?= (3/7) Apr 23 2016 Code you give a complete code example in D, please or point out a
- safety0ff (6/13) Apr 23 2016 https://dlang.org/phobos/core_cpuid.html#.hasPopcnt
- tsbockman (9/16) Apr 23 2016 core.bitop.popcnt() uses core.cpuid.hasPopcnt() to choose between
- Lass Safin (15/28) Apr 25 2016 I just found this:
- Iain Buclaw via Digitalmars-d (11/42) Apr 25 2016 version(X86)
- tsbockman (5/14) Apr 25 2016 That is not right (since 2.069, I think?). Adding that check just
- Marco Leise (6/25) Apr 26 2016 Can we just use __traits(hasTargetFeature, "popcnt") already
- David Nadlinger (11/13) Apr 23 2016 As Andrei pointed out, it's not only popcnt being available, it's
- Marco Leise (9/17) Apr 23 2016 We argued in another, unrelated thread ("Any usable SIMD
- deadalnix (5/7) Apr 23 2016 I'm not sure why do you test against x - 1 when you could test
- Andrei Alexandrescu (4/10) Apr 24 2016 So if you do (x & -x) == x that returns 1 for x == 0. For many
- Walter Bright (2/13) Apr 24 2016 This sort of stuff should go on wiki.dlang.org page!
- Temtaime (4/24) Apr 24 2016 Please no cmp.
- David Nadlinger (3/6) Apr 24 2016 You do realise that this will (typically) emit a branch?
- Xinok (40/46) Apr 24 2016 I compiled using dmd -O and looked at the disassembly using
- Xinok (2/3) Apr 24 2016 Sorry, didn't mean to say David's solution. Too many edits. >_<
- Andrei Alexandrescu (3/20) Apr 24 2016 With gdc https://godbolt.org/g/jcU4np isPow2B is the winner (shortest
- Solomon E (13/19) Apr 25 2016 I generalized function isPow2B to work for negative numbers in
- Andrei Alexandrescu (3/21) Apr 25 2016 That's smaller and faster, thanks. But how do you show it is still
- Xinok (4/34) Apr 25 2016 Brute force.
- Dominikus Dittes Scherkl (7/42) Apr 25 2016 Yeah. And your test spares the failed case int.min (0x80000000),
- Xinok (6/16) Apr 26 2016 How is it wrong? Negative numbers are NOT powers of two (unless
- Timon Gehr (2/18) Apr 26 2016 static assert(2^^31==int.min); // :o)
- Solomon E (3/24) Apr 25 2016 Brute force? That doesn't prove the algorithm, it only proves the
- Solomon E (55/58) Apr 25 2016 First I wrote the following variants of isPow2A. It's more
- Solomon E (32/36) Apr 25 2016 Obviously wrong explanation, sorry. (x >>> 1) is equal to half of
- Andrei Alexandrescu (5/7) Apr 25 2016 I suggest you simply consider x unsigned. There is no value to the
- Andrei Alexandrescu (4/11) Apr 25 2016 This is worse than what we have now, which is easy to show is correct.
- deadalnix (11/41) Apr 25 2016 x & -x is the smallest power of 2 that divides x. Basically, if x
- Dominikus Dittes Scherkl (6/16) Apr 26 2016 Yes. Except for the case 0x80000000 (= int.min), because this is
- Matthias Bentrup (11/30) Apr 26 2016 Well, it depends a bit on how you define "is a power of two":
- deadalnix (4/23) Apr 26 2016 You got to explain me how you end up with a negative number by
- Era Scarecrow (4/6) Apr 26 2016 You gotta scrawl your numbers down badly so one of the 2's looks
- Andrei Alexandrescu (2/21) Apr 24 2016 Why? -- Andrei
- deadalnix (5/13) Apr 23 2016 Note that (A | B) & -(A | B) will give you the minimal power of 2
- Shachar Shemesh (7/14) Apr 25 2016 The standard way (assuming 2's complement machine) is:
Wanted: CT-trait and run-time predicate for checking whether its single integer parameter is an exact power of two. I guess https://dlang.org/phobos/std_math.html#.truncPow2 or https://dlang.org/phobos/std_math.html#.nextPow2 could be reused, but I bet there's a more efficient way of checking this.
Apr 23 2016
On Saturday, 23 April 2016 at 13:04:00 UTC, Nordlöw wrote:Wanted: CT-trait and run-time predicate for checking whether its single integer parameter is an exact power of two.popcnt(x) <= 1 ?
Apr 23 2016
On Saturday, 23 April 2016 at 13:06:58 UTC, Vladimir Panteleev wrote:On Saturday, 23 April 2016 at 13:04:00 UTC, Nordlöw wrote:Great. My solution: /** Check if `x` is an exact (binary) power of 2. TODO Move to std.math. */ bool isPow2(T)(T x) if (isIntegral!T) { import core.bitop : popcnt; return popcnt(x) == 1; } safe pure nothrow nogc unittest { // run-time assert(!7.isPow2); assert(8.isPow2); assert(!9.isPow2); // compile-time static assert(!7.isPow2); static assert(8.isPow2); static assert(!9.isPow2); } Destroy.Wanted: CT-trait and run-time predicate for checking whether its single integer parameter is an exact power of two.popcnt(x) <= 1 ?
Apr 23 2016
On 4/23/16 9:06 AM, Vladimir Panteleev wrote:On Saturday, 23 April 2016 at 13:04:00 UTC, Nordlöw wrote:Would be interesting to see how it compares to the handwritten version on different machines. popcnt has a reputation of being slow in older CPUs, in recent ones it has a latency of 3 cycles and a throughput of 1 cycle. -- AndreiWanted: CT-trait and run-time predicate for checking whether its single integer parameter is an exact power of two.popcnt(x) <= 1 ?
Apr 23 2016
On 4/23/16 9:54 AM, Andrei Alexandrescu wrote:On 4/23/16 9:06 AM, Vladimir Panteleev wrote:Oh, and the bummer is that on gdc it's a function call: https://godbolt.org/g/dEYCcG -- AndreiOn Saturday, 23 April 2016 at 13:04:00 UTC, Nordlöw wrote:Would be interesting to see how it compares to the handwritten version on different machines. popcnt has a reputation of being slow in older CPUs, in recent ones it has a latency of 3 cycles and a throughput of 1 cycle. -- AndreiWanted: CT-trait and run-time predicate for checking whether its single integer parameter is an exact power of two.popcnt(x) <= 1 ?
Apr 23 2016
On 23 April 2016 at 15:56, Andrei Alexandrescu via Digitalmars-d <digitalmars-d puremagic.com> wrote:On 4/23/16 9:54 AM, Andrei Alexandrescu wrote:That can easily be changed. ;-)On 4/23/16 9:06 AM, Vladimir Panteleev wrote:Oh, and the bummer is that on gdc it's a function call: https://godbolt.org/g/dEYCcG -- AndreiOn Saturday, 23 April 2016 at 13:04:00 UTC, Nordlöw wrote:Would be interesting to see how it compares to the handwritten version on different machines. popcnt has a reputation of being slow in older CPUs, in recent ones it has a latency of 3 cycles and a throughput of 1 cycle. -- AndreiWanted: CT-trait and run-time predicate for checking whether its single integer parameter is an exact power of two.popcnt(x) <= 1 ?
Apr 24 2016
On 4/23/16 9:04 AM, Nordlöw wrote:Wanted: CT-trait and run-time predicate for checking whether its single integer parameter is an exact power of two. I guess https://dlang.org/phobos/std_math.html#.truncPow2 or https://dlang.org/phobos/std_math.html#.nextPow2 could be reused, but I bet there's a more efficient way of checking this.There's stuff for that in e.g. https://github.com/dlang/phobos/blob/master/std/experimental/allo ator/common.d#L462. They're package-private with the intent is to move those slowly into Phobos. -- Andrei
Apr 23 2016
On 23-Apr-2016 16:04, Nordlöw wrote:Wanted: CT-trait and run-time predicate for checking whether its single integer parameter is an exact power of two. I guess https://dlang.org/phobos/std_math.html#.truncPow2 or https://dlang.org/phobos/std_math.html#.nextPow2 could be reused, but I bet there's a more efficient way of checking this.x & (x-1) == 0 -- Dmitry Olshansky
Apr 23 2016
On 4/23/16 10:41 AM, Dmitry Olshansky wrote:On 23-Apr-2016 16:04, Nordlöw wrote:Yah, that's the canonical. I forgot why I chose (x & -x) > (x - 1) over it. -- AndreiWanted: CT-trait and run-time predicate for checking whether its single integer parameter is an exact power of two. I guess https://dlang.org/phobos/std_math.html#.truncPow2 or https://dlang.org/phobos/std_math.html#.nextPow2 could be reused, but I bet there's a more efficient way of checking this.x & (x-1) == 0
Apr 23 2016
On 04/23/2016 11:29 AM, Andrei Alexandrescu wrote:On 4/23/16 10:41 AM, Dmitry Olshansky wrote:Oh I remember. If x == 0, the result should also be 0. -- AndreiOn 23-Apr-2016 16:04, Nordlöw wrote:Yah, that's the canonical. I forgot why I chose (x & -x) > (x - 1) over it. -- AndreiWanted: CT-trait and run-time predicate for checking whether its single integer parameter is an exact power of two. I guess https://dlang.org/phobos/std_math.html#.truncPow2 or https://dlang.org/phobos/std_math.html#.nextPow2 could be reused, but I bet there's a more efficient way of checking this.x & (x-1) == 0
Apr 23 2016
On Saturday, 23 April 2016 at 17:28:21 UTC, Andrei Alexandrescu wrote:So is there a way to check if popcnt builtin is available on current platform?Yah, that's the canonical. I forgot why I chose (x & -x) > (x - 1) over it. -- Andrei
Apr 23 2016
On Saturday, 23 April 2016 at 20:34:52 UTC, Nordlöw wrote:On Saturday, 23 April 2016 at 17:28:21 UTC, Andrei Alexandrescu wrote:CPUID: https://en.wikipedia.org/wiki/CPUID. You can check for the presence of a lot of instructions with this instruction. However this will only work on x86 and only run-time.So is there a way to check if popcnt builtin is available on current platform?Yah, that's the canonical. I forgot why I chose (x & -x) > (x - 1) over it. -- Andrei
Apr 23 2016
On Saturday, 23 April 2016 at 20:42:25 UTC, Lass Safin wrote:CPUID: https://en.wikipedia.org/wiki/CPUID. You can check for the presence of a lot of instructions with this instruction. However this will only work on x86 and only run-time.Code you give a complete code example in D, please or point out a suitable place in druntime/phobos?
Apr 23 2016
On Saturday, 23 April 2016 at 21:04:52 UTC, Nordlöw wrote:On Saturday, 23 April 2016 at 20:42:25 UTC, Lass Safin wrote:https://dlang.org/phobos/core_cpuid.html#.hasPopcnt However, it is usually better to use the methods stated by Andrei / Dmitry* rather than population count for checking powers of two. * With the code given by Dmitry you have to check for zero, otherwise it will return true for 0. e.g. x && !(x & (x - 1))CPUID: https://en.wikipedia.org/wiki/CPUID. You can check for the presence of a lot of instructions with this instruction. However this will only work on x86 and only run-time.Code you give a complete code example in D, please or point out a suitable place in druntime/phobos?
Apr 23 2016
On Saturday, 23 April 2016 at 21:04:52 UTC, Nordlöw wrote:On Saturday, 23 April 2016 at 20:42:25 UTC, Lass Safin wrote:core.bitop.popcnt() uses core.cpuid.hasPopcnt() to choose between the intrinsic _popcnt() and a software fallback algorithm at runtime: https://github.com/dlang/druntime/blob/master/src/core/bitop.d#L343 The relative complexity of softPopcnt() also shows why you don't really want to use popcnt() for this task unless you know the intrinsic will be available.CPUID: https://en.wikipedia.org/wiki/CPUID. You can check for the presence of a lot of instructions with this instruction. However this will only work on x86 and only run-time.Code you give a complete code example in D, please or point out a suitable place in druntime/phobos?
Apr 23 2016
On Saturday, 23 April 2016 at 21:04:52 UTC, Nordlöw wrote:On Saturday, 23 April 2016 at 20:42:25 UTC, Lass Safin wrote:I just found this: https://dlang.org/phobos/core_cpuid.html#.hasPopcnt! It does exactly as it says: checks if the system has popcnt. Though read the top of https://dlang.org/phobos/core_cpuid.html before you use it:CPUID: https://en.wikipedia.org/wiki/CPUID. You can check for the presence of a lot of instructions with this instruction. However this will only work on x86 and only run-time.Code you give a complete code example in D, please or point out a suitable place in druntime/phobos?Bugs: Currently only works on x86 and Itanium CPUs. Many processors have bugs in their microcode for the CPUID instruction, so sometimes the cache information may be incorrect.Example; import core.bitop; import core.cpuid; int count; if(hasPopcnt) count = _popcnt; // Uses x86-instruction "popcnt". else count = popcnt; // Phobos's software implementation. // Do stuff with count
Apr 25 2016
On 25 April 2016 at 18:48, Lass Safin via Digitalmars-d <digitalmars-d puremagic.com> wrote:On Saturday, 23 April 2016 at 21:04:52 UTC, Nordlöw wrote:version(X86) { if (hasPopcnt) ... } else { // API not guaranteed to be the same. }On Saturday, 23 April 2016 at 20:42:25 UTC, Lass Safin wrote:I just found this: https://dlang.org/phobos/core_cpuid.html#.hasPopcnt! It does exactly as it says: checks if the system has popcnt. Though read the top of https://dlang.org/phobos/core_cpuid.html before you use it:CPUID: https://en.wikipedia.org/wiki/CPUID. You can check for the presence of a lot of instructions with this instruction. However this will only work on x86 and only run-time.Code you give a complete code example in D, please or point out a suitable place in druntime/phobos?Bugs: Currently only works on x86 and Itanium CPUs. Many processors have bugs in their microcode for the CPUID instruction, so sometimes the cache information may be incorrect.Example; import core.bitop; import core.cpuid; int count; if(hasPopcnt) count = _popcnt; // Uses x86-instruction "popcnt". else count = popcnt; // Phobos's software implementation.
Apr 25 2016
On Monday, 25 April 2016 at 16:48:14 UTC, Lass Safin wrote:Example; import core.bitop; import core.cpuid; int count; if(hasPopcnt) count = _popcnt; // Uses x86-instruction "popcnt". else count = popcnt; // Phobos's software implementation. // Do stuff with countThat is not right (since 2.069, I think?). Adding that check just makes things slower and more complicated: core.bitop.popcnt() itself already checks hasPopcnt, and forwards to the intrinsic if it is available.
Apr 25 2016
Am Mon, 25 Apr 2016 17:04:43 +0000 schrieb tsbockman <thomas.bockman gmail.com>:On Monday, 25 April 2016 at 16:48:14 UTC, Lass Safin wrote:Can we just use __traits(hasTargetFeature, "popcnt") already and get rid of _popcnt? -- MarcoExample; import core.bitop; import core.cpuid; int count; if(hasPopcnt) count = _popcnt; // Uses x86-instruction "popcnt". else count = popcnt; // Phobos's software implementation. // Do stuff with countThat is not right (since 2.069, I think?). Adding that check just makes things slower and more complicated: core.bitop.popcnt() itself already checks hasPopcnt, and forwards to the intrinsic if it is available.
Apr 26 2016
On Tuesday, 26 April 2016 at 14:37:46 UTC, Marco Leise wrote:Can we just use __traits(hasTargetFeature, "popcnt") already and get rid of _popcnt?No, because DMD does not currently support setting SSE4 as the minimum target. Thus, `__traits(hasTargetFeature, "popcnt")` must always return `false` at compile time. As for LDC and GDC - both already treat `popcnt()` as an intrinsic, and don't need your proposed change.
Apr 26 2016
On Tuesday, 26 April 2016 at 16:53:00 UTC, tsbockman wrote:On Tuesday, 26 April 2016 at 14:37:46 UTC, Marco Leise wrote:It's strange why it emits an branch. http://goo.gl/jJbSYd Det version doesn't use cmp and should be fast.Can we just use __traits(hasTargetFeature, "popcnt") already and get rid of _popcnt?No, because DMD does not currently support setting SSE4 as the minimum target. Thus, `__traits(hasTargetFeature, "popcnt")` must always return `false` at compile time. As for LDC and GDC - both already treat `popcnt()` as an intrinsic, and don't need your proposed change.
Apr 26 2016
On Saturday, 23 April 2016 at 20:34:52 UTC, Nordlöw wrote:So is there a way to check if popcnt builtin is available on current platform?As Andrei pointed out, it's not only popcnt being available, it's also it being fast. A function implementing the classic hand-written version compiles down to something like lea eax, [rdi - 1] test eax, edi sete al ret which is already quite good. There is also blsi to consider on Haswell and above. — David
Apr 23 2016
Am Sat, 23 Apr 2016 20:34:52 +0000 schrieb Nordl=C3=B6w <per.nordlow gmail.com>:On Saturday, 23 April 2016 at 17:28:21 UTC, Andrei Alexandrescu=20 wrote:We argued in another, unrelated thread ("Any usable SIMD implementation?") for compile-time information on the target. That would include __traits("hasTargetFeature", "popcnt") A runtime check requires at least a read of a global variable and I doubt it is faster than `x & (x-1)`. --=20 Marco=20 So is there a way to check if popcnt builtin is available on=20 current platform?Yah, that's the canonical. I forgot why I chose (x & -x) > (x=20 - 1) over it. -- Andrei =20
Apr 23 2016
On Saturday, 23 April 2016 at 15:29:08 UTC, Andrei Alexandrescu wrote:Yah, that's the canonical. I forgot why I chose (x & -x) > (x - 1) over it.I'm not sure why do you test against x - 1 when you could test for equality. Not only it looks like it is going to require an extra computation (x - 1) but also it doesn't work for 0.
Apr 23 2016
On 04/24/2016 02:57 AM, deadalnix wrote:On Saturday, 23 April 2016 at 15:29:08 UTC, Andrei Alexandrescu wrote:So if you do (x & -x) == x that returns 1 for x == 0. For many applications you want to yield 0. So you test for (x & -x) > (x - 1) such that for x == 0 the right hand side is a large number. -- AndreiYah, that's the canonical. I forgot why I chose (x & -x) > (x - 1) over it.I'm not sure why do you test against x - 1 when you could test for equality. Not only it looks like it is going to require an extra computation (x - 1) but also it doesn't work for 0.
Apr 24 2016
On 4/24/2016 8:33 AM, Andrei Alexandrescu wrote:On 04/24/2016 02:57 AM, deadalnix wrote:This sort of stuff should go on wiki.dlang.org page!On Saturday, 23 April 2016 at 15:29:08 UTC, Andrei Alexandrescu wrote:So if you do (x & -x) == x that returns 1 for x == 0. For many applications you want to yield 0. So you test for (x & -x) > (x - 1) such that for x == 0 the right hand side is a large number. -- AndreiYah, that's the canonical. I forgot why I chose (x & -x) > (x - 1) over it.I'm not sure why do you test against x - 1 when you could test for equality. Not only it looks like it is going to require an extra computation (x - 1) but also it doesn't work for 0.
Apr 24 2016
On Sunday, 24 April 2016 at 21:45:32 UTC, Walter Bright wrote:On 4/24/2016 8:33 AM, Andrei Alexandrescu wrote:Please no cmp. Just bool isPowerOf2(uint x) { return x && !(x & (x - 1)); }On 04/24/2016 02:57 AM, deadalnix wrote:This sort of stuff should go on wiki.dlang.org page!On Saturday, 23 April 2016 at 15:29:08 UTC, Andrei Alexandrescu wrote:So if you do (x & -x) == x that returns 1 for x == 0. For many applications you want to yield 0. So you test for (x & -x) > (x - 1) such that for x == 0 the right hand side is a large number. -- AndreiYah, that's the canonical. I forgot why I chose (x & -x) > (x - 1) over it.I'm not sure why do you test against x - 1 when you could test for equality. Not only it looks like it is going to require an extra computation (x - 1) but also it doesn't work for 0.
Apr 24 2016
On Sunday, 24 April 2016 at 23:00:56 UTC, Temtaime wrote:Please no cmp. Just bool isPowerOf2(uint x) { return x && !(x & (x - 1)); }You do realise that this will (typically) emit a branch? — David
Apr 24 2016
On Sunday, 24 April 2016 at 23:17:53 UTC, David Nadlinger wrote:On Sunday, 24 April 2016 at 23:00:56 UTC, Temtaime wrote:I compiled using dmd -O and looked at the disassembly using objdump -D and there are indeed a couple branches in the disassembly: 0000000000000000 <_D4test8powerOf2FkZk>: 0: 50 push %rax 1: 48 89 f9 mov %rdi,%rcx 4: 85 c9 test %ecx,%ecx 6: 74 0a je 12 <_D4test8powerOf2FkZk+0x12> 8: 8d 81 ff ff ff ff lea -0x1(%rcx),%eax e: 85 c1 test %eax,%ecx 10: 74 04 je 16 <_D4test8powerOf2FkZk+0x16> 12: 31 c0 xor %eax,%eax 14: eb 05 jmp 1b <_D4test8powerOf2FkZk+0x1b> 16: b8 01 00 00 00 mov $0x1,%eax 1b: 59 pop %rcx 1c: c3 retq 1d: 0f 1f 00 nopl (%rax) I modified David's solution a bit to (hopefully) eliminate the branch: bool powerOf2(uint x){ return !x ^ !(x & (x - 1)); } 0000000000000000 <_D4test8powerOf2FkZb>: 0: 50 push %rax 1: 53 push %rbx 2: 48 89 fa mov %rdi,%rdx 5: 85 d2 test %edx,%edx 7: 0f 94 c0 sete %al a: 8d 8a ff ff ff ff lea -0x1(%rdx),%ecx 10: 85 ca test %ecx,%edx 12: 0f 94 c3 sete %bl 15: 32 c3 xor %bl,%al 17: 5b pop %rbx 18: 59 pop %rcx 19: c3 retq 1a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) The branches are gone but I'm really not sure if this is better or worse.Please no cmp. Just bool isPowerOf2(uint x) { return x && !(x & (x - 1)); }You do realise that this will (typically) emit a branch? — David
Apr 24 2016
On Monday, 25 April 2016 at 01:17:48 UTC, Xinok wrote:...Sorry, didn't mean to say David's solution. Too many edits. >_<
Apr 24 2016
On 4/24/16 9:17 PM, Xinok wrote:I modified David's solution a bit to (hopefully) eliminate the branch: bool powerOf2(uint x){ return !x ^ !(x & (x - 1)); } 0000000000000000 <_D4test8powerOf2FkZb>: 0: 50 push %rax 1: 53 push %rbx 2: 48 89 fa mov %rdi,%rdx 5: 85 d2 test %edx,%edx 7: 0f 94 c0 sete %al a: 8d 8a ff ff ff ff lea -0x1(%rdx),%ecx 10: 85 ca test %ecx,%edx 12: 0f 94 c3 sete %bl 15: 32 c3 xor %bl,%al 17: 5b pop %rbx 18: 59 pop %rcx 19: c3 retq 1a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) The branches are gone but I'm really not sure if this is better or worse.With gdc https://godbolt.org/g/jcU4np isPow2B is the winner (shortest code, simplest instructions). -- Andrei
Apr 24 2016
On Monday, 25 April 2016 at 05:35:12 UTC, Andrei Alexandrescu wrote:With gdc https://godbolt.org/g/jcU4np isPow2B is the winner (shortest code, simplest instructions). -- AndreiI generalized function isPow2B to work for negative numbers in case of a signed type, while keeping the assembly the same or better. (That also incidentally made it correctly return false for zero.)bool isPow2B(T)(T x) { return (x & -x) > (x - 1); }bool isPow2F(T)(T x) { return (x & -x) > (x >>> 1); } assembly diff: -- subl $1, %edi ++ shrl %edi
Apr 25 2016
On 4/25/16 6:42 AM, Solomon E wrote:On Monday, 25 April 2016 at 05:35:12 UTC, Andrei Alexandrescu wrote:That's smaller and faster, thanks. But how do you show it is still correct? -- AndreiWith gdc https://godbolt.org/g/jcU4np isPow2B is the winner (shortest code, simplest instructions). -- AndreiI generalized function isPow2B to work for negative numbers in case of a signed type, while keeping the assembly the same or better. (That also incidentally made it correctly return false for zero.)bool isPow2B(T)(T x) { return (x & -x) > (x - 1); }bool isPow2F(T)(T x) { return (x & -x) > (x >>> 1); } assembly diff: -- subl $1, %edi ++ shrl %edi
Apr 25 2016
On Monday, 25 April 2016 at 12:56:27 UTC, Andrei Alexandrescu wrote:On 4/25/16 6:42 AM, Solomon E wrote:Brute force. http://dpaste.dzfl.pl/882d7cdc5f74On Monday, 25 April 2016 at 05:35:12 UTC, Andrei Alexandrescu wrote:That's smaller and faster, thanks. But how do you show it is still correct? -- AndreiWith gdc https://godbolt.org/g/jcU4np isPow2B is the winner (shortest code, simplest instructions). -- AndreiI generalized function isPow2B to work for negative numbers in case of a signed type, while keeping the assembly the same or better. (That also incidentally made it correctly return false for zero.)bool isPow2B(T)(T x) { return (x & -x) > (x - 1); }bool isPow2F(T)(T x) { return (x & -x) > (x >>> 1); } assembly diff: -- subl $1, %edi ++ shrl %edi
Apr 25 2016
On Monday, 25 April 2016 at 15:27:02 UTC, Xinok wrote:On Monday, 25 April 2016 at 12:56:27 UTC, Andrei Alexandrescu wrote:Yeah. And your test spares the failed case int.min (0x80000000), because in this case x & -x is negative, but of cause x>>>1 is positive, so it returns false despite -(2^^31) is in fact a power of two. So this requires an extra cast to work correct (in fact no big difference in the assembly): return (Unsigned!T)(x & -x) > (Unsigned!T)(x >>> 1);On 4/25/16 6:42 AM, Solomon E wrote:Brute force. http://dpaste.dzfl.pl/882d7cdc5f74On Monday, 25 April 2016 at 05:35:12 UTC, Andrei Alexandrescu wrote:That's smaller and faster, thanks. But how do you show it is still correct? -- AndreiWith gdc https://godbolt.org/g/jcU4np isPow2B is the winner (shortest code, simplest instructions). -- AndreiI generalized function isPow2B to work for negative numbers in case of a signed type, while keeping the assembly the same or better. (That also incidentally made it correctly return false for zero.)bool isPow2B(T)(T x) { return (x & -x) > (x - 1); }bool isPow2F(T)(T x) { return (x & -x) > (x >>> 1); } assembly diff: -- subl $1, %edi ++ shrl %edi
Apr 25 2016
On Monday, 25 April 2016 at 15:35:14 UTC, Dominikus Dittes Scherkl wrote:On Monday, 25 April 2016 at 15:27:02 UTC, Xinok wrote:How is it wrong? Negative numbers are NOT powers of two (unless you have an imaginary/complex exponent), so it's still correct to return false for -int.min. Should it not? http://www.wolframalpha.com/input/?i=solve+2^n+%3D+-%282^31%29+for+nBrute force. http://dpaste.dzfl.pl/882d7cdc5f74Yeah. And your test spares the failed case int.min (0x80000000), because in this case x & -x is negative, but of cause x>>>1 is positive, so it returns false despite -(2^^31) is in fact a power of two. So this requires an extra cast to work correct (in fact no big difference in the assembly): return (Unsigned!T)(x & -x) > (Unsigned!T)(x >>> 1);
Apr 26 2016
On 26.04.2016 17:27, Xinok wrote:On Monday, 25 April 2016 at 15:35:14 UTC, Dominikus Dittes Scherkl wrote:static assert(2^^31==int.min); // :o)On Monday, 25 April 2016 at 15:27:02 UTC, Xinok wrote:How is it wrong? Negative numbers are NOT powers of two (unless you have an imaginary/complex exponent), so it's still correct to return false for -int.min. Should it not? http://www.wolframalpha.com/input/?i=solve+2^n+%3D+-%282^31%29+for+nBrute force. http://dpaste.dzfl.pl/882d7cdc5f74Yeah. And your test spares the failed case int.min (0x80000000), because in this case x & -x is negative, but of cause x>>>1 is positive, so it returns false despite -(2^^31) is in fact a power of two. So this requires an extra cast to work correct (in fact no big difference in the assembly): return (Unsigned!T)(x & -x) > (Unsigned!T)(x >>> 1);
Apr 26 2016
On Monday, 25 April 2016 at 15:27:02 UTC, Xinok wrote:On Monday, 25 April 2016 at 12:56:27 UTC, Andrei Alexandrescu wrote:Brute force? That doesn't prove the algorithm, it only proves the implementation.On 4/25/16 6:42 AM, Solomon E wrote:Brute force. http://dpaste.dzfl.pl/882d7cdc5f74... I generalized function isPow2B to work for negative numbers in case of a signed type, while keeping the assembly the same or better. ...That's smaller and faster, thanks. But how do you show it is still correct? -- Andrei[...]bool isPow2F(T)(T x) { return (x & -x) > (x >>> 1); } ...
Apr 25 2016
On Monday, 25 April 2016 at 12:56:27 UTC, Andrei Alexandrescu wrote:First I wrote the following variants of isPow2A. It's more obvious that they're correct, but they didn't compile to fewer instructions. bool isPow2D(T)(T x) { return (x > 0) && !(x & (x - 1)); } (The variant isPow2E avoids a jump instruction:) bool isPow2E(T)(T x) { return (x > 0) & !(x & (x - 1)); } So at least I had some functions that I knew were doing what I wanted (handling negative values for sure) to test against. Then I tried some other things that didn't work, then the bit shift variant, isPow2F. It seemed to work, I looked at at long list of results in the terminal and saw what the pattern is, but right, I shouldn't trust it without a proof. (x & -x) evaluates to the value of the lowest bit set in x. Either (x & -x) == x, which is a power of 2, or (x & -x) == x minus the value of all the bits above the lowest bit set. (x >>> 1) is equal to half of x or one less than half of x, or a positive number if x is negative. For (x & -x) > (x >>> 1) to be true, that would either be because x is a power of 2, or else x - n > x/2 - 1 (or x - n > x/2, which is implied by that) where n is the value of the bits above the lowest bit set, which means n >= 2/3*x. So x - n > x/2 - 1 would be 1/3*x > x/2 - 1, which would be 0 > 1/6*x - 1. That's impossible for x > 6. For values of x under 6, the results check out case by case. When x is negative in a signed type, the unsigned shift treats x as if it were an unsigned number. The only particular bit arrangement that's treated differently when using a signed type is that -(2^^32) isn't a power of two because it's negative, so the expression (x & -x) should evaluate to a negative value before the comparison. That seems to work for a byte or a short, where (x & -x) comes out negative when at the min value. I've written this clumsily because I'm not sure what sort of symbols to use and don't know how to write mathematical proofs for algorithms in programming in general, but I thought it was interesting that writing it down and double checking required showing how close the algorithm cuts it. For example, x = 6144, (x & -x) == 2048, (x >>> 1) == 3072. The cases for 0 to 6, when using signed types for x: algo B algo F x 0 B true F false (x & -x) == 0, (x >>> 1) == 0 x 1 B true F true (x & -x) == 1, (x >>> 1) == 0 x 2 B true F true (x & -x) == 2, (x >>> 1) == 1 x 3 B false F false (x & -x) == 1, (x >>> 1) == 1 x 4 B true F true (x & -x) == 4, (x >>> 1) == 2 x 5 B false F false (x & -x) == 1, (x >>> 1) == 2 x 6 B false F false (x & -x) == 2, (x >>> 1) == 3...That's smaller and faster, thanks. But how do you show it is still correct? -- Andrei
Apr 25 2016
On Monday, 25 April 2016 at 16:53:20 UTC, Solomon E wrote:... (x >>> 1) is equal to half of x or one less than half of x, or a positive number if x is negative. ...Obviously wrong explanation, sorry. (x >>> 1) is equal to half of x or half of one less than x, or a positive number if x is negative. There are other parts in isPow2F where I'm not sure exactly what the bits are doing, such as how the compiler makes the result negative for (x & -x) at the bit and short min values, where in other expressions while learning D I've found an implicit cast to int that I worked around by casting back to the type I put in. (I guess that's the way it's supposed to be for compatibility with C/C++ expectations.) Also I skipped mentioning case x = 6 works too, not just cases over and under 6. It's just a sketch of a proof. I would use the other function I wrote first, isPow2D, to be more sure of the logic, although it's not the most optimized, or I'd use a library function that promises to do the task for all types if I found one first, rather than the isPow2 variants I wrote and others wrote, and rather than popcnt. I'm more concerned with getting the application logic right when I program than optimizing for speed or bytes, until something is noticeably pausing or taking extra megabytes. bool isPow2D(T)(T x) { return (x > 0) && !(x & (x - 1)); } Just now I tested popcnt against isPow2D and found that popcnt reports the number of bits set in a uint, regardless of popcnt's argument being a smaller size integer type or signed. That would make (popcnt( -(2^^31) ) == 1) yield true, which isn't what I want for an isPow2 function. popcnt is only documented to take uint or ulong, so that's not a bug in the library, only in my try at using it roughly.
Apr 25 2016
On 04/25/2016 03:21 PM, Solomon E wrote:There are other parts in isPow2F where I'm not sure exactly what the bits are doing, such as how the compiler makes the result negativeI suggest you simply consider x unsigned. There is no value to the discussion brought by negative numbers, and implementation-wise you can insert a cast or let the overload mechanism only use unsigned integers. -- Andrei
Apr 25 2016
On 04/25/2016 03:21 PM, Solomon E wrote:I'm more concerned with getting the application logic right when I program than optimizing for speed or bytes, until something is noticeably pausing or taking extra megabytes.This is a luxury not available to library writers.bool isPow2D(T)(T x) { return (x > 0) && !(x & (x - 1)); }This is worse than what we have now, which is easy to show is correct. Andrei
Apr 25 2016
On Monday, 25 April 2016 at 19:37:45 UTC, Andrei Alexandrescu wrote:On 04/25/2016 03:21 PM, Solomon E wrote:I agree.I'm more concerned with getting the application logic right when I program than optimizing for speed or bytes, until something is noticeably pausing or taking extra megabytes.This is a luxury not available to library writers.I wasn't recommending isPow2D for the library, just for the luxury use case of writing cautiously to show the logic is correct. It's easy to show that the current code for experimental.common.isPowerOf2 (the same logic as isPow2B) does wrong things when a signed type is forced into it. I was hoping to suggest that if isPow2F isn't wrong, it might be used, because it has been proved and checked to be correct in this thread. However, I don't know if it actually is faster, and it would be difficult to test the speed up, if any, since the function variants are all so small, and isPow2F compiles to different instructions when used with different types. So it might not be worthwhile, unless there's an advantage of correctness or type-safety, which I'm not experienced enough in D to judge. (The library's descriptions of std.math.nextPow2 are vastly oversimplified compared to what it does as shown in the examples, so attempting to use that for recognizing simple powers of 2 would be a complexity mismatch.)bool isPow2D(T)(T x) { return (x > 0) && !(x & (x - 1)); }This is worse than what we have now, which is easy to show is correct. Andrei
Apr 25 2016
On Monday, 25 April 2016 at 22:15:31 UTC, Solomon E wrote:... I was hoping to suggest that if isPow2F isn't wrong, it might be used, because it has been proved and checked to be correct in this thread. ...On second thought, no one's going to understand why the code (x & -x) > (x >>> 1) parameterized for all integer types is in the library, if it ever breaks because of some compiler backend problem with the different generated code for the different integer types. So I'll take back suggesting that for the library, (at least until I or someone else understands why it would never break like that.) The ways to improve D that occur to me based on this are: (1) It would be better to have a compiler warning on passing signed types to unsigned parameter functions. There wasn't with popcnt, on GDC 5.2.1. There should be more of other warnings like that too. I like to compile with practically all the warnings & errors on when I try writing a little C, such as a chess engine with no heap allocation that I wrote in January. -Wall -Wextra -fno-common -Wcast-align -Wredundant-decls -Wbad-function-cast -Wwrite-strings -Waggregate-return -Wstrict-prototypes -Wmissing-prototypes -pedantic -Wformat=2 -Wdouble-promotion -Winit-self -Wmissing-include-dirs -Wswitch-default -Wswitch-enum -Wfloat-equal -Wconversion -Wlogical-op -Wredundant-decls -Wjump-misses-init -Wshadow -Wcast-qual -Wvla -Wsuggest-attribute=pure -fipa-pure-const -Wsuggest-attribute=const No segfaults from that program, which was not the case with learning to make a thread local class variable in D. (2) More thorough, precise descriptions in the online library documentation. Some things are told in very loose terms, with unexpected complexities shown in the examples. See std.math.nextPow2 for what I'm talking about. I'd like to rewrite the docs, if I could, or maybe an alternate expanded version, but it's a lot and it's a moving target.
Apr 25 2016
On Monday, 25 April 2016 at 12:56:27 UTC, Andrei Alexandrescu wrote:On 4/25/16 6:42 AM, Solomon E wrote:x & -x is the smallest power of 2 that divides x. Basically, if x = xxxx1000 , x & -x = 00001000 . This is easy to proves considering -x = ~x + 1 . Now, x >> 1 will be of the form 0xxxx100 . If one of these digit is one, then (x >> 1) >= (x & -x) . If they are all zeros, (x >> 1) < (x & -x) which is the case where you have a power of 2. 0 is a special case, can it can be checked that the function return false for this specific input. This looks like it is correct.On Monday, 25 April 2016 at 05:35:12 UTC, Andrei Alexandrescu wrote:That's smaller and faster, thanks. But how do you show it is still correct? -- AndreiWith gdc https://godbolt.org/g/jcU4np isPow2B is the winner (shortest code, simplest instructions). -- AndreiI generalized function isPow2B to work for negative numbers in case of a signed type, while keeping the assembly the same or better. (That also incidentally made it correctly return false for zero.)bool isPow2B(T)(T x) { return (x & -x) > (x - 1); }bool isPow2F(T)(T x) { return (x & -x) > (x >>> 1); } assembly diff: -- subl $1, %edi ++ shrl %edi
Apr 25 2016
On Monday, 25 April 2016 at 22:42:38 UTC, deadalnix wrote:x & -x is the smallest power of 2 that divides x. Basically, if x = xxxx1000 , x & -x = 00001000 . This is easy to proves considering -x = ~x + 1 . Now, x >> 1 will be of the form 0xxxx100 . If one of these digit is one, then (x >> 1) >= (x & -x) . If they are all zeros, (x >> 1) < (x & -x) which is the case where you have a power of 2. 0 is a special case, can it can be checked that the function return false for this specific input. This looks like it is correct.Yes. Except for the case 0x80000000 (= int.min), because this is negative so NOT smaller than 0x40000000 (= int.min>>>1), which is considered positive. So the algorithm doesn't work for signed integers (without extra cast).
Apr 26 2016
On Tuesday, 26 April 2016 at 08:12:02 UTC, Dominikus Dittes Scherkl wrote:On Monday, 25 April 2016 at 22:42:38 UTC, deadalnix wrote:Well, it depends a bit on how you define "is a power of two": If you define "x is a power of 2" as "there is a natural number n, so that x == 2*2*...*2 (n-times)" with * defined as multiplication in the ring of integers modulo 2^32 (i.e. int or uint), then int.min and 0 are powers of two because the multiplication overflows. If you define "x is a power of 2" as above but based on the multiplication in the ring of integers, then int.min (== -2^31) and 0 are not powers of two.x & -x is the smallest power of 2 that divides x. Basically, if x = xxxx1000 , x & -x = 00001000 . This is easy to proves considering -x = ~x + 1 . Now, x >> 1 will be of the form 0xxxx100 . If one of these digit is one, then (x >> 1) >= (x & -x) . If they are all zeros, (x >> 1) < (x & -x) which is the case where you have a power of 2. 0 is a special case, can it can be checked that the function return false for this specific input. This looks like it is correct.Yes. Except for the case 0x80000000 (= int.min), because this is negative so NOT smaller than 0x40000000 (= int.min>>>1), which is considered positive. So the algorithm doesn't work for signed integers (without extra cast).
Apr 26 2016
On Tuesday, 26 April 2016 at 08:12:02 UTC, Dominikus Dittes Scherkl wrote:On Monday, 25 April 2016 at 22:42:38 UTC, deadalnix wrote:You got to explain me how you end up with a negative number by multiplying 2 by 2 a bunch of time.x & -x is the smallest power of 2 that divides x. Basically, if x = xxxx1000 , x & -x = 00001000 . This is easy to proves considering -x = ~x + 1 . Now, x >> 1 will be of the form 0xxxx100 . If one of these digit is one, then (x >> 1) >= (x & -x) . If they are all zeros, (x >> 1) < (x & -x) which is the case where you have a power of 2. 0 is a special case, can it can be checked that the function return false for this specific input. This looks like it is correct.Yes. Except for the case 0x80000000 (= int.min), because this is negative so NOT smaller than 0x40000000 (= int.min>>>1), which is considered positive. So the algorithm doesn't work for signed integers (without extra cast).
Apr 26 2016
On Tuesday, 26 April 2016 at 18:04:53 UTC, deadalnix wrote:You got to explain me how you end up with a negative number by multiplying 2 by 2 a bunch of time.You gotta scrawl your numbers down badly so one of the 2's looks like a minus sign. Or go so large that the universe caves in on itself.
Apr 26 2016
On 4/24/16 7:00 PM, Temtaime wrote:On Sunday, 24 April 2016 at 21:45:32 UTC, Walter Bright wrote:Why? -- AndreiOn 4/24/2016 8:33 AM, Andrei Alexandrescu wrote:Please no cmp.On 04/24/2016 02:57 AM, deadalnix wrote:This sort of stuff should go on wiki.dlang.org page!On Saturday, 23 April 2016 at 15:29:08 UTC, Andrei Alexandrescu wrote:So if you do (x & -x) == x that returns 1 for x == 0. For many applications you want to yield 0. So you test for (x & -x) > (x - 1) such that for x == 0 the right hand side is a large number. -- AndreiYah, that's the canonical. I forgot why I chose (x & -x) > (x - 1) over it.I'm not sure why do you test against x - 1 when you could test for equality. Not only it looks like it is going to require an extra computation (x - 1) but also it doesn't work for 0.
Apr 24 2016
On Saturday, 23 April 2016 at 13:04:00 UTC, Nordlöw wrote:Wanted: CT-trait and run-time predicate for checking whether its single integer parameter is an exact power of two. I guess https://dlang.org/phobos/std_math.html#.truncPow2 or https://dlang.org/phobos/std_math.html#.nextPow2 could be reused, but I bet there's a more efficient way of checking this.Note that (A | B) & -(A | B) will give you the minimal power of 2 that divide A and B. Now, if A == B, you get (A & -A) == A to test if A is a power of 2.
Apr 23 2016
On 23/04/16 16:04, Nordlöw wrote:Wanted: CT-trait and run-time predicate for checking whether its single integer parameter is an exact power of two. I guess https://dlang.org/phobos/std_math.html#.truncPow2 or https://dlang.org/phobos/std_math.html#.nextPow2 could be reused, but I bet there's a more efficient way of checking this.The standard way (assuming 2's complement machine) is: if( i == (-i & (~i)) ) writeln("Power of 2"); Works for all 2's complement integers except 0, where it gives a false positive. Shachar
Apr 25 2016