www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Checking if an Integer is an Exact Binary Power

reply =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
Wanted: CT-trait and run-time predicate for checking whether its 
single integer parameter is an exact power of two.

I guess

https://dlang.org/phobos/std_math.html#.truncPow2

or

https://dlang.org/phobos/std_math.html#.nextPow2

could be reused, but I bet there's a more efficient way of 
checking this.
Apr 23 2016
next sibling parent reply Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Saturday, 23 April 2016 at 13:04:00 UTC, Nordlöw wrote:
 Wanted: CT-trait and run-time predicate for checking whether 
 its single integer parameter is an exact power of two.
popcnt(x) <= 1 ?
Apr 23 2016
next sibling parent =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Saturday, 23 April 2016 at 13:06:58 UTC, Vladimir Panteleev 
wrote:
 On Saturday, 23 April 2016 at 13:04:00 UTC, Nordlöw wrote:
 Wanted: CT-trait and run-time predicate for checking whether 
 its single integer parameter is an exact power of two.
popcnt(x) <= 1 ?
Great. My solution: /** Check if `x` is an exact (binary) power of 2. TODO Move to std.math. */ bool isPow2(T)(T x) if (isIntegral!T) { import core.bitop : popcnt; return popcnt(x) == 1; } safe pure nothrow nogc unittest { // run-time assert(!7.isPow2); assert(8.isPow2); assert(!9.isPow2); // compile-time static assert(!7.isPow2); static assert(8.isPow2); static assert(!9.isPow2); } Destroy.
Apr 23 2016
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/23/16 9:06 AM, Vladimir Panteleev wrote:
 On Saturday, 23 April 2016 at 13:04:00 UTC, Nordlöw wrote:
 Wanted: CT-trait and run-time predicate for checking whether its
 single integer parameter is an exact power of two.
popcnt(x) <= 1 ?
Would be interesting to see how it compares to the handwritten version on different machines. popcnt has a reputation of being slow in older CPUs, in recent ones it has a latency of 3 cycles and a throughput of 1 cycle. -- Andrei
Apr 23 2016
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/23/16 9:54 AM, Andrei Alexandrescu wrote:
 On 4/23/16 9:06 AM, Vladimir Panteleev wrote:
 On Saturday, 23 April 2016 at 13:04:00 UTC, Nordlöw wrote:
 Wanted: CT-trait and run-time predicate for checking whether its
 single integer parameter is an exact power of two.
popcnt(x) <= 1 ?
Would be interesting to see how it compares to the handwritten version on different machines. popcnt has a reputation of being slow in older CPUs, in recent ones it has a latency of 3 cycles and a throughput of 1 cycle. -- Andrei
Oh, and the bummer is that on gdc it's a function call: https://godbolt.org/g/dEYCcG -- Andrei
Apr 23 2016
parent Iain Buclaw via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 23 April 2016 at 15:56, Andrei Alexandrescu via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On 4/23/16 9:54 AM, Andrei Alexandrescu wrote:
 On 4/23/16 9:06 AM, Vladimir Panteleev wrote:
 On Saturday, 23 April 2016 at 13:04:00 UTC, Nordlöw wrote:
 Wanted: CT-trait and run-time predicate for checking whether its
 single integer parameter is an exact power of two.
popcnt(x) <= 1 ?
Would be interesting to see how it compares to the handwritten version on different machines. popcnt has a reputation of being slow in older CPUs, in recent ones it has a latency of 3 cycles and a throughput of 1 cycle. -- Andrei
Oh, and the bummer is that on gdc it's a function call: https://godbolt.org/g/dEYCcG -- Andrei
That can easily be changed. ;-)
Apr 24 2016
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/23/16 9:04 AM, Nordlöw wrote:
 Wanted: CT-trait and run-time predicate for checking whether its single
 integer parameter is an exact power of two.

 I guess

 https://dlang.org/phobos/std_math.html#.truncPow2

 or

 https://dlang.org/phobos/std_math.html#.nextPow2

 could be reused, but I bet there's a more efficient way of checking this.
There's stuff for that in e.g. https://github.com/dlang/phobos/blob/master/std/experimental/allo ator/common.d#L462. They're package-private with the intent is to move those slowly into Phobos. -- Andrei
Apr 23 2016
prev sibling next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 23-Apr-2016 16:04, Nordlöw wrote:
 Wanted: CT-trait and run-time predicate for checking whether its single
 integer parameter is an exact power of two.

 I guess

 https://dlang.org/phobos/std_math.html#.truncPow2

 or

 https://dlang.org/phobos/std_math.html#.nextPow2

 could be reused, but I bet there's a more efficient way of checking this.
x & (x-1) == 0 -- Dmitry Olshansky
Apr 23 2016
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/23/16 10:41 AM, Dmitry Olshansky wrote:
 On 23-Apr-2016 16:04, Nordlöw wrote:
 Wanted: CT-trait and run-time predicate for checking whether its single
 integer parameter is an exact power of two.

 I guess

 https://dlang.org/phobos/std_math.html#.truncPow2

 or

 https://dlang.org/phobos/std_math.html#.nextPow2

 could be reused, but I bet there's a more efficient way of checking this.
x & (x-1) == 0
Yah, that's the canonical. I forgot why I chose (x & -x) > (x - 1) over it. -- Andrei
Apr 23 2016
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 04/23/2016 11:29 AM, Andrei Alexandrescu wrote:
 On 4/23/16 10:41 AM, Dmitry Olshansky wrote:
 On 23-Apr-2016 16:04, Nordlöw wrote:
 Wanted: CT-trait and run-time predicate for checking whether its single
 integer parameter is an exact power of two.

 I guess

 https://dlang.org/phobos/std_math.html#.truncPow2

 or

 https://dlang.org/phobos/std_math.html#.nextPow2

 could be reused, but I bet there's a more efficient way of checking
 this.
x & (x-1) == 0
Yah, that's the canonical. I forgot why I chose (x & -x) > (x - 1) over it. -- Andrei
Oh I remember. If x == 0, the result should also be 0. -- Andrei
Apr 23 2016
parent reply =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Saturday, 23 April 2016 at 17:28:21 UTC, Andrei Alexandrescu 
wrote:
 Yah, that's the canonical. I forgot why I chose (x & -x) > (x 
 - 1) over
 it. -- Andrei
So is there a way to check if popcnt builtin is available on current platform?
Apr 23 2016
next sibling parent reply Lass Safin <lasssafin gmail.com> writes:
On Saturday, 23 April 2016 at 20:34:52 UTC, Nordlöw wrote:
 On Saturday, 23 April 2016 at 17:28:21 UTC, Andrei Alexandrescu 
 wrote:
 Yah, that's the canonical. I forgot why I chose (x & -x) > (x 
 - 1) over
 it. -- Andrei
So is there a way to check if popcnt builtin is available on current platform?
CPUID: https://en.wikipedia.org/wiki/CPUID. You can check for the presence of a lot of instructions with this instruction. However this will only work on x86 and only run-time.
Apr 23 2016
parent reply =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Saturday, 23 April 2016 at 20:42:25 UTC, Lass Safin wrote:
 CPUID: https://en.wikipedia.org/wiki/CPUID.
 You can check for the presence of a lot of instructions with 
 this instruction.
 However this will only work on x86 and only run-time.
Code you give a complete code example in D, please or point out a suitable place in druntime/phobos?
Apr 23 2016
next sibling parent safety0ff <safety0ff.dev gmail.com> writes:
On Saturday, 23 April 2016 at 21:04:52 UTC, Nordlöw wrote:
 On Saturday, 23 April 2016 at 20:42:25 UTC, Lass Safin wrote:
 CPUID: https://en.wikipedia.org/wiki/CPUID.
 You can check for the presence of a lot of instructions with 
 this instruction.
 However this will only work on x86 and only run-time.
Code you give a complete code example in D, please or point out a suitable place in druntime/phobos?
https://dlang.org/phobos/core_cpuid.html#.hasPopcnt However, it is usually better to use the methods stated by Andrei / Dmitry* rather than population count for checking powers of two. * With the code given by Dmitry you have to check for zero, otherwise it will return true for 0. e.g. x && !(x & (x - 1))
Apr 23 2016
prev sibling next sibling parent tsbockman <thomas.bockman gmail.com> writes:
On Saturday, 23 April 2016 at 21:04:52 UTC, Nordlöw wrote:
 On Saturday, 23 April 2016 at 20:42:25 UTC, Lass Safin wrote:
 CPUID: https://en.wikipedia.org/wiki/CPUID.
 You can check for the presence of a lot of instructions with 
 this instruction.
 However this will only work on x86 and only run-time.
Code you give a complete code example in D, please or point out a suitable place in druntime/phobos?
core.bitop.popcnt() uses core.cpuid.hasPopcnt() to choose between the intrinsic _popcnt() and a software fallback algorithm at runtime: https://github.com/dlang/druntime/blob/master/src/core/bitop.d#L343 The relative complexity of softPopcnt() also shows why you don't really want to use popcnt() for this task unless you know the intrinsic will be available.
Apr 23 2016
prev sibling parent reply Lass Safin <lasssafin gmail.com> writes:
On Saturday, 23 April 2016 at 21:04:52 UTC, Nordlöw wrote:
 On Saturday, 23 April 2016 at 20:42:25 UTC, Lass Safin wrote:
 CPUID: https://en.wikipedia.org/wiki/CPUID.
 You can check for the presence of a lot of instructions with 
 this instruction.
 However this will only work on x86 and only run-time.
Code you give a complete code example in D, please or point out a suitable place in druntime/phobos?
I just found this: https://dlang.org/phobos/core_cpuid.html#.hasPopcnt! It does exactly as it says: checks if the system has popcnt. Though read the top of https://dlang.org/phobos/core_cpuid.html before you use it:
 Bugs:
 Currently only works on x86 and Itanium CPUs. Many processors 
 have bugs in their
 microcode for the CPUID instruction, so sometimes the cache 
 information may be
 incorrect.
Example; import core.bitop; import core.cpuid; int count; if(hasPopcnt) count = _popcnt; // Uses x86-instruction "popcnt". else count = popcnt; // Phobos's software implementation. // Do stuff with count
Apr 25 2016
next sibling parent Iain Buclaw via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 25 April 2016 at 18:48, Lass Safin via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On Saturday, 23 April 2016 at 21:04:52 UTC, Nordlöw wrote:
 On Saturday, 23 April 2016 at 20:42:25 UTC, Lass Safin wrote:
 CPUID: https://en.wikipedia.org/wiki/CPUID.
 You can check for the presence of a lot of instructions with this
 instruction.
 However this will only work on x86 and only run-time.
Code you give a complete code example in D, please or point out a suitable place in druntime/phobos?
I just found this: https://dlang.org/phobos/core_cpuid.html#.hasPopcnt! It does exactly as it says: checks if the system has popcnt. Though read the top of https://dlang.org/phobos/core_cpuid.html before you use it:
 Bugs:
 Currently only works on x86 and Itanium CPUs. Many processors have bugs in
 their
 microcode for the CPUID instruction, so sometimes the cache information
 may be
 incorrect.
Example; import core.bitop; import core.cpuid; int count; if(hasPopcnt) count = _popcnt; // Uses x86-instruction "popcnt". else count = popcnt; // Phobos's software implementation.
version(X86) { if (hasPopcnt) ... } else { // API not guaranteed to be the same. }
Apr 25 2016
prev sibling parent reply tsbockman <thomas.bockman gmail.com> writes:
On Monday, 25 April 2016 at 16:48:14 UTC, Lass Safin wrote:
 Example;

 import core.bitop;
 import core.cpuid;

 int count;
 if(hasPopcnt)
     count = _popcnt; // Uses x86-instruction "popcnt".
 else
     count = popcnt; // Phobos's software implementation.

 // Do stuff with count
That is not right (since 2.069, I think?). Adding that check just makes things slower and more complicated: core.bitop.popcnt() itself already checks hasPopcnt, and forwards to the intrinsic if it is available.
Apr 25 2016
parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Mon, 25 Apr 2016 17:04:43 +0000
schrieb tsbockman <thomas.bockman gmail.com>:

 On Monday, 25 April 2016 at 16:48:14 UTC, Lass Safin wrote:
 Example;

 import core.bitop;
 import core.cpuid;

 int count;
 if(hasPopcnt)
     count = _popcnt; // Uses x86-instruction "popcnt".
 else
     count = popcnt; // Phobos's software implementation.

 // Do stuff with count  
That is not right (since 2.069, I think?). Adding that check just makes things slower and more complicated: core.bitop.popcnt() itself already checks hasPopcnt, and forwards to the intrinsic if it is available.
Can we just use __traits(hasTargetFeature, "popcnt") already and get rid of _popcnt? -- Marco
Apr 26 2016
parent reply tsbockman <thomas.bockman gmail.com> writes:
On Tuesday, 26 April 2016 at 14:37:46 UTC, Marco Leise wrote:
 Can we just use __traits(hasTargetFeature, "popcnt") already
 and get rid of _popcnt?
No, because DMD does not currently support setting SSE4 as the minimum target. Thus, `__traits(hasTargetFeature, "popcnt")` must always return `false` at compile time. As for LDC and GDC - both already treat `popcnt()` as an intrinsic, and don't need your proposed change.
Apr 26 2016
parent Temtaime <temtaime gmail.com> writes:
On Tuesday, 26 April 2016 at 16:53:00 UTC, tsbockman wrote:
 On Tuesday, 26 April 2016 at 14:37:46 UTC, Marco Leise wrote:
 Can we just use __traits(hasTargetFeature, "popcnt") already
 and get rid of _popcnt?
No, because DMD does not currently support setting SSE4 as the minimum target. Thus, `__traits(hasTargetFeature, "popcnt")` must always return `false` at compile time. As for LDC and GDC - both already treat `popcnt()` as an intrinsic, and don't need your proposed change.
It's strange why it emits an branch. http://goo.gl/jJbSYd Det version doesn't use cmp and should be fast.
Apr 26 2016
prev sibling next sibling parent David Nadlinger <code klickverbot.at> writes:
On Saturday, 23 April 2016 at 20:34:52 UTC, Nordlöw wrote:
 So is there a way to check if popcnt builtin is available on 
 current platform?
As Andrei pointed out, it's not only popcnt being available, it's also it being fast. A function implementing the classic hand-written version compiles down to something like lea eax, [rdi - 1] test eax, edi sete al ret which is already quite good. There is also blsi to consider on Haswell and above. — David
Apr 23 2016
prev sibling parent Marco Leise <Marco.Leise gmx.de> writes:
Am Sat, 23 Apr 2016 20:34:52 +0000
schrieb Nordl=C3=B6w <per.nordlow gmail.com>:

 On Saturday, 23 April 2016 at 17:28:21 UTC, Andrei Alexandrescu=20
 wrote:
 Yah, that's the canonical. I forgot why I chose (x & -x) > (x=20
 - 1) over
 it. -- Andrei =20
=20 So is there a way to check if popcnt builtin is available on=20 current platform?
We argued in another, unrelated thread ("Any usable SIMD implementation?") for compile-time information on the target. That would include __traits("hasTargetFeature", "popcnt") A runtime check requires at least a read of a global variable and I doubt it is faster than `x & (x-1)`. --=20 Marco
Apr 23 2016
prev sibling parent reply deadalnix <deadalnix gmail.com> writes:
On Saturday, 23 April 2016 at 15:29:08 UTC, Andrei Alexandrescu 
wrote:
 Yah, that's the canonical. I forgot why I chose (x & -x) > (x - 
 1) over it.
I'm not sure why do you test against x - 1 when you could test for equality. Not only it looks like it is going to require an extra computation (x - 1) but also it doesn't work for 0.
Apr 23 2016
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 04/24/2016 02:57 AM, deadalnix wrote:
 On Saturday, 23 April 2016 at 15:29:08 UTC, Andrei Alexandrescu wrote:
 Yah, that's the canonical. I forgot why I chose (x & -x) > (x - 1)
 over it.
I'm not sure why do you test against x - 1 when you could test for equality. Not only it looks like it is going to require an extra computation (x - 1) but also it doesn't work for 0.
So if you do (x & -x) == x that returns 1 for x == 0. For many applications you want to yield 0. So you test for (x & -x) > (x - 1) such that for x == 0 the right hand side is a large number. -- Andrei
Apr 24 2016
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/24/2016 8:33 AM, Andrei Alexandrescu wrote:
 On 04/24/2016 02:57 AM, deadalnix wrote:
 On Saturday, 23 April 2016 at 15:29:08 UTC, Andrei Alexandrescu wrote:
 Yah, that's the canonical. I forgot why I chose (x & -x) > (x - 1)
 over it.
I'm not sure why do you test against x - 1 when you could test for equality. Not only it looks like it is going to require an extra computation (x - 1) but also it doesn't work for 0.
So if you do (x & -x) == x that returns 1 for x == 0. For many applications you want to yield 0. So you test for (x & -x) > (x - 1) such that for x == 0 the right hand side is a large number. -- Andrei
This sort of stuff should go on wiki.dlang.org page!
Apr 24 2016
parent reply Temtaime <temtaime gmail.com> writes:
On Sunday, 24 April 2016 at 21:45:32 UTC, Walter Bright wrote:
 On 4/24/2016 8:33 AM, Andrei Alexandrescu wrote:
 On 04/24/2016 02:57 AM, deadalnix wrote:
 On Saturday, 23 April 2016 at 15:29:08 UTC, Andrei 
 Alexandrescu wrote:
 Yah, that's the canonical. I forgot why I chose (x & -x) > 
 (x - 1)
 over it.
I'm not sure why do you test against x - 1 when you could test for equality. Not only it looks like it is going to require an extra computation (x - 1) but also it doesn't work for 0.
So if you do (x & -x) == x that returns 1 for x == 0. For many applications you want to yield 0. So you test for (x & -x) > (x - 1) such that for x == 0 the right hand side is a large number. -- Andrei
This sort of stuff should go on wiki.dlang.org page!
Please no cmp. Just bool isPowerOf2(uint x) { return x && !(x & (x - 1)); }
Apr 24 2016
next sibling parent reply David Nadlinger <code klickverbot.at> writes:
On Sunday, 24 April 2016 at 23:00:56 UTC, Temtaime wrote:
 Please no cmp.
 Just
 bool isPowerOf2(uint x) { return x && !(x & (x - 1)); }
You do realise that this will (typically) emit a branch? — David
Apr 24 2016
parent reply Xinok <xinok live.com> writes:
On Sunday, 24 April 2016 at 23:17:53 UTC, David Nadlinger wrote:
 On Sunday, 24 April 2016 at 23:00:56 UTC, Temtaime wrote:
 Please no cmp.
 Just
 bool isPowerOf2(uint x) { return x && !(x & (x - 1)); }
You do realise that this will (typically) emit a branch? — David
I compiled using dmd -O and looked at the disassembly using objdump -D and there are indeed a couple branches in the disassembly: 0000000000000000 <_D4test8powerOf2FkZk>: 0: 50 push %rax 1: 48 89 f9 mov %rdi,%rcx 4: 85 c9 test %ecx,%ecx 6: 74 0a je 12 <_D4test8powerOf2FkZk+0x12> 8: 8d 81 ff ff ff ff lea -0x1(%rcx),%eax e: 85 c1 test %eax,%ecx 10: 74 04 je 16 <_D4test8powerOf2FkZk+0x16> 12: 31 c0 xor %eax,%eax 14: eb 05 jmp 1b <_D4test8powerOf2FkZk+0x1b> 16: b8 01 00 00 00 mov $0x1,%eax 1b: 59 pop %rcx 1c: c3 retq 1d: 0f 1f 00 nopl (%rax) I modified David's solution a bit to (hopefully) eliminate the branch: bool powerOf2(uint x){ return !x ^ !(x & (x - 1)); } 0000000000000000 <_D4test8powerOf2FkZb>: 0: 50 push %rax 1: 53 push %rbx 2: 48 89 fa mov %rdi,%rdx 5: 85 d2 test %edx,%edx 7: 0f 94 c0 sete %al a: 8d 8a ff ff ff ff lea -0x1(%rdx),%ecx 10: 85 ca test %ecx,%edx 12: 0f 94 c3 sete %bl 15: 32 c3 xor %bl,%al 17: 5b pop %rbx 18: 59 pop %rcx 19: c3 retq 1a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) The branches are gone but I'm really not sure if this is better or worse.
Apr 24 2016
next sibling parent Xinok <xinok live.com> writes:
On Monday, 25 April 2016 at 01:17:48 UTC, Xinok wrote:
 ...
Sorry, didn't mean to say David's solution. Too many edits. >_<
Apr 24 2016
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/24/16 9:17 PM, Xinok wrote:
 I modified David's solution a bit to (hopefully) eliminate the branch:
 bool powerOf2(uint x){ return !x ^ !(x & (x - 1)); }

      0000000000000000 <_D4test8powerOf2FkZb>:
         0:   50                      push   %rax
         1:   53                      push   %rbx
         2:   48 89 fa                mov    %rdi,%rdx
         5:   85 d2                   test   %edx,%edx
         7:   0f 94 c0                sete   %al
         a:   8d 8a ff ff ff ff       lea    -0x1(%rdx),%ecx
        10:   85 ca                   test   %ecx,%edx
        12:   0f 94 c3                sete   %bl
        15:   32 c3                   xor    %bl,%al
        17:   5b                      pop    %rbx
        18:   59                      pop    %rcx
        19:   c3                      retq
        1a:   66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)

 The branches are gone but I'm really not sure if this is better or worse.
With gdc https://godbolt.org/g/jcU4np isPow2B is the winner (shortest code, simplest instructions). -- Andrei
Apr 24 2016
parent reply Solomon E <default avatar.org> writes:
On Monday, 25 April 2016 at 05:35:12 UTC, Andrei Alexandrescu 
wrote:
 With gdc https://godbolt.org/g/jcU4np isPow2B is the winner 
 (shortest code, simplest instructions). -- Andrei
I generalized function isPow2B to work for negative numbers in case of a signed type, while keeping the assembly the same or better. (That also incidentally made it correctly return false for zero.)
bool isPow2B(T)(T x)
{
    return (x & -x) > (x - 1);
}
bool isPow2F(T)(T x) { return (x & -x) > (x >>> 1); } assembly diff: -- subl $1, %edi ++ shrl %edi
Apr 25 2016
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/25/16 6:42 AM, Solomon E wrote:
 On Monday, 25 April 2016 at 05:35:12 UTC, Andrei Alexandrescu wrote:
 With gdc https://godbolt.org/g/jcU4np isPow2B is the winner (shortest
 code, simplest instructions). -- Andrei
I generalized function isPow2B to work for negative numbers in case of a signed type, while keeping the assembly the same or better. (That also incidentally made it correctly return false for zero.)
 bool isPow2B(T)(T x)
 {
    return (x & -x) > (x - 1);
 }
bool isPow2F(T)(T x) { return (x & -x) > (x >>> 1); } assembly diff: -- subl $1, %edi ++ shrl %edi
That's smaller and faster, thanks. But how do you show it is still correct? -- Andrei
Apr 25 2016
next sibling parent reply Xinok <xinok live.com> writes:
On Monday, 25 April 2016 at 12:56:27 UTC, Andrei Alexandrescu 
wrote:
 On 4/25/16 6:42 AM, Solomon E wrote:
 On Monday, 25 April 2016 at 05:35:12 UTC, Andrei Alexandrescu 
 wrote:
 With gdc https://godbolt.org/g/jcU4np isPow2B is the winner 
 (shortest
 code, simplest instructions). -- Andrei
I generalized function isPow2B to work for negative numbers in case of a signed type, while keeping the assembly the same or better. (That also incidentally made it correctly return false for zero.)
 bool isPow2B(T)(T x)
 {
    return (x & -x) > (x - 1);
 }
bool isPow2F(T)(T x) { return (x & -x) > (x >>> 1); } assembly diff: -- subl $1, %edi ++ shrl %edi
That's smaller and faster, thanks. But how do you show it is still correct? -- Andrei
Brute force. http://dpaste.dzfl.pl/882d7cdc5f74
Apr 25 2016
next sibling parent reply Dominikus Dittes Scherkl <Dominikus.Scherkl continental-corporation.com> writes:
On Monday, 25 April 2016 at 15:27:02 UTC, Xinok wrote:
 On Monday, 25 April 2016 at 12:56:27 UTC, Andrei Alexandrescu 
 wrote:
 On 4/25/16 6:42 AM, Solomon E wrote:
 On Monday, 25 April 2016 at 05:35:12 UTC, Andrei Alexandrescu 
 wrote:
 With gdc https://godbolt.org/g/jcU4np isPow2B is the winner 
 (shortest
 code, simplest instructions). -- Andrei
I generalized function isPow2B to work for negative numbers in case of a signed type, while keeping the assembly the same or better. (That also incidentally made it correctly return false for zero.)
 bool isPow2B(T)(T x)
 {
    return (x & -x) > (x - 1);
 }
bool isPow2F(T)(T x) { return (x & -x) > (x >>> 1); } assembly diff: -- subl $1, %edi ++ shrl %edi
That's smaller and faster, thanks. But how do you show it is still correct? -- Andrei
Brute force. http://dpaste.dzfl.pl/882d7cdc5f74
Yeah. And your test spares the failed case int.min (0x80000000), because in this case x & -x is negative, but of cause x>>>1 is positive, so it returns false despite -(2^^31) is in fact a power of two. So this requires an extra cast to work correct (in fact no big difference in the assembly): return (Unsigned!T)(x & -x) > (Unsigned!T)(x >>> 1);
Apr 25 2016
parent reply Xinok <xinok live.com> writes:
On Monday, 25 April 2016 at 15:35:14 UTC, Dominikus Dittes 
Scherkl wrote:
 On Monday, 25 April 2016 at 15:27:02 UTC, Xinok wrote:
 Brute force.

 http://dpaste.dzfl.pl/882d7cdc5f74
Yeah. And your test spares the failed case int.min (0x80000000), because in this case x & -x is negative, but of cause x>>>1 is positive, so it returns false despite -(2^^31) is in fact a power of two. So this requires an extra cast to work correct (in fact no big difference in the assembly): return (Unsigned!T)(x & -x) > (Unsigned!T)(x >>> 1);
How is it wrong? Negative numbers are NOT powers of two (unless you have an imaginary/complex exponent), so it's still correct to return false for -int.min. Should it not? http://www.wolframalpha.com/input/?i=solve+2^n+%3D+-%282^31%29+for+n
Apr 26 2016
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 26.04.2016 17:27, Xinok wrote:
 On Monday, 25 April 2016 at 15:35:14 UTC, Dominikus Dittes Scherkl wrote:
 On Monday, 25 April 2016 at 15:27:02 UTC, Xinok wrote:
 Brute force.

 http://dpaste.dzfl.pl/882d7cdc5f74
Yeah. And your test spares the failed case int.min (0x80000000), because in this case x & -x is negative, but of cause x>>>1 is positive, so it returns false despite -(2^^31) is in fact a power of two. So this requires an extra cast to work correct (in fact no big difference in the assembly): return (Unsigned!T)(x & -x) > (Unsigned!T)(x >>> 1);
How is it wrong? Negative numbers are NOT powers of two (unless you have an imaginary/complex exponent), so it's still correct to return false for -int.min. Should it not? http://www.wolframalpha.com/input/?i=solve+2^n+%3D+-%282^31%29+for+n
static assert(2^^31==int.min); // :o)
Apr 26 2016
prev sibling parent Solomon E <default avatar.org> writes:
On Monday, 25 April 2016 at 15:27:02 UTC, Xinok wrote:
 On Monday, 25 April 2016 at 12:56:27 UTC, Andrei Alexandrescu 
 wrote:
 On 4/25/16 6:42 AM, Solomon E wrote:
 ...
 I generalized function isPow2B to work for negative numbers 
 in case of a
 signed type, while keeping the assembly the same or better. 
 ...
 [...]
bool isPow2F(T)(T x) { return (x & -x) > (x >>> 1); } ...
That's smaller and faster, thanks. But how do you show it is still correct? -- Andrei
Brute force. http://dpaste.dzfl.pl/882d7cdc5f74
Brute force? That doesn't prove the algorithm, it only proves the implementation.
Apr 25 2016
prev sibling next sibling parent reply Solomon E <default avatar.org> writes:
On Monday, 25 April 2016 at 12:56:27 UTC, Andrei Alexandrescu 
wrote:
...
That's smaller and faster, thanks. But how do you show it is still correct? -- Andrei
First I wrote the following variants of isPow2A. It's more obvious that they're correct, but they didn't compile to fewer instructions. bool isPow2D(T)(T x) { return (x > 0) && !(x & (x - 1)); } (The variant isPow2E avoids a jump instruction:) bool isPow2E(T)(T x) { return (x > 0) & !(x & (x - 1)); } So at least I had some functions that I knew were doing what I wanted (handling negative values for sure) to test against. Then I tried some other things that didn't work, then the bit shift variant, isPow2F. It seemed to work, I looked at at long list of results in the terminal and saw what the pattern is, but right, I shouldn't trust it without a proof. (x & -x) evaluates to the value of the lowest bit set in x. Either (x & -x) == x, which is a power of 2, or (x & -x) == x minus the value of all the bits above the lowest bit set. (x >>> 1) is equal to half of x or one less than half of x, or a positive number if x is negative. For (x & -x) > (x >>> 1) to be true, that would either be because x is a power of 2, or else x - n > x/2 - 1 (or x - n > x/2, which is implied by that) where n is the value of the bits above the lowest bit set, which means n >= 2/3*x. So x - n > x/2 - 1 would be 1/3*x > x/2 - 1, which would be 0 > 1/6*x - 1. That's impossible for x > 6. For values of x under 6, the results check out case by case. When x is negative in a signed type, the unsigned shift treats x as if it were an unsigned number. The only particular bit arrangement that's treated differently when using a signed type is that -(2^^32) isn't a power of two because it's negative, so the expression (x & -x) should evaluate to a negative value before the comparison. That seems to work for a byte or a short, where (x & -x) comes out negative when at the min value. I've written this clumsily because I'm not sure what sort of symbols to use and don't know how to write mathematical proofs for algorithms in programming in general, but I thought it was interesting that writing it down and double checking required showing how close the algorithm cuts it. For example, x = 6144, (x & -x) == 2048, (x >>> 1) == 3072. The cases for 0 to 6, when using signed types for x: algo B algo F x 0 B true F false (x & -x) == 0, (x >>> 1) == 0 x 1 B true F true (x & -x) == 1, (x >>> 1) == 0 x 2 B true F true (x & -x) == 2, (x >>> 1) == 1 x 3 B false F false (x & -x) == 1, (x >>> 1) == 1 x 4 B true F true (x & -x) == 4, (x >>> 1) == 2 x 5 B false F false (x & -x) == 1, (x >>> 1) == 2 x 6 B false F false (x & -x) == 2, (x >>> 1) == 3
Apr 25 2016
parent reply Solomon E <default avatar.org> writes:
On Monday, 25 April 2016 at 16:53:20 UTC, Solomon E wrote:
 ...

 (x >>> 1) is equal to half of x or one less than half of x, or 
 a positive number if x is negative.

 ...
Obviously wrong explanation, sorry. (x >>> 1) is equal to half of x or half of one less than x, or a positive number if x is negative. There are other parts in isPow2F where I'm not sure exactly what the bits are doing, such as how the compiler makes the result negative for (x & -x) at the bit and short min values, where in other expressions while learning D I've found an implicit cast to int that I worked around by casting back to the type I put in. (I guess that's the way it's supposed to be for compatibility with C/C++ expectations.) Also I skipped mentioning case x = 6 works too, not just cases over and under 6. It's just a sketch of a proof. I would use the other function I wrote first, isPow2D, to be more sure of the logic, although it's not the most optimized, or I'd use a library function that promises to do the task for all types if I found one first, rather than the isPow2 variants I wrote and others wrote, and rather than popcnt. I'm more concerned with getting the application logic right when I program than optimizing for speed or bytes, until something is noticeably pausing or taking extra megabytes. bool isPow2D(T)(T x) { return (x > 0) && !(x & (x - 1)); } Just now I tested popcnt against isPow2D and found that popcnt reports the number of bits set in a uint, regardless of popcnt's argument being a smaller size integer type or signed. That would make (popcnt( -(2^^31) ) == 1) yield true, which isn't what I want for an isPow2 function. popcnt is only documented to take uint or ulong, so that's not a bug in the library, only in my try at using it roughly.
Apr 25 2016
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 04/25/2016 03:21 PM, Solomon E wrote:
 There are other parts in isPow2F where I'm not sure exactly what the
 bits are doing, such as how the compiler makes the result negative
I suggest you simply consider x unsigned. There is no value to the discussion brought by negative numbers, and implementation-wise you can insert a cast or let the overload mechanism only use unsigned integers. -- Andrei
Apr 25 2016
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 04/25/2016 03:21 PM, Solomon E wrote:
 I'm more concerned with getting the application
 logic right when I program than optimizing for speed or bytes, until
 something is noticeably pausing or taking extra megabytes.
This is a luxury not available to library writers.
 bool isPow2D(T)(T x)
 {
      return (x > 0) && !(x & (x - 1));
 }
This is worse than what we have now, which is easy to show is correct. Andrei
Apr 25 2016
parent reply Solomon E <default avatar.org> writes:
On Monday, 25 April 2016 at 19:37:45 UTC, Andrei Alexandrescu 
wrote:
 On 04/25/2016 03:21 PM, Solomon E wrote:
 I'm more concerned with getting the application
 logic right when I program than optimizing for speed or bytes, 
 until
 something is noticeably pausing or taking extra megabytes.
This is a luxury not available to library writers.
I agree.
 bool isPow2D(T)(T x)
 {
      return (x > 0) && !(x & (x - 1));
 }
This is worse than what we have now, which is easy to show is correct. Andrei
I wasn't recommending isPow2D for the library, just for the luxury use case of writing cautiously to show the logic is correct. It's easy to show that the current code for experimental.common.isPowerOf2 (the same logic as isPow2B) does wrong things when a signed type is forced into it. I was hoping to suggest that if isPow2F isn't wrong, it might be used, because it has been proved and checked to be correct in this thread. However, I don't know if it actually is faster, and it would be difficult to test the speed up, if any, since the function variants are all so small, and isPow2F compiles to different instructions when used with different types. So it might not be worthwhile, unless there's an advantage of correctness or type-safety, which I'm not experienced enough in D to judge. (The library's descriptions of std.math.nextPow2 are vastly oversimplified compared to what it does as shown in the examples, so attempting to use that for recognizing simple powers of 2 would be a complexity mismatch.)
Apr 25 2016
parent Solomon E <default avatar.org> writes:
On Monday, 25 April 2016 at 22:15:31 UTC, Solomon E wrote:
 ...

 I was hoping to suggest that if isPow2F isn't wrong, it might 
 be used, because it has been proved and checked to be correct 
 in this thread. ...
On second thought, no one's going to understand why the code (x & -x) > (x >>> 1) parameterized for all integer types is in the library, if it ever breaks because of some compiler backend problem with the different generated code for the different integer types. So I'll take back suggesting that for the library, (at least until I or someone else understands why it would never break like that.) The ways to improve D that occur to me based on this are: (1) It would be better to have a compiler warning on passing signed types to unsigned parameter functions. There wasn't with popcnt, on GDC 5.2.1. There should be more of other warnings like that too. I like to compile with practically all the warnings & errors on when I try writing a little C, such as a chess engine with no heap allocation that I wrote in January. -Wall -Wextra -fno-common -Wcast-align -Wredundant-decls -Wbad-function-cast -Wwrite-strings -Waggregate-return -Wstrict-prototypes -Wmissing-prototypes -pedantic -Wformat=2 -Wdouble-promotion -Winit-self -Wmissing-include-dirs -Wswitch-default -Wswitch-enum -Wfloat-equal -Wconversion -Wlogical-op -Wredundant-decls -Wjump-misses-init -Wshadow -Wcast-qual -Wvla -Wsuggest-attribute=pure -fipa-pure-const -Wsuggest-attribute=const No segfaults from that program, which was not the case with learning to make a thread local class variable in D. (2) More thorough, precise descriptions in the online library documentation. Some things are told in very loose terms, with unexpected complexities shown in the examples. See std.math.nextPow2 for what I'm talking about. I'd like to rewrite the docs, if I could, or maybe an alternate expanded version, but it's a lot and it's a moving target.
Apr 25 2016
prev sibling parent reply deadalnix <deadalnix gmail.com> writes:
On Monday, 25 April 2016 at 12:56:27 UTC, Andrei Alexandrescu 
wrote:
 On 4/25/16 6:42 AM, Solomon E wrote:
 On Monday, 25 April 2016 at 05:35:12 UTC, Andrei Alexandrescu 
 wrote:
 With gdc https://godbolt.org/g/jcU4np isPow2B is the winner 
 (shortest
 code, simplest instructions). -- Andrei
I generalized function isPow2B to work for negative numbers in case of a signed type, while keeping the assembly the same or better. (That also incidentally made it correctly return false for zero.)
 bool isPow2B(T)(T x)
 {
    return (x & -x) > (x - 1);
 }
bool isPow2F(T)(T x) { return (x & -x) > (x >>> 1); } assembly diff: -- subl $1, %edi ++ shrl %edi
That's smaller and faster, thanks. But how do you show it is still correct? -- Andrei
x & -x is the smallest power of 2 that divides x. Basically, if x = xxxx1000 , x & -x = 00001000 . This is easy to proves considering -x = ~x + 1 . Now, x >> 1 will be of the form 0xxxx100 . If one of these digit is one, then (x >> 1) >= (x & -x) . If they are all zeros, (x >> 1) < (x & -x) which is the case where you have a power of 2. 0 is a special case, can it can be checked that the function return false for this specific input. This looks like it is correct.
Apr 25 2016
parent reply Dominikus Dittes Scherkl <Dominikus.Scherkl continental-corporation.com> writes:
On Monday, 25 April 2016 at 22:42:38 UTC, deadalnix wrote:
 x & -x is the smallest power of 2 that divides x. Basically, if 
 x = xxxx1000 , x & -x = 00001000 . This is easy to proves 
 considering -x = ~x + 1 .

 Now, x >> 1 will be of the form 0xxxx100 . If one of these 
 digit is one, then (x >> 1) >= (x & -x) . If they are all 
 zeros, (x >> 1) < (x & -x) which is the case where you have a 
 power of 2.

 0 is a special case, can it can be checked that the function 
 return false for this specific input.

 This looks like it is correct.
Yes. Except for the case 0x80000000 (= int.min), because this is negative so NOT smaller than 0x40000000 (= int.min>>>1), which is considered positive. So the algorithm doesn't work for signed integers (without extra cast).
Apr 26 2016
next sibling parent Matthias Bentrup <matthias.bentrup googlemail.com> writes:
On Tuesday, 26 April 2016 at 08:12:02 UTC, Dominikus Dittes 
Scherkl wrote:
 On Monday, 25 April 2016 at 22:42:38 UTC, deadalnix wrote:
 x & -x is the smallest power of 2 that divides x. Basically, 
 if x = xxxx1000 , x & -x = 00001000 . This is easy to proves 
 considering -x = ~x + 1 .

 Now, x >> 1 will be of the form 0xxxx100 . If one of these 
 digit is one, then (x >> 1) >= (x & -x) . If they are all 
 zeros, (x >> 1) < (x & -x) which is the case where you have a 
 power of 2.

 0 is a special case, can it can be checked that the function 
 return false for this specific input.

 This looks like it is correct.
Yes. Except for the case 0x80000000 (= int.min), because this is negative so NOT smaller than 0x40000000 (= int.min>>>1), which is considered positive. So the algorithm doesn't work for signed integers (without extra cast).
Well, it depends a bit on how you define "is a power of two": If you define "x is a power of 2" as "there is a natural number n, so that x == 2*2*...*2 (n-times)" with * defined as multiplication in the ring of integers modulo 2^32 (i.e. int or uint), then int.min and 0 are powers of two because the multiplication overflows. If you define "x is a power of 2" as above but based on the multiplication in the ring of integers, then int.min (== -2^31) and 0 are not powers of two.
Apr 26 2016
prev sibling parent reply deadalnix <deadalnix gmail.com> writes:
On Tuesday, 26 April 2016 at 08:12:02 UTC, Dominikus Dittes 
Scherkl wrote:
 On Monday, 25 April 2016 at 22:42:38 UTC, deadalnix wrote:
 x & -x is the smallest power of 2 that divides x. Basically, 
 if x = xxxx1000 , x & -x = 00001000 . This is easy to proves 
 considering -x = ~x + 1 .

 Now, x >> 1 will be of the form 0xxxx100 . If one of these 
 digit is one, then (x >> 1) >= (x & -x) . If they are all 
 zeros, (x >> 1) < (x & -x) which is the case where you have a 
 power of 2.

 0 is a special case, can it can be checked that the function 
 return false for this specific input.

 This looks like it is correct.
Yes. Except for the case 0x80000000 (= int.min), because this is negative so NOT smaller than 0x40000000 (= int.min>>>1), which is considered positive. So the algorithm doesn't work for signed integers (without extra cast).
You got to explain me how you end up with a negative number by multiplying 2 by 2 a bunch of time.
Apr 26 2016
parent Era Scarecrow <rtcvb32 yahoo.com> writes:
On Tuesday, 26 April 2016 at 18:04:53 UTC, deadalnix wrote:
 You got to explain me how you end up with a negative number by 
 multiplying 2 by 2 a bunch of time.
You gotta scrawl your numbers down badly so one of the 2's looks like a minus sign. Or go so large that the universe caves in on itself.
Apr 26 2016
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/24/16 7:00 PM, Temtaime wrote:
 On Sunday, 24 April 2016 at 21:45:32 UTC, Walter Bright wrote:
 On 4/24/2016 8:33 AM, Andrei Alexandrescu wrote:
 On 04/24/2016 02:57 AM, deadalnix wrote:
 On Saturday, 23 April 2016 at 15:29:08 UTC, Andrei Alexandrescu wrote:
 Yah, that's the canonical. I forgot why I chose (x & -x) > (x - 1)
 over it.
I'm not sure why do you test against x - 1 when you could test for equality. Not only it looks like it is going to require an extra computation (x - 1) but also it doesn't work for 0.
So if you do (x & -x) == x that returns 1 for x == 0. For many applications you want to yield 0. So you test for (x & -x) > (x - 1) such that for x == 0 the right hand side is a large number. -- Andrei
This sort of stuff should go on wiki.dlang.org page!
Please no cmp.
Why? -- Andrei
Apr 24 2016
prev sibling next sibling parent deadalnix <deadalnix gmail.com> writes:
On Saturday, 23 April 2016 at 13:04:00 UTC, Nordlöw wrote:
 Wanted: CT-trait and run-time predicate for checking whether 
 its single integer parameter is an exact power of two.

 I guess

 https://dlang.org/phobos/std_math.html#.truncPow2

 or

 https://dlang.org/phobos/std_math.html#.nextPow2

 could be reused, but I bet there's a more efficient way of 
 checking this.
Note that (A | B) & -(A | B) will give you the minimal power of 2 that divide A and B. Now, if A == B, you get (A & -A) == A to test if A is a power of 2.
Apr 23 2016
prev sibling parent Shachar Shemesh <shachar weka.io> writes:
On 23/04/16 16:04, Nordlöw wrote:
 Wanted: CT-trait and run-time predicate for checking whether its single
 integer parameter is an exact power of two.

 I guess

 https://dlang.org/phobos/std_math.html#.truncPow2

 or

 https://dlang.org/phobos/std_math.html#.nextPow2

 could be reused, but I bet there's a more efficient way of checking this.
The standard way (assuming 2's complement machine) is: if( i == (-i & (~i)) ) writeln("Power of 2"); Works for all 2's complement integers except 0, where it gives a false positive. Shachar
Apr 25 2016