## digitalmars.D.learn - testing bits in sequence

• spir (46/46) Dec 26 2010 Hello,
• Simon (5/11) Dec 26 2010 http://digitalmars.com/d/2.0/phobos/std_intrinsic.html
• Simen kjaeraas (12/30) Dec 26 2010 std.intrinsic's[1] bsr returns the highest non-zero bit. I'd think the
spir <denis.spir gmail.com> writes:
```Hello,

I need to test in sequence the bits of an unsigned int (see below more prec=
ision), and move in a tree accordingly. Since there are 2 possible branches=
at every step, they are encoded in a [2] array, indexed by bit. I am looki=
ng for the fastest way to get that bit.

To run backwards (MSB <-- LSB), one can simply mask and shift right, like:
auto node =3D root;
auto code =3D something;
foreach (_ ; 0..BIT_SIZE) {
node =3D node.nodes[bit];
code >>=3D 1;
}
But this looks like slow (to my surprise). I don't know what the limiting i=
nstruction is.

Also, My actual need would rather be to move forward. The reason is this al=
lows important optimisations for common cases. In fact, the codes are unico=
de code points: if the 5 first bits are 0 (tested with a mask), I can jump =
forward to a sub-tree corresponding to a code < 0x10000 (BMP, 99.999% cases=
in practice). If the 14 first bits are 0 (ditto), I can jump to the ASCII =
case, very common as well.
But this seems more difficult; or rather I have not found a direct way to e=
xtract a 0/1 bit. I had first used an array of masks [2^n,2^n-1,...,4,2,1]:
foreach (i ; 0..BIT_SIZE) {
node =3D node.nodes[bit];
}
But as you see masking that way lets a value of 2^i, not 1, in the 'true' c=
ase, which needs to be converted, either with "cast(bool)bit" or "!!bit". A=
nd this seems _very_ slow: runs about 3 times slower than backward traversa=
l.

Note: I cannot use a plain array, because there is a small proportion of ac=
tual values in the whole range (sparse array of ~ 1 per-thousand 'busy' loc=
ations). I'm looking for a possible alternative to using AAs (costly becaus=
e of division). Backward traversal currently runs sensibly slower that usin=
g an AA, but with optimisation for common cases there could be a huge gain =
in practice.
The data structure I'm trying to implement is concretely a kind "bit-trie" =

Denis
-- -- -- -- -- -- --
vit esse estrany =E2=98=A3

spir.wikidot.com
```
Dec 26 2010
Simon <s.d.hammett gmail.com> writes:
```On 26/12/2010 12:18, spir wrote:
Hello,

I need to test in sequence

Denis
-- -- -- -- -- -- --
vit esse estrany ☣

spir.wikidot.com

http://digitalmars.com/d/2.0/phobos/std_intrinsic.html

--
My enormous talent is exceeded only by my outrageous laziness.
http://www.ssTk.co.uk
```
Dec 26 2010
"Simen kjaeraas" <simen.kjaras gmail.com> writes:
```spir <denis.spir gmail.com> wrote:

Also, My actual need would rather be to move forward. The reason is this
allows important optimisations for common cases. In fact, the codes are
unicode code points: if the 5 first bits are 0 (tested with a mask), I
can jump forward to a sub-tree corresponding to a code < 0x10000 (BMP,
99.999% cases in practice). If the 14 first bits are 0 (ditto), I can

std.intrinsic's[1] bsr returns the highest non-zero bit. I'd think the
masking'd be as effective when you know which bits you care about.

But this seems more difficult; or rather I have not found a direct way
to extract a 0/1 bit.

[2^n,2^n-1,...,4,2,1]:
foreach (i ; 0..BIT_SIZE) {
node = node.nodes[bit];
}
But as you see masking that way lets a value of 2^i, not 1, in the
'true' case, which needs to be converted, either with "cast(bool)bit" or
"!!bit". And this seems _very_ slow: runs about 3 times slower than
backward traversal.

This seems better implemented like this:

foreach ( i; 0..BIT_SIZE ) {
bit = ( code >> i ) & 1;
node = node.nodes[bit];
}

[1]: http://www.digitalmars.com/d/2.0/phobos/std_intrinsic.html
--
Simen
```
Dec 26 2010
spir <denis.spir gmail.com> writes:
```On Sun, 26 Dec 2010 13:40:22 +0100
"Simen kjaeraas" <simen.kjaras gmail.com> wrote:

foreach (i ; 0..BIT_SIZE) {
node =3D node.nodes[bit];
}
But as you see masking that way lets a value of 2^i, not 1, in the =20
'true' case, which needs to be converted, either with "cast(bool)bit" o=

r =20
"!!bit". And this seems _very_ slow: runs about 3 times slower than =20
backward traversal. =20

=20
This seems better implemented like this:
=20
foreach ( i; 0..BIT_SIZE ) {
bit =3D ( code >> i ) & 1;
node =3D node.nodes[bit];
}

You have not read my post carefully enough ;-) That's ~ how I coded travers=
ing the bit seq backwards (right-to-left, LSB -> MSB); but I was trying to =
find a way to traverse it forwards.
Anyway, than you for the pointer to std.intrinsic, seems helpful.

Denis
-- -- -- -- -- -- --
vit esse estrany =E2=98=A3

spir.wikidot.com
```
Dec 26 2010
"Simen kjaeraas" <simen.kjaras gmail.com> writes:
```spir <denis.spir gmail.com> wrote:

foreach ( i; 0..BIT_SIZE ) {
bit = ( code >> i ) & 1;
node = node.nodes[bit];
}

You have not read my post carefully enough ;-) That's ~ how I coded
traversing the bit seq backwards (right-to-left, LSB -> MSB); but I was
trying to find a way to traverse it forwards.

Sorry. Indeed I messed up:

foreach_reverse ( i; 0..BIT_SIZE ) {
bit = ( code >> i ) & 1;
node = node.nodes[bit];
}

or

foreach ( i; 0..BIT_SIZE ) {
bit = ( code >> ( BIT_SIZE + 1 - i ) ) & 1;
node = node.nodes[bit];
}

--
Simen
```
Dec 26 2010