digitalmars.D - How about a Hash template?

Andrej Mitrovic (15/15) Apr 27 2011 I often see code written like this:

Alexander (7/9) Apr 28 2011 The problem is, "potentially" means "for [relatively] large sets", as ...

Andrej Mitrovic (18/18) Apr 28 2011 You seem to be right. It looks like DMD is pretty good at optimizing

Alexander (6/10) Apr 28 2011 Make it 16 values and run again :) Sure, the more elements we have, th...

Andrej Mitrovic (16/16) Apr 28 2011 "InSet!(var, values...)."

Peter Alexander (7/23) Apr 28 2011 What's wrong with

Andrej Mitrovic (1/1) Apr 28 2011 It works, but it makes the if statement ugly again. :)

Alexander (4/5) Apr 28 2011 Well, just a bit - still much better than (var == "..." || var = "..."...

Andrei Alexandrescu (5/9) Apr 28 2011 That was discussed, too. Walter and I support that although it's a

Timon Gehr (9/22) Apr 28 2011 would be really nice, of course :)

Alexander (4/6) Apr 28 2011 +1. It would be very useful and convenient this way, especially if sea...

Torarin (6/12) Apr 28 2011 rch will be optimized to series of comparisons for small item count.

Jacob Carlborg (6/30) Apr 29 2011 This has been discussed many times before and the general argument

Kai Meyer (21/57) May 02 2011 The only thing I can compare this with is python's 'in' operator. It's

Andrej Mitrovic (6/6) Apr 28 2011 Well, DMD accepts this kind of syntax already:

Jacob Carlborg (7/13) Apr 29 2011 Wouldn't it be nice if you could do something like this:

Robert Clipsham (9/23) Apr 29 2011 Using infix operators you can do:
Peter Alexander (4/18) Apr 29 2011 Sure, if you like ambiguous grammars that break a fundamental property

Jacob Carlborg (4/27) Apr 29 2011 Well, I didn't think THAT far ahead :) . It just looks like a nice synta...

Jonathan M Davis (7/16) Apr 28 2011 That should evaluate each of the expressions in order and take the value...
Andrej Mitrovic (2/2) Apr 28 2011 I know it's not doing that, it was a hypothetical. What I'm saying is

Jonathan M Davis (6/8) Apr 28 2011 Well, it would almost certainly be bad practice to do so. The comma oper...
Denis Koroskin (59/74) Apr 29 2011 Try this:

Andrei Alexandrescu (21/44) Apr 29 2011 [snip]

KennyTM~ (5/51) Apr 29 2011 This is generally 2~3 times slower than a compile-time solution if the

Andrei Alexandrescu (83/83) Apr 29 2011 On 4/29/11 2:44 PM, KennyTM~ wrote:

Alexander (4/5) Apr 29 2011 [snip]

Denis Koroskin (4/10) Apr 29 2011 That's just an optimizer issue. Even if you replace the code body with

Andrei Alexandrescu (4/18) Apr 29 2011 Yah. I think the difference is immaterial outside inner loops. If it

Alexander (3/4) Apr 29 2011 Outside of inner loops even 20x is not a real problem ;)

Alexander (3/4) Apr 29 2011 2x? Hardly, IMHO.

Sean Cavanaugh (14/19) Apr 30 2011 When understanding the CPU platform you are on, one of the benchmarks

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

I often see code written like this:

if (value == somevalue ||
   value == someothervalue ||
   value == yetanothervalue);

You could use the switch statement. But that introduces indentation,
and is rarely used for a couple of values.

I really like the "in" keyword, and I also like hashes since they can
potentially speed up look-ups compared to conventional arrays. So I
thought it would be cool to have a Hash template that constructs an
associative array which you can use especially in if statements when
you just want to know if a runtime value matches some predetermined
value.

Here's my attempt at writing it:
http://codepad.org/c4sYDSyR

Whaddya think?

Apr 27 2011

Alexander <aldem+dmars nk7.net> writes:

On 28.04.2011 01:10, Andrej Mitrovic wrote:

 I really like the "in" keyword, and I also like hashes since they can
 potentially speed up look-ups compared to conventional arrays.

  The problem is, "potentially" means "for [relatively] large sets", as hashing
overhead will kill the benefits for smaller sets.

  For 3, 5, 10 values chained if-then-else (or switch) will be *much* faster,
especially for basic types (and strings), especially when values are compared
according to frequency of usage.

  Just run a benchmark - you will be surprised :)

  For more than 10 values - I doubt that inline Hash! will be really readable,
I would rather define an array somewhere else.

  OTOH, I like the idea of providing list of values "inline" for 'in' operator,
this, obviously, improves readability.

/Alexander

Apr 28 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

You seem to be right. It looks like DMD is pretty good at optimizing
if statements. The Hash version is quite a bit slower actually.

Well, that's what happens when I assume things. :)

Here's a little benchmark of searching an array and hash of randomly
generated strings:
http://pastebin.com/GddUn3e3

On my system (in milliseconds):
Length: 128

Hash  Foreach: 334
Array Foreach: 61

Hash  Lookup:  6
Array Lookup:  31

I guess what I was really looking for is syntax sugar that expands this:
if (variable in value1, value2, value3, value4)

into this:
if (variable == value1 || variable == value2...)

or something of that sort. Otherwise my Hash/hash functions are pretty
useless. Sorry for not researching before opening the topic.

Apr 28 2011

Alexander <aldem+dmars nk7.net> writes:

On 28.04.2011 16:25, Andrej Mitrovic wrote:

 Here's a little benchmark of searching an array and hash of randomly
 generated strings:

  Make it 16 values and run again :) Sure, the more elements we have, the
better hash performs.

  But in case of ints/floats, for instance, for small sets especially, the
difference will be even more significant.

  Chained ifs are better than array search (for fixed and known sets of values,
of course).

 I guess what I was really looking for is syntax sugar that expands this:
 if (variable in value1, value2, value3, value4)

  Exactly - *that* would be very nice :) I believe this is easily doable using
templates & mixins, though, not using 'in' operator, but something like
InSet!(var, values...).

/Alexander

Apr 28 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

"InSet!(var, values...)."

That wouldn't work because var is known only at runtime. Ideally I'd
like to write something like:

string needle = "foo";
if (needle.InSet!("bar", "barfoo", "doo", "foo")) { }

And InSet would mixin and call a function such as:
bool inSet(string needle)
{
    if (needle = "bar" || needle = "barfoo" || needle == "doo"
        || needle == "foo") return true;
    return false;
}

However I don't think this is possible. I want the string literals to
be binded at compile time, and needle to be binded at runtime with the
UFC syntax. However in this case the compiler tries to pass needle as
a compile-time argument, which fails.

Apr 28 2011

Peter Alexander <peter.alexander.au gmail.com> writes:

On 28/04/11 4:24 PM, Andrej Mitrovic wrote:
 "InSet!(var, values...)."

 That wouldn't work because var is known only at runtime. Ideally I'd
 like to write something like:

 string needle = "foo";
 if (needle.InSet!("bar", "barfoo", "doo", "foo")) { }

 And InSet would mixin and call a function such as:
 bool inSet(string needle)
 {
      if (needle = "bar" || needle = "barfoo" || needle == "doo"
          || needle == "foo") return true;
      return false;
 }

 However I don't think this is possible. I want the string literals to
 be binded at compile time, and needle to be binded at runtime with the
 UFC syntax. However in this case the compiler tries to pass needle as
 a compile-time argument, which fails.

What's wrong with

InSet!(T, T values...)(T needle)

?

string foo = "b";
if (InSet!("a", "b", "c")(foo))
   ...

Apr 28 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

It works, but it makes the if statement ugly again. :)

Apr 28 2011

Alexander <aldem+dmars nk7.net> writes:

On 28.04.2011 17:46, Andrej Mitrovic wrote:

 It works, but it makes the if statement ugly again. :)

  Well, just a bit - still much better than (var == "..." || var = "..." || var
== "..." ...), don't you think so? ;)

  If compiler would support special "in" usage, automatically expanding values
on the right to series of comparisons (like "var in [1,2,3,4,5]"), it would be
really nice, of course :)

/Alexander

Apr 28 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 4/28/11 11:00 AM, Alexander wrote:
 On 28.04.2011 17:46, Andrej Mitrovic wrote:

 It works, but it makes the if statement ugly again. :)

    Well, just a bit - still much better than (var == "..." || var = "..." ||
var == "..." ...), don't you think so? ;)

    If compiler would support special "in" usage, automatically expanding
values on the right to series of comparisons (like "var in [1,2,3,4,5]"), it
would be really nice, of course :)

That was discussed, too. Walter and I support that although it's a 
special case. We didn't get around to talk the purist police into 
accepting it.

Andrei

Apr 28 2011

Timon Gehr <timon.gehr gmx.ch> writes:

Andrei Alexandrescu wrote:
 On 4/28/11 11:00 AM, Alexander wrote:
 On 28.04.2011 17:46, Andrej Mitrovic wrote:

 It works, but it makes the if statement ugly again. :)

    Well, just a bit - still much better than (var == "..." || var = "..." >||


var == "..." ...), don't you think so? ;)
    If compiler would support special "in" usage, automatically expanding
values on the right to series of comparisons (like "var in [1,2,3,4,5]"), >>it


would be really nice, of course :)
 That was discussed, too. Walter and I support that although it's a
 special case. We didn't get around to talk the purist police into
 accepting it.

 Andrei

Why is this a special case? The 'in' could be extended operator to generally
work
on arrays (using a simple linear search). The compiler could then optimize the
given expression the way suggested (actually even to 1<=var&&var<=5).

Why is 'in' not currently defined on arrays? To me it seems like a left-out that
should be fixed, because it has quite obvious and useful semantics.

Timon

Apr 28 2011

Alexander <aldem+dmars nk7.net> writes:

On 28.04.2011 21:46, Timon Gehr wrote:

 Why is 'in' not currently defined on arrays? To me it seems like a left-out
that
 should be fixed, because it has quite obvious and useful semantics.

  +1. It would be very useful and convenient this way, especially if search
will be optimized to series of comparisons for small item count.

  Perhaps we should file a feature request? :)

/Alexander

Apr 28 2011

Torarin <torarind gmail.com> writes:

2011/4/29 Alexander <aldem+dmars nk7.net>:
 On 28.04.2011 21:46, Timon Gehr wrote:

 Why is 'in' not currently defined on arrays? To me it seems like a left-=


out that
 should be fixed, because it has quite obvious and useful semantics.

 =A0+1. It would be very useful and convenient this way, especially if sea=

rch will be optimized to series of comparisons for small item count.
 =A0Perhaps we should file a feature request? :)

 /Alexander

This has been discussed before, see:
http://www.mail-archive.com/digitalmars-d puremagic.com/msg38779.html

Torarin

Apr 28 2011

Jacob Carlborg <doob me.com> writes:

On 2011-04-28 21:46, Timon Gehr wrote:
 Andrei Alexandrescu wrote:
 On 4/28/11 11:00 AM, Alexander wrote:
 On 28.04.2011 17:46, Andrej Mitrovic wrote:

 It works, but it makes the if statement ugly again. :)

     Well, just a bit - still much better than (var == "..." || var = "...">||


 var == "..." ...), don't you think so? ;)
     If compiler would support special "in" usage, automatically expanding
 values on the right to series of comparisons (like "var in [1,2,3,4,5]"),>>it


 would be really nice, of course :)
 That was discussed, too. Walter and I support that although it's a
 special case. We didn't get around to talk the purist police into
 accepting it.

 Andrei

 Why is this a special case? The 'in' could be extended operator to generally
work
 on arrays (using a simple linear search). The compiler could then optimize the
 given expression the way suggested (actually even to 1<=var&&var<=5).

 Why is 'in' not currently defined on arrays? To me it seems like a left-out
that
 should be fixed, because it has quite obvious and useful semantics.

 Timon

This has been discussed many times before and the general argument 
against it is that "in" is available for associative arrays and it tests 
for keys. With arrays it would test for values.

-- 
/Jacob Carlborg

Apr 29 2011

Kai Meyer <kai unixlords.com> writes:

On 04/29/2011 07:13 AM, Jacob Carlborg wrote:
 On 2011-04-28 21:46, Timon Gehr wrote:
 Andrei Alexandrescu wrote:
 On 4/28/11 11:00 AM, Alexander wrote:
 On 28.04.2011 17:46, Andrej Mitrovic wrote:

 It works, but it makes the if statement ugly again. :)

 Well, just a bit - still much better than (var == "..." || var =
 "...">||


 var == "..." ...), don't you think so? ;)
 If compiler would support special "in" usage, automatically expanding
 values on the right to series of comparisons (like "var in
 [1,2,3,4,5]"),>>it


 would be really nice, of course :)
 That was discussed, too. Walter and I support that although it's a
 special case. We didn't get around to talk the purist police into
 accepting it.

 Andrei

 Why is this a special case? The 'in' could be extended operator to
 generally work
 on arrays (using a simple linear search). The compiler could then
 optimize the
 given expression the way suggested (actually even to 1<=var&&var<=5).

 Why is 'in' not currently defined on arrays? To me it seems like a
 left-out that
 should be fixed, because it has quite obvious and useful semantics.

 Timon

 This has been discussed many times before and the general argument
 against it is that "in" is available for associative arrays and it tests
 for keys. With arrays it would test for values.

The only thing I can compare this with is python's 'in' operator. It's 
purpose is different though. In python, 'in' returns a boolean, and the 
operator works for any sequence. Associative arrays can return a set of 
keys (hashed), or a list (array) of values, and the 'in' operator can be 
used on either, but defaults to keys.

Not only have I missed a 'set' data structure, but I have missed the 
'in' for arrays as well. It's value is suspect when we consider that 
currently the 'in' operator returns a pointer to the value in the 
associative array. What would we do for an array? You don't need the 
pointer to the value, because you have the value already that you're 
looking for. I see the value in returning the pointer in the associative 
array, so you don't have to do the look-up again.

My argument is that all sequences and containers should support some 
flavor of the "Do I contain this?" question. How many times are we asked 
to build a collection bucket, and then asked to retrieve something put 
in there? In the case of the OP, he has a static compile-time bucket 
that he wants to go poking around in.

If we had a built-in "set" data structure (you know, just the key 
portion of the Associative array), we would solve the OP's problem, and 
make some of my dreams come true.

May 02 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

Well, DMD accepts this kind of syntax already:

if (value == val1, val2, val3) { }

I think this turns the expression value==val1 into a bool, and then
turns val2 and val3 into bools and compares against each of them. Or
something like that. It's odd and I've never seen it used anywhere.

Maybe we could put that syntax into good use.

Apr 28 2011

Jacob Carlborg <doob me.com> writes:

On 2011-04-28 18:20, Andrej Mitrovic wrote:
 Well, DMD accepts this kind of syntax already:

 if (value == val1, val2, val3) { }

 I think this turns the expression value==val1 into a bool, and then
 turns val2 and val3 into bools and compares against each of them. Or
 something like that. It's odd and I've never seen it used anywhere.

 Maybe we could put that syntax into good use.

Wouldn't it be nice if you could do something like this:

if (value == (3 || 4)) {}

And the compiler turns that into:

if (value == 3 || value == 4) {}

-- 
/Jacob Carlborg

Apr 29 2011

Robert Clipsham <robert octarineparrot.com> writes:

On 29/04/2011 10:31, Jacob Carlborg wrote:
 On 2011-04-28 18:20, Andrej Mitrovic wrote:
 Well, DMD accepts this kind of syntax already:

 if (value == val1, val2, val3) { }

 I think this turns the expression value==val1 into a bool, and then
 turns val2 and val3 into bools and compares against each of them. Or
 something like that. It's odd and I've never seen it used anywhere.

 Maybe we could put that syntax into good use.

 Wouldn't it be nice if you could do something like this:

 if (value == (3 || 4)) {}

 And the compiler turns that into:

 if (value == 3 || value == 4) {}

Using infix operators you can do:

if (value == 3 /or/ 4 /or/ 5)

See tools.base: 
http://dsource.org/projects/scrapple/browser/trunk/tools/tools/base.d#L1093

There are also variations in there for and, map, and others.

-- 
Robert
http://octarineparrot.com/

Apr 29 2011

Peter Alexander <peter.alexander.au gmail.com> writes:

On 29/04/11 10:31 AM, Jacob Carlborg wrote:
 On 2011-04-28 18:20, Andrej Mitrovic wrote:
 Well, DMD accepts this kind of syntax already:

 if (value == val1, val2, val3) { }

 I think this turns the expression value==val1 into a bool, and then
 turns val2 and val3 into bools and compares against each of them. Or
 something like that. It's odd and I've never seen it used anywhere.

 Maybe we could put that syntax into good use.

 Wouldn't it be nice if you could do something like this:

 if (value == (3 || 4)) {}

 And the compiler turns that into:

 if (value == 3 || value == 4) {}

Sure, if you like ambiguous grammars that break a fundamental property 
of expressions.

:-)

Apr 29 2011

Jacob Carlborg <doob me.com> writes:

On 2011-04-29 14:22, Peter Alexander wrote:
 On 29/04/11 10:31 AM, Jacob Carlborg wrote:
 On 2011-04-28 18:20, Andrej Mitrovic wrote:
 Well, DMD accepts this kind of syntax already:

 if (value == val1, val2, val3) { }

 I think this turns the expression value==val1 into a bool, and then
 turns val2 and val3 into bools and compares against each of them. Or
 something like that. It's odd and I've never seen it used anywhere.

 Maybe we could put that syntax into good use.

 Wouldn't it be nice if you could do something like this:

 if (value == (3 || 4)) {}

 And the compiler turns that into:

 if (value == 3 || value == 4) {}

 Sure, if you like ambiguous grammars that break a fundamental property
 of expressions.

 :-)

Well, I didn't think THAT far ahead :) . It just looks like a nice syntax.

-- 
/Jacob Carlborg

Apr 29 2011

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

 Well, DMD accepts this kind of syntax already:
 
 if (value == val1, val2, val3) { }
 
 I think this turns the expression value==val1 into a bool, and then
 turns val2 and val3 into bools and compares against each of them. Or
 something like that. It's odd and I've never seen it used anywhere.
 
 Maybe we could put that syntax into good use.

That should evaluate each of the expressions in order and take the value of 
the last one. So, ultimately, unless there are side effects to value == val1 
or implicitly converting val2 to bool (which there shouldn't be), then the 
result of the if condition will always be boolean equivalent of val3. That's 
not doing what you're looking for at all. And making it do what you want would 
alter the behavior of the comma operator, which wouldn't be good.

- Jonathan M Davis

Apr 28 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

I know it's not doing that, it was a hypothetical. What I'm saying is
I've never seen the comma used this way in code.

Apr 28 2011

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

 I know it's not doing that, it was a hypothetical. What I'm saying is
 I've never seen the comma used this way in code.

Well, it would almost certainly be bad practice to do so. The comma operator 
can be very useful in some cases (such as for loops) but should generally be 
avoided. Here, it's utterly pointless. However, changing what it did in this 
particular case - as useful as that might be - would make the comma operator 
inconsistent, which could have serious consequences.

- Jonathan M Davis

Apr 28 2011

"Denis Koroskin" <2korden gmail.com> writes:

On Thu, 28 Apr 2011 03:10:15 +0400, Andrej Mitrovic  
<andrej.mitrovich gmail.com> wrote:

 I often see code written like this:

 if (value == somevalue ||
    value == someothervalue ||
    value == yetanothervalue);

 You could use the switch statement. But that introduces indentation,
 and is rarely used for a couple of values.

 I really like the "in" keyword, and I also like hashes since they can
 potentially speed up look-ups compared to conventional arrays. So I
 thought it would be cool to have a Hash template that constructs an
 associative array which you can use especially in if statements when
 you just want to know if a runtime value matches some predetermined
 value.

 Here's my attempt at writing it:
 http://codepad.org/c4sYDSyR

 Whaddya think?

Try this:


import std.stdio;

struct Set(Keys...)
{
	bool opIn_r(Key)(Key key)
	{
		// TODO: Sort!(Keys)?
		foreach (k; Keys) {
			if (k == key) {
				return true;
			}
		}
		
		return false;
	}
}

Set!(Keys) set(Keys...)() {
	return Set!(Keys)();
}

void main()
{
	int needle = 42;
	bool result = needle in set!(1, 2, 3, 4, 5); // as opposed to needle == 1  
|| needle == 2 || ...
	
	writeln(result); // prints false
	
However, this only works with compile-time constants. Here is an extension  
to that (albeit probably less efficient):

	int k = 42;
	
	result = 1 in dynset(1, k, 3, 4, 5);
	writeln(result); // prints true
}

struct DynSet(Keys...)
{
	this(Keys keys)
	{
		this.keys = keys;
	}
	
	bool opIn_r(Key)(Key key)
	{
		foreach (k; keys) {
			if (k == key) {
				return true;
			}
		}
		
		return false;
	}
	
	private Keys keys;
}

DynSet!(Keys) dynset(Keys...)(Keys keys) {
	return DynSet!(Keys)(keys);
}

Apr 29 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 4/29/11 8:50 AM, Denis Koroskin wrote:
 On Thu, 28 Apr 2011 03:10:15 +0400, Andrej Mitrovic
 <andrej.mitrovich gmail.com> wrote:

 I often see code written like this:

 if (value == somevalue ||
 value == someothervalue ||
 value == yetanothervalue);

 You could use the switch statement. But that introduces indentation,
 and is rarely used for a couple of values.

 I really like the "in" keyword, and I also like hashes since they can
 potentially speed up look-ups compared to conventional arrays. So I
 thought it would be cool to have a Hash template that constructs an
 associative array which you can use especially in if statements when
 you just want to know if a runtime value matches some predetermined
 value.

 Here's my attempt at writing it:
 http://codepad.org/c4sYDSyR

 Whaddya think?

 Try this:

[snip]

Nice work. I think this code could earn a place in Phobos:

auto either(Keys...)(Keys keys)
{
     static struct Result
     {
         bool opIn_r(Key)(Key key)
         {
             foreach (k; keys) {
                 if (k == key) {
                     return true;
                 }
             }
             return false;
         }
         private Keys keys;
     }
     return Result(keys);
}


Andrei

Apr 29 2011

KennyTM~ <kennytm gmail.com> writes:

On Apr 30, 11 03:12, Andrei Alexandrescu wrote:
 On 4/29/11 8:50 AM, Denis Koroskin wrote:
 On Thu, 28 Apr 2011 03:10:15 +0400, Andrej Mitrovic
 <andrej.mitrovich gmail.com> wrote:

 I often see code written like this:

 if (value == somevalue ||
 value == someothervalue ||
 value == yetanothervalue);

 You could use the switch statement. But that introduces indentation,
 and is rarely used for a couple of values.

 I really like the "in" keyword, and I also like hashes since they can
 potentially speed up look-ups compared to conventional arrays. So I
 thought it would be cool to have a Hash template that constructs an
 associative array which you can use especially in if statements when
 you just want to know if a runtime value matches some predetermined
 value.

 Here's my attempt at writing it:
 http://codepad.org/c4sYDSyR

 Whaddya think?

 Try this:

 [snip]

 Nice work. I think this code could earn a place in Phobos:

 auto either(Keys...)(Keys keys)
 {
 static struct Result
 {
 bool opIn_r(Key)(Key key)
 {
 foreach (k; keys) {
 if (k == key) {
 return true;
 }
 }
 return false;
 }
 private Keys keys;
 }
 return Result(keys);
 }


 Andrei

This is generally 2~3 times slower than a compile-time solution if the 
keys are known in advance (and the former is 1~2 times slower than using 
x == y || x == z || ... directly)

(Why do you still use opIn_r? ;) )

Apr 29 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 4/29/11 2:44 PM, KennyTM~ wrote:
[snip]

You need to replace the assert and compile with -O -release -inline. My 
results:

find in array: 163
compile-time 1: 10
either: 17
straight comparison: 8
compile-time 2: 11

Code:

import std.datetime, std.algorithm, std.typecons, std.stdio;

auto either(Keys...)(Keys keys)
{
     static struct Result
     {
         bool opIn_r(Key)(Key key)
         {
             foreach (k; keys) {
                 if (k == key) {
                     return true;
                 }
             }
             return false;
         }
         private Keys keys;
     }
     return Result(keys);
}

struct ctEither(keys...) {
      safe bool opBinaryRight(string op:"in", T)(in T x) const pure 
nothrow {
         foreach (k; keys)
             if (x == k)
                 return true;
         return false;
     }
}

struct ctEither2(keys...) {
      safe bool opBinaryRight(string op:"in", T)(in T x) const pure 
nothrow {
         switch (x) {
             default:
                 return false;
             foreach (k; keys) {
                 case k:
                     return true;
             }
         }
         return false;
     }
}

int p = 1234567890;

void a() {
     p !in ctEither!(1, 12, 123, 1234, 12345, 123456, 1234567, 12345678, 
123456789)() || assert(0);
}
void b() {
     p !in either(1, 12, 123, 1234, 12345, 123456, 1234567, 12345678, 
123456789) || assert(0);
}
void c() {
     ![1, 12, 123, 1234, 12345, 123456, 1234567, 12345678, 
123456789].canFind(p)  || assert(0);
}
void d() {
     !(p == 1 || p == 12 || p == 123 || p == 1234 || p == 12345 || p == 
123456 || p == 1234567 || p == 12345678 || p == 123456789)  || assert(0);
}
void e() {
     p !in ctEither2!(1, 12, 123, 1234, 12345, 123456, 1234567, 
12345678, 123456789)() || assert(0);
}

void main() {
     writeln("find in array: ", benchmark!c(1_000_000)[0].to!("msecs", 
int) );

     writeln("compile-time 1: ", benchmark!a(1_000_000)[0].to!("msecs", 
int) );
     writeln("either: ", benchmark!b(1_000_000)[0].to!("msecs", int) );
     writeln("straight comparison: ", 
benchmark!d(1_000_000)[0].to!("msecs", int) );
     writeln("compile-time 2: ", benchmark!e(1_000_000)[0].to!("msecs", 
int) );
}

Apr 29 2011

Alexander <aldem+dmars nk7.net> writes:

On 29.04.2011 21:58, Andrei Alexandrescu wrote:

 You need to replace the assert and compile with -O -release -inline. My
results:

[snip]

Still, straight comparison wins - 2x faster ;)

/Alexander

Apr 29 2011

"Denis Koroskin" <2korden gmail.com> writes:

On Sat, 30 Apr 2011 03:19:37 +0400, Alexander <aldem+dmars nk7.net> wrote:

 On 29.04.2011 21:58, Andrei Alexandrescu wrote:

 You need to replace the assert and compile with -O -release -inline. My  
 results:

 [snip]

 Still, straight comparison wins - 2x faster ;)

 /Alexander

That's just an optimizer issue. Even if you replace the code body with  
"return false;" (i.e. not found) it's STILL slower than straight  
comparison.

Apr 29 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 4/29/11 6:21 PM, Denis Koroskin wrote:
 On Sat, 30 Apr 2011 03:19:37 +0400, Alexander <aldem+dmars nk7.net> wrote:

 On 29.04.2011 21:58, Andrei Alexandrescu wrote:

 You need to replace the assert and compile with -O -release -inline.
 My results:

 [snip]

 Still, straight comparison wins - 2x faster ;)

 /Alexander

 That's just an optimizer issue. Even if you replace the code body with
 "return false;" (i.e. not found) it's STILL slower than straight
 comparison.

Yah. I think the difference is immaterial outside inner loops. If it 
were like 20x slower that would've been a problem.

Andrei

Apr 29 2011

Alexander <aldem+dmars nk7.net> writes:

On 30.04.2011 01:23, Andrei Alexandrescu wrote:

 Yah. I think the difference is immaterial outside inner loops. If it were like
20x slower that would've been a problem.

  Outside of inner loops even 20x is not a real problem ;)

/Alexander

Apr 29 2011

Alexander <aldem+dmars nk7.net> writes:

On 30.04.2011 01:21, Denis Koroskin wrote:

 That's just an optimizer issue. Even if you replace the code body with "return
false;" (i.e. not found) it's STILL slower than straight comparison.

2x? Hardly, IMHO.

/Alexander

Apr 29 2011

Sean Cavanaugh <WorksOnMyMachine gmail.com> writes:

On 4/29/2011 6:19 PM, Alexander wrote:
 On 29.04.2011 21:58, Andrei Alexandrescu wrote:

 You need to replace the assert and compile with -O -release -inline. My
results:

 [snip]

 Still, straight comparison wins - 2x faster ;)

 /Alexander


When understanding the CPU platform you are on, one of the benchmarks 
you can do is to measure how many linear integer comparions you can do 
vs a binary search of the same size, and graph it out.

There is a crossover point where the binary search will be faster, but 
with modern CPUs the number of linear items you can search increases 
every year.

The linear search also makes extremely good use of the cache and 
hardware prefetching, and the branches (as well as the loop itself) will 
be predicted correctly until the terminating condition is found, where 
the binary search is mispredicted 50% of the time.  The last time I 
measured the crossover point around 60 integer values, and it wouldn't 
surprise me at all that its over 100 on newer chipsets (Sandy Bridge, 
Bulldozer etc).

Apr 30 2011

D Programming

C/C++ Programming

Other

digitalmars.D - How about a Hash template?