## digitalmars.D.bugs - int opEquals(Object), and other legacy ints

• Stewart Gordon (23/23) Jul 20 2006 There seem to be a number of leftovers from before we had a bool type,
• Walter Bright (4/6) Jul 21 2006 They are typed as returning int for efficiency reasons. These functions
• Bruno Medeiros (5/12) Jul 21 2006 But isn't bool an int internally? Why is it less efficient to use a bool...
• Jarrett Billingsley (3/4) Jul 21 2006 It's a byte internally.
• Walter Bright (2/12) Jul 21 2006 It's a byte internally, and is constrained to be one of the values 0 or ...
• Bruno Medeiros (11/24) Jul 27 2006 Duh, it's a byte of course, I should have checked that.
• xs0 (6/23) Jul 28 2006 but if the return type is bool, it becomes
• Stewart Gordon (18/34) Jul 28 2006 If it does this, then there's a serious bug in the compiler.
• Walter Bright (8/35) Jul 28 2006 The only difference between a CMP and a SUB instruction is where the
• kris (5/54) Jul 28 2006 So, why not treat false as 0, and true as not 0? That way, it works
• Frits van Bommel (3/19) Jul 28 2006 Then what would happen if a and b differ by, say, 256? Remember, an int
• kris (14/40) Jul 28 2006 Sure, but it's generally more efficient to do all logical and arithmetic...
• Frits van Bommel (5/47) Jul 29 2006 Actually, I'm pretty sure testing for zero is already how it's done
• Stewart Gordon (16/48) Jul 29 2006 If anything resembling the above, then
• Walter Bright (31/61) Jul 29 2006 ? Let's look at an example:
• Deewiant (4/12) Jul 30 2006 (a - b), if a and b are equal ints, evaluates to 0, which is generally
• Walter Bright (4/15) Jul 30 2006 Oh, I see what you mean.
• Stewart Gordon (19/37) Jul 30 2006 Exactly. But because what we have is opEquals and not opNotEquals, the
• Bruno Medeiros (22/92) Jul 30 2006 As per the other posts, Eq2 actually takes 2 instructions:
• kris (7/119) Jul 30 2006 Yes indeed. Well spotted! On anything supporting the 386 instruction set...
• Frits van Bommel (14/30) Jul 30 2006 Interesting instruction. Seems to have the exact semantics needed for
• Lionello Lunesu (5/6) Aug 07 2006 But is it faster? I've noticed that many of the higher-level assembly
• Frits van Bommel (5/11) Aug 07 2006 Heh... You may have noticed I didn't use any word related to speed :).
• kris (18/29) Aug 07 2006 If you'd looked at the setne instruction linked previously, you'd have
• Dave (7/40) Aug 07 2006 Yea, AFAIK setne is supported by 386 onward, plus a quick check of the G...
• Bruno Medeiros (19/28) Jul 30 2006 [PS: I've read Frits answer after writing this: ]
• Walter Bright (12/20) Jul 28 2006 Consider:
• Bruno Medeiros (19/40) Jul 30 2006 Well, let's think about the other way around then. Why should bool be
• Walter Bright (3/18) Jul 30 2006 I think most programmers would find this to be very surprising behavior....
• Bruno Medeiros (9/28) Aug 01 2006 Surprising behavior? What surprising behavior, those are all
• Dave (9/12) Jul 31 2006 I consider this kind of stuff the compilers job -- so if I write or
Stewart Gordon <smjg_1998 yahoo.com> writes:
```There seem to be a number of leftovers from before we had a bool type,
and many people were using the int type to pass booleans around.

The most obvious is int opEquals(Object) defined in the Object class.
Changing this'll break a considerable amount of existing code - but then
again, the 0.163 change of making imports private by default has done

But there are many functions in Phobos that can be cleaned up a bit
without doing much harm.  Just to name a few....

std.string.iswhite
std.string.inPattern
std.ctype.isalnum (indeed, most of the functions in std.ctype)
std.file.exists
std.file.isfile
std.file.isdir
std.intrinsic.bt
std.intrinsic.btc
std.intrinsic.btr
std.intrinsic.bts
std.math.isnan (and other is* functions)
std.math.signbit

Going through the other modules will probably reveal many more, but I
haven't checked.

Stewart.
```
Jul 20 2006
Walter Bright <newshound digitalmars.com> writes:
```Stewart Gordon wrote:
There seem to be a number of leftovers from before we had a bool type,
and many people were using the int type to pass booleans around.

They are typed as returning int for efficiency reasons. These functions
often appear in performance critical loops, where an extra instruction
or two makes a difference.
```
Jul 21 2006
Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:
```Walter Bright wrote:
Stewart Gordon wrote:
There seem to be a number of leftovers from before we had a bool type,
and many people were using the int type to pass booleans around.

They are typed as returning int for efficiency reasons. These functions
often appear in performance critical loops, where an extra instruction
or two makes a difference.

But isn't bool an int internally? Why is it less efficient to use a bool?

--
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
```
Jul 21 2006
"Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
```"Bruno Medeiros" <brunodomedeirosATgmail SPAM.com> wrote in message
news:e9qd21\$2ueu\$2 digitaldaemon.com...

But isn't bool an int internally? Why is it less efficient to use a bool?

It's a byte internally.
```
Jul 21 2006
Walter Bright <newshound digitalmars.com> writes:
```Bruno Medeiros wrote:
Walter Bright wrote:
Stewart Gordon wrote:
There seem to be a number of leftovers from before we had a bool
type, and many people were using the int type to pass booleans around.

They are typed as returning int for efficiency reasons. These
functions often appear in performance critical loops, where an extra
instruction or two makes a difference.

But isn't bool an int internally? Why is it less efficient to use a bool?

It's a byte internally, and is constrained to be one of the values 0 or 1.
```
Jul 21 2006
Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:
```Walter Bright wrote:
Bruno Medeiros wrote:
Walter Bright wrote:
Stewart Gordon wrote:
There seem to be a number of leftovers from before we had a bool
type, and many people were using the int type to pass booleans around.

They are typed as returning int for efficiency reasons. These
functions often appear in performance critical loops, where an extra
instruction or two makes a difference.

But isn't bool an int internally? Why is it less efficient to use a bool?

It's a byte internally, and is constrained to be one of the values 0 or 1.

Duh, it's a byte of course, I should have checked that.

But the question remains, is it then less efficient to return a byte
than a int? Why? And if so isn't there a way for the compiler to somehow
optimize it?
I find it a bit hard to believe that nowadays there isn't sufficient
compiler and/or CPU technology to somehow make a bool(byte) return value
as efficient as a int one. :/

--
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
```
Jul 27 2006
xs0 <xs0 xs0.com> writes:
``` It's a byte internally, and is constrained to be one of the values 0
or 1.

Duh, it's a byte of course, I should have checked that.

But the question remains, is it then less efficient to return a byte
than a int? Why? And if so isn't there a way for the compiler to somehow
optimize it?
I find it a bit hard to believe that nowadays there isn't sufficient
compiler and/or CPU technology to somehow make a bool(byte) return value
as efficient as a int one. :/

Well, I'm just guessing, but I think something like

int opEquals(Foo foo)
{
return this.bar == foo.bar;
}

is compiled to something like

return this.bar-foo.bar; // 1 instruction

but if the return type is bool, it becomes

return this.bar-foo.bar?1:0; // 3 instructions

It's the 1/0 constraint on bools that causes the slowness, not the size
(stack is usually size_t-aligned anyway)

xs0
```
Jul 28 2006
Stewart Gordon <smjg_1998 yahoo.com> writes:
```xs0 wrote:
<snip>
Well, I'm just guessing, but I think something like

> int opEquals(Foo foo)
> {
>     return this.bar == foo.bar;
> }

is compiled to something like

return this.bar-foo.bar; // 1 instruction

but if the return type is bool, it becomes

return this.bar-foo.bar?1:0; // 3 instructions

If it does this, then there's a serious bug in the compiler.

Moreover, what's your evidence that subtracting one number from another
might be more efficient than comparing them for equality directly?

It's the 1/0 constraint on bools that causes the slowness, not the size
(stack is usually size_t-aligned anyway)

But if the function only tries to return 0 or 1 anyway, then what
difference does it make?  At the moment, I can't think of an example of
equality testing that can be made more efficient by being allowed to
return a value other than 0 or 1.

Stewart.

--
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCS/M d- s:-  C++  a->--- UB  P+ L E  W++  N+++ o K-  w++  O? M V? PS-
PE- Y? PGP- t- 5? X? R b DI? D G e++++ h-- r-- !y
------END GEEK CODE BLOCK------

My e-mail is valid but not my primary mailbox.  Please keep replies on
the 'group where everyone may benefit.
```
Jul 28 2006
Walter Bright <newshound digitalmars.com> writes:
```Stewart Gordon wrote:
xs0 wrote:
<snip>
Well, I'm just guessing, but I think something like

> int opEquals(Foo foo)
> {
>     return this.bar == foo.bar;
> }

is compiled to something like

return this.bar-foo.bar; // 1 instruction

but if the return type is bool, it becomes

return this.bar-foo.bar?1:0; // 3 instructions

If it does this, then there's a serious bug in the compiler.

What instruction sequence do expect to be generated for it?

Moreover, what's your evidence that subtracting one number from another
might be more efficient than comparing them for equality directly?

The only difference between a CMP and a SUB instruction is where the
result ends up. But the CMP doesn't generate 0 or 1 as a result, it puts
the result in the FLAGS register. Converting the FLAGS to a 0 or 1 in a
register takes more instructions.

It's the 1/0 constraint on bools that causes the slowness, not the
size (stack is usually size_t-aligned anyway)

But if the function only tries to return 0 or 1 anyway, then what
difference does it make?  At the moment, I can't think of an example of
equality testing that can be made more efficient by being allowed to
return a value other than 0 or 1.

I can. (a == b), where a and b are ints, can be implemented as (a - b),
and the result is int 0 for equality, int !=0 for inequality.
```
Jul 28 2006
kris <foo bar.com> writes:
```Walter Bright wrote:
Stewart Gordon wrote:

xs0 wrote:
<snip>

Well, I'm just guessing, but I think something like

> int opEquals(Foo foo)
> {
>     return this.bar == foo.bar;
> }

is compiled to something like

return this.bar-foo.bar; // 1 instruction

but if the return type is bool, it becomes

return this.bar-foo.bar?1:0; // 3 instructions

If it does this, then there's a serious bug in the compiler.

What instruction sequence do expect to be generated for it?

Moreover, what's your evidence that subtracting one number from
another might be more efficient than comparing them for equality
directly?

The only difference between a CMP and a SUB instruction is where the
result ends up. But the CMP doesn't generate 0 or 1 as a result, it puts
the result in the FLAGS register. Converting the FLAGS to a 0 or 1 in a
register takes more instructions.

It's the 1/0 constraint on bools that causes the slowness, not the
size (stack is usually size_t-aligned anyway)

But if the function only tries to return 0 or 1 anyway, then what
difference does it make?  At the moment, I can't think of an example
of equality testing that can be made more efficient by being allowed
to return a value other than 0 or 1.

I can. (a == b), where a and b are ints, can be implemented as (a - b),
and the result is int 0 for equality, int !=0 for inequality.

So, why not treat false as 0, and true as not 0?  That way, it works
just the same as the "int" version does (and comparing/testing against
zero doesn't hit the address-bus). Yes, I can see some potential for
concern there; but is there anything insurmountable?
```
Jul 28 2006
Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
```kris wrote:
Walter Bright wrote:
Stewart Gordon wrote:
But if the function only tries to return 0 or 1 anyway, then what
difference does it make?  At the moment, I can't think of an example
of equality testing that can be made more efficient by being allowed
to return a value other than 0 or 1.

I can. (a == b), where a and b are ints, can be implemented as (a -
b), and the result is int 0 for equality, int !=0 for inequality.

So, why not treat false as 0, and true as not 0?  That way, it works
just the same as the "int" version does (and comparing/testing against
zero doesn't hit the address-bus). Yes, I can see some potential for
concern there; but is there anything insurmountable?

Then what would happen if a and b differ by, say, 256? Remember, an int
is 4 bytes, a bool is only 1.
```
Jul 28 2006
kris <foo bar.com> writes:
```Frits van Bommel wrote:
kris wrote:

Walter Bright wrote:

Stewart Gordon wrote:

But if the function only tries to return 0 or 1 anyway, then what
difference does it make?  At the moment, I can't think of an example
of equality testing that can be made more efficient by being allowed
to return a value other than 0 or 1.

I can. (a == b), where a and b are ints, can be implemented as (a -
b), and the result is int 0 for equality, int !=0 for inequality.

So, why not treat false as 0, and true as not 0?  That way, it works
just the same as the "int" version does (and comparing/testing against
zero doesn't hit the address-bus). Yes, I can see some potential for
concern there; but is there anything insurmountable?

Then what would happen if a and b differ by, say, 256? Remember, an int
is 4 bytes, a bool is only 1.

Sure, but it's generally more efficient to do all logical and arithmetic
operations in the native width of the device anyway ~ generally 32bits
for current D compilers.

If you're talking about issues related to actually storing a bool
result, then that's part of the "concerns" noted above. Bool values
derived in certains ways may need to be folded for storage, but not for
testing. The subtraction case above may be included in that group, but
testing should still only require a compare against zero (for both true
and false). I'm suggesting only that zero values should *always* be used
to test for 'truth' ~ never 1, or 255, or any value other than zero.
Anywhere the keyword "true" is used (or implied) for comparative
purposes, test against zero and invert the jmp-condition instead. If
that's not done already, it would probably speed things up in many cases.
```
Jul 28 2006
Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
```kris wrote:
Frits van Bommel wrote:
kris wrote:

Walter Bright wrote:

Stewart Gordon wrote:

But if the function only tries to return 0 or 1 anyway, then what
difference does it make?  At the moment, I can't think of an
example of equality testing that can be made more efficient by
being allowed to return a value other than 0 or 1.

I can. (a == b), where a and b are ints, can be implemented as (a -
b), and the result is int 0 for equality, int !=0 for inequality.

So, why not treat false as 0, and true as not 0?  That way, it works
just the same as the "int" version does (and comparing/testing
against zero doesn't hit the address-bus). Yes, I can see some
potential for concern there; but is there anything insurmountable?

Then what would happen if a and b differ by, say, 256? Remember, an
int is 4 bytes, a bool is only 1.

Sure, but it's generally more efficient to do all logical and arithmetic
operations in the native width of the device anyway ~ generally 32bits
for current D compilers.

If you're talking about issues related to actually storing a bool
result, then that's part of the "concerns" noted above. Bool values
derived in certains ways may need to be folded for storage, but not for
testing. The subtraction case above may be included in that group, but
testing should still only require a compare against zero (for both true
and false). I'm suggesting only that zero values should *always* be used
to test for 'truth' ~ never 1, or 255, or any value other than zero.
Anywhere the keyword "true" is used (or implied) for comparative
purposes, test against zero and invert the jmp-condition instead. If
that's not done already, it would probably speed things up in many cases.

Actually, I'm pretty sure testing for zero is already how it's done
(just with 1-byte operands instead of 4-byte ones).

Something else: if there are multiple ways to represent true then
equality testing just got a lot more complicated :).
```
Jul 29 2006
Stewart Gordon <smjg_1998 yahoo.com> writes:
```Walter Bright wrote:
Stewart Gordon wrote:
xs0 wrote:
<snip>
Well, I'm just guessing, but I think something like

> int opEquals(Foo foo)
> {
>     return this.bar == foo.bar;
> }

is compiled to something like

return this.bar-foo.bar; // 1 instruction

but if the return type is bool, it becomes

return this.bar-foo.bar?1:0; // 3 instructions

If it does this, then there's a serious bug in the compiler.

What instruction sequence do expect to be generated for it?

If anything resembling the above, then

return this.bar-foo.bar?0:1;

which cancels out the advantage you mention next:

<snip>
The only difference between a CMP and a SUB instruction is where the
result ends up. But the CMP doesn't generate 0 or 1 as a result, it puts
the result in the FLAGS register. Converting the FLAGS to a 0 or 1 in a
register takes more instructions.

<snip>
But if the function only tries to return 0 or 1 anyway, then what
difference does it make?  At the moment, I can't think of an example
of equality testing that can be made more efficient by being allowed
to return a value other than 0 or 1.

I can. (a == b), where a and b are ints, can be implemented as (a - b),
and the result is int 0 for equality, int !=0 for inequality.

How is this (a == b) rather than (a != b)?

Stewart.

--
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCS/M d- s:-  C++  a->--- UB  P+ L E  W++  N+++ o K-  w++  O? M V? PS-
PE- Y? PGP- t- 5? X? R b DI? D G e++++ h-- r-- !y
------END GEEK CODE BLOCK------

My e-mail is valid but not my primary mailbox.  Please keep replies on
the 'group where everyone may benefit.
```
Jul 29 2006
Walter Bright <newshound digitalmars.com> writes:
```Stewart Gordon wrote:
Walter Bright wrote:
Stewart Gordon wrote:
xs0 wrote:
<snip>
Well, I'm just guessing, but I think something like

> int opEquals(Foo foo)
> {
>     return this.bar == foo.bar;
> }

is compiled to something like

return this.bar-foo.bar; // 1 instruction

but if the return type is bool, it becomes

return this.bar-foo.bar?1:0; // 3 instructions

If it does this, then there's a serious bug in the compiler.

What instruction sequence do expect to be generated for it?

If anything resembling the above, then

return this.bar-foo.bar?0:1;

? Let's look at an example:

class Foo
{
int foo, bar;

int Eq1(Foo foo)
{
return this.bar-foo.bar?0:1;
}

int Eq2(Foo foo)
{
return this.bar-foo.bar;
}
}

which generates:

Eq1:
mov     EDX,4[ESP]
mov     ECX,0Ch[EAX]
sub     ECX,0Ch[EDX]
cmp     ECX,1
sbb     EAX,EAX
neg     EAX
ret     4
Eq2:
mov     ECX,4[ESP]
mov     EAX,0Ch[EAX]
sub     EAX,0Ch[ECX]
ret     4

So we have 4 instructions generated rather than 1. If there's a trick to
generate only one instruction for Eq1, I'd like to know about it.

I can. (a == b), where a and b are ints, can be implemented as (a -
b), and the result is int 0 for equality, int !=0 for inequality.

How is this (a == b) rather than (a != b)?

```
Jul 29 2006
Deewiant <deewiant.doesnotlike.spam gmail.com> writes:
```Walter Bright wrote:
Stewart Gordon wrote:
Walter Bright wrote:
I can. (a == b), where a and b are ints, can be implemented as (a -
b), and the result is int 0 for equality, int !=0 for inequality.

How is this (a == b) rather than (a != b)?

(a - b), if a and b are equal ints, evaluates to 0, which is generally
considered to mean false. So isn't (a - b) actually a way of finding (a != b),
```
Jul 30 2006
Walter Bright <newshound digitalmars.com> writes:
```Deewiant wrote:
Walter Bright wrote:
Stewart Gordon wrote:
Walter Bright wrote:
I can. (a == b), where a and b are ints, can be implemented as (a -
b), and the result is int 0 for equality, int !=0 for inequality.

How is this (a == b) rather than (a != b)?

(a - b), if a and b are equal ints, evaluates to 0, which is generally
considered to mean false. So isn't (a - b) actually a way of finding (a != b),

Oh, I see what you mean.

To invert the result would take another 2 instructions for a total of 3,
still less than 4.
```
Jul 30 2006
Stewart Gordon <smjg_1998 yahoo.com> writes:
```Walter Bright wrote:
Deewiant wrote:
Walter Bright wrote:
Stewart Gordon wrote:
Walter Bright wrote:
I can. (a == b), where a and b are ints, can be implemented as (a -
b), and the result is int 0 for equality, int !=0 for inequality.

How is this (a == b) rather than (a != b)?

(a - b), if a and b are equal ints, evaluates to 0, which is generally
considered to mean false. So isn't (a - b) actually a way of finding
(a != b),

Oh, I see what you mean.

To invert the result would take another 2 instructions for a total of 3,
still less than 4.

Exactly.  But because what we have is opEquals and not opNotEquals, the
benefit of fewer instructions is lost (except when opEquals is simple
enough that the compiler can inline and optimise away the double negation).

Indeed, on this basis, if we had opNotEquals then it would be just be
equivalent to opCmp for many types.  So I can see people thinking that
opNotEquals should just call opCmp by default.  However, there's a
problem with this idea - for classes that have no ordering, even the
current behaviour of comparing object references would have to be
explicitly programmed in.

Stewart.

--
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCS/M d- s:-  C++  a->--- UB  P+ L E  W++  N+++ o K-  w++  O? M V? PS-
PE- Y? PGP- t- 5? X? R b DI? D G e++++ h-- r-- !y
------END GEEK CODE BLOCK------

My e-mail is valid but not my primary mailbox.  Please keep replies on
the 'group where everyone may benefit.
```
Jul 30 2006
Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:
```Walter Bright wrote:
Stewart Gordon wrote:
Walter Bright wrote:
Stewart Gordon wrote:
xs0 wrote:
<snip>
Well, I'm just guessing, but I think something like

> int opEquals(Foo foo)
> {
>     return this.bar == foo.bar;
> }

is compiled to something like

return this.bar-foo.bar; // 1 instruction

but if the return type is bool, it becomes

return this.bar-foo.bar?1:0; // 3 instructions

If it does this, then there's a serious bug in the compiler.

What instruction sequence do expect to be generated for it?

If anything resembling the above, then

return this.bar-foo.bar?0:1;

? Let's look at an example:

class Foo
{
int foo, bar;

int Eq1(Foo foo)
{
return this.bar-foo.bar?0:1;
}

int Eq2(Foo foo)
{
return this.bar-foo.bar;
}
}

which generates:

Eq1:
mov     EDX,4[ESP]
mov     ECX,0Ch[EAX]
sub     ECX,0Ch[EDX]
cmp     ECX,1
sbb     EAX,EAX
neg     EAX
ret     4
Eq2:
mov     ECX,4[ESP]
mov     EAX,0Ch[EAX]
sub     EAX,0Ch[ECX]
ret     4

So we have 4 instructions generated rather than 1. If there's a trick to
generate only one instruction for Eq1, I'd like to know about it.

I can. (a == b), where a and b are ints, can be implemented as (a -
b), and the result is int 0 for equality, int !=0 for inequality.

How is this (a == b) rather than (a != b)?

As per the other posts, Eq2 actually takes 2 instructions:

Eq2:
...
sub     EAX,0Ch[ECX]
not	EAX;

And uuuh.., I've checked gcc's generated code for a C++'s Eq1, and it
was only 2 instructions too, CMP and SETE ! :

Eq1:
...
cmp     EAX,0Ch[ECX]
sete	EAX;

(http://www.cs.tut.fi/~siponen/upros/intel/instr/sete_setz.html)
It seems to me perfectly valid, is there any problem here?

What does the original Eq1 even do? :

sub     ECX,0Ch[EDX]
cmp     ECX,1       // Huh?
sbb     EAX,EAX
neg     EAX

--
Bruno Medeiros - MSc in CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
```
Jul 30 2006
kris <foo bar.com> writes:
```Bruno Medeiros wrote:
Walter Bright wrote:

Stewart Gordon wrote:

Walter Bright wrote:

Stewart Gordon wrote:

xs0 wrote:
<snip>

Well, I'm just guessing, but I think something like

> int opEquals(Foo foo)
> {
>     return this.bar == foo.bar;
> }

is compiled to something like

return this.bar-foo.bar; // 1 instruction

but if the return type is bool, it becomes

return this.bar-foo.bar?1:0; // 3 instructions

If it does this, then there's a serious bug in the compiler.

What instruction sequence do expect to be generated for it?

If anything resembling the above, then

return this.bar-foo.bar?0:1;

? Let's look at an example:

class Foo
{
int foo, bar;

int Eq1(Foo foo)
{
return this.bar-foo.bar?0:1;
}

int Eq2(Foo foo)
{
return this.bar-foo.bar;
}
}

which generates:

Eq1:
mov     EDX,4[ESP]
mov     ECX,0Ch[EAX]
sub     ECX,0Ch[EDX]
cmp     ECX,1
sbb     EAX,EAX
neg     EAX
ret     4
Eq2:
mov     ECX,4[ESP]
mov     EAX,0Ch[EAX]
sub     EAX,0Ch[ECX]
ret     4

So we have 4 instructions generated rather than 1. If there's a trick
to generate only one instruction for Eq1, I'd like to know about it.

I can. (a == b), where a and b are ints, can be implemented as (a -
b), and the result is int 0 for equality, int !=0 for inequality.

How is this (a == b) rather than (a != b)?

As per the other posts, Eq2 actually takes 2 instructions:

Eq2:
...
sub     EAX,0Ch[ECX]
not    EAX;

And uuuh.., I've checked gcc's generated code for a C++'s Eq1, and it
was only 2 instructions too, CMP and SETE ! :

Eq1:
...
cmp     EAX,0Ch[ECX]
sete    EAX;

(http://www.cs.tut.fi/~siponen/upros/intel/instr/sete_setz.html)
It seems to me perfectly valid, is there any problem here?

Yes indeed. Well spotted! On anything supporting the 386 instruction set
(and D is targeted for 32-bit devices only), there's really no
performance advantage in returning an int over returning a bool.

This should be addressed, so that some of the core APIs can be cleaned
up appropriately?

What does the original Eq1 even do? :

sub     ECX,0Ch[EDX]
cmp     ECX,1       // Huh?
sbb     EAX,EAX
neg     EAX

That's old-skool, pre-386 hacking :)
```
Jul 30 2006
Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
```Bruno Medeiros wrote:
And uuuh.., I've checked gcc's generated code for a C++'s Eq1, and it
was only 2 instructions too, CMP and SETE ! :

Eq1:
...
cmp     EAX,0Ch[ECX]
sete    EAX;

(http://www.cs.tut.fi/~siponen/upros/intel/instr/sete_setz.html)
It seems to me perfectly valid, is there any problem here?

Interesting instruction. Seems to have the exact semantics needed for
these situations. You'd almost think CPU designers care about what
people want to do with their products :P.

What does the original Eq1 even do? :

Step by step:
mov     ECX,0Ch[EAX]

(You skipped this one) Loads this.bar into ECX.
sub     ECX,0Ch[EDX]

Subtracts foo.bar from ECX.
cmp     ECX,1       // Huh?

Among other things, sets borrow (aka carry) flag if ECX == 0 (i.e. if
foo.bar == this.bar), clears it otherwise.
sbb     EAX,EAX

Subtracts (EAX + borrow) from EAX, setting it to either -1 (if carry ==
1) or 0 (if carry == 0).
neg     EAX

Negates EAX.

A bit weird at first glance, but it works as advertised :).

But indeed, a cmp/sete combo seems to do the same in less instructions.
```
Jul 30 2006
"Lionello Lunesu" <lio lunesu.remove.com> writes:
``` But indeed, a cmp/sete combo seems to do the same in less instructions.

But is it faster? I've noticed that many of the higher-level assembly
instructions are actually slower than multiple lower-level ones. "loop" is
the best example of this (dec ecx/jne is faster), or "rep" (again, dec/jne
is faster).

L.
```
Aug 07 2006
Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
```Lionello Lunesu wrote:
But indeed, a cmp/sete combo seems to do the same in less instructions.

But is it faster? I've noticed that many of the higher-level assembly
instructions are actually slower than multiple lower-level ones. "loop" is
the best example of this (dec ecx/jne is faster), or "rep" (again, dec/jne
is faster).

Heh... You may have noticed I didn't use any word related to speed :).
The reason for that is that I don't know much about optimization for
speed, especially where pipelines etc. are involved...

Hardware is weird.
```
Aug 07 2006
kris <foo bar.com> writes:
```Lionello Lunesu wrote:
But indeed, a cmp/sete combo seems to do the same in less instructions.

But is it faster? I've noticed that many of the higher-level assembly
instructions are actually slower than multiple lower-level ones. "loop" is
the best example of this (dec ecx/jne is faster), or "rep" (again, dec/jne
is faster).

L.

If you'd looked at the setne instruction linked previously, you'd have
seen that it consumes 3 cycles. And no; there are no jump, loops, or any
other reason to cause pipeline bubbles. If you need a primer on what
causes modern CPUs to stall (the silly P4 in particular) then you could
do a lot worse than to read the articles by Jon Stokes at ArsTechnica.

Oh, and this is just daft. Why don't we all count the cycles for a
call/return instead? Or, perhaps just exactly what it costs to compare
the bytes of two strings until they start to look different? You'll find
the cost of setne (and probably even the prior "extra" three
instructions for boolean support) is relegated to background noise.

Let's face it: int is likely used instead of bool for historical
reasons; probably just an artifact left over from pre-80386 days. Would
be nice to get that codegen cleaned up ~ especially since it was W who
claimed the reasons were performance related. Hacking the high-level
code with int vs boolean, just to reflect some archaic machine
instruction, is one of those things that come under the umbrella of
"premature optimization".
```
Aug 07 2006
```kris wrote:
Lionello Lunesu wrote:
But indeed, a cmp/sete combo seems to do the same in less instructions.

But is it faster? I've noticed that many of the higher-level assembly
instructions are actually slower than multiple lower-level ones.
"loop" is
the best example of this (dec ecx/jne is faster), or "rep" (again,
dec/jne
is faster).

L.

If you'd looked at the setne instruction linked previously, you'd have
seen that it consumes 3 cycles. And no; there are no jump, loops, or any
other reason to cause pipeline bubbles. If you need a primer on what
causes modern CPUs to stall (the silly P4 in particular) then you could
do a lot worse than to read the articles by Jon Stokes at ArsTechnica.

Oh, and this is just daft. Why don't we all count the cycles for a
call/return instead? Or, perhaps just exactly what it costs to compare
the bytes of two strings until they start to look different? You'll find
the cost of setne (and probably even the prior "extra" three
instructions for boolean support) is relegated to background noise.

Let's face it: int is likely used instead of bool for historical
reasons; probably just an artifact left over from pre-80386 days. Would
be nice to get that codegen cleaned up ~ especially since it was W who
claimed the reasons were performance related. Hacking the high-level
code with int vs boolean, just to reflect some archaic machine
instruction, is one of those things that come under the umbrella of
"premature optimization".

Yea, AFAIK setne is supported by 386 onward, plus a quick check of the GDC code
that uses it seems
to indicate it is faster (from the Eq1 and Eq2 samples earlier in the thread).

But you're right - in many cases it will probably be background noise anyhow
'cause you only save a
couple of cycles.

As an aside, I think the current DMD backend may be well suited to the new Dual
Core CPU because it
hasn't been chasing after optimum performance on the P4 with it's 20 stage
pipeline or whatever <g>
```
Aug 07 2006
Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:
```Bruno Medeiros wrote:

What does the original Eq1 even do? :

sub     ECX,0Ch[EDX]
cmp     ECX,1       // Huh?
sbb     EAX,EAX
neg     EAX

Ah I get it now... wasn't understanding what borrow (the mathematical
notion) was, since I'm not a native english speaker. Nothing a wikipedia
lookup didn't solve. So, correct me if I'm wrong:

(when I say EDX I mean 0Ch[EDX] or whatever)

// sets the carry flag if zero flag is on,
// that is, if ECX == EDX (from previous instruction)
cmp   ECX,1

// sets EAX as zero and also subtracts one if carry flag is set
// that is, EAX = -1 if ECX == EDX and EAX = 0 if ECX != EDX
sbb	EAX,EAX

// two's complement negation of EAX, 0 becomes 0, -1 becomes 1
neg EAX
// end result: EAX = 1 if ECX == EDX and EAX = 0 if ECX != EDX

So yeah, it seems these 3 instructions do the same as SETE ... ?

--
Bruno Medeiros - MSc in CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
```
Jul 30 2006
Walter Bright <newshound digitalmars.com> writes:
```Bruno Medeiros wrote:
But the question remains, is it then less efficient to return a byte
than a int?

Yes. It's also less efficient to constrain the results to 0 or 1.

Why?

Consider:

a = 0x1000;
b = 0x2000;

Now convert (a == b) into a bool. If the result is an int, I can just do
(a - b), one instruction. Converting it to a byte, or to 1 or 0, takes more.

And if so isn't there a way for the compiler to somehow
optimize it?

The math is inevitable <g>.

I find it a bit hard to believe that nowadays there isn't sufficient
compiler and/or CPU technology to somehow make a bool(byte) return value
as efficient as a int one. :/

I work with what the CPU makes available.

efficiencies. The trouble is, these kinds of things often appear in
tight loops, where small inefficiencies get multiplied by millions.
```
Jul 28 2006
Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:
```Walter Bright wrote:
Bruno Medeiros wrote:
But the question remains, is it then less efficient to return a byte
than a int?

Yes. It's also less efficient to constrain the results to 0 or 1.

Why?

Consider:

a = 0x1000;
b = 0x2000;

Now convert (a == b) into a bool. If the result is an int, I can just do
(a - b), one instruction. Converting it to a byte, or to 1 or 0, takes
more.

And if so isn't there a way for the compiler to somehow optimize it?

The math is inevitable <g>.

Well, let's think about the other way around then. Why should bool be
constrained to 0 or 1? Why not, same as kris said, 0 would be false, and
non zero would be true. Then we could have an opEquals or any function
returning a bool instead of int, without penalty loss.

The only shortcoming I see is that it would be slower to compare two
bool /variables/:
(b1 == b2)
that expression is currently just 1 instruction, a CMP, but without the
0,1 restriction it would be more (3, I think, have to check that).
However, is that significantly worse? I think not. I think comparison
between two bool _variables_ is likely very rare, and when it happens it
is also probably not performance critical. (statistical references?)
Note: this would not affect at all comparisons between a bool variable
and a bool literal. Like (b == true) or (b == false).

Or is there another reason for the 0,1 restriction?

--
Bruno Medeiros - MSc in CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
```
Jul 30 2006
Walter Bright <newshound digitalmars.com> writes:
```Bruno Medeiros wrote:
Well, let's think about the other way around then. Why should bool be
constrained to 0 or 1? Why not, same as kris said, 0 would be false, and
non zero would be true. Then we could have an opEquals or any function
returning a bool instead of int, without penalty loss.

The only shortcoming I see is that it would be slower to compare two
bool /variables/:
(b1 == b2)
that expression is currently just 1 instruction, a CMP, but without the
0,1 restriction it would be more (3, I think, have to check that).
However, is that significantly worse? I think not. I think comparison
between two bool _variables_ is likely very rare, and when it happens it
is also probably not performance critical. (statistical references?)
Note: this would not affect at all comparisons between a bool variable
and a bool literal. Like (b == true) or (b == false).

I think most programmers would find this to be very surprising behavior.
I know I would.
```
Jul 30 2006
Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:
```Walter Bright wrote:
Bruno Medeiros wrote:
Well, let's think about the other way around then. Why should bool be
constrained to 0 or 1? Why not, same as kris said, 0 would be false,
and non zero would be true. Then we could have an opEquals or any
function returning a bool instead of int, without penalty loss.

The only shortcoming I see is that it would be slower to compare two
bool /variables/:
(b1 == b2)
that expression is currently just 1 instruction, a CMP, but without
the 0,1 restriction it would be more (3, I think, have to check that).
However, is that significantly worse? I think not. I think comparison
between two bool _variables_ is likely very rare, and when it happens
it is also probably not performance critical. (statistical references?)
Note: this would not affect at all comparisons between a bool variable
and a bool literal. Like (b == true) or (b == false).

I think most programmers would find this to be very surprising behavior.
I know I would.

Surprising behavior? What surprising behavior, those are all
implementation details, they have not a bearing on language/program
behavior.

And how about the alternative of using the SETE instruction for bool
restriction?, you haven't commented on that yet...

--
Bruno Medeiros - MSc in CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
```
Aug 01 2006
```Walter Bright wrote:
efficiencies. The trouble is, these kinds of things often appear in
tight loops, where small inefficiencies get multiplied by millions.

I consider this kind of stuff the compilers job -- so if I write or
maintain code that is slow, I know there is probably something I can do
about it w/o having to drop into assembly.

Personally I've spent a huge amount of time tuning code and I can't tell
you the positive effect that has on end-users. IMHO bad performance is
often the "forgotten bug" (that's not to say the budget should be busted
on that "last 20%" either though).

- Dave
```
Jul 31 2006