digitalmars.D.learn - naked popcnt function
- Ad (11/11) Nov 22 2014 Hello, I would like to write a "popcnt" function. This works fine
- safety0ff (5/17) Nov 22 2014 Last time I used naked asm simply used the calling convention to
- Marco Leise (11/25) Nov 22 2014 It is long ago that I tried "naked", but IIRC it strips all
Hello, I would like to write a "popcnt" function. This works fine
ulong popcnt(ulong x)
{
asm { mov RAX, x ; popcnt RAX, RAX ; }
}
However, if I add the "naked" keyword ( which should improve
performance? ) it doesn't work anymore and I can't figure out
what change I am supposed to make ( aside from x[RBP] instead of
x )
This function is going to be *heavily* used.
Thanks for any help.
Nov 22 2014
On Saturday, 22 November 2014 at 18:30:06 UTC, Ad wrote:
Hello, I would like to write a "popcnt" function. This works
fine
ulong popcnt(ulong x)
{
asm { mov RAX, x ; popcnt RAX, RAX ; }
}
However, if I add the "naked" keyword ( which should improve
performance? ) it doesn't work anymore and I can't figure out
what change I am supposed to make ( aside from x[RBP] instead
of x )
This function is going to be *heavily* used.
Thanks for any help.
Last time I used naked asm simply used the calling convention to
figure out the location of the parameter (e.g. RCX win64, RDI
linux 64, iirc.)
N.B. on LDC & GDC there is an intrinsic for popcnt.
Nov 22 2014
Am Sat, 22 Nov 2014 18:30:05 +0000
schrieb "Ad" <ad fakmail.fg>:
Hello, I would like to write a "popcnt" function. This works fine
ulong popcnt(ulong x)
{
asm { mov RAX, x ; popcnt RAX, RAX ; }
}
However, if I add the "naked" keyword ( which should improve
performance? ) it doesn't work anymore and I can't figure out
what change I am supposed to make ( aside from x[RBP] instead of
x )
This function is going to be *heavily* used.
Thanks for any help.
It is long ago that I tried "naked", but IIRC it strips all
compiler generated code from the function and I see no 'ret'
in your function. So it probably runs into whatever code lies
behind that function in the executable.
I would use a tool like obj2asm or objdump to check what the
generated code looks like, or use a debugger that can
disassemble on the fly.
--
Marco
Nov 22 2014









"safety0ff" <safety0ff.dev gmail.com> 