www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - naked popcnt function

reply "Ad" <ad fakmail.fg> writes:
Hello, I would like to write a "popcnt" function. This works fine

ulong popcnt(ulong x)
{
asm { mov RAX, x ; popcnt RAX, RAX ; }
}

However, if I add the "naked" keyword ( which should improve 
performance? ) it doesn't work anymore and I can't figure out 
what change I am supposed to make ( aside from x[RBP] instead of 
x )
This function is going to be *heavily* used.

Thanks for any help.
Nov 22 2014
next sibling parent "safety0ff" <safety0ff.dev gmail.com> writes:
On Saturday, 22 November 2014 at 18:30:06 UTC, Ad wrote:
 Hello, I would like to write a "popcnt" function. This works 
 fine

 ulong popcnt(ulong x)
 {
 asm { mov RAX, x ; popcnt RAX, RAX ; }
 }

 However, if I add the "naked" keyword ( which should improve 
 performance? ) it doesn't work anymore and I can't figure out 
 what change I am supposed to make ( aside from x[RBP] instead 
 of x )
 This function is going to be *heavily* used.

 Thanks for any help.
Last time I used naked asm simply used the calling convention to figure out the location of the parameter (e.g. RCX win64, RDI linux 64, iirc.) N.B. on LDC & GDC there is an intrinsic for popcnt.
Nov 22 2014
prev sibling parent Marco Leise <Marco.Leise gmx.de> writes:
Am Sat, 22 Nov 2014 18:30:05 +0000
schrieb "Ad" <ad fakmail.fg>:

 Hello, I would like to write a "popcnt" function. This works fine
 
 ulong popcnt(ulong x)
 {
 asm { mov RAX, x ; popcnt RAX, RAX ; }
 }
 
 However, if I add the "naked" keyword ( which should improve 
 performance? ) it doesn't work anymore and I can't figure out 
 what change I am supposed to make ( aside from x[RBP] instead of 
 x )
 This function is going to be *heavily* used.
 
 Thanks for any help.
It is long ago that I tried "naked", but IIRC it strips all compiler generated code from the function and I see no 'ret' in your function. So it probably runs into whatever code lies behind that function in the executable. I would use a tool like obj2asm or objdump to check what the generated code looks like, or use a debugger that can disassemble on the fly. -- Marco
Nov 22 2014