digitalmars.D.learn - naked popcnt function
- Ad (11/11) Nov 22 2014 Hello, I would like to write a "popcnt" function. This works fine
- safety0ff (5/17) Nov 22 2014 Last time I used naked asm simply used the calling convention to
- Marco Leise (11/25) Nov 22 2014 It is long ago that I tried "naked", but IIRC it strips all
Hello, I would like to write a "popcnt" function. This works fine ulong popcnt(ulong x) { asm { mov RAX, x ; popcnt RAX, RAX ; } } However, if I add the "naked" keyword ( which should improve performance? ) it doesn't work anymore and I can't figure out what change I am supposed to make ( aside from x[RBP] instead of x ) This function is going to be *heavily* used. Thanks for any help.
Nov 22 2014
On Saturday, 22 November 2014 at 18:30:06 UTC, Ad wrote:Hello, I would like to write a "popcnt" function. This works fine ulong popcnt(ulong x) { asm { mov RAX, x ; popcnt RAX, RAX ; } } However, if I add the "naked" keyword ( which should improve performance? ) it doesn't work anymore and I can't figure out what change I am supposed to make ( aside from x[RBP] instead of x ) This function is going to be *heavily* used. Thanks for any help.Last time I used naked asm simply used the calling convention to figure out the location of the parameter (e.g. RCX win64, RDI linux 64, iirc.) N.B. on LDC & GDC there is an intrinsic for popcnt.
Nov 22 2014
Am Sat, 22 Nov 2014 18:30:05 +0000 schrieb "Ad" <ad fakmail.fg>:Hello, I would like to write a "popcnt" function. This works fine ulong popcnt(ulong x) { asm { mov RAX, x ; popcnt RAX, RAX ; } } However, if I add the "naked" keyword ( which should improve performance? ) it doesn't work anymore and I can't figure out what change I am supposed to make ( aside from x[RBP] instead of x ) This function is going to be *heavily* used. Thanks for any help.It is long ago that I tried "naked", but IIRC it strips all compiler generated code from the function and I see no 'ret' in your function. So it probably runs into whatever code lies behind that function in the executable. I would use a tool like obj2asm or objdump to check what the generated code looks like, or use a debugger that can disassemble on the fly. -- Marco
Nov 22 2014