www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.ldc - popcnt intrinsic unused

reply Stefan Koch <uplink.coder googlemail.com> writes:
I just compiled a simple program calling popcnt in a loop.
It does not generate the intrinsic even when compiled with
-O3 -c -mcpu=amdfam10
Jan 19 2016
parent reply David Nadlinger via digitalmars-d-ldc <digitalmars-d-ldc puremagic.com> writes:
Hi Stefan,

On 19 Jan 2016, at 14:18, Stefan Koch via digitalmars-d-ldc wrote:
 I just compiled a simple program calling popcnt in a loop.
 It does not generate the intrinsic even when compiled with
 -O3 -c -mcpu=amdfam10
You mean it emits a function call to libdruntime-ldc instead of just the intrinsic? In that case, it's probably the inlining problem that has been haunting us for ages (can't use the LLVM inliner because core.bitop doesn't actually get compiled, and we are not using DMD's front-end inliner either). If that's the case, a workaround would be to either copy/paste the function into your source code, or add the druntime module to the build (making sure to use -singleobj for the ldc2 driver). Either way, one of the next important goals for LDC should be finally implementing proper force-inline support (that, unlike DMD's pragma, also works when the inliner is not otherwise active, and across all module boundaries). — David
Jan 19 2016
next sibling parent reply Kagamin <spam here.lot> writes:
On Tuesday, 19 January 2016 at 20:01:31 UTC, David Nadlinger 
wrote:
 Either way, one of the next important goals for LDC should be 
 finally implementing proper force-inline support (that, unlike 
 DMD's pragma, also works when the inliner is not otherwise 
 active, and across all module boundaries).
Why? Stefan is compiling with -O3.
Jan 20 2016
parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Wednesday, 20 January 2016 at 09:03:16 UTC, Kagamin wrote:
 On Tuesday, 19 January 2016 at 20:01:31 UTC, David Nadlinger 
 wrote:
 Either way, one of the next important goals for LDC should be 
 finally implementing proper force-inline support (that, unlike 
 DMD's pragma, also works when the inliner is not otherwise 
 active, and across all module boundaries).
Why? Stefan is compiling with -O3.
Because the runtime is not visible as source-code. If it were, llvm could make this into the popcnt instruction. But the inliner is blind when it calls a library...
Jan 20 2016
parent reply Kagamin <spam here.lot> writes:
On Wednesday, 20 January 2016 at 17:35:02 UTC, Stefan Koch wrote:
 Because the runtime is not visible as source-code.
 If it were, llvm could make this into the popcnt instruction.
 But the inliner is blind when it calls a library...
If a function has pragma(inline) and inliner doesn't inline it, then the pragma is not implemented. And how dmd does it if the function source is not visible?
Jan 21 2016
parent Stefan Koch <uplink.coder googlemail.com> writes:
On Thursday, 21 January 2016 at 09:03:54 UTC, Kagamin wrote:
 On Wednesday, 20 January 2016 at 17:35:02 UTC, Stefan Koch 
 wrote:
 Because the runtime is not visible as source-code.
 If it were, llvm could make this into the popcnt instruction.
 But the inliner is blind when it calls a library...
If a function has pragma(inline) and inliner doesn't inline it, then the pragma is not implemented. And how dmd does it if the function source is not visible?
dmd does not. if you use the runtime function popcnt... it does use the intrinsic if you use the intrinsic _popcnt
Jan 21 2016
prev sibling parent Stefan Koch <uplink.coder googlemail.com> writes:
On Tuesday, 19 January 2016 at 20:01:31 UTC, David Nadlinger 
wrote:
 Hi Stefan,

 On 19 Jan 2016, at 14:18, Stefan Koch via digitalmars-d-ldc 
 wrote:
 [...]
[...]
Hi David, Thanks for the explanation.
Jan 20 2016