digitalmars.D.learn - Inline assembly and Profiling
- Matthew Dudley (25/25) Feb 29 2016 I'm working on a chess engine side-project, and I'm starting to
- Marco Leise (34/65) Mar 05 2016 I didn't check the documentation, but I believe you have to
I'm working on a chess engine side-project, and I'm starting to get into profiling and optimization. One of the optimizations I've made involves some inline assembly, and I ran across some apparently bizarre behavior today, and I just wanted to double-check that I'm not doing something wrong. Here's the behavior boiled down: import std.stdio; ubyte LS1B(ulong board) { asm { bsf RAX, board; } } void main() { auto one = 0x939839FA; assert(one.LS1B == 1, "Wrong LS1B!"); } If I run this through DMD without profiling on, it runs successfully, but with profiling on, the assertion fails. And in the actual code, it returns seeming random numbers. Is the profiling code stomping on my toes here? Am I not allowed to just single instruction into RAX like this with profiling on? Or is this just a compiler bug?
Feb 29 2016
Am Tue, 01 Mar 2016 02:30:04 +0000 schrieb Matthew Dudley <pontifechs gmail.com>:I'm working on a chess engine side-project, and I'm starting to get into profiling and optimization. One of the optimizations I've made involves some inline assembly, and I ran across some apparently bizarre behavior today, and I just wanted to double-check that I'm not doing something wrong. Here's the behavior boiled down: import std.stdio; ubyte LS1B(ulong board) { asm { bsf RAX, board; } } void main() { auto one = 0x939839FA; assert(one.LS1B == 1, "Wrong LS1B!"); } If I run this through DMD without profiling on, it runs successfully, but with profiling on, the assertion fails. And in the actual code, it returns seeming random numbers. Is the profiling code stomping on my toes here? Am I not allowed to just single instruction into RAX like this with profiling on? Or is this just a compiler bug?I didn't check the documentation, but I believe you have to store RAX into some variable and return that when you use inline assembly. In any case you should report a bug about this. If this code is correct, then DMD assumes you implicitly set the return value inside the asm-block and profiling should save RAX. If this is not intended, then the function is missing a return statement. Alternatively you can turn this into a naked function by starting your asm-block with "naked" and adding an explicit "ret" at the end. Naked asm means that the functions only contains the instructions you have explicitly written down, circumventing the profiling instrumentation. Either way functions with DMD-style inline assembly cannot be inlined at all, which means you are better off looking into the core.bitops compiler intrinsics. Also code coverage or profiling (forgot which one) used to not work in multi-threaded code! What I typically do is compile on Linux with GDC or LDC and use an external sampling profiler such as OProfile. You will need change some optimizations in the compiler (no inlining, debug information, keep frame pointers) so function call stack can actually be reasoned about. After a profile run you can then display the result in various ways. At first these are confusing, but you'll get the hang of it after a while. For example you could display sample counts per line of code, or display a call graph which tells you the time spent in a function separated by call site. OProfile being a system profiler is not limited to your program. It can include time spent in kernel functions or just profile the whole system at once. -- Marco
Mar 05 2016