digitalmars.D.learn - dmd asm output
- John Colvin (35/35) Mar 31 2013 I've been learning assembler a bit and I decided to have a look
- bearophile (7/8) Mar 31 2013 In the dmd sources there are the sources for those array
- bearophile (4/6) Mar 31 2013 Sorry, I was wrong. The SSE ops are done elsewhere. You see that
- John Colvin (4/10) Mar 31 2013 Woops, sorry the actual filename I used was sse.d
- nazriel (37/73) Apr 01 2013 It just looks like wrong snippet. Probably GDB isn't best
- js.mdnq (14/55) Apr 01 2013 What's after the code?
- Artur Skawina (10/21) Apr 01 2013 This is just how objdump/gdb shows the code - it does *not* display
- John Colvin (2/29) Apr 01 2013 thanks, that explains it.
I've been learning assembler a bit and I decided to have a look at what dmd spits out. I tried a simple function with arrays to see what vectorization gets done void addto(int[] a, int[] b) { a[] += b[]; } dmd -O -release -inline -noboundscheck -gc -c test.d disassembled with gdb: _D3sse5addtoFAiAiZv: 0x0000000000000040 <+0>: push rbp 0x0000000000000041 <+1>: mov rbp,rsp 0x0000000000000044 <+4>: sub rsp,0x30 0x0000000000000048 <+8>: mov QWORD PTR [rbp-0x20],rdi 0x000000000000004c <+12>: mov QWORD PTR [rbp-0x18],rsi 0x0000000000000050 <+16>: mov QWORD PTR [rbp-0x10],rdx 0x0000000000000054 <+20>: mov QWORD PTR [rbp-0x8],rcx 0x0000000000000058 <+24>: mov rcx,QWORD PTR [rbp-0x18] 0x000000000000005c <+28>: mov rax,QWORD PTR [rbp-0x20] 0x0000000000000060 <+32>: mov rdx,rax 0x0000000000000063 <+35>: mov QWORD PTR [rbp-0x28],rdx 0x0000000000000067 <+39>: mov rdx,QWORD PTR [rbp-0x8] 0x000000000000006b <+43>: mov rdi,QWORD PTR [rbp-0x10] 0x000000000000006f <+47>: mov rsi,rdx 0x0000000000000072 <+50>: mov rdx,QWORD PTR [rbp-0x28] 0x0000000000000076 <+54>: call 0x7b <_D3sse5addtoFAiAiZv+59> 0x000000000000007b <+59>: mov rsp,rbp 0x000000000000007e <+62>: pop rbp 0x000000000000007f <+63>: ret This looks nothing like what I expected. At first I thought maybe it was due to a crazy calling convention, but adding extern(C) changed nothing. Can anyone explain what on earth is going on here? All that moving things on and off the stack, a call to the next line (strange) and then we're done bar the cleanup? I feel i must be missing something.
Mar 31 2013
John Colvin:Can anyone explain what on earth is going on here?In the dmd sources there are the sources for those array operations too. In what you are seeing I think something is not recognizing the SSE+ instructions. Bye, bearophile
Mar 31 2013
In what you are seeing I think something is not recognizing the SSE+ instructions.Sorry, I was wrong. The SSE ops are done elsewhere. You see that "call 0x7b <_D3sse5addtoFAiAiZv+59>". Bye, bearophile
Mar 31 2013
On Monday, 1 April 2013 at 02:03:12 UTC, bearophile wrote:Woops, sorry the actual filename I used was sse.d You can see that in the function name at the top, the same name as in the callIn what you are seeing I think something is not recognizing the SSE+ instructions.Sorry, I was wrong. The SSE ops are done elsewhere. You see that "call 0x7b <_D3sse5addtoFAiAiZv+59>". Bye, bearophile
Mar 31 2013
On Monday, 1 April 2013 at 01:54:10 UTC, John Colvin wrote:I've been learning assembler a bit and I decided to have a look at what dmd spits out. I tried a simple function with arrays to see what vectorization gets done void addto(int[] a, int[] b) { a[] += b[]; } dmd -O -release -inline -noboundscheck -gc -c test.d disassembled with gdb: _D3sse5addtoFAiAiZv: 0x0000000000000040 <+0>: push rbp 0x0000000000000041 <+1>: mov rbp,rsp 0x0000000000000044 <+4>: sub rsp,0x30 0x0000000000000048 <+8>: mov QWORD PTR [rbp-0x20],rdi 0x000000000000004c <+12>: mov QWORD PTR [rbp-0x18],rsi 0x0000000000000050 <+16>: mov QWORD PTR [rbp-0x10],rdx 0x0000000000000054 <+20>: mov QWORD PTR [rbp-0x8],rcx 0x0000000000000058 <+24>: mov rcx,QWORD PTR [rbp-0x18] 0x000000000000005c <+28>: mov rax,QWORD PTR [rbp-0x20] 0x0000000000000060 <+32>: mov rdx,rax 0x0000000000000063 <+35>: mov QWORD PTR [rbp-0x28],rdx 0x0000000000000067 <+39>: mov rdx,QWORD PTR [rbp-0x8] 0x000000000000006b <+43>: mov rdi,QWORD PTR [rbp-0x10] 0x000000000000006f <+47>: mov rsi,rdx 0x0000000000000072 <+50>: mov rdx,QWORD PTR [rbp-0x28] 0x0000000000000076 <+54>: call 0x7b <_D3sse5addtoFAiAiZv+59> 0x000000000000007b <+59>: mov rsp,rbp 0x000000000000007e <+62>: pop rbp 0x000000000000007f <+63>: ret This looks nothing like what I expected. At first I thought maybe it was due to a crazy calling convention, but adding extern(C) changed nothing. Can anyone explain what on earth is going on here? All that moving things on and off the stack, a call to the next line (strange) and then we're done bar the cleanup? I feel i must be missing something.It just looks like wrong snippet. Probably GDB isn't best assembly level debugger. .text._D4test5addtoFAiAiZAi:08000044 public _D4test5addtoFAiAiZAi .text._D4test5addtoFAiAiZAi:08000044 _D4test5addtoFAiAiZAi proc near .text._D4test5addtoFAiAiZAi:08000044 .text._D4test5addtoFAiAiZAi:08000044 arg_0 = dword ptr 8 .text._D4test5addtoFAiAiZAi:08000044 arg_8 = dword ptr 10h .text._D4test5addtoFAiAiZAi:08000044 arg_C = dword ptr 14h .text._D4test5addtoFAiAiZAi:08000044 .text._D4test5addtoFAiAiZAi:08000044 push ebp .text._D4test5addtoFAiAiZAi:08000045 mov ebp, esp .text._D4test5addtoFAiAiZAi:08000047 push dword ptr [esp+0Ch] .text._D4test5addtoFAiAiZAi:0800004B push [ebp+arg_0] .text._D4test5addtoFAiAiZAi:0800004E push [ebp+arg_C] .text._D4test5addtoFAiAiZAi:08000051 push [ebp+arg_8] .text._D4test5addtoFAiAiZAi:08000054 call _arraySliceSliceAddass_i .text._D4test5addtoFAiAiZAi:08000059 add esp, 10h .text._D4test5addtoFAiAiZAi:0800005C pop ebp .text._D4test5addtoFAiAiZAi:0800005D retn 10h .text._D4test5addtoFAiAiZAi:0800005D _D4test5addtoFAiAiZAi endp Pardon 32bits, my IDA free doesn't handle 64bit too well. The only difference is the fact that arguments here are passed on stack instead of rdi, rsi etc like it takes place on System V AMD64 calling convention
Apr 01 2013
On Monday, 1 April 2013 at 01:54:10 UTC, John Colvin wrote:I've been learning assembler a bit and I decided to have a look at what dmd spits out. I tried a simple function with arrays to see what vectorization gets done void addto(int[] a, int[] b) { a[] += b[]; } dmd -O -release -inline -noboundscheck -gc -c test.d disassembled with gdb: _D3sse5addtoFAiAiZv: 0x0000000000000040 <+0>: push rbp 0x0000000000000041 <+1>: mov rbp,rsp 0x0000000000000044 <+4>: sub rsp,0x30 0x0000000000000048 <+8>: mov QWORD PTR [rbp-0x20],rdi 0x000000000000004c <+12>: mov QWORD PTR [rbp-0x18],rsi 0x0000000000000050 <+16>: mov QWORD PTR [rbp-0x10],rdx 0x0000000000000054 <+20>: mov QWORD PTR [rbp-0x8],rcx 0x0000000000000058 <+24>: mov rcx,QWORD PTR [rbp-0x18] 0x000000000000005c <+28>: mov rax,QWORD PTR [rbp-0x20] 0x0000000000000060 <+32>: mov rdx,rax 0x0000000000000063 <+35>: mov QWORD PTR [rbp-0x28],rdx 0x0000000000000067 <+39>: mov rdx,QWORD PTR [rbp-0x8] 0x000000000000006b <+43>: mov rdi,QWORD PTR [rbp-0x10] 0x000000000000006f <+47>: mov rsi,rdx 0x0000000000000072 <+50>: mov rdx,QWORD PTR [rbp-0x28] 0x0000000000000076 <+54>: call 0x7b <_D3sse5addtoFAiAiZv+59> 0x000000000000007b <+59>: mov rsp,rbp 0x000000000000007e <+62>: pop rbp 0x000000000000007f <+63>: ret This looks nothing like what I expected. At first I thought maybe it was due to a crazy calling convention, but adding extern(C) changed nothing. Can anyone explain what on earth is going on here? All that moving things on and off the stack, a call to the next line (strange) and then we're done bar the cleanup? I feel i must be missing something.What's after the code? The 0x76 call is an inline call function, the ret returns it. The stuff before it is setting up the registers for the call and what comes after0x0000000000000076 <+54>: call 0x7b <_D3sse5addtoFAiAiZv+59> 0x000000000000007b <+59>: mov rsp,rbp 0x000000000000007e <+62>: pop rbp 0x000000000000007f <+63>: retAs you can see, the call is calling the function right below it, but when it returns it depends on what is on the stack as to where the function returns(since ip is being popped into rbp). To me, and this is a guess, this looks like some type of table of functions being called(the ret function is being redirected to somewhere other than to the place that it was being called from). So there is much more going on than meets the eye. It would be easier to understand if you stepped through the code to see where the ret is headed.
Apr 01 2013
On 04/01/13 12:24, js.mdnq wrote:On Monday, 1 April 2013 at 01:54:10 UTC, John Colvin wrote: What's after the code? The 0x76 call is an inline call function, the ret returns it. The stuff before it is setting up the registers for the call and what comes afterThis is just how objdump/gdb shows the code - it does *not* display relocations inline, so you get this misleading output. The call instruction will not end up having a zero offset (that is why it seems to point at the next op), but will be fixed up to call the right function. Run objdump -dr your_obj_or_exe_file and the real call target will be shown as a relocation entry after the call instruction. artur0x0000000000000076 <+54>: call 0x7b <_D3sse5addtoFAiAiZv+59> 0x000000000000007b <+59>: mov rsp,rbp 0x000000000000007e <+62>: pop rbp 0x000000000000007f <+63>: retAs you can see, the call is calling the function right below it, [...]
Apr 01 2013
On Monday, 1 April 2013 at 11:10:56 UTC, Artur Skawina wrote:On 04/01/13 12:24, js.mdnq wrote:thanks, that explains it.On Monday, 1 April 2013 at 01:54:10 UTC, John Colvin wrote: What's after the code? The 0x76 call is an inline call function, the ret returns it. The stuff before it is setting up the registers for the call and what comes afterThis is just how objdump/gdb shows the code - it does *not* display relocations inline, so you get this misleading output. The call instruction will not end up having a zero offset (that is why it seems to point at the next op), but will be fixed up to call the right function. Run objdump -dr your_obj_or_exe_file and the real call target will be shown as a relocation entry after the call instruction. artur0x0000000000000076 <+54>: call 0x7b <_D3sse5addtoFAiAiZv+59> 0x000000000000007b <+59>: mov rsp,rbp 0x000000000000007e <+62>: pop rbp 0x000000000000007f <+63>: retAs you can see, the call is calling the function right below it, [...]
Apr 01 2013