www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 14936] New: Dividing by a power of 2 slow on 32bit

https://issues.dlang.org/show_bug.cgi?id=14936

          Issue ID: 14936
           Summary: Dividing by a power of 2 slow on 32bit
           Product: D
           Version: D2
          Hardware: x86
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P1
         Component: dmd
          Assignee: nobody puremagic.com
          Reporter: ibuclaw gdcproject.org

Noticed from results in: http://bugzilla.gdcproject.org/show_bug.cgi?id=180

"""
for: dmd -O -inline -release
  tByte: 0.004,872 secs
 tShort: 0.006,896 secs
   tInt: 0.008,672 secs
  tLong: 0.036,864 secs
"""

Reduced test:
long test(long l)
{
  return l / 2;
}


Compiles down to: http://goo.gl/10dWmf
long example.test(long):
    push   %ebp
    mov    %esp,%ebp
    push   %eax
    xor    %ecx,%ecx
    mov    0xc(%ebp),%edx
    push   %ebx
    mov    $0x2,%ebx
    mov    0x8(%ebp),%eax
    push   %ecx
    push   %ebx
    push   %edx
    push   %eax
    call   __divdi3        ; <--- !!!
    add    $0x10,%esp
    pop    %ebx
    mov    %ebp,%esp
    pop    %ebp
    ret    $0x8
    add    %al,(%eax)


In comparison to GDC: http://goo.gl/XaBqdA
    push    %ebx
    mov    12(%esp), %ebx
    xor    %edx, %edx
    mov    8(%esp), %ecx
    mov    %ebx, %eax
    shr    $31, %eax
    add    %ecx, %eax
    adc    %ebx, %edx
    shrd    $1, %edx, %eax
    pop    %ebx
    sar    %edx
    ret


You can exchange '2' for '4', '8', '16'... '128' to observe the difference
(optimization changes for 256 and higher).

--
Aug 19 2015