www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 14758] New: TypeInfo causes excessive binary bloat

https://issues.dlang.org/show_bug.cgi?id=14758

          Issue ID: 14758
           Summary: TypeInfo causes excessive binary bloat
           Product: D
           Version: D2
          Hardware: All
                OS: All
            Status: NEW
          Severity: major
          Priority: P1
         Component: dmd
          Assignee: nobody puremagic.com
          Reporter: slavo5150 yahoo.com

The way DMD generates TypeInfo causes excessive binary bloat so badly that D
absolutely cannot be used for resource constrained systems until this problem
is solved.

=Evidence=
Create two source files object.d and test.d in the same directory as show
below.

// object.d
module object;

alias immutable(char)[] string;

class Object
{ }

class TypeInfo
{ }

class TypeInfo_Class : TypeInfo
{
    ubyte[136] ignore;
}

extern(C) void _d_dso_registry(void* data)
{ }

// test.d
module test;

long sys_write(long arg1, in void* arg2, long arg3)
{
    long result;

    asm
    {
        mov RAX, 1;
        mov RDI, arg1;
        mov RSI, arg2;
        mov RDX, arg3;
        syscall;
    }

    return result;
}

void write(in string text)
{
    sys_write(2, text.ptr, text.length);
}

void write(A...)(in A a)
{
    foreach(t; a)
    {
        write(t);
    }
}

final abstract class TestClass1 { }
final abstract class TestClass2 { }
final abstract class TestClass3 { }
final abstract class TestClass4 { }
final abstract class TestClass5 { }
final abstract class TestClass6 { }
final abstract class TestClass7 { }
final abstract class TestClass8 { }
final abstract class TestClass9 { }

extern(C) void main()
{
    write("Hello\n");
}

Compile on 64-bit Linux with ( This method compiles without phobos or druntime
to obtain the smallest possible binary):
dmd -m64 -defaultlib= -debuglib= -conf= -betterC -release object.d test.d
-oftest

Analyze with
objdump -s -j .rodata test

Contents of section .rodata:
 4006c0 01000200 00000000 00000000 00000000  ................
 4006d0 f0064000 00000000 00000000 00000000  .. .............
 4006e0 6f626a65 63742e4f 626a6563 74000000  object.Object...
 4006f0 68126000 00000000 00000000 00000000  h.`.............
 400700 20074000 00000000 00000000 00000000   . .............
 400710 6f626a65 63742e54 79706549 6e666f00  object.TypeInfo.
 400720 08136000 00000000 00000000 00000000  ..`.............
 400730 d8074000 00000000 00000000 00000000  .. .............
 400740 00000000 00000000 00000000 00000000  ................
 400750 00000000 00000000 00000000 00000000  ................
 400760 00000000 00000000 00000000 00000000  ................
 400770 00000000 00000000 00000000 00000000  ................
 400780 00000000 00000000 00000000 00000000  ................
 400790 00000000 00000000 00000000 00000000  ................
 4007a0 00000000 00000000 00000000 00000000  ................
 4007b0 00000000 00000000 00000000 00000000  ................
 4007c0 00000000 00000000 54797065 496e666f  ........TypeInfo
 4007d0 5f436c61 73730000 a8136000 00000000  _Class....`.....
 4007e0 00084000 00000000 00000000 00000000  .. .............
 4007f0 74657374 2e546573 74436c61 73733100  test.TestClass1.
 400800 48146000 00000000 00000000 00000000  H.`.............
 400810 30084000 00000000 00000000 00000000  0. .............
 400820 74657374 2e546573 74436c61 73733200  test.TestClass2.
 400830 e8146000 00000000 00000000 00000000  ..`.............
 400840 60084000 00000000 00000000 00000000  `. .............
 400850 74657374 2e546573 74436c61 73733300  test.TestClass3.
 400860 88156000 00000000 00000000 00000000  ..`.............
 400870 90084000 00000000 00000000 00000000  .. .............
 400880 74657374 2e546573 74436c61 73733400  test.TestClass4.
 400890 28166000 00000000 00000000 00000000  (.`.............
 4008a0 c0084000 00000000 00000000 00000000  .. .............
 4008b0 74657374 2e546573 74436c61 73733500  test.TestClass5.
 4008c0 c8166000 00000000 00000000 00000000  ..`.............
 4008d0 f0084000 00000000 00000000 00000000  .. .............
 4008e0 74657374 2e546573 74436c61 73733600  test.TestClass6.
 4008f0 68176000 00000000 00000000 00000000  h.`.............
 400900 20094000 00000000 00000000 00000000   . .............
 400910 74657374 2e546573 74436c61 73733700  test.TestClass7.
 400920 08186000 00000000 00000000 00000000  ..`.............
 400930 50094000 00000000 00000000 00000000  P. .............
 400940 74657374 2e546573 74436c61 73733800  test.TestClass8.
 400950 a8186000 00000000 00000000 00000000  ..`.............
 400960 80094000 00000000 00000000 00000000  .. .............
 400970 74657374 2e546573 74436c61 73733900  test.TestClass9.
 400980 48196000 00000000 48656c6c 6f0a0000  H.`.....Hello...
 400990 06000000 00000000 88094000 00000000  .......... .....

One can see that although TypeInfo is used nowhere in the code, implicitly or
explicitly, it is still included in the binary; the TypeInfo.name field being
the greatest contributor to the problem.

I have an embedded system with a large number of peripherals all programmable
using memory-mapped I/O resulting in 100s of register and more than 1000
bitfields
(http://www.st.com/web/en/resource/technical/document/reference_manual/DM00031020.pdf).
 I have modeled these statically with D's fantastic modelling features to
produce an object-oriented, easily-navigable hierarchy of the hardware memory
map (example:
https://github.com/JinShil/stm32f42_discovery_demo/blob/master/source/stm32f42/gpio.d).
 It is an excellent model that generates fast code and plays very well with
tooling.  The only problem is the TypeInfo bloat.

For even a miniscule proof of concept as linked above, the resulting binary is
400KB. If the executable is hacked to remove the TypeInfo data (sometimes
works, sometimes doesn't), the resulting binary is only about 5KB; 80 times the
size it should be.  And it gets worse as more peripherals are added.

Even when compiling with GDC's -fdata-sections and -ffunction-sections and
linking with --gc-sections the data cannot be removed due to the way the
compiler emits TypeInfo to the binary.

It is intolerable because some binaries are so bloated they cannot be uploaded
to the hardware's flash memory, and I therefore cannot continue work on this
platform in D.

This issue was also filed under GDC's bug tracker
(http://bugzilla.gdcproject.org/show_bug.cgi?id=184), but my investigation has
led me to believe it to be intrinsically a DMD issue.

The TypeInfo implementation in DMD is code smell evident by the fact that the
compiler checks the size of the TypeInfo declaration in the runtime against the
hard-coded TypeInfo generation in DMD, resulting in hacks as demonstrated above
in object.d (e.g. ubyte[size] ignore;).  DMD is a compiler capable of parsing D
code, so there is no reason for the compiler to hard-code the TypeInfo
generation when it can read the definition from druntime, and use that.

There have been proposals to add a -nortti flag to the compiler to remove
TypeInfo completely, but that would force a compromise on slicing, postblit and
other features.  Such compromises are most undesirable.

A potential solution is Issue 12270 - Move TypeInfo to the D Runtime.  But the
emitted object code must be generated in a way that allows unused data and
functions to be removed by either --gc-sections or link-time optimization, or
better yet, just not generated at all.

--
Jul 01 2015