www.digitalmars.com         C & C++   DMDScript  

D.gnu - Removing RTTI from binaries

reply "Mike" <none none.com> writes:
I'm building some code that is heavily templated.  Therefore, I 
have many very small classes.  I was surprised to see my binaries 
growing very large, disproportionately to the amount of code I 
was adding.  I inspected the binaries with objdump and found 
contents of the .rodata section like the following:

  801fa00 6572616c 2e526567 69737465 72212830  eral.Register!(0
  801fa10 2c206361 73742841 63636573 73293729  , cast(Access)7)
  801fa20 2e526567 69737465 722e4269 74212831  .Register.Bit!(1
  801fa30 312c2063 61737428 4d757461 62696c69  1, cast(Mutabili
  801fa40 74792932 292e4269 74000000 6d6d696f  ty)2).Bit...mmio
  801fa50 2e506572 69706865 72616c21 28414842  .Peripheral!(AHB
  801fa60 312c2031 35333630 292e5065 72697068  1, 15360).Periph
  801fa70 6572616c 2e526567 69737465 72212830  eral.Register!(0
  801fa80 2c206361 73742841 63636573 73293729  , cast(Access)7)
  801fa90 2e526567 69737465 722e4269 74212831  .Register.Bit!(1
  801faa0 322c2063 61737428 4d757461 62696c69  2, cast(Mutabili
  801fab0 74792930 292e4269 74000000 6275732e  ty)0).Bit...bus.
  801fac0 41504232 00000000 6275732e 41504231  APB2....bus.APB1
  801fad0 00000000 6275732e 41484233 00000000  ....bus.AHB3....
  801fae0 6275732e 41484232 00000000 6275732e  bus.AHB2....bus.
  801faf0 41484231 00000000 6275732e 436f7265  AHB1....bus.Core
  801fb00 50657269 70686572 616c7300 54797065  Peripherals.Type
  801fb10 496e666f 5f690000 54797065 496e666f  Info_i..TypeInfo
  801fb20 5f456e75 6d000000 54797065 496e666f  _Enum...TypeInfo
  801fb30 5f417272 61790000 54797065 496e666f  _Array..TypeInfo

Most of my code just uses classes as namespaces calling static 
methods and properties.  The amount of code in my .text segment 
is only a few hundred bytes, but the .rodata section is several 
thousand bytes.

I'm guessing this is RTTI.  Is there any way, either through 
linker scripting, or the compiler to keep this stuff out of my 
binary?

Thanks,
Mike

using GDC 4.9 arm-none-eabi cross-compiler.

compiler flags:
arm-none-eabi-gdc -O3 -nophoboslib -nostdinc -nodefaultlibs 
-nostdlib -fno-emit-moduleinfo -ffunction-sections 
-fdata-sections -Wl,-Tsource/linker/linker.ld -Wl,--gc-sections
Jan 11 2015
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Mike:

 I'm building some code that is heavily templated.  Therefore, I 
 have many very small classes.
This is a non sequitur.
 Most of my code just uses classes as namespaces calling static 
 methods and properties.
Aren't structs better for that? Bye, bearophile
Jan 11 2015
parent reply "Mike" <none none.com> writes:
On Sunday, 11 January 2015 at 15:02:07 UTC, bearophile wrote:
 Mike:

 I'm building some code that is heavily templated.  Therefore, 
 I have many very small classes.
This is a non sequitur.
I believe it is because nearly every one of the instantiated template names is appears in the .rodata section, thus causing the binary's size to inflate.
 Most of my code just uses classes as namespaces calling static 
 methods and properties.
Aren't structs better for that?
Not in my current design, as I also make use of inheritance. If you're curious, you can see the code here: https://github.com/JinShil/stm32f42_discovery_demo, specifically the stm32f42 folder. Mike
Jan 11 2015
parent reply Johannes Pfau <nospam example.com> writes:
Am Sun, 11 Jan 2015 15:15:38 +0000
schrieb "Mike" <none none.com>:

 On Sunday, 11 January 2015 at 15:02:07 UTC, bearophile wrote:
 Mike:

 I'm building some code that is heavily templated.  Therefore, 
 I have many very small classes.
This is a non sequitur.
I believe it is because nearly every one of the instantiated template names is appears in the .rodata section, thus causing the binary's size to inflate.
That's likely used/caused by the TypeInfo.name property.
 Most of my code just uses classes as namespaces calling static 
 methods and properties.
Aren't structs better for that?
Not in my current design, as I also make use of inheritance. If you're curious, you can see the code here: https://github.com/JinShil/stm32f42_discovery_demo, specifically the stm32f42 folder. Mike
I guess you'd see the same problem with structs. There's no standard way to avoid TypeInfo right now. -fno-rtti would disable TypeInfo completely but it's not implemented in upstream GDC. If you can disable TypeInfo for all classes open gcc/d/d-objfile.cc search for "// Put out the TypeInfo" in ClassDeclaration::toObjFile and comment out this line: "type->getTypeInfo (NULL);" If you only want to disable TypeInfo for some classes that's more difficult: https://github.com/D-Programming-microD/GDC/commit/f0614bc9480dacd1ec6bb75277d280afa96e08bb
Jan 11 2015
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Johannes Pfau:

 If you only want to disable TypeInfo for some classes that's 
 more difficult:
This seems a feature that can be useful in standard D (all compilers), with an annotation of some kind like nortti. Bye, bearophile
Jan 11 2015
prev sibling next sibling parent reply "Mike" <none none.com> writes:
On Sunday, 11 January 2015 at 16:57:41 UTC, Johannes Pfau wrote:
 That's likely used/caused by the TypeInfo.name property.
Judging by what I'm seeing, I think you're right. But I'm compiling with -fdata-sections and -Wl,--gc-sections, so shouldn't that put each TypeInfo.name in its own section and strip it out? Here's what I'm seeing: -------------------- arm-none-eabi-objdump -t binary/firmware binary/firmware: file format elf32-littlearm SYMBOL TABLE: 08000000 l d .text 00000000 .text 08000a44 l d .rodata 00000000 .rodata 00000000 l df *ABS* 00000000 start.d 0800001c l .text 00000000 handler_address 00000000 l *UND* 00000000 __aeabi_unwind_cpp_pr0 00000000 l *UND* 00000000 __aeabi_unwind_cpp_pr1 00000000 l df *ABS* 00000000 10010000 l *ABS* 00000000 _stackStart 08000034 g F .text 0000007e memcpy 08000010 g F .text 00000014 _D5start7OnResetFZv 080202d4 g .rodata 00000000 __text_end__ 08000004 g O .text 00000004 ResetHandler 20000000 g .rodata 00000000 __data_end__ 20000000 g .rodata 00000000 __bss_start__ 20000000 g .rodata 00000000 __bss_end__ 08000024 g F .text 00000010 memset 20000000 g .rodata 00000000 __data_start__ 0800000c g O .text 00000004 HardFaultHandler 080000b4 g F .text 0000093c main 08000a28 g F .text 0000001c _D5start11OnHardFaultFZv I don't see anything in the symbol table, but... --------------------- rm-none-eabi-readelf -S binary/firmware There are 6 section headers, starting at offset 0x28300: Section Headers: [Nr] Name Type Addr Off Size ES Flg [0] NULL 00000000 000000 000000 00 [1] .text PROGBITS 08000000 008000 000a44 00 AX [2] .rodata PROGBITS 08000a44 008a44 01f890 00 A [3] .shstrtab STRTAB 00000000 0282d4 000029 00 [4] .symtab SYMTAB 00000000 0283f0 000270 10 [5] .strtab STRTAB 00000000 028660 000128 00 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings) You can see the .rodata section is orders of magnitude larger than any other section. Mike
Jan 13 2015
next sibling parent reply "Dicebot" <public dicebot.lv> writes:
On Tuesday, 13 January 2015 at 14:20:43 UTC, Mike wrote:
 On Sunday, 11 January 2015 at 16:57:41 UTC, Johannes Pfau wrote:
 That's likely used/caused by the TypeInfo.name property.
Judging by what I'm seeing, I think you're right. But I'm compiling with -fdata-sections and -Wl,--gc-sections, so shouldn't that put each TypeInfo.name in its own section and strip it out?
I remember speaking about it with Martin and Daniel during DConf 2014 and I think it was Daniel who mentioned that by default TypeInfo/ModuleInfo is emitted in some weird packed way. When LDC announced using --gc-sections by default it was mentioned they had to change ModuleInfo emitting to make it actually work. Can it be the same issue?
Jan 13 2015
parent reply "Mike" <none none.com> writes:
On Tuesday, 13 January 2015 at 14:36:15 UTC, Dicebot wrote:
 I remember speaking about it with Martin and Daniel during 
 DConf 2014 and I think it was Daniel who mentioned that by 
 default TypeInfo/ModuleInfo is emitted in some weird packed 
 way. When LDC announced using --gc-sections by default it was 
 mentioned they had to change ModuleInfo emitting to make it 
 actually work.

 Can it be the same issue?
Thanks, Dicebot, for bringing this to my attention. That would explain what I'm seeing. Is this something unique to GDC, or is it an artifact inherited from DMD? Mike
Jan 13 2015
parent reply "Iain Buclaw via D.gnu" <d.gnu puremagic.com> writes:
On 14 January 2015 at 04:00, Mike via D.gnu <d.gnu puremagic.com> wrote:
 On Tuesday, 13 January 2015 at 14:36:15 UTC, Dicebot wrote:
 I remember speaking about it with Martin and Daniel during DConf 2014 and
 I think it was Daniel who mentioned that by default TypeInfo/ModuleInfo is
 emitted in some weird packed way. When LDC announced using --gc-sections by
 default it was mentioned they had to change ModuleInfo emitting to make it
 actually work.

 Can it be the same issue?
Thanks, Dicebot, for bringing this to my attention. That would explain what I'm seeing. Is this something unique to GDC, or is it an artifact inherited from DMD? Mike
It's an artifact inherited from DMD. ModuleInfo is of a dynamic size, depending on what is implemented in the module. See: https://github.com/D-Programming-Language/druntime/blob/081591237ee7d666ffd81463dac1b7f38e7d9798/src/object_.d#L1589 However it's size is correctly recorded before being sent to be written. The ModuleInfo symbols themselves aren't put into any particular section, they also can't go in rodata because of how the D runtime start-up works, so they end up in the same section as __gshared data. The same is also true with TypeInfo_Class (alias ClassInfo) where interface vtables are written packed immediately after the data structure ends. Again, it's size is treated as dynamic and is correctly recorded before being written, and again it cannot be in rodata because the __monitor field is directly written to. Iain.
Jan 14 2015
parent reply "Mike" <none none.com> writes:
On Wednesday, 14 January 2015 at 08:42:55 UTC, Iain Buclaw via 
D.gnu wrote:
 On 14 January 2015 at 04:00, Mike via D.gnu 
 <d.gnu puremagic.com> wrote:
 On Tuesday, 13 January 2015 at 14:36:15 UTC, Dicebot wrote:
 I remember speaking about it with Martin and Daniel during 
 DConf 2014 and
 I think it was Daniel who mentioned that by default 
 TypeInfo/ModuleInfo is
 emitted in some weird packed way. When LDC announced using 
 --gc-sections by
 default it was mentioned they had to change ModuleInfo 
 emitting to make it
 actually work.

 Can it be the same issue?
Thanks, Dicebot, for bringing this to my attention. That would explain what I'm seeing. Is this something unique to GDC, or is it an artifact inherited from DMD? Mike
It's an artifact inherited from DMD. ModuleInfo is of a dynamic size, depending on what is implemented in the module. See: https://github.com/D-Programming-Language/druntime/blob/081591237ee7d666ffd81463dac1b7f38e7d9798/src/object_.d#L1589 However it's size is correctly recorded before being sent to be written. The ModuleInfo symbols themselves aren't put into any particular section, they also can't go in rodata because of how the D runtime start-up works, so they end up in the same section as __gshared data. The same is also true with TypeInfo_Class (alias ClassInfo) where interface vtables are written packed immediately after the data structure ends. Again, it's size is treated as dynamic and is correctly recorded before being written, and again it cannot be in rodata because the __monitor field is directly written to.
Ok, but I have a mess of classes generated by templates (and I love it). Their `name` properties [1] should go in .rodata, right? But why aren't they being put in their own sections when compiling with -fdata-sections? [1] - https://github.com/D-Programming-GDC/GDC/blob/master/libphobos/libdruntime/object_.d#L81 Mike
Jan 14 2015
parent reply "Mike" <none none.com> writes:
On Wednesday, 14 January 2015 at 09:04:50 UTC, Mike wrote:
 Ok, but I have a mess of classes generated by templates (and I 
 love it).  Their `name` properties [1] should go in .rodata, 
 right?  But why aren't they being put in their own sections 
 when compiling with -fdata-sections?

 [1] - 
 https://github.com/D-Programming-GDC/GDC/blob/master/libphobos/libdruntime/object_.d#L81
Well, I was working a reduced test case and found that it has something to do with my trace.d file here: https://github.com/JinShil/stm32f42_discovery_demo/blob/master/source/stm32f42/trace.d If I add a trace.writeLine("x") in my program, then the binary goes from 2K to 130K. Anyway, it appears it has nothing to do with TypeInfo. I'll continue to try to reduce. Thanks for the help and useful information. Mike
Jan 14 2015
next sibling parent reply "Iain Buclaw via D.gnu" <d.gnu puremagic.com> writes:
On 14 January 2015 at 13:32, Mike via D.gnu <d.gnu puremagic.com> wrote:
 On Wednesday, 14 January 2015 at 09:04:50 UTC, Mike wrote:
 Ok, but I have a mess of classes generated by templates (and I love it).
 Their `name` properties [1] should go in .rodata, right?  But why aren't
 they being put in their own sections when compiling with -fdata-sections?

 [1] -
 https://github.com/D-Programming-GDC/GDC/blob/master/libphobos/libdruntime/object_.d#L81
Well, I was working a reduced test case and found that it has something to do with my trace.d file here: https://github.com/JinShil/stm32f42_discovery_demo/blob/master/source/stm32f42/trace.d If I add a trace.writeLine("x") in my program, then the binary goes from 2K to 130K. Anyway, it appears it has nothing to do with TypeInfo. I'll continue to try to reduce.
Could it be that the entire module trace takes up 128K? Probably not very likely. And I doubt the writeLine/write templates contribute to much.
Jan 14 2015
parent reply "Mike" <none none.com> writes:
On Wednesday, 14 January 2015 at 14:20:50 UTC, Iain Buclaw via
D.gnu wrote:

 Could it be that the entire module trace takes up 128K?  
 Probably not
 very likely.  And I doubt the writeLine/write templates 
 contribute to
 much.
Sorry, I didn't see this reply before I submitted my last post. The problem was my .rodata was filling up with the names of every single type in my program. The .text section was quite small (2K), so I don't think it was the executable code generated from the trace module. I'll try to make a reduced test case on Linux that others can run and see. Mike
Jan 14 2015
parent reply "Mike" <none none.com> writes:
On Wednesday, 14 January 2015 at 14:34:48 UTC, Mike wrote:

 I'll try to make a reduced test case on Linux that others can 
 run
 and see.
Ok, here's my reduced test case that runs on Linux 64-bit. I don't know if I really needed to implement the syscalls, but I just wanted to get rid of everything I could so the important stuff would stand out. test.d *************************************** void sys_exit(long arg1) nothrow { asm { "syscall" : : "a" 60, "D" arg1, : "memory", "cc", "rcx", "r11"; } } long sys_write(long arg1, in void* arg2, long arg3) nothrow { long result; asm { "syscall" : "=a" result : "a" 1, "D" arg1, "S" arg2, "m" arg2, "d" arg3 : "memory", "cc", "rcx", "r11"; } return result; } void write(in string text) nothrow { sys_write(2, text.ptr, text.length); } void write(A...)(in A a) nothrow { foreach(t; a) { write(t); } } final abstract class TestClass1 { } // final abstract class TestClass2 { } // final abstract class TestClass3 { } // final abstract class TestClass4 { } // final abstract class TestClass5 { } // final abstract class TestClass6 { } // final abstract class TestClass7 { } // final abstract class TestClass8 { } // final abstract class TestClass9 { } extern(C) void main() { write("x"); sys_exit(0); } *************************************** compile with: gdc -static -frelease -fno-emit-moduleinfo -nophoboslib -nostdlib test.d --entry=main -ffunction-sections -fdata-sections -Wl,--gc-sections -o test show .rodata: *************************************** objdump -s -j .rodata test Contents of section .rodata: 4001c4 74657374 2e546573 74436c61 73733100 test.TestClass1. 4001d4 78 x See the "test.TestClass1" string there? That's the problem. Now uncomment the other `TestClass`s. show .rodata with more types: *************************************** objdump -s -j .rodata test Contents of section .rodata: 4001c4 74657374 2e546573 74436c61 73733100 test.TestClass1. 4001d4 74657374 2e546573 74436c61 73733200 test.TestClass2. 4001e4 74657374 2e546573 74436c61 73733300 test.TestClass3. 4001f4 74657374 2e546573 74436c61 73733400 test.TestClass4. 400204 74657374 2e546573 74436c61 73733500 test.TestClass5. 400214 74657374 2e546573 74436c61 73733600 test.TestClass6. 400224 74657374 2e546573 74436c61 73733700 test.TestClass7. 400234 74657374 2e546573 74436c61 73733800 test.TestClass8. 400244 74657374 2e546573 74436c61 73733900 test.TestClass9. 400254 78 x Now imagine I have a highly templated library generating 100s of these little classes. Now, comment out the `write("x")`. show .rodata `write("x")` commented out: *************************************** objdump -s -j .rodata test objdump: section '.rodata' mentioned in a -j option, but not found in any input file No .rodata at all!! IMO the toolchain should be able to recognize that the TypeInfo.name property is not being used and strip it out. If you can explain the mechanics causing this, please enlighten me. Bug? Enhancement? By design? Thanks for the help, Mike
Jan 15 2015
parent reply "Dicebot" <public dicebot.lv> writes:
On Thursday, 15 January 2015 at 11:04:37 UTC, Mike wrote:
 If you can explain the mechanics causing this, please enlighten 
 me.  Bug? Enhancement? By design?
Random guess: can it possibly confuse template-based variadics with runtime variadics? Latter require RTTI to work. If something like that happens it is surely a bug. I don't see any obvious legitimate reason for this behavior.
Jan 15 2015
next sibling parent reply "Mike" <none none.com> writes:
On Thursday, 15 January 2015 at 11:42:31 UTC, Dicebot wrote:
 Random guess: can it possibly confuse template-based variadics 
 with runtime variadics? Latter require RTTI to work. If 
 something like that happens it is surely a bug.
Thanks for the reply. I have to apologize for the last post, as I didn't fully reduce the code, and posted prematurely. Here's a further reduction without any templates or variadics, so I'm under the impression that neither templates nor variadics are contributing to the problem. The following code won't execute, but it will reproduce the problem at hand: Inflating .rodata with TypeInfo.name. test.d ************************************** void write(in string text) nothrow {} final abstract class TestClass1 { } final abstract class TestClass2 { } final abstract class TestClass3 { } final abstract class TestClass4 { } final abstract class TestClass5 { } final abstract class TestClass6 { } final abstract class TestClass7 { } final abstract class TestClass8 { } final abstract class TestClass9 { } extern(C) void main() { write(""); } compile with: gdc -static -frelease -fno-emit-moduleinfo -nophoboslib -nostdlib test.d --entry=main -ffunction-sections -fdata-sections -Wl,--gc-sections -o test objdump -s -j .rodata test ************************************* Contents of section .rodata: 400152 74657374 2e546573 74436c61 73733100 test.TestClass1. 400162 74657374 2e546573 74436c61 73733200 test.TestClass2. 400172 74657374 2e546573 74436c61 73733300 test.TestClass3. 400182 74657374 2e546573 74436c61 73733400 test.TestClass4. 400192 74657374 2e546573 74436c61 73733500 test.TestClass5. 4001a2 74657374 2e546573 74436c61 73733600 test.TestClass6. 4001b2 74657374 2e546573 74436c61 73733700 test.TestClass7. 4001c2 74657374 2e546573 74436c61 73733800 test.TestClass8. 4001d2 74657374 2e546573 74436c61 73733900 test.TestClass9. Interestingly, if I change the argument to `write` from a string to a char, all is good. test.d ************************************** void write(in char text) nothrow {} final abstract class TestClass1 { } final abstract class TestClass2 { } final abstract class TestClass3 { } final abstract class TestClass4 { } final abstract class TestClass5 { } final abstract class TestClass6 { } final abstract class TestClass7 { } final abstract class TestClass8 { } final abstract class TestClass9 { } extern(C) void main() { write(' '); } objdump -s -j .rodata test ************************************** objdump: section '.rodata' mentioned in a -j option, but not found in any input file I guess all I'm really showing is how little I understand about this problem. Again, I ask for help. Mike
Jan 15 2015
parent Johannes Pfau <nospam example.com> writes:
Am Thu, 15 Jan 2015 11:51:41 +0000
schrieb "Mike" <none none.com>:

 On Thursday, 15 January 2015 at 11:42:31 UTC, Dicebot wrote:
 Random guess: can it possibly confuse template-based variadics 
 with runtime variadics? Latter require RTTI to work. If 
 something like that happens it is surely a bug.
Thanks for the reply. I have to apologize for the last post, as I didn't fully reduce the code, and posted prematurely. Here's a further reduction without any templates or variadics, so I'm under the impression that neither templates nor variadics are contributing to the problem. The following code won't execute, but it will reproduce the problem at hand: Inflating .rodata with TypeInfo.name. test.d ************************************** void write(in string text) nothrow {} final abstract class TestClass1 { } final abstract class TestClass2 { } final abstract class TestClass3 { } final abstract class TestClass4 { } final abstract class TestClass5 { } final abstract class TestClass6 { } final abstract class TestClass7 { } final abstract class TestClass8 { } final abstract class TestClass9 { } extern(C) void main() { write(""); } compile with: gdc -static -frelease -fno-emit-moduleinfo -nophoboslib -nostdlib test.d --entry=main -ffunction-sections -fdata-sections -Wl,--gc-sections -o test objdump -s -j .rodata test ************************************* Contents of section .rodata: 400152 74657374 2e546573 74436c61 73733100 test.TestClass1. 400162 74657374 2e546573 74436c61 73733200 test.TestClass2. 400172 74657374 2e546573 74436c61 73733300 test.TestClass3. 400182 74657374 2e546573 74436c61 73733400 test.TestClass4. 400192 74657374 2e546573 74436c61 73733500 test.TestClass5. 4001a2 74657374 2e546573 74436c61 73733600 test.TestClass6. 4001b2 74657374 2e546573 74436c61 73733700 test.TestClass7. 4001c2 74657374 2e546573 74436c61 73733800 test.TestClass8. 4001d2 74657374 2e546573 74436c61 73733900 test.TestClass9. Interestingly, if I change the argument to `write` from a string to a char, all is good. test.d ************************************** void write(in char text) nothrow {} final abstract class TestClass1 { } final abstract class TestClass2 { } final abstract class TestClass3 { } final abstract class TestClass4 { } final abstract class TestClass5 { } final abstract class TestClass6 { } final abstract class TestClass7 { } final abstract class TestClass8 { } final abstract class TestClass9 { } extern(C) void main() { write(' '); } objdump -s -j .rodata test ************************************** objdump: section '.rodata' mentioned in a -j option, but not found in any input file I guess all I'm really showing is how little I understand about this problem. Again, I ask for help. Mike
The char is probably not placed in rodata but reproduced in some other way (hardcoded instruction with literal or whatever). This matches the theory in my other reply.
Jan 15 2015
prev sibling parent reply Johannes Pfau <nospam example.com> writes:
Am Thu, 15 Jan 2015 11:42:30 +0000
schrieb "Dicebot" <public dicebot.lv>:

 On Thursday, 15 January 2015 at 11:04:37 UTC, Mike wrote:
 If you can explain the mechanics causing this, please enlighten 
 me.  Bug? Enhancement? By design?
Random guess: can it possibly confuse template-based variadics with runtime variadics? Latter require RTTI to work. If something like that happens it is surely a bug. I don't see any obvious legitimate reason for this behavior.
That'd be quite weird. My best guess is that the strings are always placed in rodata, never in separate sections. If you do write("x"), "x" is also in rodata, the rodata section can't be removed. If you delete the write call there's no reference to rodata and it's possible to remove the complete section. After some google-fu: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=192 Considering this was filed in 2000 I'd say it's not very likely to get fixed soon :-( So the best option is probably to get rid of this problem by patching the compiler ( notypeinfo or -fnortti).
Jan 15 2015
next sibling parent "Dicebot" <public dicebot.lv> writes:
On Thursday, 15 January 2015 at 12:01:05 UTC, Johannes Pfau wrote:
 So the best option is probably to get rid of this problem by 
 patching
 the compiler ( notypeinfo or -fnortti).
Oh, sorry, I though GDC already does that when TypeInfo is never actually used. Nevermind then, your version seems to be a solid match.
Jan 15 2015
prev sibling next sibling parent "Mike" <none none.com> writes:
On Thursday, 15 January 2015 at 12:01:05 UTC, Johannes Pfau wrote:

 After some google-fu:
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=192
 Considering this was filed in 2000 I'd say it's not very likely 
 to get
 fixed soon :-(
That looks right-on, and knowing the likely cause, I should able to engineer a workaround for now. Nice find, and thank you! Mike
Jan 15 2015
prev sibling next sibling parent "ketmar via D.gnu" <d.gnu puremagic.com> writes:
On Thu, 15 Jan 2015 13:01:04 +0100
"Johannes Pfau via D.gnu" <d.gnu puremagic.com> wrote:

 After some google-fu:
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D192
 Considering this was filed in 2000 I'd say it's not very likely to get
 fixed soon :-(
and it's so forgotten that they have stupid portostory attached circa 2007. heh.
Jan 15 2015
prev sibling next sibling parent reply "Mike" <none none.com> writes:
On Thursday, 15 January 2015 at 12:01:05 UTC, Johannes Pfau wrote:

 My best guess is that the strings are always placed in rodata, 
 never in
 separate sections. If you do write("x"), "x" is also in rodata, 
 the
 rodata section can't be removed. If you delete the write call 
 there's
 no reference to rodata and it's possible to remove the complete
 section.

 After some google-fu:
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=192
 Considering this was filed in 2000 I'd say it's not very likely 
 to get
 fixed soon :-(

 So the best option is probably to get rid of this problem by 
 patching
 the compiler ( notypeinfo or -fnortti).
Here's a filthy sed hack to workaround this bug: 1) compile to assembly: ----------------------------------------------- gdc -S -static -frelease -fno-emit-moduleinfo -nophoboslib -nostdlib test.d --entry=main -ffunction-sections -fdata-sections -Wl,--gc-sections -o test_temp.s 2) use sed to modify the assembly, putting each string into its own section: ----------------------------------------------- sed -e 's/^\(\.LC[0-9]*\)\(\:\)/\.section .rodata\1\n\1\2/g' test_temp.s >test.s 3) compile the new assembly: ----------------------------------------------- as test.s -o test.o 4) link: ----------------------------------------------- ld test.o --entry=main --gc-sections -o test 5) verify: ----------------------------------------------- objdump -s -j .rodata test Contents of section .rodata: 400168 780a x. size test text data bss dec hex filename 338 0 0 338 152 test 6) execute: ------------------------------------------------ ./test x Filthy, but cheap and effective. Fortunately it's all automated with rdmd. Mike
Jan 15 2015
next sibling parent reply "Orvid King via D.gnu" <d.gnu puremagic.com> writes:
On 1/15/2015 10:31 PM, Mike via D.gnu wrote:
 On Thursday, 15 January 2015 at 12:01:05 UTC, Johannes Pfau wrote:

 My best guess is that the strings are always placed in rodata, never in
 separate sections. If you do write("x"), "x" is also in rodata, the
 rodata section can't be removed. If you delete the write call there's
 no reference to rodata and it's possible to remove the complete
 section.

 After some google-fu:
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=192
 Considering this was filed in 2000 I'd say it's not very likely to get
 fixed soon :-(

 So the best option is probably to get rid of this problem by patching
 the compiler ( notypeinfo or -fnortti).
Here's a filthy sed hack to workaround this bug: 1) compile to assembly: ----------------------------------------------- gdc -S -static -frelease -fno-emit-moduleinfo -nophoboslib -nostdlib test.d --entry=main -ffunction-sections -fdata-sections -Wl,--gc-sections -o test_temp.s 2) use sed to modify the assembly, putting each string into its own section: ----------------------------------------------- sed -e 's/^\(\.LC[0-9]*\)\(\:\)/\.section .rodata\1\n\1\2/g' test_temp.s >test.s 3) compile the new assembly: ----------------------------------------------- as test.s -o test.o 4) link: ----------------------------------------------- ld test.o --entry=main --gc-sections -o test 5) verify: ----------------------------------------------- objdump -s -j .rodata test Contents of section .rodata: 400168 780a x. size test text data bss dec hex filename 338 0 0 338 152 test 6) execute: ------------------------------------------------ ./test x Filthy, but cheap and effective. Fortunately it's all automated with rdmd. Mike
The problem with this is that the TypeInfo is used for a significant number of things in the runtime, casting, allocation, and initialization just to name a few. The type info for structures is already only generated if it's allocated on the heap. I suspect though that I'm probably just too tired right now to understand the intricacies of the topic.
Jan 15 2015
parent reply Johannes Pfau <nospam example.com> writes:
Am Thu, 15 Jan 2015 23:08:57 -0600
schrieb "Orvid King via D.gnu" <d.gnu puremagic.com>:

 On 1/15/2015 10:31 PM, Mike via D.gnu wrote:
 On Thursday, 15 January 2015 at 12:01:05 UTC, Johannes Pfau wrote:

 My best guess is that the strings are always placed in rodata,
 never in separate sections. If you do write("x"), "x" is also in
 rodata, the rodata section can't be removed. If you delete the
 write call there's no reference to rodata and it's possible to
 remove the complete section.

 After some google-fu:
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=192
 Considering this was filed in 2000 I'd say it's not very likely to
 get fixed soon :-(

 So the best option is probably to get rid of this problem by
 patching the compiler ( notypeinfo or -fnortti).
Here's a filthy sed hack to workaround this bug: 1) compile to assembly: ----------------------------------------------- gdc -S -static -frelease -fno-emit-moduleinfo -nophoboslib -nostdlib test.d --entry=main -ffunction-sections -fdata-sections -Wl,--gc-sections -o test_temp.s 2) use sed to modify the assembly, putting each string into its own section: ----------------------------------------------- sed -e 's/^\(\.LC[0-9]*\)\(\:\)/\.section .rodata\1\n\1\2/g' test_temp.s >test.s 3) compile the new assembly: ----------------------------------------------- as test.s -o test.o 4) link: ----------------------------------------------- ld test.o --entry=main --gc-sections -o test 5) verify: ----------------------------------------------- objdump -s -j .rodata test Contents of section .rodata: 400168 780a x. size test text data bss dec hex filename 338 0 0 338 152 test 6) execute: ------------------------------------------------ ./test x Filthy, but cheap and effective. Fortunately it's all automated with rdmd. Mike
The problem with this is that the TypeInfo is used for a significant number of things in the runtime, casting, allocation, and initialization just to name a few.
Mike doesn't use the runtime so this is less of a problem. And in many cases the compiler already avoids TypeInfo: Default initializers are separate symbols and don't require TypeInfo. Allocation: Only GC allocation is affected, custom allocators are templates. Casting: AFAIK only for class downcasts. This is probably the most important feature requiring TypeInfo. Here we might prefer a minimal classinfo instead of completely removing it.
 The type info for structures is
 already only generated if it's allocated on the heap.
Either I don't understand what you mean or you're wrong ;-) TypeInfo is generated at declaration time. At that point you can't even know if a struct will be allocated on the heap at some point. Maybe you're talking about closures? It is true that _GC_ heap allocation requires TypeInfo but that's usually considered a bug.
 I suspect
 though that I'm probably just too tired right now to understand the
 intricacies of the topic.
Jan 16 2015
parent reply "Jens Bauer" <doctor who.no> writes:
On Friday, 16 January 2015 at 18:37:28 UTC, Johannes Pfau wrote:
 Am Thu, 15 Jan 2015 23:08:57 -0600
 schrieb "Orvid King via D.gnu" <d.gnu puremagic.com>:

 TypeInfo is generated at declaration time. At that point you
 can't even know if a struct will be allocated on the heap at
 some point.
I was wondering... Would it be possible to make selective TypeInfo ? -Eg. Only add TypeInfo for those things that really need it. If that's possible, then I think the overhead could be decreased dramatically. Another way it could perhaps be reduced, would be to change strings into pointers or identifiers. This popped into my mind, because it appears that TypeInfo is zero-terminated C-strings anyway. If those exist in a read-only memory space, then comparing them using strcmp seems quite extraneous; just compare the pointer and put some other data there instead; or make a 32-bit uniqueID (perhaps even a 16-bit UID).
Apr 30 2015
next sibling parent "Jens Bauer" <doctor who.no> writes:
On Thursday, 30 April 2015 at 14:27:17 UTC, Jens Bauer wrote:
 {snip} or make a 32-bit uniqueID (perhaps even a 16-bit UID).
Note: The most frequently used typeinfo should have the lowest ID numbers, because on small devices, loading a small value into a register, will use very little space. Of course, an 8-bit value can be supported on most systems. On ARM Cortex-M3 and later, we have several ways of loading small values. One is the 'modified immediate', which is one of the following: (8-bit value) << (0 ..24) (8-bit value) * 0x00010001 (8-bit value) * 0x01000100 (8-bit value) * 0x01010101 Then there's a 16-bit load using movw. Both modified immediate and movw uses 32-bit instructions, but a direct 8-bit value can be loaded using the 16-bit mov.n instruction (also on Cortex-M0). ... My point is that if sorting things like UniqueIDs would result in smaller binaries.
Apr 30 2015
prev sibling parent reply Johannes Pfau <nospam example.com> writes:
Am Thu, 30 Apr 2015 14:27:15 +0000
schrieb "Jens Bauer" <doctor who.no>:

 I was wondering... Would it be possible to make selective 
 TypeInfo ?
 -Eg. Only add TypeInfo for those things that really need it.
'really need it' is quite subjective. IIRC TypeInfo is mostly used for GC, AA and sometimes arrays. You won't use most of this anyway and you can still use a library template base Array/AA so this is easy to avoid. TypeInfo is also required for type-safe runtime variadics. Nobody uses these. I think one place where you really need it is downcasting class objects. But you only need a small subset of class TypeInfo for that.
Apr 30 2015
parent "Jens Bauer" <doctor who.no> writes:
 Am Thu, 30 Apr 2015 14:27:15 +0000
 schrieb "Jens Bauer" <doctor who.no>:
 I was wondering... Would it be possible to make selective 
 TypeInfo ?
 -Eg. Only add TypeInfo for those things that really need it.
{snip} On Thursday, 30 April 2015 at 19:24:14 UTC, Johannes Pfau wrote:
 I think one place where you really need it is downcasting class 
 objects.
Casting was what I was thinking of, because when I needed to do a dynamic_cast in C++, I only needed it for one, maybe two classes. Thus if being able to enable it selectively for one out of 10 classes, it might help on the memory footprint.
May 01 2015
prev sibling parent "Dicebot" <public dicebot.lv> writes:
On Friday, 16 January 2015 at 04:31:17 UTC, Mike wrote:
 On Thursday, 15 January 2015 at 12:01:05 UTC, Johannes Pfau 
 wrote:

 My best guess is that the strings are always placed in rodata, 
 never in
 separate sections. If you do write("x"), "x" is also in 
 rodata, the
 rodata section can't be removed. If you delete the write call 
 there's
 no reference to rodata and it's possible to remove the complete
 section.

 After some google-fu:
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=192
 Considering this was filed in 2000 I'd say it's not very 
 likely to get
 fixed soon :-(

 So the best option is probably to get rid of this problem by 
 patching
 the compiler ( notypeinfo or -fnortti).
Here's a filthy sed hack to workaround this bug: 1) compile to assembly: ----------------------------------------------- gdc -S -static -frelease -fno-emit-moduleinfo -nophoboslib -nostdlib test.d --entry=main -ffunction-sections -fdata-sections -Wl,--gc-sections -o test_temp.s 2) use sed to modify the assembly, putting each string into its own section: ----------------------------------------------- sed -e 's/^\(\.LC[0-9]*\)\(\:\)/\.section .rodata\1\n\1\2/g' test_temp.s >test.s 3) compile the new assembly: ----------------------------------------------- as test.s -o test.o 4) link: ----------------------------------------------- ld test.o --entry=main --gc-sections -o test 5) verify: ----------------------------------------------- objdump -s -j .rodata test Contents of section .rodata: 400168 780a x. size test text data bss dec hex filename 338 0 0 338 152 test 6) execute: ------------------------------------------------ ./test x Filthy, but cheap and effective. Fortunately it's all automated with rdmd. Mike
Looks like awesome wiki material :)
Jan 16 2015
prev sibling parent reply "Mike" <none none.com> writes:
On Thursday, 15 January 2015 at 12:01:05 UTC, Johannes Pfau wrote:
 After some google-fu:
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=192
 Considering this was filed in 2000 I'd say it's not very likely 
 to get
 fixed soon :-(

 So the best option is probably to get rid of this problem by 
 patching
 the compiler ( notypeinfo or -fnortti).
Looks like someone picked up on this and submitted a patch: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=192#c9 Cool!! But I still have yet to test it. -fnortti would still be most welcome. Mike
May 10 2015
parent reply "Mike" <none none.com> writes:
On Sunday, 10 May 2015 at 09:54:51 UTC, Mike wrote:

 Looks like someone picked up on this and submitted a patch:  
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=192#c9   Cool!!  
 But I still have yet to test it.
Damn! Didn't work.
May 10 2015
next sibling parent reply "Iain Buclaw via D.gnu" <d.gnu puremagic.com> writes:
On 10 May 2015 at 14:48, Mike via D.gnu <d.gnu puremagic.com> wrote:
 On Sunday, 10 May 2015 at 09:54:51 UTC, Mike wrote:

 Looks like someone picked up on this and submitted a patch:
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=192#c9   Cool!!  But I still
 have yet to test it.
Damn! Didn't work.
Not sure what you're using to build, but it seems reliant on -ffunction-sections -fdata-sections -fmerge-constants - or at least the latter two of those options. Did you try the minimum test in the PR?
May 10 2015
parent "Mike" <none none.com> writes:
On Sunday, 10 May 2015 at 13:20:42 UTC, Iain Buclaw wrote:
 Not sure what you're using to build, but it seems reliant on
 -ffunction-sections -fdata-sections -fmerge-constants - or at 
 least
 the latter two of those options.

 Did you try the minimum test in the PR?
I'm using the cross-compiler built with this script: https://github.com/JinShil/arm-none-eabi-gdc/blob/master/arm-none-eabi-gdc.sh I compiled with -ffunction-sections -fdata-sections -fmerge-constants, and a few other variants. I link with --gc-sections. For a test program, I'm using my LCD demo here: https://github.com/JinShil/stm32f42_discovery_demo The only way I can get a minimal binary is to compile to assembly, use this sed hack (https://github.com/JinShil/stm32f42_discovery_demo/blob/master/build.d#L69) to put strings into their own section, and then compile the modified assembly. Without the sed hack, my binary is 450k. With the sed hack it's 6k. The binary seems to fill .rodata with the TypeInfo.name field. Here's a small sample of what that looks like in the binary: 801fa00 6572616c 2e526567 69737465 72212830 eral.Register!(0 801fa10 2c206361 73742841 63636573 73293729 , cast(Access)7) 801fa20 2e526567 69737465 722e4269 74212831 .Register.Bit!(1 801fa30 312c2063 61737428 4d757461 62696c69 1, cast(Mutabili 801fa40 74792932 292e4269 74000000 6d6d696f ty)2).Bit...mmio 801fa50 2e506572 69706865 72616c21 28414842 .Peripheral!(AHB 801fa60 312c2031 35333630 292e5065 72697068 1, 15360).Periph Those are the types instantiated with my mmio template library. Here's a sample: https://github.com/JinShil/stm32f42_discovery_demo/blob/master/source/stm32f42/flash.d It appears the TypeInfo.name field is not put into its own section even though I compile with -fdata-sections. I guess I'll have to inspect the binary compiled with the patch to see what's happening. Mike
May 10 2015
prev sibling parent "Mike" <none none.com> writes:
On Sunday, 10 May 2015 at 12:48:55 UTC, Mike wrote:
 On Sunday, 10 May 2015 at 09:54:51 UTC, Mike wrote:

 Looks like someone picked up on this and submitted a patch:  
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=192#c9   Cool!!  
 But I still have yet to test it.
Damn! Didn't work.
Bug report submitted here: http://bugzilla.gdcproject.org/show_bug.cgi?id=184
May 10 2015
prev sibling parent "Mike" <none none.com> writes:
On Wednesday, 14 January 2015 at 13:32:53 UTC, Mike wrote:

 Well, I was working a reduced test case and found that it has
 something to do with my trace.d file here:
 https://github.com/JinShil/stm32f42_discovery_demo/blob/master/source/stm32f42/trace.d

 If I add a trace.writeLine("x") in my program, then the binary
 goes from 2K to 130K.  Anyway, it appears it has nothing to do
 with TypeInfo.  I'll continue to try to reduce.

 Thanks for the help and useful information.

 Mike
Final update: The problem was with this function here: void write(A...)(in A a) { foreach(t; a) { write(t); } } I think that since this is an open-ended template and I could potentially pass any type to it, the compiler thinks it should remember the TypeInfo.name values for every type in my program. However, since I only ever used write("x") in my program, I expect the linker to be able to see that those TypeInfo.name values are never used, and strip them out when compiled with -fdata-sections and -Wl,--gc-sections. Perhaps it couldn't because of the way the data is packed. Anyway, I guess I'll see about modifying my code to be less flexible in order to reign this data in. Suggestions are welcome. Mike
Jan 14 2015
prev sibling parent "Mike" <none none.com> writes:
On Tuesday, 13 January 2015 at 14:20:43 UTC, Mike wrote:
 Here's what I'm seeing:

 --------------------
 arm-none-eabi-objdump -t binary/firmware

 binary/firmware:     file format elf32-littlearm

 SYMBOL TABLE:
 08000000 l    d  .text  00000000 .text
 08000a44 l    d  .rodata        00000000 .rodata
 00000000 l    df *ABS*  00000000 start.d
 0800001c l       .text  00000000 handler_address
 00000000 l       *UND*  00000000 __aeabi_unwind_cpp_pr0
 00000000 l       *UND*  00000000 __aeabi_unwind_cpp_pr1
 00000000 l    df *ABS*  00000000
 10010000 l       *ABS*  00000000 _stackStart
 08000034 g     F .text  0000007e memcpy
 08000010 g     F .text  00000014 _D5start7OnResetFZv
 080202d4 g       .rodata        00000000 __text_end__
 08000004 g     O .text  00000004 ResetHandler
 20000000 g       .rodata        00000000 __data_end__
 20000000 g       .rodata        00000000 __bss_start__
 20000000 g       .rodata        00000000 __bss_end__
 08000024 g     F .text  00000010 memset
 20000000 g       .rodata        00000000 __data_start__
 0800000c g     O .text  00000004 HardFaultHandler
 080000b4 g     F .text  0000093c main
 08000a28 g     F .text  0000001c _D5start11OnHardFaultFZv
I just wanted to show off our shiny new demangle support in binutils for comparison with my previous post. ----------------------------- arm-none-eabi-objdump --demangle=dlang -t binary/firmware binary/firmware: file format elf32-littlearm SYMBOL TABLE: 08000000 l d .text 00000000 .text 08000380 l d .rodata 00000000 .rodata 00000000 l df *ABS* 00000000 start.d 0800001c l .text 00000000 handler_address 00000000 l *UND* 00000000 __aeabi_unwind_cpp_pr0 00000000 l df *ABS* 00000000 10010000 l *ABS* 00000000 _stackStart 08000034 g F .text 0000007e memcpy 08000010 g F .text 00000014 start.OnReset() 0801fb78 g .rodata 00000000 __text_end__ 08000004 g O .text 00000004 ResetHandler 20000000 g .rodata 00000000 __data_end__ 20000000 g .rodata 00000000 __bss_start__ 20000000 g .rodata 00000000 __bss_end__ 08000024 g F .text 00000010 memset 0800032c w F .text 00000038 trace.writeLine!(immutable(char)[]).writeLine(const(immutable(char)[])) 20000000 g .rodata 00000000 __data_start__ 0800000c g O .text 00000004 HardFaultHandler 080000b4 g F .text 00000278 main 08000364 g F .text 0000001c start.OnHardFault() Nice! Thanks Iain.
Jan 13 2015
prev sibling parent "Mike" <none none.com> writes:
On Sunday, 11 January 2015 at 16:57:41 UTC, Johannes Pfau wrote:
 If you only want to disable TypeInfo for some classes that's 
 more
 difficult:
 https://github.com/D-Programming-microD/GDC/commit/f0614bc9480dacd1ec6bb75277d280afa96e08bb
Is this good enough for a pull request upstream, or just an experiment?
Jan 15 2015