digitalmars.D - Executable size affected by module count?
- kris (18/18) Jan 24 2007 Given a (fixed) body of code, it appears that retaining it all within
- Walter Bright (3/7) Jan 24 2007 The way to see what's in an object file is to run obj2asm on it. It'll
- kris (6/17) Jan 24 2007 I'm aware of that, thanks.
- Frits van Bommel (11/28) Jan 24 2007 Suggesting you look at what's submitted to the linker doesn't
- kris (3/29) Jan 24 2007 That's right. But I'd already looked at the obj file content, and
- Walter Bright (14/32) Jan 24 2007 The words used suggested an unfamiliarity with the tools. I believe it's...
- kris (13/58) Jan 24 2007 Forgive me. They were chosen to cause the least amount of conflict? But,...
- jcc7 (11/20) Jan 25 2007 It's probably a longshot that wouldn't work, but it's possible that the ...
- kris (2/28) Jan 25 2007 That's a good pointer -- thanks, jcc7 :)
- John Reimer (14/37) Jan 25 2007 That statement has been there for years. I'm not sure if anywone has
- Pragma (12/15) Jan 25 2007 As someone who has hacked away on OMF handling for a while now, I must a...
- Thomas Kuehne (20/28) Jan 24 2007 -----BEGIN PGP SIGNED MESSAGE-----
- Sean Kelly (8/34) Jan 24 2007 These are the outstanding problem for exposing templates from library
- Thomas Kuehne (10/15) Jan 24 2007 -----BEGIN PGP SIGNED MESSAGE-----
- Frits van Bommel (7/19) Jan 24 2007 In their current form they're not identical for each module, for the
- Sean Kelly (5/25) Jan 24 2007 Exactly my point. Why not define _d_assert and _d_arraybounds somewhere...
- Frits van Bommel (15/40) Jan 25 2007 Please look at phobos/std/asserterror.d and phobos/std/array.d[1]. I
- Sean Kelly (6/45) Jan 25 2007 It's been too long since I've messed with this portion of Phobos. I
- kris (2/64) Jan 25 2007 Mine too
- Frits van Bommel (8/10) Jan 25 2007 Well, there's still another bug that seems to crop up in source files
- Sean Kelly (4/16) Jan 25 2007 Good to know it's still there I suppose :-/ I'd gotten rid of all
Given a (fixed) body of code, it appears that retaining it all within one module, and splitting it into multiple modules, results in different executable sizes? There's no real surprise that this would happen, but it's the actual difference that is cause for a little concern -- it appears that each module consumes 512 bytes minimum. This may actually be a linker thing, but perhaps not? Why would this matter? Well, if you wind up using the 158 modules in the Win32 project, that adds nearly 80KB to an application. Purely in module overhead. And, those headers are almost all enum, const, and struct. This is in addition to the ~70KB of unused initializer from the Win32 headers, dicussed in the other topic. That's a whole lot of overhead for Win32 programs to carry -- especially if the target is mobile devices. Obviously, you'd be doing something truly serious if you were actually using all those header modules! However, Win32 headers are not exactly a model in decoupled design (in C also), so you wind up with large numbers of them unintentionally. Any ideas, Walter?
Jan 24 2007
kris wrote:There's no real surprise that this would happen, but it's the actual difference that is cause for a little concern -- it appears that each module consumes 512 bytes minimum. This may actually be a linker thing, but perhaps not?The way to see what's in an object file is to run obj2asm on it. It'll show exactly what's submitted to the linker.
Jan 24 2007
Walter Bright wrote:kris wrote:I'm aware of that, thanks. Do you think you could comment on why it might be the linker? And how to compensate? You're probably one of the two ppl in the world who know OptLink ... Also: are dmd obj files compatible with any other linker?There's no real surprise that this would happen, but it's the actual difference that is cause for a little concern -- it appears that each module consumes 512 bytes minimum. This may actually be a linker thing, but perhaps not?The way to see what's in an object file is to run obj2asm on it. It'll show exactly what's submitted to the linker.
Jan 24 2007
kris wrote:Walter Bright wrote:Suggesting you look at what's submitted to the linker doesn't necessarily imply the linker is the cause. If it receives object files with more data in it (generated by the compiler) and dutifully links them, the output is still bigger ;).kris wrote:I'm aware of that, thanks. Do you think you could comment on why it might be the linker?There's no real surprise that this would happen, but it's the actual difference that is cause for a little concern -- it appears that each module consumes 512 bytes minimum. This may actually be a linker thing, but perhaps not?The way to see what's in an object file is to run obj2asm on it. It'll show exactly what's submitted to the linker.Also: are dmd obj files compatible with any other linker?IIRC OMF (the format of dmd obj files) used to be the "standard" object format on Windows[1], and perhaps other OSs as well. That was a while back though. It wouldn't surprise me if optlink is the only recent linker to support it, but even if so you may be able to find some old versions of other linkers that support it. [1]: Or was it still DOS back then? The MS OS at the time, anyway.
Jan 24 2007
Frits van Bommel wrote:kris wrote:That's right. But I'd already looked at the obj file content, and subsequently discounted it :pWalter Bright wrote:Suggesting you look at what's submitted to the linker doesn't necessarily imply the linker is the cause. If it receives object files with more data in it (generated by the compiler) and dutifully links them, the output is still bigger ;).kris wrote:I'm aware of that, thanks. Do you think you could comment on why it might be the linker?There's no real surprise that this would happen, but it's the actual difference that is cause for a little concern -- it appears that each module consumes 512 bytes minimum. This may actually be a linker thing, but perhaps not?The way to see what's in an object file is to run obj2asm on it. It'll show exactly what's submitted to the linker.
Jan 24 2007
kris wrote:Walter Bright wrote:The words used suggested an unfamiliarity with the tools. I believe it's well worth the effort to master what's going on with object files and linking, especially for professional developers, and the tools obj2asm, /MAP, and dumpexe are marvelous aids.kris wrote:I'm aware of that, thanks.There's no real surprise that this would happen, but it's the actual difference that is cause for a little concern -- it appears that each module consumes 512 bytes minimum. This may actually be a linker thing, but perhaps not?The way to see what's in an object file is to run obj2asm on it. It'll show exactly what's submitted to the linker.Do you think you could comment on why it might be the linker? And how to compensate? You're probably one of the two ppl in the world who know OptLink ...I'd first look at the contents of the .obj file, and see if that is what is expected. I'd also check the optlink instructions http://www.digitalmars.com/ctg/ctgLinkSwitches.html#alignment as there are quite a lot of switches that offer a great deal of control over the linking process.Also: are dmd obj files compatible with any other linker?Any linker that supports the Microsoft OMF format. I know Microsoft linkers dropped support for it when they went to 32 bits, but I am not very familiar with other linkers. Pharlap did, but I think they went out of business.
Jan 24 2007
Walter Bright wrote:kris wrote:Forgive me. They were chosen to cause the least amount of conflict? But, as far as OptLink goes -- yes, I have no clue about it, and there's more switches than a power-station. Best to ask the expert, it would seem?Walter Bright wrote:The words used suggested an unfamiliarity with the tools.kris wrote:I'm aware of that, thanks.There's no real surprise that this would happen, but it's the actual difference that is cause for a little concern -- it appears that each module consumes 512 bytes minimum. This may actually be a linker thing, but perhaps not?The way to see what's in an object file is to run obj2asm on it. It'll show exactly what's submitted to the linker.I believe it's well worth the effort to master what's going on with object files and linking, especially for professional developers, and the tools obj2asm, /MAP, and dumpexe are marvelous aids.Thank you. Indeed they are. We've both been using these kinds of tools for approximately the same period of time.Did that first. There's nothing that jumps out. I should note that this was first noticed perhaps two years ago ... it's not something that suddenly changed. But, it has become more important recently; vis-a-vis win32 headersDo you think you could comment on why it might be the linker? And how to compensate? You're probably one of the two ppl in the world who know OptLink ...I'd first look at the contents of the .obj file, and see if that is what is expected.I'd also check the optlink instructions http://www.digitalmars.com/ctg/ctgLinkSwitches.html#alignment as there are quite a lot of switches that offer a great deal of control over the linking process.Yes, classic stuff! That's why I'm asking the expert.Thanks. This means it's not exactly feasible to check the concern via another linker.Also: are dmd obj files compatible with any other linker?Any linker that supports the Microsoft OMF format. I know Microsoft linkers dropped support for it when they went to 32 bits, but I am not very familiar with other linkers. Pharlap did, but I think they went out of business.
Jan 24 2007
== Quote from kris (foo bar.com)'s articleIt's probably a longshot that wouldn't work, but it's possible that the Standalone OpenWatcom Tools would be of some use: http://cmeerw.org/prog/owtools/ I haven't tried using any of the tools myself (and the webpage implies that development stopped in November 2003), but I thought I'd point this page out in case it's of use to you. It mentions: "I am currently working on getting OpenWatcom's tools working with Digital Mars C++ to provide an alternative to the already dated Digital Mars tools." And one of the tools is called "Open Watcom OMF Dump Utility". jcc7Thanks. This means it's not exactly feasible to check the concern via another linker.Also: are dmd obj files compatible with any other linker?Any linker that supports the Microsoft OMF format. I know Microsoft linkers dropped support for it when they went to 32 bits, but I am not very familiar with other linkers. Pharlap did, but I think they went out of business.
Jan 25 2007
jcc7 wrote:== Quote from kris (foo bar.com)'s articleThat's a good pointer -- thanks, jcc7 :)It's probably a longshot that wouldn't work, but it's possible that the Standalone OpenWatcom Tools would be of some use: http://cmeerw.org/prog/owtools/ I haven't tried using any of the tools myself (and the webpage implies that development stopped in November 2003), but I thought I'd point this page out in case it's of use to you. It mentions: "I am currently working on getting OpenWatcom's tools working with Digital Mars C++ to provide an alternative to the already dated Digital Mars tools." And one of the tools is called "Open Watcom OMF Dump Utility". jcc7Thanks. This means it's not exactly feasible to check the concern via another linker.Also: are dmd obj files compatible with any other linker?Any linker that supports the Microsoft OMF format. I know Microsoft linkers dropped support for it when they went to 32 bits, but I am not very familiar with other linkers. Pharlap did, but I think they went out of business.
Jan 25 2007
On Thu, 25 Jan 2007 15:49:28 +0000, jcc7 wrote:== Quote from kris (foo bar.com)'s articleThat statement has been there for years. I'm not sure if anywone has found any benefit to using the modified openwatcom linker. There doesn't seem to be any ongoing development for dmc compatibility. The one benefit (and primary goal, I think) was the ability to link with coff files, but since dmc/dmd spits out omf only, there's not much advantage beyond that... and then even coff files from different compiler venders are incompatible. Object formats have been a mess on windows for well over a decade... dmd's largest problem here continues to be the omf format. On the other hand, dmd would do well to have a new linker and object system to help it support new language features. Depending on old C technology is restrictive to D's progress. Alas, easier said than done. -JJRIt's probably a longshot that wouldn't work, but it's possible that the Standalone OpenWatcom Tools would be of some use: http://cmeerw.org/prog/owtools/ I haven't tried using any of the tools myself (and the webpage implies that development stopped in November 2003), but I thought I'd point this page out in case it's of use to you. It mentions: "I am currently working on getting OpenWatcom's tools working with Digital Mars C++ to provide an alternative to the already dated Digital Mars tools." And one of the tools is called "Open Watcom OMF Dump Utility". jcc7Thanks. This means it's not exactly feasible to check the concern via another linker.Also: are dmd obj files compatible with any other linker?Any linker that supports the Microsoft OMF format. I know Microsoft linkers dropped support for it when they went to 32 bits, but I am not very familiar with other linkers. Pharlap did, but I think they went out of business.
Jan 25 2007
John Reimer wrote:dmd's largest problem here continues to be the omf format. On the other hand, dmd would do well to have a new linker and object system to help it support new language features.As someone who has hacked away on OMF handling for a while now, I must agree. It has quite a few strikes against it for prospective D toolchain developers: * Hard to follow documentation that contains errors and makes *lots* of assumptions * Is unsupported by just about everyone else except DigitalMars (no offense intended) * Is *not* 64-bit ready * Contains gobs of legacy cruft (specification moreso than .obj files) * Is limited in how it can represent module inter-dependencies Don't get me wrong. I like DMD and OPTLINK in particular - I would love to see OPTLINK modified to work with ELF files or some new D object file format. -- - EricAnderton at yahoo
Jan 25 2007
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 kris schrieb am 2007-01-24:Given a (fixed) body of code, it appears that retaining it all within one module, and splitting it into multiple modules, results in different executable sizes? There's no real surprise that this would happen, but it's the actual difference that is cause for a little concern -- it appears that each module consumes 512 bytes minimum. This may actually be a linker thing, but perhaps not?[...]Any ideas, Walter?Every non-trivial module contains (numbers are for Linux) _D5module7__arrayZ 23 bytes code, 19 bytes stringtab, 18 bytes symtab _D5module8__assertFiZv 24 bytes code, 23 bytes stringtab, 18 bytes symtab _D5module9__modctorFZv 11+ bytes code, 23 bytes stringtab, 18 bytes symtab In total 177 bytes, after stripping (strip --strip-all) 58 bytes. The minimum overhead of an object file is about 800 bytes, most of those are discared at link time. Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFFt9xsLK5blCcjpWoRAiFWAJ9TP0DUTgcs67gE1XPFwhub90HJlgCfZrxE cmnrlBB68I2DbBUf61ekwY4= =czhH -----END PGP SIGNATURE-----
Jan 24 2007
Thomas Kuehne wrote:-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 kris schrieb am 2007-01-24:These are the outstanding problem for exposing templates from library code. And I don't understand why they are generated, since it seems like the code will be identical for each instance generated. Couldn't they just have a static definition in the runtime?Given a (fixed) body of code, it appears that retaining it all within one module, and splitting it into multiple modules, results in different executable sizes? There's no real surprise that this would happen, but it's the actual difference that is cause for a little concern -- it appears that each module consumes 512 bytes minimum. This may actually be a linker thing, but perhaps not?[...]Any ideas, Walter?Every non-trivial module contains (numbers are for Linux) _D5module7__arrayZ 23 bytes code, 19 bytes stringtab, 18 bytes symtab _D5module8__assertFiZv 24 bytes code, 23 bytes stringtab, 18 bytes symtab_D5module9__modctorFZv 11+ bytes code, 23 bytes stringtab, 18 bytes symtabOnly if the module as a static ctor though, right?In total 177 bytes, after stripping (strip --strip-all) 58 bytes. The minimum overhead of an object file is about 800 bytes, most of those are discared at link time.Thanks for the info. Sean
Jan 24 2007
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sean Kelly schrieb am 2007-01-24:Thomas Kuehne wrote:kris schrieb am 2007-01-24:Also if a class/struct has a static ctor. Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFFt+wfLK5blCcjpWoRAlVJAJ9hCLNd1aHPSAuRQ5pd9LoLqXQdKQCdHNwo symVzMwZP6+zbDfllfI64v0= =eyWh -----END PGP SIGNATURE-----_D5module9__modctorFZv 11+ bytes code, 23 bytes stringtab, 18 bytes symtabOnly if the module as a static ctor though, right?
Jan 24 2007
Sean Kelly wrote:Thomas Kuehne wrote:In their current form they're not identical for each module, for the simple reason that the code (after linking) has the reference to the module name string hardcoded. For two extra instructions per call that could be avoided, though at that point you might as well just call _d_assert/_d_arraybounds directly instead of using an intermediary function...Every non-trivial module contains (numbers are for Linux) _D5module7__arrayZ 23 bytes code, 19 bytes stringtab, 18 bytes symtab _D5module8__assertFiZv 24 bytes code, 23 bytes stringtab, 18 bytes symtabThese are the outstanding problem for exposing templates from library code. And I don't understand why they are generated, since it seems like the code will be identical for each instance generated. Couldn't they just have a static definition in the runtime?
Jan 24 2007
Frits van Bommel wrote:Sean Kelly wrote:Exactly my point. Why not define _d_assert and _d_arraybounds somewhere and simply call those? Since, as far as I can tell, the function bodies never vary. SeanThomas Kuehne wrote:In their current form they're not identical for each module, for the simple reason that the code (after linking) has the reference to the module name string hardcoded. For two extra instructions per call that could be avoided, though at that point you might as well just call _d_assert/_d_arraybounds directly instead of using an intermediary function...Every non-trivial module contains (numbers are for Linux) _D5module7__arrayZ 23 bytes code, 19 bytes stringtab, 18 bytes symtab _D5module8__assertFiZv 24 bytes code, 23 bytes stringtab, 18 bytes symtabThese are the outstanding problem for exposing templates from library code. And I don't understand why they are generated, since it seems like the code will be identical for each instance generated. Couldn't they just have a static definition in the runtime?
Jan 24 2007
Sean Kelly wrote:Frits van Bommel wrote:Please look at phobos/std/asserterror.d and phobos/std/array.d[1]. I didn't just pick those names out of a hat ;). What the generated functions do is basically: asm { push EAX; // caller puts line number there push name_ptr; push name_length; call _d_assert; // or _d_array_bounds } Like I said, this can easily be inlined. Replacing "mov EAX, linenr" with three pushes and using a different address is all it takes... In fact, it would seem calls to _d_assert_msg are already done like this (for asserts with the optional char[] second argument). [1]: It would seem I made a typo, it's _d_array_bounds.Sean Kelly wrote:Exactly my point. Why not define _d_assert and _d_arraybounds somewhereThomas Kuehne wrote:In their current form they're not identical for each module, for the simple reason that the code (after linking) has the reference to the module name string hardcoded. For two extra instructions per call that could be avoided, though at that point you might as well just call _d_assert/_d_arraybounds directly instead of using an intermediary function...Every non-trivial module contains (numbers are for Linux) _D5module7__arrayZ 23 bytes code, 19 bytes stringtab, 18 bytes symtab _D5module8__assertFiZv 24 bytes code, 23 bytes stringtab, 18 bytes symtabThese are the outstanding problem for exposing templates from library code. And I don't understand why they are generated, since it seems like the code will be identical for each instance generated. Couldn't they just have a static definition in the runtime?and simply call those? Since, as far as I can tell, the function bodies never vary.
Jan 25 2007
Frits van Bommel wrote:Sean Kelly wrote:It's been too long since I've messed with this portion of Phobos. I knew they rang a bell! :-pFrits van Bommel wrote:Please look at phobos/std/asserterror.d and phobos/std/array.d[1]. I didn't just pick those names out of a hat ;).Sean Kelly wrote:Exactly my point. Why not define _d_assert and _d_arraybounds somewhereThomas Kuehne wrote:In their current form they're not identical for each module, for the simple reason that the code (after linking) has the reference to the module name string hardcoded. For two extra instructions per call that could be avoided, though at that point you might as well just call _d_assert/_d_arraybounds directly instead of using an intermediary function...Every non-trivial module contains (numbers are for Linux) _D5module7__arrayZ 23 bytes code, 19 bytes stringtab, 18 bytes symtab _D5module8__assertFiZv 24 bytes code, 23 bytes stringtab, 18 bytes symtabThese are the outstanding problem for exposing templates from library code. And I don't understand why they are generated, since it seems like the code will be identical for each instance generated. Couldn't they just have a static definition in the runtime?What the generated functions do is basically: asm { push EAX; // caller puts line number there push name_ptr; push name_length; call _d_assert; // or _d_array_bounds } Like I said, this can easily be inlined. Replacing "mov EAX, linenr" with three pushes and using a different address is all it takes... In fact, it would seem calls to _d_assert_msg are already done like this (for asserts with the optional char[] second argument).Makes perfect sense. Well... doing this would eliminate one of the last quantifiable issues with placing templates in a library, so it has my vote. Sean
Jan 25 2007
Sean Kelly wrote:Frits van Bommel wrote:Mine tooSean Kelly wrote:It's been too long since I've messed with this portion of Phobos. I knew they rang a bell! :-pFrits van Bommel wrote:Please look at phobos/std/asserterror.d and phobos/std/array.d[1]. I didn't just pick those names out of a hat ;).Sean Kelly wrote:Exactly my point. Why not define _d_assert and _d_arraybounds somewhereThomas Kuehne wrote:In their current form they're not identical for each module, for the simple reason that the code (after linking) has the reference to the module name string hardcoded. For two extra instructions per call that could be avoided, though at that point you might as well just call _d_assert/_d_arraybounds directly instead of using an intermediary function...Every non-trivial module contains (numbers are for Linux) _D5module7__arrayZ 23 bytes code, 19 bytes stringtab, 18 bytes symtab _D5module8__assertFiZv 24 bytes code, 23 bytes stringtab, 18 bytes symtabThese are the outstanding problem for exposing templates from library code. And I don't understand why they are generated, since it seems like the code will be identical for each instance generated. Couldn't they just have a static definition in the runtime?What the generated functions do is basically: asm { push EAX; // caller puts line number there push name_ptr; push name_length; call _d_assert; // or _d_array_bounds } Like I said, this can easily be inlined. Replacing "mov EAX, linenr" with three pushes and using a different address is all it takes... In fact, it would seem calls to _d_assert_msg are already done like this (for asserts with the optional char[] second argument).Makes perfect sense. Well... doing this would eliminate one of the last quantifiable issues with placing templates in a library, so it has my vote. Sean
Jan 25 2007
Sean Kelly wrote:Makes perfect sense. Well... doing this would eliminate one of the last quantifiable issues with placing templates in a library, so it has my vote.Well, there's still another bug that seems to crop up in source files (http://d.puremagic.com/issues/show_bug.cgi?id=22) Ran into that one again today. But you should know about it, since you were the one to report it. Though perhaps that one is more of a general templates problem, unrelated to libraries. Still annoying though.
Jan 25 2007
Frits van Bommel wrote:Sean Kelly wrote:Good to know it's still there I suppose :-/ I'd gotten rid of all occurrences of it in Tango and wasn't sure if it was still an issue. SeanMakes perfect sense. Well... doing this would eliminate one of the last quantifiable issues with placing templates in a library, so it has my vote.Well, there's still another bug that seems to crop up in source files (http://d.puremagic.com/issues/show_bug.cgi?id=22) Ran into that one again today. But you should know about it, since you were the one to report it. Though perhaps that one is more of a general templates problem, unrelated to libraries. Still annoying though.
Jan 25 2007