digitalmars.D - Program size, linking matter, and static this()
- Andrei Alexandrescu (95/95) Dec 16 2011 Hello,
- Nick Sabalausky (5/9) Dec 16 2011 Interesting stuff.
- Steven Schveighoffer (20/81) Dec 16 2011 I disagree with this assessment. It's good to know the cause of the
- Andrei Alexandrescu (6/23) Dec 16 2011 I'd almost agree, but the code showed doesn't use Object.factory(). So
- Jacob Carlborg (6/32) Dec 16 2011 I don't think it's completely separate. Can the compiler know if runtime...
- Andrei Alexandrescu (3/5) Dec 16 2011 Yes. Reflection is used if reflection primitive functions are called.
- Jacob Carlborg (6/11) Dec 16 2011 Yeah, but how does the compiler know which are primitive functions, hard...
- Steven Schveighoffer (18/44) Dec 16 2011 You cannot know until link time whether factory is used when compiling
- Steven Schveighoffer (6/14) Dec 16 2011 The other valid option I see is removing the link to the virtual tables,...
- Adam D. Ruppe (6/8) Dec 16 2011 I wouldn't want to cripple either - put all the reflection
- Steven Schveighoffer (8/15) Dec 16 2011 The only way I can think of to decouple it is to disable it with a
- Andrei Alexandrescu (12/43) Dec 16 2011 I'm not an expert in linkers, but my understanding is that linkers
- Steven Schveighoffer (16/63) Dec 16 2011 Factory doesn't directly reference classes, it does so through the
- Marco Leise (6/8) Dec 18 2011 That should hold true for any OS. Otherwise, how would the program
- Steven Schveighoffer (11/19) Dec 19 2011 Not necessarily. On Linux, system calls provide the "interface" between...
- Sean Kelly (7/9) Dec 16 2011 naturally remove unused object files. That, coupled with dmd's ability =
- Martin Nowak (5/10) Jan 18 2012 That's strange, because Object.factory should only require TypeInfo_Clas...
- torhu (3/5) Dec 16 2011 How exactly do they solve the problem? An exe plus a DLL version of the...
- Jonathan M Davis (7/13) Dec 16 2011 You have to stick it all in the DLL anyway (since you can't know which p...
- Steven Schveighoffer (4/9) Dec 19 2011 The DLL is loaded into memory once. With static linking, it's loaded
- torhu (4/13) Dec 19 2011 I thought we were talking about distribution sizes, not memory use. But...
- Steven Schveighoffer (8/24) Dec 19 2011 Right, in order for dlls to make a difference, you need to separate the ...
- Jacob Carlborg (8/23) Dec 19 2011 It could be useful for a package manager. Theoretically all installed
- Marco Leise (31/36) Dec 20 2011 No! Let's please try to get closer to something that works with package ...
- dsimcha (2/13) Dec 20 2011 Minor nitpick: winsxs has been around since XP.
- Jacob Carlborg (7/13) Dec 16 2011 It's not very useful as is, but you can create your own version that
- Jonathan M Davis (46/55) Dec 16 2011 Hmm. I had reply for this already, but it seems to have disappeared, so ...
- Timon Gehr (10/68) Dec 16 2011 no.
- Jonathan M Davis (22/67) Dec 16 2011 Yes they are. static constructors completely chicken out on them. Not on...
- Timon Gehr (16/86) Dec 16 2011 I don't think that is an option.
- Jonathan M Davis (26/40) Dec 16 2011 That only works if the variable being initialized is in the new module i...
- Timon Gehr (3/6) Dec 16 2011 In what way would encapsulation be broken by just moving the class to a
- Andrei Alexandrescu (26/53) Dec 16 2011 I don't see progress here over arranging packages and modules to reflect...
- maarten van damme (2/2) Dec 16 2011 how did other languages solve this issue? I can't imagine D beeing the o...
- Walter Bright (6/8) Dec 16 2011 In C++, the order that static constructors run is implementation defined...
- Sean Kelly (8/15) Dec 16 2011 defined. No guarantees at all. The programmer has no reasonable way to =
- Timon Gehr (6/8) Dec 16 2011 Nobody has solved the issue. The approach in Java and C#, for instance,
- Somedude (5/7) Dec 17 2011 AFAIK, I believe like in D, it's best practice to avoid static
- Jonathan M Davis (35/45) Dec 16 2011 I don't know what's wrong with singletons. It's a great pattern in certa...
- Andrei Alexandrescu (22/67) Dec 16 2011 http://en.wikipedia.org/wiki/Singleton_pattern
- Jonathan M Davis (86/122) Dec 16 2011 Valid points, but it's still useful under some circumstances. I don't ac...
- Andrei Alexandrescu (13/32) Dec 16 2011 That is hardly a good argument in favor of the feature :o).
- deadalnix (4/46) Dec 17 2011 Very good point. CTFE is improving with each version of dmd, and is a
- Jonathan M Davis (13/16) Dec 17 2011 I think that in general, the uses for static this fall into one of two
- Somedude (4/25) Dec 18 2011 In the Java/C# world, they use dependency injection frameworks like
- so (20/33) Dec 17 2011 I don't like patterns much but when it comes to singleton i absolutely
- Andrei Alexandrescu (5/16) Dec 17 2011 Singleton has two benefits. One, you can't accidentally create more than...
- so (5/24) Dec 17 2011 Now i am puzzled,
- Jakob Ovrum (12/42) Dec 17 2011 Both of your examples are the singleton pattern if `make` returns
- so (4/12) Dec 18 2011 Exactly. there is no difference between "static A.make" and "makeA" in D...
- Jakob Ovrum (8/9) Dec 18 2011 Yeah, in most sane code, I would imagine so. But still, the
- Jakob Ovrum (3/13) Dec 18 2011 Sorry, I'm wrong, that wasn't the case at all. The original
- Jonathan M Davis (17/36) Dec 17 2011 Yes. There are occasions when singleton is very useful and makes perfect...
- Walter Bright (7/9) Dec 16 2011 I also don't really see how turning off checking is even slightly more e...
- Martin Nowak (27/43) Jan 18 2012 Which is a hack because that C function is a compiler wall while the
- Martin Nowak (5/48) Jan 18 2012 Forget about it. Immutable initialization shouldn't work from thread loc...
- Andrei Alexandrescu (33/81) Dec 16 2011 I understand and empathize with the sentiment, and I agree with most of
- Jonathan M Davis (26/69) Dec 16 2011 I'm not completely against this precisely because of this, but at the sa...
- Steven Schveighoffer (4/31) Dec 16 2011 This can be solved with malloc and emplace
- Andrei Alexandrescu (4/35) Dec 16 2011 Sure you meant static ubyte[__traits(classInstanceSize, T)]
- Steven Schveighoffer (4/46) Dec 16 2011 That works too!
- Sean Kelly (2/9) Dec 16 2011 Don't forget the 16 byte alignment :-)
- Timon Gehr (3/12) Dec 16 2011 Which is currently relatively easy:
- bearophile (6/11) Dec 16 2011 Is it possible to support this in D2/D3?
- Andrei Alexandrescu (14/15) Dec 16 2011 Why? From
- Jonathan M Davis (14/32) Dec 16 2011 I mean that if CTFE was advanced enough that I could do
- Walter Bright (3/6) Dec 16 2011 Sure, but having a way to tell the compiler "assume this constructor doe...
- Andrei Alexandrescu (10/46) Dec 16 2011 I think it's all a matter of terminology. Calling tzset during module
- Jonathan M Davis (5/7) Dec 16 2011 Some of the C stuff that LocalTime uses requires it. If LocalTime is laz...
- Andrei Alexandrescu (3/10) Dec 16 2011 Thanks. Sounds like we have a plan!
- Sean Kelly (14/16) Dec 16 2011 quite fit for the standard library. The stdlib is the connection between...
- Andrei Alexandrescu (4/22) Dec 16 2011 Absolutely.
- Trass3r (3/3) Dec 16 2011 A related issue is phobos being an intermodule dependency monster.
- Andrei Alexandrescu (10/13) Dec 16 2011 In fact it doesn't (after yesterday's commit). The std code in hello,
- Trass3r (5/13) Dec 16 2011 Yep, the 30 modules is a measure I took before that commit.
- Timon Gehr (16/30) Dec 16 2011 I think it is already lazy?
- Bane (3/21) Dec 16 2011 http://wiki.freepascal.org/Size_Matters
- Adam D. Ruppe (14/21) Dec 16 2011 This sounds fantastic.
- Walter Bright (8/14) Dec 16 2011 Another thing is to avoid using classes for things where one does not ex...
- Steven Schveighoffer (7/23) Dec 19 2011 Although I don't disagree with you that it should be a struct and not a ...
- Walter Bright (5/14) Dec 19 2011 Yes. The pointers to Object's functions, and a pointer to the TypeInfo f...
- Steven Schveighoffer (8/28) Dec 19 2011 Well pointers to Object's functions shouldn't add any bloat. The TypeIn...
- Walter Bright (2/5) Dec 19 2011 Or perhaps it should be in its own module.
- Marco Leise (4/11) Dec 20 2011 When I first saw it I thought "That's how _Java_ goes about free
- Andrei Alexandrescu (4/16) Dec 20 2011 Same here. If I had my way I'd rethink the name of those functions.
- Jonathan M Davis (7/26) Dec 20 2011 It's not the only place in Phobos which uses a class as a namespace. I b...
- Jakob Ovrum (3/36) Dec 20 2011 Sounds like the perfect candidate for its own module.
- Jonathan M Davis (5/18) Dec 20 2011 Not out of the question, I suppose, but it would make an awfully small m...
- so (7/27) Dec 21 2011 Supporting module nesting in single file wouldn't hurt, would it?
- Michal Minich (17/22) Dec 21 2011 Kind of...
- Jonathan M Davis (8/27) Dec 20 2011 Not to mention, I quite like the effect that you get with it as a class,...
- Sean Kelly (7/13) Dec 16 2011 world is a minuscule 3KB. The rest of 218KB is runtime.
- Somedude (2/20) Dec 17 2011 Fantastic ! :)
- Sean Kelly (5/7) Dec 16 2011 This was one of the major motivations for separating druntime from =
- Andrei Alexandrescu (8/15) Dec 16 2011 Well, right now druntime itself may have become the interdependency knot...
- Sean Kelly (4/16) Dec 16 2011 knot it once wanted to shun :o).
- torhu (3/17) Dec 16 2011 Maybe this is the tool you're thinking of:
- Richard Webb (5/23) Dec 16 2011 On a slightly related note:
- Sean Kelly (11/19) Dec 16 2011 sudden amounts when certain modules were included. After much =
- Martin Nowak (11/108) Dec 16 2011 We'd need the linker to do anything of this. Unreferenced symbols should...
- Martin Nowak (8/143) Dec 16 2011 More concrete if we'd output weak defined symbols (null) for what is
- Andrei Alexandrescu (3/15) Dec 16 2011 I think it would be awesome to exploit weak symbols.
- Denis Shelomovskij (54/149) Dec 20 2011 Really sorry, but it sounds silly for me. It's a minor problem. Does
- Andrei Alexandrescu (18/32) Dec 20 2011 In my experience, in a system programming language people do care about
- Walter Bright (12/30) Dec 20 2011 First off, dmd most definitely puts 0 initialized static data into the B...
- Denis Shelomovskij (10/22) Dec 21 2011 Sorry, it was because of copying C code in my post. ubyte array was
- Marco Leise (6/30) Dec 20 2011 +1. I didn't know about .bss, but static arrays of zeroes (global, struc...
- Walter Bright (2/6) Dec 20 2011 I added a faq entry for this.
- Marco Leise (14/23) Dec 20 2011 Ok, I jumped on the band wagon to early. Personally I only had this
- Walter Bright (2/14) Dec 20 2011 The struct one already does. Compile it, obj2asm it, and you'll see it t...
- Marco Leise (3/25) Dec 26 2011 Ah, I see it now. Sorry for the noise!
- Marco Leise (10/36) Jan 18 2012 It is back again! The following struct in my main module increases the
- Walter Bright (58/63) Jan 18 2012 Compiling it and obj2asm'ing the result, and you'll see it goes into the...
- Marco Leise (5/16) Jan 18 2012 Thanks for checking back. I'll have to experiment a bit to narrow this o...
- Marco Leise (34/100) Jan 19 2012 I tried different versions of DMD 2.057:
- Marco Leise (4/4) Jan 19 2012 P.S.: I could have realized it earlier: DMD uses the Windows PE BSS
- Vladimir Panteleev (7/26) Dec 20 2011 I believe this is bug 2254:
Hello, Late last night Walter and I figured a few interesting tidbits of information. Allow me to give some context, discuss them, and sketch a few approaches for improving things. A while ago Walter wanted to enable function-level linking, i.e. only get the needed functions from a given (and presumably large) module. So he arranged things that a library contains many small object "files" (that actually are generated from a single .d file and never exist on disk, only inside the library file, which can be considered an archive like tar). Then the linker would only pick the used object "files" from the library and link those in. Unfortunately that didn't have nearly the expected impact - essentially the size of most binaries stayed the same. The mystery was unsolved, and Walter needed to move on to other things. One particularly annoying issue is that even programs that don't ostensibly use anything from an imported module may balloon inexplicably in size. Consider: import std.path; void main(){} This program, after stripping and all, has some 750KB in size. Removing the import line reduces the size to 218KB. That includes the runtime support, garbage collector, and such, and I'll consider it a baseline. (A similar but separate discussion could be focused on reducing the baseline size, but herein I'll consider it constant.) What we'd simply want is to be able to import stuff without blatantly paying for what we don't use. If a program imports std.path and uses no function from it, it should be as large as a program without the import. Furthermore, the increase should be incremental - using 2-3 functions from std.path should only increase the executable size by a little, not suddenly link in all code in that module. But in experiments it seemed like program size would increase in sudden amounts when certain modules were included. After much investigation we figured that the following fateful causal sequence happened: 1. Some modules define static constructors with "static this()" or "static shared this()", and/or static destructors. 2. These constructors/destructors are linked in automatically whenever a module is included. 3. Importing a module with a static constructor (or destructor) will generate its ModuleInfo structure, which contains static information about all module members. In particular, it keeps virtual table pointers for all classes defined inside the module. 4. That means generating ModuleInfo refers all virtual functions defined in that module, whether they're used or not. 5. The phenomenon is transitive, e.g. even if std.path has no static constructors but imports std.datetime which does, a ModuleInfo is generated for std.path too, in addition to the one for std.datetime. So now classes inside std.path (if any) will be all linked in. 6. It follows that a module that defines classes which in turn use other functions in other modules, and has static constructors (or includes other modules that do) will baloon the size of the executable suddenly. There are a few approaches that we can use to improve the state of affairs. A. On the library side, use static constructors and destructors sparingly inside druntime and std. We can use lazy initialization instead of compulsively initializing library internals. I think this is often a worthy thing to do in any case (dynamic libraries etc) because it only does work if and when work needs to be done at the small cost of a check upon each use. B. On the compiler side, we could use a similar lazy initialization trick to only refer class methods in the module if they're actually needed. I'm being vague here because I'm not sure what and how that can be done. Here's a list of all files in std using static cdtors: std/__fileinit.d std/concurrency.d std/cpuid.d std/cstream.d std/datebase.d std/datetime.d std/encoding.d std/internal/math/biguintcore.d std/internal/math/biguintx86.d std/internal/processinit.d std/internal/windows/advapi32.d std/mmfile.d std/parallelism.d std/perf.d std/socket.d std/stdiobase.d std/uri.d The majority of them don't do a lot of work and are not much used inside phobos, so they don't blow up the executable. The main one that could receive some attention is std.datetime. It has a few static ctors and a lot of classes. Essentially just importing std.datetime or any std module that transitively imports std.datetime (and there are many of them) ends up linking in most of Phobos and blows the size up from the 218KB baseline to 700KB. Jonathan, could I impose on you to replace all static cdtors in std.datetime with lazy initialization? I looked through it and it strikes me as a reasonably simple job, but I think you'd know better what to do than me. A similar effort could be conducted to reduce or eliminate static cdtors from druntime. I made the experiment of commenting them all, and that reduced the size of the baseline from 218KB to 200KB. This is a good amount, but not as dramatic as what we can get by working on std.datetime. Thanks, Andrei
Dec 16 2011
Interesting stuff. "Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message news:jcg2lu$17p2$1 digitalmars.com...We can use lazy initialization instead of compulsively initializing library internals. I think this is often a worthy thing to do in any case (dynamic libraries etc) because it only does work if and when work needs to be done at the small cost of a check upon each use.That also has the benefit of reducing the risk of dreaded circular ctor dependency problems.
Dec 16 2011
On Fri, 16 Dec 2011 13:29:18 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Hello, Late last night Walter and I figured a few interesting tidbits of information. Allow me to give some context, discuss them, and sketch a few approaches for improving things. A while ago Walter wanted to enable function-level linking, i.e. only get the needed functions from a given (and presumably large) module. So he arranged things that a library contains many small object "files" (that actually are generated from a single .d file and never exist on disk, only inside the library file, which can be considered an archive like tar). Then the linker would only pick the used object "files" from the library and link those in. Unfortunately that didn't have nearly the expected impact - essentially the size of most binaries stayed the same. The mystery was unsolved, and Walter needed to move on to other things. One particularly annoying issue is that even programs that don't ostensibly use anything from an imported module may balloon inexplicably in size. Consider: import std.path; void main(){} This program, after stripping and all, has some 750KB in size. Removing the import line reduces the size to 218KB. That includes the runtime support, garbage collector, and such, and I'll consider it a baseline. (A similar but separate discussion could be focused on reducing the baseline size, but herein I'll consider it constant.) What we'd simply want is to be able to import stuff without blatantly paying for what we don't use. If a program imports std.path and uses no function from it, it should be as large as a program without the import. Furthermore, the increase should be incremental - using 2-3 functions from std.path should only increase the executable size by a little, not suddenly link in all code in that module. But in experiments it seemed like program size would increase in sudden amounts when certain modules were included. After much investigation we figured that the following fateful causal sequence happened: 1. Some modules define static constructors with "static this()" or "static shared this()", and/or static destructors. 2. These constructors/destructors are linked in automatically whenever a module is included. 3. Importing a module with a static constructor (or destructor) will generate its ModuleInfo structure, which contains static information about all module members. In particular, it keeps virtual table pointers for all classes defined inside the module. 4. That means generating ModuleInfo refers all virtual functions defined in that module, whether they're used or not. 5. The phenomenon is transitive, e.g. even if std.path has no static constructors but imports std.datetime which does, a ModuleInfo is generated for std.path too, in addition to the one for std.datetime. So now classes inside std.path (if any) will be all linked in. 6. It follows that a module that defines classes which in turn use other functions in other modules, and has static constructors (or includes other modules that do) will baloon the size of the executable suddenly. There are a few approaches that we can use to improve the state of affairs. A. On the library side, use static constructors and destructors sparingly inside druntime and std. We can use lazy initialization instead of compulsively initializing library internals. I think this is often a worthy thing to do in any case (dynamic libraries etc) because it only does work if and when work needs to be done at the small cost of a check upon each use. B. On the compiler side, we could use a similar lazy initialization trick to only refer class methods in the module if they're actually needed. I'm being vague here because I'm not sure what and how that can be done.I disagree with this assessment. It's good to know the cause of the problem, but let's look at the root issue -- reflection. The only reason to include class information for classes not being referenced is to be able to construct/use classes at runtime instead of at compile time. But if you look at D's runtime reflection capabilities, they are quite poor. You can only construct a class at runtime if it has a zero-arg constructor. So essentially, we are paying the penalty of having runtime reflection in terms of bloat, but get very very little benefit. I think there are two things that need to be considered: 1. We eventually should have some reasonably complete runtime reflection capability 2. Runtime reflection and shared libraries go hand-in-hand. With shared library support, the bloat penalty isn't nearly as significant. I don't think the right answer is to avoid using features of the language because the compiler/runtime has some design deficiencies. At some point these deficiencies will be fixed, and then we are left with a library that has seemingly odd design choices that we can't change. -Steve
Dec 16 2011
On 12/16/11 1:23 PM, Steven Schveighoffer wrote:I disagree with this assessment. It's good to know the cause of the problem, but let's look at the root issue -- reflection. The only reason to include class information for classes not being referenced is to be able to construct/use classes at runtime instead of at compile time. But if you look at D's runtime reflection capabilities, they are quite poor. You can only construct a class at runtime if it has a zero-arg constructor. So essentially, we are paying the penalty of having runtime reflection in terms of bloat, but get very very little benefit.I'd almost agree, but the code showed doesn't use Object.factory(). So that shouldn't be linked in, and shouldn't pull vtables.I think there are two things that need to be considered: 1. We eventually should have some reasonably complete runtime reflection capability 2. Runtime reflection and shared libraries go hand-in-hand. With shared library support, the bloat penalty isn't nearly as significant. I don't think the right answer is to avoid using features of the language because the compiler/runtime has some design deficiencies. At some point these deficiencies will be fixed, and then we are left with a library that has seemingly odd design choices that we can't change.Runtime reflection is great, but I think it's a separate issue from what's discussed here. Andrei
Dec 16 2011
On 2011-12-16 20:48, Andrei Alexandrescu wrote:On 12/16/11 1:23 PM, Steven Schveighoffer wrote:There are other runtime reflection functionality that can be used.I disagree with this assessment. It's good to know the cause of the problem, but let's look at the root issue -- reflection. The only reason to include class information for classes not being referenced is to be able to construct/use classes at runtime instead of at compile time. But if you look at D's runtime reflection capabilities, they are quite poor. You can only construct a class at runtime if it has a zero-arg constructor. So essentially, we are paying the penalty of having runtime reflection in terms of bloat, but get very very little benefit.I'd almost agree, but the code showed doesn't use Object.factory(). So that shouldn't be linked in, and shouldn't pull vtables.I don't think it's completely separate. Can the compiler know if runtime reflection is used or not? -- /Jacob CarlborgI think there are two things that need to be considered: 1. We eventually should have some reasonably complete runtime reflection capability 2. Runtime reflection and shared libraries go hand-in-hand. With shared library support, the bloat penalty isn't nearly as significant. I don't think the right answer is to avoid using features of the language because the compiler/runtime has some design deficiencies. At some point these deficiencies will be fixed, and then we are left with a library that has seemingly odd design choices that we can't change.Runtime reflection is great, but I think it's a separate issue from what's discussed here.
Dec 16 2011
On 12/16/11 2:47 PM, Jacob Carlborg wrote:I don't think it's completely separate. Can the compiler know if runtime reflection is used or not?Yes. Reflection is used if reflection primitive functions are called. Andrei
Dec 16 2011
On 2011-12-16 21:49, Andrei Alexandrescu wrote:On 12/16/11 2:47 PM, Jacob Carlborg wrote:Yeah, but how does the compiler know which are primitive functions, hard code them in the compiler? Or perhaps the compiler already need to know this. -- /Jacob CarlborgI don't think it's completely separate. Can the compiler know if runtime reflection is used or not?Yes. Reflection is used if reflection primitive functions are called. Andrei
Dec 16 2011
On Fri, 16 Dec 2011 14:48:33 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:On 12/16/11 1:23 PM, Steven Schveighoffer wrote:You cannot know until link time whether factory is used when compiling individual files. By then it's probably too late to exclude them. The point is that you can instantiate unreferenced classes simply by calling them out by name.I disagree with this assessment. It's good to know the cause of the problem, but let's look at the root issue -- reflection. The only reason to include class information for classes not being referenced is to be able to construct/use classes at runtime instead of at compile time. But if you look at D's runtime reflection capabilities, they are quite poor. You can only construct a class at runtime if it has a zero-arg constructor. So essentially, we are paying the penalty of having runtime reflection in terms of bloat, but get very very little benefit.I'd almost agree, but the code showed doesn't use Object.factory(). So that shouldn't be linked in, and shouldn't pull vtables.I'm not pushing for runtime reflection, all I'm saying is, I don't think it's worth changing how the library is written to work around something because the *compiler* is incorrectly implemented/designed. So why don't we just leave the code size situation as-is? 500kb is not a terribly significant amount, but dlls are on the horizon (Walter has publicly said so). Then size becomes a moot point. If we get reflection, then you will find that having excluded all the runtime information when not used is going to hamper D's reflection capability, and we'll probably have to start putting it back in anyway. In short, dlls will solve the problem, let's work on that instead of shuffling around code. -SteveI think there are two things that need to be considered: 1. We eventually should have some reasonably complete runtime reflection capability 2. Runtime reflection and shared libraries go hand-in-hand. With shared library support, the bloat penalty isn't nearly as significant. I don't think the right answer is to avoid using features of the language because the compiler/runtime has some design deficiencies. At some point these deficiencies will be fixed, and then we are left with a library that has seemingly odd design choices that we can't change.Runtime reflection is great, but I think it's a separate issue from what's discussed here.
Dec 16 2011
On Fri, 16 Dec 2011 16:28:03 -0500, Steven Schveighoffer <schveiguy yahoo.com> wrote:So why don't we just leave the code size situation as-is? 500kb is not a terribly significant amount, but dlls are on the horizon (Walter has publicly said so). Then size becomes a moot point. If we get reflection, then you will find that having excluded all the runtime information when not used is going to hamper D's reflection capability, and we'll probably have to start putting it back in anyway. In short, dlls will solve the problem, let's work on that instead of shuffling around code.The other valid option I see is removing the link to the virtual tables, thereby disabling reflection via factory until we can implement full reflection. -Steve
Dec 16 2011
On Friday, 16 December 2011 at 21:28:03 UTC, Steven Schveighoffer wrote:In short, dlls will solve the problem, let's work on that instead of shuffling around code.I wouldn't want to cripple either - put all the reflection info in the dll, but keep it sufficiently decoupled so the linker can strip it out when statically linking. The effort in decoupling most the code isn't great.
Dec 16 2011
On Fri, 16 Dec 2011 16:48:47 -0500, Adam D. Ruppe <destructionator gmail.com> wrote:On Friday, 16 December 2011 at 21:28:03 UTC, Steven Schveighoffer wrote:The only way I can think of to decouple it is to disable it with a compiler switch, since the compiler is the one including the info. I envision a nasty world where libraries are built 4 ways, with two orthogonal factors -- dynamic vs. static, and reflection vs. no reflection. Oh, hello visual C++, what are you doing here? -SteveIn short, dlls will solve the problem, let's work on that instead of shuffling around code.I wouldn't want to cripple either - put all the reflection info in the dll, but keep it sufficiently decoupled so the linker can strip it out when statically linking. The effort in decoupling most the code isn't great.
Dec 16 2011
On 12/16/11 3:28 PM, Steven Schveighoffer wrote:On Fri, 16 Dec 2011 14:48:33 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:I'm not an expert in linkers, but my understanding is that linkers naturally remove unused object files. That, coupled with dmd's ability to break compilation output in many pseudo-object files, would take care of the matter. Truth be told, once you link in Object.factory(), bam - all classes are linked.On 12/16/11 1:23 PM, Steven Schveighoffer wrote:You cannot know until link time whether factory is used when compiling individual files. By then it's probably too late to exclude them.I disagree with this assessment. It's good to know the cause of the problem, but let's look at the root issue -- reflection. The only reason to include class information for classes not being referenced is to be able to construct/use classes at runtime instead of at compile time. But if you look at D's runtime reflection capabilities, they are quite poor. You can only construct a class at runtime if it has a zero-arg constructor. So essentially, we are paying the penalty of having runtime reflection in terms of bloat, but get very very little benefit.I'd almost agree, but the code showed doesn't use Object.factory(). So that shouldn't be linked in, and shouldn't pull vtables.The point is that you can instantiate unreferenced classes simply by calling them out by name.Yah, but you must call a function to do that.I'm not pushing for runtime reflection, all I'm saying is, I don't think it's worth changing how the library is written to work around something because the *compiler* is incorrectly implemented/designed. So why don't we just leave the code size situation as-is? 500kb is not a terribly significant amount, but dlls are on the horizon (Walter has publicly said so). Then size becomes a moot point. If we get reflection, then you will find that having excluded all the runtime information when not used is going to hamper D's reflection capability, and we'll probably have to start putting it back in anyway. In short, dlls will solve the problem, let's work on that instead of shuffling around code.I think there are more issues with static this() than simply executable size, as discussed. Also, adding dynamic linking capability does not mean we give up on static linking. A lot of programs use static linking by choice, and for good reasons. Andrei
Dec 16 2011
On Fri, 16 Dec 2011 17:00:45 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:On 12/16/11 3:28 PM, Steven Schveighoffer wrote:Factory doesn't directly reference classes, it does so through the moduleinfo tree/array (not sure what it is). So the way it works is, the linker includes the module info because it's defined as static data, which includes the vtable functions, and factory can instantiate non-referenced classes because of this fact, not the other way around.On Fri, 16 Dec 2011 14:48:33 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:I'm not an expert in linkers, but my understanding is that linkers naturally remove unused object files. That, coupled with dmd's ability to break compilation output in many pseudo-object files, would take care of the matter. Truth be told, once you link in Object.factory(), bam - all classes are linked.On 12/16/11 1:23 PM, Steven Schveighoffer wrote:You cannot know until link time whether factory is used when compiling individual files. By then it's probably too late to exclude them.I disagree with this assessment. It's good to know the cause of the problem, but let's look at the root issue -- reflection. The only reason to include class information for classes not being referenced is to be able to construct/use classes at runtime instead of at compile time. But if you look at D's runtime reflection capabilities, they are quite poor. You can only construct a class at runtime if it has a zero-arg constructor. So essentially, we are paying the penalty of having runtime reflection in terms of bloat, but get very very little benefit.I'd almost agree, but the code showed doesn't use Object.factory(). So that shouldn't be linked in, and shouldn't pull vtables.Even statically linked programs might use runtime reflection. I agree the issue is not static linking vs. dynamic linking, but dynamic linking would hide the problem quite well. Note that on Linux today, the executable is not truly static -- OS libs are dynamically linked. Another option is to disable runtime reflection via a compiler switch (which would sever the ties between moduleinfo and classinfo). Then we simply must make sure we don't use factory in the library anywhere. -SteveI'm not pushing for runtime reflection, all I'm saying is, I don't think it's worth changing how the library is written to work around something because the *compiler* is incorrectly implemented/designed. So why don't we just leave the code size situation as-is? 500kb is not a terribly significant amount, but dlls are on the horizon (Walter has publicly said so). Then size becomes a moot point. If we get reflection, then you will find that having excluded all the runtime information when not used is going to hamper D's reflection capability, and we'll probably have to start putting it back in anyway. In short, dlls will solve the problem, let's work on that instead of shuffling around code.I think there are more issues with static this() than simply executable size, as discussed. Also, adding dynamic linking capability does not mean we give up on static linking. A lot of programs use static linking by choice, and for good reasons.
Dec 16 2011
Am 16.12.2011, 23:08 Uhr, schrieb Steven Schveighoffer <schveiguy yahoo.com>:Note that on Linux today, the executable is not truly static -- OS libs are dynamically linked.That should hold true for any OS. Otherwise, how would the program communicate with the kernel and drivers, i.e. render a button on the screen? Some dynamically linked in functions must provide the interface to that "administrative singleton" that manages system resources.
Dec 18 2011
On Sun, 18 Dec 2011 18:02:10 -0500, Marco Leise <Marco.Leise gmx.de> wrote:Am 16.12.2011, 23:08 Uhr, schrieb Steven Schveighoffer <schveiguy yahoo.com>:Not necessarily. On Linux, system calls provide the "interface" between the code and the OS. A system call is essentially an OS interrupt, similar to a network protocol. You don't need dynamic linking to implement it. Remember, Linux didn't even support dynamic libraries before kernel 1.2 maybe? Hm... must check wikipedia... But my point is, if the intention is that you have a myriad of D based libraries or executables on your system, then druntime and phobos enter the same realm as glibc. -SteveNote that on Linux today, the executable is not truly static -- OS libs are dynamically linked.That should hold true for any OS. Otherwise, how would the program communicate with the kernel and drivers, i.e. render a button on the screen? Some dynamically linked in functions must provide the interface to that "administrative singleton" that manages system resources.
Dec 19 2011
On Dec 16, 2011, at 2:00 PM, Andrei Alexandrescu wrote:=20 I'm not an expert in linkers, but my understanding is that linkers =naturally remove unused object files. That, coupled with dmd's ability = to break compilation output in many pseudo-object files, would take care = of the matter. Truth be told, once you link in Object.factory(), bam - = all classes are linked. There's an old bugzilla entry that may apply: http://d.puremagic.com/issues/show_bug.cgi?id=3D879=
Dec 16 2011
I'm not an expert in linkers, but my understanding is that linkers naturally remove unused object files. That, coupled with dmd's ability to break compilation output in many pseudo-object files, would take care of the matter. Truth be told, once you link in Object.factory(), bam - all classes are linked.That's strange, because Object.factory should only require TypeInfo_Class which only indirectly iterates through all modules. The ModuleInfos do drag in all their classes so what we currently don't get is a module with only some of it's classes. What OS are you using? Can you bundle up some files that reproduce this?
Jan 18 2012
On 16.12.2011 22:28, Steven Schveighoffer wrote:In short, dlls will solve the problem, let's work on that instead of shuffling around code.How exactly do they solve the problem? An exe plus a DLL version of the library will usually be larger than just a statically linked exe.
Dec 16 2011
On Friday, December 16, 2011 23:30:44 torhu wrote:On 16.12.2011 22:28, Steven Schveighoffer wrote:You have to stick it all in the DLL anyway (since you can't know which parts will and won't be used), so the whole issue of not including used functionality goes away completely. There's no point in worrying about how much unused functionality gets included when you have no choice but to include everything regardless of whether it's actually used. - Jonathan M DavisIn short, dlls will solve the problem, let's work on that instead of shuffling around code.How exactly do they solve the problem? An exe plus a DLL version of the library will usually be larger than just a statically linked exe.
Dec 16 2011
On Fri, 16 Dec 2011 17:30:44 -0500, torhu <no spam.invalid> wrote:On 16.12.2011 22:28, Steven Schveighoffer wrote:The DLL is loaded into memory once. With static linking, it's loaded every time you run an exe. -SteveIn short, dlls will solve the problem, let's work on that instead of shuffling around code.How exactly do they solve the problem? An exe plus a DLL version of the library will usually be larger than just a statically linked exe.
Dec 19 2011
On 19.12.2011 16:08, Steven Schveighoffer wrote:On Fri, 16 Dec 2011 17:30:44 -0500, torhu<no spam.invalid> wrote:I thought we were talking about distribution sizes, not memory use. But anyway, DLL's won't do a lot as long as people don't have a whole bunch of D programs installed.On 16.12.2011 22:28, Steven Schveighoffer wrote:The DLL is loaded into memory once. With static linking, it's loaded every time you run an exe.In short, dlls will solve the problem, let's work on that instead of shuffling around code.How exactly do they solve the problem? An exe plus a DLL version of the library will usually be larger than just a statically linked exe.
Dec 19 2011
On Mon, 19 Dec 2011 13:09:18 -0500, torhu <no spam.invalid> wrote:On 19.12.2011 16:08, Steven Schveighoffer wrote:Right, in order for dlls to make a difference, you need to separate the library install from the exe install, as is done most of the time. If you are installing one D application on your box, what would be the issue with the size anyway? The complaint is generally that the size is much bigger than a hello world compiled for C/C++, which obviously doesn't take into account that the C/C++ standard libraries are DLLs. -SteveOn Fri, 16 Dec 2011 17:30:44 -0500, torhu<no spam.invalid> wrote:I thought we were talking about distribution sizes, not memory use. But anyway, DLL's won't do a lot as long as people don't have a whole bunch of D programs installed.On 16.12.2011 22:28, Steven Schveighoffer wrote:The DLL is loaded into memory once. With static linking, it's loaded every time you run an exe.In short, dlls will solve the problem, let's work on that instead of shuffling around code.How exactly do they solve the problem? An exe plus a DLL version of the library will usually be larger than just a statically linked exe.
Dec 19 2011
On 2011-12-19 19:09, torhu wrote:On 19.12.2011 16:08, Steven Schveighoffer wrote:It could be useful for a package manager. Theoretically all installed packages could share the same dynamic library. But I would guess the the packages would depend on different versions of the library and the package manager would end up installing a whole bunch of different versions of the Phobos and druntime. -- /Jacob CarlborgOn Fri, 16 Dec 2011 17:30:44 -0500, torhu<no spam.invalid> wrote:I thought we were talking about distribution sizes, not memory use. But anyway, DLL's won't do a lot as long as people don't have a whole bunch of D programs installed.On 16.12.2011 22:28, Steven Schveighoffer wrote:The DLL is loaded into memory once. With static linking, it's loaded every time you run an exe.In short, dlls will solve the problem, let's work on that instead of shuffling around code.How exactly do they solve the problem? An exe plus a DLL version of the library will usually be larger than just a statically linked exe.
Dec 19 2011
Am 19.12.2011, 20:43 Uhr, schrieb Jacob Carlborg <doob me.com>:It could be useful for a package manager. Theoretically all installed packages could share the same dynamic library. But I would guess the the packages would depend on different versions of the library and the package manager would end up installing a whole bunch of different versions of the Phobos and druntime.No! Let's please try to get closer to something that works with package managers than the situation on Windows. On Windows I see few applications that install libraries separately, unless they started on Linux or the libraries are established like DirectX. In the past DLLs from newly installed programs used to overwrite existing DLLs. IIRC the DLLs were then checked for their versions by installers, so they are never downgraded, but that still broke some applications with library updates that changed the API. Starting with Vista, there is the winsxs difrectory that - as I understand it - keeps a copy of every version of every dll associated to the programs that installed/use them. Package managers are close to my ideal world: - different API versions (major revisions) can be installed in parallel - applications link to the API version they were designed for - bug fixes replace the old DLL for the whole system, all applications benefit - RAM is shared between applications that use the same DLL I'd think it would be bad to make cuts here. If you cannot even imagine an operating system with 1000 little apps like type/cat, cp/copy, sed etc... written in D, because they would all link statically against the runtime and cause major bloat, then that is turning off another few % of C users and purists. You don't drive an off-road car, because you go off-roads so often, but because you could imagine it. (Please buy small cars for city use.) Linking against different library versions goes in practice like this: There is at least one version installed, maybe libphobos2.so.1.057. The 1 would be a major revision (one where hard deprecations occur), then there is a link named libphobos2.so.1 to that file, that all applications using API version 1 link against. So the actual file can be updated to libphobos2.so.1.058 without recompiles or breakage.
Dec 20 2011
On Tuesday, 20 December 2011 at 20:51:38 UTC, Marco Leise wrote:Am 19.12.2011, 20:43 Uhr, schrieb Jacob Carlborg <doob me.com>: On Windows I see few applications that install libraries separately, unless they started on Linux or the libraries are established like DirectX. In the past DLLs from newly installed programs used to overwrite existing DLLs. IIRC the DLLs were then checked for their versions by installers, so they are never downgraded, but that still broke some applications with library updates that changed the API. Starting with Vista, there is the winsxs difrectory that - as I understand it - keeps a copy of every version of every dll associated to the programs that installed/use them.Minor nitpick: winsxs has been around since XP.
Dec 20 2011
On 2011-12-16 20:23, Steven Schveighoffer wrote:I disagree with this assessment. It's good to know the cause of the problem, but let's look at the root issue -- reflection. The only reason to include class information for classes not being referenced is to be able to construct/use classes at runtime instead of at compile time. But if you look at D's runtime reflection capabilities, they are quite poor. You can only construct a class at runtime if it has a zero-arg constructor.It's not very useful as is, but you can create your own version that doesn't call the constructor and that can be more useful sometimes. I'm using that technique in my serialization library and providing a special method that can act as a constructor. -- /Jacob Carlborg
Dec 16 2011
On Friday, December 16, 2011 12:29:18 Andrei Alexandrescu wrote:Jonathan, could I impose on you to replace all static cdtors in std.datetime with lazy initialization? I looked through it and it strikes me as a reasonably simple job, but I think you'd know better what to do than me. A similar effort could be conducted to reduce or eliminate static cdtors from druntime. I made the experiment of commenting them all, and that reduced the size of the baseline from 218KB to 200KB. This is a good amount, but not as dramatic as what we can get by working on std.datetime.Hmm. I had reply for this already, but it seems to have disappeared, so I'll try again. You could make core.time use property functions instead of the static immutable variables that it's using now for ticksPerSec and appOrigin, but in order to do that right would require introducing a mutex or synchronized block (which is really just a mutex under the hood anyway), and I'm loathe to do that in time-related code. ticksPerSec gets used all over the place in TickDuration, and that could have a negative impact on performance for something that needs to be really fast (since it's used in stuff like StopWatch and benchmarking). On top of that, in order to maintain the current semantics, the property functions would have to be pure, which they can't be without doing some nasty casting to convince the compiler that stuff which isn't pure is actually pure. For std.datetime, the problem would be reduced if a class could be created in CTFE and still be around at runtime, but we can't do that yet, and it wouldn't completely solve the problem, since the shared static constructor related to LocalTime has to call tzset. So, some sort of runtime initialization must be done. And the instances for the singleton are not only immutable, but the functions for getting them are pure. So, once again, some nasty casting would be required to get it to work without breaking purity. And once again, we'd have introduce a mutex. And for both core.time and std.datetime we're talking about a mutex would be needed only briefly to ensure that we don't end up with two threads trying to initialize the variable at the same time. After that, it would just be impeding performance for no value. They're classic situations for static constructors - initializing static immutable variables - and really, they _should_ be using static constructors. If we have to get rid of them, it's to get around other problems in the language or compiler instead of fixing those problems. So, on some level, that seems like a failure on the part of the language and the compiler. If we _have_ to find a workaround, then we have to find a workaround, but I find the need to be distasteful to say the least. I previously tried to get rid of the static constructors in std.datetime and couldn't precisely because they're needed unless you play major casting games to get around immutable and pure. If we play nice, it's impossible to get rid of the static constructors in std.datetime. It probably is possible if we do nasty casting, but (much as I hate to use the word) it seems like this is a hack to get around the fact that the compiler isn't dealing with static constructors as well as we'd like. I'd _really_ like to see this fixed at the compiler level. And honestly, I think that a far worse problem with static constructors is circular dependencies. _That_ is something that needs to be addressed with regards to static constructors. In general at this point, it's looking like static constructors are turning out to be a bit of a failure on some level, given the issues that we're having because of them, and I think that we should fix the language and/or compiler so that they _aren't_ a failure. - Jonathan M Davis
Dec 16 2011
On 12/16/2011 08:41 PM, Jonathan M Davis wrote:On Friday, December 16, 2011 12:29:18 Andrei Alexandrescu wrote:lazy variables would resolve this.Jonathan, could I impose on you to replace all static cdtors in std.datetime with lazy initialization? I looked through it and it strikes me as a reasonably simple job, but I think you'd know better what to do than me. A similar effort could be conducted to reduce or eliminate static cdtors from druntime. I made the experiment of commenting them all, and that reduced the size of the baseline from 218KB to 200KB. This is a good amount, but not as dramatic as what we can get by working on std.datetime.Hmm. I had reply for this already, but it seems to have disappeared, so I'll try again. You could make core.time use property functions instead of the static immutable variables that it's using now for ticksPerSec and appOrigin, but in order to do that right would require introducing a mutex or synchronized block (which is really just a mutex under the hood anyway), and I'm loathe to do that in time-related code. ticksPerSec gets used all over the place in TickDuration, and that could have a negative impact on performance for something that needs to be really fast (since it's used in stuff like StopWatch and benchmarking). On top of that, in order to maintain the current semantics, the property functions would have to be pure, which they can't be without doing some nasty casting to convince the compiler that stuff which isn't pure is actually pure.For std.datetime, the problem would be reduced if a class could be created in CTFE and still be around at runtime, but we can't do that yet, and it wouldn't completely solve the problem, since the shared static constructor related to LocalTime has to call tzset. So, some sort of runtime initialization must be done. And the instances for the singleton are not only immutable, but the functions for getting them are pure. So, once again, some nasty casting would be required to get it to work without breaking purity. And once again, we'd have introduce a mutex. And for both core.time and std.datetime we're talking about a mutex would be needed only briefly to ensure that we don't end up with two threads trying to initialize the variable at the same time. After that, it would just be impeding performance for no value. They're classic situations for static constructors - initializing static immutable variables - and really, they _should_ be using static constructors. If we have to get rid of them, it's to get around other problems in the language or compiler instead of fixing those problems. So, on some level, that seems like a failure on the part of the languageno.and the compiler.yes. Although I am not severely affected by 500kb of bloat.If we _have_ to find a workaround, then we have to find a workaround, but I find the need to be distasteful to say the least. I previously tried to get rid of the static constructors in std.datetime and couldn't precisely because they're needed unless you play major casting games to get around immutable and pure. If we play nice, it's impossible to get rid of the static constructors in std.datetime. It probably is possible if we do nasty casting, but (much as I hate to use the word) it seems like this is a hack to get around the fact that the compiler isn't dealing with static constructors as well as we'd like. I'd _really_ like to see this fixed at the compiler level. And honestly, I think that a far worse problem with static constructors is circular dependencies. _That_ is something that needs to be addressed with regards to static constructors.Circular dependencies are not to be blamed on the design of static constructors.In general at this point, it's looking like static constructors are turning out to be a bit of a failure on some level, given the issues that we're having because of them, and I think that we should fix the language and/or compiler so that they _aren't_ a failure. - Jonathan M DavisWe are having (minor!!) problems because the task of initializing global data in a modular way is inherently hard. Just have a look how other languages handle initialization of global data and you'll notice that the D solution is actually very sensible.
Dec 16 2011
On Friday, December 16, 2011 21:06:49 Timon Gehr wrote:On 12/16/2011 08:41 PM, Jonathan M Davis wrote:True, but we don't have them.On Friday, December 16, 2011 12:29:18 Andrei Alexandrescu wrote:lazy variables would resolve this.Jonathan, could I impose on you to replace all static cdtors in std.datetime with lazy initialization? I looked through it and it strikes me as a reasonably simple job, but I think you'd know better what to do than me. A similar effort could be conducted to reduce or eliminate static cdtors from druntime. I made the experiment of commenting them all, and that reduced the size of the baseline from 218KB to 200KB. This is a good amount, but not as dramatic as what we can get by working on std.datetime.>Hmm. I had reply for this already, but it seems to have disappeared, so I'll try again. You could make core.time use property functions instead of the static immutable variables that it's using now for ticksPerSec and appOrigin, but in order to do that right would require introducing a mutex or synchronized block (which is really just a mutex under the hood anyway), and I'm loathe to do that in time-related code. ticksPerSec gets used all over the place in TickDuration, and that could have a negative impact on performance for something that needs to be really fast (since it's used in stuff like StopWatch and benchmarking). On top of that, in order to maintain the current semantics, the property functions would have to be pure, which they can't be without doing some nasty casting to convince the compiler that stuff which isn't pure is actually pure.Circular dependencies are not to be blamed on the design of static constructors.Yes they are. static constructors completely chicken out on them. Not only is there no real attempt to determine whether the static constructors are actually dependent (which granted, isn't an easy problem), but there is _zero_ support in the language for resolving such circular dependencies. There's no way to say that they _aren't_ dependent even if you can clearly see that they aren't. The solution used in Phobos (which won't work in std.datetime due to the use of immutable and pure) is to create a C module which has the code from the static constructor and then have a separate module which calls it in its static constructor. It works, but it's not pretty (and it doesn't always work - e.g. std.datetime), and it would be _far_ better if you could just mark a static constructor as not depending on anything or mark it as not depending on a specific module or something similar. And given how disgusting it generally is to even figure out what's causing a circular dependency when the runtime won't start your program because of it, I really think that this is a problem which should resolved. static constructors need to be improved.Yes. The situation with D is better than that of many other languages, but what prodblems we do have can be _really_ annoying to deal with. Have to deal with circular dependencies due to static module constructors which aren't actually interdependent is one of the most annoying issues in D IMHO. - Jonathan M DavisIn general at this point, it's looking like static constructors are turning out to be a bit of a failure on some level, given the issues that we're having because of them, and I think that we should fix the language and/or compiler so that they _aren't_ a failure. - Jonathan M DavisWe are having (minor!!) problems because the task of initializing global data in a modular way is inherently hard. Just have a look how other languages handle initialization of global data and you'll notice that the D solution is actually very sensible.
Dec 16 2011
On 12/16/2011 09:31 PM, Jonathan M Davis wrote:On Friday, December 16, 2011 21:06:49 Timon Gehr wrote:No. They arise from the design of the module hierarchy.On 12/16/2011 08:41 PM, Jonathan M Davis wrote:True, but we don't have them.On Friday, December 16, 2011 12:29:18 Andrei Alexandrescu wrote:lazy variables would resolve this.Jonathan, could I impose on you to replace all static cdtors in std.datetime with lazy initialization? I looked through it and it strikes me as a reasonably simple job, but I think you'd know better what to do than me. A similar effort could be conducted to reduce or eliminate static cdtors from druntime. I made the experiment of commenting them all, and that reduced the size of the baseline from 218KB to 200KB. This is a good amount, but not as dramatic as what we can get by working on std.datetime.>Hmm. I had reply for this already, but it seems to have disappeared, so I'll try again. You could make core.time use property functions instead of the static immutable variables that it's using now for ticksPerSec and appOrigin, but in order to do that right would require introducing a mutex or synchronized block (which is really just a mutex under the hood anyway), and I'm loathe to do that in time-related code. ticksPerSec gets used all over the place in TickDuration, and that could have a negative impact on performance for something that needs to be really fast (since it's used in stuff like StopWatch and benchmarking). On top of that, in order to maintain the current semantics, the property functions would have to be pure, which they can't be without doing some nasty casting to convince the compiler that stuff which isn't pure is actually pure.Circular dependencies are not to be blamed on the design of static constructors.Yes they are.static constructors completely chicken out on them. Not only is there no real attempt to determine whether the static constructors are actually dependent (which granted, isn't an easy problem),I don't think that is an option.but there is _zero_ support in the language for resolving such circular dependencies. There's no way to say that they _aren't_ dependent even if you can clearly see that they aren't.Yes there is. The compiler and runtime understand that they are not mutually dependent if their modules are not mutually dependent. Package level is the right level for dealing with such issues because the circular dependencies are a modularity problem.The solution used in Phobos (which won't work in std.datetime due to the use of immutable and pure) is to create a C module which has the code from the static constructor and then have a separate module which calls it in its static constructor.You don't need a C function if you just factor out every variable it initializes to the separate D module. __fileinit.d works that way. I don't see why stdiobase.d could not do the same.It works, but it's not pretty (and it doesn't always work - e.g. std.datetime), and it would be _far_ better if you could just mark a static constructor as not depending on anything or mark it as not depending on a specific module or something similar.How would that be checked?And given how disgusting it generally is to even figure out what's causing a circular dependency when the runtime won't start your program because of it, I really think that this is a problem which should resolved. static constructors need to be improved.Nobody has figured out how to solve the problem of modular global data initialization. That is because there probably is no solution.Adding a language construct that turns off the checking entirely (as you seem to suggest) is not at all better than having to create a few additional source files.Yes. The situation with D is better than that of many other languages, but what prodblems we do have can be _really_ annoying to deal with. Have to deal with circular dependencies due to static module constructors which aren't actually interdependent is one of the most annoying issues in D IMHO.In general at this point, it's looking like static constructors are turning out to be a bit of a failure on some level, given the issues that we're having because of them, and I think that we should fix the language and/or compiler so that they _aren't_ a failure. - Jonathan M DavisWe are having (minor!!) problems because the task of initializing global data in a modular way is inherently hard. Just have a look how other languages handle initialization of global data and you'll notice that the D solution is actually very sensible.
Dec 16 2011
On Friday, December 16, 2011 22:41:14 Timon Gehr wrote:On 12/16/2011 09:31 PM, Jonathan M Davis wrote: You don't need a C function if you just factor out every variable it initializes to the separate D module. __fileinit.d works that way. I don't see why stdiobase.d could not do the same.That only works if the variable being initialized is in the new module instead of the original module, which you can't always do.It wouldn't be. It wouldn't need to be. The programmer is telling the compiler that there isn't a dependency. It's up to the programmer to make sure that it's right, and it's wrong, it's their fault. There are plenty of other features like that in D - just not SafeD.It works, but it's not pretty (and it doesn't always work - e.g. std.datetime), and it would be _far_ better if you could just mark a static constructor as not depending on anything or mark it as not depending on a specific module or something similar.How would that be checked?I completely disagree. For instance, it's impossible to move the singleton instances of UTC and LocalTime from std.datetime into another module without breaking encapsulation, and it's definitely impossible to do it and leave them as members of their respective classes. Those static constructors clearly don't rely on any other modules except for the one which gives the declaration for tzset (and has no static constructors). But if std.file needed a module constructor, we'd end up with a circular dependency between std.datetime and std.file when clearly nothing in std.datetime's static constructor relies on std.file in any way shape or form. It would be a huge improvement to be able to just mark those static constructors as not relying on any other modules having their static constructors run first. As it stands, it's a royal pain to deal with any circular dependencies which pop up and because of that, it quickly becomes best practice to avoid static constructors as much as possible, which is a big problem IMHO. Factoring out the static constructor's contents into a separate module is not always possible, and it's an ugly solution IMHO. I'd _much_ rather have a feature where I can tell the compiler that there is no circular dependency so that it can appropriately order the loading of the modules. - Jonathan M Davisannoying issues in D IMHO.Adding a language construct that turns off the checking entirely (as you seem to suggest) is not at all better than having to create a few additional source files.
Dec 16 2011
On 12/16/2011 11:39 PM, Jonathan M Davis wrote:[...] For instance, it's impossible to move the singleton instances of UTC and LocalTime from std.datetime into another module without breaking encapsulation.In what way would encapsulation be broken by just moving the class to a helper module?
Dec 16 2011
On 12/16/11 4:39 PM, Jonathan M Davis wrote:It wouldn't be. It wouldn't need to be. The programmer is telling the compiler that there isn't a dependency. It's up to the programmer to make sure that it's right, and it's wrong, it's their fault. There are plenty of other features like that in D - just not SafeD.I don't see progress here over arranging packages and modules to reflect program structure in a way that clarifies it to the human /and/ the compiler.Maybe there's an issue with the design. Maybe Singleton (the most damned of all patterns) is not the best choice here. Or maybe the use of an inheritance hierarchy with a grand total of 4 classes. Or maybe the encapsulation could be rethought. The general point is, a design lives within a language. Any language is going to disallow a few designs or make them unsuitable for particular situation. This is, again, multiplied by the context: it's the standard library.I completely disagree. For instance, it's impossible to move the singleton instances of UTC and LocalTime from std.datetime into another module without breaking encapsulation, and it's definitely impossible to do it and leave them as members of their respective classes.annoying issues in D IMHO.Adding a language construct that turns off the checking entirely (as you seem to suggest) is not at all better than having to create a few additional source files.Those static constructors clearly don't rely on any other modules except for the one which gives the declaration for tzset (and has no static constructors). But if std.file needed a module constructor, we'd end up with a circular dependency between std.datetime and std.file when clearly nothing in std.datetime's static constructor relies on std.file in any way shape or form. It would be a huge improvement to be able to just mark those static constructors as not relying on any other modules having their static constructors run first. As it stands, it's a royal pain to deal with any circular dependencies which pop up and because of that, it quickly becomes best practice to avoid static constructors as much as possible, which is a big problem IMHO.I think this point has gotten into an extreme, a corner of the design space. Yeah, sky's blue, apple pie is good (and too much of it gives diabetes), and module dependencies can be messy. But it strikes me as a bit backwards to add instructions in the core language to lessen guarantees and make things even messier, when alternatives exist that foster better dependency control for the very rare situations that need intervention. It's just not proportional response. The persona using such a feature would be quite an odd combination - a developer with sophisticated enough needs to want unchecked dependencies as a feature, yet naive enough to be unable to solve the problem without the feature, and yet again sophisticated enough to not make mistakes in using said feature.Factoring out the static constructor's contents into a separate module is not always possible, and it's an ugly solution IMHO. I'd _much_ rather have a feature where I can tell the compiler that there is no circular dependency so that it can appropriately order the loading of the modules.But what's the appropriate order then? :o) Andrei
Dec 16 2011
how did other languages solve this issue? I can't imagine D beeing the only language with static constructors, do they have that problem too?
Dec 16 2011
On 12/16/2011 3:18 PM, maarten van damme wrote:how did other languages solve this issue? I can't imagine D beeing the only language with static constructors, do they have that problem too?In C++, the order that static constructors run is implementation defined. No guarantees at all. The programmer has no reasonable way to control the order in which they are done. (Of course, C++ doesn't even have modules, so the notion of a module constructor is tenuous at best.)
Dec 16 2011
On Dec 16, 2011, at 3:24 PM, Walter Bright wrote:On 12/16/2011 3:18 PM, maarten van damme wrote:the onlyhow did other languages solve this issue? I can't imagine D beeing =defined. No guarantees at all. The programmer has no reasonable way to = control the order in which they are done.language with static constructors, do they have that problem too?=20 In C++, the order that static constructors run is implementation ==20 (Of course, C++ doesn't even have modules, so the notion of a module =constructor is tenuous at best.) This aspect of C++ drives me absolutely crazy. Though I imagine it = bothers a lot of people given all the coverage static initialization has = gotten in C++ literature.=
Dec 16 2011
On 12/17/2011 12:18 AM, maarten van damme wrote:how did other languages solve this issue? I can't imagine D beeing the only language with static constructors, do they have that problem too?is to call the static constructor lazily upon class load time. That means it can be called at an arbitrary point during your program execution. And if you accidentally have circular dependencies between static constructors, your program may or may not blow up or behave badly.
Dec 16 2011
Le 17/12/2011 00:18, maarten van damme a écrit :how did other languages solve this issue? I can't imagine D beeing the only language with static constructors, do they have that problem too?AFAIK, I believe like in D, it's best practice to avoid static well, even though the running order is well-defined. The dependency injection design pattern seems to help here.
Dec 17 2011
On Friday, December 16, 2011 17:13:49 Andrei Alexandrescu wrote:Maybe there's an issue with the design. Maybe Singleton (the most damned of all patterns) is not the best choice here. Or maybe the use of an inheritance hierarchy with a grand total of 4 classes. Or maybe the encapsulation could be rethought. The general point is, a design lives within a language. Any language is going to disallow a few designs or make them unsuitable for particular situation. This is, again, multiplied by the context: it's the standard library.I don't know what's wrong with singletons. It's a great pattern in certain circumstances. In this case, it avoids unnecessary allocations every time that you do something like Clock.currTime(). There's no reason to keep allocating new instances of LocalTime and wasting memory. The data in all of them would be identical. And since the time zone has to be dynamic, it requires either a class or function pointers (or delegates). And since multiple functions are involved per time zone, it's far cleaner to use class. It has the added benefit of giving you a nice place to do stuff like ask the time zone its name. So, I don't see what could be better than using classes for the time zones like it does now. And given the fact that it's completely unnecessary and wasteful to allocate multiple instances of UTC and LocalTime, it seems to me that the singleton pattern is exactly the correct solution for this problem. There would be fewer potential issues with circular dependencies if std.datetime were broken up, but the consensus seems to be that we don't want to do that. Regardless, if I find a way to lazily load the singletons in spite of immutable and pure, then there won't be any more need for the static constructors for them. There's still one for the unit tests, but worse comes to worst, that functionality could be moved to a function which is called by the first unittest block.But what's the appropriate order then? :o)It doesn't matter. The static constructors in std.datetime has no dependencies on other modules at all aside from object and the core module which holds the declaration for tzset. In neither case does it depend on any other static constructors. In my experience, that's almost always the case. But because of how circular dependencies are treated, the compiler/runtime considers it a circular dependency as soon as two modules which import each other directly - or worse, indirectly - both have module constructors, regardless of whether there is anything even vaguely interdependent about those static constructors and what they initialize. So, you're forced to move stuff into other modules, and in some cases (such as when pure or immutable is being used), that may not work. Clearly, I'm not going to win any arguments on this, given that both you and Walter are definitely opposed, but I definitely think that the current situation with circular dependencies is one of D's major warts. - Jonathan M Davis
Dec 16 2011
On 12/16/11 5:50 PM, Jonathan M Davis wrote:On Friday, December 16, 2011 17:13:49 Andrei Alexandrescu wrote:http://en.wikipedia.org/wiki/Singleton_pattern Second paragraph.Maybe there's an issue with the design. Maybe Singleton (the most damned of all patterns) is not the best choice here. Or maybe the use of an inheritance hierarchy with a grand total of 4 classes. Or maybe the encapsulation could be rethought. The general point is, a design lives within a language. Any language is going to disallow a few designs or make them unsuitable for particular situation. This is, again, multiplied by the context: it's the standard library.I don't know what's wrong with singletons.It's a great pattern in certain circumstances. In this case, it avoids unnecessary allocations every time that you do something like Clock.currTime(). There's no reason to keep allocating new instances of LocalTime and wasting memory. The data in all of them would be identical. And since the time zone has to be dynamic, it requires either a class or function pointers (or delegates). And since multiple functions are involved per time zone, it's far cleaner to use class. It has the added benefit of giving you a nice place to do stuff like ask the time zone its name. So, I don't see what could be better than using classes for the time zones like it does now. And given the fact that it's completely unnecessary and wasteful to allocate multiple instances of UTC and LocalTime, it seems to me that the singleton pattern is exactly the correct solution for this problem.You're using a stilted version of it. Most often the singleton object is created lazily upon the first access, whereas std.datetime creates the object (and therefore shotguns linkage with the garbage collector) even if never needed. But what I'm trying here is to lift the level of discourse. The Singleton sounds like the solution of choice already presupposing that inheritance and polymorphism are good decisions. What I'm trying to say is that D should be rich enough to allow you considerable freedom in the design space, so we should have enough means to navigate around this one particular issue. I don't think we can say with a straight face we can't avoid use of static this inside std.datetime.There would be fewer potential issues with circular dependencies if std.datetime were broken up, but the consensus seems to be that we don't want to do that. Regardless, if I find a way to lazily load the singletons in spite of immutable and pure, then there won't be any more need for the static constructors for them. There's still one for the unit tests, but worse comes to worst, that functionality could be moved to a function which is called by the first unittest block.Maybe the choice of immutable and pure is too restrictive. How about making the object returned const?Under what circumstances it doesn't work, and how would adding _more_ support for _less_ safety would be better than a glorified cast that you can use _today_?But what's the appropriate order then? :o)It doesn't matter. The static constructors in std.datetime has no dependencies on other modules at all aside from object and the core module which holds the declaration for tzset. In neither case does it depend on any other static constructors. In my experience, that's almost always the case. But because of how circular dependencies are treated, the compiler/runtime considers it a circular dependency as soon as two modules which import each other directly - or worse, indirectly - both have module constructors, regardless of whether there is anything even vaguely interdependent about those static constructors and what they initialize. So, you're forced to move stuff into other modules, and in some cases (such as when pure or immutable is being used), that may not work.Clearly, I'm not going to win any arguments on this, given that both you and Walter are definitely opposed, but I definitely think that the current situation with circular dependencies is one of D's major warts.I'm not nailed to the floor. Any good arguments would definitely change my opinion. Andrei
Dec 16 2011
On Friday, December 16, 2011 18:05:56 Andrei Alexandrescu wrote:On 12/16/11 5:50 PM, Jonathan M Davis wrote: http://en.wikipedia.org/wiki/Singleton_pattern Second paragraph.Valid points, but it's still useful under some circumstances. I don't actually use it very often personally. It just made sense here. Thanks for the link.You're using a stilted version of it. Most often the singleton object is created lazily upon the first access, whereas std.datetime creates the object (and therefore shotguns linkage with the garbage collector) even if never needed. But what I'm trying here is to lift the level of discourse. The Singleton sounds like the solution of choice already presupposing that inheritance and polymorphism are good decisions. What I'm trying to say is that D should be rich enough to allow you considerable freedom in the design space, so we should have enough means to navigate around this one particular issue. I don't think we can say with a straight face we can't avoid use of static this inside std.datetime.The only reason that it's not lazily loading is because of the purity issue an the fact that it would require a mutex. The mutex we can live with. pure can't be gotten around easily, but I'll figure it out. As for the general design, SysTime needs to be able to dynamically adjust its value based on the time zone upon request (e.g. asking for the SysTime as a string or asking for the that SysTime's year). That essentially requires that the set of functions required for the calculations be swappable (preferably as a group, since that's far cleaner). Encapsulating it in a class gives you that polymorphic behavior quite nicely and also groups the various functions quite nicely. It also gives you a nice place to put some stuff like the time zone's name. Sure, we could theoretically change it to' be struct which holds function pointers, but that seems to me like you're pretty much just trying to redesign classes that way. I think that the basic design is solid.SysTime holds an immutable TimeZone (currently with Rebindable). In theory, this should have the advantage of making it possible to pass a SysTime across with send and receive, but bugs in the compiler currently make it impossible to construct and immutable SysTime. So, all TimeZone objects are const, or they won't work with SysTime. And since there's not normally a reason to change any of the values in a TimeZone (they don't hold much data in the first place), that's really not a problem. The only problem with making it immutable has to do with the singleton. I suppose that it could be change to Rebindable!(immutable TimeZone) like in SysTime, but when I designed it, there didn't seem much point to that, since it had to be constructed at runtime and required a static constructor regardless. And I was trying to make absolutely as much in std.datetime pure as possible, which inevitably led to the singletons being pure. Making them impure makes it so that a variety of other functions can't be pure and would break code. I don't remember how much however. Regardless, to avoid breaking code, it has to pure. It's possible that the code breakage would be worth it, but I'd have to mess around with it to see. With appropriate casts, pure can be subverted, but that's obviously ugly.There would be fewer potential issues with circular dependencies if std.datetime were broken up, but the consensus seems to be that we don't want to do that. Regardless, if I find a way to lazily load the singletons in spite of immutable and pure, then there won't be any more need for the static constructors for them. There's still one for the unit tests, but worse comes to worst, that functionality could be moved to a function which is called by the first unittest block.Maybe the choice of immutable and pure is too restrictive. How about making the object returned const?Under what circumstances it doesn't work,I couldn't move the singletons out of std.datetime in that way. pure disallows it.and how would adding _more_ support for _less_ safety would be better than a glorified cast that you can use _today_?I don't think that I have ever seen an _actual_ circular dependency when a program blows up because of it. It's always a case of the two modules doing completely unrelated stuff with their static constructors. It's generally incredibly obvious that there's no interdependency, but the compiler/runtime isn't smart enough to see that. And if you use static constructors much (which invariably happens if you have much in the way of immutable variables which are commonly used enough to put at module or class scope), you run into this problem fairly easily. And given the large amount of inter-module importing in Phobos, it's _very_ easy to run into the problem there if we use static constructors. When such circular dependencies happen, it's a royal pain to sort out what's going on - especially if the modules to import each other directly. The error messages have improved, but it's still nasty to sort out exactly what's happening. And then fixing it? Assuming that you can use the solution that some of Phobos' modules use by having a secondary module for the initialization, then there's a way to do it, but that solution is quite ugly IMHO, and regardless of that, it's _not_ in the least bit obvious. I don't know that I ever would have thought of it myself (maybe, maybe not). So, the programmer is essentially faced with a situation where they have two modules with static constructors that they can clearly see are completely unrelated, but they're going to have to do some major refactoring to get around the issue that the compiler and runtime _aren't_ smart enough to see that there order that the modules are initialized doesn't matter at all. _If_ they think of the solution that Phobos uses or are lucky enough to have someone else points it out to them _and_ it's actually possible to refactor the static constructor out like that, then the solution is doable, albeit arguably on the ugly side. But that's assuming a lot IMHO. By contrast, we could have a simple feature that was explained in the documenation along with static constructors which made it easy to tell the compiler that the order doesn't matter - either by saying that it doesn't matter at all or that it doesn't matter in regards to a specific module. e.g. nodepends(std.file) static this() { } Now the code doesn't have to be redesigned to get around the fact that the compiler just isn't smart enough to figure it out on its own. Sure, the feature is potentially unsafe, but so are plenty of other features in D. The best situation would be if the compiler was smart enough to figure it out for itself, but barring that this definitely seems like a far cleaner solution than having to try and figure out how to break up some of the initialization code for a module into a separate module, especially when features such as immutable and pure tend to make such separation impossible without some nasty casts. It would just be way simpler to have a feature which allowed you to tell the compiler that there was no dependency. I'd probably feel differently about this if static constructors tended to have actual interdependencies, but they are almost invariably used for initializing immutable variables and the like and have no dependencies on other modules at all. It's other stuff in the modules which have those interdependencies. - Jonathan M DavisClearly, I'm not going to win any arguments on this, given that both you and Walter are definitely opposed, but I definitely think that the current situation with circular dependencies is one of D's major warts.I'm not nailed to the floor. Any good arguments would definitely change my opinion.
Dec 16 2011
On 12/16/11 6:54 PM, Jonathan M Davis wrote:By contrast, we could have a simple feature that was explained in the documenation along with static constructors which made it easy to tell the compiler that the order doesn't matter - either by saying that it doesn't matter at all or that it doesn't matter in regards to a specific module. e.g. nodepends(std.file) static this() { } Now the code doesn't have to be redesigned to get around the fact that the compiler just isn't smart enough to figure it out on its own. Sure, the feature is potentially unsafe, but so are plenty of other features in D.That is hardly a good argument in favor of the feature :o). One issue that you might have not considered is that this is more brittle than it might seem. Even though the dependency pattern is "painfully obvious" to the human at a point in time, maintenance work can easily change that, and in very non-obvious ways (e.g. dependency cycles spanning multiple modules). I've seen it happening in C++, and when you realize it it's quite mind-boggling.The best situation would be if the compiler was smart enough to figure it out for itself, but barring that this definitely seems like a far cleaner solution than having to try and figure out how to break up some of the initialization code for a module into a separate module, especially when features such as immutable and pure tend to make such separation impossible without some nasty casts. It would just be way simpler to have a feature which allowed you to tell the compiler that there was no dependency.I think the only right approach to this must be principled - either by CTFEing the constructor or by guaranteeing it calls no functions that may close a dependency cycle. Even without that, I'd say we're in very good shape. Andrei
Dec 16 2011
Le 17/12/2011 02:39, Andrei Alexandrescu a écrit :On 12/16/11 6:54 PM, Jonathan M Davis wrote:Very good point. CTFE is improving with each version of dmd, and is a real alternative to static this(); It should be considered when apropriate, it has many benefices.By contrast, we could have a simple feature that was explained in the documenation along with static constructors which made it easy to tell the compiler that the order doesn't matter - either by saying that it doesn't matter at all or that it doesn't matter in regards to a specific module. e.g. nodepends(std.file) static this() { } Now the code doesn't have to be redesigned to get around the fact that the compiler just isn't smart enough to figure it out on its own. Sure, the feature is potentially unsafe, but so are plenty of other features in D.That is hardly a good argument in favor of the feature :o). One issue that you might have not considered is that this is more brittle than it might seem. Even though the dependency pattern is "painfully obvious" to the human at a point in time, maintenance work can easily change that, and in very non-obvious ways (e.g. dependency cycles spanning multiple modules). I've seen it happening in C++, and when you realize it it's quite mind-boggling.The best situation would be if the compiler was smart enough to figure it out for itself, but barring that this definitely seems like a far cleaner solution than having to try and figure out how to break up some of the initialization code for a module into a separate module, especially when features such as immutable and pure tend to make such separation impossible without some nasty casts. It would just be way simpler to have a feature which allowed you to tell the compiler that there was no dependency.I think the only right approach to this must be principled - either by CTFEing the constructor or by guaranteeing it calls no functions that may close a dependency cycle. Even without that, I'd say we're in very good shape. Andrei
Dec 17 2011
On Saturday, December 17, 2011 19:44:28 deadalnix wrote:Very good point. CTFE is improving with each version of dmd, and is a real alternative to static this(); It should be considered when apropriate, it has many benefices.I think that in general, the uses for static this fall into one of two categories: 1. Initializing stuff that can't be initialized at compile time. This includes stuff like classes or AAs as well as stuff which needs to be initialized with a value which isn't known until runtime (e.g. when the program started running). 2. Calling functions which need to be called at the beginning of the program (e.g. a function which does something to the environment that the program is running in). rarer of the two use cases. So, ultimately static this may become very rare. - Jonathan M Davis
Dec 17 2011
Le 18/12/2011 03:01, Jonathan M Davis a écrit :On Saturday, December 17, 2011 19:44:28 deadalnix wrote:Google Guice or picocontainer to deal with this issue. In the case of datetime, though, I suspect it would be a using a hammer to crush a fly.Very good point. CTFE is improving with each version of dmd, and is a real alternative to static this(); It should be considered when apropriate, it has many benefices.I think that in general, the uses for static this fall into one of two categories: 1. Initializing stuff that can't be initialized at compile time. This includes stuff like classes or AAs as well as stuff which needs to be initialized with a value which isn't known until runtime (e.g. when the program started running). 2. Calling functions which need to be called at the beginning of the program (e.g. a function which does something to the environment that the program is running in). rarer of the two use cases. So, ultimately static this may become very rare. - Jonathan M Davis
Dec 18 2011
On Sat, 17 Dec 2011 01:50:51 +0200, Jonathan M Davis <jmdavisProg gmx.com> wrote:On Friday, December 16, 2011 17:13:49 Andrei Alexandrescu wrote:I don't like patterns much but when it comes to singleton i absolutely hate it. Just ask yourself what does it do to earn that fancy name. NOTHING. It is nothing but a hype of those who want to rule everything with one paradigm. Generic solutions/rules/paradigms are our final target WHEN they are elegant. If you are using singleton in your C++/D (or any other M-P language) code, do yourself a favor and trash that book you learned it from. --- class A { static A make(); } class B; B makeB(); --- What A.make can do makeB can not? (Other than creating objects of two different types :P )Maybe there's an issue with the design. Maybe Singleton (the most damned of all patterns) is not the best choice here. Or maybe the use of an inheritance hierarchy with a grand total of 4 classes. Or maybe the encapsulation could be rethought. The general point is, a design lives within a language. Any language is going to disallow a few designs or make them unsuitable for particular situation. This is, again, multiplied by the context: it's the standard library.I don't know what's wrong with singletons. It's a great pattern in certain circumstances.
Dec 17 2011
On 12/17/11 6:34 AM, so wrote:If you are using singleton in your C++/D (or any other M-P language) code, do yourself a favor and trash that book you learned it from. --- class A { static A make(); } class B; B makeB(); --- What A.make can do makeB can not? (Other than creating objects of two different types :P )Singleton has two benefits. One, you can't accidentally create more than one instance. The second, which is often overlooked, is that you still benefit of polymorphism (as opposed to making its state global). Andrei
Dec 17 2011
On Sat, 17 Dec 2011 21:20:33 +0200, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:On 12/17/11 6:34 AM, so wrote:Now i am puzzled, "makeB" does both and does better. (better as it doesn't expose any detail to user)If you are using singleton in your C++/D (or any other M-P language) code, do yourself a favor and trash that book you learned it from. --- class A { static A make(); } class B; B makeB(); --- What A.make can do makeB can not? (Other than creating objects of two different types :P )Singleton has two benefits. One, you can't accidentally create more than one instance. The second, which is often overlooked, is that you still benefit of polymorphism (as opposed to making its state global). Andrei
Dec 17 2011
On Saturday, 17 December 2011 at 21:02:58 UTC, so wrote:On Sat, 17 Dec 2011 21:20:33 +0200, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Both of your examples are the singleton pattern if `make` returns the same instance every time, and arguably (optionally?) A or B shouldn't be instantiable in any other way. I suspect that the reason a static member function is prevalent is because it's easy to just make the constructor private (and not have to mess with things like C++'s `friend`). In D, there's no real difference because you can still use private members as long as you're in the same module. The only difference between them I can see is that the module-level function doesn't expose the class name directly when using the function, which is but a minor improvement.On 12/17/11 6:34 AM, so wrote:Now i am puzzled, "makeB" does both and does better. (better as it doesn't expose any detail to user)If you are using singleton in your C++/D (or any other M-P language) code, do yourself a favor and trash that book you learned it from. --- class A { static A make(); } class B; B makeB(); --- What A.make can do makeB can not? (Other than creating objects of two different types :P )Singleton has two benefits. One, you can't accidentally create more than one instance. The second, which is often overlooked, is that you still benefit of polymorphism (as opposed to making its state global). Andrei
Dec 17 2011
On Sat, 17 Dec 2011 23:12:16 +0200, Jakob Ovrum <jakobovrum gmail.com> wrote:I suspect that the reason a static member function is prevalent is because it's easy to just make the constructor private (and not have to mess with things like C++'s `friend`). In D, there's no real difference because you can still use private members as long as you're in the same module.Exactly. there is no difference between "static A.make" and "makeA" in D.The only difference between them I can see is that the module-level function doesn't expose the class name directly when using the function, which is but a minor improvement.You have to expose either way no? "A.make" instead of "makeA"
Dec 18 2011
On Sunday, 18 December 2011 at 08:56:56 UTC, so wrote:You have to expose either way no? "A.make" instead of "makeA"Yeah, in most sane code, I would imagine so. But still, the original example was just `make` version `A.make`. They could both obscure their return type through various means (like auto), but imo it makes less sense to do so for the static member function - I would be surprised to call `A.make` and not get a value of type `A`. But it would only be a tiny improvement and I don't think it's really relevant to the singleton pattern.
Dec 18 2011
On Sunday, 18 December 2011 at 09:26:58 UTC, Jakob Ovrum wrote:On Sunday, 18 December 2011 at 08:56:56 UTC, so wrote:Sorry, I'm wrong, that wasn't the case at all. The original example was indeed `A.make` versus `makeB`.You have to expose either way no? "A.make" instead of "makeA"Yeah, in most sane code, I would imagine so. But still, the original example was just `make` version `A.make`. They could both obscure their return type through various means (like auto), but imo it makes less sense to do so for the static member function - I would be surprised to call `A.make` and not get a value of type `A`. But it would only be a tiny improvement and I don't think it's really relevant to the singleton pattern.
Dec 18 2011
On Saturday, December 17, 2011 13:20:33 Andrei Alexandrescu wrote:On 12/17/11 6:34 AM, so wrote:Yes. There are occasions when singleton is very useful and makes perfect sense. There's every possibity that it's a design pattern which is overused, and if you don't need it, you probably shouldn't use it, but there _are_ cases where it's useful. In the case of std.datetime, the UTC and LocalTime classes are singletons because there's absolutely no point in ever allocating multiple of them. It would be a waste of memory. Imagine if auto time = Clock.currTime(); had to allocate a LocalTime object every time. That's a lot of useless heap allocation. By making it a singleton, it's far more efficient. Currently, it does _no_ heap allocation, and once the singleton becomes lazy, it'll only allocate on the first call. I don't see a valid reason _not_ to use a singleton in this case - certainly not as long as time zones are classes, and I think that they make the most sense as classes considering what they have to do and how they have to behave. - Jonathan M DavisIf you are using singleton in your C++/D (or any other M-P language) code, do yourself a favor and trash that book you learned it from. --- class A { static A make(); } class B; B makeB(); --- What A.make can do makeB can not? (Other than creating objects of two different types :P )Singleton has two benefits. One, you can't accidentally create more than one instance. The second, which is often overlooked, is that you still benefit of polymorphism (as opposed to making its state global).
Dec 17 2011
On 12/16/2011 1:41 PM, Timon Gehr wrote:Adding a language construct that turns off the checking entirely (as you seem to suggest) is not at all better than having to create a few additional source files.I also don't really see how turning off checking is even slightly more elegant than using a dirty cast. The additional source file thing is best because it fits in with the guarantees of the language - it is not a hack nor does it require trust in the programmer to get it right. It's not going to have heisenbugs where it working or not depends on arbitrary link order.
Dec 16 2011
Yes they are. static constructors completely chicken out on them. Not only is there no real attempt to determine whether the static constructors are actually dependent (which granted, isn't an easy problem), but there is _zero_ support in the language for resolving such circular dependencies. There's no way to say that they _aren't_ dependent even if you can clearly see that they aren't. The solution used in Phobos (which won't work in std.datetime due to the use of immutable and pure) is to create a C module which has the code from the static constructor and then have a separate module which calls it in its static constructor.Which is a hack because that C function is a compiler wall while the dependency persists. Btw. that stdiobase and datebase are obsolete the cycles have vanished. You will get this only if std.dateparse had a shared static ctor too. Cycle detected between modules with ctors/dtors: std.date -> std.dateparse -> std.date object.Exception src/rt/minfo.d(309): Aborting! There is a cleaner hack to solve the issue but I really don't like it. It's two DAGs that are iterated one for "shared static this" and one for "static this". ---- module a; import b; shared static this() { } ---- module b; import a, core.atomic : cas; shared bool initialized; static this() { if (!cas(&initialized, false, true)) return; ... } ----
Jan 18 2012
On Wed, 18 Jan 2012 12:14:07 +0100, Martin Nowak <dawg dawgfoto.de> wrote:Forget about it. Immutable initialization shouldn't work from thread local ctors. But hey I found a bug and it already had a number http://d.puremagic.com/issues/show_bug.cgi?id=4923.Yes they are. static constructors completely chicken out on them. Not only is there no real attempt to determine whether the static constructors are actually dependent (which granted, isn't an easy problem), but there is _zero_ support in the language for resolving such circular dependencies. There's no way to say that they _aren't_ dependent even if you can clearly see that they aren't. The solution used in Phobos (which won't work in std.datetime due to the use of immutable and pure) is to create a C module which has the code from the static constructor and then have a separate module which calls it in its static constructor.Which is a hack because that C function is a compiler wall while the dependency persists. Btw. that stdiobase and datebase are obsolete the cycles have vanished. You will get this only if std.dateparse had a shared static ctor too. Cycle detected between modules with ctors/dtors: std.date -> std.dateparse -> std.date object.Exception src/rt/minfo.d(309): Aborting! There is a cleaner hack to solve the issue but I really don't like it. It's two DAGs that are iterated one for "shared static this" and one for "static this". ---- module a; import b; shared static this() { } ---- module b; import a, core.atomic : cas; shared bool initialized; static this() { if (!cas(&initialized, false, true)) return; ... } ----
Jan 18 2012
On 12/16/11 1:41 PM, Jonathan M Davis wrote:You could make core.time use property functions instead of the static immutable variables that it's using now for ticksPerSec and appOrigin, but in order to do that right would require introducing a mutex or synchronized block (which is really just a mutex under the hood anyway), and I'm loathe to do that in time-related code. ticksPerSec gets used all over the place in TickDuration, and that could have a negative impact on performance for something that needs to be really fast (since it's used in stuff like StopWatch and benchmarking). On top of that, in order to maintain the current semantics, the property functions would have to be pure, which they can't be without doing some nasty casting to convince the compiler that stuff which isn't pure is actually pure. For std.datetime, the problem would be reduced if a class could be created in CTFE and still be around at runtime, but we can't do that yet, and it wouldn't completely solve the problem, since the shared static constructor related to LocalTime has to call tzset. So, some sort of runtime initialization must be done. And the instances for the singleton are not only immutable, but the functions for getting them are pure. So, once again, some nasty casting would be required to get it to work without breaking purity. And once again, we'd have introduce a mutex. And for both core.time and std.datetime we're talking about a mutex would be needed only briefly to ensure that we don't end up with two threads trying to initialize the variable at the same time. After that, it would just be impeding performance for no value. They're classic situations for static constructors - initializing static immutable variables - and really, they _should_ be using static constructors. If we have to get rid of them, it's to get around other problems in the language or compiler instead of fixing those problems. So, on some level, that seems like a failure on the part of the language and the compiler. If we _have_ to find a workaround, then we have to find a workaround, but I find the need to be distasteful to say the least. I previously tried to get rid of the static constructors in std.datetime and couldn't precisely because they're needed unless you play major casting games to get around immutable and pure. If we play nice, it's impossible to get rid of the static constructors in std.datetime. It probably is possible if we do nasty casting, but (much as I hate to use the word) it seems like this is a hack to get around the fact that the compiler isn't dealing with static constructors as well as we'd like. I'd _really_ like to see this fixed at the compiler level.I understand and empathize with the sentiment, and I agree with most of the technical points at face value, save for a few details. But there are other things at stake. Consider scope. Many arguments applicable to application code are not quite fit for the standard library. The stdlib is the connection between the compiler innards, the runtime innards, and the OS innards all meet, and the role of the stdlib is to provide nice abstractions to client code. Inside the stdlib it's entirely expected to find things like __traits most nobody heard of, casts, and other things that would be normally shunned in application code. I'd be more worried if there was no possibility to do what we need to do. The standard library is not a place to play it nice. We can't afford to say "well yeah everyone's binary is bloated and slower to start but we didn't like the cast that would have taken care of that". As another matter, there is value in minimizing compulsive work during library startup. Consider for example this code in std.datetime: shared static this() { tzset(); _localTime = new immutable(LocalTime)(); } This summons the garbage collector right off the bat, thus wiping off anyone's chance of compiling and linking without a GC - as many people seem to want to do. And that happens not to programs that import and use std.datetime, but to program using any part of the standard library that transitively imports std.datetime, even for the most innocuous uses, and even if they never, ever use _localtime! That one line essentially locks out 75% of the standard library to anyone wishing to ever avoid using the GC.And honestly, I think that a far worse problem with static constructors is circular dependencies. _That_ is something that needs to be addressed with regards to static constructors. In general at this point, it's looking like static constructors are turning out to be a bit of a failure on some level, given the issues that we're having because of them, and I think that we should fix the language and/or compiler so that they _aren't_ a failure.Here I totally disagree. The design is sound. The issues discussed here are entirely detail implementation artifacts. Andrei
Dec 16 2011
On Friday, December 16, 2011 14:44:48 Andrei Alexandrescu wrote:On 12/16/11 1:41 PM, Jonathan M Davis wrote: I understand and empathize with the sentiment, and I agree with most of the technical points at face value, save for a few details. But there are other things at stake. Consider scope. Many arguments applicable to application code are not quite fit for the standard library. The stdlib is the connection between the compiler innards, the runtime innards, and the OS innards all meet, and the role of the stdlib is to provide nice abstractions to client code. Inside the stdlib it's entirely expected to find things like __traits most nobody heard of, casts, and other things that would be normally shunned in application code. I'd be more worried if there was no possibility to do what we need to do. The standard library is not a place to play it nice. We can't afford to say "well yeah everyone's binary is bloated and slower to start but we didn't like the cast that would have taken care of that".I'm not completely against this precisely because of this, but at the same time, it strikes me as completely ridiculous to have to resort to some nasty casting simply to reduce the binary size of the base executable. I'd much rather see the compiler improved such that this isn't necessary.As another matter, there is value in minimizing compulsive work during library startup. Consider for example this code in std.datetime: shared static this() { tzset(); _localTime = new immutable(LocalTime)(); } This summons the garbage collector right off the bat, thus wiping off anyone's chance of compiling and linking without a GC - as many people seem to want to do. And that happens not to programs that import and use std.datetime, but to program using any part of the standard library that transitively imports std.datetime, even for the most innocuous uses, and even if they never, ever use _localtime! That one line essentially locks out 75% of the standard library to anyone wishing to ever avoid using the GC.This, on the other hand, is of much greater concern, and is a much better argument for using the ugly casting necessary to get rid of the static constructors, even if the compiler did a fanastic job at cutting out the extra cruft in the binary - though as far as the GC goes, it might not be an issue once CTFE is good enough to create classes at compile time that still exist at runtime. Unfortunately, the necessity of tzset would remain however.As far as the binary size goes, I completely agree that it's an implementation issue, but I definitely think that the issues with circular dependencies is a design issue which needs to be addressed. The basics of static constructors wouldn't have to change drastically, but there should at least be a way to indicate to the compiler that there is not actually a circular dependency. I don't think that I have ever seen druntime blow up on a circular dependency where there was actually a circular dependency. It's just that the compiler (or druntime or both) isn't smart enough to determine whether the static constructors _actually_ create a circular dependency. It has no way of determining which module's static constructors should be called first and givse up. We need a way to give it that information so that it can order them when they aren't actually interdependent. _That_ is the design flaw that I see in static constructors, and it's one of the most annoying issues in the language IMHO (which arguably just goes to show how good D is in general, I suppose). - Jonathan M DavisAnd honestly, I think that a far worse problem with static constructors is circular dependencies. _That_ is something that needs to be addressed with regards to static constructors. In general at this point, it's looking like static constructors are turning out to be a bit of a failure on some level, given the issues that we're having because of them, and I think that we should fix the language and/or compiler so that they _aren't_ a failure.Here I totally disagree. The design is sound. The issues discussed here are entirely detail implementation artifacts.
Dec 16 2011
On Fri, 16 Dec 2011 15:58:28 -0500, Jonathan M Davis <jmdavisProg gmx.com> wrote:On Friday, December 16, 2011 14:44:48 Andrei Alexandrescu wrote:This can be solved with malloc and emplace -SteveAs another matter, there is value in minimizing compulsive work during library startup. Consider for example this code in std.datetime: shared static this() { tzset(); _localTime = new immutable(LocalTime)(); } This summons the garbage collector right off the bat, thus wiping off anyone's chance of compiling and linking without a GC - as many people seem to want to do. And that happens not to programs that import and use std.datetime, but to program using any part of the standard library that transitively imports std.datetime, even for the most innocuous uses, and even if they never, ever use _localtime! That one line essentially locks out 75% of the standard library to anyone wishing to ever avoid using the GC.This, on the other hand, is of much greater concern, and is a much better argument for using the ugly casting necessary to get rid of the static constructors, even if the compiler did a fanastic job at cutting out the extra cruft in the binary - though as far as the GC goes, it might not be an issue once CTFE is good enough to create classes at compile time that still exist at runtime. Unfortunately, the necessity of tzset would remain however.
Dec 16 2011
On 12/16/11 3:43 PM, Steven Schveighoffer wrote:On Fri, 16 Dec 2011 15:58:28 -0500, Jonathan M Davis <jmdavisProg gmx.com> wrote:Sure you meant static ubyte[__traits(classInstanceSize, T)] and emplace :o). AndreiOn Friday, December 16, 2011 14:44:48 Andrei Alexandrescu wrote:This can be solved with malloc and emplaceAs another matter, there is value in minimizing compulsive work during library startup. Consider for example this code in std.datetime: shared static this() { tzset(); _localTime = new immutable(LocalTime)(); } This summons the garbage collector right off the bat, thus wiping off anyone's chance of compiling and linking without a GC - as many people seem to want to do. And that happens not to programs that import and use std.datetime, but to program using any part of the standard library that transitively imports std.datetime, even for the most innocuous uses, and even if they never, ever use _localtime! That one line essentially locks out 75% of the standard library to anyone wishing to ever avoid using the GC.This, on the other hand, is of much greater concern, and is a much better argument for using the ugly casting necessary to get rid of the static constructors, even if the compiler did a fanastic job at cutting out the extra cruft in the binary - though as far as the GC goes, it might not be an issue once CTFE is good enough to create classes at compile time that still exist at runtime. Unfortunately, the necessity of tzset would remain however.
Dec 16 2011
On Fri, 16 Dec 2011 16:48:18 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:On 12/16/11 3:43 PM, Steven Schveighoffer wrote:That works too! -SteveOn Fri, 16 Dec 2011 15:58:28 -0500, Jonathan M Davis <jmdavisProg gmx.com> wrote:Sure you meant static ubyte[__traits(classInstanceSize, T)] and emplace :o).On Friday, December 16, 2011 14:44:48 Andrei Alexandrescu wrote:This can be solved with malloc and emplaceAs another matter, there is value in minimizing compulsive work during library startup. Consider for example this code in std.datetime: shared static this() { tzset(); _localTime = new immutable(LocalTime)(); } This summons the garbage collector right off the bat, thus wiping off anyone's chance of compiling and linking without a GC - as many people seem to want to do. And that happens not to programs that import and use std.datetime, but to program using any part of the standard library that transitively imports std.datetime, even for the most innocuous uses, and even if they never, ever use _localtime! That one line essentially locks out 75% of the standard library to anyone wishing to ever avoid using the GC.This, on the other hand, is of much greater concern, and is a much better argument for using the ugly casting necessary to get rid of the static constructors, even if the compiler did a fanastic job at cutting out the extra cruft in the binary - though as far as the GC goes, it might not be an issue once CTFE is good enough to create classes at compile time that still exist at runtime. Unfortunately, the necessity of tzset would remain however.
Dec 16 2011
On Dec 16, 2011, at 1:48 PM, Andrei Alexandrescu wrote:On 12/16/11 3:43 PM, Steven Schveighoffer wrote:Don't forget the 16 byte alignment :-)This can be solved with malloc and emplaceSure you meant static ubyte[__traits(classInstanceSize, T)] and emplace :o).
Dec 16 2011
On 12/17/2011 12:11 AM, Sean Kelly wrote:On Dec 16, 2011, at 1:48 PM, Andrei Alexandrescu wrote:Which is currently relatively easy: http://d.puremagic.com/issues/show_bug.cgi?id=6635On 12/16/11 3:43 PM, Steven Schveighoffer wrote:Don't forget the 16 byte alignment :-)This can be solved with malloc and emplaceSure you meant static ubyte[__traits(classInstanceSize, T)] and emplace :o).
Dec 16 2011
Sean Kelly:On Dec 16, 2011, at 1:48 PM, Andrei Alexandrescu wrote:Is it possible to support this in D2/D3? align(16) static ubyte[__traits(classInstanceSize, T)] _localTime; There are some situations I'd like a static array to be aligned to 16 bytes. Bye, bearophileSure you meant static ubyte[__traits(classInstanceSize, T)] and emplace :o).Don't forget the 16 byte alignment :-)
Dec 16 2011
On 12/16/11 2:58 PM, Jonathan M Davis wrote:Unfortunately, the necessity of tzset would remain however.Why? From http://pubs.opengroup.org/onlinepubs/007904875/functions/tzset.html: "The tzset() function shall use the value of the environment variable TZ to set time conversion information used by ctime(), localtime(), mktime(), and strftime(). If TZ is absent from the environment, implementation-defined default timezone information shall be used." I'd expect a good standard library implementation for D would call tzset() once per process instance, lazily, inside the wrapper functions for the four functions above. Alternatively, people could call the stdc.* versions and expect tzet() to _not_ having been called. That strikes the right balance between convenience, flexibility, and efficiency. Andrei
Dec 16 2011
On Friday, December 16, 2011 16:58:51 Andrei Alexandrescu wrote:On 12/16/11 2:58 PM, Jonathan M Davis wrote:I mean that if CTFE was advanced enough that I could do immutable _localTime = new LocalTime(); then I could eliminate the shared static constructor for UTC completely, but the tzset for LocalTime would still be required. It _should_ be run once per process, and it's currently in a shared static constructor, so that's what it does. It's just not currently lazy. Regardless, my point was that even if CTFE were that advanced, the static constructor would still be required. If it's changed so that it's lazily loaded, then it can be moved out of the static constructor, but the CTFE solution wouldn't be enough. I'll look at what it would take to get rid of the static constructors and make the singletons load lazily, but it will require subverting the type system, since it's going to have to break both immutable and pure to be loaded lazily. - Jonathan M DavisUnfortunately, the necessity of tzset would remain however.Why? From http://pubs.opengroup.org/onlinepubs/007904875/functions/tzset.html: "The tzset() function shall use the value of the environment variable TZ to set time conversion information used by ctime(), localtime(), mktime(), and strftime(). If TZ is absent from the environment, implementation-defined default timezone information shall be used." I'd expect a good standard library implementation for D would call tzset() once per process instance, lazily, inside the wrapper functions for the four functions above. Alternatively, people could call the stdc.* versions and expect tzet() to _not_ having been called. That strikes the right balance between convenience, flexibility, and efficiency.
Dec 16 2011
On 12/16/2011 3:54 PM, Jonathan M Davis wrote:I'll look at what it would take to get rid of the static constructors and make the singletons load lazily, but it will require subverting the type system, since it's going to have to break both immutable and pure to be loaded lazily.Sure, but having a way to tell the compiler "assume this constructor does not have any dependencies" also subverts the type system.
Dec 16 2011
On 12/16/11 5:54 PM, Jonathan M Davis wrote:On Friday, December 16, 2011 16:58:51 Andrei Alexandrescu wrote:"The tzset() function shall use the value of the environment variable TZOn 12/16/11 2:58 PM, Jonathan M Davis wrote:Unfortunately, the necessity of tzset would remain however.Why? From http://pubs.opengroup.org/onlinepubs/007904875/functions/tzset.html:I think it's all a matter of terminology. Calling tzset during module initialization is not "required", doing it otherwise is not "impossible", and the standard library does not have to always "play it nice". :o) One more thing - could you take the time to explain why you believe calling tzset() compulsively is needed? Thanks, Andreito set time conversion information used by ctime(), localtime(), mktime(), and strftime(). If TZ is absent from the environment, implementation-defined default timezone information shall be used." I'd expect a good standard library implementation for D would call tzset() once per process instance, lazily, inside the wrapper functions for the four functions above. Alternatively, people could call the stdc.* versions and expect tzet() to _not_ having been called. That strikes the right balance between convenience, flexibility, and efficiency.I mean that if CTFE was advanced enough that I could do immutable _localTime = new LocalTime(); then I could eliminate the shared static constructor for UTC completely, but the tzset for LocalTime would still be required. It _should_ be run once per process, and it's currently in a shared static constructor, so that's what it does. It's just not currently lazy. Regardless, my point was that even if CTFE were that advanced, the static constructor would still be required. If it's changed so that it's lazily loaded, then it can be moved out of the static constructor, but the CTFE solution wouldn't be enough. I'll look at what it would take to get rid of the static constructors and make the singletons load lazily, but it will require subverting the type system, since it's going to have to break both immutable and pure to be loaded lazily.
Dec 16 2011
On Friday, December 16, 2011 18:47:02 Andrei Alexandrescu wrote:One more thing - could you take the time to explain why you believe calling tzset() compulsively is needed?Some of the C stuff that LocalTime uses requires it. If LocalTime is lazily initialized, then it can be called then though rather than in the shared static constructor. - Jonathan M Davis
Dec 16 2011
On 12/16/11 6:53 PM, Jonathan M Davis wrote:On Friday, December 16, 2011 18:47:02 Andrei Alexandrescu wrote:Thanks. Sounds like we have a plan! AndreiOne more thing - could you take the time to explain why you believe calling tzset() compulsively is needed?Some of the C stuff that LocalTime uses requires it. If LocalTime is lazily initialized, then it can be called then though rather than in the shared static constructor. - Jonathan M Davis
Dec 16 2011
On Dec 16, 2011, at 12:44 PM, Andrei Alexandrescu wrote:=20 Consider scope. Many arguments applicable to application code are not =quite fit for the standard library. The stdlib is the connection between = the compiler innards, the runtime innards, and the OS innards all meet, = and the role of the stdlib is to provide nice abstractions to client = code. Inside the stdlib it's entirely expected to find things like = __traits most nobody heard of, casts, and other things that would be = normally shunned in application code. I'd be more worried if there was = no possibility to do what we need to do. The standard library is not a = place to play it nice. We can't afford to say "well yeah everyone's = binary is bloated and slower to start but we didn't like the cast that = would have taken care of that". I think this is a reasonable assertion about druntime, but the standard = library itself should require very little black magic, though the use of = obscure features (like __traits) could be commonplace.=
Dec 16 2011
On 12/16/11 4:55 PM, Sean Kelly wrote:On Dec 16, 2011, at 12:44 PM, Andrei Alexandrescu wrote:"Very little" sounds almost enough :o).Consider scope. Many arguments applicable to application code are not quite fit for the standard library. The stdlib is the connection between the compiler innards, the runtime innards, and the OS innards all meet, and the role of the stdlib is to provide nice abstractions to client code. Inside the stdlib it's entirely expected to find things like __traits most nobody heard of, casts, and other things that would be normally shunned in application code. I'd be more worried if there was no possibility to do what we need to do. The standard library is not a place to play it nice. We can't afford to say "well yeah everyone's binary is bloated and slower to start but we didn't like the cast that would have taken care of that".I think this is a reasonable assertion about druntime, but the standard library itself should require very little black magic,though the use of obscure features (like __traits) could be commonplace.Absolutely. Andrei
Dec 16 2011
A related issue is phobos being an intermodule dependency monster. A simple hello world pulls in almost 30 modules! And std.stdio is supposed to be just a simple wrapper around C FILE.
Dec 16 2011
On 12/16/11 3:38 PM, Trass3r wrote:A related issue is phobos being an intermodule dependency monster. A simple hello world pulls in almost 30 modules! And std.stdio is supposed to be just a simple wrapper around C FILE.In fact it doesn't (after yesterday's commit). The std code in hello, world is a minuscule 3KB. The rest of 218KB is druntime. Once we solve the static constructor issue, function-level linking should take care of pulling only the minimum needed. One interesting fact is that a lot of issues that I tended to take non-critically ("templates cause bloat", "intermodule dependencies cause bloat", "static linking creates large programs") looked a whole lot differently when I looked closer at causes and effects. Andrei
Dec 16 2011
Am 16.12.2011, 22:45 Uhr, schrieb Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:On 12/16/11 3:38 PM, Trass3r wrote:Yep, the 30 modules is a measure I took before that commit.A related issue is phobos being an intermodule dependency monster. A simple hello world pulls in almost 30 modules! And std.stdio is supposed to be just a simple wrapper around C FILE.In fact it doesn't (after yesterday's commit). The std code in hello, world is a minuscule 3KB. The rest of 218KB is druntime.Once we solve the static constructor issue, function-level linking should take care of pulling only the minimum needed.Also by pulling in I just meant the imports. But the planned lazy semantic analysis should improve the situation.
Dec 16 2011
On 12/16/2011 10:53 PM, Trass3r wrote:Am 16.12.2011, 22:45 Uhr, schrieb Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:I think it is already lazy? --- module a; void foo(){ imanundefinedsymbolandcauseacompileerror(); } --- --- module b; import a; void main(){ foo(); } ---On 12/16/11 3:38 PM, Trass3r wrote:Yep, the 30 modules is a measure I took before that commit.A related issue is phobos being an intermodule dependency monster. A simple hello world pulls in almost 30 modules! And std.stdio is supposed to be just a simple wrapper around C FILE.In fact it doesn't (after yesterday's commit). The std code in hello, world is a minuscule 3KB. The rest of 218KB is druntime.Once we solve the static constructor issue, function-level linking should take care of pulling only the minimum needed.Also by pulling in I just meant the imports. But the planned lazy semantic analysis should improve the situation.
Dec 16 2011
Andrei Alexandrescu Wrote:On 12/16/11 3:38 PM, Trass3r wrote:http://wiki.freepascal.org/Size_Matters Otherwise a great language that never did manage to remove "bloated" factor from its name. Many people stopped using it because of that, including me. I guess people do not like bloat when programming systems stuff.A related issue is phobos being an intermodule dependency monster. A simple hello world pulls in almost 30 modules! And std.stdio is supposed to be just a simple wrapper around C FILE.In fact it doesn't (after yesterday's commit). The std code in hello, world is a minuscule 3KB. The rest of 218KB is druntime. Once we solve the static constructor issue, function-level linking should take care of pulling only the minimum needed. One interesting fact is that a lot of issues that I tended to take non-critically ("templates cause bloat", "intermodule dependencies cause bloat", "static linking creates large programs") looked a whole lot differently when I looked closer at causes and effects. Andrei
Dec 16 2011
On Friday, 16 December 2011 at 21:45:43 UTC, Andrei Alexandrescu wrote:Once we solve the static constructor issue, function-level linking should take care of pulling only the minimum needed.This sounds fantastic.One interesting fact is that a lot of issues that I tended to take non-critically ("templates cause bloat", "intermodule dependencies cause bloat", "static linking creates large programs") looked a whole lot differently when I looked closer at causes and effects.I'd be careful to overgeneralize from this though; templates do have the potential to bloat things up, etc. Though static linking has and always shall rok. (For bloated templates, I had a monster of one in web.d that shrunk the binary by about three megabytes by refactoring some of it into regular functions. Shaved two seconds off the compile time too! Note this binary is my work project, so your results may vary with my library. It was basically inlining several kilobytes of the same stuff into hundreds of different functions... 10 kb * 300 functions = lots of code.)
Dec 16 2011
On 12/16/2011 1:45 PM, Andrei Alexandrescu wrote:On 12/16/11 3:38 PM, Trass3r wrote:Another thing is to avoid using classes for things where one does not expect it to ever be derived from. Use a struct instead, as referencing parts of the struct implementation will not pull in the whole of it, nor is there a vtbl[] to pull it all in. For example, in std.datetime there's "final class Clock". It inherits nothing, and nothing can be derived from it. The comments for it say it is merely a namespace. It should be a struct.A related issue is phobos being an intermodule dependency monster. A simple hello world pulls in almost 30 modules! And std.stdio is supposed to be just a simple wrapper around C FILE.In fact it doesn't (after yesterday's commit). The std code in hello, world is a minuscule 3KB. The rest of 218KB is druntime.
Dec 16 2011
On Fri, 16 Dec 2011 17:55:47 -0500, Walter Bright <newshound2 digitalmars.com> wrote:On 12/16/2011 1:45 PM, Andrei Alexandrescu wrote:Although I don't disagree with you that it should be a struct and not a class, does it have anything in its vtbl anyways if it's final? I'm just trying to understand what gets pulled in when you import a module with static ctors... -SteveOn 12/16/11 3:38 PM, Trass3r wrote:Another thing is to avoid using classes for things where one does not expect it to ever be derived from. Use a struct instead, as referencing parts of the struct implementation will not pull in the whole of it, nor is there a vtbl[] to pull it all in. For example, in std.datetime there's "final class Clock". It inherits nothing, and nothing can be derived from it. The comments for it say it is merely a namespace. It should be a struct.A related issue is phobos being an intermodule dependency monster. A simple hello world pulls in almost 30 modules! And std.stdio is supposed to be just a simple wrapper around C FILE.In fact it doesn't (after yesterday's commit). The std code in hello, world is a minuscule 3KB. The rest of 218KB is druntime.
Dec 19 2011
On 12/19/2011 7:17 AM, Steven Schveighoffer wrote:On Fri, 16 Dec 2011 17:55:47 -0500, Walter Bright <newshound2 digitalmars.com> wrote:Yes. The pointers to Object's functions, and a pointer to the TypeInfo for that class.For example, in std.datetime there's "final class Clock". It inherits nothing, and nothing can be derived from it. The comments for it say it is merely a namespace. It should be a struct.Although I don't disagree with you that it should be a struct and not a class, does it have anything in its vtbl anyways if it's final?I'm just trying to understand what gets pulled in when you import a module with static ctors...Write some trivial code snippets, compile them, and take a look at the object file with obj2asm.
Dec 19 2011
On Mon, 19 Dec 2011 13:09:42 -0500, Walter Bright <newshound2 digitalmars.com> wrote:On 12/19/2011 7:17 AM, Steven Schveighoffer wrote:Well pointers to Object's functions shouldn't add any bloat. The TypeInfo may, but that shouldn't pull in any real code from the module, right?On Fri, 16 Dec 2011 17:55:47 -0500, Walter Bright <newshound2 digitalmars.com> wrote:Yes. The pointers to Object's functions, and a pointer to the TypeInfo for that class.For example, in std.datetime there's "final class Clock". It inherits nothing, and nothing can be derived from it. The comments for it say it is merely a namespace. It should be a struct.Although I don't disagree with you that it should be a struct and not a class, does it have anything in its vtbl anyways if it's final?I'll rephrase -- I'm trying to understand what's *supposed* to happen :) Trusting that the compiler is doing it right isn't always correct. Though it probably is in this case. -SteveI'm just trying to understand what gets pulled in when you import a module with static ctors...Write some trivial code snippets, compile them, and take a look at the object file with obj2asm.
Dec 19 2011
On 12/16/2011 2:55 PM, Walter Bright wrote:For example, in std.datetime there's "final class Clock". It inherits nothing, and nothing can be derived from it. The comments for it say it is merely a namespace. It should be a struct.Or perhaps it should be in its own module.
Dec 19 2011
Am 19.12.2011, 19:08 Uhr, schrieb Walter Bright <newshound2 digitalmars.com>:On 12/16/2011 2:55 PM, Walter Bright wrote:When I first saw it I thought "That's how _Java_ goes about free functions: Make it a class." :)For example, in std.datetime there's "final class Clock". It inherits nothing, and nothing can be derived from it. The comments for it say it is merely a namespace. It should be a struct.Or perhaps it should be in its own module.
Dec 20 2011
On 12/20/11 2:58 PM, Marco Leise wrote:Am 19.12.2011, 19:08 Uhr, schrieb Walter Bright <newshound2 digitalmars.com>:Same here. If I had my way I'd rethink the name of those functions. Having a cutesy prefix "Clock." is hardly justifiable. AndreiOn 12/16/2011 2:55 PM, Walter Bright wrote:When I first saw it I thought "That's how _Java_ goes about free functions: Make it a class." :)For example, in std.datetime there's "final class Clock". It inherits nothing, and nothing can be derived from it. The comments for it say it is merely a namespace. It should be a struct.Or perhaps it should be in its own module.
Dec 20 2011
On Tuesday, December 20, 2011 17:32:53 Andrei Alexandrescu wrote:On 12/20/11 2:58 PM, Marco Leise wrote:It's not the only place in Phobos which uses a class as a namespace. I believe that both std.process and std.windows.registry are doing the same thing. In this case, it nicely group all of the functions that are grabbing the time in one form or another. They're all effectively grabbing the time from the system clock, so they're grouped on Clock. - Jonathan M DavisAm 19.12.2011, 19:08 Uhr, schrieb Walter Bright <newshound2 digitalmars.com>:Same here. If I had my way I'd rethink the name of those functions. Having a cutesy prefix "Clock." is hardly justifiable.On 12/16/2011 2:55 PM, Walter Bright wrote:When I first saw it I thought "That's how _Java_ goes about free functions: Make it a class." :)For example, in std.datetime there's "final class Clock". It inherits nothing, and nothing can be derived from it. The comments for it say it is merely a namespace. It should be a struct.Or perhaps it should be in its own module.
Dec 20 2011
On Wednesday, 21 December 2011 at 02:10:30 UTC, Jonathan M Davis wrote:On Tuesday, December 20, 2011 17:32:53 Andrei Alexandrescu wrote:Sounds like the perfect candidate for its own module.On 12/20/11 2:58 PM, Marco Leise wrote:It's not the only place in Phobos which uses a class as a namespace. I believe that both std.process and std.windows.registry are doing the same thing. In this case, it nicely group all of the functions that are grabbing the time in one form or another. They're all effectively grabbing the time from the system clock, so they're grouped on Clock. - Jonathan M DavisAm 19.12.2011, 19:08 Uhr, schrieb Walter Bright <newshound2 digitalmars.com>:Same here. If I had my way I'd rethink the name of those functions. Having a cutesy prefix "Clock." is hardly justifiable.On 12/16/2011 2:55 PM, Walter Bright wrote:When I first saw it I thought "That's how _Java_ goes about free functions: Make it a class." :)For example, in std.datetime there's "final class Clock". It inherits nothing, and nothing can be derived from it. The comments for it say it is merely a namespace. It should be a struct.Or perhaps it should be in its own module.
Dec 20 2011
On Wednesday, December 21, 2011 06:18:59 Jakob Ovrum wrote:On Wednesday, 21 December 2011 at 02:10:30 UTC, Jonathan M DavisNot out of the question, I suppose, but it would make an awfully small module and would inevitably make it that much harder for people to figure out how to get the current time. - Jonathan M DavisIt's not the only place in Phobos which uses a class as a namespace. I believe that both std.process and std.windows.registry are doing the same thing. In this case, it nicely group all of the functions that are grabbing the time in one form or another. They're all effectively grabbing the time from the system clock, so they're grouped on Clock. - Jonathan M DavisSounds like the perfect candidate for its own module.
Dec 20 2011
On Wed, 21 Dec 2011 07:34:30 +0200, Jonathan M Davis <jmdavisProg gmx.com> wrote:On Wednesday, December 21, 2011 06:18:59 Jakob Ovrum wrote:Supporting module nesting in single file wouldn't hurt, would it? module main; module nested { }On Wednesday, 21 December 2011 at 02:10:30 UTC, Jonathan M DavisNot out of the question, I suppose, but it would make an awfully small module and would inevitably make it that much harder for people to figure out how to get the current time. - Jonathan M DavisIt's not the only place in Phobos which uses a class as a namespace. I believe that both std.process and std.windows.registry are doing the same thing. In this case, it nicely group all of the functions that are grabbing the time in one form or another. They're all effectively grabbing the time from the system clock, so they're grouped on Clock. - Jonathan M DavisSounds like the perfect candidate for its own module.
Dec 21 2011
On 21. 12. 2011 14:22, so wrote:Supporting module nesting in single file wouldn't hurt, would it? module main; module nested { }Kind of... template MyNamespaceImpl () { int i; } alias MyNamespaceImpl!() MyNamespace; void main () { MyNamespace.i = 1; with (MyNamespace) { i = 2; } writeln(MyNamespace.i); readln(); }
Dec 21 2011
On Tuesday, December 20, 2011 21:34:30 Jonathan M Davis wrote:On Wednesday, December 21, 2011 06:18:59 Jakob Ovrum wrote:Not to mention, I quite like the effect that you get with it as a class, since it's explicit that it's coming from the clock, whereas if it were a module, that wouldn't be the case. You get the same effect with std.process' Environment. When you're calling functions on it, it's explicit that you're getting information from and affecting the environment. In a way, it's like a singleton, but there's nothing to instantiate. - Jonathan M DavisOn Wednesday, 21 December 2011 at 02:10:30 UTC, Jonathan M DavisNot out of the question, I suppose, but it would make an awfully small module and would inevitably make it that much harder for people to figure out how to get the current time.It's not the only place in Phobos which uses a class as a namespace. I believe that both std.process and std.windows.registry are doing the same thing. In this case, it nicely group all of the functions that are grabbing the time in one form or another. They're all effectively grabbing the time from the system clock, so they're grouped on Clock. - Jonathan M DavisSounds like the perfect candidate for its own module.
Dec 20 2011
On Dec 16, 2011, at 1:45 PM, Andrei Alexandrescu wrote:On 12/16/11 3:38 PM, Trass3r wrote:world is a minuscule 3KB. The rest of 218KB is runtime. Once upon a time, a minimal D app was roughly 65K. TypeInfo has = ballooned a lot since then however. It's worth considering whether = you're writing a Windows or Posix app as well, since the Posix headers = are far more extensive (and thus may result in far more ModuleInfo = instances).=A related issue is phobos being an intermodule dependency monster. A simple hello world pulls in almost 30 modules! And std.stdio is supposed to be just a simple wrapper around C FILE.=20 In fact it doesn't (after yesterday's commit). The std code in hello, =
Dec 16 2011
Le 16/12/2011 22:45, Andrei Alexandrescu a écrit :On 12/16/11 3:38 PM, Trass3r wrote:Fantastic ! :)A related issue is phobos being an intermodule dependency monster. A simple hello world pulls in almost 30 modules! And std.stdio is supposed to be just a simple wrapper around C FILE.In fact it doesn't (after yesterday's commit). The std code in hello, world is a minuscule 3KB. The rest of 218KB is druntime. Once we solve the static constructor issue, function-level linking should take care of pulling only the minimum needed. One interesting fact is that a lot of issues that I tended to take non-critically ("templates cause bloat", "intermodule dependencies cause bloat", "static linking creates large programs") looked a whole lot differently when I looked closer at causes and effects. Andrei
Dec 17 2011
On Dec 16, 2011, at 1:38 PM, Trass3r wrote:A related issue is phobos being an intermodule dependency monster. A simple hello world pulls in almost 30 modules!This was one of the major motivations for separating druntime from = phobos. The last thing anyone wants is for something in runtime to = print to the console and end up pulling in 80% of the standard library = as a result.=
Dec 16 2011
On 12/16/11 5:08 PM, Sean Kelly wrote:On Dec 16, 2011, at 1:38 PM, Trass3r wrote:Well, right now druntime itself may have become the interdependency knot it once wanted to shun :o). Commenting out all static cdtors from druntime only reduced the code size from 218KB to 200KB for a do-nothing program, so most of druntime is compulsively linked and loaded. I think we can improve things a bit there. AndreiA related issue is phobos being an intermodule dependency monster. A simple hello world pulls in almost 30 modules!This was one of the major motivations for separating druntime from phobos. The last thing anyone wants is for something in runtime to print to the console and end up pulling in 80% of the standard library as a result.
Dec 16 2011
On Dec 16, 2011, at 3:16 PM, Andrei Alexandrescu wrote:On 12/16/11 5:08 PM, Sean Kelly wrote:knot it once wanted to shun :o). The first place to look would be rt/. I know there's some tool that = generates dependency graphs for D. Does Descent do that?=On Dec 16, 2011, at 1:38 PM, Trass3r wrote: =20=20 Well, right now druntime itself may have become the interdependency =A related issue is phobos being an intermodule dependency monster. A simple hello world pulls in almost 30 modules!=20 This was one of the major motivations for separating druntime from phobos. The last thing anyone wants is for something in runtime to print to the console and end up pulling in 80% of the standard library as a result.
Dec 16 2011
On 17.12.2011 00:34, Sean Kelly wrote:On Dec 16, 2011, at 3:16 PM, Andrei Alexandrescu wrote:Maybe this is the tool you're thinking of: http://www.shfls.org/w/d/dimple/On 12/16/11 5:08 PM, Sean Kelly wrote:The first place to look would be rt/. I know there's some tool that generates dependency graphs for D. Does Descent do that?On Dec 16, 2011, at 1:38 PM, Trass3r wrote:Well, right now druntime itself may have become the interdependency knot it once wanted to shun :o).A related issue is phobos being an intermodule dependency monster. A simple hello world pulls in almost 30 modules!This was one of the major motivations for separating druntime from phobos. The last thing anyone wants is for something in runtime to print to the console and end up pulling in 80% of the standard library as a result.
Dec 16 2011
On 16/12/2011 18:29, Andrei Alexandrescu wrote:Here's a list of all files in std using static cdtors: std/__fileinit.d std/concurrency.d std/cpuid.d std/cstream.d std/datebase.d std/datetime.d std/encoding.d std/internal/math/biguintcore.d std/internal/math/biguintx86.d std/internal/processinit.d std/internal/windows/advapi32.d std/mmfile.d std/parallelism.d std/perf.d std/socket.d std/stdiobase.d std/uri.dOn a slightly related note: http://d.puremagic.com/issues/show_bug.cgi?id=5614 Basically, do the static constructors in __fileinit and mmfile need to exist on a (hypothetical) 64bit Windows build?
Dec 16 2011
On Dec 16, 2011, at 10:29 AM, Andrei Alexandrescu wrote:=20 But in experiments it seemed like program size would increase in =sudden amounts when certain modules were included. After much = investigation we figured that the following fateful causal sequence = happened:=20 1. Some modules define static constructors with "static this()" or ="static shared this()", and/or static destructors.=20 2. These constructors/destructors are linked in automatically whenever =a module is included.=20 3. Importing a module with a static constructor (or destructor) will =generate its ModuleInfo structure, which contains static information = about all module members. In particular, it keeps virtual table pointers = for all classes defined inside the module. What is gained from having class vtbls referenced by ModuleInfo? Could = we put them elsewhere?=
Dec 16 2011
On Fri, 16 Dec 2011 19:29:18 +0100, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Hello, Late last night Walter and I figured a few interesting tidbits of information. Allow me to give some context, discuss them, and sketch a few approaches for improving things. A while ago Walter wanted to enable function-level linking, i.e. only get the needed functions from a given (and presumably large) module. So he arranged things that a library contains many small object "files" (that actually are generated from a single .d file and never exist on disk, only inside the library file, which can be considered an archive like tar). Then the linker would only pick the used object "files" from the library and link those in. Unfortunately that didn't have nearly the expected impact - essentially the size of most binaries stayed the same. The mystery was unsolved, and Walter needed to move on to other things. One particularly annoying issue is that even programs that don't ostensibly use anything from an imported module may balloon inexplicably in size. Consider: import std.path; void main(){} This program, after stripping and all, has some 750KB in size. Removing the import line reduces the size to 218KB. That includes the runtime support, garbage collector, and such, and I'll consider it a baseline. (A similar but separate discussion could be focused on reducing the baseline size, but herein I'll consider it constant.) What we'd simply want is to be able to import stuff without blatantly paying for what we don't use. If a program imports std.path and uses no function from it, it should be as large as a program without the import. Furthermore, the increase should be incremental - using 2-3 functions from std.path should only increase the executable size by a little, not suddenly link in all code in that module. But in experiments it seemed like program size would increase in sudden amounts when certain modules were included. After much investigation we figured that the following fateful causal sequence happened: 1. Some modules define static constructors with "static this()" or "static shared this()", and/or static destructors. 2. These constructors/destructors are linked in automatically whenever a module is included. 3. Importing a module with a static constructor (or destructor) will generate its ModuleInfo structure, which contains static information about all module members. In particular, it keeps virtual table pointers for all classes defined inside the module. 4. That means generating ModuleInfo refers all virtual functions defined in that module, whether they're used or not. 5. The phenomenon is transitive, e.g. even if std.path has no static constructors but imports std.datetime which does, a ModuleInfo is generated for std.path too, in addition to the one for std.datetime. So now classes inside std.path (if any) will be all linked in. 6. It follows that a module that defines classes which in turn use other functions in other modules, and has static constructors (or includes other modules that do) will baloon the size of the executable suddenly. There are a few approaches that we can use to improve the state of affairs. A. On the library side, use static constructors and destructors sparingly inside druntime and std. We can use lazy initialization instead of compulsively initializing library internals. I think this is often a worthy thing to do in any case (dynamic libraries etc) because it only does work if and when work needs to be done at the small cost of a check upon each use. B. On the compiler side, we could use a similar lazy initialization trick to only refer class methods in the module if they're actually needed. I'm being vague here because I'm not sure what and how that can be done. Here's a list of all files in std using static cdtors: std/__fileinit.d std/concurrency.d std/cpuid.d std/cstream.d std/datebase.d std/datetime.d std/encoding.d std/internal/math/biguintcore.d std/internal/math/biguintx86.d std/internal/processinit.d std/internal/windows/advapi32.d std/mmfile.d std/parallelism.d std/perf.d std/socket.d std/stdiobase.d std/uri.d The majority of them don't do a lot of work and are not much used inside phobos, so they don't blow up the executable. The main one that could receive some attention is std.datetime. It has a few static ctors and a lot of classes. Essentially just importing std.datetime or any std module that transitively imports std.datetime (and there are many of them) ends up linking in most of Phobos and blows the size up from the 218KB baseline to 700KB. Jonathan, could I impose on you to replace all static cdtors in std.datetime with lazy initialization? I looked through it and it strikes me as a reasonably simple job, but I think you'd know better what to do than me. A similar effort could be conducted to reduce or eliminate static cdtors from druntime. I made the experiment of commenting them all, and that reduced the size of the baseline from 218KB to 200KB. This is a good amount, but not as dramatic as what we can get by working on std.datetime. Thanks, AndreiWe'd need the linker to do anything of this. Unreferenced symbols should be outputted using kind of vague linkage (multiobj partly does this). I-reference-everything stuff link ModuleInfos should only create weak references. This includes that localClasses might contain only part of the actual module. People can use the designated export attribute to forcefully output unused symbols.
Dec 16 2011
On Sat, 17 Dec 2011 07:09:50 +0100, Martin Nowak <dawg dawgfoto.de> wrote:On Fri, 16 Dec 2011 19:29:18 +0100, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:More concrete if we'd output weak defined symbols (null) for what is referenced by a ModuleInfo then the linker should not open further object files to find a definition. But if another definition is linked in it will replace the weak definition. The program would then need to skip the dummy symbols (null) at runtime.Hello, Late last night Walter and I figured a few interesting tidbits of information. Allow me to give some context, discuss them, and sketch a few approaches for improving things. A while ago Walter wanted to enable function-level linking, i.e. only get the needed functions from a given (and presumably large) module. So he arranged things that a library contains many small object "files" (that actually are generated from a single .d file and never exist on disk, only inside the library file, which can be considered an archive like tar). Then the linker would only pick the used object "files" from the library and link those in. Unfortunately that didn't have nearly the expected impact - essentially the size of most binaries stayed the same. The mystery was unsolved, and Walter needed to move on to other things. One particularly annoying issue is that even programs that don't ostensibly use anything from an imported module may balloon inexplicably in size. Consider: import std.path; void main(){} This program, after stripping and all, has some 750KB in size. Removing the import line reduces the size to 218KB. That includes the runtime support, garbage collector, and such, and I'll consider it a baseline. (A similar but separate discussion could be focused on reducing the baseline size, but herein I'll consider it constant.) What we'd simply want is to be able to import stuff without blatantly paying for what we don't use. If a program imports std.path and uses no function from it, it should be as large as a program without the import. Furthermore, the increase should be incremental - using 2-3 functions from std.path should only increase the executable size by a little, not suddenly link in all code in that module. But in experiments it seemed like program size would increase in sudden amounts when certain modules were included. After much investigation we figured that the following fateful causal sequence happened: 1. Some modules define static constructors with "static this()" or "static shared this()", and/or static destructors. 2. These constructors/destructors are linked in automatically whenever a module is included. 3. Importing a module with a static constructor (or destructor) will generate its ModuleInfo structure, which contains static information about all module members. In particular, it keeps virtual table pointers for all classes defined inside the module. 4. That means generating ModuleInfo refers all virtual functions defined in that module, whether they're used or not. 5. The phenomenon is transitive, e.g. even if std.path has no static constructors but imports std.datetime which does, a ModuleInfo is generated for std.path too, in addition to the one for std.datetime. So now classes inside std.path (if any) will be all linked in. 6. It follows that a module that defines classes which in turn use other functions in other modules, and has static constructors (or includes other modules that do) will baloon the size of the executable suddenly. There are a few approaches that we can use to improve the state of affairs. A. On the library side, use static constructors and destructors sparingly inside druntime and std. We can use lazy initialization instead of compulsively initializing library internals. I think this is often a worthy thing to do in any case (dynamic libraries etc) because it only does work if and when work needs to be done at the small cost of a check upon each use. B. On the compiler side, we could use a similar lazy initialization trick to only refer class methods in the module if they're actually needed. I'm being vague here because I'm not sure what and how that can be done. Here's a list of all files in std using static cdtors: std/__fileinit.d std/concurrency.d std/cpuid.d std/cstream.d std/datebase.d std/datetime.d std/encoding.d std/internal/math/biguintcore.d std/internal/math/biguintx86.d std/internal/processinit.d std/internal/windows/advapi32.d std/mmfile.d std/parallelism.d std/perf.d std/socket.d std/stdiobase.d std/uri.d The majority of them don't do a lot of work and are not much used inside phobos, so they don't blow up the executable. The main one that could receive some attention is std.datetime. It has a few static ctors and a lot of classes. Essentially just importing std.datetime or any std module that transitively imports std.datetime (and there are many of them) ends up linking in most of Phobos and blows the size up from the 218KB baseline to 700KB. Jonathan, could I impose on you to replace all static cdtors in std.datetime with lazy initialization? I looked through it and it strikes me as a reasonably simple job, but I think you'd know better what to do than me. A similar effort could be conducted to reduce or eliminate static cdtors from druntime. I made the experiment of commenting them all, and that reduced the size of the baseline from 218KB to 200KB. This is a good amount, but not as dramatic as what we can get by working on std.datetime. Thanks, AndreiWe'd need the linker to do anything of this. Unreferenced symbols should be outputted using kind of vague linkage (multiobj partly does this). I-reference-everything stuff link ModuleInfos should only create weak references. This includes that localClassesmight contain only part of the actual module. People can use the designated export attribute to forcefully output unused symbols.
Dec 16 2011
On 12/17/11 12:27 AM, Martin Nowak wrote:I think it would be awesome to exploit weak symbols. AndreiWe'd need the linker to do anything of this. Unreferenced symbols should be outputted using kind of vague linkage (multiobj partly does this). I-reference-everything stuff link ModuleInfos should only create weak references. This includes that localClassesMore concrete if we'd output weak defined symbols (null) for what is referenced by a ModuleInfo then the linker should not open further object files to find a definition. But if another definition is linked in it will replace the weak definition. The program would then need to skip the dummy symbols (null) at runtime.
Dec 16 2011
16.12.2011 21:29, Andrei Alexandrescu пишет:Hello, Late last night Walter and I figured a few interesting tidbits of information. Allow me to give some context, discuss them, and sketch a few approaches for improving things. A while ago Walter wanted to enable function-level linking, i.e. only get the needed functions from a given (and presumably large) module. So he arranged things that a library contains many small object "files" (that actually are generated from a single .d file and never exist on disk, only inside the library file, which can be considered an archive like tar). Then the linker would only pick the used object "files" from the library and link those in. Unfortunately that didn't have nearly the expected impact - essentially the size of most binaries stayed the same. The mystery was unsolved, and Walter needed to move on to other things. One particularly annoying issue is that even programs that don't ostensibly use anything from an imported module may balloon inexplicably in size. Consider: import std.path; void main(){} This program, after stripping and all, has some 750KB in size. Removing the import line reduces the size to 218KB. That includes the runtime support, garbage collector, and such, and I'll consider it a baseline. (A similar but separate discussion could be focused on reducing the baseline size, but herein I'll consider it constant.) What we'd simply want is to be able to import stuff without blatantly paying for what we don't use. If a program imports std.path and uses no function from it, it should be as large as a program without the import. Furthermore, the increase should be incremental - using 2-3 functions from std.path should only increase the executable size by a little, not suddenly link in all code in that module. But in experiments it seemed like program size would increase in sudden amounts when certain modules were included. After much investigation we figured that the following fateful causal sequence happened: 1. Some modules define static constructors with "static this()" or "static shared this()", and/or static destructors. 2. These constructors/destructors are linked in automatically whenever a module is included. 3. Importing a module with a static constructor (or destructor) will generate its ModuleInfo structure, which contains static information about all module members. In particular, it keeps virtual table pointers for all classes defined inside the module. 4. That means generating ModuleInfo refers all virtual functions defined in that module, whether they're used or not. 5. The phenomenon is transitive, e.g. even if std.path has no static constructors but imports std.datetime which does, a ModuleInfo is generated for std.path too, in addition to the one for std.datetime. So now classes inside std.path (if any) will be all linked in. 6. It follows that a module that defines classes which in turn use other functions in other modules, and has static constructors (or includes other modules that do) will baloon the size of the executable suddenly. There are a few approaches that we can use to improve the state of affairs. A. On the library side, use static constructors and destructors sparingly inside druntime and std. We can use lazy initialization instead of compulsively initializing library internals. I think this is often a worthy thing to do in any case (dynamic libraries etc) because it only does work if and when work needs to be done at the small cost of a check upon each use. B. On the compiler side, we could use a similar lazy initialization trick to only refer class methods in the module if they're actually needed. I'm being vague here because I'm not sure what and how that can be done. Here's a list of all files in std using static cdtors: std/__fileinit.d std/concurrency.d std/cpuid.d std/cstream.d std/datebase.d std/datetime.d std/encoding.d std/internal/math/biguintcore.d std/internal/math/biguintx86.d std/internal/processinit.d std/internal/windows/advapi32.d std/mmfile.d std/parallelism.d std/perf.d std/socket.d std/stdiobase.d std/uri.d The majority of them don't do a lot of work and are not much used inside phobos, so they don't blow up the executable. The main one that could receive some attention is std.datetime. It has a few static ctors and a lot of classes. Essentially just importing std.datetime or any std module that transitively imports std.datetime (and there are many of them) ends up linking in most of Phobos and blows the size up from the 218KB baseline to 700KB. Jonathan, could I impose on you to replace all static cdtors in std.datetime with lazy initialization? I looked through it and it strikes me as a reasonably simple job, but I think you'd know better what to do than me. A similar effort could be conducted to reduce or eliminate static cdtors from druntime. I made the experiment of commenting them all, and that reduced the size of the baseline from 218KB to 200KB. This is a good amount, but not as dramatic as what we can get by working on std.datetime. Thanks, AndreiReally sorry, but it sounds silly for me. It's a minor problem. Does anyone really cares about 600 KiB (3.5x) size change in an empty program? Yes, he does, but only if there is no other size increases in real programs. Now dmd have at least _two order of magnitude_ file size increase. I posted that problem four months ago at "Building GtkD app on Win32 results in 111 MiB file mostly from zeroes". An example of this bug is in archive: http://deoma-cmd.ru/files/other/gtkD-1.5.1-size.7z Built version (with *.exe and *.lib files): http://deoma-cmd.ru/files/other/gtkD-1.5.1-size-built.7z Detailed description: GtkD is built using singe (gtk-one-obj.lib) or separate (one per source file) object files (gtk-sep-obj.lib). Than main.d that imports gtk.Main is built using those libraries. Than zeroCount utils is built and launched over resulting files: -------------------------------------------------- Now let's calculate zero bytes counts: -------------------------------------------------- Zero bytes| %| Non-zero| Total bytes| File 3628311| 21.56| 13202153| 16830464|gtk-one-obj.lib 1953124| 15.98| 10272924| 12226048|gtk-sep-obj.lib 127968798| 99.00| 1298430| 129267228|main-one-obj.exe 743821| 37.51| 1239183| 1983004|main-sep-obj.exe Done. So we have to use very slow per-file build to produce a good (not 100 MiB) executable. No matter what *.exe is launched, its process allocates ~20MiB of RAM (loaded Gtk dll-s). The second dmd issue (that was discovered because of 99.00% of zeros) is that _it doesn't use bss section_. Lets look at the C++ program built using Microsoft's cl: --- char arr[1024 * 1024 * 10]; void main() { } --- It resultis in ~10KiB executable, because `arr` is initialized with zero bytes and put in bss section. If one of its elements is set to non-zero: --- char arr[1024 * 1024 * 10] = { 1 }; void main() { } --- The array can't be in .bss any more and resulting executable size will be increased by adding ~10MiB. The following D program results in ~10MiB executable: --- ubyte[1024 * 1024 * 10] arr; void main() { } --- So, if there really is a reason not to use .bss, it should be clearly explained. If described issues aren't much more significant than "static this()", show me where am I wrong, please.
Dec 20 2011
On 12/20/11 9:00 AM, Denis Shelomovskij wrote:16.12.2011 21:29, Andrei Alexandrescu пишет:[snip]Really sorry, but it sounds silly for me. It's a minor problem. Does anyone really cares about 600 KiB (3.5x) size change in an empty program? Yes, he does, but only if there is no other size increases in real programs.In my experience, in a system programming language people do care about baseline size for one reason or another. I'd agree the reason is often overstated. But I did notice that people take a look at D and use "hello, world" size as a proxy for language's overall overhead - runtime, handling of linking etc. You may or may not care about the conclusions of our investigation, but we and a category of people do care for a variety of project sizes and approaches to building them.Now dmd have at least _two order of magnitude_ file size increase. I posted that problem four months ago at "Building GtkD app on Win32 results in 111 MiB file mostly from zeroes".[snip]--- char arr[1024 * 1024 * 10]; void main() { } ---[snip]If described issues aren't much more significant than "static this()", show me where am I wrong, please.Using BSS is a nice optimization, but not all compilers do it and I know for a fact MSVC didn't have it for a long time. That's probably why I got used to thinking "poor style" when seeing a large statically-sized buffer with static duration. I'd say both issues deserve to be looked at, and saying one is more significant than the other would be difficult. Andrei
Dec 20 2011
On 12/20/2011 6:23 AM, Andrei Alexandrescu wrote:On 12/20/11 9:00 AM, Denis Shelomovskij wrote:First off, dmd most definitely puts 0 initialized static data into the BSS segment. So what's going on here? 1. char data is not initialized to 0, it is initialized to 0xFF. Non-zero data cannot be put in BSS. 2. Static data goes, by default, into thread local storage. BSS data is not thread local. To put it in global data, it has to be declared with __gshared. So, __gshared byte arr[1024 * 1024 *10]; will go into BSS. There is pretty much no reason to have such huge arrays in static data. Instead, dynamically allocate them.Now dmd have at least _two order of magnitude_ file size increase. I posted that problem four months ago at "Building GtkD app on Win32 results in 111 MiB file mostly from zeroes".[snip]--- char arr[1024 * 1024 * 10]; void main() { } ---[snip]If described issues aren't much more significant than "static this()", show me where am I wrong, please.Using BSS is a nice optimization, but not all compilers do it and I know for a fact MSVC didn't have it for a long time. That's probably why I got used to thinking "poor style" when seeing a large statically-sized buffer with static duration. I'd say both issues deserve to be looked at, and saying one is more significant than the other would be difficult.
Dec 20 2011
21.12.2011 0:22, Walter Bright пишет:First off, dmd most definitely puts 0 initialized static data into the BSS segment. So what's going on here? 1. char data is not initialized to 0, it is initialized to 0xFF. Non-zero data cannot be put in BSS.Sorry, it was because of copying C code in my post. ubyte array was tested in D.2. Static data goes, by default, into thread local storage. BSS data is not thread local. To put it in global data, it has to be declared with __gshared.I completely forgot about TLS.So, __gshared byte arr[1024 * 1024 *10]; will go into BSS. There is pretty much no reason to have such huge arrays in static data. Instead, dynamically allocate them.Of course, it was just an example of a huge executable. Now I see that dmd uses BSS , thank you for the explanation! I still think that zero-filled TLS arrays can occupy no size in the executable, but it should be done with compiler and D run-time system support and surely it is not worth the time it will take to implement. I apologize for the unfair accusation.
Dec 21 2011
Am 20.12.2011, 16:00 Uhr, schrieb Denis Shelomovskij <verylonglogin.reg gmail.com>:The second dmd issue (that was discovered because of 99.00% of zeros) is that _it doesn't use bss section_. Lets look at the C++ program built using Microsoft's cl: --- char arr[1024 * 1024 * 10]; void main() { } --- It resultis in ~10KiB executable, because `arr` is initialized with zero bytes and put in bss section. If one of its elements is set to non-zero: --- char arr[1024 * 1024 * 10] = { 1 }; void main() { } --- The array can't be in .bss any more and resulting executable size will be increased by adding ~10MiB. The following D program results in ~10MiB executable: --- ubyte[1024 * 1024 * 10] arr; void main() { } --- So, if there really is a reason not to use .bss, it should be clearly explained. If described issues aren't much more significant than "static this()", show me where am I wrong, please.+1. I didn't know about .bss, but static arrays of zeroes (global, struct, class) increasing the executable size looked like a problem wanting a solution. I hope it is easy to solve for dmd and is just an unimportant issue, so was never implemented.
Dec 20 2011
On 12/20/2011 1:07 PM, Marco Leise wrote:+1. I didn't know about .bss, but static arrays of zeroes (global, struct, class) increasing the executable size looked like a problem wanting a solution. I hope it is easy to solve for dmd and is just an unimportant issue, so was never implemented.I added a faq entry for this.
Dec 20 2011
Am 20.12.2011, 22:39 Uhr, schrieb Walter Bright <newshound2 digitalmars.com>:On 12/20/2011 1:07 PM, Marco Leise wrote:Ok, I jumped on the band wagon to early. Personally I only had this problem with classes and structs. struct Test { byte arr[1024 * 1024 *10]; } and class Test { byte arr[1024 * 1024 *10]; } both create a 10MB executable. While for the class, init may contain more data than just that one field, I don't see the struct adding anything or going into TLS. Can these initializers also go into .bss?+1. I didn't know about .bss, but static arrays of zeroes (global, struct, class) increasing the executable size looked like a problem wanting a solution. I hope it is easy to solve for dmd and is just an unimportant issue, so was never implemented.I added a faq entry for this.
Dec 20 2011
On 12/20/2011 5:52 PM, Marco Leise wrote:Ok, I jumped on the band wagon to early. Personally I only had this problem with classes and structs. struct Test { byte arr[1024 * 1024 *10]; } and class Test { byte arr[1024 * 1024 *10]; } both create a 10MB executable. While for the class, init may contain more data than just that one field, I don't see the struct adding anything or going into TLS. Can these initializers also go into .bss?The struct one already does. Compile it, obj2asm it, and you'll see it there.
Dec 20 2011
Am 21.12.2011, 07:11 Uhr, schrieb Walter Bright <newshound2 digitalmars.com>:On 12/20/2011 5:52 PM, Marco Leise wrote:Ah, I see it now. Sorry for the noise!Ok, I jumped on the band wagon to early. Personally I only had this problem with classes and structs. struct Test { byte arr[1024 * 1024 *10]; } and class Test { byte arr[1024 * 1024 *10]; } both create a 10MB executable. While for the class, init may contain more data than just that one field, I don't see the struct adding anything or going into TLS. Can these initializers also go into .bss?The struct one already does. Compile it, obj2asm it, and you'll see it there.
Dec 26 2011
Am 27.12.2011, 03:42 Uhr, schrieb Marco Leise <Marco.Leise gmx.de>:Am 21.12.2011, 07:11 Uhr, schrieb Walter Bright <newshound2 digitalmars.com>:It is back again! The following struct in my main module increases the executable size by 10MB with DMD 2.075: struct Test { byte abcd[10 * 1024 * 1024]; } It seems not to do so with *both* of these declarations, that create static arrays in the module: byte abcd[10 * 1024 * 1024]; __gshared byte abcd[10 * 1024 * 1024];On 12/20/2011 5:52 PM, Marco Leise wrote:Ah, I see it now. Sorry for the noise!Ok, I jumped on the band wagon to early. Personally I only had this problem with classes and structs. struct Test { byte arr[1024 * 1024 *10]; } and class Test { byte arr[1024 * 1024 *10]; } both create a 10MB executable. While for the class, init may contain more data than just that one field, I don't see the struct adding anything or going into TLS. Can these initializers also go into .bss?The struct one already does. Compile it, obj2asm it, and you'll see it there.
Jan 18 2012
On 1/18/2012 1:43 AM, Marco Leise wrote:It is back again! The following struct in my main module increases the executable size by 10MB with DMD 2.075: struct Test { byte abcd[10 * 1024 * 1024]; }Compiling it and obj2asm'ing the result, and you'll see it goes into the BSS segment: _TEXT segment dword use32 public 'CODE' ;size is 0 _TEXT ends _DATA segment para use32 public 'DATA' ;size is 12 _DATA ends CONST segment para use32 public 'CONST' ;size is 0 CONST ends _BSS segment para use32 public 'BSS' ;size is 10485760 _BSS ends FLAT group extrn _D19TypeInfo_S3foo4Test6__initZ public _D3foo4Test6__initZ FMB segment dword use32 public 'DATA' ;size is 0 FMB ends FM segment dword use32 public 'DATA' ;size is 4 FM ends FME segment dword use32 public 'DATA' ;size is 0 FME ends extrn _D15TypeInfo_Struct6__vtblZ public _D3foo12__ModuleInfoZ _D19TypeInfo_S3foo4Test6__initZ COMDAT flags=x0 attr=x10 align=x0 _TEXT segment assume CS:_TEXT _TEXT ends _DATA segment _D3foo12__ModuleInfoZ: db 004h,000h,000h,0ffffff80h,000h,000h,000h,000h ;........ db 066h,06fh,06fh,000h ;foo. _DATA ends CONST segment CONST ends _BSS segment _BSS ends FMB segment FMB ends FM segment dd offset FLAT:_D3foo12__ModuleInfoZ FM ends FME segment FME ends _D19TypeInfo_S3foo4Test6__initZ comdat dd offset FLAT:_D15TypeInfo_Struct6__vtblZ db 000h,000h,000h,000h ;.... db 008h,000h,000h,000h ;.... dd offset FLAT:_D19TypeInfo_S3foo4Test6__initZ[03Ch] db 000h,000h,0ffffffa0h,000h,000h,000h,000h,000h ;........ db 000h,000h,000h,000h,000h,000h,000h,000h ;........ db 000h,000h,000h,000h,000h,000h,000h,000h ;........ db 000h,000h,000h,000h,000h,000h,000h,000h ;........ db 000h,000h,000h,000h,000h,000h,000h,000h ;........ db 001h,000h,000h,000h,066h,06fh,06fh,02eh ;....foo. db 054h,065h,073h,074h,000h ;Test. _D19TypeInfo_S3foo4Test6__initZ ends end ------------------------------------------------- Adding a void main(){} yields an executable of 145,948 bytes.
Jan 18 2012
Am 18.01.2012, 11:18 Uhr, schrieb Walter Bright <newshound2 digitalmars.com>:On 1/18/2012 1:43 AM, Marco Leise wrote:Thanks for checking back. I'll have to experiment a bit to narrow this one down. It comes and goes like a ghost. I was using Linux 64-bit and the switches -O -release on a medium size code base.It is back again! The following struct in my main module increases the executable size by 10MB with DMD 2.075: struct Test { byte abcd[10 * 1024 * 1024]; }Compiling it and obj2asm'ing the result, and you'll see it goes into the BSS segment: [...] Adding a void main(){} yields an executable of 145,948 bytes.
Jan 18 2012
I tried different versions of DMD 2.057: - compiled from sources in the release zip (Gentoo ebuild) - using the 32-bit binaries in the release zip - compiling the latest 32-bit version of DMD from the repository I tried different compiler flags or no flags at all, compiled similar code in C++ to see if the linker is ok and tried -m32 and -m64, all to no avail. Then I found a solution that I can hardly imagine happening only on my unique snow-flake of a system ;) : struct Test { __gshared byte abcd[10 * 1024 * 1024]; } If it weren't for your own test results, I'd assume there is a small compiler bug in the code that decides what can go into .bss, that makes it look only for data explicitly flagged as __gshared, but not other immutable data. (Something like that anyway.) I back-tracked the compiler code to where it either calls obj_bytes (good case, goes into .bss) or obj_lidata (bad case) to write the 10 MB of zeros. But there were so many call sites, that I figured someone with inside knowledge would figure it out faster. As a side-effect of this experiment I found this combination to do funny things at runtime: -------------------------------------------------- struct Test { byte arr1[1024 * 1024 * 10]; __gshared byte arr2[1024 * 1024 * 10]; } int main() { Test test; return 0; } -------------------------------------------------- -- Marco Am 18.01.2012, 11:18 Uhr, schrieb Walter Bright <newshound2 digitalmars.com>:On 1/18/2012 1:43 AM, Marco Leise wrote:It is back again! The following struct in my main module increases the executable size by 10MB with DMD 2.075: struct Test { byte abcd[10 * 1024 * 1024]; }Compiling it and obj2asm'ing the result, and you'll see it goes into the BSS segment: _TEXT segment dword use32 public 'CODE' ;size is 0 _TEXT ends _DATA segment para use32 public 'DATA' ;size is 12 _DATA ends CONST segment para use32 public 'CONST' ;size is 0 CONST ends _BSS segment para use32 public 'BSS' ;size is 10485760 _BSS ends FLAT group extrn _D19TypeInfo_S3foo4Test6__initZ public _D3foo4Test6__initZ FMB segment dword use32 public 'DATA' ;size is 0 FMB ends FM segment dword use32 public 'DATA' ;size is 4 FM ends FME segment dword use32 public 'DATA' ;size is 0 FME ends extrn _D15TypeInfo_Struct6__vtblZ public _D3foo12__ModuleInfoZ _D19TypeInfo_S3foo4Test6__initZ COMDAT flags=x0 attr=x10 align=x0 _TEXT segment assume CS:_TEXT _TEXT ends _DATA segment _D3foo12__ModuleInfoZ: db 004h,000h,000h,0ffffff80h,000h,000h,000h,000h ;........ db 066h,06fh,06fh,000h ;foo. _DATA ends CONST segment CONST ends _BSS segment _BSS ends FMB segment FMB ends FM segment dd offset FLAT:_D3foo12__ModuleInfoZ FM ends FME segment FME ends _D19TypeInfo_S3foo4Test6__initZ comdat dd offset FLAT:_D15TypeInfo_Struct6__vtblZ db 000h,000h,000h,000h ;.... db 008h,000h,000h,000h ;.... dd offset FLAT:_D19TypeInfo_S3foo4Test6__initZ[03Ch] db 000h,000h,0ffffffa0h,000h,000h,000h,000h,000h ;........ db 000h,000h,000h,000h,000h,000h,000h,000h ;........ db 000h,000h,000h,000h,000h,000h,000h,000h ;........ db 000h,000h,000h,000h,000h,000h,000h,000h ;........ db 000h,000h,000h,000h,000h,000h,000h,000h ;........ db 001h,000h,000h,000h,066h,06fh,06fh,02eh ;....foo. db 054h,065h,073h,074h,000h ;Test. _D19TypeInfo_S3foo4Test6__initZ ends end ------------------------------------------------- Adding a void main(){} yields an executable of 145,948 bytes.
Jan 19 2012
P.S.: I could have realized it earlier: DMD uses the Windows PE BSS section quite well! It is Linux where the .bss section is not used! I'll file a bug report about this after lunch and look forward to smaller executables under Linux any time soon :D
Jan 19 2012
On Tuesday, 20 December 2011 at 14:01:04 UTC, Denis Shelomovskij wrote:Detailed description: GtkD is built using singe (gtk-one-obj.lib) or separate (one per source file) object files (gtk-sep-obj.lib). Than main.d that imports gtk.Main is built using those libraries. Than zeroCount utils is built and launched over resulting files: -------------------------------------------------- Now let's calculate zero bytes counts: -------------------------------------------------- Zero bytes| %| Non-zero| Total bytes| File 3628311| 21.56| 13202153| 16830464|gtk-one-obj.lib 1953124| 15.98| 10272924| 12226048|gtk-sep-obj.lib 127968798| 99.00| 1298430| 129267228|main-one-obj.exe 743821| 37.51| 1239183| 1983004|main-sep-obj.exe Done. So we have to use very slow per-file build to produce a good (not 100 MiB) executable. No matter what *.exe is launched, its process allocates ~20MiB of RAM (loaded Gtk dll-s).I believe this is bug 2254: http://d.puremagic.com/issues/show_bug.cgi?id=2254 The cause is the way DMD builds libraries. The old way of building libraries (using a librarian) does not create libraries that exhibit this problem when linked with an executable.
Dec 20 2011