digitalmars.D - Please integrate build framework into the compiler
- davidl (13/13) Mar 21 2009 1. compiler know in what situation a file need to be recompiled
- grauzone (32/32) Mar 21 2009 I don't really understand what you mean. But if you want the compiler to...
- Andrei Alexandrescu (3/10) Mar 21 2009 That's precisely what rdmd does.
- grauzone (26/37) Mar 21 2009 This looks really good, but I couldn't get it to work. Am I doing
- Andrei Alexandrescu (4/51) Mar 21 2009 Should work, but I tested only with D2. You may want to pass --chatty to...
- grauzone (5/5) Mar 21 2009 My rdmd doesn't know --chatty. Probably the zip file for dmd 1.041
- Andrei Alexandrescu (7/12) Mar 21 2009 rdmd invokes dmd -v to get deps. It's a interesting idea to add a
- davidl (19/31) Mar 21 2009 The bad news is that public imports ruin the simplicity of dependencies....
- Ary Borenszweig (2/26) Mar 22 2009 Yes. They could give a compile-time error... always. ;-)
- grauzone (5/20) Mar 23 2009 Is this just an "interesting idea", or are you actually considering
- Andrei Alexandrescu (3/21) Mar 23 2009 I would if there was a compelling case made in favor of it.
- dsimcha (19/23) Mar 21 2009 I'm surprised that this could possibly be more efficient than incrementa...
- grauzone (9/34) Mar 21 2009 Maybe incremental compilation could be faster, but dmd has a bug that
- Christopher Wright (2/10) Mar 21 2009 This is only if there is no dynamic linking.
- BCS (7/13) Mar 21 2009 Adding that without a way to turn it off would kill D in some cases. I h...
- Christopher Wright (2/7) Mar 21 2009 You can use interfaces for this, though that is not always possible.
- davidl (28/34) Mar 21 2009 This may not be true. Consider the dwt lib case, once you tweaked a modu...
- grauzone (18/28) Mar 21 2009 If it's about bugs, it would (probably) be easier for Walter to fix that...
- Kristian Kilpi (16/22) Mar 22 2009 Well, why not get rid of the imports altogether... Ok, that would not be...
- Christopher Wright (6/9) Mar 22 2009 That's not sufficient. I'm using SDL right now; if I type 'Surface s;',
- Kristian Kilpi (28/37) Mar 22 2009 Such things should of course be told to the compiler somehow. By using t...
- dennis luehring (36/39) Mar 22 2009 maybe like delphi did it
- Christopher Wright (17/61) Mar 22 2009 Then I want to deal with a library type with the same name as my builtin...
- Nick Sabalausky (25/35) Mar 24 2009 "If your program can operate efficiently with a textual representation.....
- bearophile (4/8) Mar 24 2009 Maybe not much, because today textual files can be compressed and decomp...
- Nick Sabalausky (12/20) Mar 24 2009 I've become more and more wary of this "CPUs are now fast enough..." phr...
- bearophile (8/9) Mar 24 2009 See here too :-)
- Nick Sabalausky (13/21) Mar 24 2009 Excellent article :)
- bearophile (6/13) Mar 24 2009 Because experiments have shown it solves or reduces a lot the problem yo...
- Christopher Wright (7/11) Mar 24 2009 Most programs only need to load up text on startup. So the cost of
- Unknown W. Brackets (12/33) Mar 22 2009 Actually, dmd is so fast I never bother with these "build" utilities. I...
1. compiler know in what situation a file need to be recompiled Consider the file given the same header file, then the obj file of this will be required for linking, all other files import this file shouldn't require any recompilation in this case. If a file's header file changes, thus the interface changes, all files import this file should be recompiled. Compiler can emit building command like rebuild does. I would enjoy: dmd -buildingcommand abc.d > responsefile dmd responsefile I think we need to eliminate useless recompilation as much as we should with consideration of the growing d project size. 2. maintaining the build without compiler support costs
Mar 21 2009
I don't really understand what you mean. But if you want the compiler to scan for dependencies, I fully agree. I claim that we don't even need incremental compilation. It would be better if the compiler would scan for dependencies, and if a source file has changed, recompile the whole project in one go. This would be simple and efficient. Here are some arguments that speak for this approach: - A full compiler is the only piece of software that can build a correct/complete module dependency graph. This is because you need full semantic analysis to catch all import statements. For example, you can use a string mixin to generate import statements: mixin("import bla;"). No naive dependency scanner would be able to detect this import. You need CTFE capabilities, which require almost a full compiler. (Actually, dsss uses the dmd frontend for dependency scanning.) - Speed. Incremental compilation is godawfully slow (10 times slower than to compile all files in one dmd invocation). You could pass all changed files to dmd at once, but this is broken and often causes linker errors (ask the dsss author for details lol). Recompiling the whole thing every time is faster. - Long dependency chains. Unlike in C/C++, you can't separate a module into interface and implementation. Compared to C++, it's as if a change to one .c file triggers recompilation of a _lot_ of other .c files. This makes incremental compilation really look useless. Unless you move modules into libraries and use them through .di files. I would even go so far to say, that dmd should automatically follow all imports and compile them in one go. This would be faster than having a separate responsefile step, because the source code needs to be analyzed only once. To prevent compilation of imported library headers, the compiler could provide a new include switch for library code. Modules inside "library" include paths wouldn't be compiled. Hell, maybe I'll even manage to come up with a compiler patch, to turn this into reality.
Mar 21 2009
grauzone wrote:I don't really understand what you mean. But if you want the compiler to scan for dependencies, I fully agree. I claim that we don't even need incremental compilation. It would be better if the compiler would scan for dependencies, and if a source file has changed, recompile the whole project in one go. This would be simple and efficient.That's precisely what rdmd does. Andrei
Mar 21 2009
Andrei Alexandrescu wrote:grauzone wrote:This looks really good, but I couldn't get it to work. Am I doing something wrong? --- o.d: module o; import tango.io.Stdout; void k() { Stdout("foo").newline; } --- u.d: module u; import o; void main() { k(); } $ rdmd u.d /tmp/u-1000-20-49158160-A46C236CDE107E3B9F053881E4257C2D.o:(.data+0x38): undefined reference to `_D1o12__ModuleInfoZ' /tmp/u-1000-20-49158160-A46C236CDE107E3B9F053881E4257C2D.o: In function `_Dmain': u.d:(.text._Dmain+0x4): undefined reference to `_D1o1kFZv' collect2: ld returned 1 exit status --- errorlevel 1 rdmd: Couldn't compile or execute u.d. $ dmd|grep Compiler Digital Mars D Compiler v1.041I don't really understand what you mean. But if you want the compiler to scan for dependencies, I fully agree. I claim that we don't even need incremental compilation. It would be better if the compiler would scan for dependencies, and if a source file has changed, recompile the whole project in one go. This would be simple and efficient.That's precisely what rdmd does.Andrei
Mar 21 2009
grauzone wrote:Andrei Alexandrescu wrote:Should work, but I tested only with D2. You may want to pass --chatty to rdmd and see what commands it invokes. Andreigrauzone wrote:This looks really good, but I couldn't get it to work. Am I doing something wrong? --- o.d: module o; import tango.io.Stdout; void k() { Stdout("foo").newline; } --- u.d: module u; import o; void main() { k(); } $ rdmd u.d /tmp/u-1000-20-49158160-A46C236CDE107E3B9F053881E4257C2D.o:(.data+0x38): undefined reference to `_D1o12__ModuleInfoZ' /tmp/u-1000-20-49158160-A46C236CDE107E3B9F053881E4257C2D.o: In function `_Dmain': u.d:(.text._Dmain+0x4): undefined reference to `_D1o1kFZv' collect2: ld returned 1 exit status --- errorlevel 1 rdmd: Couldn't compile or execute u.d. $ dmd|grep Compiler Digital Mars D Compiler v1.041I don't really understand what you mean. But if you want the compiler to scan for dependencies, I fully agree. I claim that we don't even need incremental compilation. It would be better if the compiler would scan for dependencies, and if a source file has changed, recompile the whole project in one go. This would be simple and efficient.That's precisely what rdmd does.
Mar 21 2009
My rdmd doesn't know --chatty. Probably the zip file for dmd 1.041 contains an outdated, buggy version. Where can I find the up-to-date source code? Another question, rdmd just calls dmd, right? How does it scan for dependencies, or is this step actually done by dmd itself?
Mar 21 2009
grauzone wrote:My rdmd doesn't know --chatty. Probably the zip file for dmd 1.041 contains an outdated, buggy version. Where can I find the up-to-date source code?Hold off on that for now.Another question, rdmd just calls dmd, right? How does it scan for dependencies, or is this step actually done by dmd itself?rdmd invokes dmd -v to get deps. It's a interesting idea to add a compilation mode to rdmd that asks dmd to generate headers and diff them against the old headers. That way we can implement incremental rebuilds without changing the compiler. Andrei
Mar 21 2009
在 Sun, 22 Mar 2009 12:18:03 +0800,Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> 写道:grauzone wrote:The bad news is that public imports ruin the simplicity of dependencies. Though most cases d projs uses private imports. Maybe we can further restrict the public imports. I suggest we add a new module style of interfacing. Public imports are only allowed in those modules. Interface module can only have public imports. example: all.d module(interface) all; public import blah; public import blah.foo; interface module can not import another interface module. Thus no public import chain will be created. The shortcoming of it is: module(interface) subpack.all; public import subpack.mod; module(interface) all; public import subpack.mod; // duplication here. public import subpack1.mod1;My rdmd doesn't know --chatty. Probably the zip file for dmd 1.041 contains an outdated, buggy version. Where can I find the up-to-date source code?Hold off on that for now.Another question, rdmd just calls dmd, right? How does it scan for dependencies, or is this step actually done by dmd itself?rdmd invokes dmd -v to get deps. It's a interesting idea to add a compilation mode to rdmd that asks dmd to generate headers and diff them against the old headers. That way we can implement incremental rebuilds without changing the compiler. Andrei
Mar 21 2009
davidl escribió:在 Sun, 22 Mar 2009 12:18:03 +0800,Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> 写道:Yes. They could give a compile-time error... always. ;-)grauzone wrote:The bad news is that public imports ruin the simplicity of dependencies. Though most cases d projs uses private imports. Maybe we can further restrict the public imports.My rdmd doesn't know --chatty. Probably the zip file for dmd 1.041 contains an outdated, buggy version. Where can I find the up-to-date source code?Hold off on that for now.Another question, rdmd just calls dmd, right? How does it scan for dependencies, or is this step actually done by dmd itself?rdmd invokes dmd -v to get deps. It's a interesting idea to add a compilation mode to rdmd that asks dmd to generate headers and diff them against the old headers. That way we can implement incremental rebuilds without changing the compiler. Andrei
Mar 22 2009
Andrei Alexandrescu wrote:grauzone wrote:Is this just an "interesting idea", or are you actually considering implementing it? Anyway, maybe you could pressure Walter to fix that dmd bug, that stops dsss from being efficient. I can't advertise this enough.My rdmd doesn't know --chatty. Probably the zip file for dmd 1.041 contains an outdated, buggy version. Where can I find the up-to-date source code?Hold off on that for now.Another question, rdmd just calls dmd, right? How does it scan for dependencies, or is this step actually done by dmd itself?rdmd invokes dmd -v to get deps. It's a interesting idea to add a compilation mode to rdmd that asks dmd to generate headers and diff them against the old headers. That way we can implement incremental rebuilds without changing the compiler.Andrei
Mar 23 2009
grauzone wrote:Andrei Alexandrescu wrote:I would if there was a compelling case made in favor of it. Andreigrauzone wrote:Is this just an "interesting idea", or are you actually considering implementing it?My rdmd doesn't know --chatty. Probably the zip file for dmd 1.041 contains an outdated, buggy version. Where can I find the up-to-date source code?Hold off on that for now.Another question, rdmd just calls dmd, right? How does it scan for dependencies, or is this step actually done by dmd itself?rdmd invokes dmd -v to get deps. It's a interesting idea to add a compilation mode to rdmd that asks dmd to generate headers and diff them against the old headers. That way we can implement incremental rebuilds without changing the compiler.
Mar 23 2009
== Quote from grauzone (none example.net)'s articleI claim that we don't even need incremental compilation. It would be better if the compiler would scan for dependencies, and if a source file has changed, recompile the whole project in one go. This would be simple and efficient.I'm surprised that this could possibly be more efficient than incremental compilation, but I've never worked on a project large enough for compile times to be a major issue, so I've never really looked into this. If incremental compilation were removed from the spec, meaning the compiler would always know about the whole program when compiling, I assume (correct me if I'm wrong) that would mean the following restrictions could be removed: 1. std.traits could offer a way to get a tuple of all derived classes, essentially the opposite of BaseTypeType. 2. Since DMD would know about all derived classes when compiling the base class, it would be feasible to allow templates to add virtual functions to classes. IMHO, this would be an absolute godsend, as it is currently a _huge_ limitation of templates. 3. For the same reason, methods calls to classes with no derived classes could be made directly instead of through the vtable. Of course, these restrictions would still apply to libraries that use .di files. If incremental compilation is actually causing more problems than it solves anyhow, it would be great to get rid of it along with the annoying restrictions it creates.
Mar 21 2009
dsimcha wrote:== Quote from grauzone (none example.net)'s articleMaybe incremental compilation could be faster, but dmd has a bug that forces tools like dsss/rebuild to use a slower method. Instead of invoking the compiler once to recompile all modules that depend from changed files, it has to start a new compiler process for each file.I claim that we don't even need incremental compilation. It would be better if the compiler would scan for dependencies, and if a source file has changed, recompile the whole project in one go. This would be simple and efficient.I'm surprised that this could possibly be more efficient than incremental compilation, but I've never worked on a project large enough for compile times to be a major issue, so I've never really looked into this.If incremental compilation were removed from the spec, meaning the compiler would always know about the whole program when compiling, I assume (correct me if I'm wrong) that would mean the following restrictions could be removed: 1. std.traits could offer a way to get a tuple of all derived classes, essentially the opposite of BaseTypeType. 2. Since DMD would know about all derived classes when compiling the base class, it would be feasible to allow templates to add virtual functions to classes. IMHO, this would be an absolute godsend, as it is currently a _huge_ limitation of templates. 3. For the same reason, methods calls to classes with no derived classes could be made directly instead of through the vtable.And you could do all kinds of interprocedural optimizations.Of course, these restrictions would still apply to libraries that use .di files. If incremental compilation is actually causing more problems than it solves anyhow, it would be great to get rid of it along with the annoying restrictions it creates.compilation. But for now, D's build model is too similar to C/C++ as that you'd completely remove that ability.
Mar 21 2009
dsimcha wrote:1. std.traits could offer a way to get a tuple of all derived classes, essentially the opposite of BaseTypeType. 2. Since DMD would know about all derived classes when compiling the base class, it would be feasible to allow templates to add virtual functions to classes. IMHO, this would be an absolute godsend, as it is currently a _huge_ limitation of templates. 3. For the same reason, methods calls to classes with no derived classes could be made directly instead of through the vtable.This is only if there is no dynamic linking.
Mar 21 2009
Hello grauzone,I would even go so far to say, that dmd should automatically follow all imports and compile them in one go. This would be faster than having a separate responsefile step, because the source code needs to be analyzed only once. To prevent compilation of imported library headers, the compiler could provide a new include switch for library code. Modules inside "library" include paths wouldn't be compiled.Adding that without a way to turn it off would kill D in some cases. I have a project where DMD uses up >30% of the available address space compiling one module. If I was forced to compile all modules at once, it might not work, end of story. That said, for many cases, I don't see a problem with having that feature available.
Mar 21 2009
grauzone wrote:- Long dependency chains. Unlike in C/C++, you can't separate a module into interface and implementation. Compared to C++, it's as if a change to one .c file triggers recompilation of a _lot_ of other .c files. This makes incremental compilation really look useless. Unless you move modules into libraries and use them through .di files.You can use interfaces for this, though that is not always possible.
Mar 21 2009
在 Sun, 22 Mar 2009 04:19:31 +0800,grauzone <none example.net> 写道:I don't really understand what you mean. But if you want the compiler to scan for dependencies, I fully agree. I claim that we don't even need incremental compilation. It would be better if the compiler would scan for dependencies, and if a source file has changed, recompile the whole project in one go. This would be simple and efficient.This may not be true. Consider the dwt lib case, once you tweaked a module very little(that means you do not modify any interface connects with outside modules and code that could possible affect modules in the same packages), the optimal way is dmd -c your_tweaked_module link all_obj That's much faster than regenerating all other object files. Yes, feed them all to DMD compiles really fast. Writing all object files to disk costs much time. And your impression of incremental compilation seems to be misguided by the rebuild and dsss system. Rebuild takes no advantage of di files, thus it have to recompile everytime even in the situation that the module based on all other di files unchanged. I posted several blocking header generation bugs in DMD and with fixes. Just so little change that dmd can generate almost all header files correctly. I tested tango, dwt, dwt-addons. Those projects are very big and some take advanced use of templates. So the header generation building strategy is really not far away. Little self-promotion here, and in case Walter misses some of them: http://d.puremagic.com/issues/show_bug.cgi?id=2744 http://d.puremagic.com/issues/show_bug.cgi?id=2745 http://d.puremagic.com/issues/show_bug.cgi?id=2747 http://d.puremagic.com/issues/show_bug.cgi?id=2748 http://d.puremagic.com/issues/show_bug.cgi?id=2751 In c++, a sophisticated makefile carefully build .h dependencies of .c files. Thus, once .h files are updated, then .c files which are based on them need to be recompile. This detection can be made by comparison of old .di files and new .di files by testing their equality.
Mar 21 2009
Little self-promotion here, and in case Walter misses some of them: http://d.puremagic.com/issues/show_bug.cgi?id=2744 http://d.puremagic.com/issues/show_bug.cgi?id=2745 http://d.puremagic.com/issues/show_bug.cgi?id=2747 http://d.puremagic.com/issues/show_bug.cgi?id=2748 http://d.puremagic.com/issues/show_bug.cgi?id=2751If it's about bugs, it would (probably) be easier for Walter to fix that code generation bug, that forces dsss/rebuild to invoke a new dmd process to recompile each outdated file separately. This would bring a critical speedup for incremental compilation (from absolutely useless to relatively useful), and all impatient D users with middle sized source bases could be happy.In c++, a sophisticated makefile carefully build .h dependencies of .c files. Thus, once .h files are updated, then .c files which are based on them need to be recompile. This detection can be made by comparison of old .di files and new .di files by testing their equality.This sounds like a really nice idea, but it's also quite complex. For example, to guarantee correctness, the D compiler _always_ had to read the .di file when importing a module (and not the .d file directly). If it doesn't do that, it could "accidentally" use information that isn't included in the .di file (like code when doing inlining). This means you had to generate the .di files first. When doing this, you also had to deal with circular dependencies, which will bring extra headaches. And of course, you need to fix all those .di generation bugs. It's actually a bit scary that the compiler not only has to be able to parse D code, but also to output D source code again. And .di files are not even standardized. It's perhaps messy enough to deem it unrealistic. Still, nice idea.
Mar 21 2009
On Sat, 21 Mar 2009 22:19:31 +0200, grauzone <none example.net> wrote:I don't really understand what you mean. But if you want the compiler to scan for dependencies, I fully agree. I claim that we don't even need incremental compilation. It would be better if the compiler would scan for dependencies, and if a source file has changed, recompile the whole project in one go. This would be simple and efficient.Well, why not get rid of the imports altogether... Ok, that would not be feasible because of the way compilers (D, C++, etc) are build nowadays. I find adding of #includes/imports laborious. (Is this component already #included/imported? Where's that class defined? Did I forgot something?) And when you modify or refractor the file, you have to update the #includes/imports accordingly... (In case of modification/refractoring) the easiest way is just to compile the file, and see if there's errors... Of course, that approach will not help to remove the unnecessary #includes/imports. So, sometimes (usually?) I give up, create one huge #include/import file that #includes/imports all the stuff, and use that instead. Efficient? Pretty? No. Easy? Simple? Yes. #includes/imports are redundant information: the source code of course describes what's used in it. So, the compiler could be aware of the whole project (and the libraries used) instead of one file at the time.
Mar 22 2009
Kristian Kilpi wrote:#includes/imports are redundant information: the source code of course describes what's used in it. So, the compiler could be aware of the whole project (and the libraries used) instead of one file at the time.That's not sufficient. I'm using SDL right now; if I type 'Surface s;', should that import sdl.surface or cairo.Surface? How is the compiler to tell? How should the compiler find out where to look for classes named Surface? Should it scan everything under /usr/local/include/d/? That's going to be pointlessly expensive.
Mar 22 2009
On Sun, 22 Mar 2009 14:14:39 +0200, Christopher Wright <dhasenan gmail.com> wrote:Kristian Kilpi wrote:Such things should of course be told to the compiler somehow. By using the project configuration, or by other means. (It's only a matter of definition.) For example, if my project contains the Surface class, then 'Surface s;' should of course refer to it. If some library (used by the project) also has the Surface class, then one should use some other way to refer it (e.g. sdl.Surface). But my point was that the compilers today do not have knowledge about the projects as a whole. That makes this kind of 'scanning' too expensive (in the current compiler implementations). But if the compilers were build differently that wouldn't have to be true. If I were to create/design a compiler (which I am not ;) ), it would be something like this: Every file is cached (why to read and parse files over and over again, if not necessary). These cache files would contain all the information (parse trees, interfaces, etc) needed during the compilation (of the whole project). Also, they would contain the compilation results too (i.e. assembly). So, these cache/database files would logically replace the old object files. That is, there would be database for the whole project. When something gets changed, the compiler knows what effect it has and what's required to do. And finally, I would also change the format of libraries. A library would be one file only. No more header/.di -files; one compact file containing all the needed information (in a binary formated database that can be read very quickly).#includes/imports are redundant information: the source code of course describes what's used in it. So, the compiler could be aware of the whole project (and the libraries used) instead of one file at the time.That's not sufficient. I'm using SDL right now; if I type 'Surface s;', should that import sdl.surface or cairo.Surface? How is the compiler to tell? How should the compiler find out where to look for classes named Surface? Should it scan everything under /usr/local/include/d/? That's going to be pointlessly expensive.
Mar 22 2009
Such things should of course be told to the compiler somehow. By using the project configuration, or by other means. (It's only a matter of definition.)maybe like delphi did it there is a file called .dpr (delphi project) which holds the absolute/relative pathes for in project used imports it could be seen as an delphi source based makefile test.dpr --- project test; uses // like D's import unit1 in '\temp\unit1.pas', unit2 in '\bla\unit2.pas', unit3 in '\blub\unit3.pas', ... --- unit1.pas --- uses unit2, unit3; interface ... implementation ... --- and the sources files .pas compiled into an delphi compiler specific "object file format" called .dcu (delphi compiled unit) which holds all intelligent data for the compiler when used serveral times (if the compiler finds an .dcu he will use it, or compile the .pas if needed to an .dcu) i think that, the blasting fast parser (and the absence of generic programming features) makes delphi the fastest compiler out there the compiling speed is compareable to sending a message through icq or save a small file did the dmd compiler have rich compile/linktime intermediate files? and btw: if we do compiletime bechmarks - delphi is the only hart to beat reference but i still don't like delphi :-)
Mar 22 2009
Kristian Kilpi wrote:On Sun, 22 Mar 2009 14:14:39 +0200, Christopher Wright <dhasenan gmail.com> wrote:Then I want to deal with a library type with the same name as my builtin type. You can come up with a convention that does the right thing 90% of the time, but produces strange errors on occasion.Kristian Kilpi wrote:Such things should of course be told to the compiler somehow. By using the project configuration, or by other means. (It's only a matter of definition.) For example, if my project contains the Surface class, then 'Surface s;' should of course refer to it. If some library (used by the project) also has the Surface class, then one should use some other way to refer it (e.g. sdl.Surface).#includes/imports are redundant information: the source code of course describes what's used in it. So, the compiler could be aware of the whole project (and the libraries used) instead of one file at the time.That's not sufficient. I'm using SDL right now; if I type 'Surface s;', should that import sdl.surface or cairo.Surface? How is the compiler to tell? How should the compiler find out where to look for classes named Surface? Should it scan everything under /usr/local/include/d/? That's going to be pointlessly expensive.But my point was that the compilers today do not have knowledge about the projects as a whole. That makes this kind of 'scanning' too expensive (in the current compiler implementations). But if the compilers were build differently that wouldn't have to be true.If you want a system that accepts plugins, you will never have access to the entire project. If you are writing a library, you will never have access to the entire project. So a compiler has to address those needs, too.If I were to create/design a compiler (which I am not ;) ), it would be something like this: Every file is cached (why to read and parse files over and over again, if not necessary). These cache files would contain all the information (parse trees, interfaces, etc) needed during the compilation (of the whole project). Also, they would contain the compilation results too (i.e. assembly). So, these cache/database files would logically replace the old object files. That is, there would be database for the whole project. When something gets changed, the compiler knows what effect it has and what's required to do.All this is helpful for developers. It's not helpful if you are merely compiling everything once, but then, the overhead would only be experienced on occasion.And finally, I would also change the format of libraries. A library would be one file only. No more header/.di -files; one compact file containing all the needed information (in a binary formated database that can be read very quickly).Why binary? If your program can operate efficiently with a textual representation, it's easier to test, easier to debug, and less susceptible to changes in internal structures. Additionally, a database in a binary format will require special tools to examine. You can't just pop it open in a text editor to see what functions are defined.
Mar 22 2009
"Christopher Wright" <dhasenan gmail.com> wrote in message news:gq6lms$1815$1 digitalmars.com..."If your program can operate efficiently with a textual representation..." I think that's the key right there. Most of the time, parsing a sensibly-designed text format is going to be a bit slower than reading in an equivalent sensibly-designed (as opposed to over-engineered [pet-peeve]ex: GOLD Parser Builder's .cgt format[/pet-peeve]) binary format. First off, there's just simply more raw data to be read off the disk and processed, then you've got the actual tokenizing/syntax-parsing itself, and then anything that isn't supposed to be interpreted as a string (like ints and bools) need to get converted to their proper internal representations. And then for saving, you go through all the same, but in reverse. (Also, mixed human/computer editing of a text file can sometimes be problematic.) With a sensibly-designed binary format (and a sensible systems language like into memory and apply some structs over top of them. Toss in some trivial version checks and maybe some endian fixups and you're done. Very little processing and memory is needed. I can certainly appreciate the other benefits of text formats, though, and certainly agree that there are cases where the performance of using a text format would be perfectly acceptable. But it can add up. And I often wonder how much faster and more memory-efficient things like linux and the web could have been if they weren't so big on sticking damn near everything into "convenient" text formats.And finally, I would also change the format of libraries. A library would be one file only. No more header/.di -files; one compact file containing all the needed information (in a binary formated database that can be read very quickly).Why binary? If your program can operate efficiently with a textual representation, it's easier to test, easier to debug, and less susceptible to changes in internal structures. Additionally, a database in a binary format will require special tools to examine. You can't just pop it open in a text editor to see what functions are defined.
Mar 24 2009
Nick Sabalausky:I often wonder how much faster and more memory-efficient things like linux and the web could have been if they weren't so big on sticking damn near everything into "convenient" text formats.Maybe not much, because today textual files can be compressed and decomperssed on the fly. CPUs are now fast enough that even with compression the I/O is usually the bottleneck anyway. Bye, bearophile
Mar 24 2009
"bearophile" <bearophileHUGS lycos.com> wrote in message news:gqbe2k$13al$1 digitalmars.com...Nick Sabalausky:I've become more and more wary of this "CPUs are now fast enough..." phrase that keeps getting tossed around these days. The problem is, that argument gets used SO much, that on this fastest computer I've ever owned, I've actually experienced *basic text-entry boxes* (with no real bells or whistles or anything) that had *seconds* of delay. That never once happened to me on my "slow" Apple 2. The unfortunate truth is that the speed and memory of modern systems are constantly getting used to rationalize shoddy bloatware practices and we wind up with systems that are even *slower* than they were back on less-powerful hardware. It's pathetic, and drives me absolutely nuts.I often wonder how much faster and more memory-efficient things like linux and the web could have been if they weren't so big on sticking damn near everything into "convenient" text formats.Maybe not much, because today textual files can be compressed and decomperssed on the fly. CPUs are now fast enough that even with compression the I/O is usually the bottleneck anyway.
Mar 24 2009
Nick Sabalausky:That never once happened to me on my "slow" Apple 2.<See here too :-) http://hubpages.com/hub/_86_Mac_Plus_Vs_07_AMD_DualCore_You_Wont_Believe_Who_Wins Yet, what I have written is often true :-) Binary data can't be compressed as well as textual data, and lzop is I/O bound in most situations: http://www.lzop.org/ Bye, bearophile
Mar 24 2009
"bearophile" <bearophileHUGS lycos.com> wrote in message news:gqbgma$189l$1 digitalmars.com...Nick Sabalausky:Excellent article :)That never once happened to me on my "slow" Apple 2.<See here too :-) http://hubpages.com/hub/_86_Mac_Plus_Vs_07_AMD_DualCore_You_Wont_Believe_Who_WinsYet, what I have written is often true :-) Binary data can't be compressed as well as textual data,Doesn't really matter, since binary data (assuming a format that isn't over-engineered) is already smaller than the same data in text form. Text compresses well *because* it contains so much more excess redundant data than binary data does. I could stick 10GB of zeros to the end of a 1MB binary file and suddenly it would compress far better than any typical text file.and lzop is I/O bound in most situations: http://www.lzop.org/I'm not really sure why you're bringing up compression...? Do you mean that the actual disk access time of a text format can be brought down to the time of an equivalent binary format by storing the text file in a compressed form?
Mar 24 2009
Nick Sabalausky:Doesn't really matter, since binary data (assuming a format that isn't over-engineered) is already smaller than the same data in text form.If you take into account compression too, sometimes text compressed is smaller than the same binary file and the same binary file compressed (because good compressors are often able to spot redundancy better in text files than in arbitrary structured binary files).I'm not really sure why you're bringing up compression...?<Because experiments have shown it solves or reduces a lot the problem you were talking about.Do you mean that the actual disk access time of a text format can be brought down to the time of an equivalent binary format by storing the text file in a compressed form?It's not always true, but it happens often enough, or the difference becomes tolerable and balances the clarity advantages of the textual format (and sometimes the actual time becomes less, but this is less common). Bye, bearophile
Mar 24 2009
Nick Sabalausky wrote:But it can add up. And I often wonder how much faster and more memory-efficient things like linux and the web could have been if they weren't so big on sticking damn near everything into "convenient" text formats.Most programs only need to load up text on startup. So the cost of parsing the config file is linear in the number of times you start the application, and linear in the size of the config file. If there were a binary database format in place of libraries, I would be fine with it, as long as there were a convenient way to get the textual version.
Mar 24 2009
Actually, dmd is so fast I never bother with these "build" utilities. I just send it all the files and have it rebuild everytime, deleting all the o files afterward. This is very fast, even for larger projects. It appears (to me) the static cost of calling dmd is much greater than the dynamic cost of compiling a file. These toolkits always compile a, then b, then c, which takes like 2.5 times as long as compiling a, b, and c at once. That said, if dmd were made to link into other programs, these toolkits could hook into it, and have the fixed cost only once (theoretically) - but still dynamically decide which files to compile. This seems ideal. -[Unknown] davidl wrote:1. compiler know in what situation a file need to be recompiled Consider the file given the same header file, then the obj file of this will be required for linking, all other files import this file shouldn't require any recompilation in this case. If a file's header file changes, thus the interface changes, all files import this file should be recompiled. Compiler can emit building command like rebuild does. I would enjoy: dmd -buildingcommand abc.d > responsefile dmd responsefile I think we need to eliminate useless recompilation as much as we should with consideration of the growing d project size. 2. maintaining the build without compiler support costs
Mar 22 2009