www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Compiler: Size of generated executable file

reply Ph <romanua gmail.com> writes:
Why a generated file is so huge?
"Empty" program such as:

int main(char[][] args)
{

	return 0;
}

compiled with dmd2 into file with size of  266268 bytes.
Even after UPX, it's size is 87552 bytes.
Size of this code,compiled with VS(yes,yes, C++), is 6 656 bytes.
Compiler add's standard library  to file, am i right?
Is there some optimization which delete unused code from file?
Jan 09 2010
next sibling parent The Anh Tran <trtheanh gmail.com> writes:
D has large file size b/c the linker still leave lots of typeinfo there, 
even if you don't use any.
Optimization seems to have the lowest priority. There a way more 
important things that need man power: concurrent paradigm, functional 
paradigm, fixing bugs, ...
Jan 09 2010
prev sibling next sibling parent Lutger <lutger.blijdestijn gmail.com> writes:
On 01/09/2010 04:36 PM, Ph wrote:
 Why a generated file is so huge?
Size of binraries are big because of typeinfo, standard library and bloat from templates. C++ binaries are probably also much bigger when the std lib is compiled statically and also bloat up pretty fast when you use templated code (especially iostream). The 6kb you mention is excluding the MS runtime dll. It is not a priority for dmd2. With ldc there are some switches I believe to selectively turn of generating TypeInfo. Is it a problem for you? Eventually in a bigger program the bloat is probably on par with C++.
Jan 09 2010
prev sibling next sibling parent reply "Nick Sabalausky" <a a.a> writes:
"Ph" <romanua gmail.com> wrote in message 
news:hia7qc$b5k$1 digitalmars.com...
 Why a generated file is so huge?
 "Empty" program such as:

 int main(char[][] args)
 {

 return 0;
 }

 compiled with dmd2 into file with size of  266268 bytes.
 Even after UPX, it's size is 87552 bytes.
 Size of this code,compiled with VS(yes,yes, C++), is 6 656 bytes.
 Compiler add's standard library  to file, am i right?
 Is there some optimization which delete unused code from file?
That's not even a third of a megabyte, why does this keep getting brought up as an issue by so many people?
Jan 09 2010
next sibling parent reply grauzone <none example.net> writes:
Nick Sabalausky wrote:
 "Ph" <romanua gmail.com> wrote in message 
 news:hia7qc$b5k$1 digitalmars.com...
 Why a generated file is so huge?
 "Empty" program such as:

 int main(char[][] args)
 {

 return 0;
 }

 compiled with dmd2 into file with size of  266268 bytes.
 Even after UPX, it's size is 87552 bytes.
 Size of this code,compiled with VS(yes,yes, C++), is 6 656 bytes.
 Compiler add's standard library  to file, am i right?
 Is there some optimization which delete unused code from file?
That's not even a third of a megabyte, why does this keep getting brought up as an issue by so many people?
Maybe most of them don't know that it's only constant overheads. On the other hand, template bloat can inflate binaries surprisingly much. For example, the unlinked object file of "hello world" in D1 is 2.3 KB, while in D2, it's 36 KB. That's because writefln() is a template in D2. (The final executable is almost twice the size as the D2 one too, although it's questionable how much of the additional size is due to templates.)
Jan 09 2010
next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
grauzone wrote:
 (The final executable is almost twice the size as the D2 one too, 
 although it's questionable how much of the additional size is due to 
 templates.)
Finding out why an executable is large is as easy as compiling with -L/map on Windows and -L-map -Lfoo.map on Linux/OSX, and then examining the resulting map file which will tell you each and every symbol in the executable and how large it is.
Jan 09 2010
parent reply grauzone <none example.net> writes:
Walter Bright wrote:
 grauzone wrote:
 (The final executable is almost twice the size as the D2 one too, 
 although it's questionable how much of the additional size is due to 
 templates.)
Finding out why an executable is large is as easy as compiling with -L/map on Windows and -L-map -Lfoo.map on Linux/OSX, and then examining the resulting map file which will tell you each and every symbol in the executable and how large it is.
Yes, but making sense out of the raw data is another thing.
Jan 09 2010
parent reply Walter Bright <newshound1 digitalmars.com> writes:
grauzone wrote:
 Walter Bright wrote:
 grauzone wrote:
 (The final executable is almost twice the size as the D2 one too, 
 although it's questionable how much of the additional size is due to 
 templates.)
Finding out why an executable is large is as easy as compiling with -L/map on Windows and -L-map -Lfoo.map on Linux/OSX, and then examining the resulting map file which will tell you each and every symbol in the executable and how large it is.
Yes, but making sense out of the raw data is another thing.
What's hard about "function foo is in the executable, and consumes 421 bytes"? All a linker does is concatenate the bytes of your generated code together from the various obj files and write it out. There's no magic going on.
Jan 10 2010
parent reply grauzone <none example.net> writes:
Walter Bright wrote:
 grauzone wrote:
 Walter Bright wrote:
 grauzone wrote:
 (The final executable is almost twice the size as the D2 one too, 
 although it's questionable how much of the additional size is due to 
 templates.)
Finding out why an executable is large is as easy as compiling with -L/map on Windows and -L-map -Lfoo.map on Linux/OSX, and then examining the resulting map file which will tell you each and every symbol in the executable and how large it is.
Yes, but making sense out of the raw data is another thing.
What's hard about "function foo is in the executable, and consumes 421 bytes"?
If an executable has > 10000 symbols, it's hard to find out what's actually causing overhead. If you have a script, that categorizes symbol types by demangling the symbol names and creates statistics based on this or so, please post.
 All a linker does is concatenate the bytes of your generated code 
 together from the various obj files and write it out. There's no magic 
 going on.
If it's so simple, then why does OPTLINK fail so hard? I can only guess how many people are turning away from D just because they have to deal with OPTLINK's inability to deal with COFF. They end up trying to compile libraries with dmc etc., just to get it linked.
Jan 10 2010
parent reply Walter Bright <newshound1 digitalmars.com> writes:
grauzone wrote:
 Walter Bright wrote:
 What's hard about "function foo is in the executable, and consumes 421 
 bytes"?
If an executable has > 10000 symbols, it's hard to find out what's actually causing overhead. If you have a script, that categorizes symbol types by demangling the symbol names and creates statistics based on this or so, please post.
It's really not that hard to just look at. The map file even sorts it for you, by name and by location.
 All a linker does is concatenate the bytes of your generated code 
 together from the various obj files and write it out. There's no magic 
 going on.
If it's so simple, then why does OPTLINK fail so hard? I can only guess how many people are turning away from D just because they have to deal with OPTLINK's inability to deal with COFF. They end up trying to compile libraries with dmc etc., just to get it linked.
The file formats are complicated. The concept of what the linker does is trivial, especially when we're talking about "what consumes space in the exe file".
Jan 10 2010
parent reply retard <re tard.com.invalid> writes:
Sun, 10 Jan 2010 13:05:12 -0800, Walter Bright wrote:

 grauzone wrote:
 Walter Bright wrote:
 What's hard about "function foo is in the executable, and consumes 421
 bytes"?
If an executable has > 10000 symbols, it's hard to find out what's actually causing overhead. If you have a script, that categorizes symbol types by demangling the symbol names and creates statistics based on this or so, please post.
It's really not that hard to just look at. The map file even sorts it for you, by name and by location.
 All a linker does is concatenate the bytes of your generated code
 together from the various obj files and write it out. There's no magic
 going on.
If it's so simple, then why does OPTLINK fail so hard? I can only guess how many people are turning away from D just because they have to deal with OPTLINK's inability to deal with COFF. They end up trying to compile libraries with dmc etc., just to get it linked.
The file formats are complicated. The concept of what the linker does is trivial, especially when we're talking about "what consumes space in the exe file".
If you take for example GNU ld from binutils, it's not that trivial. It even has its own scripting language. If the object file format is complicated, the linker has to support most important parts of the spec, at least. ld happens to support several architectures, several operating systems, several language mangling conventions, and several object file formats. The architecture has pluggable components and it comes with tons of switches for various kinds of tuning options. As a whole, it takes couple of weeks of intensive learning to really master the tool. Of course some guidance helps, but the documentation isn't very good. For instance setting the entry point on architectures with no operating system may bloat the executable, if you don't know what you're doing.
Jan 10 2010
parent reply Walter Bright <newshound1 digitalmars.com> writes:
retard wrote:
 The file formats are complicated. The concept of what the linker does is
 trivial, especially when we're talking about "what consumes space in the
 exe file".
If you take for example GNU ld from binutils, it's not that trivial. It even has its own scripting language. If the object file format is complicated, the linker has to support most important parts of the spec, at least. ld happens to support several architectures, several operating systems, several language mangling conventions, and several object file formats. The architecture has pluggable components and it comes with tons of switches for various kinds of tuning options. As a whole, it takes couple of weeks of intensive learning to really master the tool. Of course some guidance helps, but the documentation isn't very good. For instance setting the entry point on architectures with no operating system may bloat the executable, if you don't know what you're doing.
Name mangling conventions have nothing to do with bloat, neither do object file formats, byte order, etc. What a linker does *is* conceptually trivial - it merely concatentates the binary data in the object files together and writes it out. At its core, you could conceivably design an object format and have the linker *actually* just concatenate those files to form an executable. The original MS-DOS executable file format wasn't even a file format, it was nothing more than binary data that was copied into memory and blindly jumped to. If you want to know where the size in your exe file is coming from, the map file will tell you - broken down by each name and its associated size. Maybe the problem is the man page for ld, which has a truly bewildering quantity of obtuse options, confusing people about what a linker actually does.
Jan 11 2010
next sibling parent reply grauzone <none example.net> writes:
Walter Bright wrote:
 If you want to know where the size in your exe file is coming from, the 
 map file will tell you - broken down by each name and its associated size.
Can you tell me how many bytes all TypeInfos use up in libphobos? (Without disabling codegen for TypeInfos and comparing the final file sizes *g*.)
 Maybe the problem is the man page for ld, which has a truly bewildering 
 quantity of obtuse options, confusing people about what a linker 
 actually does.
Jan 11 2010
parent Walter Bright <newshound1 digitalmars.com> writes:
grauzone wrote:
 Walter Bright wrote:
 If you want to know where the size in your exe file is coming from, 
 the map file will tell you - broken down by each name and its 
 associated size.
Can you tell me how many bytes all TypeInfos use up in libphobos? (Without disabling codegen for TypeInfos and comparing the final file sizes *g*.)
I thought we were talking about exe files, not library files. I don't think anyone cares how much space a library consumes. All the TypeInfos each use nearly the same space, you could estimate the total by grepping for TypeInfo to get a count, and multiplying by the size.
Jan 11 2010
prev sibling next sibling parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
Walter Bright wrote:
 Maybe the problem is the man page for ld, which has a truly bewildering 
 quantity of obtuse options, confusing people about what a linker 
 actually does.
Amusingly, it also contains the following phrase: "The linker supports a plethora of command-line options, but in actual practice few of them are used in any particular context." :) -Lars
Jan 11 2010
next sibling parent reply "Nick Sabalausky" <a a.a> writes:
"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message 
news:hiesf9$1f0v$1 digitalmars.com...
 Walter Bright wrote:
 Maybe the problem is the man page for ld, which has a truly bewildering 
 quantity of obtuse options, confusing people about what a linker actually 
 does.
Amusingly, it also contains the following phrase: "The linker supports a plethora of command-line options, but in actual practice few of them are used in any particular context." :)
The GNU compilation tools have a lot of WTFs. My favorite: "We think that they're only done this way for historical reasons, but we aren't sure." - From the GCC Documentation (http://gcc.gnu.org/projects/beginner.html)
Jan 11 2010
parent Walter Bright <newshound1 digitalmars.com> writes:
Nick Sabalausky wrote:
 The GNU compilation tools have a lot of WTFs. My favorite:
 
 "We think that they're only done this way for historical reasons, but we 
 aren't sure."
   - From the GCC Documentation (http://gcc.gnu.org/projects/beginner.html)
I prefer the honesty of such answers rather than some made-up bs.
Jan 11 2010
prev sibling parent Walter Bright <newshound1 digitalmars.com> writes:
Lars T. Kyllingstad wrote:
 Walter Bright wrote:
 Maybe the problem is the man page for ld, which has a truly 
 bewildering quantity of obtuse options, confusing people about what a 
 linker actually does.
Amusingly, it also contains the following phrase: "The linker supports a plethora of command-line options, but in actual practice few of them are used in any particular context." :)
Yeah, the problem is it takes several minutes to find the one that is actually useful, like -map.
Jan 11 2010
prev sibling parent reply retard <re tard.com.invalid> writes:
Mon, 11 Jan 2010 00:54:24 -0800, Walter Bright wrote:

 retard wrote:
 The file formats are complicated. The concept of what the linker does
 is trivial, especially when we're talking about "what consumes space
 in the exe file".
If you take for example GNU ld from binutils, it's not that trivial. It even has its own scripting language. If the object file format is complicated, the linker has to support most important parts of the spec, at least. ld happens to support several architectures, several operating systems, several language mangling conventions, and several object file formats. The architecture has pluggable components and it comes with tons of switches for various kinds of tuning options. As a whole, it takes couple of weeks of intensive learning to really master the tool. Of course some guidance helps, but the documentation isn't very good. For instance setting the entry point on architectures with no operating system may bloat the executable, if you don't know what you're doing.
Name mangling conventions have nothing to do with bloat, neither do object file formats, byte order, etc. What a linker does *is* conceptually trivial - it merely concatentates the binary data in the object files together and writes it out.
To me it feels like a modern linker is more a simple compiler than a 'cat' utility. One could argue that also the compiler is a huge concatenation system which concatenates "sections" (statements and expressions) and produces executable code for the linker. My linker of choice has a frontend for the scripting language, several translation engines to convert between object file formats, byte orders etc. It also does optimization and modifies executable code & symbol names when needed/ asked. It does kind of conditional compilation since I can pack many versions of the same code in to the executable. In addition linker can do stuff like code injection and link time evaluation (even ld can do that).
 
 At its core, you could conceivably design an object format and have the
 linker *actually* just concatenate those files to form an executable.
 The original MS-DOS executable file format wasn't even a file format, it
 was nothing more than binary data that was copied into memory and
 blindly jumped to.
I've been using *nix since I learned to read. I couldn't be more interested in legacy cp/m or m$ crap.
 
 If you want to know where the size in your exe file is coming from, the
 map file will tell you - broken down by each name and its associated
 size.
Right.
 
 Maybe the problem is the man page for ld, which has a truly bewildering
 quantity of obtuse options, confusing people about what a linker
 actually does.
The man page only shows a basic set of switches to use the linker. In reality you need lots of other documentation just to write linker scripts or when writing operating systems or programs for embedded platforms.
Jan 11 2010
parent Walter Bright <newshound1 digitalmars.com> writes:
retard wrote:
 To me it feels like a modern linker is more a simple compiler than a 
 'cat' utility.
That may be true if the linker is doing JITting or some such, but Optlink and ld do nothing like that and do not do anything resembling what a compiler does.
 One could argue that also the compiler is a huge 
 concatenation system which concatenates "sections" (statements and 
 expressions) and produces executable code for the linker.
That is not a useful mental model of what a compiler does. I'm trying to impart a mental model of what the linker does that is useful in that it makes sense of what goes in the linker and what comes out of it.
 My linker of 
 choice has a frontend for the scripting language, several translation 
 engines to convert between object file formats, byte orders etc. It also 
 does optimization and modifies executable code & symbol names when needed/
 asked. It does kind of conditional compilation since I can pack many 
 versions of the same code in to the executable. In addition linker can do 
 stuff like code injection and link time evaluation (even ld can do that).
No wonder you're confused about what a linker does! Yes, some linkers do those things. No, they are NOT the prime thing that it does. The prime thing a linker does is concatenate blocks of data together and write them out. Controlling the order of those sections, dealing with byte orders, file formats, etc., is all detail.
 At its core, you could conceivably design an object format and have the
 linker *actually* just concatenate those files to form an executable.
 The original MS-DOS executable file format wasn't even a file format, it
 was nothing more than binary data that was copied into memory and
 blindly jumped to.
I've been using *nix since I learned to read. I couldn't be more interested in legacy cp/m or m$ crap.
I think early executable formats for unix (and other machines of that day) were pretty much the same. Over time, complexity got layered on, but the fundamentals never changed.
 The man page only shows a basic set of switches to use the linker. In 
 reality you need lots of other documentation just to write linker scripts 
 or when writing operating systems or programs for embedded platforms.
You don't need to know any of that to figure out what is consuming space in your exe file. (BTW, exe files for embedded systems tend to be nothing more than binary data to be blown into EPROMs - blindly jumped to by the microprocessor.)
Jan 11 2010
prev sibling next sibling parent reply retard <re tard.com.invalid> writes:
Sat, 09 Jan 2010 19:44:07 +0100, grauzone wrote:

 Nick Sabalausky wrote:
 "Ph" <romanua gmail.com> wrote in message
 news:hia7qc$b5k$1 digitalmars.com...
 Why a generated file is so huge?
 "Empty" program such as:

 int main(char[][] args)
 {

 return 0;
 }

 compiled with dmd2 into file with size of  266268 bytes. Even after
 UPX, it's size is 87552 bytes. Size of this code,compiled with
 VS(yes,yes, C++), is 6 656 bytes. Compiler add's standard library  to
 file, am i right? Is there some optimization which delete unused code
 from file?
That's not even a third of a megabyte, why does this keep getting brought up as an issue by so many people?
Maybe most of them don't know that it's only constant overheads.
Are you sure it's a constant overhead? I've written few thousands of lines in D and it always seems that if you port the same code to C++, to grow quite fast. E.g. if I link against some GUI lib, the hello world window+label grows to 2..5 MB. In Java the same app using Swing is still only a few kilobytes (label + window + procedure to close the app is 1.2 kB to be precise). Note that the Java app provides even better runtime reflection capabilities that D can. I would imagine a larger program that uses network, sound, graphics, and some other domain specific libraries would need a 50..100 MB binary .exe file when done in D.
Jan 09 2010
next sibling parent reply "Nick Sabalausky" <a a.a> writes:
"retard" <re tard.com.invalid> wrote in message 
news:hiavkv$1meu$1 digitalmars.com...
 Sat, 09 Jan 2010 19:44:07 +0100, grauzone wrote:

 Nick Sabalausky wrote:
 "Ph" <romanua gmail.com> wrote in message
 news:hia7qc$b5k$1 digitalmars.com...
 Why a generated file is so huge?
 "Empty" program such as:

 int main(char[][] args)
 {

 return 0;
 }

 compiled with dmd2 into file with size of  266268 bytes. Even after
 UPX, it's size is 87552 bytes. Size of this code,compiled with
 VS(yes,yes, C++), is 6 656 bytes. Compiler add's standard library  to
 file, am i right? Is there some optimization which delete unused code
 from file?
That's not even a third of a megabyte, why does this keep getting brought up as an issue by so many people?
Maybe most of them don't know that it's only constant overheads.
Are you sure it's a constant overhead? I've written few thousands of lines in D and it always seems that if you port the same code to C++, to grow quite fast. E.g. if I link against some GUI lib, the hello world window+label grows to 2..5 MB. In Java the same app using Swing is still only a few kilobytes (label + window + procedure to close the app is 1.2 kB to be precise). Note that the Java app provides even better runtime reflection capabilities that D can. I would imagine a larger program that uses network, sound, graphics, and some other domain specific libraries would need a 50..100 MB binary .exe file when done in D.
I'd rather use an app that did a bunch of compile-time reflection than one that did a bunch of run-time reflection. And I think that 50..100 MB figure seems quite exaggerated unless you're packing all those art+sound assets into the exe itself (or if you're using that one GUI lib that's been known to result in really inflated exe's, forget which one that was...).
Jan 09 2010
next sibling parent reply retard <re tard.com.invalid> writes:
Sat, 09 Jan 2010 18:15:44 -0500, Nick Sabalausky wrote:

 "retard" <re tard.com.invalid> wrote in message
 news:hiavkv$1meu$1 digitalmars.com...
 Sat, 09 Jan 2010 19:44:07 +0100, grauzone wrote:

 Nick Sabalausky wrote:
 "Ph" <romanua gmail.com> wrote in message
 news:hia7qc$b5k$1 digitalmars.com...
 Why a generated file is so huge?
 "Empty" program such as:

 int main(char[][] args)
 {

 return 0;
 }

 compiled with dmd2 into file with size of  266268 bytes. Even after
 UPX, it's size is 87552 bytes. Size of this code,compiled with
 VS(yes,yes, C++), is 6 656 bytes. Compiler add's standard library 
 to file, am i right? Is there some optimization which delete unused
 code from file?
That's not even a third of a megabyte, why does this keep getting brought up as an issue by so many people?
Maybe most of them don't know that it's only constant overheads.
Are you sure it's a constant overhead? I've written few thousands of lines in D and it always seems that if you port the same code to C++, to grow quite fast. E.g. if I link against some GUI lib, the hello world window+label grows to 2..5 MB. In Java the same app using Swing is still only a few kilobytes (label + window + procedure to close the app is 1.2 kB to be precise). Note that the Java app provides even better runtime reflection capabilities that D can. I would imagine a larger program that uses network, sound, graphics, and some other domain specific libraries would need a 50..100 MB binary .exe file when done in D.
I'd rather use an app that did a bunch of compile-time reflection than one that did a bunch of run-time reflection. And I think that 50..100 MB figure seems quite exaggerated unless you're packing all those art+sound assets into the exe itself (or if you're using that one GUI lib that's been known to result in really inflated exe's, forget which one that was...).
I've tried both - GTK+ bindings and the SWT port by frank benoit. Both are HUEG
Jan 09 2010
parent Bane <branimir.milosavljevic gmail.com> writes:
You can't beat DFL (D Forms Library) on that matter. No overhead. And it looks
great.

retard Wrote:

 Sat, 09 Jan 2010 18:15:44 -0500, Nick Sabalausky wrote:
 
 "retard" <re tard.com.invalid> wrote in message
 news:hiavkv$1meu$1 digitalmars.com...
 Sat, 09 Jan 2010 19:44:07 +0100, grauzone wrote:

 Nick Sabalausky wrote:
 "Ph" <romanua gmail.com> wrote in message
 news:hia7qc$b5k$1 digitalmars.com...
 Why a generated file is so huge?
 "Empty" program such as:

 int main(char[][] args)
 {

 return 0;
 }

 compiled with dmd2 into file with size of  266268 bytes. Even after
 UPX, it's size is 87552 bytes. Size of this code,compiled with
 VS(yes,yes, C++), is 6 656 bytes. Compiler add's standard library 
 to file, am i right? Is there some optimization which delete unused
 code from file?
That's not even a third of a megabyte, why does this keep getting brought up as an issue by so many people?
Maybe most of them don't know that it's only constant overheads.
Are you sure it's a constant overhead? I've written few thousands of lines in D and it always seems that if you port the same code to C++, to grow quite fast. E.g. if I link against some GUI lib, the hello world window+label grows to 2..5 MB. In Java the same app using Swing is still only a few kilobytes (label + window + procedure to close the app is 1.2 kB to be precise). Note that the Java app provides even better runtime reflection capabilities that D can. I would imagine a larger program that uses network, sound, graphics, and some other domain specific libraries would need a 50..100 MB binary .exe file when done in D.
I'd rather use an app that did a bunch of compile-time reflection than one that did a bunch of run-time reflection. And I think that 50..100 MB figure seems quite exaggerated unless you're packing all those art+sound assets into the exe itself (or if you're using that one GUI lib that's been known to result in really inflated exe's, forget which one that was...).
I've tried both - GTK+ bindings and the SWT port by frank benoit. Both are HUEG
Jan 10 2010
prev sibling parent grauzone <none example.net> writes:
Nick Sabalausky wrote:
 I'd rather use an app that did a bunch of compile-time reflection than one 
 that did a bunch of run-time reflection. And I think that 50..100 MB figure 
The question is: what will cause more overhead? Compile time or runtime reflection? For some use cases, you'll have compile time reflection to generate the runtime type information, which can only cause overhead. The situation would be fine if the code used for generating the runtime info would be run only at compile time, but in reality the code ends up in the final executable, causing severe overhead. Plus it will increase compile times, anyway.
 seems quite exaggerated unless you're packing all those art+sound assets 
 into the exe itself (or if you're using that one GUI lib that's been known 
 to result in really inflated exe's, forget which one that was...).
 
 
Jan 10 2010
prev sibling parent reply div0 <div0 users.sourceforge.net> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

retard wrote:
<snip>
 E.g. if I link against some GUI lib, the hello world window+label grows 
 to 2..5 MB. In Java the same app using Swing is still only a few 
 kilobytes (label + window + procedure to close the app is 1.2 kB to be 
 precise). 
Yeah but that's because Swing (and everything else as well come to think of it) is in the class library which is stored in the java run time directory rather than it being linked into the app. The java runtime is 70MB btw. For some bizarre reason, I've got 600 Mb of different Java runtimes on my machine, even though I've only got one Java application installed that I use. God knows where the rest came from. - -- My enormous talent is exceeded only by my outrageous laziness. http://www.ssTk.co.uk -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iD8DBQFLSRaXT9LetA9XoXwRAvsEAKC4kpKuUK7+JSj2rETddpmKNtgQugCgvN5E 5JG7WN4EYWGtdbZn1rukI1Y= =0pVC -----END PGP SIGNATURE-----
Jan 09 2010
parent retard <re tard.com.invalid> writes:
Sat, 09 Jan 2010 23:51:51 +0000, div0 wrote:

 -----BEGIN PGP SIGNED MESSAGE-----
 Hash: SHA1
 
 retard wrote:
 <snip>
 E.g. if I link against some GUI lib, the hello world window+label grows
 to 2..5 MB. In Java the same app using Swing is still only a few
 kilobytes (label + window + procedure to close the app is 1.2 kB to be
 precise).
Yeah but that's because Swing (and everything else as well come to think of it) is in the class library which is stored in the java run time directory rather than it being linked into the app.
GTK+ is also a third party library. I can easily use it in any language and the resulting binary will be really small. In D a small hello world GTK+ GUI application was 2 MB, IIRC, when I used the bindings available somewhere.
 The java runtime is 70MB btw.
And D depends on some basic C libraries, I might guess. Agreed not everyone has Java installed, but when the end users have it, it's a large advantage to ship small binaries. If you host the binaries on the web, you can cut off 95% of the traffic expenses pretty easily by shrinking the distributables. It's by all means ok to have few bloaty application on a system, but image if all programs started to consume 1000x as much space as now, software like Windows 7 would require 10 TB of disk space.
 
 For some bizarre reason, I've got 600 Mb of different Java runtimes on
 my machine, even though I've only got one Java application installed
 that I use. God knows where the rest came from.
That's because you use windows.. it stores only one instance on *nixen - thanks to sane package managers.
Jan 09 2010
prev sibling parent BLS <windevguy hotmail.de> writes:
On 09/01/2010 19:44, grauzone wrote:
 On the other hand, template bloat can inflate binaries surprisingly
 much. For example, the unlinked object file of "hello world" in D1 is
 2.3 KB, while in D2, it's 36 KB. That's because writefln() is a template
 in D2. (The final executable is almost twice the size as the D2 one too,
 although it's questionable how much of the additional size is due to
 templates.)
simple generics are smart.
Jan 09 2010
prev sibling parent reply "Chris" <invalid invalid.invalid> writes:
"Nick Sabalausky":
 "Ph":
 Why a generated file is so huge? [...]
That's not even a third of a megabyte, why does this keep getting brought up as an issue by so many people?
Execution speed perhaps, since the time elapsed is proportional to the number of processor instruction executed. This explains why some people (for certain time critical apps) do not even take the step from C to C++, and chose to stay 20 year behind "modern" languages. D presented itself being a high level language suitable for system programming, so executable sizes must be taken into consideration, imho. Year after year I see the sizes of overbloated executables to grow with non-proportional added substance. I am totally shocked when for only to add a reference to an external library, my program burn the space of an entire computer of old good days. I simply can't get used to it, and probably never will for anyone who used to code in low-level languages, since they know how much a program size can really be.
Jan 10 2010
next sibling parent dsimcha <dsimcha yahoo.com> writes:
== Quote from Chris (invalid invalid.invalid)'s article
 "Nick Sabalausky":
 "Ph":
 Why a generated file is so huge? [...]
That's not even a third of a megabyte, why does this keep getting brought up as an issue by so many people?
Execution speed perhaps, since the time elapsed is proportional to the number of processor instruction executed.
Right, but not all of the code is instructions that are executed. Some of it's static data that's never used. Some is instructions that are never executed and should be thrown out by the linker.
Jan 10 2010
prev sibling next sibling parent reply retard <re tard.com.invalid> writes:
Sun, 10 Jan 2010 12:25:16 +0100, Chris wrote:

 "Nick Sabalausky":
 "Ph":
 Why a generated file is so huge? [...]
That's not even a third of a megabyte, why does this keep getting brought up as an issue by so many people?
Execution speed perhaps, since the time elapsed is proportional to the number of processor instruction executed. This explains why some people (for certain time critical apps) do not even take the step from C to C++, and chose to stay 20 year behind "modern" languages. D presented itself being a high level language suitable for system programming, so executable sizes must be taken into consideration, imho. Year after year I see the sizes of overbloated executables to grow with non-proportional added substance. I am totally shocked when for only to add a reference to an external library, my program burn the space of an entire computer of old good days. I simply can't get used to it, and probably never will for anyone who used to code in low-level languages, since they know how much a program size can really be.
What's funny is that more and more computation can be done with a single instruction because of SSE1-4.2/MMX. Also register sizes grow so computation does not need to be split into many registers because of overflow issues. Also CPUs get faster so a tighter algorithm with a bit slower performance could be used instead. Unfortunately computer programs seem to inflate over time. A typical program doubles its size in 2-3 years. I would understand this if a tradeoff was made between size and performance but unfortunately many programs also perform worse than before. There are exceptions such as the linux kernel - IIRC it fit in a 1.4MB floppy along with a basic set of userspace programs. Nowadays, 15 years later, my hand-built kernel is about 2.5 .. 3x larger. On the other hand it supports more hardware now. I used to have drivers for 4x read only cd, vesa video, sound blaster 16, iomega zip, floppy, and parallel printer. Nowadays I have 2-3 times as many devices connected to the PC and most of them are much more advanced - bi-directional printer link, dvd-rw etc.
Jan 10 2010
parent reply Justin Johansson <no spam.com> writes:
Generally speaking on the substance of the remarks on this thread (as 
below; retard et. al) ...

especially ...
 Unfortunately computer programs seem to inflate over time. A typical
 program doubles its size in 2-3 years. I would understand this if a
 tradeoff was made between size and performance but unfortunately many
 programs also perform worse than before.
The blot is called marketing and is the hallmark of a capitalistic, consumerist, non-green and resource-unsustainable society. Happy New Year, Justin Johansson retard wrote:
 Sun, 10 Jan 2010 12:25:16 +0100, Chris wrote:
 
 "Nick Sabalausky":
 "Ph":
 Why a generated file is so huge? [...]
That's not even a third of a megabyte, why does this keep getting brought up as an issue by so many people?
Execution speed perhaps, since the time elapsed is proportional to the number of processor instruction executed. This explains why some people (for certain time critical apps) do not even take the step from C to C++, and chose to stay 20 year behind "modern" languages. D presented itself being a high level language suitable for system programming, so executable sizes must be taken into consideration, imho. Year after year I see the sizes of overbloated executables to grow with non-proportional added substance. I am totally shocked when for only to add a reference to an external library, my program burn the space of an entire computer of old good days. I simply can't get used to it, and probably never will for anyone who used to code in low-level languages, since they know how much a program size can really be.
What's funny is that more and more computation can be done with a single instruction because of SSE1-4.2/MMX. Also register sizes grow so computation does not need to be split into many registers because of overflow issues. Also CPUs get faster so a tighter algorithm with a bit slower performance could be used instead. Unfortunately computer programs seem to inflate over time. A typical program doubles its size in 2-3 years. I would understand this if a tradeoff was made between size and performance but unfortunately many programs also perform worse than before. There are exceptions such as the linux kernel - IIRC it fit in a 1.4MB floppy along with a basic set of userspace programs. Nowadays, 15 years later, my hand-built kernel is about 2.5 .. 3x larger. On the other hand it supports more hardware now. I used to have drivers for 4x read only cd, vesa video, sound blaster 16, iomega zip, floppy, and parallel printer. Nowadays I have 2-3 times as many devices connected to the PC and most of them are much more advanced - bi-directional printer link, dvd-rw etc.
Jan 13 2010
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Justin Johansson wrote:
 Generally speaking on the substance of the remarks on this thread (as 
 below; retard et. al) ...
 
 especially ...
  > Unfortunately computer programs seem to inflate over time. A typical
  > program doubles its size in 2-3 years. I would understand this if a
  > tradeoff was made between size and performance but unfortunately many
  > programs also perform worse than before.
 
 
 The blot is called marketing and is the hallmark of a capitalistic, 
 consumerist, non-green and resource-unsustainable society.
It's generally a problem with the difference between what people say they want and what they'll spend money on. They say they want a stripper but over and over they buy the fully optioned version. I few years ago, I was looking to buy a pickup truck but instead got a used commercial van. It's very interesting how different it is from a consumer van. The commercial one is a "stripper" - nothing but what it needs to get the job done. No radio, no stereo, no cupholder, no electric windows, no A/C, no heated seats, no glove box, no courtesy lights, no cruise control, no chrome, no badges, no trim, no nothing but what is needed to do its job. It's actually kind of neat-o. You can't buy anything like that in the consumer catalog. (Back in the 80's, the Japanese car companies discovered that sales increased if all the "options" were rolled into the base configuration.) The same goes for most consumer items. When was the last time you didn't prefer buying a phone with the longest feature list?
Jan 13 2010
next sibling parent retard <re tard.com.invalid> writes:
Wed, 13 Jan 2010 13:11:55 -0800, Walter Bright wrote:

 Justin Johansson wrote:
 Generally speaking on the substance of the remarks on this thread (as
 below; retard et. al) ...
 
 especially ...
  > Unfortunately computer programs seem to inflate over time. A typical
  > program doubles its size in 2-3 years. I would understand this if a
  > tradeoff was made between size and performance but unfortunately
  > many programs also perform worse than before.
 
 
 The blot is called marketing and is the hallmark of a capitalistic,
 consumerist, non-green and resource-unsustainable society.
It's generally a problem with the difference between what people say they want and what they'll spend money on. They say they want a stripper but over and over they buy the fully optioned version. I few years ago, I was looking to buy a pickup truck but instead got a used commercial van. It's very interesting how different it is from a consumer van. The commercial one is a "stripper" - nothing but what it needs to get the job done. No radio, no stereo, no cupholder, no electric windows, no A/C, no heated seats, no glove box, no courtesy lights, no cruise control, no chrome, no badges, no trim, no nothing but what is needed to do its job. It's actually kind of neat-o. You can't buy anything like that in the consumer catalog. (Back in the 80's, the Japanese car companies discovered that sales increased if all the "options" were rolled into the base configuration.) The same goes for most consumer items. When was the last time you didn't prefer buying a phone with the longest feature list?
I actually prefer smartphones with smaller power consumption. A slower CPU and less features is better if you can increase the active uptime from 3 hours to one week. The worst smartphone I've had had to be recharged 3 times in a day because the buggy applications drained all battery almost immediately.
Jan 13 2010
prev sibling next sibling parent retard <re tard.com.invalid> writes:
Wed, 13 Jan 2010 13:11:55 -0800, Walter Bright wrote:

 Justin Johansson wrote:
 Generally speaking on the substance of the remarks on this thread (as
 below; retard et. al) ...
 
 especially ...
  > Unfortunately computer programs seem to inflate over time. A typical
  > program doubles its size in 2-3 years. I would understand this if a
  > tradeoff was made between size and performance but unfortunately
  > many programs also perform worse than before.
 
 
 The blot is called marketing and is the hallmark of a capitalistic,
 consumerist, non-green and resource-unsustainable society.
It's generally a problem with the difference between what people say they want and what they'll spend money on. They say they want a stripper but over and over they buy the fully optioned version. I few years ago, I was looking to buy a pickup truck but instead got a used commercial van. It's very interesting how different it is from a consumer van. The commercial one is a "stripper" - nothing but what it needs to get the job done. No radio, no stereo, no cupholder, no electric windows, no A/C, no heated seats, no glove box, no courtesy lights, no cruise control, no chrome, no badges, no trim, no nothing but what is needed to do its job. It's actually kind of neat-o. You can't buy anything like that in the consumer catalog. (Back in the 80's, the Japanese car companies discovered that sales increased if all the "options" were rolled into the base configuration.) The same goes for most consumer items. When was the last time you didn't prefer buying a phone with the longest feature list?
But in any case the car analogy fails here. There are no open source cars. You have lots of choice when choosing applications. I prefer lightweight applications even on this >3.5 Ghz Core i7. rxvt or xterm over gnome-terminal, pan over thunderbird, awesome over metacity etc. The system feels lightning fast. I have no DRMs to worry about. I can also easily get rid of all user friendly crapware that Windows users have to endure. Unfortunately it seems I have hard time evading Wirth's law since most programs get larger and larger. If this trend continues, there is a physical limit on hardware capabilities, but applications will still continue on their road to doom.
Jan 13 2010
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Walter Bright wrote:
 Justin Johansson wrote:
 Generally speaking on the substance of the remarks on this thread (as 
 below; retard et. al) ...

 especially ...
  > Unfortunately computer programs seem to inflate over time. A typical
  > program doubles its size in 2-3 years. I would understand this if a
  > tradeoff was made between size and performance but unfortunately many
  > programs also perform worse than before.


 The blot is called marketing and is the hallmark of a capitalistic, 
 consumerist, non-green and resource-unsustainable society.
It's generally a problem with the difference between what people say they want and what they'll spend money on. They say they want a stripper
At this point in the sentence I got really interested...
 but over and over they buy the fully optioned version.
...to only get disappointed. Andrei
Jan 13 2010
parent reply "Nick Sabalausky" <a a.a> writes:
"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message 
news:hilm2u$dlq$1 digitalmars.com...
 Walter Bright wrote:
 It's generally a problem with the difference between what people say they 
 want and what they'll spend money on. They say they want a stripper
At this point in the sentence I got really interested...
 but over and over they buy the fully optioned version.
...to only get disappointed.
What, you don't like fully-optioned strippers?
Jan 13 2010
parent "Nick Sabalausky" <a a.a> writes:
"Nick Sabalausky" <a a.a> wrote in message 
news:him8pd$1fsq$1 digitalmars.com...
 "Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message 
 news:hilm2u$dlq$1 digitalmars.com...
 Walter Bright wrote:
 It's generally a problem with the difference between what people say 
 they want and what they'll spend money on. They say they want a stripper
At this point in the sentence I got really interested...
 but over and over they buy the fully optioned version.
...to only get disappointed.
What, you don't like fully-optioned strippers?
Heck, those are the best kind, or at least when all those options don't cause a lot of bloat.
Jan 13 2010
prev sibling next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Chris wrote:
 I simply can't get used to it, and probably never will for anyone who
 used to code in low-level languages, since they know how much
 a program size can really be.
I downloaded a program from cpuid.com to tell me what processor I'm running. The executable file size is 1.8 Mb.
Jan 10 2010
parent reply Leandro Lucarella <llucax gmail.com> writes:
Walter Bright, el 10 de enero a las 13:06 me escribiste:
 Chris wrote:
I simply can't get used to it, and probably never will for anyone who
used to code in low-level languages, since they know how much
a program size can really be.
I downloaded a program from cpuid.com to tell me what processor I'm running. The executable file size is 1.8 Mb.
Well, if it's in the internet I'm sure is good! Come on, other people making crap doesn't mean making more crap is justified :) -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- Dale tu mano al mono, pero no el codo, dado que un mono confianzudo es irreversible. -- Ricardo Vaporeso. La Reja, Agosto de 1912.
Jan 11 2010
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Leandro Lucarella wrote:
 Walter Bright, el 10 de enero a las 13:06 me escribiste:
 Chris wrote:
 I simply can't get used to it, and probably never will for anyone who
 used to code in low-level languages, since they know how much
 a program size can really be.
I downloaded a program from cpuid.com to tell me what processor I'm running. The executable file size is 1.8 Mb.
Well, if it's in the internet I'm sure is good! Come on, other people making crap doesn't mean making more crap is justified :)
It's actually a nice program. My point was that the era of tiny executables has long since passed.
Jan 11 2010
next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Walter Bright (newshound1 digitalmars.com)'s article
 It's actually a nice program. My point was that the era of tiny
 executables has long since passed.
<rant> Vote++. I'm convinced that there's just a subset of programmers out there that will not use any high-level programming model, no matter how much easier it makes life, unless they're convinced it has **zero** overhead compared to the crufty old C way. Not negligible overhead, not practically insignificant overhead for their use case, not zero overhead in terms of whatever their most constrained resource is but nonzero overhead in terms of other resources, but zero overhead, period. Then there are those who won't make any tradeoff in terms of safety, encapsulation, readability, modularity, maintainability, etc., even if it means their program runs 15x slower. Why can't more programmers take a more pragmatic attitude towards efficiency (among other things)? Yes, noone wants to just gratuitously squander massive resources, but is a few hundred kilobytes (fine, even a few megabytes, given how cheap bandwidth and storage are nowadays) larger binary really going to make or break your app, especially if you get it working faster and/or with less bugs than you would have using some cruftier, older, lower level language that produces smaller binaries? </rant>
Jan 11 2010
next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
dsimcha wrote:
 Vote++.  I'm convinced that there's just a subset of programmers out there that
 will not use any high-level programming model, no matter how much easier it
makes
 life, unless they're convinced it has **zero** overhead compared to the crufty
old
 C way.  Not negligible overhead, not practically insignificant overhead for
their
 use case, not zero overhead in terms of whatever their most constrained
resource
 is but nonzero overhead in terms of other resources, but zero overhead, period.
 
 Then there are those who won't make any tradeoff in terms of safety,
 encapsulation, readability, modularity, maintainability, etc., even if it means
 their program runs 15x slower.  Why can't more programmers take a more
pragmatic
 attitude towards efficiency (among other things)?  Yes, noone wants to just
 gratuitously squander massive resources, but is a few hundred kilobytes (fine,
 even a few megabytes, given how cheap bandwidth and storage are nowadays)
larger
 binary really going to make or break your app, especially if you get it working
 faster and/or with less bugs than you would have using some cruftier, older,
lower
 level language that produces smaller binaries?
I agree that a lot of the concerns are based on obsolete notions. First off, I just bought another terabyte drive for $90. The first hard drive I bought was $600 for 10Mb. A couple years earlier I used a 10Mb drive that cost $5000. If I look at what eats space on my lovely terabyte drive, it ain't executables. It's music and pictures. I'd be very surprised if I had a whole CD's worth of exe files. Next, even a very large executable doesn't necessarily run any slower than a small one. The reason is the magic of demand paged virtual memory. Executables are NOT loaded into memory before running. They are memory-mapped in. Only code that is actually executed is EVER loaded into memory. You can actually organize the layout of code in the exe file so that it loads very fast, by putting functions that call each other next to each other, and grouping rarely executed code elsewhere. Optlink has the features necessary to do this, and the -profile switch can output a file to drive Optlink to do the necessary layouts. Other languages do appear to have smaller executables, but that's often because the runtime library is dynamically linked, not statically linked, and is not counted as part of the executable size even though it is loaded into memory to be run. D's runtime library is still statically linked in for the pragmatic reason that static linking avoids the "dll hell" versioning problem for your customers. And lastly, it *is* possible to use D as a "C compiler" with the same overhead that C has. All you need to do is make your main a "C" main, and not link in Phobos. In fact, this is how I port dmd to new platforms before Phobos is built. Stick with "C" constructs (i.e. no dynamic arrays!) and it will work just like C does.
Jan 11 2010
next sibling parent reply retard <re tard.com.invalid> writes:
Mon, 11 Jan 2010 19:24:06 -0800, Walter Bright wrote:

 dsimcha wrote:
 Vote++.  I'm convinced that there's just a subset of programmers out
 there that will not use any high-level programming model, no matter how
 much easier it makes life, unless they're convinced it has **zero**
 overhead compared to the crufty old C way.  Not negligible overhead,
 not practically insignificant overhead for their use case, not zero
 overhead in terms of whatever their most constrained resource is but
 nonzero overhead in terms of other resources, but zero overhead,
 period.
 
 Then there are those who won't make any tradeoff in terms of safety,
 encapsulation, readability, modularity, maintainability, etc., even if
 it means their program runs 15x slower.  Why can't more programmers
 take a more pragmatic attitude towards efficiency (among other things)?
  Yes, noone wants to just gratuitously squander massive resources, but
 is a few hundred kilobytes (fine, even a few megabytes, given how cheap
 bandwidth and storage are nowadays) larger binary really going to make
 or break your app, especially if you get it working faster and/or with
 less bugs than you would have using some cruftier, older, lower level
 language that produces smaller binaries?
I agree that a lot of the concerns are based on obsolete notions. First off, I just bought another terabyte drive for $90. The first hard drive I bought was $600 for 10Mb. A couple years earlier I used a 10Mb drive that cost $5000. If I look at what eats space on my lovely terabyte drive, it ain't executables. It's music and pictures. I'd be very surprised if I had a whole CD's worth of exe files.
A 1 Tb spinning hard disk doesn't represent the current state-of-the-art. I have Intel SSD disks are those are damn expensive if you e.g. start to build a safe RAID 1+0 setup. Instead of 1000 GB the same price SSD comes with 8..16 GB. Suddenly application size starts to matter. For instance, my root partition seems to contain 9 GB worth of files and I've only installed a quite minimal graphical Linux environment to write some modern end-user applications.
Jan 12 2010
parent reply "Nick Sabalausky" <a a.a> writes:
"retard" <re tard.com.invalid> wrote in message 
news:hihgbe$qtl$2 digitalmars.com...
 Mon, 11 Jan 2010 19:24:06 -0800, Walter Bright wrote:

 dsimcha wrote:
 Vote++.  I'm convinced that there's just a subset of programmers out
 there that will not use any high-level programming model, no matter how
 much easier it makes life, unless they're convinced it has **zero**
 overhead compared to the crufty old C way.  Not negligible overhead,
 not practically insignificant overhead for their use case, not zero
 overhead in terms of whatever their most constrained resource is but
 nonzero overhead in terms of other resources, but zero overhead,
 period.

 Then there are those who won't make any tradeoff in terms of safety,
 encapsulation, readability, modularity, maintainability, etc., even if
 it means their program runs 15x slower.  Why can't more programmers
 take a more pragmatic attitude towards efficiency (among other things)?
  Yes, noone wants to just gratuitously squander massive resources, but
 is a few hundred kilobytes (fine, even a few megabytes, given how cheap
 bandwidth and storage are nowadays) larger binary really going to make
 or break your app, especially if you get it working faster and/or with
 less bugs than you would have using some cruftier, older, lower level
 language that produces smaller binaries?
I agree that a lot of the concerns are based on obsolete notions. First off, I just bought another terabyte drive for $90. The first hard drive I bought was $600 for 10Mb. A couple years earlier I used a 10Mb drive that cost $5000. If I look at what eats space on my lovely terabyte drive, it ain't executables. It's music and pictures. I'd be very surprised if I had a whole CD's worth of exe files.
A 1 Tb spinning hard disk doesn't represent the current state-of-the-art. I have Intel SSD disks are those are damn expensive if you e.g. start to build a safe RAID 1+0 setup. Instead of 1000 GB the same price SSD comes with 8..16 GB. Suddenly application size starts to matter. For instance, my root partition seems to contain 9 GB worth of files and I've only installed a quite minimal graphical Linux environment to write some modern end-user applications.
Not that other OSes don't have their own forms for bloat, but from what I've seen of linux, an enormus amout of the system is stored as raw text files. I wouldn't be surprised if converting those to sensible (ie non-over-engineered) binary formats, or even just storing them all in a run-of-the-mill zip format would noticably cut down on that footprint.
Jan 12 2010
parent reply retard <re tard.com.invalid> writes:
Tue, 12 Jan 2010 05:34:49 -0500, Nick Sabalausky wrote:

 "retard" <re tard.com.invalid> wrote in message
 news:hihgbe$qtl$2 digitalmars.com...
 Mon, 11 Jan 2010 19:24:06 -0800, Walter Bright wrote:

 dsimcha wrote:
 Vote++.  I'm convinced that there's just a subset of programmers out
 there that will not use any high-level programming model, no matter
 how much easier it makes life, unless they're convinced it has
 **zero** overhead compared to the crufty old C way.  Not negligible
 overhead, not practically insignificant overhead for their use case,
 not zero overhead in terms of whatever their most constrained
 resource is but nonzero overhead in terms of other resources, but
 zero overhead, period.

 Then there are those who won't make any tradeoff in terms of safety,
 encapsulation, readability, modularity, maintainability, etc., even
 if it means their program runs 15x slower.  Why can't more
 programmers take a more pragmatic attitude towards efficiency (among
 other things)?
  Yes, noone wants to just gratuitously squander massive resources,
  but
 is a few hundred kilobytes (fine, even a few megabytes, given how
 cheap bandwidth and storage are nowadays) larger binary really going
 to make or break your app, especially if you get it working faster
 and/or with less bugs than you would have using some cruftier, older,
 lower level language that produces smaller binaries?
I agree that a lot of the concerns are based on obsolete notions. First off, I just bought another terabyte drive for $90. The first hard drive I bought was $600 for 10Mb. A couple years earlier I used a 10Mb drive that cost $5000. If I look at what eats space on my lovely terabyte drive, it ain't executables. It's music and pictures. I'd be very surprised if I had a whole CD's worth of exe files.
A 1 Tb spinning hard disk doesn't represent the current state-of-the-art. I have Intel SSD disks are those are damn expensive if you e.g. start to build a safe RAID 1+0 setup. Instead of 1000 GB the same price SSD comes with 8..16 GB. Suddenly application size starts to matter. For instance, my root partition seems to contain 9 GB worth of files and I've only installed a quite minimal graphical Linux environment to write some modern end-user applications.
Not that other OSes don't have their own forms for bloat, but from what I've seen of linux, an enormus amout of the system is stored as raw text files. I wouldn't be surprised if converting those to sensible (ie non-over-engineered) binary formats, or even just storing them all in a run-of-the-mill zip format would noticably cut down on that footprint.
At least on Linux this is solved on filesystem level. There are e.g. read- only file systems with lzma/xz support. Unfortunately stable rw- filesystems don't utilize compression. What's actually happening regarding configuration files - parts of Linux are moving to xml based configuration system. Seen stuff like hal or policykit? Not only does xml consume more space, the century old rock solid and stable unix configuration reader libraries aren't used anymore, since we have these over-hyped, slow, and buggy xml parsers written in slow dynamic languages. OTOH the configuration file ecosystem isn't that big. On two of my systems 'du -sh /etc/' gives 9.8M and 27M. I doubt the files on the hidden folders and .*rc files on my home directory are much larger. My Windows 7 (came preinstalled on my laptop) profile is already 1.5 GB big - I have absolutely no idea what's inside that binary blob - I don't even have almost anything installed.
Jan 12 2010
next sibling parent Lutger <lutger.blijdestijn gmail.com> writes:
On 01/12/2010 12:14 PM, retard wrote:
 Tue, 12 Jan 2010 05:34:49 -0500, Nick Sabalausky wrote:

 "retard"<re tard.com.invalid>  wrote in message
 news:hihgbe$qtl$2 digitalmars.com...
 Mon, 11 Jan 2010 19:24:06 -0800, Walter Bright wrote:

 dsimcha wrote:
 Vote++.  I'm convinced that there's just a subset of programmers out
 there that will not use any high-level programming model, no matter
 how much easier it makes life, unless they're convinced it has
 **zero** overhead compared to the crufty old C way.  Not negligible
 overhead, not practically insignificant overhead for their use case,
 not zero overhead in terms of whatever their most constrained
 resource is but nonzero overhead in terms of other resources, but
 zero overhead, period.

 Then there are those who won't make any tradeoff in terms of safety,
 encapsulation, readability, modularity, maintainability, etc., even
 if it means their program runs 15x slower.  Why can't more
 programmers take a more pragmatic attitude towards efficiency (among
 other things)?
   Yes, noone wants to just gratuitously squander massive resources,
   but
 is a few hundred kilobytes (fine, even a few megabytes, given how
 cheap bandwidth and storage are nowadays) larger binary really going
 to make or break your app, especially if you get it working faster
 and/or with less bugs than you would have using some cruftier, older,
 lower level language that produces smaller binaries?
I agree that a lot of the concerns are based on obsolete notions. First off, I just bought another terabyte drive for $90. The first hard drive I bought was $600 for 10Mb. A couple years earlier I used a 10Mb drive that cost $5000. If I look at what eats space on my lovely terabyte drive, it ain't executables. It's music and pictures. I'd be very surprised if I had a whole CD's worth of exe files.
A 1 Tb spinning hard disk doesn't represent the current state-of-the-art. I have Intel SSD disks are those are damn expensive if you e.g. start to build a safe RAID 1+0 setup. Instead of 1000 GB the same price SSD comes with 8..16 GB. Suddenly application size starts to matter. For instance, my root partition seems to contain 9 GB worth of files and I've only installed a quite minimal graphical Linux environment to write some modern end-user applications.
Not that other OSes don't have their own forms for bloat, but from what I've seen of linux, an enormus amout of the system is stored as raw text files. I wouldn't be surprised if converting those to sensible (ie non-over-engineered) binary formats, or even just storing them all in a run-of-the-mill zip format would noticably cut down on that footprint.
At least on Linux this is solved on filesystem level. There are e.g. read- only file systems with lzma/xz support. Unfortunately stable rw- filesystems don't utilize compression. What's actually happening regarding configuration files - parts of Linux are moving to xml based configuration system. Seen stuff like hal or policykit? Not only does xml consume more space, the century old rock solid and stable unix configuration reader libraries aren't used anymore, since we have these over-hyped, slow, and buggy xml parsers written in slow dynamic languages.
That sucks, I find .conf files editing sometimes arcane, but at least it is very readable and easy once you know (or look up) what is what. Compare that to the windows registry... The great thing is, when all things fail all you need is pico/nano/vi to change the settings. I would hate to have to do that with xml. I myself am also a buggy xml parser ;)
 OTOH the configuration file ecosystem isn't that big. On two of my
 systems 'du -sh /etc/' gives 9.8M and 27M. I doubt the files on the
 hidden folders and .*rc files on my home directory are much larger. My
 Windows 7 (came preinstalled on my laptop) profile is already 1.5 GB big
 - I have absolutely no idea what's inside that binary blob - I don't even
 have almost anything installed.
I seem to have a 11 gb setup (excluding home of course). Here are some stats: $ du /usr/sbin /usr/bin /usr/lib -s 30M /usr/sbin 284M /usr/bin (almost 3000 files) 2,4G /usr/lib I installed an enormous amount of apps, but it seems the executables don't consume that much at all. It is all in the libraries. That suggest to me that it would matter more if shared libraries for D would be better supported (that also means distributed by distro's!).
Jan 12 2010
prev sibling parent "Nick Sabalausky" <a a.a> writes:
"retard" <re tard.com.invalid> wrote in message 
news:hihlj8$16rb$1 digitalmars.com...
 What's actually happening regarding configuration files - parts of Linux
 are moving to xml based configuration system.
Eeewww!
 OTOH the configuration file ecosystem isn't that big. On two of my
 systems 'du -sh /etc/' gives 9.8M and 27M. I doubt the files on the
 hidden folders and .*rc files on my home directory are much larger. My
 Windows 7 (came preinstalled on my laptop) profile is already 1.5 GB big
 - I have absolutely no idea what's inside that binary blob - I don't even
 have almost anything installed.
Interesting.
Jan 13 2010
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 11 Jan 2010 22:24:06 -0500, Walter Bright  
<newshound1 digitalmars.com> wrote:

 Other languages do appear to have smaller executables, but that's often  
 because the runtime library is dynamically linked, not statically  
 linked, and is not counted as part of the executable size even though it  
 is loaded into memory to be run. D's runtime library is still statically  
 linked in for the pragmatic reason that static linking avoids the "dll  
 hell" versioning problem for your customers.
I hope this is not a permanent situation. Shared libraries (not necessarily DLLs) help reduce the memory usage of all the programs on the system that use the same libraries, and the footprint of the binaries. The "DLL hell" versioning problem AFAIK is only on Windows, and has nothing to do with using dlls, it has everything to do with retarded installers that think they are in charge of putting shared DLLs into your system directory (fostered by the very loose directory security model of Windows to begin with). My understanding of the "no shared library" problem of D was not that it was an anti-dll-hell feature but actually an unsolved problem with sharing the GC. If you are saying that even if someone solves the GC sharing problem that D still will produce mostly static exes, you might as well take D out back and shoot it now. I don't see any businesses using a language that is 15 years behind the curve in library production. If it were a problem that couldn't be solved, then there would be lots of languages that have that problem. On all your other points, I agree that installs these days are bigger more because of media than binaries. I do have a problem with that in some cases. I don't want my scanner driver software to install 500MB of crap that I will never use when all I want to do is scan documents with a simple interface. I don't need "skins" for my word processor or help videos on how to use my mouse. Just install the shit that does the work, and leave the rest of my hard drive alone :) -Steve
Jan 12 2010
next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Steven Schveighoffer wrote:
 I hope this is not a permanent situation.  Shared libraries (not 
 necessarily DLLs) help reduce the memory usage of all the programs on 
 the system that use the same libraries, and the footprint of the 
 binaries.
That would be reasonable once we can get libphobos installed on linux distributions. Right now, it's easier for users to not have to deal with version hell for shared libraries. Static linking of phobos does not impair using D, so it is a lower priority.
 The "DLL hell" versioning problem AFAIK is only on Windows, 
My experience is different. There are two C shared libraries in common use on Linux. If I link dmd to one, one group of dmd users gets annoyed, if I link with the other, the other half gets annoyed. There's no decent solution.
 My understanding of the "no shared library" problem of D was not that it 
 was an anti-dll-hell feature but actually an unsolved problem with 
 sharing the GC.  If you are saying that even if someone solves the GC 
 sharing problem that D still will produce mostly static exes, you might 
 as well take D out back and shoot it now.  I don't see any businesses 
 using a language that is 15 years behind the curve in library 
 production.  If it were a problem that couldn't be solved, then there 
 would be lots of languages that have that problem.
It's true that nobody has spent the effort to do this. Anyone is welcome to step up and work on it.
 On all your other points, I agree that installs these days are bigger 
 more because of media than binaries.  I do have a problem with that in 
 some cases.  I don't want my scanner driver software to install 500MB of 
 crap that I will never use when all I want to do is scan documents with 
 a simple interface.  I don't need "skins" for my word processor or help 
 videos on how to use my mouse.  Just install the shit that does the 
 work, and leave the rest of my hard drive alone :)
Heh, dlls won't solve that problem. I upgraded Nero to support a new dvd drive, and what the heck, a 20Mb install turned into - 360Mb !!
Jan 12 2010
next sibling parent Lutger <lutger.blijdestijn gmail.com> writes:
On 01/12/2010 11:40 PM, Walter Bright wrote:
 Steven Schveighoffer wrote:
 I hope this is not a permanent situation. Shared libraries (not
 necessarily DLLs) help reduce the memory usage of all the programs on
 the system that use the same libraries, and the footprint of the
 binaries.
That would be reasonable once we can get libphobos installed on linux distributions. Right now, it's easier for users to not have to deal with version hell for shared libraries. Static linking of phobos does not impair using D, so it is a lower priority.
That could be a nice goal after D2 is released, but how to handle the fact that dmd is not open source? Some distro's will probably not accept it in standard repositories. I think this is unfortunate, since a wider linux distribution could attract more developers.
 The "DLL hell" versioning problem AFAIK is only on Windows,
My experience is different. There are two C shared libraries in common use on Linux. If I link dmd to one, one group of dmd users gets annoyed, if I link with the other, the other half gets annoyed. There's no decent solution.
The preferred solution with a linux system generally is to go all the way: let the distro packagers handle the dependencies and make them build dmd. Or have some community build distro specific packages for you. Both require a redistribution license however, and usually an open source one.
Jan 12 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 12 Jan 2010 17:40:58 -0500, Walter Bright  
<newshound1 digitalmars.com> wrote:

 Steven Schveighoffer wrote:
 I hope this is not a permanent situation.  Shared libraries (not  
 necessarily DLLs) help reduce the memory usage of all the programs on  
 the system that use the same libraries, and the footprint of the  
 binaries.
That would be reasonable once we can get libphobos installed on linux distributions. Right now, it's easier for users to not have to deal with version hell for shared libraries. Static linking of phobos does not impair using D, so it is a lower priority.
As long as it's not viewed as a detriment for D to be shared-library based (or a "benefit" to be statically linked), I'm OK with it. I agree that when shared libraries are available, the easiest distribution method would be an installer which put all the files in the right places. But those are pretty easy to come by.
 The "DLL hell" versioning problem AFAIK is only on Windows,
My experience is different. There are two C shared libraries in common use on Linux. If I link dmd to one, one group of dmd users gets annoyed, if I link with the other, the other half gets annoyed. There's no decent solution.
I call that distribution hell :) You have to remember that releasing a binary on a Linux OS does not make that binary compatible with all other types of Linux OSes. However, compiling a binary on a specific linux OS should make that library runnable on all later versions of that OS (sometimes requires installing "legacy" libs) The solution is pretty simple however, and most companies just live with it -- support 3 or 4 different flavors of Linux by building a package for each one. Usually it's RedHat, SuSE, Ubuntu, and Debian. The effort it takes to build under 2 or 3 different environments is not that much. The advent of full dmd source being available should relieve that problem anyways...
 On all your other points, I agree that installs these days are bigger  
 more because of media than binaries.  I do have a problem with that in  
 some cases.  I don't want my scanner driver software to install 500MB  
 of crap that I will never use when all I want to do is scan documents  
 with a simple interface.  I don't need "skins" for my word processor or  
 help videos on how to use my mouse.  Just install the shit that does  
 the work, and leave the rest of my hard drive alone :)
Heh, dlls won't solve that problem. I upgraded Nero to support a new dvd drive, and what the heck, a 20Mb install turned into - 360Mb !!
The only thing I've found that solves this problem is -- not using Windows. For some reason Linux developers still put value on small exe size and actual features instead of needlessly snazzy UI and exploding media bloat. -Steve
Jan 12 2010
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 1/12/10 23:40, Walter Bright wrote:
 Steven Schveighoffer wrote:
 I hope this is not a permanent situation. Shared libraries (not
 necessarily DLLs) help reduce the memory usage of all the programs on
 the system that use the same libraries, and the footprint of the
 binaries.
That would be reasonable once we can get libphobos installed on linux distributions. Right now, it's easier for users to not have to deal with version hell for shared libraries. Static linking of phobos does not impair using D, so it is a lower priority.
 The "DLL hell" versioning problem AFAIK is only on Windows,
My experience is different. There are two C shared libraries in common use on Linux. If I link dmd to one, one group of dmd users gets annoyed, if I link with the other, the other half gets annoyed. There's no decent solution.
 My understanding of the "no shared library" problem of D was not that
 it was an anti-dll-hell feature but actually an unsolved problem with
 sharing the GC. If you are saying that even if someone solves the GC
 sharing problem that D still will produce mostly static exes, you
 might as well take D out back and shoot it now. I don't see any
 businesses using a language that is 15 years behind the curve in
 library production. If it were a problem that couldn't be solved, then
 there would be lots of languages that have that problem.
It's true that nobody has spent the effort to do this. Anyone is welcome to step up and work on it.
Hasn't that already been solved with ddl: http://www.dsource.org/projects/ddl/
 On all your other points, I agree that installs these days are bigger
 more because of media than binaries. I do have a problem with that in
 some cases. I don't want my scanner driver software to install 500MB
 of crap that I will never use when all I want to do is scan documents
 with a simple interface. I don't need "skins" for my word processor or
 help videos on how to use my mouse. Just install the shit that does
 the work, and leave the rest of my hard drive alone :)
Heh, dlls won't solve that problem. I upgraded Nero to support a new dvd drive, and what the heck, a 20Mb install turned into - 360Mb !!
Jan 13 2010
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 13 Jan 2010 06:28:33 -0500, Jacob Carlborg <doob me.com> wrote:

 On 1/12/10 23:40, Walter Bright wrote:
 It's true that nobody has spent the effort to do this. Anyone is welcome
 to step up and work on it.
Hasn't that already been solved with ddl: http://www.dsource.org/projects/ddl/
It's not solved until I can do something like: dmd -sharedlib mylibfile.d and have it output a shared object that I can link against. Having to do all the shared library stuff manually is not even close to a solution. On top of that, phobos and druntime must be shared libs. If the end result is that the D compiler emits the correct code to interface with DDL, that is fine. But it has to be built in. -Steve
Jan 14 2010
next sibling parent reply retard <re tard.com.invalid> writes:
To conclude this discussion, it seems that executable size could be 
reduced dramatically. Unfortunately the compiler fails to use the 
opportunity in almost all cases.

A typical program would be 50% to 99.999% smaller if the libraries 
(including stdlib) were dynamically linked in, compiler omitted gc when 
it's not needed, and the extra type info would be eliminated - completely 
when no runtime reflection is used at all. Walter doesn't see this as a 
large problem, at least not worth fixing now so we got to get used to 
executables 100x larger than they need to be. Kthxbai
Jan 14 2010
parent Lutger <lutger.blijdestijn gmail.com> writes:
On 01/14/2010 03:19 PM, retard wrote:
 To conclude this discussion, it seems that executable size could be
 reduced dramatically. Unfortunately the compiler fails to use the
 opportunity in almost all cases.

 A typical program would be 50% to 99.999% smaller if the libraries
 (including stdlib) were dynamically linked in, compiler omitted gc when
 it's not needed, and the extra type info would be eliminated - completely
 when no runtime reflection is used at all. Walter doesn't see this as a
 large problem, at least not worth fixing now so we got to get used to
 executables 100x larger than they need to be. Kthxbai
A minor correction: Walter does not see this as a priority for *him* worth postponing other tasks for at *this* moment. That's not entirely unreasonable considering current goals, no? If anybody would step in and solve the problem I'm sure he would integrate it in dmd.
Jan 14 2010
prev sibling parent grauzone <none example.net> writes:
Steven Schveighoffer wrote:
 On Wed, 13 Jan 2010 06:28:33 -0500, Jacob Carlborg <doob me.com> wrote:
 
 On 1/12/10 23:40, Walter Bright wrote:
 It's true that nobody has spent the effort to do this. Anyone is welcome
 to step up and work on it.
Hasn't that already been solved with ddl: http://www.dsource.org/projects/ddl/
It's not solved until I can do something like: dmd -sharedlib mylibfile.d
Then just make dmd support DDL directly.
 and have it output a shared object that I can link against.  Having to 
 do all the shared library stuff manually is not even close to a solution.
 
 On top of that, phobos and druntime must be shared libs.
 
 If the end result is that the D compiler emits the correct code to 
 interface with DDL, that is fine.  But it has to be built in.
 
 -Steve
Jan 14 2010
prev sibling parent "Nick Sabalausky" <a a.a> writes:
"Walter Bright" <newshound1 digitalmars.com> wrote in message 
news:hiitps$12fn$1 digitalmars.com...
 Heh, dlls won't solve that problem. I upgraded Nero to support a new dvd 
 drive, and what the heck, a 20Mb install turned into - 360Mb !!
Nero turned to shit ten years ago, I never touch the damn thing. Just use InfraRecorder instead (And ImgBurn for dealing with any disc image formats that InfraRecorder might not support).
Jan 13 2010
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 1/12/10 22:31, Steven Schveighoffer wrote:
 On Mon, 11 Jan 2010 22:24:06 -0500, Walter Bright
 <newshound1 digitalmars.com> wrote:

 Other languages do appear to have smaller executables, but that's
 often because the runtime library is dynamically linked, not
 statically linked, and is not counted as part of the executable size
 even though it is loaded into memory to be run. D's runtime library is
 still statically linked in for the pragmatic reason that static
 linking avoids the "dll hell" versioning problem for your customers.
I hope this is not a permanent situation. Shared libraries (not necessarily DLLs) help reduce the memory usage of all the programs on the system that use the same libraries, and the footprint of the binaries. The "DLL hell" versioning problem AFAIK is only on Windows, and has nothing to do with using dlls, it has everything to do with retarded installers that think they are in charge of putting shared DLLs into your system directory (fostered by the very loose directory security model of Windows to begin with). My understanding of the "no shared library" problem of D was not that it was an anti-dll-hell feature but actually an unsolved problem with sharing the GC. If you are saying that even if someone solves the GC sharing problem that D still will produce mostly static exes, you might as well take D out back and shoot it now. I don't see any businesses using a language that is 15 years behind the curve in library production. If it were a problem that couldn't be solved, then there would be lots of languages that have that problem.
Hasn't that already been solved with ddl: http://www.dsource.org/projects/ddl/
 On all your other points, I agree that installs these days are bigger
 more because of media than binaries. I do have a problem with that in
 some cases. I don't want my scanner driver software to install 500MB of
 crap that I will never use when all I want to do is scan documents with
 a simple interface. I don't need "skins" for my word processor or help
 videos on how to use my mouse. Just install the shit that does the work,
 and leave the rest of my hard drive alone :)

 -Steve
Jan 13 2010
prev sibling next sibling parent retard <re tard.com.invalid> writes:
Tue, 12 Jan 2010 02:45:08 +0000, dsimcha wrote:

 == Quote from Walter Bright (newshound1 digitalmars.com)'s article
 It's actually a nice program. My point was that the era of tiny
 executables has long since passed.
<rant> Vote++. I'm convinced that there's just a subset of programmers out there that will not use any high-level programming model, no matter how much easier it makes life, unless they're convinced it has **zero** overhead compared to the crufty old C way. Not negligible overhead, not practically insignificant overhead for their use case, not zero overhead in terms of whatever their most constrained resource is but nonzero overhead in terms of other resources, but zero overhead, period. Then there are those who won't make any tradeoff in terms of safety, encapsulation, readability, modularity, maintainability, etc., even if it means their program runs 15x slower. Why can't more programmers take a more pragmatic attitude towards efficiency (among other things)? Yes, noone wants to just gratuitously squander massive resources, but is a few hundred kilobytes (fine, even a few megabytes, given how cheap bandwidth and storage are nowadays) larger binary really going to make or break your app, especially if you get it working faster and/or with less bugs than you would have using some cruftier, older, lower level language that produces smaller binaries? </rant>
You could fit e.g. the whole Linux userspace application suite to disk cache or L1/2/3 cpu cache if they only were small enough. There is always a faster class of memory which would make the system faster, but with very strict space constraints. In fact the good old command line programs work rather efficiently even today as they are usually 5 .. 150 kB large. They work quickly on a i586 with 8 MB of RAM and they work lightning fast on my overclocked Core i7 960. Unfortunately the same cannot be said about GUI applications. Even on the Core i7 machine with triple channel memory and two super fast ssd disks in raid-0, the startup and i/o response times are really shitty on the user friendliest GUI apps. If you have a rather mature program such as /bin/echo, you don't need to rewrite it each year. So why not write it in assembly once and just use it.
Jan 12 2010
prev sibling parent Leandro Lucarella <llucax gmail.com> writes:
dsimcha, el 12 de enero a las 02:45 me escribiste:
 == Quote from Walter Bright (newshound1 digitalmars.com)'s article
 It's actually a nice program. My point was that the era of tiny
 executables has long since passed.
<rant> Vote++. I'm convinced that there's just a subset of programmers out there that will not use any high-level programming model, no matter how much easier it makes life, unless they're convinced it has **zero** overhead compared to the crufty old C way.
Just to clarify, I'm not talking about this. I prefer to use D even with its overhead (when I can afford it), but that doesn't mean D shouldn't take this seriously and say "bah, everybody is doing big binaries, why should I care?". One thing is "we can't focus on that because we have other priorities but we are concerned about the issue" and another *very different* thing is "we don't care, even if the binary size still grow". It's a very nice rant, and I agree, but you missed the point. I'm not talking about not using D, I'm talking about recognizing this as an issue (even when it might not be a huge one). -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- Did you know the originally a Danish guy invented the burglar-alarm unfortunately, it got stolen
Jan 12 2010
prev sibling next sibling parent reply Leandro Lucarella <llucax gmail.com> writes:
Walter Bright, el 11 de enero a las 18:02 me escribiste:
 Leandro Lucarella wrote:
Walter Bright, el 10 de enero a las 13:06 me escribiste:
Chris wrote:
I simply can't get used to it, and probably never will for anyone who
used to code in low-level languages, since they know how much
a program size can really be.
I downloaded a program from cpuid.com to tell me what processor I'm running. The executable file size is 1.8 Mb.
Well, if it's in the internet I'm sure is good! Come on, other people making crap doesn't mean making more crap is justified :)
It's actually a nice program. My point was that the era of tiny executables has long since passed.
NOT for everybody. I refuse to download huge programs (unless their size is justified). And when it comes to embedded systems, size *do* matter. -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- I would drape myself in velvet if it were socially acceptable. -- George Constanza
Jan 12 2010
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Leandro Lucarella wrote:
 NOT for everybody. I refuse to download huge programs (unless their size
 is justified).
Unfortunately, sizeof(exe + dll) == sizeof(exe) + sizeof(dll) You'd only see savings if the user already has the dll installed, or if you are shipping multiple exe's.
 And when it comes to embedded systems, size *do* matter.
I know.
Jan 12 2010
next sibling parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Tue, Jan 12, 2010 at 02:48:31PM -0800, Walter Bright wrote:
 Unfortunately, sizeof(exe + dll) == sizeof(exe) + sizeof(dll)
Actually, sizeof(exe + dll) < sizeof(exe) + sizeof(dll) in most cases. In a single exe, the unneeded things from the library are stripped out by the linker. As to the main subject, the size of D programs is something I would like to see brought down - 300kb is a lot when transferring on dialup, but I consider it the lowest of all the priorities I can think of. It isn't /obscenely/ large. -- Adam D. Ruppe http://arsdnet.net
Jan 12 2010
parent reply retard <re tard.com.invalid> writes:
Tue, 12 Jan 2010 17:59:38 -0500, Adam D. Ruppe wrote:

 On Tue, Jan 12, 2010 at 02:48:31PM -0800, Walter Bright wrote:
 Unfortunately, sizeof(exe + dll) == sizeof(exe) + sizeof(dll)
Actually, sizeof(exe + dll) < sizeof(exe) + sizeof(dll) in most cases.
That's for sure. If I look at my current desktop setup, for example all xfce and kde packages depend on the same dedicated libraries. This way the applications stay really small and as much code as possible can be shared. Many applications are only 3 kB large, many others are between 10 and 100 kB. If they all used static linking, I'm afraid I would have to buy another 32 GB ssd disk only for the binaries.
Jan 12 2010
parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Tue, Jan 12, 2010 at 11:12:33PM +0000, retard wrote:
 If they all used static linking, I'm afraid I would have to 
 buy another 32 GB ssd disk only for the binaries.
Yes, for large numbers of application, shared libraries save a lot of space. Though, what I was saying is if you have only one application using the library, static linking actually saves space. A shared lib needs to keep functions around just in case - a static link doesn't. I'm generally for static linking anything that isn't part of the base OS install, just to ease the process for end users. I'd ideally like to see a statically linked application that is also small - improvements to the compiler and linker still have some potential here. -- Adam D. Ruppe http://arsdnet.net
Jan 12 2010
next sibling parent reply Lutger <lutger.blijdestijn gmail.com> writes:
On 01/13/2010 12:29 AM, Adam D. Ruppe wrote:
 On Tue, Jan 12, 2010 at 11:12:33PM +0000, retard wrote:
 If they all used static linking, I'm afraid I would have to
 buy another 32 GB ssd disk only for the binaries.
Yes, for large numbers of application, shared libraries save a lot of space. Though, what I was saying is if you have only one application using the library, static linking actually saves space. A shared lib needs to keep functions around just in case - a static link doesn't.
Funny enough distributing dll's alongside the app is exactly what a lot of windows apps do, to prevent dll hell or satisy some license.
Jan 12 2010
parent Jussi Jumppanen <jussij zeusedit.com> writes:
Lutger Wrote:

 Funny enough distributing dll's alongside the app is exactly what a lot 
 of windows apps do, to prevent dll hell or satisy some license.
If the Windows application is developed correctly in that it is design to use the Windows Side-by-side assembly feature: http://en.wikipedia.org/wiki/Side-by-side_assembly then provided the developer does not stuff up the manifest, DLL hell is pretty much a non event.
Jan 12 2010
prev sibling next sibling parent Sean Kelly <sean invisibleduck.org> writes:
Adam D. Ruppe Wrote:
 
 I'm generally for static linking anything that isn't part of the base OS
 install, just to ease the process for end users.
Redistributing apps built on dynamic libraries can be an utter pain, particularly if the libraries are "standard" as opposed to application-specific.
Jan 12 2010
prev sibling next sibling parent reply Leandro Lucarella <llucax gmail.com> writes:
Adam D. Ruppe, el 12 de enero a las 18:29 me escribiste:
 On Tue, Jan 12, 2010 at 11:12:33PM +0000, retard wrote:
 If they all used static linking, I'm afraid I would have to 
 buy another 32 GB ssd disk only for the binaries.
Yes, for large numbers of application, shared libraries save a lot of space. Though, what I was saying is if you have only one application using the library, static linking actually saves space. A shared lib needs to keep functions around just in case - a static link doesn't. I'm generally for static linking anything that isn't part of the base OS install, just to ease the process for end users. I'd ideally like to see a statically linked application that is also small - improvements to the compiler and linker still have some potential here.
What about security updates. If you, for example, use OpenSSL in a program, an you link it statically, then the user is condemned to have a vulnerable program when an OpenSSL security bug is found until you give him a new binary. That is a lot of trouble for you and the user. If you use dynamic linking, the user just need to keep its system updated to avoid this kind of issues, and you only need to care about new release when the bugs are really from your program, not third-party libraries. -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- Sometimes you got to suffer a little in your youth to motivate you to succeed later in life. Do you think if Bill Gates got laid in high school, do you think there'd be a Microsoft? Of course not. You gotta spend a lot of time stuffin your own locker with your underwear wedged up your arse before you think "I'm gona take over the world with computers! You'll see I'll show them."
Jan 12 2010
next sibling parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Tue, Jan 12, 2010 at 10:13:36PM -0300, Leandro Lucarella wrote:
 If you use dynamic linking, the user just need to keep its system updated
 to avoid this kind of issues, and you only need to care about new release
 when the bugs are really from your program, not third-party libraries.
Yes, that is a benefit of dynamic linking. But, the other side of this is if the third-party library's new version breaks your app, your poor user is in trouble. The choice to go dynamic is a trade off - sometimes worth it, but I tend to assume not until the specific case shows otherwise. -- Adam D. Ruppe http://arsdnet.net
Jan 12 2010
parent retard <re tard.com.invalid> writes:
Tue, 12 Jan 2010 22:00:25 -0500, Adam D. Ruppe wrote:

 On Tue, Jan 12, 2010 at 10:13:36PM -0300, Leandro Lucarella wrote:
 If you use dynamic linking, the user just need to keep its system
 updated to avoid this kind of issues, and you only need to care about
 new release when the bugs are really from your program, not third-party
 libraries.
Yes, that is a benefit of dynamic linking. But, the other side of this is if the third-party library's new version breaks your app, your poor user is in trouble. The choice to go dynamic is a trade off - sometimes worth it, but I tend to assume not until the specific case shows otherwise.
It's not only useful when security issues arise. The 3rd party libraries can be bug fixed as much as their author wants to without touching end user applications. Imagine something universal like Gtk+ or Qt. Unfortunately Linux is full of badly maintained api breaking libraries. Just a while ago I couldn't print at all because cups or some other library depended on an internal symbol of another lib.
Jan 12 2010
prev sibling parent reply Rainer Deyke <rainerd eldwood.com> writes:
Leandro Lucarella wrote:
 If you use dynamic linking, the user just need to keep its system updated
 to avoid this kind of issues, and you only need to care about new release
 when the bugs are really from your program, not third-party libraries.
No, that's backwards. If the user gets the application and library from a central repository (e.g. apt-get), then it is the responsibility of the repository maintainer(s) to keep everything up to date. Getting a patched executable from the repository is no more or less effort for the user than getting a patched library from the repository. Putting a new executable up is no more or less effort for the repository maintainer(s) than putting a new library up. If the user gets the application and library from the application developer, then it's the responsibility of the application developer to keep everything patched. Getting a patched executable is still no more or less effort for the user than getting a patched library. Putting a new executable up is no more or less effort for the application developer than putting a new library up. If the user gets the application and library from separate developers, then keeping the library up to date is the responsibility of the library developer. Getting software from multiple sources is /more/ effort for the user. Furthermore, library developers are rarely set up to distribute software to the end user. Often the library developers don't even distribute binaries. In summary, there are no cases where dynamic linking makes security updates easier for the end user. There are cases where this separation makes security updates a lot harder for the end user. -- Rainer Deyke - rainerd eldwood.com
Jan 12 2010
parent reply KennyTM~ <kennytm gmail.com> writes:
On Jan 13, 10 11:57, Rainer Deyke wrote:
 Leandro Lucarella wrote:
 If you use dynamic linking, the user just need to keep its system updated
 to avoid this kind of issues, and you only need to care about new release
 when the bugs are really from your program, not third-party libraries.
No, that's backwards. If the user gets the application and library from a central repository (e.g. apt-get), then it is the responsibility of the repository maintainer(s) to keep everything up to date. Getting a patched executable from the repository is no more or less effort for the user than getting a patched library from the repository. Putting a new executable up is no more or less effort for the repository maintainer(s) than putting a new library up.
Suppose libc got a security flaw. Instead of downloading and updating 1 library you got to download and update 1,000 executables. So instead of distributing (say) 100 KB of binaries the repositories need to send 100 MB to its users. A huge and unnecessary bandwidth waste for both sides I would say.
 If the user gets the application and library from the application
 developer, then it's the responsibility of the application developer to
 keep everything patched.  Getting a patched executable is still no more
 or less effort for the user than getting a patched library.  Putting a
 new executable up is no more or less effort for the application
 developer than putting a new library up.
What if the application developer is irresponsible?
 If the user gets the application and library from separate developers,
 then keeping the library up to date is the responsibility of the library
 developer.  Getting software from multiple sources is /more/ effort for
 the user.  Furthermore, library developers are rarely set up to
 distribute software to the end user.  Often the library developers don't
 even distribute binaries.

 In summary, there are no cases where dynamic linking makes security
 updates easier for the end user.  There are cases where this separation
 makes security updates a lot harder for the end user.
Jan 12 2010
parent Rainer Deyke <rainerd eldwood.com> writes:
KennyTM~ wrote:
 Suppose libc got a security flaw. Instead of downloading and updating 1
 library you got to download and update 1,000 executables. So instead of
 distributing (say) 100 KB of binaries the repositories need to send 100
 MB to its users. A huge and unnecessary bandwidth waste for both sides I
 would say.
That's a worst case scenario - a dll that's effectively a core component of the operating system. The vast majority of dlls are used much less frequently. (It's also questionable if the security flaw would actually affect all 1000 executables.) Still, bandwidth is cheap. Windows service packs are a lot bigger than 100MB.
 What if the application developer is irresponsible?
What if the security flaw is in the application and not in any library? In a lot of cases, it doesn't matter because the application doesn't connect to the outside world and is therefore secure by default. When it does matter, you have three options: accept the risk, run on a quarantined system, or don't use the application. -- Rainer Deyke - rainerd eldwood.com
Jan 12 2010
prev sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Adam D. Ruppe (destructionator gmail.com)'s article
 I'm generally for static linking anything that isn't part of the base OS
 install, just to ease the process for end users.
One thing that has escaped discussion in the static vs. dynamic linking debate so far is **templates**. If you use template-heavy code all over your library, that pretty much rules out dynamic linking. If you avoid templates so you can dynamically link, you're avoiding IMHO the single most important feature that distinguishes D from other languages and are writing non-idiomatic D. You may as well use some other language that's better suited to doing things without templates. Therefore, I suspect D culture will be very biased toward static linking for that reason.
Jan 12 2010
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 12 Jan 2010 22:16:53 -0500, dsimcha <dsimcha yahoo.com> wrote:

 == Quote from Adam D. Ruppe (destructionator gmail.com)'s article
 I'm generally for static linking anything that isn't part of the base OS
 install, just to ease the process for end users.
One thing that has escaped discussion in the static vs. dynamic linking debate so far is **templates**. If you use template-heavy code all over your library, that pretty much rules out dynamic linking. If you avoid templates so you can dynamically link, you're avoiding IMHO the single most important feature that distinguishes D from other languages and are writing non-idiomatic D. You may as well use some other language that's better suited to doing things without templates. Therefore, I suspect D culture will be very biased toward static linking for that reason.
dynamic linking does not prevent template use. The C++ standard library which arguably contains mostly templates still has a .so size of 900k on my linux box. You would most likely be using many templates that are already instantiated by phobos in the dynamic library. Anything specialized will be compiled into your app. You can have it both ways! -Steve
Jan 12 2010
prev sibling parent reply Leandro Lucarella <llucax gmail.com> writes:
Walter Bright, el 12 de enero a las 14:48 me escribiste:
 Leandro Lucarella wrote:
NOT for everybody. I refuse to download huge programs (unless their size
is justified).
Unfortunately, sizeof(exe + dll) == sizeof(exe) + sizeof(dll) You'd only see savings if the user already has the dll installed, or if you are shipping multiple exe's.
In Linux, using a distribution, you have a pretty good change that dynamic libraries are used by more than one program. And there is typeinfo/template bloat too to analyze in the executables bloat, it's not just static vs dynamic linking (even when that would be a big step forward smaller binaries).
And when it comes to embedded systems, size *do* matter.
I know.
Well, I hope then that D sees big binaries as a (low priority) bug then :) -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- De las generaciones venideras espero, nada más, que vengan. -- Ricardo Vaporeso
Jan 12 2010
parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Tue, Jan 12, 2010 at 10:09:01PM -0300, Leandro Lucarella wrote:
 In Linux, using a distribution, you have a pretty good change that dynamic
 libraries are used by more than one program.
That's actually not true. I ran a program on my system a long time ago that ran ldd against everything in /bin, /lib, and all the various variants. http://arsdnet.net/lib-data-sorted.txt Notice that the *vast* majority of those libraries are used by only a handful of binaries on the system. Half has five or fewer users. This is slightly biased by not counting dynamically loaded things; ldd only does statically requested shared libs. But I doubt that changes things significantly. Also note how at the bottom of the list, users go up quickly - the basic system libraries are used by a great many apps. Most everything else is pretty specialized though. I don't remember where I put the source to the program that generated that list, but it isn't too hard to recreate; it just counts references that ldd outputs.
 typeinfo/template bloat too to analyze in the executables bloat
I'm sure the linker will eventually take care of templates. As to typeinfo, does D need it anymore? It seems like templates obsolete most the typeinfo now. -- Adam D. Ruppe http://arsdnet.net
Jan 12 2010
next sibling parent retard <re tard.com.invalid> writes:
Tue, 12 Jan 2010 21:57:06 -0500, Adam D. Ruppe wrote:

 On Tue, Jan 12, 2010 at 10:09:01PM -0300, Leandro Lucarella wrote:
 In Linux, using a distribution, you have a pretty good change that
 dynamic libraries are used by more than one program.
That's actually not true. I ran a program on my system a long time ago that ran ldd against everything in /bin, /lib, and all the various variants. http://arsdnet.net/lib-data-sorted.txt Notice that the *vast* majority of those libraries are used by only a handful of binaries on the system. Half has five or fewer users. This is slightly biased by not counting dynamically loaded things; ldd only does statically requested shared libs. But I doubt that changes things significantly. Also note how at the bottom of the list, users go up quickly - the basic system libraries are used by a great many apps. Most everything else is pretty specialized though.
Yes, that's a bit unfortunate. The freedom of choice in Linux often leads to too many libraries. I have several libraries in my system that are basically rivaling technology. A monoculture sometimes feels more natural, but nobody likes it unless it suits their needs. A more flexible option for distributing executables would be to allow both static and dynamic linking. Use dynamic libraries unless the use fails, in which case fallback to the statically provided library code. I don't know if this can be done with currently available tools.
Jan 12 2010
prev sibling parent reply KennyTM~ <kennytm gmail.com> writes:
On Jan 13, 10 10:57, Adam D. Ruppe wrote:
 On Tue, Jan 12, 2010 at 10:09:01PM -0300, Leandro Lucarella wrote:
 In Linux, using a distribution, you have a pretty good change that dynamic
 libraries are used by more than one program.
That's actually not true. I ran a program on my system a long time ago that ran ldd against everything in /bin, /lib, and all the various variants. http://arsdnet.net/lib-data-sorted.txt Notice that the *vast* majority of those libraries are used by only a handful of binaries on the system. Half has five or fewer users.
The data is flawed. For example, libtiff appears as /usr/lib/./libtiff.so.3 (2 uses) and /usr/lib/libtiff.so.3 (70 uses). Moreover, as long as there are ≥2 uses it's disk usage is already lower than static linking.
 This is slightly biased by not counting dynamically loaded things; ldd
 only does statically requested shared libs. But I doubt that changes things
 significantly.


 Also note how at the bottom of the list, users go up quickly - the basic
 system libraries are used by a great many apps. Most everything else is
 pretty specialized though.

 I don't remember where I put the source to the program that generated that
 list, but it isn't too hard to recreate; it just counts references that
 ldd outputs.

 typeinfo/template bloat too to analyze in the executables bloat
I'm sure the linker will eventually take care of templates. As to typeinfo, does D need it anymore? It seems like templates obsolete most the typeinfo now.
Jan 12 2010
parent reply Rainer Deyke <rainerd eldwood.com> writes:
KennyTM~ wrote:
 Moreover, as long as there are ≥2 uses it's disk usage is already lower
 than static linking.
Only so long as the average program that uses the library uses more than 50% of the library. -- Rainer Deyke - rainerd eldwood.com
Jan 12 2010
parent retard <re tard.com.invalid> writes:
Wed, 13 Jan 2010 00:58:09 -0700, Rainer Deyke wrote:

 KennyTM~ wrote:
 Moreover, as long as there are ≥2 uses it's disk usage is already lower
 than static linking.
Only so long as the average program that uses the library uses more than 50% of the library.
At least on Linux the size of a simplest hello world bumps from 2.5 kB to 500 kB if all libraries are built in statically. You have to be careful when choosing the set of static libraries. That's a 200x size increase. I rarely have problems choosing which libraries should be shared. The typical 3rd party libraries I use e.g. in game development (sdl, opengl bindings and wrappers, media libraries etc.) are quite often used in various projects - at least on Linux.
Jan 13 2010
prev sibling parent reply Don <nospam nospam.com> writes:
Walter Bright wrote:
 Leandro Lucarella wrote:
 Walter Bright, el 10 de enero a las 13:06 me escribiste:
 Chris wrote:
 I simply can't get used to it, and probably never will for anyone who
 used to code in low-level languages, since they know how much
 a program size can really be.
I downloaded a program from cpuid.com to tell me what processor I'm running. The executable file size is 1.8 Mb.
Well, if it's in the internet I'm sure is good! Come on, other people making crap doesn't mean making more crap is justified :)
It's actually a nice program. My point was that the era of tiny executables has long since passed.
I once had a customer make a request for a very small DOS utility program, and specifically said that they didn't have much disk space left on their computer, could I please make sure the executable wasn't too big? I wrote it in asm. It was 15 bytes. <g>
Jan 12 2010
next sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
Don Wrote:
 
 I once had a customer make a request for a very small DOS utility 
 program, and specifically said that they didn't have much disk space 
 left on their computer, could I please make sure the executable wasn't 
 too big?
 
 I wrote it in asm. It was 15 bytes. <g>
Hah. You just reminded me of an old Usenet story: http://www.pbm.com/~lindahl/mel.html
Jan 12 2010
parent grauzone <none example.net> writes:
Sean Kelly wrote:
 Don Wrote:
 I once had a customer make a request for a very small DOS utility 
 program, and specifically said that they didn't have much disk space 
 left on their computer, could I please make sure the executable wasn't 
 too big?

 I wrote it in asm. It was 15 bytes. <g>
Hah. You just reminded me of an old Usenet story: http://www.pbm.com/~lindahl/mel.html
Did this Mel write OPTLINK?
Jan 12 2010
prev sibling parent dsimcha <dsimcha yahoo.com> writes:
== Quote from Don (nospam nospam.com)'s article
 I once had a customer make a request for a very small DOS utility
 program, and specifically said that they didn't have much disk space
 left on their computer, could I please make sure the executable wasn't
 too big?
 I wrote it in asm. It was 15 bytes. <g>
What did it do?
Jan 12 2010
prev sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Chris:
 Execution speed perhaps, since the time elapsed is proportional to the
 number of processor instruction executed. This explains why some people
 (for certain time critical apps) do not even take the step from C to C++,
 and chose to stay 20 year behind "modern" languages.
In real programs what takes time are (beside I/O and cache issues that are another form of I/O) often small amounts of code, usually loops inside loops. Removing a single instruction inside them can reduce the running time by K%, while removing a megabyte of code elsewhere may just reduce a little the loading time, etc.
 D presented itself being a high level language suitable for system
 programming, so executable sizes must be taken into consideration, imho.
I don't think we'll see miracles soon, but D2 is currently in alpha state still. Once it's in beta some care will be probably given to optimizations too, and among them there is the executable size too. Even if most people don't need such optimization, it's clearly psychologically required by C/C++ programmers. Eventually it's even possible to add a compilation flag to D compilers to not use the GC, avoiding that overhead on C-like programs (such flag must also turn some operations into compilation errors, like array join, etc, to avoid leaks). Currently Link Time Optimization of LLVM (that can be used by LDC) removes some unused code from D1 programs. Bye, bearophile
Jan 10 2010
next sibling parent retard <re tard.com.invalid> writes:
Sun, 10 Jan 2010 16:45:56 -0500, bearophile wrote:

 Chris:
 Execution speed perhaps, since the time elapsed is proportional to the
 number of processor instruction executed. This explains why some people
 (for certain time critical apps) do not even take the step from C to
 C++, and chose to stay 20 year behind "modern" languages.
In real programs what takes time are (beside I/O and cache issues that are another form of I/O) often small amounts of code, usually loops inside loops. Removing a single instruction inside them can reduce the running time by K%, while removing a megabyte of code elsewhere may just reduce a little the loading time, etc.
 D presented itself being a high level language suitable for system
 programming, so executable sizes must be taken into consideration,
 imho.
I don't think we'll see miracles soon, but D2 is currently in alpha state still. Once it's in beta some care will be probably given to optimizations too, and among them there is the executable size too. Even if most people don't need such optimization, it's clearly psychologically required by C/C++ programmers. Eventually it's even possible to add a compilation flag to D compilers to not use the GC, avoiding that overhead on C-like programs (such flag must also turn some operations into compilation errors, like array join, etc, to avoid leaks). Currently Link Time Optimization of LLVM (that can be used by LDC) removes some unused code from D1 programs.
At least you're admitting that a problem really exists. Some D users even see the problems as important and desired features - "Your exe consumes 500 MB of space? But that's a good thing. Now you can be certain that you have a valid reason for finally purchasing that 10 TB raid-5 array."
Jan 10 2010
prev sibling next sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
bearophile Wrote:

 Currently Link Time Optimization of LLVM (that can be used by LDC) removes
some unused code from D1 programs.
I guess this is different than --gc-sections or whatever the ld flag is? I recall that one breaking exception handling, though I also recall there being a trivial change during code generation that could fix this.
Jan 10 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
Sean Kelly:
 I guess this is different than --gc-sections or whatever the ld flag is?<
I don't remember what --gc-sections is, but I guess it's something different. The code removed during the LTO is for example unreachable functions, or functions/methods, that once inlined are called from nowhere else, unused constants, etc. Here you can see an example on C code (in D1 it's the same): http://llvm.org/docs/LinkTimeOptimization.html Anyway, currently the LDC project is mostly sleeping. Bye, bearophile
Jan 10 2010
parent reply Walter Bright <newshound1 digitalmars.com> writes:
bearophile wrote:
 Sean Kelly:
 I guess this is different than --gc-sections or whatever the ld
 flag is?<
I don't remember what --gc-sections is, but I guess it's something different. The code removed during the LTO is for example unreachable functions, or functions/methods, that once inlined are called from nowhere else, unused constants, etc. Here you can see an example on C code (in D1 it's the same): http://llvm.org/docs/LinkTimeOptimization.html Anyway, currently the LDC project is mostly sleeping.
Optlink does this too. It's oooollldd technology, been around since the 80's. Consider the following program: ======================== int x; void foo() { x++; } int main() { return 0; } ======================== Compile it, dmd foo -L/map and let's have a look at the object file (cut down for brevity): ======================== _D3foo3fooFZv comdat assume CS:_D3foo3fooFZv mov EAX,FS:__tls_array mov ECX,[EAX] inc dword ptr _D3foo1xi[ECX] ret _D3foo3fooFZv ends __Dmain comdat assume CS:__Dmain xor EAX,EAX ret __Dmain ends ========================= Now look at the map file with: grep foo foo.map ========================= 0004:00000090 _D3foo12__ModuleInfoZ 00434090 0003:00000004 _D3foo1xi 00433004 0003:00000004 _D3foo1xi 00433004 0004:00000090 _D3foo12__ModuleInfoZ 00434090 ========================= and we see that _D3foo3fooFZv does not appear in it. Optlink does this by default, you don't even have to throw a switch.
Jan 11 2010
next sibling parent reply Matti Niemenmaa <see_signature for.real.address> writes:
On 2010-01-11 11:04, Walter Bright wrote:
 bearophile wrote:
 I don't remember what --gc-sections is, but I guess it's something
 different. The code removed during the LTO is for example unreachable
 functions, or functions/methods, that once inlined are called from
 nowhere else, unused constants, etc. Here you can see an example on C
 code (in D1 it's the same):
 http://llvm.org/docs/LinkTimeOptimization.html Anyway, currently the
 LDC project is mostly sleeping.
Optlink does this too. It's oooollldd technology, been around since the 80's. Consider the following program: ======================== int x; void foo() { x++; } int main() { return 0; } ======================== Compile it, dmd foo -L/map
<snip>
 Now look at the map file with:

 grep foo foo.map

 =========================
 0004:00000090 _D3foo12__ModuleInfoZ 00434090
 0003:00000004 _D3foo1xi 00433004
 0003:00000004 _D3foo1xi 00433004
 0004:00000090 _D3foo12__ModuleInfoZ 00434090
 =========================

 and we see that _D3foo3fooFZv does not appear in it. Optlink does this
 by default, you don't even have to throw a switch.
_D3foo1xi, however, does appear in it, even though it's just as unused as _D3foo3fooFZv. Why doesn't Optlink remove that? LLVM's LTO does. -- E-mail address: matti.niemenmaa+news, domain is iki (DOT) fi
Jan 11 2010
parent Walter Bright <newshound1 digitalmars.com> writes:
Matti Niemenmaa wrote:
 _D3foo1xi, however, does appear in it, even though it's just as unused 
 as _D3foo3fooFZv. Why doesn't Optlink remove that? LLVM's LTO does.
It would if x was a COMDAT. The problem, though, is x is allocated as thread local which has some kludgy issues about it, as it's an afterthought in the omf. The real issue, though, is that saving 4 bytes in your exe file won't make any difference. If you've got thousands of global variables (used or not), you've got other problems with your program. In other words, it's what do you spend time implementing optimizations for? It should be things that matter.
Jan 11 2010
prev sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Walter Bright:
 Optlink does this too. It's oooollldd technology, been around since the 
 80's. Consider the following program:
Only now GCC is starting to do it, and only partially. If you compile that C example (3 files): --- a.h --- extern int foo1(void); extern void foo2(void); extern void foo4(void); --- a.c --- #include "a.h" static signed int i = 0; void foo2(void) { i = -1; } static int foo3() { foo4(); return 10; } int foo1(void) { int data = 0; if (i < 0) { data = foo3(); } data = data + 42; return data; } --- main.c --- #include <stdio.h> #include "a.h" void foo4(void) { printf ("Hi\n"); } int main() { return foo1(); } All you find at the end is a binay that contains the compile of just: int main() { return 42; } All other constants, variables and functions are absent. I'm sure it's not rocket science, but such things do actually improve performance of programs, sometimes in my tests up to about 15-25% (with the help of inlining too). Bye, bearophile
Jan 11 2010
parent reply retard <re tard.com.invalid> writes:
Mon, 11 Jan 2010 07:12:14 -0500, bearophile wrote:

 Walter Bright:
 Optlink does this too. It's oooollldd technology, been around since the
 80's. Consider the following program:
Only now GCC is starting to do it, and only partially. If you compile that C example (3 files): --- a.h --- extern int foo1(void); extern void foo2(void); extern void foo4(void); --- a.c --- #include "a.h" static signed int i = 0; void foo2(void) { i = -1; } static int foo3() { foo4(); return 10; } int foo1(void) { int data = 0; if (i < 0) { data = foo3(); } data = data + 42; return data; } --- main.c --- #include <stdio.h> #include "a.h" void foo4(void) { printf ("Hi\n"); } int main() { return foo1(); } All you find at the end is a binay that contains the compile of just: int main() { return 42; } All other constants, variables and functions are absent. I'm sure it's not rocket science, but such things do actually improve performance of programs, sometimes in my tests up to about 15-25% (with the help of inlining too).
Just looking at the executable size (when compiled with dmc, the latter program is 30 kB and the first one is ~40 kB) makes it pretty clear that dmc/optlink does not optimize this. The same bloatness issue happens here - 10 kB extra just to potentially print "Hi!" - OMG. Unfortunately I'm not buying a windows / full dmc license to be able to objdump the executable so it's a bit hard to see what's inside it.
Jan 11 2010
parent reply Walter Bright <newshound1 digitalmars.com> writes:
retard wrote:
 Just looking at the executable size (when compiled with dmc, the latter 
 program is 30 kB and the first one is ~40 kB) makes it pretty clear that 
 dmc/optlink does not optimize this.
Yes, it does, if you use the -Nc (function level linking) switch when compiling. dmd puts functions in COMDATs by default. Some older C programs require that the layout of functions in memory match their appearance in the source file, hence dmc doesn't do that by default. http://www.digitalmars.com/ctg/sc.html#dashCapNc
 Unfortunately I'm 
 not buying a windows / full dmc license to be able to objdump the 
 executable so it's a bit hard to see what's inside it.
You can get OBJ2ASM for $15 as part of the Extended Utility Package. I think it's worth every penny! http://www.digitalmars.com/shop.html
Jan 11 2010
parent reply retard <re tard.com.invalid> writes:
Mon, 11 Jan 2010 17:59:46 -0800, Walter Bright wrote:

 retard wrote:
 Just looking at the executable size (when compiled with dmc, the latter
 program is 30 kB and the first one is ~40 kB) makes it pretty clear
 that dmc/optlink does not optimize this.
Yes, it does, if you use the -Nc (function level linking) switch when compiling. dmd puts functions in COMDATs by default. Some older C programs require that the layout of functions in memory match their appearance in the source file, hence dmc doesn't do that by default. http://www.digitalmars.com/ctg/sc.html#dashCapNc
That still doesn't explain why the resulting binaries from the two examples have so big a difference in size. When I compile them with gcc 4.4.3, the non-stripped non-optimized binaries are 6994 and 6427 bytes. Stripped and -Os optimized ones are 4504 and 4264 bytes. So the extra functions cost 240 bytes while your compiler generates 10 kB of additional data. The printf is linked in statically? Is this some windows related limitation?
Jan 12 2010
parent Walter Bright <newshound1 digitalmars.com> writes:
retard wrote:
 That still doesn't explain why the resulting binaries from the two 
 examples have so big a difference in size. When I compile them with gcc 
 4.4.3, the non-stripped non-optimized binaries are 6994 and 6427 bytes. 
 Stripped and -Os optimized ones are 4504 and 4264 bytes. So the extra 
 functions cost 240 bytes while your compiler generates 10 kB of 
 additional data. The printf is linked in statically? Is this some windows 
 related limitation?
As I mentioned in another thread, gcc does not link in the C runtime library. The library is a shared library (or dll), which is not part of the exe file, though it is part of your running program. dmc links in the C runtime library statically, thus avoiding dll hell. I'll reiterate that this is easily discoverable by looking at the map file. Just looking at the exe file size and blindly guessing what is consuming space in it without looking at the map file is a waste of time.
Jan 12 2010
prev sibling parent "Chris" <invalid invalid.invalid> writes:
"bearophile"
 Chris:
 Execution speed perhaps, since the time elapsed is proportional to the
 number of processor instruction executed. This explains why some people
 (for certain time critical apps) do not even take the step from C to C++,
 and chose to stay 20 year behind "modern" languages.
In real programs what takes time are (beside I/O and cache issues that are another form of I/O) often small amounts of code, usually loops inside loops. Removing a single instruction inside them can reduce the running time by K%, while removing a megabyte of code elsewhere may just reduce a little the loading time, etc.
I am aware of it, this is why I specified instruction _executed_. A loop that executes 1000 times take roughly the same running time of the equivalent loop content written 1000 times (but the looping version has more chance to fit in CPU cache...). This does inflate the .exe but will not change the total execution time, and is not the point I was trying to make. I was not talking about reducing disk usage or .exe size per se, but regarding speed improvement (the real goal). If Mr.Bright will confirm that the important section of code are well optimized, and the size problem was only due to some peripheral branch of code seldom executed, my concern will be mostly resolved. Thanks.
Jan 12 2010
prev sibling next sibling parent Sean Kelly <sean invisibleduck.org> writes:
Ph Wrote:

 Why a generated file is so huge?
 "Empty" program such as:
 
 int main(char[][] args)
 {
 
 	return 0;
 }
 
 compiled with dmd2 into file with size of  266268 bytes.
 Even after UPX, it's size is 87552 bytes.
 Size of this code,compiled with VS(yes,yes, C++), is 6 656 bytes.
 Compiler add's standard library  to file, am i right?
 Is there some optimization which delete unused code from file?
The minimum size of a program used to be around 70k, but TypeInfo size has ballooned since then, and I believe the weak links to ModuleInfo were eliminated because of a problem with optlink (if I remember correctly). At the very least, every app will link in the GC because it's initialized during program startup, even if it's never called in user code. The GC calls core.thread in druntime, so that will be pulled in as well, along with core.exception.
Jan 10 2010
prev sibling next sibling parent reply Roman Ivanov <isroman-rem move-km.ru> writes:
Walter Bright Wrote:
 I agree that a lot of the concerns are based on obsolete notions. First 
 off, I just bought another terabyte drive for $90. The first hard drive 
 I bought was $600 for 10Mb. A couple years earlier I used a 10Mb drive 
 that cost $5000. If I look at what eats space on my lovely terabyte 
 drive, it ain't executables. It's music and pictures. I'd be very 
 surprised if I had a whole CD's worth of exe files.
 
 Next, even a very large executable doesn't necessarily run any slower 
 than a small one. The reason is the magic of demand paged virtual 
 memory. Executables are NOT loaded into memory before running. They are 
 memory-mapped in. Only code that is actually executed is EVER loaded 
 into memory.
I would say you're severely understating the issue. There are plenty of valid reasons why small executables are desirable. Imagine every Unix utility being several megabytes large. There are hundreds of them just in /usr/bin directory. The concerns about updating security problems and bandwidth usage are pretty rational as well. Besides, there is also an issue of marketing. People have a perception that larger programs run slower. Usually this perception is right. I would have hard time "selling" D programs as efficient to my boss, if the executables are much larger than their counterparts in other popular languages. Perhaps I'm missing something, but what would be involved in separating the stuff that bloats executables into a shared library?
Jan 13 2010
parent Walter Bright <newshound1 digitalmars.com> writes:
Roman Ivanov wrote:
 Perhaps I'm missing something, but what would be involved in
 separating the stuff that bloats executables into a shared library?
Creating a shared library version of Phobos requires someone to sit down and take the time to do it.
Jan 13 2010
prev sibling parent reply Roman Ivanov <isroman-rem move.km.ru> writes:
Nick Sabalausky Wrote:

 "retard" <re tard.com.invalid> wrote in message 
 news:hihgbe$qtl$2 digitalmars.com...
 Mon, 11 Jan 2010 19:24:06 -0800, Walter Bright wrote:

 dsimcha wrote:
 Vote++.  I'm convinced that there's just a subset of programmers out
 there that will not use any high-level programming model, no matter how
 much easier it makes life, unless they're convinced it has **zero**
 overhead compared to the crufty old C way.  Not negligible overhead,
 not practically insignificant overhead for their use case, not zero
 overhead in terms of whatever their most constrained resource is but
 nonzero overhead in terms of other resources, but zero overhead,
 period.

 Then there are those who won't make any tradeoff in terms of safety,
 encapsulation, readability, modularity, maintainability, etc., even if
 it means their program runs 15x slower.  Why can't more programmers
 take a more pragmatic attitude towards efficiency (among other things)?
  Yes, noone wants to just gratuitously squander massive resources, but
 is a few hundred kilobytes (fine, even a few megabytes, given how cheap
 bandwidth and storage are nowadays) larger binary really going to make
 or break your app, especially if you get it working faster and/or with
 less bugs than you would have using some cruftier, older, lower level
 language that produces smaller binaries?
I agree that a lot of the concerns are based on obsolete notions. First off, I just bought another terabyte drive for $90. The first hard drive I bought was $600 for 10Mb. A couple years earlier I used a 10Mb drive that cost $5000. If I look at what eats space on my lovely terabyte drive, it ain't executables. It's music and pictures. I'd be very surprised if I had a whole CD's worth of exe files.
A 1 Tb spinning hard disk doesn't represent the current state-of-the-art. I have Intel SSD disks are those are damn expensive if you e.g. start to build a safe RAID 1+0 setup. Instead of 1000 GB the same price SSD comes with 8..16 GB. Suddenly application size starts to matter. For instance, my root partition seems to contain 9 GB worth of files and I've only installed a quite minimal graphical Linux environment to write some modern end-user applications.
Not that other OSes don't have their own forms for bloat, but from what I've seen of linux, an enormus amout of the system is stored as raw text files. I wouldn't be surprised if converting those to sensible (ie non-over-engineered) binary formats, or even just storing them all in a run-of-the-mill zip format would noticably cut down on that footprint.
Text files are easy to modify, extensible and self-describing. And you don't need custom tools to work with them. In most cases, if you made extensible binary configs, the space savings would be negligible. Most lines in a typical config are of the form some-key=some-vaue If the value is ASCII text, like a file path, there simply isn't anything you could save by converting it to binary. If the value is a number, you would get some savings, but they would be negligible for anything but very large numbers. ( ceil(log_base_10(N)) - ceil(log_base_2(N))? Something of that sort.) You could get large savings on booleans, but you would have to be tricky, because memory still works in bytes. You would get roughly the same savings by stripping text configs of comments and redundant whitespace. Storing configs in zip files would increase boot and load times, because applications would waste cycles on unzipping them. That is bloat of a worse kind.
Jan 13 2010
parent reply "Nick Sabalausky" <a a.a> writes:
"Roman Ivanov" <isroman-rem move.km.ru> wrote in message 
news:hilovb$ioc$1 digitalmars.com...
 Most lines in a typical config are of the form

 some-key=some-vaue

 If the value is ASCII text, like a file path, there simply isn't anything 
 you could save by converting it to binary. If the value is a number, you 
 would get some savings, but they would be negligible for anything but very 
 large numbers. ( ceil(log_base_10(N)) - ceil(log_base_2(N))? Something of 
 that sort.) You could get large savings on booleans, but you would have to 
 be tricky, because memory still works in bytes.
Good points. Although with binary, the key wouldn't necissarily need to be text. But yea, I'm probably just splitting hairs now.
 Storing configs in zip files would increase boot and load times, because 
 applications would waste cycles on unzipping them. That is bloat of a 
 worse kind.
I'm not sure that would be the case if the data was stored on a hard drive (as opposed to one of those flash drives), because wouldn't the cost of extra disk access typically be greater than decoding a non-top-of-the-line zip encoding? (Not that I really care all that much either way.)
Jan 13 2010
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 14 Jan 2010 00:19:56 -0500, Nick Sabalausky <a a.a> wrote:

 "Roman Ivanov" <isroman-rem move.km.ru> wrote in message
 Storing configs in zip files would increase boot and load times, because
 applications would waste cycles on unzipping them. That is bloat of a
 worse kind.
I'm not sure that would be the case if the data was stored on a hard drive (as opposed to one of those flash drives), because wouldn't the cost of extra disk access typically be greater than decoding a non-top-of-the-line zip encoding? (Not that I really care all that much either way.)
Unless your text file is on the order of 200k bytes, you don't get much savings :) hard drives read blocks, not bytes, so the relationship of load time to bytes is not linear. -Steve
Jan 14 2010