digitalmars.D - DMD producing huge binaries
- Georgi D (110/110) May 17 2016 Hi,
- Andrei Alexandrescu (3/7) May 17 2016 [snip]
- ZombineDev (4/14) May 17 2016 I'm guessing that is highly related to this one:
- Steven Schveighoffer (7/19) May 19 2016 Yep. chain uses voldemort type, map does not.
- Andrei Alexandrescu (16/17) May 19 2016 We definitely need to fix Voldemort types. Walter and I bounced a few
- Steven Schveighoffer (40/56) May 19 2016 This may be crazy, but I solved this problem by simply moving the struct...
- Andrei Alexandrescu (2/4) May 19 2016 That's also a possibility, but likely a more complicated one. -- Andrei
- Daniel Murphy (2/6) May 19 2016 We don't actually need to lower it, just mangle it as if it was lowered.
- Andrei Alexandrescu (2/9) May 19 2016 Noice! -- Andrei
- Steven Schveighoffer (5/17) May 19 2016 This solution works awesomely, actually. It even produces smaller code
- Walter Bright (3/18) May 19 2016 Consider using a 'static struct'. A non-static struct will include a hid...
- jmh530 (7/10) May 19 2016 Queue me going to the language reference to look up static
- Walter Bright (2/6) May 19 2016 You can always submit a PR for a better one.
- Steven Schveighoffer (7/29) May 23 2016 Yes, I did use static in my actual code. In fact, I found a bug because
- Walter Bright (4/7) May 19 2016 Let's see how far we get with compression first.
- Adam D. Ruppe (9/11) May 19 2016 Using 6.4 megabyte strings already makes symbolic debugging
- poliklosio (5/16) May 19 2016 Clearly the reason for building such a gigantic string was some
- poliklosio (2/7) May 19 2016 It seems like dynamic programming can be useful.
- Patrick Schluter (4/15) May 19 2016 Good point. Using a SHA1 derived from the string instead of a
- default0 (7/17) May 19 2016 You could simply add a "trivial" version of the struct +
- poliklosio (2/6) May 20 2016 This is why compression is not good enough.
- ZombineDev (11/21) May 20 2016 Unfortunately, the PR doesn't cure the root cause, but only
- ZombineDev (11/35) May 20 2016 IMO, the best way forward is:
- Rene Zwanenburg (3/13) May 20 2016 Location info shouldn't be used. This will break things like
- ZombineDev (17/31) May 20 2016 Well... template-heavy code doesn't play well header-files and
- ZombineDev (11/43) May 20 2016 @Rene
- Rene Zwanenburg (48/58) May 20 2016 I was thinking of something along the lines of this:
- ZombineDev (8/68) May 20 2016 The only issue is code bloat. It also happens with separate
- Andrei Alexandrescu (6/15) May 20 2016 This is a fallacy. I don't think so, at all, when the baseline is an
- Johan Engelen (11/28) May 20 2016 I agree with Andrei.
- Andrei Alexandrescu (3/29) May 20 2016 This is nice. How difficult would it be to rework it into a PR for dmd?
- Johan Engelen (13/26) May 20 2016 I can work on it, but only if it will not result in a long debate
- Andrei Alexandrescu (5/30) May 20 2016 I don't see a need for hashing something. Would a randomly-generated
- H. S. Teoh via Digitalmars-d (11/26) May 20 2016 [...]
- Marc =?UTF-8?B?U2Now7x0eg==?= (2/4) May 20 2016 That would break separate compilation, wouldn't it?
- Wyatt (8/10) May 20 2016 Naïve question: is a (randomly-)?generated _anything_ required?
- Adam D. Ruppe (7/9) May 20 2016 Line and file is not unique because the same template would
- Walter Bright (3/5) May 20 2016 Hashing produces reproducible unique values. Random will not be reproduc...
- Dicebot (6/12) May 20 2016 The question is: is it actually good for them to be reproducible?
- Walter Bright (7/8) May 20 2016 1. yes, because the same type can be generated from multiple compilation...
- H. S. Teoh via Digitalmars-d (7/10) May 20 2016 Yeah, I think hashing is the way to go. Random strings just raise more
- cym13 (7/21) May 20 2016 It would make binary comparison of libraries and executables
- Bill Hicks (4/10) May 20 2016 Just a troll here, but Quasirandom numbers are reproducible and
- Era Scarecrow (4/10) May 20 2016 Unless there's a way to seed it consistently so order doesn't
- Bill Hicks (3/8) May 20 2016 There is no seed; it's a low-discrepancy sequence, and each
- ZombineDev (7/37) May 20 2016 I like your approach. As I said earlier, it would be best if can
- Johan Engelen (8/14) May 20 2016 From what I've observed, generating the long symbol name itself
- ZombineDev (18/33) May 20 2016 IIUC, your scheme works like this:
- Georgi D (13/49) May 20 2016 I see two separate issues that I think should be handled
- Andrei Alexandrescu (4/7) May 20 2016 Was talking to Walter on the phone and he just had one of those great
- Georgi D (11/19) May 20 2016 This is a very interesting idea. I see one problem though: The
- Era Scarecrow (6/16) May 20 2016 I keep having it go through my head, if we had to hash the
- Andrei Alexandrescu (3/4) May 20 2016 Voldemort returns are by far the worst. Compression and hashing should
- Kagamin (4/7) May 21 2016 Equivalent to not mangling return type at all. But this leaves
- krzaq (8/17) May 21 2016 Would it be practical to use type-erasure in the middle of a call
- Andrei Alexandrescu (3/4) May 21 2016 He said that that won't happen any longer, the growth was because of the...
- Kagamin (8/10) May 21 2016 https://issues.dlang.org/show_bug.cgi?id=15831#c4
- Andrei Alexandrescu (4/13) May 22 2016 Just to make sure I understand: do you agree or disagree that there's no...
- Steven Schveighoffer (15/29) May 23 2016 Not sure if someone has definitively answered before, but no, this does
- Steven Schveighoffer (4/8) May 23 2016 Actually, this is not true. It will only fail to link if the name of the...
- Steven Schveighoffer (6/24) May 23 2016 To clarify, I think this is still a good idea, but only if the type is
- Walter Bright (2/3) May 23 2016 It does factor into type matching when a function pointer is used, for e...
- Steven Schveighoffer (6/10) May 23 2016 Yes, that may be the only time return type is factored in, but
- Era Scarecrow (14/17) May 21 2016 Depends on implementation and algorithm. However even the
- Walter Bright (4/15) May 21 2016 We already have a compressor in the compiler source for compressing name...
- Walter Bright (3/6) May 21 2016 Note how well it does:
- Anon (6/16) May 21 2016 The underlying symbols are still growing exponentially. Nobody
- Andrei Alexandrescu (3/18) May 22 2016 My understanding is that the encoding "auto" return in the mangling
- Anon (27/51) May 22 2016 No:
- Walter Bright (9/12) May 20 2016 The compressor only kicks in if the string is longer than 64 bytes. I se...
- Andrei Alexandrescu (5/18) May 20 2016 From the measurements shown the work seems Pareto distributed - the
- Johan Engelen (12/14) May 20 2016 I also don't think compression should be a performance issue. I
- poliklosio (22/40) May 20 2016 I have an Idea of reproducible, less-than-exponential time
- Johan Engelen (3/14) May 21 2016 See
- poliklosio (7/23) May 21 2016 So if I understand correctly, you tried implementing something
- Walter Bright (3/14) May 21 2016 This is what LZW compression does, except that LZW does it in general ra...
- poliklosio (12/30) May 21 2016 That's true, but a general compression algorithm requires a
- Guillaume Boucher (2/11) May 21 2016 It's more like LZ77 than LZW.
- Georgi D (22/46) May 19 2016 Yes,
- Steven Schveighoffer (14/42) May 19 2016 I'd much prefer we fix the compiler to handle voldemort types in a more
- Andrei Alexandrescu (2/6) May 19 2016 Thanks very much for helping with these measurements! -- Andrei
- Georgi D (16/25) May 20 2016 Just some more numbers:
- Georgi D (3/13) May 17 2016 I have file bug: https://issues.dlang.org/show_bug.cgi?id=16039
Hi, While working on a D project which heavily uses the lazy algorithms for ranges I noticed a sudden huge increase in the compilation time and the produced binary size. I chased down the offending change to just a small change in one line of the code which made the resulting binary to jump from 18MB to over 400MB. The code that produces the large binary actually failed to link using DMD on osx (works on linux). The link command just consumes all the CPU and never completes. Using LDC on osx just failed to compile. ldc was using all the CPU and never completing. There is also a very big difference between passing -unittest and not passing it. I was able to shrink the code down to the following: import std.range.primitives; import std.range : chain, choose, repeat, chooseAmong, takeOne; import std.algorithm.iteration : joiner , map; import std.range.interfaces : InputRange, inputRangeObject; import std.conv : text; struct AType { string id; Annotation[] annotations; TypeArgument[] typeArguments; } struct Annotation { string type; } struct TypeArgument { AType refType; } struct PrimaryValue { int type; } struct FieldDecl { AType type; string id; PrimaryValue[] values; } struct MethodDecl { string name; AType returnType; FieldDecl[] params; } auto toCharsArray(R)(R r, string sep = "\n", string tail = "\n") if (isInputRange!R) { return choose(r.empty, "", chain(r.joiner(sep), tail)); } auto toChars(Annotation uda) { return chain(" ", uda.type); } InputRange!(ElementType!string) toChars(TypeArgument ta) { auto r = chain("", ta.refType.id); with (ta.refType) { auto tArgs = choose(typeArguments.empty, "", chain("!(", typeArguments.map!(toChars).joiner(", "), ")")); return chain(id, tArgs).inputRangeObject(); } } auto toChars(AType t) { auto udas = t.annotations.map!(toChars).toCharsArray(); auto tArgs = choose(t.typeArguments.empty, "", chain("!(", t.typeArguments.map!(toChars).joiner(", "), ")")); return chain(udas, t.id, tArgs); } /// PrimaryValue auto toChars(PrimaryValue pv) { return chooseAmong(pv.type, "val", "val"); } auto toCharsSingleOrArray(R)(R r) if (isInputRange!R) { return choose(r.length == 1, r.takeOne.map!(toChars).joiner(), chain("[", r.map!(toChars).joiner(", "), "]") ); } auto toCharsBare(FieldDecl f) { auto def = chain(f.type.toChars, " ", f.id); return choose(f.values.empty, def, chain(def, " = ", toCharsSingleOrArray(f.values))); } auto toChars(FieldDecl f) { return chain("", f.toCharsBare()); } auto toChars(MethodDecl m) { //Just changing the line bellow to have map!(toChars) reduces the binary size //to 18MB from 403MB auto prms = chain("(", m.params.map!(toCharsBare).toCharsArray(", ", ""), ")"); return chain(" ", m.name, prms, ";"); } unittest { AType t1 = {id : "Foo"}; FieldDecl f1 = {type : t1, id : "f1"}; MethodDecl m1 = {name : "m", returnType : t1 }; assert(!m1.toChars().text().empty); // removing this line drops the size by half } void main() { } I am compiling this with: I have marked the line that needs to be changed for the code size to drop significantly. What is interesting is that the using toChars should be more complex than using toCharsBare since the former calls the latter but the binary size just explodes when using toCharsBare. I apologize for the long code segment.
May 17 2016
On 05/17/2016 05:44 PM, Georgi D wrote:Hi, While working on a D project which heavily uses the lazy algorithms for ranges I noticed a sudden huge increase in the compilation time and the produced binary size.[snip] Thanks. That's definitely deserving of a bug report. -- Andrei
May 17 2016
On Tuesday, 17 May 2016 at 21:58:06 UTC, Andrei Alexandrescu wrote:On 05/17/2016 05:44 PM, Georgi D wrote:I'm guessing that is highly related to this one: https://issues.dlang.org/show_bug.cgi?id=15831Hi, While working on a D project which heavily uses the lazy algorithms for ranges I noticed a sudden huge increase in the compilation time and the produced binary size.[snip] Thanks. That's definitely deserving of a bug report. -- Andrei
May 17 2016
On 5/17/16 6:04 PM, ZombineDev wrote:On Tuesday, 17 May 2016 at 21:58:06 UTC, Andrei Alexandrescu wrote:Yep. chain uses voldemort type, map does not. Georgi: try copying chain implementation into your local file, but instead of having a static internal struct, move it to a module-level struct (you probably have to repeat the template parameters, but not the constraints). See if it produces a similar slimming effect. -SteveOn 05/17/2016 05:44 PM, Georgi D wrote:I'm guessing that is highly related to this one: https://issues.dlang.org/show_bug.cgi?id=15831Hi, While working on a D project which heavily uses the lazy algorithms for ranges I noticed a sudden huge increase in the compilation time and the produced binary size.[snip] Thanks. That's definitely deserving of a bug report. -- Andrei
May 19 2016
On 05/19/2016 08:38 AM, Steven Schveighoffer wrote:Yep. chain uses voldemort type, map does not.We definitely need to fix Voldemort types. Walter and I bounced a few ideas during DConf. 1. Do the cleartext/compress/hash troika everywhere (currently we do it on Windows). That should help in several places. 2. For Voldemort types, simply generate a GUID-style random string of e.g. 64 bytes. Currently the name of the type is composed out of its full scope, with some exponentially-increasing repetition of substrings. That would compress well, but it's just a lot of work to produce and then compress the name. A random string is cheap to produce and adequate for Voldemort types (you don't care for their name anyway... Voldemort... get it?). I very much advocate slapping a 64-long random string for all Voldermort returns and calling it a day. I bet Liran's code will get a lot quicker to build and smaller to boot. Andrei
May 19 2016
On 5/19/16 9:45 AM, Andrei Alexandrescu wrote:On 05/19/2016 08:38 AM, Steven Schveighoffer wrote:This may be crazy, but I solved this problem by simply moving the struct outside the function. What about a lowering that does this for you? Example: auto foo(T)(T t) if (someConstraints) { static struct Result { T t; } return Result(t); } Changes to: private struct foo_Result(T) if (someConstraints) { T t; } auto foo(T)(T t) if (someConstraints) { alias Result = foo_Result!T // inserted by compiler return Result(t); } Or even better: template(T) foo if (someConstraints) { struct Result { T t; } auto foo(T t) { return Result(t); } } What this does is lower the number of times the template parameters (and function arguments) appears in the type name to 1. This gets rid of the exponential nature of the symbol name. This doesn't work for voldemorts with closures, but maybe it can somehow be reconstructed so the context is included in the generated struct. -SteveYep. chain uses voldemort type, map does not.We definitely need to fix Voldemort types. Walter and I bounced a few ideas during DConf. 1. Do the cleartext/compress/hash troika everywhere (currently we do it on Windows). That should help in several places. 2. For Voldemort types, simply generate a GUID-style random string of e.g. 64 bytes. Currently the name of the type is composed out of its full scope, with some exponentially-increasing repetition of substrings. That would compress well, but it's just a lot of work to produce and then compress the name. A random string is cheap to produce and adequate for Voldemort types (you don't care for their name anyway... Voldemort... get it?). I very much advocate slapping a 64-long random string for all Voldermort returns and calling it a day. I bet Liran's code will get a lot quicker to build and smaller to boot.
May 19 2016
On 05/19/2016 10:43 AM, Steven Schveighoffer wrote:This may be crazy, but I solved this problem by simply moving the struct outside the function. What about a lowering that does this for you?That's also a possibility, but likely a more complicated one. -- Andrei
May 19 2016
On 20/05/2016 12:44 AM, Andrei Alexandrescu wrote:On 05/19/2016 10:43 AM, Steven Schveighoffer wrote:We don't actually need to lower it, just mangle it as if it was lowered.This may be crazy, but I solved this problem by simply moving the struct outside the function. What about a lowering that does this for you?That's also a possibility, but likely a more complicated one. -- Andrei
May 19 2016
On 5/19/16 10:56 AM, Daniel Murphy wrote:On 20/05/2016 12:44 AM, Andrei Alexandrescu wrote:Noice! -- AndreiOn 05/19/2016 10:43 AM, Steven Schveighoffer wrote:We don't actually need to lower it, just mangle it as if it was lowered.This may be crazy, but I solved this problem by simply moving the struct outside the function. What about a lowering that does this for you?That's also a possibility, but likely a more complicated one. -- Andrei
May 19 2016
On 5/19/16 10:43 AM, Steven Schveighoffer wrote:Or even better: template(T) foo if (someConstraints) { struct Result { T t; } auto foo(T t) { return Result(t); } }This solution works awesomely, actually. It even produces smaller code than moving the struct completely outside. I'm going to use this mechanism in iopipe to get my voldemorts back :) -Steve
May 19 2016
On 5/19/2016 8:39 AM, Steven Schveighoffer wrote:Consider using a 'static struct'. A non-static struct will include a hidden member that points to the stack frame of foo(), i.e. "produces smaller code".template(T) foo if (someConstraints) { struct Result { T t; } auto foo(T t) { return Result(t); } }This solution works awesomely, actually. It even produces smaller code than moving the struct completely outside. I'm going to use this mechanism in iopipe to get my voldemorts back :)
May 19 2016
On Thursday, 19 May 2016 at 22:17:37 UTC, Walter Bright wrote:Consider using a 'static struct'. A non-static struct will include a hidden member that points to the stack frame of foo(), i.e. "produces smaller code".Queue me going to the language reference to look up static structs and reading "A struct can be prevented from being nested by using the static attribute, but then of course it will not be able to access variables from its enclosing scope." Not a fan of that description...
May 19 2016
On 5/19/2016 4:36 PM, jmh530 wrote:Queue me going to the language reference to look up static structs and reading "A struct can be prevented from being nested by using the static attribute, but then of course it will not be able to access variables from its enclosing scope." Not a fan of that description...You can always submit a PR for a better one.
May 19 2016
On 5/19/16 6:17 PM, Walter Bright wrote:On 5/19/2016 8:39 AM, Steven Schveighoffer wrote:Yes, I did use static in my actual code. In fact, I found a bug because of this (one of my voldemort functions was allocating a closure because I had thought I was using a member but was actually using a function parameter). But this doesn't fix the symbol size problem. The other mechanism does. -SteveConsider using a 'static struct'. A non-static struct will include a hidden member that points to the stack frame of foo(), i.e. "produces smaller code".template(T) foo if (someConstraints) { struct Result { T t; } auto foo(T t) { return Result(t); } }This solution works awesomely, actually. It even produces smaller code than moving the struct completely outside. I'm going to use this mechanism in iopipe to get my voldemorts back :)
May 23 2016
On 5/19/2016 6:45 AM, Andrei Alexandrescu wrote:I very much advocate slapping a 64-long random string for all Voldermort returns and calling it a day. I bet Liran's code will get a lot quicker to build and smaller to boot.Let's see how far we get with compression first. https://github.com/dlang/dmd/pull/5793 Using 64 character random strings will make symbolic debugging unpleasant.
May 19 2016
On Thursday, 19 May 2016 at 22:16:03 UTC, Walter Bright wrote:Using 64 character random strings will make symbolic debugging unpleasant.Using 6.4 megabyte strings already makes symbolic debugging unpleasant. The one thing that worries me about random strings is that it needs to be the same across all builds, or you'll get random linking errors when doing package-at-a-time or whatever (dmd already has some problems like this!). But building a gigantic string then compressing or hashing it still sucks... what we need is a O(1) solution that is still unique and repeatable.
May 19 2016
On Thursday, 19 May 2016 at 22:46:02 UTC, Adam D. Ruppe wrote:On Thursday, 19 May 2016 at 22:16:03 UTC, Walter Bright wrote:Clearly the reason for building such a gigantic string was some sort of repetition. Detect the repetition on-the-fly to avoid processing all of it. This way the generated name is already compressed.Using 64 character random strings will make symbolic debugging unpleasant.Using 6.4 megabyte strings already makes symbolic debugging unpleasant. The one thing that worries me about random strings is that it needs to be the same across all builds, or you'll get random linking errors when doing package-at-a-time or whatever (dmd already has some problems like this!). But building a gigantic string then compressing or hashing it still sucks... what we need is a O(1) solution that is still unique and repeatable.
May 19 2016
On Thursday, 19 May 2016 at 23:56:46 UTC, poliklosio wrote:(...) Clearly the reason for building such a gigantic string was some sort of repetition. Detect the repetition on-the-fly to avoid processing all of it. This way the generated name is already compressed.It seems like dynamic programming can be useful.
May 19 2016
On Thursday, 19 May 2016 at 22:46:02 UTC, Adam D. Ruppe wrote:On Thursday, 19 May 2016 at 22:16:03 UTC, Walter Bright wrote:Good point. Using a SHA1 derived from the string instead of a GUID is imho better. It has the advantage of repeatability, is even shorter and not very expensive to generate.Using 64 character random strings will make symbolic debugging unpleasant.Using 6.4 megabyte strings already makes symbolic debugging unpleasant. The one thing that worries me about random strings is that it needs to be the same across all builds, or you'll get random linking errors when doing package-at-a-time or whatever (dmd already has some problems like this!). But building a gigantic string then compressing or hashing it still sucks... what we need is a O(1) solution that is still unique and repeatable.
May 19 2016
On Thursday, 19 May 2016 at 22:16:03 UTC, Walter Bright wrote:On 5/19/2016 6:45 AM, Andrei Alexandrescu wrote:You could simply add a "trivial" version of the struct + enclosing function name (ie once, without repetitions) before or after the random character string. This way you would know which struct its referring to, its unique, and you still avoid generating a 5 Exabyte large symbol name just to compress/hash/whatever it.I very much advocate slapping a 64-long random string for all Voldermort returns and calling it a day. I bet Liran's code will get a lot quicker to build and smaller to boot.Let's see how far we get with compression first. https://github.com/dlang/dmd/pull/5793 Using 64 character random strings will make symbolic debugging unpleasant.
May 19 2016
On Friday, 20 May 2016 at 05:34:15 UTC, default0 wrote:On Thursday, 19 May 2016 at 22:16:03 UTC, Walter Bright wrote:This is why compression is not good enough.(...)(...) you still avoid generating a 5 Exabyte large symbol name just to compress/hash/whatever it.
May 20 2016
On Thursday, 19 May 2016 at 22:16:03 UTC, Walter Bright wrote:On 5/19/2016 6:45 AM, Andrei Alexandrescu wrote:Unfortunately, the PR doesn't cure the root cause, but only provides linear 45-55% improvement, which doesn't scale with the exponential growth of the symbol size: case time to compile du -h strings file | wc -c NG-case before 0m19.708s 339M 338.23M NG-case after 0m27.006s 218M 209.35M OK-case before 0m1.466s 16M 15.33M OK-case after 0m1.856s 11M 9.28M For more info on the measurements: https://github.com/dlang/dmd/pull/5793#issuecomment-220550682I very much advocate slapping a 64-long random string for all Voldermort returns and calling it a day. I bet Liran's code will get a lot quicker to build and smaller to boot.Let's see how far we get with compression first. https://github.com/dlang/dmd/pull/5793 Using 64 character random strings will make symbolic debugging unpleasant.
May 20 2016
On Friday, 20 May 2016 at 09:48:15 UTC, ZombineDev wrote:On Thursday, 19 May 2016 at 22:16:03 UTC, Walter Bright wrote:IMO, the best way forward is: + The compiler should lower voldemort types, according to the scheme that Steve suggested (http://forum.dlang.org/post/nhkmo7$ob5$1 digitalmars.com) + After that, during symbol generation (mangling) if a symbol starts getting larger than some threshold (e.g. 800 characters), the mangling algorithm should detect that and bail out by generating some unique id instead. The only valuable information that the symbol must include is the module name and location (line and column number) of the template instantiation.On 5/19/2016 6:45 AM, Andrei Alexandrescu wrote:Unfortunately, the PR doesn't cure the root cause, but only provides linear 45-55% improvement, which doesn't scale with the exponential growth of the symbol size: case time to compile du -h strings file | wc -c NG-case before 0m19.708s 339M 338.23M NG-case after 0m27.006s 218M 209.35M OK-case before 0m1.466s 16M 15.33M OK-case after 0m1.856s 11M 9.28M For more info on the measurements: https://github.com/dlang/dmd/pull/5793#issuecomment-220550682I very much advocate slapping a 64-long random string for all Voldermort returns and calling it a day. I bet Liran's code will get a lot quicker to build and smaller to boot.Let's see how far we get with compression first. https://github.com/dlang/dmd/pull/5793 Using 64 character random strings will make symbolic debugging unpleasant.
May 20 2016
On Friday, 20 May 2016 at 11:32:16 UTC, ZombineDev wrote:IMO, the best way forward is: + The compiler should lower voldemort types, according to the scheme that Steve suggested (http://forum.dlang.org/post/nhkmo7$ob5$1 digitalmars.com) + After that, during symbol generation (mangling) if a symbol starts getting larger than some threshold (e.g. 800 characters), the mangling algorithm should detect that and bail out by generating some unique id instead. The only valuable information that the symbol must include is the module name and location (line and column number) of the template instantiation.Location info shouldn't be used. This will break things like interface files and dynamic libraries.
May 20 2016
On Friday, 20 May 2016 at 11:40:12 UTC, Rene Zwanenburg wrote:On Friday, 20 May 2016 at 11:32:16 UTC, ZombineDev wrote:Well... template-heavy code doesn't play well header-files and dynamic libraries. Most of the time templates are used for the implementation of an interface, but template types such as ranges are unsuable in function signatures. That's why they're called voldemort types - because they're the ones that can not/must not be named. Instead dynamic libraries should use stable types such as interfaces, arrays and function pointers, which don't have the aforementioned symbol size problems. Since we're using random numbers for symbols (instead of the actual names) it would not be possible for such symbols to be part of an interface, because a different invocation of the compiler would produce different symbol names. Such symbols should always an implementation detail, and not part of an interface. That's why location info would play no role, except for debugging purposes.IMO, the best way forward is: + The compiler should lower voldemort types, according to the scheme that Steve suggested (http://forum.dlang.org/post/nhkmo7$ob5$1 digitalmars.com) + After that, during symbol generation (mangling) if a symbol starts getting larger than some threshold (e.g. 800 characters), the mangling algorithm should detect that and bail out by generating some unique id instead. The only valuable information that the symbol must include is the module name and location (line and column number) of the template instantiation.Location info shouldn't be used. This will break things like interface files and dynamic libraries.
May 20 2016
On Friday, 20 May 2016 at 12:01:14 UTC, ZombineDev wrote:On Friday, 20 May 2016 at 11:40:12 UTC, Rene Zwanenburg wrote:Rene How do you expect the compiler to know the exact return type, only by looking at this signature: auto transmogrify(string str); A possible implementation might be this: auto transmogrify(string str) { return str.map!someFunc.filter!otherFunc.joiner(); } or something completly different.On Friday, 20 May 2016 at 11:32:16 UTC, ZombineDev wrote:Well... template-heavy code doesn't play well header-files and dynamic libraries. Most of the time templates are used for the implementation of an interface, but template types such as ranges are unsuable in function signatures. That's why they're called voldemort types - because they're the ones that can not/must not be named. Instead dynamic libraries should use stable types such as interfaces, arrays and function pointers, which don't have the aforementioned symbol size problems. Since we're using random numbers for symbols (instead of the actual names) it would not be possible for such symbols to be part of an interface, because a different invocation of the compiler would produce different symbol names. Such symbols should always an implementation detail, and not part of an interface. That's why location info would play no role, except for debugging purposes.IMO, the best way forward is: + The compiler should lower voldemort types, according to the scheme that Steve suggested (http://forum.dlang.org/post/nhkmo7$ob5$1 digitalmars.com) + After that, during symbol generation (mangling) if a symbol starts getting larger than some threshold (e.g. 800 characters), the mangling algorithm should detect that and bail out by generating some unique id instead. The only valuable information that the symbol must include is the module name and location (line and column number) of the template instantiation.Location info shouldn't be used. This will break things like interface files and dynamic libraries.
May 20 2016
On Friday, 20 May 2016 at 12:08:37 UTC, ZombineDev wrote:Rene How do you expect the compiler to know the exact return type, only by looking at this signature: auto transmogrify(string str); A possible implementation might be this: auto transmogrify(string str) { return str.map!someFunc.filter!otherFunc.joiner(); } or something completly different.I was thinking of something along the lines of this: ======= size_t frobnicate(int i) { return 0; } auto frobnicator(T)(T t) { static struct Result { int index; size_t front() { return frobnicate(index); } enum empty = false; void popFront() { ++index; } } return Result(t.index); } ======= Automatically generating a header with DMD gives me: ======= size_t frobnicate(int i); auto frobnicator(T)(T t) { static struct Result { int index; size_t front(); enum empty = false; void popFront(); } return Result(t.index); } ======= Now frobnicator returns a different type for the same T depending on whether you're using the .d or the .di file. I'm not sure if this is a problem, but it sounds like something that can come back to bite you in edge cases.
May 20 2016
On Friday, 20 May 2016 at 15:39:18 UTC, Rene Zwanenburg wrote:On Friday, 20 May 2016 at 12:08:37 UTC, ZombineDev wrote:The only issue is code bloat. It also happens with separate compilation, becuase the compiler can't know if the template has already been instantiated in a different compilation unit. Only in a single compilation unit/invocation the compiler can reliably see all the places where the same template instance is used. In that case, the actual mangled name shouldn't matter, AFAIU.Rene How do you expect the compiler to know the exact return type, only by looking at this signature: auto transmogrify(string str); A possible implementation might be this: auto transmogrify(string str) { return str.map!someFunc.filter!otherFunc.joiner(); } or something completly different.I was thinking of something along the lines of this: ======= size_t frobnicate(int i) { return 0; } auto frobnicator(T)(T t) { static struct Result { int index; size_t front() { return frobnicate(index); } enum empty = false; void popFront() { ++index; } } return Result(t.index); } ======= Automatically generating a header with DMD gives me: ======= size_t frobnicate(int i); auto frobnicator(T)(T t) { static struct Result { int index; size_t front(); enum empty = false; void popFront(); } return Result(t.index); } ======= Now frobnicator returns a different type for the same T depending on whether you're using the .d or the .di file. I'm not sure if this is a problem, but it sounds like something that can come back to bite you in edge cases.
May 20 2016
On 5/19/16 6:16 PM, Walter Bright wrote:On 5/19/2016 6:45 AM, Andrei Alexandrescu wrote:This is a fallacy. I don't think so, at all, when the baseline is an extremely long string. In any given scope there are at most a handful of Voldemort types at play, and it's easy to identify which is which. We've all done so many times with IPs, GUIDs of objects, directories, etc. etc. -- AndreiI very much advocate slapping a 64-long random string for all Voldermort returns and calling it a day. I bet Liran's code will get a lot quicker to build and smaller to boot.Let's see how far we get with compression first. https://github.com/dlang/dmd/pull/5793 Using 64 character random strings will make symbolic debugging unpleasant.
May 20 2016
On Friday, 20 May 2016 at 10:54:18 UTC, Andrei Alexandrescu wrote:On 5/19/16 6:16 PM, Walter Bright wrote:I agree with Andrei. I solved it this way https://github.com/ldc-developers/ldc/pull/1445: "Hashed symbols look like this: _D3one3two5three3L3433_46a82aac733d8a4b3588d7fa8937aad66Result3fooZ ddemangle gives: one.two.three.L34._46a82aac733d8a4b3588d7fa8937aad6.Result.foo Meaning: this symbol is defined in module one.two.three on line 34. The identifier is foo and is contained in the struct or class Result."On 5/19/2016 6:45 AM, Andrei Alexandrescu wrote:This is a fallacy. I don't think so, at all, when the baseline is an extremely long string.I very much advocate slapping a 64-long random string for all Voldermort returns and calling it a day. I bet Liran's code will get a lot quicker to build and smaller to boot.Let's see how far we get with compression first. https://github.com/dlang/dmd/pull/5793 Using 64 character random strings will make symbolic debugging unpleasant.
May 20 2016
On 05/20/2016 08:21 AM, Johan Engelen wrote:On Friday, 20 May 2016 at 10:54:18 UTC, Andrei Alexandrescu wrote:This is nice. How difficult would it be to rework it into a PR for dmd? -- AndreiOn 5/19/16 6:16 PM, Walter Bright wrote:I agree with Andrei. I solved it this way https://github.com/ldc-developers/ldc/pull/1445: "Hashed symbols look like this: _D3one3two5three3L3433_46a82aac733d8a4b3588d7fa8937aad66Result3fooZ ddemangle gives: one.two.three.L34._46a82aac733d8a4b3588d7fa8937aad6.Result.foo Meaning: this symbol is defined in module one.two.three on line 34. The identifier is foo and is contained in the struct or class Result."On 5/19/2016 6:45 AM, Andrei Alexandrescu wrote:This is a fallacy. I don't think so, at all, when the baseline is an extremely long string.I very much advocate slapping a 64-long random string for all Voldermort returns and calling it a day. I bet Liran's code will get a lot quicker to build and smaller to boot.Let's see how far we get with compression first. https://github.com/dlang/dmd/pull/5793 Using 64 character random strings will make symbolic debugging unpleasant.
May 20 2016
On Friday, 20 May 2016 at 12:30:10 UTC, Andrei Alexandrescu wrote:On 05/20/2016 08:21 AM, Johan Engelen wrote:I can work on it, but only if it will not result in a long debate afterwards (!!!). One obstacle is the hasher itself: I am not going to implement one myself! In the LDC PR, I used LLVM's MD5 hasher and Phobos's MD5 hasher. Perhaps it is better to use a faster hasher (I have no expertise on that; Murmur?), so we will have to carry our own copy of a good hasher implementation. Or perhaps the speedlimit is memory access and hash algorithm speed doesn't matter. I made the hashing optional, with a symbol length threshold. Getting rid of the variable threshold would be good, such that the (few) large symbols in Phobos are hashed too and all will work fine. Perhaps 1k is a good threshold.https://github.com/ldc-developers/ldc/pull/1445: "Hashed symbols look like this: _D3one3two5three3L3433_46a82aac733d8a4b3588d7fa8937aad66Result3fooZ ddemangle gives: one.two.three.L34._46a82aac733d8a4b3588d7fa8937aad6.Result.foo Meaning: this symbol is defined in module one.two.three on line 34. The identifier is foo and is contained in the struct or class Result."This is nice. How difficult would it be to rework it into a PR for dmd? -- Andrei
May 20 2016
On 05/20/2016 09:07 AM, Johan Engelen wrote:On Friday, 20 May 2016 at 12:30:10 UTC, Andrei Alexandrescu wrote:Thanks. I'll get back to you on that.On 05/20/2016 08:21 AM, Johan Engelen wrote:I can work on it, but only if it will not result in a long debate afterwards (!!!).https://github.com/ldc-developers/ldc/pull/1445: "Hashed symbols look like this: _D3one3two5three3L3433_46a82aac733d8a4b3588d7fa8937aad66Result3fooZ ddemangle gives: one.two.three.L34._46a82aac733d8a4b3588d7fa8937aad6.Result.foo Meaning: this symbol is defined in module one.two.three on line 34. The identifier is foo and is contained in the struct or class Result."This is nice. How difficult would it be to rework it into a PR for dmd? -- AndreiOne obstacle is the hasher itself: I am not going to implement one myself! In the LDC PR, I used LLVM's MD5 hasher and Phobos's MD5 hasher. Perhaps it is better to use a faster hasher (I have no expertise on that; Murmur?), so we will have to carry our own copy of a good hasher implementation. Or perhaps the speedlimit is memory access and hash algorithm speed doesn't matter. I made the hashing optional, with a symbol length threshold. Getting rid of the variable threshold would be good, such that the (few) large symbols in Phobos are hashed too and all will work fine. Perhaps 1k is a good threshold.I don't see a need for hashing something. Would a randomly-generated string suffice? Andrei
May 20 2016
On Fri, May 20, 2016 at 09:24:42AM -0400, Andrei Alexandrescu via Digitalmars-d wrote:On 05/20/2016 09:07 AM, Johan Engelen wrote:[...][...] Wouldn't we want the same symbol to be generated if we call a Voldemort-returning function with the same compile-time arguments (but not necessarily runtime arguments) multiple times from different places? This is likely not an issue when compiling all sources at once, but may be a problem with incremental compilation. T -- "Hi." "'Lo."One obstacle is the hasher itself: I am not going to implement one myself! In the LDC PR, I used LLVM's MD5 hasher and Phobos's MD5 hasher. Perhaps it is better to use a faster hasher (I have no expertise on that; Murmur?), so we will have to carry our own copy of a good hasher implementation. Or perhaps the speedlimit is memory access and hash algorithm speed doesn't matter. I made the hashing optional, with a symbol length threshold. Getting rid of the variable threshold would be good, such that the (few) large symbols in Phobos are hashed too and all will work fine. Perhaps 1k is a good threshold.I don't see a need for hashing something. Would a randomly-generated string suffice?
May 20 2016
On Friday, 20 May 2016 at 13:24:42 UTC, Andrei Alexandrescu wrote:I don't see a need for hashing something. Would a randomly-generated string suffice?That would break separate compilation, wouldn't it?
May 20 2016
On Friday, 20 May 2016 at 13:24:42 UTC, Andrei Alexandrescu wrote:I don't see a need for hashing something. Would a randomly-generated string suffice?Naïve question: is a (randomly-)?generated _anything_ required? I've probably missed something, but it seems like line noise just because we feel we must. I guess there's a possibility that there would be multiple matches on the same line with the same object and identifier... Stick the column in there too and call it a day? -Wyatt
May 20 2016
On Friday, 20 May 2016 at 17:11:52 UTC, Wyatt wrote:I've probably missed something, but it seems like line noise just because we feel we must.Line and file is not unique because the same template would generate different functions for different arguments. Template arguments are a part of the identifier... and that can really blow up the size because they might be template arguments which might be template arguments, etc., but each unique argument could be creating an entirely different function.
May 20 2016
On 5/20/2016 6:24 AM, Andrei Alexandrescu wrote:I don't see a need for hashing something. Would a randomly-generated string suffice?Hashing produces reproducible unique values. Random will not be reproducible and may not even be unique.
May 20 2016
On Friday, 20 May 2016 at 19:41:16 UTC, Walter Bright wrote:On 5/20/2016 6:24 AM, Andrei Alexandrescu wrote:The question is: is it actually good for them to be reproducible? The very idea behind voldemort types is that you don't reference them directly in any way, it is just implementation detail. To me it does make sense to apply it to debugging too (debugging of deeply chained template types isn't really very usable anyway).I don't see a need for hashing something. Would a randomly-generated string suffice?Hashing produces reproducible unique values. Random will not be reproducible and may not even be unique.
May 20 2016
On 5/20/2016 1:49 PM, Dicebot wrote:The question is: is it actually good for them to be reproducible?1. yes, because the same type can be generated from multiple compilation units (the linker removes the duplicates - if they're not duplicates, the linker won't remove them). 2. having the compiler produce different .o files on different runs with the same inputs is pretty eyebrow raising, and makes it harder to debug the compiler, and harder to have a test suite for the compiler.
May 20 2016
On Fri, May 20, 2016 at 02:08:02PM -0700, Walter Bright via Digitalmars-d wrote: [...]2. having the compiler produce different .o files on different runs with the same inputs is pretty eyebrow raising, and makes it harder to debug the compiler, and harder to have a test suite for the compiler.Yeah, I think hashing is the way to go. Random strings just raise more issues than they solve. T -- In theory, software is implemented according to the design that has been carefully worked out beforehand. In practice, design documents are written after the fact to describe the sorry mess that has gone on before.
May 20 2016
On Friday, 20 May 2016 at 20:49:20 UTC, Dicebot wrote:On Friday, 20 May 2016 at 19:41:16 UTC, Walter Bright wrote:It would make binary comparison of libraries and executables difficult which troubles me as comparing hashes is a basics of binary distribution security : you can check that a precompiled binary is legit by recompiling it in the same conditions and comparing the two. It would be way harder if random components were added.On 5/20/2016 6:24 AM, Andrei Alexandrescu wrote:The question is: is it actually good for them to be reproducible? The very idea behind voldemort types is that you don't reference them directly in any way, it is just implementation detail. To me it does make sense to apply it to debugging too (debugging of deeply chained template types isn't really very usable anyway).I don't see a need for hashing something. Would a randomly-generated string suffice?Hashing produces reproducible unique values. Random will not be reproducible and may not even be unique.
May 20 2016
On Friday, 20 May 2016 at 21:09:23 UTC, cym13 wrote:It would make binary comparison of libraries and executables difficult which troubles me as comparing hashes is a basics of binary distribution security : you can check that a precompiled binary is legit by recompiling it in the same conditions and comparing the two. It would be way harder if random components were added.Yeah, that is good reason.
May 20 2016
On Friday, 20 May 2016 at 21:09:23 UTC, cym13 wrote:It would make binary comparison of libraries and executables difficult which troubles me as comparing hashes is a basics of binary distribution security : you can check that a precompiled binary is legit by recompiling it in the same conditions and comparing the two. It would be way harder if random components were added.My recollection is that successively compiled binaries are rarely directly comparable, because of timestamps embedded by compilers. So I have to ask: are there standard tools to understand enough of the ELF binary format (or whatever, for the given platform) to compare everything except the timestamps?
May 20 2016
On Saturday, 21 May 2016 at 01:29:18 UTC, Observer wrote:My recollection is that successively compiled binaries are rarely directly comparable, because of timestamps embedded by compilers. So I have to ask: are there standard tools to understand enough of the ELF binary format (or whatever, for the given platform) to compare everything except the timestamps?I just looked at an ELF format document, and didn't see any reference to timestamps. Maybe I'm confusing that with compression formats like gzip produces.
May 20 2016
On Friday, 20 May 2016 at 19:41:16 UTC, Walter Bright wrote:On 5/20/2016 6:24 AM, Andrei Alexandrescu wrote:Just a troll here, but Quasirandom numbers are reproducible and unique. e.g., Sobol in base 2 can be very efficient and fast, for any output length.I don't see a need for hashing something. Would a randomly-generated string suffice?Hashing produces reproducible unique values. Random will not be reproducible and may not even be unique.
May 20 2016
On Saturday, 21 May 2016 at 01:35:05 UTC, Bill Hicks wrote:On Friday, 20 May 2016 at 19:41:16 UTC, Walter Bright wrote:Unless there's a way to seed it consistently so order doesn't matter, it won't work. Although a fast CRC32 on the source could then give you a seed to work with, I'd prefer a hash over PRNG.Hashing produces reproducible unique values. Random will not be reproducible and may not even be unique.Just a troll here, but Quasirandom numbers are reproducible and unique. e.g., Sobol in base 2 can be very efficient and fast, for any output length.
May 20 2016
On Saturday, 21 May 2016 at 02:16:30 UTC, Era Scarecrow wrote:Unless there's a way to seed it consistently so order doesn't matter, it won't work. Although a fast CRC32 on the source could then give you a seed to work with, I'd prefer a hash over PRNG.There is no seed; it's a low-discrepancy sequence, and each element of the sequence can be computed independently.
May 20 2016
On Friday, 20 May 2016 at 12:21:58 UTC, Johan Engelen wrote:On Friday, 20 May 2016 at 10:54:18 UTC, Andrei Alexandrescu wrote:I like your approach. As I said earlier, it would be best if can prevent the generation of long symbols in the first place, because that would improve the compilation times significantly. Walter's PR slows down the compilation with 25-40% according to my tests. I expect that compilation would be faster if the whole process is skipped altogether.On 5/19/16 6:16 PM, Walter Bright wrote:I agree with Andrei. I solved it this way https://github.com/ldc-developers/ldc/pull/1445: "Hashed symbols look like this: _D3one3two5three3L3433_46a82aac733d8a4b3588d7fa8937aad66Result3fooZ ddemangle gives: one.two.three.L34._46a82aac733d8a4b3588d7fa8937aad6.Result.foo Meaning: this symbol is defined in module one.two.three on line 34. The identifier is foo and is contained in the struct or class Result."On 5/19/2016 6:45 AM, Andrei Alexandrescu wrote:This is a fallacy. I don't think so, at all, when the baseline is an extremely long string.I very much advocate slapping a 64-long random string for all Voldermort returns and calling it a day. I bet Liran's code will get a lot quicker to build and smaller to boot.Let's see how far we get with compression first. https://github.com/dlang/dmd/pull/5793 Using 64 character random strings will make symbolic debugging unpleasant.
May 20 2016
On Friday, 20 May 2016 at 12:57:40 UTC, ZombineDev wrote:As I said earlier, it would be best if can prevent the generation of long symbols in the first place, because that would improve the compilation times significantly.From what I've observed, generating the long symbol name itself is fast. If we avoid the deep type hierarchy, then I think indeed you can expect compile time improvement.Walter's PR slows down the compilation with 25-40% according to my tests. I expect that compilation would be faster if the whole process is skipped altogether.MD5 hashing slowed down builds by a few percent for Weka (note: LDC machinecodegen is slower than DMD's, so percentage-wise...), which can then be compensated for using PGO ;-) /+ <-- shameless PGO plug +/
May 20 2016
On Friday, 20 May 2016 at 13:16:32 UTC, Johan Engelen wrote:On Friday, 20 May 2016 at 12:57:40 UTC, ZombineDev wrote:IIUC, your scheme works like this: 1. DMDFE creates a mangled symbol name 2. Create a MD-5 hash of thr symbol use the hash instead of the full name. If minimal change in Georgi's almost trivial program w.r.t LoC (compared to e.g. Weka's huge codebase) increases compilation time from 1.5sec to 27sec, I can't imagine how slower it would take for a larger project. We should cure root cause. Genetating uselessly large symbols (even if hashed after that) is part of that problem, so I think it should never done if they start growing than e.g. 500 bytes. The solution that Steven showed is exactly what the compiler should do. Another reason why the compiler should do it is because often voldemort types capture outer context (local variables, alias paramters, delegates, etc.), which makes it very hard for the user to manually extract the voldemort type out of the function.As I said earlier, it would be best if can prevent the generation of long symbols in the first place, because that would improve the compilation times significantly.From what I've observed, generating the long symbol name itself is fast. If we avoid the deep type hierarchy, then I think indeed you can expect compile time improvement.Walter's PR slows down the compilation with 25-40% according to my tests. I expect that compilation would be faster if the whole process is skipped altogether.MD5 hashing slowed down builds by a few percent for Weka (note: LDC machinecodegen is slower than DMD's, so percentage-wise...), which can then be compensated for using PGO ;-) /+ <-- shameless PGO plug +/
May 20 2016
On Friday, 20 May 2016 at 16:21:55 UTC, ZombineDev wrote:On Friday, 20 May 2016 at 13:16:32 UTC, Johan Engelen wrote:I see two separate issues that I think should be handled independently: 1) Exponential growth of symbol name with voldemort types. I like Steven's solution where the compiler lowers the struct outside of the method. 2) Long symbol names in general which could arise even without voldemort types involved especially with chaining multiple algorithms. I like Johan Engelen solution in LDC for symbols longer than a threshold. For symbols shorter than the threshold I think Walter's compression algorithm could be used to gain 40-60% reduction and still retain the full type information.On Friday, 20 May 2016 at 12:57:40 UTC, ZombineDev wrote:IIUC, your scheme works like this: 1. DMDFE creates a mangled symbol name 2. Create a MD-5 hash of thr symbol use the hash instead of the full name. If minimal change in Georgi's almost trivial program w.r.t LoC (compared to e.g. Weka's huge codebase) increases compilation time from 1.5sec to 27sec, I can't imagine how slower it would take for a larger project. We should cure root cause. Genetating uselessly large symbols (even if hashed after that) is part of that problem, so I think it should never done if they start growing than e.g. 500 bytes. The solution that Steven showed is exactly what the compiler should do. Another reason why the compiler should do it is because often voldemort types capture outer context (local variables, alias paramters, delegates, etc.), which makes it very hard for the user to manually extract the voldemort type out of the function.As I said earlier, it would be best if can prevent the generation of long symbols in the first place, because that would improve the compilation times significantly.From what I've observed, generating the long symbol name itself is fast. If we avoid the deep type hierarchy, then I think indeed you can expect compile time improvement.Walter's PR slows down the compilation with 25-40% according to my tests. I expect that compilation would be faster if the whole process is skipped altogether.MD5 hashing slowed down builds by a few percent for Weka (note: LDC machinecodegen is slower than DMD's, so percentage-wise...), which can then be compensated for using PGO ;-) /+ <-- shameless PGO plug +/
May 20 2016
On 5/20/16 2:34 PM, Georgi D wrote:1) Exponential growth of symbol name with voldemort types. I like Steven's solution where the compiler lowers the struct outside of the method.Was talking to Walter on the phone and he just had one of those great ideas: encode in the function mangle that it returns "auto". I thought that was fantastic. Would that work? -- Andrei
May 20 2016
On Friday, 20 May 2016 at 19:37:23 UTC, Andrei Alexandrescu wrote:On 5/20/16 2:34 PM, Georgi D wrote:This is a very interesting idea. I see one problem though: The real issue is not just the return type but also the types of the input parameters to a function. So even if the return type of a method is encoded as "auto" the moment the result of that method is passed as parameter to another template method the long typename will resurface. I do not think the type of the input parameters could be encoded as "auto" since the different instances of a template method will clash in names. In essence the problem that should be solved is the long names of types.1) Exponential growth of symbol name with voldemort types. I like Steven's solution where the compiler lowers the struct outside of the method.Was talking to Walter on the phone and he just had one of those great ideas: encode in the function mangle that it returns "auto". I thought that was fantastic. Would that work? -- Andrei
May 20 2016
On Friday, 20 May 2016 at 22:01:21 UTC, Georgi D wrote:This is a very interesting idea. I see one problem though: The real issue is not just the return type but also the types of the input parameters to a function. So even if the return type of a method is encoded as "auto" the moment the result of that method is passed as parameter to another template method the long typename will resurface. I do not think the type of the input parameters could be encoded as "auto" since the different instances of a template method will clash in names. In essence the problem that should be solved is the long names of types.I keep having it go through my head, if we had to hash the result, I'd prefer to has the source (minus comments & whitespace, and after replacements have been done) to come up with a reproducible value. This of course is if long names or compression won't do the job.
May 20 2016
On 05/20/2016 06:01 PM, Georgi D wrote:In essence the problem that should be solved is the long names of types.Voldemort returns are by far the worst. Compression and hashing should handle the rest. -- Andrei
May 20 2016
On Friday, 20 May 2016 at 19:37:23 UTC, Andrei Alexandrescu wrote:Was talking to Walter on the phone and he just had one of those great ideas: encode in the function mangle that it returns "auto". I thought that was fantastic. Would that work? -- AndreiEquivalent to not mangling return type at all. But this leaves you with 2^^n growth, still exponential. Also is there any evidence that compression is faster than hashing?
May 21 2016
On Saturday, 21 May 2016 at 17:34:19 UTC, Kagamin wrote:On Friday, 20 May 2016 at 19:37:23 UTC, Andrei Alexandrescu wrote:Would it be practical to use type-erasure in the middle of a call chain? It would cut 2^^n right at the knees at the possibly negligible runtime cost. It would impose a burden on the client (as I don't think a compiler should do it on its own - or even if it could), but if it's really Pareto distributed then it shouldn't be that difficult to find the hot spots.Was talking to Walter on the phone and he just had one of those great ideas: encode in the function mangle that it returns "auto". I thought that was fantastic. Would that work? -- AndreiEquivalent to not mangling return type at all. But this leaves you with 2^^n growth, still exponential. Also is there any evidence that compression is faster than hashing?
May 21 2016
On 05/21/2016 01:34 PM, Kagamin wrote:But this leaves you with 2^^n growth, still exponentialHe said that that won't happen any longer, the growth was because of the return type. Is that correct? -- Andrei
May 21 2016
On Saturday, 21 May 2016 at 18:18:21 UTC, Andrei Alexandrescu wrote:He said that that won't happen any longer, the growth was because of the return type. Is that correct?https://issues.dlang.org/show_bug.cgi?id=15831#c4 Looks like growth is due to the fact that the voldemort type is in the scope of a function and function is fun!(T).fun(T) - hence each level multiplies by 2. And return type is not part of the mangled name already - that would cause circular dependency, you would need the voldemort type mangling to generate it.
May 21 2016
On 05/21/2016 03:13 PM, Kagamin wrote:On Saturday, 21 May 2016 at 18:18:21 UTC, Andrei Alexandrescu wrote:Just to make sure I understand: do you agree or disagree that there's no more exponential growth if we encode "auto return" in the mangling? Thx! -- AndreiHe said that that won't happen any longer, the growth was because of the return type. Is that correct?https://issues.dlang.org/show_bug.cgi?id=15831#c4 Looks like growth is due to the fact that the voldemort type is in the scope of a function and function is fun!(T).fun(T) - hence each level multiplies by 2. And return type is not part of the mangled name already - that would cause circular dependency, you would need the voldemort type mangling to generate it.
May 22 2016
On 5/22/16 5:42 PM, Andrei Alexandrescu wrote:On 05/21/2016 03:13 PM, Kagamin wrote:Not sure if someone has definitively answered before, but no, this does not. I think this would shrink the growth from 3^n to 2^n. The exponential growth happens because of the repeating of the template types. If you look at the example in the bug report, there is no return types anywhere. They exist in the mangled names, but are not relevant to the FQN or symbol resolution. The return type only plays a factor for preventing linking against other files that expect a certain return type. In fact, this may be a reason to NOT shortcut the return mangle (if you had compiled against foo one day, and foo's internal voldemort type changes the next, it would still link, even though the type may have changed). Indeed, D does not overload based on return type ever. -SteveOn Saturday, 21 May 2016 at 18:18:21 UTC, Andrei Alexandrescu wrote:Just to make sure I understand: do you agree or disagree that there's no more exponential growth if we encode "auto return" in the mangling? Thx! -- AndreiHe said that that won't happen any longer, the growth was because of the return type. Is that correct?https://issues.dlang.org/show_bug.cgi?id=15831#c4 Looks like growth is due to the fact that the voldemort type is in the scope of a function and function is fun!(T).fun(T) - hence each level multiplies by 2. And return type is not part of the mangled name already - that would cause circular dependency, you would need the voldemort type mangling to generate it.
May 23 2016
On 5/23/16 3:03 PM, Steven Schveighoffer wrote:In fact, this may be a reason to NOT shortcut the return mangle (if you had compiled against foo one day, and foo's internal voldemort type changes the next, it would still link, even though the type may have changed).Actually, this is not true. It will only fail to link if the name of the internal type changes, or you return some other type :( -Steve
May 23 2016
On 5/23/16 3:03 PM, Steven Schveighoffer wrote:On 5/22/16 5:42 PM, Andrei Alexandrescu wrote:To clarify, I think this is still a good idea, but only if the type is defined internally (no reason to repeat the entire function signature of the function you are defining). But if you specify auto return and return int, it should still be mangled as returning int. -SteveOn 05/21/2016 03:13 PM, Kagamin wrote:Not sure if someone has definitively answered before, but no, this does not. I think this would shrink the growth from 3^n to 2^n.On Saturday, 21 May 2016 at 18:18:21 UTC, Andrei Alexandrescu wrote:Just to make sure I understand: do you agree or disagree that there's no more exponential growth if we encode "auto return" in the mangling? Thx! -- AndreiHe said that that won't happen any longer, the growth was because of the return type. Is that correct?https://issues.dlang.org/show_bug.cgi?id=15831#c4 Looks like growth is due to the fact that the voldemort type is in the scope of a function and function is fun!(T).fun(T) - hence each level multiplies by 2. And return type is not part of the mangled name already - that would cause circular dependency, you would need the voldemort type mangling to generate it.
May 23 2016
On 5/23/2016 12:03 PM, Steven Schveighoffer wrote:Indeed, D does not overload based on return type ever.It does factor into type matching when a function pointer is used, for example.
May 23 2016
On 5/23/16 4:25 PM, Walter Bright wrote:On 5/23/2016 12:03 PM, Steven Schveighoffer wrote:Yes, that may be the only time return type is factored in, but practically speaking, a function overload set that overloads solely on return type is not directly callable (you will get ambiguity error), so there is little point to create such a situation. -SteveIndeed, D does not overload based on return type ever.It does factor into type matching when a function pointer is used, for example.
May 23 2016
On Saturday, 21 May 2016 at 17:34:19 UTC, Kagamin wrote:Equivalent to not mangling return type at all. But this leaves you with 2^^n growth, still exponential. Also is there any evidence that compression is faster than hashing?Depends on implementation and algorithm. However even the weakest compression settings can yield huge initial compression benefits. In normal text a reduction of 2:1 is expected, and will . The benefit being compression still contains all it's information, and hashing will reduce the size to a fixed length but throw away all that information for unique identity. LZO is a compressor/decompressor that prioritizes speed and can be used in real-time applications for very little cost to add compression to any application. I tinkered with some of it's options a while back and it did okay, not as good as zlib but that's to be expected. Although using zlib on it's fastest setting will likely yield good results too. http://www.oberhumer.com/opensource/lzo/
May 21 2016
On 5/21/2016 1:09 PM, Era Scarecrow wrote:Depends on implementation and algorithm. However even the weakest compression settings can yield huge initial compression benefits. In normal text a reduction of 2:1 is expected, and will . The benefit being compression still contains all it's information, and hashing will reduce the size to a fixed length but throw away all that information for unique identity. LZO is a compressor/decompressor that prioritizes speed and can be used in real-time applications for very little cost to add compression to any application. I tinkered with some of it's options a while back and it did okay, not as good as zlib but that's to be expected. Although using zlib on it's fastest setting will likely yield good results too. http://www.oberhumer.com/opensource/lzo/We already have a compressor in the compiler source for compressing names: https://github.com/dlang/dmd/blob/master/src/backend/compress.c A faster one would certainly be nice. Anyone game?
May 21 2016
On 5/21/2016 1:49 PM, Walter Bright wrote:We already have a compressor in the compiler source for compressing names: https://github.com/dlang/dmd/blob/master/src/backend/compress.c A faster one would certainly be nice. Anyone game?Note how well it does: https://issues.dlang.org/show_bug.cgi?id=15831#c5
May 21 2016
On Saturday, 21 May 2016 at 20:50:56 UTC, Walter Bright wrote:On 5/21/2016 1:49 PM, Walter Bright wrote:The underlying symbols are still growing exponentially. Nobody has the RAM for that, regardless of what size the resultant binary is. Compressing the symbol names is a bandage. The compiler needs a new kidney.We already have a compressor in the compiler source for compressing names: https://github.com/dlang/dmd/blob/master/src/backend/compress.c A faster one would certainly be nice. Anyone game?Note how well it does: https://issues.dlang.org/show_bug.cgi?id=15831#c5
May 21 2016
On 05/21/2016 06:27 PM, Anon wrote:On Saturday, 21 May 2016 at 20:50:56 UTC, Walter Bright wrote:My understanding is that the encoding "auto" return in the mangling makes any exponential growth disappear. Is that the case? -- AndreiOn 5/21/2016 1:49 PM, Walter Bright wrote:The underlying symbols are still growing exponentially. Nobody has the RAM for that, regardless of what size the resultant binary is. Compressing the symbol names is a bandage. The compiler needs a new kidney.We already have a compressor in the compiler source for compressing names: https://github.com/dlang/dmd/blob/master/src/backend/compress.c A faster one would certainly be nice. Anyone game?Note how well it does: https://issues.dlang.org/show_bug.cgi?id=15831#c5
May 22 2016
On Sunday, 22 May 2016 at 21:40:07 UTC, Andrei Alexandrescu wrote:On 05/21/2016 06:27 PM, Anon wrote:No: auto foo(T)(T x) { struct Vold { T payload; } return Vold(x); } foo(5) _D3mod10__T3fooTiZ3fooFNaNbNiNfiZS3mod10__T3fooTiZ3fooFiZ4Vold foo(5).foo _D3mod38__T3fooTS3mod10__T3fooTiZ3fooFiZ4VoldZ3fooFNaNbNiNfS3mod10__T3fooTiZ3fooFiZ4VoldZS3mod38__T3fooTS3mod10__T3fooTiZ3fooFiZ4VoldZ3fooFS3mod10__T3fooTiZ3fooFiZ4VoldZ4Vold foo(5).foo.foo _D3mod94__T3fooTS3mod38__T3fooTS3mod10__T3fooTiZ3fooFiZ4VoldZ3fooFS3mod10__T3fooTiZ3fooFiZ4VoldZ4VoldZ3fooFNaNbNiNfS3mod38__T3fooTS3mod10__T3fooTiZ3fooFiZ4VoldZ3fooFS3mod10__T3fooTiZ3fooFiZ4VoldZ4VoldZS3mod94__T3fooTS3mod38__T3fooTS3mod10__T3fooTiZ3fooFiZ4VoldZ3fooFS3mod10__T3fooTiZ3fooFiZ4VoldZ4VoldZ3fooFS3mod38__T3fooTS3mod10__T3fooTiZ3fooFiZ4VoldZ3fooFS3mod10__T3fooTiZ3fooFiZ4VoldZ4VoldZ4Vold Lengths: 62 | 174 | 398 Just dropping the return types to a single character ($) shrinks the names, but it doesn't solve the core of the problem. Still exponential: foo(5) _D3mod1010__T3fooTiZ3fooFNaNbNiNfiZ($) foo(5).foo _D3mod38__T3fooT(S3mod10__T3fooTiZ3fooFiZ4Vold)Z3fooFNaNbNiNf(S3mod10__T3fooTiZ3fooFiZ4Vold)Z{$} foo(5).foo.foo _D3mod94__T3fooT{S3mod38__T3fooT(S3mod10__T3fooTiZ3fooFiZ4Vold)Z3fooF(S3mod10__T3fooTiZ3fooFiZ4Vold)Z4Vold}Z3fooFNaNbNiNf{S3mod38__T3fooT(S3mod10__T3fooTiZ3fooFiZ4Vold)Z3fooF(S3mod10__T3fooTiZ3fooFiZ4Vold)Z4Vold}Z$ Lengths: 36 | 90 | 202 Note: the part inside () is the return type of the first. The part in {} is the return type of the second. I left those in for illustrative purposes.On Saturday, 21 May 2016 at 20:50:56 UTC, Walter Bright wrote:My understanding is that the encoding "auto" return in the mangling makes any exponential growth disappear. Is that the case? -- AndreiOn 5/21/2016 1:49 PM, Walter Bright wrote:The underlying symbols are still growing exponentially. Nobody has the RAM for that, regardless of what size the resultant binary is. Compressing the symbol names is a bandage. The compiler needs a new kidney.We already have a compressor in the compiler source for compressing names: https://github.com/dlang/dmd/blob/master/src/backend/compress.c A faster one would certainly be nice. Anyone game?Note how well it does: https://issues.dlang.org/show_bug.cgi?id=15831#c5
May 22 2016
On 5/20/2016 5:57 AM, ZombineDev wrote:Walter's PR slows down the compilation with 25-40% according to my tests. I expect that compilation would be faster if the whole process is skipped altogether.The compressor only kicks in if the string is longer than 64 bytes. I set it pretty low in order to have it kick in often, and hopefully flush out any bugs in it. For a production version, the minimum size should be set substantially larger, based on testing to provide a reasonable balance. That should speed it up a lot. Also, the algorithm is a bit naively implemented, I bet it could be speeded up substantially. Hashing isn't algorithmically cheap, either.
May 20 2016
On 05/20/2016 03:45 PM, Walter Bright wrote:On 5/20/2016 5:57 AM, ZombineDev wrote:From the measurements shown the work seems Pareto distributed - the major time spender is the few long symbols, not the many short symbols. There are a few long symbols that dominate everything else by an order of magnitude. The one percenters! :o) -- AndreiWalter's PR slows down the compilation with 25-40% according to my tests. I expect that compilation would be faster if the whole process is skipped altogether.The compressor only kicks in if the string is longer than 64 bytes. I set it pretty low in order to have it kick in often, and hopefully flush out any bugs in it. For a production version, the minimum size should be set substantially larger, based on testing to provide a reasonable balance. That should speed it up a lot. Also, the algorithm is a bit naively implemented, I bet it could be speeded up substantially. Hashing isn't algorithmically cheap, either.
May 20 2016
On Friday, 20 May 2016 at 19:45:36 UTC, Walter Bright wrote:Hashing isn't algorithmically cheap, either.I also don't think compression should be a performance issue. I heard that some compression algorithms are as fast as the data comes in, so fast enough for our purpose. The reason I chose hashing is simplicity and a guaranteed small symbol size. I wasn't sure whether a 5MB symbol would compress to a reasonable size. I wanted symbols to be readable; so less than, say, 80 chars. For people interested in finding out whether mangling changes will improve compile speed performance: forget about linking and mangling, and just assign symbol names like "a", "b", "c", etc., and see what happens.
May 20 2016
On Thursday, 19 May 2016 at 13:45:18 UTC, Andrei Alexandrescu wrote:On 05/19/2016 08:38 AM, Steven Schveighoffer wrote:I have an Idea of reproducible, less-than-exponential time mangling, although I don't know how actionable. Leave mangling as is, but pretend that you are mangling something different, for example when the input is foo!(boo!(bar!(baz!(int))), bar!(baz!(int)), baz!(int)) pretend that you are mangling **already in the name**, particularly: Now, this is an reduction of the exponential computation time only if you can detect repetitions before you go inside them, for without looking inside it and seeing baz!(int). Do you think it could be done? You would also need a deterministic way to assign those symbols Compression can also be done at the end if necessary to reduce the polynomial growth.Yep. chain uses voldemort type, map does not.We definitely need to fix Voldemort types. Walter and I bounced a few ideas during DConf. 1. Do the cleartext/compress/hash troika everywhere (currently we do it on Windows). That should help in several places. 2. For Voldemort types, simply generate a GUID-style random string of e.g. 64 bytes. Currently the name of the type is composed out of its full scope, with some exponentially-increasing repetition of substrings. That would compress well, but it's just a lot of work to produce and then compress the name. A random string is cheap to produce and adequate for Voldemort types (you don't care for their name anyway... Voldemort... get it?). I very much advocate slapping a 64-long random string for all Voldermort returns and calling it a day. I bet Liran's code will get a lot quicker to build and smaller to boot. Andrei
May 20 2016
On Saturday, 21 May 2016 at 06:18:05 UTC, poliklosio wrote:I have an Idea of reproducible, less-than-exponential time mangling, although I don't know how actionable. Leave mangling as is, but pretend that you are mangling something different, for example when the input is foo!(boo!(bar!(baz!(int))), bar!(baz!(int)), baz!(int)) pretend that you are mangling was **already in the name**, particularly:See http://forum.dlang.org/post/szodxrizfmufqdkpdryc forum.dlang.org
May 21 2016
On Saturday, 21 May 2016 at 08:57:57 UTC, Johan Engelen wrote:On Saturday, 21 May 2016 at 06:18:05 UTC, poliklosio wrote:So if I understand correctly, you tried implementing something like this, but it didn't help much even with size of mangled names. Are you sure you were testing on the pathological case (exponential stuff), rather than a bad one? Assuming your experiment is correct, something interesting is happening, and I will be observing you guys finding out what it is. :)I have an Idea of reproducible, less-than-exponential time mangling, although I don't know how actionable. Leave mangling as is, but pretend that you are mangling something different, for example when the input is foo!(boo!(bar!(baz!(int))), bar!(baz!(int)), baz!(int)) pretend that you are mangling was **already in the name**, particularly:See http://forum.dlang.org/post/szodxrizfmufqdkpdryc forum.dlang.org
May 21 2016
On 5/20/2016 11:18 PM, poliklosio wrote:I have an Idea of reproducible, less-than-exponential time mangling, although I don't know how actionable. Leave mangling as is, but pretend that you are mangling something different, for example when the input is foo!(boo!(bar!(baz!(int))), bar!(baz!(int)), baz!(int)) pretend that you are mangling the name**, particularly:This is what LZW compression does, except that LZW does it in general rather than just for identifiers.
May 21 2016
On Saturday, 21 May 2016 at 18:25:46 UTC, Walter Bright wrote:On 5/20/2016 11:18 PM, poliklosio wrote:That's true, but a general compression algorithm requires a stream of chars as input, so to compress something of exponential size you still need exponential runtime in order to iterate (while compressing) on the exponential input. The idea was to avoid such exponential iteration in the first place by doing some sort of caching. My thinking is that if you only reduce size but keep exponential runtime, you are going to have to revisit the issue in a few months anyway when people start using things you enabled more heavily. Anyway I wish you good luck with all this!I have an Idea of reproducible, less-than-exponential time mangling, although I don't know how actionable. Leave mangling as is, but pretend that you are mangling something different, for example when the input is foo!(boo!(bar!(baz!(int))), bar!(baz!(int)), baz!(int)) pretend that you are mangling was **already in the name**, particularly:This is what LZW compression does, except that LZW does it in general rather than just for identifiers.
May 21 2016
On Saturday, 21 May 2016 at 18:25:46 UTC, Walter Bright wrote:On 5/20/2016 11:18 PM, poliklosio wrote:It's more like LZ77 than LZW.was **already in the name**, particularly:This is what LZW compression does, except that LZW does it in general rather than just for identifiers.
May 21 2016
On Thursday, 19 May 2016 at 12:38:09 UTC, Steven Schveighoffer wrote:On 5/17/16 6:04 PM, ZombineDev wrote:Yes, Making a local copy of chain and moving the structure outside of the method solved the problem and reduced the code size. The stripped size even reduced from the version that did not experience the huge increase. Stripped: 7.4Mb -> 5.5MB Should we actually change the implementation in phobos to be like this? In the company I work in we are very sensitive to the code size so anything that can be made for this in phobos will help with adoption. I am still curious though why the code size increased so dramatically with such a small change. Both versions return a result from chain and actually the one that does not experience the increase is more complex (includes the one that does). In my actual code I have multiple calls to chain on top of the result of this and the code size does not increase dramatically. I am also concerned by the fact that ldc on mac just could not compile the source which produced the big binary (entering an infinite loop). ThanksOn Tuesday, 17 May 2016 at 21:58:06 UTC, Andrei Alexandrescu wrote:Yep. chain uses voldemort type, map does not. Georgi: try copying chain implementation into your local file, but instead of having a static internal struct, move it to a module-level struct (you probably have to repeat the template parameters, but not the constraints). See if it produces a similar slimming effect. -SteveOn 05/17/2016 05:44 PM, Georgi D wrote:I'm guessing that is highly related to this one: https://issues.dlang.org/show_bug.cgi?id=15831Hi, While working on a D project which heavily uses the lazy algorithms for ranges I noticed a sudden huge increase in the compilation time and the produced binary size.[snip] Thanks. That's definitely deserving of a bug report. -- Andrei
May 19 2016
On 5/19/16 11:56 AM, Georgi D wrote:On Thursday, 19 May 2016 at 12:38:09 UTC, Steven Schveighoffer wrote:Thanks, that confirms the bug reports are for the same issue.On 5/17/16 6:04 PM, ZombineDev wrote:Yes, Making a local copy of chain and moving the structure outside of the method solved the problem and reduced the code size.I'm guessing that is highly related to this one: https://issues.dlang.org/show_bug.cgi?id=15831Yep. chain uses voldemort type, map does not. Georgi: try copying chain implementation into your local file, but instead of having a static internal struct, move it to a module-level struct (you probably have to repeat the template parameters, but not the constraints). See if it produces a similar slimming effect.The stripped size even reduced from the version that did not experience the huge increase. Stripped: 7.4Mb -> 5.5MB Should we actually change the implementation in phobos to be like this?I'd much prefer we fix the compiler to handle voldemort types in a more sane way rather than try and fix the code to deal with compiler deficiencies. However, if this isn't something that's easily done (or done within a decent time frame), we may look at that solution.In the company I work in we are very sensitive to the code size so anything that can be made for this in phobos will help with adoption. I am still curious though why the code size increased so dramatically with such a small change. Both versions return a result from chain and actually the one that does not experience the increase is more complex (includes the one that does).Each call to a voldemort wrapping function (one that wraps one type into another "voldemort" type, or internal struct) increases the symbol size at a rate of 3^n. Non-voldemort types do not suffer from this, because the symbol size increases at just n.In my actual code I have multiple calls to chain on top of the result of this and the code size does not increase dramatically. I am also concerned by the fact that ldc on mac just could not compile the source which produced the big binary (entering an infinite loop).I would expect the possibility that the issue is with the linker. In my toy experiments, the compiler works just fine and produces an object file. It's the linker having issues with parsing those humungous symbols. -Steve
May 19 2016
On 05/19/2016 11:56 AM, Georgi D wrote:Making a local copy of chain and moving the structure outside of the method solved the problem and reduced the code size. The stripped size even reduced from the version that did not experience the huge increase. Stripped: 7.4Mb -> 5.5MBThanks very much for helping with these measurements! -- Andrei
May 19 2016
On Thursday, 19 May 2016 at 16:41:59 UTC, Andrei Alexandrescu wrote:On 05/19/2016 11:56 AM, Georgi D wrote:Just some more numbers: On the version with the local chain which is 5.5MB 4170 5194012 Which mean that of the 5.5MB total binary size 4.95MB are strings. Inspecting the content of the strings over 98% are symbol names. Is there a way to not have all the symbol names as strings even after the binary was stripped? I think that some compression for the symbol names is a good idea. I agree though that a O(n) or better for the voldemort types is also needed.Making a local copy of chain and moving the structure outside of the method solved the problem and reduced the code size. The stripped size even reduced from the version that did not experience the huge increase. Stripped: 7.4Mb -> 5.5MBThanks very much for helping with these measurements! -- Andrei
May 20 2016
On Tuesday, 17 May 2016 at 21:58:06 UTC, Andrei Alexandrescu wrote:On 05/17/2016 05:44 PM, Georgi D wrote:I have file bug: https://issues.dlang.org/show_bug.cgi?id=16039Hi, While working on a D project which heavily uses the lazy algorithms for ranges I noticed a sudden huge increase in the compilation time and the produced binary size.[snip] Thanks. That's definitely deserving of a bug report. -- Andrei
May 17 2016