digitalmars.D - Folding similar templates into one
- Meta (67/67) Jun 26 2013 I apologize for the title, as I couldn't come up with something
- Jonathan M Davis (27/32) Jun 26 2013 That's _very_ dependent on what the function does. A prime counter-examp...
- Meta (26/66) Jun 27 2013 This may be true. Any kind of optimization would probably have to
- Marco Leise (22/25) Jun 28 2013 I've thought of that as well, but you have to merge templates
- Meta (11/34) Jun 28 2013 That's good to know. I wasn't sure if there were some cases when
- monarch_dodra (47/58) Jun 28 2013 What is a "string lambda" ? Do you mean "a string alias latter
- Meta (8/33) Jun 28 2013 The former. I didn't think it was *that* obscure of a term
- Daniel Murphy (6/9) Jun 28 2013 The obvious place to do this is at link time - then you get to fold all
- monarch_dodra (11/17) Jun 28 2013 Just because the arguments are known 99% of the time at compile
I apologize for the title, as I couldn't come up with something succinct and descriptive enough while still being reasonably short. We all know how powerful templates are in C++/D, but they have that damning cost of needing to generate new code for each new template argument. For example: string format(string fmtStr)(T data) { ... } As fmtStr is a compile-time template argument, the compiler will have to generate N nearly-identical format functions for N strings that format is instantiated with. This "template bloat" increases exponentially as more template arguments are added. Take std.exception.assertNotThrown: void assertNotThrown(T : Throwable = Exception, E) (lazy E expression, string msg = null, //Could be compile-time, but aren't, //because of template bloat worries string file = __FILE__, size_t line = __LINE__) { try expression(); catch(T t) { immutable message = msg.empty ? t.msg : msg; immutable tail = message.empty ? "." : ": " ~ message; throw new AssertError(format("assertNotThrown failed: %s was thrown%s", T.stringof, tail), file, line, t); } } The arguments file and line are known at compile time, but should not be made template arguments because there is a potential to create N*M copies of assertNotThrown, given N values of __FILE__ and M values of __LINE__. Adding large numbers of template arguments spuriously can cause an explosion in binary size, especially when templates are very heavily used, such as in most D code. Furthermore, programmers worried about such bloat will undoubtedly take the safe route, and try to move as many arguments as possible from the compile-time parameter list to run-time. This means that generated code will not necessarily be efficient as it should be. I'd like to ask, however, why should this be? The arguments line and file will only ever be strings and ints. They don't vary in type, only value, and changing those values will not affect how the function works. You could easily substitute out "exception.d" with "stdio.d", and the functionality will be unchanged. This remains true for values of the same type. It doesn't matter whether you pass 1000, 10_000, or 100_000 as a template argument. The way the function operates is not observably different (note that for this to be true, we have to assume that static if does not exist). This is in contrast to passing a completely different type for T (say, RangeError), which may change how the function works. I'd like to know, are there any template "optimizations" that DMD currently implements which mitigates such a situation such as with assertNotThrown, where a compile-time argument may vary only in value, and not type? Judging from the fact that this function was changed so that that __FILE__ and __LINE__ are taking as run-time parameters, I guess the answer is no. Is there anything we can do about this?
Jun 26 2013
On Thursday, June 27, 2013 04:03:34 Meta wrote:I'd like to ask, however, why should this be? The arguments line and file will only ever be strings and ints. They don't vary in type, only value, and changing those values will not affect how the function works. You could easily substitute out "exception.d" with "stdio.d", and the functionality will be unchanged.That's _very_ dependent on what the function does. A prime counter-example would be overloaded operators like opBinary which uses their string template arguments in mixins within the function, thereby generating completely different functions. And I believe that that is by far the more common use of strings as template parameters. Also, when using strings and integral values such as __FILE__ and __LINE__, the functions _do_ have different code. auto foo(string file = __FILE__, size_t line = __LINE__)(Bar b) { auto f = file; auto l = line; .... } ends up generating different code depending on the values of file or line. The code is _not_ identical. The only way for the code not to vary is for those arguments to be runtime arguments. If they're compile-time arguments, then by definition, they affect the generated code if they're ever used. Yes, it is theoretically possible for the compiler to use the same template definition under certain circumstances (especially if you're dealing with something like a templated struct that ultimately ends up with exactly the same layout across multiple instantiations), but the only way that the compiler could avoid duplicating the code with something like __FILE__ and __LINE__ being used as template arguments is to translate them into function arguments, because if the function has only one definition, then it can't have different values for the file or line number. - Jonathan M Davis
Jun 26 2013
On Thursday, 27 June 2013 at 03:47:39 UTC, Jonathan M Davis wrote:That's _very_ dependent on what the function does. A prime counter-example would be overloaded operators like opBinary which uses their string template arguments in mixins within the function, thereby generating completely different functions. And I believe that that is by far the more common use of strings as template parameters.This may be true. Any kind of optimization would probably have to ignore most operator functions. Another heavy use of strings, however, is string lambdas. Checking if two string lambdas are the same is a simple string comparison, and then that template can be compiled once and all copies can be ellided. Then it's just a call to that template wherever it is used with that particular lambda in the program. I think this could be done with any template that accepts some kind of lambda function, but it may be somewhat tricky currently to figure out if two non-string lambda are equal, and I believe that DMD currently generates a unique lambda wherever one is used.Also, when using strings and integral values such as __FILE__ and __LINE__, the functions _do_ have different code. auto foo(string file = __FILE__, size_t line = __LINE__)(Bar b) { auto f = file; auto l = line; .... } ends up generating different code depending on the values of file or line. The code is _not_ identical. The only way for the code not to vary is for those arguments to be runtime arguments. If they're compile-time arguments, then by definition, they affect the generated code if they're ever used.I do know that the code generated is different. Turning compile-time arguments into run-time arguments will, unfortunately, completely defeat the point, as you know. It's a tricky problem, because you can't just partially compile a function and then change the compile-time arguments as you encounter each new instantiation. If we could do that, we wouldn't have this problem, and we would probably be using Common Lisp.Yes, it is theoretically possible for the compiler to use the same template definition under certain circumstances (especially if you're dealing with something like a templated struct that ultimately ends up with exactly the same layout across multiple instantiations), but the only way that the compiler could avoid duplicating the code with something like __FILE__ and __LINE__ being used as template arguments is to translate them into function arguments, because if the function has only one definition, then it can't have different values for the file or line number.This is something that could be potentially solved on the binary level, if you're sure that varying the value of a template argument will not affect how it functions. I'm not an expert on how DMD works, but could this possibly be done after the native code is generated, but before optimizations are performed? Just throwing out ideas.
Jun 27 2013
Am Thu, 27 Jun 2013 23:24:48 +0200 schrieb "Meta" <jared771 gmail.com>:I'm not an expert on how DMD works, but could this possibly be done after the native code is generatedI've thought of that as well, but you have to merge templates at an earlier stage. In case of a compiler such as GCC, the native code is not even generated in the compiler, but an external assembler. That said, DMD does not create duplicate template instances for the same parameters. I'd assume this includes "string lambdas". And I'm not sure what you really want to improve for assertNotThrown. It is a unit testing function and the most important thing at the moment seems to keep DMD's memory consumption low, which more compile-time parameters can only increase. Another point is that today's best compilers are good at detecting compile-time constants already and could inline a function with runtime parameters that are statically known. I've once let DMD create a switch-case with 81 cases for me and used a 100% templated function specialized for all these cases. The result was a major slow-down from all the compile-time arguments. Run-time arguments are just better! -- Marco
Jun 28 2013
On Friday, 28 June 2013 at 12:45:22 UTC, Marco Leise wrote:Am Thu, 27 Jun 2013 23:24:48 +0200 schrieb "Meta" <jared771 gmail.com>:That's good to know. I wasn't sure if there were some cases when template were ellided or not, even with the same arguments. By string lambdas, I mean lambdas of the form sort!"a <= b" that are heavily used in std.algorithm.I'm not an expert on how DMD works, but could this possibly be done after the native code is generatedI've thought of that as well, but you have to merge templates at an earlier stage. In case of a compiler such as GCC, the native code is not even generated in the compiler, but an external assembler. That said, DMD does not create duplicate template instances for the same parameters. I'd assume this includes "string lambdas".And I'm not sure what you really want to improve for assertNotThrown. It is a unit testing function and the most important thing at the moment seems to keep DMD's memory consumption low, which more compile-time parameters can only increase.I was only using assertNotThrown as an example of a template function where parameters that could be compile-time arguments become run-time arguments to minimize template bloat.Another point is that today's best compilers are good at detecting compile-time constants already and could inline a function with runtime parameters that are statically known. I've once let DMD create a switch-case with 81 cases for me and used a 100% templated function specialized for all these cases. The result was a major slow-down from all the compile-time arguments. Run-time arguments are just better!That is certainly a corner case. Perhaps finding a way to fold templates in some cases would still provide benefits in more normal cases.
Jun 28 2013
On Friday, 28 June 2013 at 12:45:22 UTC, Marco Leise wrote:Am Thu, 27 Jun 2013 23:24:48 +0200 schrieb "Meta" <jared771 gmail.com>:What is a "string lambda" ? Do you mean "a string alias latter parsed as an expression", or "an actual lambda function". There were talks about this recently in learn: If you *type* the same lambda function twice, it *will* generate two different templates. http://forum.dlang.org/thread/rufkccowcdogctvckzar forum.dlang.org For example: -------- struct S(alias pred){} void main() { auto a = S!(() => 1)(); auto b = S!(() => 1)(); a = b; } -------- DMD v2.064 DEBUG main.d(7): Error: cannot implicitly convert expression (b) of type S!(function int() { return 1; } ) to S!(function int() { return 1; } ) -------- However, -------- enum PRED = () => 1; struct S(alias pred){} void main() { auto a = S!PRED(); auto b = S!PRED(); a = b; } -------- This is fine. From testing, *string* preds don't have this "feature"/"bug" (?), even if you compile time build them, the compiler will "see" that it is the same string (probably true for integrals and whatnot). But for lambdas, only their *name* counts (eg __lambda1__). So if you plan to re-use a lambda, it *must* be stored in a way the compiler can "see" its unicity, and not typed more than once. Furthermore, I'm think this is the correct behavior.I'm not an expert on how DMD works, but could this possibly be done after the native code is generatedI've thought of that as well, but you have to merge templates at an earlier stage. In case of a compiler such as GCC, the native code is not even generated in the compiler, but an external assembler. That said, DMD does not create duplicate template instances for the same parameters. I'd assume this includes "string lambdas".
Jun 28 2013
On Friday, 28 June 2013 at 13:20:47 UTC, monarch_dodra wrote:What is a "string lambda" ? Do you mean "a string alias latter parsed as an expression", or "an actual lambda function".The former. I didn't think it was *that* obscure of a term (https://www.google.com/search?q=%22string+lambda%22+site:forum.dlang.org).There were talks about this recently in learn: If you *type* the same lambda function twice, it *will* generate two different templates. http://forum.dlang.org/thread/rufkccowcdogctvckzar forum.dlang.orgI'm not sure if that's due to this bug or not: http://forum.dlang.org/thread/jdbhas$2ftb$1 digitalmars.com?page=2#post-jdcn98:241mg9:244:40digitalmars.comHowever, -------- enum PRED = () => 1; struct S(alias pred){} void main() { auto a = S!PRED(); auto b = S!PRED(); a = b; } -------- This is fine.If that's all it takes to avoid copies, then maybe it's not so big of a problem for lambdas.From testing, *string* preds don't have this "feature"/"bug" (?), even if you compile time build them, the compiler will "see" that it is the same string (probably true for integrals and whatnot). But for lambdas, only their *name* counts (eg __lambda1__). So if you plan to re-use a lambda, it *must* be stored in a way the compiler can "see" its unicity, and not typed more than once.Ditto.
Jun 28 2013
"Meta" <jared771 gmail.com> wrote in message news:ovlbbsdtyejzpsllfrcb forum.dlang.org...I'm not an expert on how DMD works, but could this possibly be done after the native code is generated, but before optimizations are performed? Just throwing out ideas.The obvious place to do this is at link time - then you get to fold all functions from all compilation units. It becomes something similar to string constant pooling - you fold sections/data based on content, not labels.
Jun 28 2013
On Thursday, 27 June 2013 at 02:03:52 UTC, Meta wrote:I'd like to ask, however, why should this be? The arguments line and file will only ever be strings and ints. They don't vary in type, only value, and changing those values will not affect how the function works. You could easily substitute out "exception.d" with "stdio.d", and the functionality will be unchanged.Just because the arguments are known 99% of the time at compile time, doesn't mean they should be parameters. Parameters shouldn't be used to just mean "compile time" anyways, but *specifically* to create different versions of a function or code, which in this case, is not the case at all. Besides, there are scenarios where you could pass these as runtime parameters from higher functions, if you want the exception to "look" like it came from a higher level call. Making these parameters would mean that *any* function that handles lines or filenames would have to be parameterized on that.
Jun 28 2013
On Friday, 28 June 2013 at 13:25:54 UTC, monarch_dodra wrote:Just because the arguments are known 99% of the time at compile time, doesn't mean they should be parameters. Parameters shouldn't be used to just mean "compile time" anyways, but *specifically* to create different versions of a function or code, which in this case, is not the case at all.That was probably the idea when templates were conceived of for C++, but I think their usage has evolved far beyond that now, especially in D. With the advent of CTFE, there's an enormous amount that we can do at compile-time, a lot of which isn't strictly related to characterizing a function/struct/class/etc. on a type. Though I suppose it could be argued that with CTFE, doing compile-time template stuff isn't as important.Besides, there are scenarios where you could pass these as runtime parameters from higher functions, if you want the exception to "look" like it came from a higher level call.I was just using assertNotThrown as an example to illustrate my point.Making these parameters would mean that *any* function that handles lines or filenames would have to be parameterized on that.I don't think there's anything stopping you from passing __FILE__ and __LINE__ as template parameters to some functions, and as function parameters to other functions.
Jun 28 2013
On Friday, 28 June 2013 at 22:40:14 UTC, Meta wrote:With the advent of CTFE, there's an enormous amount that we can do at compile-time, a lot of which isn't strictly related to characterizing a function/struct/class/etc. on a type.s/characterizing/parameterizing
Jun 28 2013