www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Re: Compile-time optimization

reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Jul 24, 2013 at 08:12:36PM +0200, JS wrote:
 T,
 
 I've looked over your code and understand what you are doing but I
 don't understand the expression
 
      is(typeof(func(args[0], args[1]))))
 
 you are checking if func(a,b)'s return type is a type?

That's the usual idiom for checking if it's legal to pass args[0] and args[1] to func(). If it's legal, then typeof(func(args[0],args[1])) will return some type, which then passes the check. If it's illegal, then the call to func has no type because the compiler rejects the call; say you have func(int,int) and you attempted to call func("a","a"). So then typeof(func(...)) will not be a valid type, and the check fails. On Wed, Jul 24, 2013 at 08:42:48PM +0200, JS wrote:
 On Wednesday, 24 July 2013 at 18:12:37 UTC, JS wrote:
 T,
 
 Also, I've tried to wrap the template in a function and I can't get
 it to work:
 
 	
 	string join(T...)(string delim, T args)
 	{
 		return tupleStringReduce!(args);
 	}
 
 it seems tupleStringReduce just returns the tuple of of the
 optimized strings. I'll still need to write code that will deal
 mixing in the actual compile time and runtime code.

You can't return a tuple as a string. :) Mainly because the joining has only happened to the string literals, but you haven't told the compiler how the tuple itself is to be joined into a string. I thought I already gave example code to do this: string join(string[] args...) { string result; string delim; // You need this in order for the compiler to know what // you wanna do with the optimized tuple. foreach (arg; args) { result ~= delim ~ arg; delim = ","; // just for example } return result; } template optimizedJoin(args...) { // This wrapper optimizes the tuple and passes the // result to join(). string optimizedJoin() { return join(tupleReduce!((string x, string y) => x~y, args)); } } Note that you have to use template calling syntax for optimizedJoin, like this: string result = optimizedJoin!("a", "b", x, "c", "d"); I did try wrapping it so that you can use normal runtime call syntax, but it doesn't work because the tuple reduction must happen at template expansion stage. It's too late to do it in CTFE proper.
 This doesn't seem that hard as tupleReduce essentially outlines the
 approach. The difference is instead of calling func directly I sort
 of need to mix it in as a code string.
 
 e.g., for tupleStringReduce, func is x~y. I need to build a string
 using that "((arg[0]~arg[1])~arg[2])..." which is then mixed in at
 compile time but at runtime evaluates to a string(not an array).
 
 Since there is no .codeof, AFAIK, I guess the only way to do this is
 pass func, and the func code string("x~y") in this case.
 
 The idea though is to use the lamba as a ctfe to reduce the tuple
 when possible but *also* use it at runtime to evaluate the tuple.

Alright, here's an improved version of the code: import std.stdio; template tuple(args...) { alias tuple = args; } /** * Given a binary function and a tuple, returns a tuple in which all adjacent * compile-time readable items are recursively substituted with the return * value of the binary function applied to them. */ template tupleReduce(alias func, args...) { static if (args.length > 1) { static if (__traits(compiles, { enum x = args[0]; })) { static if (__traits(compiles, { enum x = args[1]; }) && is(typeof(func(args[0], args[1])))) { enum result = func(args[0], args[1]); alias tupleReduce = tupleReduce!(func, result, args[2..$]); } else { alias tupleReduce = tuple!(args[0], args[1], tupleReduce!(func, args[2..$])); } } else { alias tupleReduce = tuple!(args[0], tupleReduce!(func, args[1..$])); } } else { alias tupleReduce = args; } } // I decided to use a real variadic instead of an array, to // force loop-unrolling. // Strictly speaking, we should add a signature constraint to // enforce uniform types in T..., but I was too lazy to figure // out how to phrase allSatisfy to check for that. For // production code, though you'll definitely want to do this. auto reduceImpl(alias fun, T...)(T args) { import std.functional; typeof(unaryFun!fun(args[0],args[0])) result; foreach (arg; args) { result = unaryFun!fun(result, arg); } return result; } // Nice wrapper for reduceImpl, so that we don't have to type // the lambda twice. template reduce(alias fun, args...) { string reduce() { return reduceImpl!fun(tupleReduce!(fun, args)); } } /** * Joins a list of strings. Adjacent CTFEable strings are joined * at compile-time into single strings, while runtime arguments * are joined at runtime. */ template join(args...) { string join() { // Behold, only one instance of the lambda! It // gets used both at compile-time and runtime. return reduce!((string x, string y) => x~y, args); } } void main() { // Example of how to use tupleReduce for compile-time // optimization of join: string x = "(runtime1)"; string y = "(runtime2)"; // Note that you still have to use compile-time argument // syntax for join. writeln(join!(x, "a", "bc", y, "de", "f", "g")); } Compiling with dmd -O -inline produced two nested function calls for the join!(...), though inside, the string literals have already been concatenated. Compiling with gdc -O3, OTOH, produced an unrolled loop in Dmain which basically sequentially appends to the result of join!(), i.e., it's as though I wrote: void main() { string x = "(runtime1)"; string y = "(runtime2)"; writeln(x ~ "abc" ~ y ~ "defg"); } You can't get much better than that. :) I think trying to repackage join!(...) as join(...) is a fool's errand, though. You'll likely only end up with pain. :) IMO writing join!(...) is a Good Enough(tm) compromise given that we have already achieved maximal compile-time optimization. T -- ASCII stupid question, getty stupid ANSI.
Jul 24 2013
parent "JS" <js.mdnq gmail.com> writes:
On Wednesday, 24 July 2013 at 19:52:57 UTC, H. S. Teoh wrote:
 On Wed, Jul 24, 2013 at 08:12:36PM +0200, JS wrote:
 T,
 
 I've looked over your code and understand what you are doing 
 but I
 don't understand the expression
 
      is(typeof(func(args[0], args[1]))))
 
 you are checking if func(a,b)'s return type is a type?

That's the usual idiom for checking if it's legal to pass args[0] and args[1] to func(). If it's legal, then typeof(func(args[0],args[1])) will return some type, which then passes the check. If it's illegal, then the call to func has no type because the compiler rejects the call; say you have func(int,int) and you attempted to call func("a","a"). So then typeof(func(...)) will not be a valid type, and the check fails. On Wed, Jul 24, 2013 at 08:42:48PM +0200, JS wrote:
 On Wednesday, 24 July 2013 at 18:12:37 UTC, JS wrote:
 T,
 
 Also, I've tried to wrap the template in a function and I 
 can't get
 it to work:
 
 	
 	string join(T...)(string delim, T args)
 	{
 		return tupleStringReduce!(args);
 	}
 
 it seems tupleStringReduce just returns the tuple of of the
 optimized strings. I'll still need to write code that will deal
 mixing in the actual compile time and runtime code.

You can't return a tuple as a string. :) Mainly because the joining has only happened to the string literals, but you haven't told the compiler how the tuple itself is to be joined into a string. I thought I already gave example code to do this: string join(string[] args...) { string result; string delim; // You need this in order for the compiler to know what // you wanna do with the optimized tuple. foreach (arg; args) { result ~= delim ~ arg; delim = ","; // just for example } return result; } template optimizedJoin(args...) { // This wrapper optimizes the tuple and passes the // result to join(). string optimizedJoin() { return join(tupleReduce!((string x, string y) => x~y, args)); } } Note that you have to use template calling syntax for optimizedJoin, like this: string result = optimizedJoin!("a", "b", x, "c", "d"); I did try wrapping it so that you can use normal runtime call syntax, but it doesn't work because the tuple reduction must happen at template expansion stage. It's too late to do it in CTFE proper.
 This doesn't seem that hard as tupleReduce essentially 
 outlines the
 approach. The difference is instead of calling func directly I 
 sort
 of need to mix it in as a code string.
 
 e.g., for tupleStringReduce, func is x~y. I need to build a 
 string
 using that "((arg[0]~arg[1])~arg[2])..." which is then mixed 
 in at
 compile time but at runtime evaluates to a string(not an 
 array).
 
 Since there is no .codeof, AFAIK, I guess the only way to do 
 this is
 pass func, and the func code string("x~y") in this case.
 
 The idea though is to use the lamba as a ctfe to reduce the 
 tuple
 when possible but *also* use it at runtime to evaluate the 
 tuple.

Alright, here's an improved version of the code: import std.stdio; template tuple(args...) { alias tuple = args; } /** * Given a binary function and a tuple, returns a tuple in which all adjacent * compile-time readable items are recursively substituted with the return * value of the binary function applied to them. */ template tupleReduce(alias func, args...) { static if (args.length > 1) { static if (__traits(compiles, { enum x = args[0]; })) { static if (__traits(compiles, { enum x = args[1]; }) && is(typeof(func(args[0], args[1])))) { enum result = func(args[0], args[1]); alias tupleReduce = tupleReduce!(func, result, args[2..$]); } else { alias tupleReduce = tuple!(args[0], args[1], tupleReduce!(func, args[2..$])); } } else { alias tupleReduce = tuple!(args[0], tupleReduce!(func, args[1..$])); } } else { alias tupleReduce = args; } } // I decided to use a real variadic instead of an array, to // force loop-unrolling. // Strictly speaking, we should add a signature constraint to // enforce uniform types in T..., but I was too lazy to figure // out how to phrase allSatisfy to check for that. For // production code, though you'll definitely want to do this. auto reduceImpl(alias fun, T...)(T args) { import std.functional; typeof(unaryFun!fun(args[0],args[0])) result; foreach (arg; args) { result = unaryFun!fun(result, arg); } return result; } // Nice wrapper for reduceImpl, so that we don't have to type // the lambda twice. template reduce(alias fun, args...) { string reduce() { return reduceImpl!fun(tupleReduce!(fun, args)); } } /** * Joins a list of strings. Adjacent CTFEable strings are joined * at compile-time into single strings, while runtime arguments * are joined at runtime. */ template join(args...) { string join() { // Behold, only one instance of the lambda! It // gets used both at compile-time and runtime. return reduce!((string x, string y) => x~y, args); } } void main() { // Example of how to use tupleReduce for compile-time // optimization of join: string x = "(runtime1)"; string y = "(runtime2)"; // Note that you still have to use compile-time argument // syntax for join. writeln(join!(x, "a", "bc", y, "de", "f", "g")); } Compiling with dmd -O -inline produced two nested function calls for the join!(...), though inside, the string literals have already been concatenated. Compiling with gdc -O3, OTOH, produced an unrolled loop in Dmain which basically sequentially appends to the result of join!(), i.e., it's as though I wrote: void main() { string x = "(runtime1)"; string y = "(runtime2)"; writeln(x ~ "abc" ~ y ~ "defg"); } You can't get much better than that. :) I think trying to repackage join!(...) as join(...) is a fool's errand, though. You'll likely only end up with pain. :) IMO writing join!(...) is a Good Enough(tm) compromise given that we have already achieved maximal compile-time optimization.

Cool, this is definitely on the right track! I like how you are able to generate the expansion reduction without string mixins(the naive way I would try and do it). The main things I was after are achieved: Loop unrolling, mixed-time constant folding, and variable arguments. While I think I understand the overall idea you are using it will take me a bit to grasp exactly how you were able to get it all done correctly. I have two requests if you don't mind and have the time... Allow for array reduction and delim/indexing. When using join we typically want to join an array of strings while also inserting a delimiter between them... usually with the last one removed. Sometimes we want to know the index and maximum array size to do specialized joins. e.g., join("a", "b", ["c", "d"]) we might want to print out "4: 1a, 2b, 3c, 4d". Using your lambda's this is somewhat easy to achieve, (string a, int i, int m) => { return ((i == 0) ? to!string(m)~": " : "")~to!string(i)~a~((i < m) ? ", " : ""); } this would be a unary applied to each string... either at compile time, if possible, or runtime. I'm not sure how to achieve this because of the runtime/compile time issue... but I think it's similar to how to mention the use of reduce being used at compile time and runtime. As far as the array issue goes, it seems that a check for an array is needed then func needs to be checked if it can work on the array element and applied in a similar fashion... but just too much complexity for my little brain to grasp ;/ Having such features should 1. Provide a join function that is optimal in all conditions but also mimic the traditional join function. 2. Demonstrate techniques that produce such optimal performing code and make normal functions more performant and expressive(e.g., works well with variable args). I can imagine an analogous Boost like library for D that provides the optimal performance possible by getting the compiler to do as much work as possible behind the scenes. Once I get these programming techniques down I'll hopefully be able to start implementing some of routines(string stuff, math, etc...). At that point, if no one beats me to the punch I'll try to see if it was all worth it in the first place ;)
Jul 25 2013