digitalmars.D - Re: Compile-time optimization

"H. S. Teoh" <hsteoh quickfur.ath.cx> Jul 24 2013

"JS" <js.mdnq gmail.com> Jul 25 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Wed, Jul 24, 2013 at 08:12:36PM +0200, JS wrote:
 T,
 
 I've looked over your code and understand what you are doing but I
 don't understand the expression
 
      is(typeof(func(args[0], args[1]))))
 
 you are checking if func(a,b)'s return type is a type?


That's the usual idiom for checking if it's legal to pass args[0] and
args[1] to func().

If it's legal, then typeof(func(args[0],args[1])) will return some type,
which then passes the check. If it's illegal, then the call to func has
no type because the compiler rejects the call; say you have
func(int,int) and you attempted to call func("a","a"). So then
typeof(func(...)) will not be a valid type, and the check fails.


On Wed, Jul 24, 2013 at 08:42:48PM +0200, JS wrote:
 On Wednesday, 24 July 2013 at 18:12:37 UTC, JS wrote:
 T,
 
 Also, I've tried to wrap the template in a function and I can't get
 it to work:
 
 	
 	string join(T...)(string delim, T args)
 	{
 		return tupleStringReduce!(args);
 	}
 
 it seems tupleStringReduce just returns the tuple of of the
 optimized strings. I'll still need to write code that will deal
 mixing in the actual compile time and runtime code.


You can't return a tuple as a string. :)  Mainly because the joining has
only happened to the string literals, but you haven't told the compiler
how the tuple itself is to be joined into a string.

I thought I already gave example code to do this:

	string join(string[] args...) {
		string result;
		string delim;

		// You need this in order for the compiler to know what
		// you wanna do with the optimized tuple.
		foreach (arg; args) {
			result ~= delim ~ arg;
			delim = ",";	// just for example
		}
		return result;
	}

	template optimizedJoin(args...)
	{
		// This wrapper optimizes the tuple and passes the
		// result to join().
		string optimizedJoin() {
			return join(tupleReduce!((string x, string y) => x~y, args));
		}
	}

Note that you have to use template calling syntax for optimizedJoin,
like this:

	string result = optimizedJoin!("a", "b", x, "c", "d");

I did try wrapping it so that you can use normal runtime call syntax,
but it doesn't work because the tuple reduction must happen at template
expansion stage. It's too late to do it in CTFE proper.


 This doesn't seem that hard as tupleReduce essentially outlines the
 approach. The difference is instead of calling func directly I sort
 of need to mix it in as a code string.
 
 e.g., for tupleStringReduce, func is x~y. I need to build a string
 using that "((arg[0]~arg[1])~arg[2])..." which is then mixed in at
 compile time but at runtime evaluates to a string(not an array).
 
 Since there is no .codeof, AFAIK, I guess the only way to do this is
 pass func, and the func code string("x~y") in this case.
 
 The idea though is to use the lamba as a ctfe to reduce the tuple
 when possible but *also* use it at runtime to evaluate the tuple.



Alright, here's an improved version of the code:

	import std.stdio;
	
	template tuple(args...) {
		alias tuple = args;
	}
	
	/**
	 * Given a binary function and a tuple, returns a tuple in which all adjacent
	 * compile-time readable items are recursively substituted with the return
	 * value of the binary function applied to them.
	 */
	template tupleReduce(alias func, args...)
	{
		static if (args.length > 1)
		{
			static if (__traits(compiles, { enum x = args[0]; }))
			{
				static if (__traits(compiles, { enum x = args[1]; }) &&
					is(typeof(func(args[0], args[1]))))
				{
					enum result = func(args[0], args[1]);
					alias tupleReduce = tupleReduce!(func, result,
						args[2..$]);
				}
				else
				{
					alias tupleReduce = tuple!(args[0], args[1],
						tupleReduce!(func, args[2..$]));
				}
			}
			else
			{
				alias tupleReduce = tuple!(args[0],
					tupleReduce!(func, args[1..$]));
			}
		}
		else
		{
			alias tupleReduce = args;
		}
	}
	
	// I decided to use a real variadic instead of an array, to
	// force loop-unrolling.
	// Strictly speaking, we should add a signature constraint to
	// enforce uniform types in T..., but I was too lazy to figure
	// out how to phrase allSatisfy to check for that. For
	// production code, though you'll definitely want to do this.
	auto reduceImpl(alias fun, T...)(T args)
	{
		import std.functional;
		typeof(unaryFun!fun(args[0],args[0])) result;
	
		foreach (arg; args) {
			result = unaryFun!fun(result, arg);
		}
		return result;
	}
	
	// Nice wrapper for reduceImpl, so that we don't have to type
	// the lambda twice.
	template reduce(alias fun, args...)
	{
		string reduce() {
			return reduceImpl!fun(tupleReduce!(fun, args));
		}
	}
	
	/**
	 * Joins a list of strings. Adjacent CTFEable strings are joined
	 * at compile-time into single strings, while runtime arguments
	 * are joined at runtime.
	 */
	template join(args...)
	{
		string join() {
			// Behold, only one instance of the lambda! It
			// gets used both at compile-time and runtime.
			return reduce!((string x, string y) => x~y, args);
		}
	}
	
	void main() {
		// Example of how to use tupleReduce for compile-time
		// optimization of join:
		string x = "(runtime1)";
		string y = "(runtime2)";

		// Note that you still have to use compile-time argument
		// syntax for join.
		writeln(join!(x, "a", "bc", y, "de", "f", "g"));
	}

Compiling with dmd -O -inline produced two nested function calls for the
join!(...), though inside, the string literals have already been
concatenated.

Compiling with gdc -O3, OTOH, produced an unrolled loop in Dmain which
basically sequentially appends to the result of join!(), i.e., it's as
though I wrote:

	void main() {
		string x = "(runtime1)";
		string y = "(runtime2)";
		writeln(x ~ "abc" ~ y ~ "defg");
	}

You can't get much better than that. :)

I think trying to repackage join!(...) as join(...) is a fool's errand,
though. You'll likely only end up with pain. :) IMO writing join!(...)
is a Good Enough(tm) compromise given that we have already achieved
maximal compile-time optimization.


T

-- 
ASCII stupid question, getty stupid ANSI.

Jul 24 2013

"JS" <js.mdnq gmail.com> writes:

On Wednesday, 24 July 2013 at 19:52:57 UTC, H. S. Teoh wrote:
 On Wed, Jul 24, 2013 at 08:12:36PM +0200, JS wrote:
 T,
 
 I've looked over your code and understand what you are doing 
 but I
 don't understand the expression
 
      is(typeof(func(args[0], args[1]))))
 
 you are checking if func(a,b)'s return type is a type?


 That's the usual idiom for checking if it's legal to pass 
 args[0] and
 args[1] to func().

 If it's legal, then typeof(func(args[0],args[1])) will return 
 some type,
 which then passes the check. If it's illegal, then the call to 
 func has
 no type because the compiler rejects the call; say you have
 func(int,int) and you attempted to call func("a","a"). So then
 typeof(func(...)) will not be a valid type, and the check fails.


 On Wed, Jul 24, 2013 at 08:42:48PM +0200, JS wrote:
 On Wednesday, 24 July 2013 at 18:12:37 UTC, JS wrote:
 T,
 
 Also, I've tried to wrap the template in a function and I 
 can't get
 it to work:
 
 	
 	string join(T...)(string delim, T args)
 	{
 		return tupleStringReduce!(args);
 	}
 
 it seems tupleStringReduce just returns the tuple of of the
 optimized strings. I'll still need to write code that will deal
 mixing in the actual compile time and runtime code.


 You can't return a tuple as a string. :)  Mainly because the 
 joining has
 only happened to the string literals, but you haven't told the 
 compiler
 how the tuple itself is to be joined into a string.

 I thought I already gave example code to do this:

 	string join(string[] args...) {
 		string result;
 		string delim;

 		// You need this in order for the compiler to know what
 		// you wanna do with the optimized tuple.
 		foreach (arg; args) {
 			result ~= delim ~ arg;
 			delim = ",";	// just for example
 		}
 		return result;
 	}

 	template optimizedJoin(args...)
 	{
 		// This wrapper optimizes the tuple and passes the
 		// result to join().
 		string optimizedJoin() {
 			return join(tupleReduce!((string x, string y) => x~y, args));
 		}
 	}

 Note that you have to use template calling syntax for 
 optimizedJoin,
 like this:

 	string result = optimizedJoin!("a", "b", x, "c", "d");

 I did try wrapping it so that you can use normal runtime call 
 syntax,
 but it doesn't work because the tuple reduction must happen at 
 template
 expansion stage. It's too late to do it in CTFE proper.


 This doesn't seem that hard as tupleReduce essentially 
 outlines the
 approach. The difference is instead of calling func directly I 
 sort
 of need to mix it in as a code string.
 
 e.g., for tupleStringReduce, func is x~y. I need to build a 
 string
 using that "((arg[0]~arg[1])~arg[2])..." which is then mixed 
 in at
 compile time but at runtime evaluates to a string(not an 
 array).
 
 Since there is no .codeof, AFAIK, I guess the only way to do 
 this is
 pass func, and the func code string("x~y") in this case.
 
 The idea though is to use the lamba as a ctfe to reduce the 
 tuple
 when possible but *also* use it at runtime to evaluate the 
 tuple.



 Alright, here's an improved version of the code:

 	import std.stdio;
 	
 	template tuple(args...) {
 		alias tuple = args;
 	}
 	
 	/**
 	 * Given a binary function and a tuple, returns a tuple in 
 which all adjacent
 	 * compile-time readable items are recursively substituted 
 with the return
 	 * value of the binary function applied to them.
 	 */
 	template tupleReduce(alias func, args...)
 	{
 		static if (args.length > 1)
 		{
 			static if (__traits(compiles, { enum x = args[0]; }))
 			{
 				static if (__traits(compiles, { enum x = args[1]; }) &&
 					is(typeof(func(args[0], args[1]))))
 				{
 					enum result = func(args[0], args[1]);
 					alias tupleReduce = tupleReduce!(func, result,
 						args[2..$]);
 				}
 				else
 				{
 					alias tupleReduce = tuple!(args[0], args[1],
 						tupleReduce!(func, args[2..$]));
 				}
 			}
 			else
 			{
 				alias tupleReduce = tuple!(args[0],
 					tupleReduce!(func, args[1..$]));
 			}
 		}
 		else
 		{
 			alias tupleReduce = args;
 		}
 	}
 	
 	// I decided to use a real variadic instead of an array, to
 	// force loop-unrolling.
 	// Strictly speaking, we should add a signature constraint to
 	// enforce uniform types in T..., but I was too lazy to figure
 	// out how to phrase allSatisfy to check for that. For
 	// production code, though you'll definitely want to do this.
 	auto reduceImpl(alias fun, T...)(T args)
 	{
 		import std.functional;
 		typeof(unaryFun!fun(args[0],args[0])) result;
 	
 		foreach (arg; args) {
 			result = unaryFun!fun(result, arg);
 		}
 		return result;
 	}
 	
 	// Nice wrapper for reduceImpl, so that we don't have to type
 	// the lambda twice.
 	template reduce(alias fun, args...)
 	{
 		string reduce() {
 			return reduceImpl!fun(tupleReduce!(fun, args));
 		}
 	}
 	
 	/**
 	 * Joins a list of strings. Adjacent CTFEable strings are 
 joined
 	 * at compile-time into single strings, while runtime arguments
 	 * are joined at runtime.
 	 */
 	template join(args...)
 	{
 		string join() {
 			// Behold, only one instance of the lambda! It
 			// gets used both at compile-time and runtime.
 			return reduce!((string x, string y) => x~y, args);
 		}
 	}
 	
 	void main() {
 		// Example of how to use tupleReduce for compile-time
 		// optimization of join:
 		string x = "(runtime1)";
 		string y = "(runtime2)";

 		// Note that you still have to use compile-time argument
 		// syntax for join.
 		writeln(join!(x, "a", "bc", y, "de", "f", "g"));
 	}

 Compiling with dmd -O -inline produced two nested function 
 calls for the
 join!(...), though inside, the string literals have already been
 concatenated.

 Compiling with gdc -O3, OTOH, produced an unrolled loop in 
 Dmain which
 basically sequentially appends to the result of join!(), i.e., 
 it's as
 though I wrote:

 	void main() {
 		string x = "(runtime1)";
 		string y = "(runtime2)";
 		writeln(x ~ "abc" ~ y ~ "defg");
 	}

 You can't get much better than that. :)

 I think trying to repackage join!(...) as join(...) is a fool's 
 errand,
 though. You'll likely only end up with pain. :) IMO writing 
 join!(...)
 is a Good Enough(tm) compromise given that we have already 
 achieved
 maximal compile-time optimization.



Cool, this is definitely on the right track! I like how you are 
able to generate the expansion reduction without string 
mixins(the naive way I would try and do it).

The main things I was after are achieved: Loop unrolling, 
mixed-time constant folding, and variable arguments.

While I think I understand the overall idea you are using it will 
take me a bit to grasp exactly how you were able to get it all 
done correctly.

I have two requests if you don't mind and have the time...

Allow for array reduction and delim/indexing. When using join we 
typically want to join an array of strings while also inserting a 
delimiter between them... usually with the last one removed.

Sometimes we want to know the index and maximum array size to do 
specialized joins.

e.g.,

join("a", "b", ["c", "d"])

we might want to print out "4: 1a, 2b, 3c, 4d".

Using your lambda's this is somewhat easy to achieve,

(string a, int i, int m) => { return ((i == 0) ? to!string(m)~": 
" : "")~to!string(i)~a~((i < m) ? ", " : ""); }

this would be a unary applied to each string... either at compile 
time, if possible, or runtime.

I'm not sure how to achieve this because of the runtime/compile 
time issue... but I think it's similar to how to mention the use 
of reduce being used at compile time and runtime.

As far as the array issue goes, it seems that a check for an 
array is needed then func needs to be checked if it can work on 
the array element and applied in a similar fashion... but just 
too much complexity for my little brain to grasp ;/

Having such features should 1. Provide a join function that is 
optimal in all conditions but also mimic the traditional join 
function. 2. Demonstrate techniques that produce such optimal 
performing code and make normal functions more performant and 
expressive(e.g., works well with variable args).

I can imagine an analogous Boost like library for D that provides 
the optimal performance possible by getting the compiler to do as 
much work as possible behind the scenes.

Once I get these programming techniques down I'll hopefully be 
able to start implementing some of routines(string stuff, math, 
etc...). At that point, if no one beats me to the punch I'll try 
to see if it was all worth it in the first place ;)

Jul 25 2013

D Programming

C/C++ Programming

Other

digitalmars.D - Re: Compile-time optimization