digitalmars.D.learn - Code generation tricks
- JS (36/36) Jul 21 2013 This seems to be a somewhat efficient string splitter
- John Colvin (2/39) Jul 22 2013 How does this perform compared to naive/phobos splitting?
- JS (10/59) Jul 22 2013 I don't know... probably not a huge difference unless phobo's is
- JS (16/65) Jul 22 2013 I don't know... probably not a huge difference unless phobo's is
- anonymous (55/57) Jul 23 2013 I probably shouldn't have done this, but I wanted to know what
This seems to be a somewhat efficient string splitter http://dpaste.dzfl.pl/4307aa5f The basic idea is for(int j = 0; j < s.length; j++) { mixin(ExpandVariadicIf!("??Cs[j]??s[j..min(s.length-1, j + %%L)]::", "d", " if (r.length <= i) r.length += 5; if (j != 0) { r[i++] = s[oldj..j]; oldj = j + %%L; } else oldj = %%L; j += %%L; continue;", T)); } ExpandVariadicIf creates a series of if's for each variadic argument. There is some strange formatting(just some crap I threw together to get something working) but it boils down to generating compile time code that minimizes computations and lookups by directly using the known compile time literals passed. IMO these types of functions seem useful but ATM are just hacks. Hopefully there is a better way to do these sorts of things as I find them pretty useful. One of the big issues not being able to pass a variadic variable to a template directly which is why the formatting string is necessary(You can pass the typetuple to get the types and size but not the compile time values if they exist. I think int this case a variadic alias would be very useful. alias T... => alias T0, alias T1, etc.... (e.g. T[0] is an alias, T.length is number of aliases, etc...) In any case, maybe someone has a good way to make these things easier and more useful. Being able to handle variadic types and values in a consistent and simple way will make them moreful.
Jul 21 2013
On Sunday, 21 July 2013 at 17:24:11 UTC, JS wrote:This seems to be a somewhat efficient string splitter http://dpaste.dzfl.pl/4307aa5f The basic idea is for(int j = 0; j < s.length; j++) { mixin(ExpandVariadicIf!("??Cs[j]??s[j..min(s.length-1, j + %%L)]::", "d", " if (r.length <= i) r.length += 5; if (j != 0) { r[i++] = s[oldj..j]; oldj = j + %%L; } else oldj = %%L; j += %%L; continue;", T)); } ExpandVariadicIf creates a series of if's for each variadic argument. There is some strange formatting(just some crap I threw together to get something working) but it boils down to generating compile time code that minimizes computations and lookups by directly using the known compile time literals passed. IMO these types of functions seem useful but ATM are just hacks. Hopefully there is a better way to do these sorts of things as I find them pretty useful. One of the big issues not being able to pass a variadic variable to a template directly which is why the formatting string is necessary(You can pass the typetuple to get the types and size but not the compile time values if they exist. I think int this case a variadic alias would be very useful. alias T... => alias T0, alias T1, etc.... (e.g. T[0] is an alias, T.length is number of aliases, etc...) In any case, maybe someone has a good way to make these things easier and more useful. Being able to handle variadic types and values in a consistent and simple way will make them moreful.How does this perform compared to naive/phobos splitting?
Jul 22 2013
On Monday, 22 July 2013 at 21:04:42 UTC, John Colvin wrote:On Sunday, 21 July 2013 at 17:24:11 UTC, JS wrote:I don't know... probably not a huge difference unless phobo's is heavily optimized. With just one delim, there should be no difference. With 100 delim literals, it should probably be significant, more so when chars are used. If the compiler is able to optimize slices of literal strings then it should be even better. http://dpaste.dzfl.pl/2f10d24a The code has a bunch of errors on it but compiles fine on mine. Must be some command line switch or something.This seems to be a somewhat efficient string splitter http://dpaste.dzfl.pl/4307aa5f The basic idea is for(int j = 0; j < s.length; j++) { mixin(ExpandVariadicIf!("??Cs[j]??s[j..min(s.length-1, j + %%L)]::", "d", " if (r.length <= i) r.length += 5; if (j != 0) { r[i++] = s[oldj..j]; oldj = j + %%L; } else oldj = %%L; j += %%L; continue;", T)); } ExpandVariadicIf creates a series of if's for each variadic argument. There is some strange formatting(just some crap I threw together to get something working) but it boils down to generating compile time code that minimizes computations and lookups by directly using the known compile time literals passed. IMO these types of functions seem useful but ATM are just hacks. Hopefully there is a better way to do these sorts of things as I find them pretty useful. One of the big issues not being able to pass a variadic variable to a template directly which is why the formatting string is necessary(You can pass the typetuple to get the types and size but not the compile time values if they exist. I think int this case a variadic alias would be very useful. alias T... => alias T0, alias T1, etc.... (e.g. T[0] is an alias, T.length is number of aliases, etc...) In any case, maybe someone has a good way to make these things easier and more useful. Being able to handle variadic types and values in a consistent and simple way will make them moreful.How does this perform compared to naive/phobos splitting?
Jul 22 2013
On Monday, 22 July 2013 at 21:04:42 UTC, John Colvin wrote:On Sunday, 21 July 2013 at 17:24:11 UTC, JS wrote:I don't know... probably not a huge difference unless phobo's is heavily optimized. With just one delim, there should be no difference. With 100 delim literals, it should probably be significant, more so when chars are used. If the compiler is able to optimize slices of literal strings then it should be even better. Heres my test code that you might be able to profile if you want: http://dpaste.dzfl.pl/2f10d24a The code has a bunch of errors on it but compiles fine on mine. Must be some command line switch or something. The Expand templates simply allow one to expand the variadic args into compile time expressions. e.g., we can do if (a == b) with normal args but not with variargs... the templates help accomplish that. (I'm sure there are better ways... think of the code as proof of concept).This seems to be a somewhat efficient string splitter http://dpaste.dzfl.pl/4307aa5f The basic idea is for(int j = 0; j < s.length; j++) { mixin(ExpandVariadicIf!("??Cs[j]??s[j..min(s.length-1, j + %%L)]::", "d", " if (r.length <= i) r.length += 5; if (j != 0) { r[i++] = s[oldj..j]; oldj = j + %%L; } else oldj = %%L; j += %%L; continue;", T)); } ExpandVariadicIf creates a series of if's for each variadic argument. There is some strange formatting(just some crap I threw together to get something working) but it boils down to generating compile time code that minimizes computations and lookups by directly using the known compile time literals passed. IMO these types of functions seem useful but ATM are just hacks. Hopefully there is a better way to do these sorts of things as I find them pretty useful. One of the big issues not being able to pass a variadic variable to a template directly which is why the formatting string is necessary(You can pass the typetuple to get the types and size but not the compile time values if they exist. I think int this case a variadic alias would be very useful. alias T... => alias T0, alias T1, etc.... (e.g. T[0] is an alias, T.length is number of aliases, etc...) In any case, maybe someone has a good way to make these things easier and more useful. Being able to handle variadic types and values in a consistent and simple way will make them moreful.How does this perform compared to naive/phobos splitting?
Jul 22 2013
On Sunday, 21 July 2013 at 17:24:11 UTC, JS wrote:This seems to be a somewhat efficient string splitter http://dpaste.dzfl.pl/4307aa5fI probably shouldn't have done this, but I wanted to know what that abomination actually does, so I reduced it (code below). In the end, all it does is accepting both char and string separators, something rather simple when you have static if. Some comments on the result: * I think the fiddling with i and r.length is silly, but it had some impact on performance, so I left it in. * Likewise, I'd rather just use std.algorithm.startsWith and not distinguish between char and string separators in split. Again, performance was slightly worse. And here it is: inout(char)[][] split(Separators ...)(inout(char)[] s, Separators separators) { size_t i = 0, oldj = 0; inout(char)[][] r; for(size_t j = 0; j < s.length; j++) { foreach(si, S; Separators) { immutable sep = separators[si]; static if(is(S : char)) { auto slice = s[j]; enum seplen = 1; } else static if(is(S : const(char)[])) { auto slice = s[j .. min(s.length, j + sep.length)]; immutable seplen = sep.length; } else static assert(false); if(slice == sep) { if(r.length <= i) r.length += 5; if(j != 0) r[i++] = s[oldj .. j]; j += seplen; oldj = j; } } } if(oldj < s.length) { auto tail = s[oldj .. $]; if(tail.length > 0) { if(r.length <= i) r.length++; r[i++] = tail; } } r.length = i; return r; }
Jul 23 2013