digitalmars.D - Template performance
- bearophile (185/185) Nov 22 2010 Time ago I have found a little C++ program that computes the number of s...
- Steven Schveighoffer (5/18) Nov 23 2010 This might be due to a bug I reported how DMD is very slow when dealing ...
- bearophile (4/8) Nov 23 2010 Right, it may be the same cause. Are you able & willing to perform a sim...
- Steven Schveighoffer (10/18) Nov 23 2010 basically, I downloaded the svn source for dmd, then edited the makefile...
- Jonathan M Davis (7/31) Nov 23 2010 There are also definitely memory issues with compiling templates:
- Stanislav Blinov (10/16) Nov 23 2010 It'd be good if you also posted the machine specs. I ran the benchmark o...
- Emil Madsen (9/41) Nov 23 2010 I'm getting:
Time ago I have found a little C++ program that computes the number of solutions to the N Queens problem at compile-time using just templates. I have translated it into CTFE code and I have shown in this newsgroup that if your purpose is to compute a result at compile-time, then D CTFE allows you to write much simpler (and faster) code compared to the C++ template mataprogramming. This time I have translated that C++ code to D code that uses just templates. This is not idiomatic D code, because for this purpose CTFE is better, but templates are used in D too, so it may be a performance benchmark for templates in general. I am not very good with C++ templates yet, so if you spot an error in my D translation please tell me that I will redo the timings. Compilation time: G++ 0.96 seconds, dmd 12.4 seconds. With N=7 G++ uses about 34 MB RAM, DMD about 130+ MB RAM. I have used MinGW 4.5.1 and DMD 2.050. Compilation: g++ nqueens_cpp.cpp -o nqueens_cpp dmd nqueens_d.d The D code is nicer and shorter. GCC 4.5 has optimized the template compilation, and from the benchmark it seems faster: http://cpptruths.blogspot.com/2010/03/faster-meta-programs-using-gcc-45-and.html Is a similar optimization (using a hash) already present in DMD and is it useful? ----------------------------- // G++ 4.5.1 code #include "stdio.h" typedef unsigned long long uint64_t; template <uint64_t Cols = 0, uint64_t Diag135 = 0, uint64_t Diag45 = 0, uint64_t Solution = 0> struct state { static uint64_t const cols = Cols; static uint64_t const diag135 = Diag135; static uint64_t const diag45 = Diag45; static uint64_t const solution = Solution; }; template <int K, int J, typename State> struct test { static bool const result = ((State::cols & (1ull << J)) + (State::diag135 & (1ull << (J + K))) + (State::diag45 & (1ull << (32 + J - K)))) == 0; }; template <int K, int J, typename State> struct mark { typedef state < State::cols ^ (1ull << J), State::diag135 ^ (1ull << (J + K)), State::diag45 ^ (1ull << (32 + J - K)), State::solution > state_type; }; template <int StartValue, int Times, typename State> struct accumulate_result { typedef typename accumulate_result < StartValue + 1, Times - 1, state < State::cols, State::diag135, State::diag45, State::solution + test<0, StartValue, State>::result > >::state_type state_type; }; template <int StartValue, typename State> struct accumulate_result<StartValue, 0, State> { typedef State state_type; }; template <int Niv, int Dx, typename State> struct solve; template <bool Condition, typename State, int Current, int Niv, int Dx> struct result_from_test { typedef typename mark < Niv, Current, typename solve < Niv - 1, Dx, typename mark<Niv, Current, State>::state_type >::state_type >::state_type state_type; }; template <typename State, int Current, int Niv, int Dx> struct result_from_test<false, State, Current, Niv, Dx> { typedef State state_type; }; template <int Begin, int Times, typename State, int Niv, int Dx> struct process_queens { typedef typename process_queens < Begin + 1, Times - 1, typename result_from_test < test<Niv, Begin, State>::result, State, Begin, Niv, Dx >::state_type, Niv, Dx >::state_type state_type; }; template <int Begin, typename State, int Niv, int Dx> struct process_queens<Begin, 0, State, Niv, Dx> { typedef State state_type; }; template <int Niv, int Dx, typename State = state<> > struct solve { typedef typename process_queens<0, Dx, State, Niv, Dx>::state_type state_type; static uint64_t const result = state_type::solution; }; template <int Dx, typename State> struct solve<0, Dx, State> { typedef typename accumulate_result<0, Dx, State>::state_type state_type; static uint64_t const result = state_type::solution; }; template <int Dx> struct meta_queens: solve<Dx - 1, Dx> {}; int main() { printf("%llu", meta_queens<7>::result); } -------------------------------- // D2 code import core.stdc.stdio: printf; template State(ulong Cols=0, ulong Diag135=0, ulong Diag45=0, ulong Solution=0) { enum ulong cols = Cols; enum ulong diag135 = Diag135; enum ulong diag45 = Diag45; enum ulong solution = Solution; } template Test(int k, int j, alias state) { enum bool Test = ((state.cols & (1UL << j)) + (state.diag135 & (1UL << (j + k))) + (state.diag45 & (1UL << (32 + j - k)))) == 0; } template Mark(int k, int j, alias state) { alias State!(state.cols ^ (1UL << j), state.diag135 ^ (1UL << (j + k)), state.diag45 ^ (1UL << (32 + j - k)), state.solution ) Mark; } template AccumulateResult(int startValue, int times, alias state) { static if (times == 0) alias state AccumulateResult; else alias AccumulateResult!(startValue + 1, times - 1, State!(state.cols, state.diag135, state.diag45, state.solution + Test!(0, startValue, state)) ) AccumulateResult; } template ResultFromTest(bool condition, alias state, int current, int niv, int dx) { static if (condition == 0) alias state ResultFromTest; else alias Mark!(niv, current, Solve!(niv - 1, dx, Mark!(niv, current, state)) ) ResultFromTest; } template ProcessQueens(int begin, int times, alias state, int niv, int dx) { static if (times == 0) alias state ProcessQueens; else alias ProcessQueens!(begin + 1, times - 1, ResultFromTest!(Test!(niv, begin, state), state, begin, niv, dx), niv, dx ) ProcessQueens; } template Solve(int niv, int dx, alias state=State!()) { static if (niv == 0) alias AccumulateResult!(0, dx, state) Solve; else alias ProcessQueens!(0, dx, state, niv, dx) Solve; } template MetaNQueens(int dx) { enum ulong MetaNQueens = Solve!(dx - 1, dx).solution; } void main() { printf("%llu", MetaNQueens!(7)); } ----------------------------- Bye, bearophile
Nov 22 2010
On Mon, 22 Nov 2010 18:23:37 -0500, bearophile <bearophileHUGS lycos.com> wrote:Time ago I have found a little C++ program that computes the number of solutions to the N Queens problem at compile-time using just templates. I have translated it into CTFE code and I have shown in this newsgroup that if your purpose is to compute a result at compile-time, then D CTFE allows you to write much simpler (and faster) code compared to the C++ template mataprogramming. This time I have translated that C++ code to D code that uses just templates. This is not idiomatic D code, because for this purpose CTFE is better, but templates are used in D too, so it may be a performance benchmark for templates in general. I am not very good with C++ templates yet, so if you spot an error in my D translation please tell me that I will redo the timings. Compilation time: G++ 0.96 seconds, dmd 12.4 seconds.This might be due to a bug I reported how DMD is very slow when dealing with lots of templates. See http://d.puremagic.com/issues/show_bug.cgi?id=4900
Nov 23 2010
Steven Schveighoffer:This might be due to a bug I reported how DMD is very slow when dealing with lots of templates. See http://d.puremagic.com/issues/show_bug.cgi?id=4900Right, it may be the same cause. Are you able & willing to perform a similar profiling of DMD while it compiles that little template-nqueen D bench? (How do you profile the DMD compiler?) Bye, bearophile
Nov 23 2010
On Tue, 23 Nov 2010 13:27:22 -0500, bearophile <bearophileHUGS lycos.com> wrote:Steven Schveighoffer:basically, I downloaded the svn source for dmd, then edited the makefile to add the profile flags (I think they are commented out), and finally run the profiler. I've run into cases where I thought I did it right and I really didn't, so it's not an easy thing. I can't remember off the top of my head what they are, but something like -gp? Remember however that the default build of dmd from svn is debug mode, so if you profile that, you will likely get the wrong results. -SteveThis might be due to a bug I reported how DMD is very slow when dealing with lots of templates. See http://d.puremagic.com/issues/show_bug.cgi?id=4900Right, it may be the same cause. Are you able & willing to perform a similar profiling of DMD while it compiles that little template-nqueen D bench? (How do you profile the DMD compiler?)
Nov 23 2010
On Tuesday, November 23, 2010 08:43:20 Steven Schveighoffer wrote:On Mon, 22 Nov 2010 18:23:37 -0500, bearophile <bearophileHUGS lycos.com> wrote:There are also definitely memory issues with compiling templates: http://d.puremagic.com/issues/show_bug.cgi?id=4984 It's enough, for instance, that you can't use very many arguments to a single call to find(), or the compiler will run out of memory. I expect that it plays a definite part in the slow template compilation. - Jonathan M DavisTime ago I have found a little C++ program that computes the number of solutions to the N Queens problem at compile-time using just templates. I have translated it into CTFE code and I have shown in this newsgroup that if your purpose is to compute a result at compile-time, then D CTFE allows you to write much simpler (and faster) code compared to the C++ template mataprogramming. This time I have translated that C++ code to D code that uses just templates. This is not idiomatic D code, because for this purpose CTFE is better, but templates are used in D too, so it may be a performance benchmark for templates in general. I am not very good with C++ templates yet, so if you spot an error in my D translation please tell me that I will redo the timings. Compilation time: G++ 0.96 seconds, dmd 12.4 seconds.This might be due to a bug I reported how DMD is very slow when dealing with lots of templates. See http://d.puremagic.com/issues/show_bug.cgi?id=4900
Nov 23 2010
23.11.2010 2:23, bearophile wrote:Time ago I have found a little C++ program that computes the number of solutions to the N Queens problem at compile-time using just templates. I have translated it into CTFE code and I have shown in this newsgroup that if your purpose is to compute a result at compile-time, then D CTFE allows you to write much simpler (and faster) code compared to the C++ template mataprogramming. This time I have translated that C++ code to D code that uses just templates. This is not idiomatic D code, because for this purpose CTFE is better, but templates are used in D too, so it may be a performance benchmark for templates in general. I am not very good with C++ templates yet, so if you spot an error in my D translation please tell me that I will redo the timings. Compilation time: G++ 0.96 seconds, dmd 12.4 seconds. With N=7 G++ uses about 34 MB RAM, DMD about 130+ MB RAM. I have used MinGW 4.5.1 and DMD 2.050.It'd be good if you also posted the machine specs. I ran the benchmark on Intel i5 750 with 8Gb RAM under Windows7 Pro x64. I don't have G++ right now, but I tried cl from Visual Studio 2008 and dmc 8.42n instead Here are the timings (compiler-version-seconds): dmd 2.050 2.97958 cl 15.00.30729.01 3.65038 dmc 8.42n 11.6375 So at least on this system dmd doesn't look that bad at all.
Nov 23 2010
2010/11/23 Stanislav Blinov <blinov loniir.ru>23.11.2010 2:23, bearophile wrote:I'm getting: MinGW G++ (source code modified): 744 miliseconds MinGW G++ (usual version): 924 miliseconds Dmd: 4275 miliseconds (i7 720QM, 4GB ram) -- // Yours sincerely // Emil 'Skeen' MadsenTime ago I have found a little C++ program that computes the number of solutions to the N Queens problem at compile-time using just templates. I have translated it into CTFE code and I have shown in this newsgroup that if your purpose is to compute a result at compile-time, then D CTFE allows you to write much simpler (and faster) code compared to the C++ template mataprogramming. This time I have translated that C++ code to D code that uses just templates. This is not idiomatic D code, because for this purpose CTFE is better, but templates are used in D too, so it may be a performance benchmark for templates in general. I am not very good with C++ templates yet, so if you spot an error in my D translation please tell me that I will redo the timings. Compilation time: G++ 0.96 seconds, dmd 12.4 seconds. With N=7 G++ uses about 34 MB RAM, DMD about 130+ MB RAM. I have used MinGW 4.5.1 and DMD 2.050.It'd be good if you also posted the machine specs. I ran the benchmark on Intel i5 750 with 8Gb RAM under Windows7 Pro x64. I don't have G++ right now, but I tried cl from Visual Studio 2008 and dmc 8.42n instead Here are the timings (compiler-version-seconds): dmd 2.050 2.97958 cl 15.00.30729.01 3.65038 dmc 8.42n 11.6375 So at least on this system dmd doesn't look that bad at all.
Nov 23 2010