digitalmars.D.learn - Why is this code slow?
- Csaba (24/24) Mar 24 I know that benchmarks are always controversial and depend on a
- matheus (11/18) Mar 24 I think a few things can be going on, but one way to go is trying
- Sergey (20/22) Mar 24 Not really..
- kdevel (13/30) Mar 24 Usually you do not translate mathematical expressions directly
- rkompass (16/24) Mar 24 I used the loop:
- Sergey (8/10) Mar 24 1) If possible you can use "betterC" - to disable runtime
- rkompass (32/42) Mar 25 Thank you. I succeeded with `gdc -Wall -O2 -frelease
- Salih Dincer (42/62) Mar 26 It's obvious that you are a good mathematician. You used sequence
- Salih Dincer (74/90) Mar 24 I also used this code:
- Csaba (3/16) Mar 26 I know that the code can be simplified/optimized, I just wanted
- Lance Bachmeier (19/44) Mar 26 As others suggested, pow is the problem. I noticed that the C
- Lance Bachmeier (16/70) Mar 26 And then the other thing is changing
- rkompass (45/45) Mar 27 I apologize for digressing a little bit further - just to share
- Salih Dincer (19/21) Mar 27 Good thing you're digressing; I am 45 years old and I still
- rkompass (41/45) Mar 28 So we go with another digression. I discovered parallel, also
- Salih Dincer (40/44) Mar 28 You can achieve parallelism in C using libraries such as OpenMP,
- rkompass (14/38) Mar 28 Nice, thank you.
- Sergey (15/16) Mar 28 It's hard to compare actually.
- Salih Dincer (5/9) Mar 28 There is no such thing as parallel programming in D anyway. At
- Serg Gini (3/6) Mar 28 I think it just works :)
- Salih Dincer (54/61) Mar 28 A year has passed and I have tried almost everything! Either it
I know that benchmarks are always controversial and depend on a lot of factors. So far, I read that D performs very well in benchmarks, as well, if not better, as C. I wrote a little program that approximates PI using the Leibniz formula. I implemented the same thing in C, D and Python, all of them execute 1,000,000 iterations 20 times and display the average time elapsed. Here are the results: C: 0.04s Python: 0.33s D: 0.73s What the hell? D slower than Python? This cannot be real. I am sure I am making a mistake here. I'm sharing all 3 programs here: C: https://pastebin.com/s7e2HFyL D: https://pastebin.com/fuURdupc Python: https://pastebin.com/zcXAkSEf As you can see the function that does the job is exactly the same in C and D. Here are the compile/run commands used: C: `gcc leibniz.c -lm -oleibc` D: `gdc leibniz.d -frelease -oleibd` Python: `python3 leibniz.py` PS. my CPU is AMD A8-5500B and my OS is Ubuntu Linux, if that matters.
Mar 24
On Sunday, 24 March 2024 at 19:31:19 UTC, Csaba wrote:... Here are the results: C: 0.04s Python: 0.33s D: 0.73s ...I think a few things can be going on, but one way to go is trying using optimization flags like "-O2", and run again. But anyway, looking through Assembly generated: C: https://godbolt.org/z/45Kn1W93b D: https://godbolt.org/z/Ghr3fqaTW The Leibniz's function is very close each other, except for one thing, the "pow" function on D side. It's a template, maybe you should start from there, in fact I'd try the pow from C to see what happens. Matheus.
Mar 24
On Sunday, 24 March 2024 at 19:31:19 UTC, Csaba wrote:As you can see the function that does the job is exactly the same in C and D.Not really.. The speed of Leibniz algo is mostly the same. You can check the code in this benchmark for example: https://github.com/niklas-heer/speed-comparison What you could fix in your code: * you can use enum for BENCHMARKS and ITERATIONS * use pow from core.stdc.math * use sw.reset() in a loop So the main part could look like this: ```d auto sw = StopWatch(AutoStart.no); sw.start(); foreach (i; 0..BENCHMARKS) { result += leibniz(ITERATIONS); total_time += sw.peek.total!"nsecs"; sw.reset(); } sw.stop(); ```
Mar 24
On Sunday, 24 March 2024 at 19:31:19 UTC, Csaba wrote:I know that benchmarks are always controversial and depend on a lot of factors. So far, I read that D performs very well in benchmarks, as well, if not better, as C. I wrote a little program that approximates PI using the Leibniz formula. I implemented the same thing in C, D and Python, all of them execute 1,000,000 iterations 20 times and display the average time elapsed. Here are the results: C: 0.04s Python: 0.33s D: 0.73s What the hell? D slower than Python? This cannot be real. I am sure I am making a mistake here. I'm sharing all 3 programs here: C: https://pastebin.com/s7e2HFyL D: https://pastebin.com/fuURdupc Python: https://pastebin.com/zcXAkSEfUsually you do not translate mathematical expressions directly into code: ``` n += pow(-1.0, i - 1.0) / (i * 2.0 - 1.0); ``` The term containing the `pow` invocation computes the alternating sequence -1, 1, -1, ..., which can be replaced by e.g. ``` immutable int [2] sign = [-1, 1]; n += sign [i & 1] / (i * 2.0 - 1.0); ``` This saves the expensive call to the pow function.
Mar 24
The term containing the `pow` invocation computes the alternating sequence -1, 1, -1, ..., which can be replaced by e.g. ``` immutable int [2] sign = [-1, 1]; n += sign [i & 1] / (i * 2.0 - 1.0); ``` This saves the expensive call to the pow function.I used the loop: ```d for (int i = 1; i < iter; i++) n += ((i%2) ? -1.0 : 1.0) / (i * 2.0 + 1.0); ``` in both C and D, with gcc and gdc and got average execution times: --- C ----- original: .... loop replacement: .... -O2: 0.009989 .... 0.003198 ........... 0.001335 --- D ----- original: .... loop replacement: .... -O2: 0.230346 .... 0.003083 ........... 0.001309 almost no difference. But the D binary is much larger on my Linux: 4600920 bytes instead of 15504 bytes for the C version. Are there some simple switches / settings to get a smaller binary?
Mar 24
On Sunday, 24 March 2024 at 22:16:06 UTC, rkompass wrote:Are there some simple switches / settings to get a smaller binary?1) If possible you can use "betterC" - to disable runtime 2) otherwise ```bash --release --O3 --flto=full -fvisibility=hidden -defaultlib=phobos2-ldc-lto,druntime-ldc-lto -L=-dead_strip -L=-x -L=-S -L=-lz ```
Mar 24
On Sunday, 24 March 2024 at 23:02:19 UTC, Sergey wrote:On Sunday, 24 March 2024 at 22:16:06 UTC, rkompass wrote:Thank you. I succeeded with `gdc -Wall -O2 -frelease -shared-libphobos` A little remark: The approximation to pi is slow, but oscillates up and down much more than its average. So doing the average of 2 steps gives many more precise digits. We can simulate this by doing a last step with half the size: ```d double leibniz(int it) { double n = 1.0; for (int i = 1; i < it; i++) n += ((i%2) ? -1.0 : 1.0) / (i * 2.0 + 1.0); n += 0.5*((it%2) ? -1.0 : 1.0) / (it * 2.0 + 1.0); return n * 4.0; } ``` Of course you may also combine the up(+) and down(-) step to one: 1/i - 1/(i+2) = 2/(i*(i+2)) ```d double leibniz(int iter) { double n = 0.0; for (int i = 1; i < iter; i+=4) n += 2.0 / (i * (i+2.0)); return n * 4.0; } ``` or even combine both approaches. But of, course mathematically much more is possible. This was not about approximating pi as fast as possible... The above first approach still works with the original speed, only makes the result a little bit nicer.Are there some simple switches / settings to get a smaller binary?1) If possible you can use "betterC" - to disable runtime 2) otherwise ```bash --release --O3 --flto=full -fvisibility=hidden -defaultlib=phobos2-ldc-lto,druntime-ldc-lto -L=-dead_strip -L=-x -L=-S -L=-lz ```
Mar 25
On Monday, 25 March 2024 at 14:02:08 UTC, rkompass wrote:Of course you may also combine the up(+) and down(-) step to one: 1/i - 1/(i+2) = 2/(i*(i+2)) ```d double leibniz(int iter) { double n = 0.0; for (int i = 1; i < iter; i+=4) n += 2.0 / (i * (i+2.0)); return n * 4.0; } ``` or even combine both approaches. But of, course mathematically much more is possible. This was not about approximating pi as fast as possible... The above first approach still works with the original speed, only makes the result a little bit nicer.It's obvious that you are a good mathematician. You used sequence A005563. First of all, I must apologize to the questioner for digressing from the topic. But I saw that there is a calculation difference between real and double. My goal was to see if there would be a change in speed. For example, with 250 million cycles (iter/4) I got the following result:3.14159265158976691 (250 5million (with real) 3.14159264457621568 (250 million with double) 3.14159265358979324 (std.math.constants.PI)First of all, my question is: Why do we see this calculation error with double? Could the changes I made to the algorithm have caused this? Here's an executable code snippet: ```d enum step = 4; enum loop = 250_000_000; auto leibniz(T)(int iter) { T n = 2/3.0; for(int i = 5; i < iter; i += step) { T a = (2.0 + i) * i; // https://oeis.org/A005563 n += 2/a; } return n * step; } import std.stdio : writefln; void main() { enum iter = loop * step-10; 65358979323.writefln!"Compare.%s"; iter.leibniz!double.writefln!"%.17f (double)"; iter.leibniz!real.writefln!"%.17f (real)"; imported!"std.math".PI.writefln!"%.17f (enum)"; } /* Prints: Compare.65358979323 3.14159264457621568 (double) 3.14159265158976689 (real) 3.14159265358979324 (enum) */ ``` In fact, there are algorithms that calculate accurately up to 12 decimal places with fewer cycles. (e.g. 9999) SDB 79
Mar 26
On Sunday, 24 March 2024 at 22:16:06 UTC, Kdevel wrote:The term containing the `pow` invocation computes the alternating sequence -1, 1, -1, ..., which can be replaced by e.g. ```d immutable int [2] sign = [-1, 1]; n += sign [i & 1] / (i * 2.0 - 1.0); ``` This saves the expensive call to the pow function.I also used this code: ```d import std.stdio : writefln; import std.datetime.stopwatch; enum ITERATIONS = 1_000_000; enum BENCHMARKS = 20; auto leibniz(bool speed = true)(int iter) { double n = 1.0; static if(speed) const sign = [-1, 1]; for(int i = 2; i < iter; i++) { static if(speed) { const m = i << 1; n += sign [i & 1] / (m - 1.0); } else { n += pow(-1, i - 1) / (i * 2.0 - 1.0); } } return n * 4.0; } auto pow(F, G)(F x, G n) nogc trusted pure nothrow { import std.traits : Unsigned, Unqual; real p = 1.0, v = void; Unsigned!(Unqual!G) m = n; if(n < 0) { if(n == -1) return 1 / x; m = cast(typeof(m))(0 - n); v = p / x; } else { switch(n) { case 0: return 1.0; case 1: return x; case 2: return x * x; default: } v = x; } while(true) { if(m & 1) p *= v; m >>= 1; if(!m) break; v *= v; } return p; } void main() { double result; long total_time = 0; for(int i = 0; i < BENCHMARKS; i++) { auto sw = StopWatch(AutoStart.no); sw.start(); result = ITERATIONS.leibniz;//!false; sw.stop(); total_time += sw.peek.total!"nsecs"; } result.writefln!"%0.21f"; writefln("Avg execution time: %f\n", total_time / BENCHMARKS / 1e9); } ``` and results:dmd -run "leibnizTest.d" 3.141594653593692054727 Avg execution time: 0.002005If I compile with leibniz!false(ITERATIONS) the average execution time increases slightly:Avg execution time: 0.044435However, if you pay attention, it is not connected to an external library and a power function that works with integers is used. Normally the following function of the library should be called:Unqual!(Largest!(F, G)) pow(F, G)(F x, G y) nogc trusted pure nothrow if (isFloatingPoint!(F) && isFloatingPoint!(G)) ...Now, the person asking the question will ask why it is slow even though we use exactly the same codes in C; rightly. You may think that the more watermelon you carry in your arms, the slower you naturally become. I think the important thing is not to drop the watermelons :) SDB 79
Mar 24
On Sunday, 24 March 2024 at 21:21:13 UTC, kdevel wrote:Usually you do not translate mathematical expressions directly into code: ``` n += pow(-1.0, i - 1.0) / (i * 2.0 - 1.0); ``` The term containing the `pow` invocation computes the alternating sequence -1, 1, -1, ..., which can be replaced by e.g. ``` immutable int [2] sign = [-1, 1]; n += sign [i & 1] / (i * 2.0 - 1.0); ``` This saves the expensive call to the pow function.I know that the code can be simplified/optimized, I just wanted to compare the same expression in C and D.
Mar 26
On Sunday, 24 March 2024 at 19:31:19 UTC, Csaba wrote:I know that benchmarks are always controversial and depend on a lot of factors. So far, I read that D performs very well in benchmarks, as well, if not better, as C. I wrote a little program that approximates PI using the Leibniz formula. I implemented the same thing in C, D and Python, all of them execute 1,000,000 iterations 20 times and display the average time elapsed. Here are the results: C: 0.04s Python: 0.33s D: 0.73s What the hell? D slower than Python? This cannot be real. I am sure I am making a mistake here. I'm sharing all 3 programs here: C: https://pastebin.com/s7e2HFyL D: https://pastebin.com/fuURdupc Python: https://pastebin.com/zcXAkSEf As you can see the function that does the job is exactly the same in C and D. Here are the compile/run commands used: C: `gcc leibniz.c -lm -oleibc` D: `gdc leibniz.d -frelease -oleibd` Python: `python3 leibniz.py` PS. my CPU is AMD A8-5500B and my OS is Ubuntu Linux, if that matters.As others suggested, pow is the problem. I noticed that the C versions are often much faster than their D counterparts. (And I don't view that as a problem, since both are built into the language - my only thought is that the D version should call the C version). Changing ``` import std.math:pow; ``` to ``` import core.stdc.math: pow; ``` and leaving everything unchanged, I get C: Avg execution time: 0.007918 D (original): Avg execution time: 0.102612 D (using core.stdc.math): Avg execution time: 0.008134 So more or less the exact same numbers if you use core.stdc.math.
Mar 26
On Tuesday, 26 March 2024 at 14:25:53 UTC, Lance Bachmeier wrote:On Sunday, 24 March 2024 at 19:31:19 UTC, Csaba wrote:And then the other thing is changing ``` const int BENCHMARKS = 20; ``` to ``` enum BENCHMARKS = 20; ``` which should allow substitution of the constant directly into the rest of the program, which gives ``` Avg execution time: 0.007564 ``` On my Ubuntu 22.04 machine, therefore, the LDC binary with no flags is slightly faster than the C code compiled with your flags.I know that benchmarks are always controversial and depend on a lot of factors. So far, I read that D performs very well in benchmarks, as well, if not better, as C. I wrote a little program that approximates PI using the Leibniz formula. I implemented the same thing in C, D and Python, all of them execute 1,000,000 iterations 20 times and display the average time elapsed. Here are the results: C: 0.04s Python: 0.33s D: 0.73s What the hell? D slower than Python? This cannot be real. I am sure I am making a mistake here. I'm sharing all 3 programs here: C: https://pastebin.com/s7e2HFyL D: https://pastebin.com/fuURdupc Python: https://pastebin.com/zcXAkSEf As you can see the function that does the job is exactly the same in C and D. Here are the compile/run commands used: C: `gcc leibniz.c -lm -oleibc` D: `gdc leibniz.d -frelease -oleibd` Python: `python3 leibniz.py` PS. my CPU is AMD A8-5500B and my OS is Ubuntu Linux, if that matters.As others suggested, pow is the problem. I noticed that the C versions are often much faster than their D counterparts. (And I don't view that as a problem, since both are built into the language - my only thought is that the D version should call the C version). Changing ``` import std.math:pow; ``` to ``` import core.stdc.math: pow; ``` and leaving everything unchanged, I get C: Avg execution time: 0.007918 D (original): Avg execution time: 0.102612 D (using core.stdc.math): Avg execution time: 0.008134 So more or less the exact same numbers if you use core.stdc.math.
Mar 26
I apologize for digressing a little bit further - just to share insights to other learners. I had the question, why my binary was so big (> 4M), discovered the `gdc -Wall -O2 -frelease -shared-libphobos` options (now >200K). Then I tried to avoid GC, just learnt about this: The GC in the Leibnitz code is there only for the writeln. With a change to (again standard C) printf the ` nogc` modifier can be applied, the binary then gets down to ~17K, a comparable size of the C counterpart. Another observation regarding precision: The iteration proceeds in the wrong order. Adding small contributions first and bigger last leads to less loss when summing up the small parts below the final real/double LSB limit. So I'm now at this code (abolishing the avarage of 20 interations as unnesseary) ```d // import std.stdio; // writeln will lead to the garbage collector to be included import core.stdc.stdio: printf; import std.datetime.stopwatch; const int ITERATIONS = 1_000_000_000; nogc pure double leibniz(int it) { // sum up the small values first double n = 0.5*((it%2) ? -1.0 : 1.0) / (it * 2.0 + 1.0); for (int i = it-1; i >= 0; i--) n += ((i%2) ? -1.0 : 1.0) / (i * 2.0 + 1.0); return n * 4.0; } nogc void main() { double result; double total_time = 0; auto sw = StopWatch(AutoStart.yes); result = leibniz(ITERATIONS); sw.stop(); total_time = sw.peek.total!"nsecs"; printf("%.16f\n", result); printf("Execution time: %f\n", total_time / 1e9); } ``` result: ``` 3.1415926535897931 Execution time: 1.068111 ```
Mar 27
On Wednesday, 27 March 2024 at 08:22:42 UTC, rkompass wrote:I apologize for digressing a little bit further - just to share insights to other learners.Good thing you're digressing; I am 45 years old and I still cannot say that I am finished as a student! For me this is version 4 and it looks like we don't need a 3rd variable other than the function parameter and return value: ```d auto leibniz_v4(int i) nogc pure { double n = 0.5*((i%2) ? -1.0 : 1.0) / (i * 2.0 + 1.0); while(--i >= 0) n += ((i%2) ? -1.0 : 1.0) / (i * 2.0 + 1.0); return n * 4.0; } /* 3.1415926535892931 3.141592653589 793238462643383279502884197169399375105 3.141593653590774200000 (v1) Avg execution time: 0.000033 */ ``` SDB 79
Mar 27
On Thursday, 28 March 2024 at 01:09:34 UTC, Salih Dincer wrote:Good thing you're digressing; I am 45 years old and I still cannot say that I am finished as a student! For me this is version 4 and it looks like we don't need a 3rd variable other than the function parameter and return value:So we go with another digression. I discovered parallel, also avoided the extra variable, as suggested by Salih: ```d import std.range; import std.parallelism; import core.stdc.stdio: printf; import std.datetime.stopwatch; enum ITERS = 1_000_000_000; enum STEPS = 31; // 5 is fine, even numbers (e.g. 10) may give bad precision (for math reason ???) pure double leibniz(int i) { // sum up the small values first double r = (i == ITERS) ? 0.5 * ((i%2) ? -1.0 : 1.0) / (i * 2.0 + 1.0) : 0.0; for (--i; i >= 0; i-= STEPS) r += ((i%2) ? -1.0 : 1.0) / (i * 2.0 + 1.0); return r * 4.0; } void main() { auto start = iota(ITERS, ITERS-STEPS, -1).array; auto sw = StopWatch(AutoStart.yes); double result = 0.0; foreach(s; start.parallel) result += leibniz(s); double total_time = sw.peek.total!"nsecs"; printf("%.16f\n", result); printf("Execution time: %f\n", total_time / 1e9); } ``` gives: ``` 3.1415926535897931 Execution time: 0.211667 ``` My laptop has 6 cores and obviously 5 are used in parallel by this. The original question related to a comparison between C, D and Python. Turning back to this: Are there similarly simple libraries for C, that allow for parallel computation?
Mar 28
On Thursday, 28 March 2024 at 11:50:38 UTC, rkompass wrote:Turning back to this: Are there similarly simple libraries for C, that allow for parallel computation?You can achieve parallelism in C using libraries such as OpenMP, which provides a set of compiler directives and runtime library routines for parallel programming. Here’s an example of how you might modify the code to use OpenMP for parallel processing: ```c #include <stdio.h> #include <time.h> #include <omp.h> #define ITERS 1000000000 #define STEPS 31 double leibniz(int i) { double r = (i == ITERS) ? 0.5 * ((i % 2) ? -1.0 : 1.0) / (i * 2.0 + 1.0) : 0.0; for (--i; i >= 0; i -= STEPS) r += ((i % 2) ? -1.0 : 1.0) / (i * 2.0 + 1.0); return r * 4.0; } int main() { double start_time = omp_get_wtime(); double result = 0.0; #pragma omp parallel for reduction(+:result) for (int s = ITERS; s >= 0; s -= STEPS) { result += leibniz(s); } // Calculate the time taken double time_taken = omp_get_wtime() - start_time; printf("%.16f\n", result); printf("%f (seconds)\n", time_taken); return 0; } ``` To compile this code with OpenMP support, you would use a command like gcc -fopenmp your_program.c. This tells the GCC compiler to enable OpenMP directives. The #pragma omp parallel for directive tells the compiler to parallelize the loop, and the reduction clause is used to safely accumulate the result variable across multiple threads. SDB 79
Mar 28
On Thursday, 28 March 2024 at 14:07:43 UTC, Salih Dincer wrote:On Thursday, 28 March 2024 at 11:50:38 UTC, rkompass wrote:Nice, thank you. It worked endlessly until I saw I had to correct the `for` to `for (int s = ITERS; s > ITERS-STEPS; s--)` Now the result is: ``` 3.1415926535897936 Execution time: 0.212483 (seconds). ``` This result is sooo similar! I didn't know that OpenMP programming could be that easy. Binary size is 16K, same order of magnitude, although somewhat less. D advantage is gone here, I would say.Turning back to this: Are there similarly simple libraries for C, that allow for parallel computation?You can achieve parallelism in C using libraries such as OpenMP, which provides a set of compiler directives and runtime library routines for parallel programming. Here’s an example of how you might modify the code to use OpenMP for parallel processing: ```c . . . #pragma omp parallel for reduction(+:result) for (int s = ITERS; s >= 0; s -= STEPS) { result += leibniz(s); } . . . ``` To compile this code with OpenMP support, you would use a command like gcc -fopenmp your_program.c. This tells the GCC compiler to enable OpenMP directives. The #pragma omp parallel for directive tells the compiler to parallelize the loop, and the reduction clause is used to safely accumulate the result variable across multiple threads. SDB 79
Mar 28
On Thursday, 28 March 2024 at 20:18:10 UTC, rkompass wrote:D advantage is gone here, I would say.It's hard to compare actually. Std.parallelism has a bit different mechanics, and I think easier to use. The syntax is nicer. OpenMP is an well-known and highly adopted tool, which is also quite flexible, but usually used with initially sequential code. And the syntax is not very intuitive. Interesting point from Dr Russel here: https://forum.dlang.org/thread/qvksmhwkaxbrnggsvtxe forum.dlang.org However since 2012 OpenMP also got some development and improvement and HPC world is pretty conservative. So it is one of the most popular tool in the area: https://www.openmp.org/wp-content/uploads/sc23-openmp-popularity-mattson.pdf With MPI.. But probably with AI and GPU revolution the balance will shift a bit to CUDA-like technologies.
Mar 28
On Thursday, 28 March 2024 at 20:18:10 UTC, rkompass wrote:I didn't know that OpenMP programming could be that easy. Binary size is 16K, same order of magnitude, although somewhat less. D advantage is gone here, I would say.There is no such thing as parallel programming in D anyway. At least it has modules, but I didn't see it being works. Whenever I use toys built in foreach() it always ends in disappointment :) SDB 79
Mar 28
On Thursday, 28 March 2024 at 23:15:26 UTC, Salih Dincer wrote:There is no such thing as parallel programming in D anyway. At least it has modules, but I didn't see it being works. Whenever I use toys built in foreach() it always ends in disappointmentI think it just works :) Which issues did you have with it?
Mar 28
On Friday, 29 March 2024 at 00:04:14 UTC, Serg Gini wrote:On Thursday, 28 March 2024 at 23:15:26 UTC, Salih Dincer wrote:A year has passed and I have tried almost everything! Either it went into an infinite loop or nothing changed at the speed. At least things are not as simple as openMP on the D side! First I tried this code snippet: futile attempt! ```d struct RowlandSequence { import std.numeric : gcd; import std.format : format; import std.conv : text; long b, r, a = 3; enum empty = false; string[] front() { string result = format("%s, %s", b, r); return [text(a), result]; } void popFront() { long result = 1; while(result == 1) { result = gcd(r++, b); b += result; } a = result; } } enum BP { f = 1, b = 7, r = 2, a = 1, /* f = 109, b = 186837516, r = 62279173, //*/ s = 5 } void main() { RowlandSequence rs; long start, skip; with(BP) { rs = RowlandSequence(b, r); start = f; skip = s; } rs.popFront(); import std.stdio, std.parallelism; import std.range : take; auto rsFirst128 = rs.take(128); foreach(r; rsFirst128.parallel) { if(r[0].length > skip) { start.writeln(": ", r); } start++; } } ``` SDB 79There is no such thing as parallel programming in D anyway. At least it has modules, but I didn't see it being works. Whenever I use toys built in foreach() it always ends in disappointmentI think it just works :) Which issues did you have with it?
Mar 28