digitalmars.D - Fastest Prime Sieve, in D
- Jabari Zakiya (51/51) Jun 28 2019 This is a continuation of a previous thread on the Segmented
- Jabari Zakiya (15/78) Jun 29 2019 Timing comparisons on my system. Nim code compiled with GCC 8.3.1
- Exil (6/30) Jun 29 2019 Could be the difference between NIM's GC and D's, seeing as you
This is a continuation of a previous thread on the Segmented Sieve of Zakiya (SSoZ): https://forum.dlang.org/post/ozwapetxgcxkbzjjslqe forum.dlang.org to present the D version of the current state-of-art of the work. The D version source code also performs the Fastest Primes Sieve for Twin Primes. https://gist.github.com/jzakiya/ae93bfa03dbc8b25ccc7f97ff8ad0f61 It's a direct translation of the (original) Nim (www.nim-lang.org) version. https://gist.github.com/jzakiya/6c7e1868bd749a6b1add62e3e3b2341e For those interested, here are two places to get my current paper on it. I need to update it (again) to incorporate this new algorithm, and results. https://www.academia.edu/37952623/The_Use_of_Prime_Generators_to_Implement_Fast_Twin_Primes_Sieve_of_Zakiya_SoZ_Applications_to_Number_Theory_and_Implications_for_the_Riemann_Hypotheses https://www.scribd.com/document/395415391/The-Use-of-Prime-Generators-to-Implement-Fast-Twin-Primes-Sieve-Of-Zakiya-SoZ-Applications-to-Number-Theory-and-Implications-for-the-Riemann-Hypoth I compiled the D source using the latest LDC 1.16.0. When comparing times to Nim it's comparable but becomes slower as inputs sizes increase. Here are some times for the Nim version compared to `primesieve` -- https://github.com/kimwalisch/primesieve ``` ➜ ~ inxi -SCM System: Host: localhost.localdomain Kernel: 5.1.11-pclos1 x86_64 bits: 64 Desktop: KDE Plasma 5.16.1 Distro: PCLinuxOS 2019 Machine: Type: Laptop System: System76 product: Gazelle v: gaze10 serial: <root required> Mobo: System76 model: Gazelle v: gaze10 serial: <root required> UEFI [Legacy]: American Megatrends v: 1.05.08 date: 03/31/2016 CPU: Topology: Quad Core model: Intel Core i7-6700HQ bits: 64 type: MT MCP L2 cache: 6144 KiB Speed: 1185 MHz min/max: 800/3500 MHz Core speeds (MHz): 1: 800 2: 800 3: 800 4: 800 5: 800 6: 800 7: 800 8: 800 ``` ``` N | twinprimes_ssoz | primesieve | Ratio | -------------------|-----------------|---------------|-------| 100_000_000_000 | 4.628 | 5.837 | 1.26 | 500_000_000_000 | 23.308 | 32.765 | 1.41 | 1_000_000_000_000 | 47.966 | 69.705 | 1.45 | 5_000_000_000_000 | 265.746 | 388.233 | 1.46 | 10_000_000_000_000 | 572.803 | 821.755 | 1.43 | 50_000_000_000_000 | 3004.487 | 4620.525 | 1.54 | 100_000_000_000_000 | 6307.521 | 9724.568 | 1.54 | 200_000_000_000_000 | 13603.531 | 20279.547 | 1.49 | ``` I'm hoping people can help improve it and make it faster than it currently is. Thanks. Jabari
Jun 28 2019
On Friday, 28 June 2019 at 19:32:41 UTC, Jabari Zakiya wrote:This is a continuation of a previous thread on the Segmented Sieve of Zakiya (SSoZ): https://forum.dlang.org/post/ozwapetxgcxkbzjjslqe forum.dlang.org to present the D version of the current state-of-art of the work. The D version source code also performs the Fastest Primes Sieve for Twin Primes. https://gist.github.com/jzakiya/ae93bfa03dbc8b25ccc7f97ff8ad0f61 It's a direct translation of the (original) Nim (www.nim-lang.org) version. https://gist.github.com/jzakiya/6c7e1868bd749a6b1add62e3e3b2341e For those interested, here are two places to get my current paper on it. I need to update it (again) to incorporate this new algorithm, and results. https://www.academia.edu/37952623/The_Use_of_Prime_Generators_to_Implement_Fast_Twin_Primes_Sieve_of_Zakiya_SoZ_Applications_to_Number_Theory_and_Implications_for_the_Riemann_Hypotheses https://www.scribd.com/document/395415391/The-Use-of-Prime-Generators-to-Implement-Fast-Twin-Primes-Sieve-Of-Zakiya-SoZ-Applications-to-Number-Theory-and-Implications-for-the-Riemann-Hypoth I compiled the D source using the latest LDC 1.16.0. When comparing times to Nim it's comparable but becomes slower as inputs sizes increase. Here are some times for the Nim version compared to `primesieve` -- https://github.com/kimwalisch/primesieve ``` ➜ ~ inxi -SCM System: Host: localhost.localdomain Kernel: 5.1.11-pclos1 x86_64 bits: 64 Desktop: KDE Plasma 5.16.1 Distro: PCLinuxOS 2019 Machine: Type: Laptop System: System76 product: Gazelle v: gaze10 serial: <root required> Mobo: System76 model: Gazelle v: gaze10 serial: <root required> UEFI [Legacy]: American Megatrends v: 1.05.08 date: 03/31/2016 CPU: Topology: Quad Core model: Intel Core i7-6700HQ bits: 64 type: MT MCP L2 cache: 6144 KiB Speed: 1185 MHz min/max: 800/3500 MHz Core speeds (MHz): 1: 800 2: 800 3: 800 4: 800 5: 800 6: 800 7: 800 8: 800 ``` ``` N | twinprimes_ssoz | primesieve | Ratio | -------------------|-----------------|---------------|-------| 100_000_000_000 | 4.628 | 5.837 | 1.26 | 500_000_000_000 | 23.308 | 32.765 | 1.41 | 1_000_000_000_000 | 47.966 | 69.705 | 1.45 | 5_000_000_000_000 | 265.746 | 388.233 | 1.46 | 10_000_000_000_000 | 572.803 | 821.755 | 1.43 | 50_000_000_000_000 | 3004.487 | 4620.525 | 1.54 | 100_000_000_000_000 | 6307.521 | 9724.568 | 1.54 | 200_000_000_000_000 | 13603.531 | 20279.547 | 1.49 | ``` I'm hoping people can help improve it and make it faster than it currently is. Thanks. JabariTiming comparisons on my system. Nim code compiled with GCC 8.3.1 (system has LLVM 8.0.0). ``` N | Nim | D | Ratio | -------------------|-----------------|---------------|-------| 100_000_000_000 | 4.628 | 4.619 | 1.00 | 500_000_000_000 | 23.308 | 24.071 | 1.03 | 1_000_000_000_000 | 47.966 | 49.703 | 1.04 | 5_000_000_000_000 | 265.746 | 276.847 | 1.04 | 10_000_000_000_000 | 572.803 | 590.881 | 1.03 | 50_000_000_000_000 | 3009.485 | 3046.913 | 1.01 | 100_000_000_000_000 | 6307.521 | 6594.283 | 1.04 | 200_000_000_000_000 | 13603.531 | 14706.743 | 1.08 | ```
Jun 29 2019
On Saturday, 29 June 2019 at 12:28:35 UTC, Jabari Zakiya wrote:Timing comparisons on my system. Nim code compiled with GCC 8.3.1 (system has LLVM 8.0.0). ``` N | Nim | D | Ratio | -------------------|-----------------|---------------|-------| 100_000_000_000 | 4.628 | 4.619 | 1.00 | 500_000_000_000 | 23.308 | 24.071 | 1.03 | 1_000_000_000_000 | 47.966 | 49.703 | 1.04 | 5_000_000_000_000 | 265.746 | 276.847 | 1.04 | 10_000_000_000_000 | 572.803 | 590.881 | 1.03 | 50_000_000_000_000 | 3009.485 | 3046.913 | 1.01 | 100_000_000_000_000 | 6307.521 | 6594.283 | 1.04 | 200_000_000_000_000 | 13603.531 | 14706.743 | 1.08 | ```Could be the difference between NIM's GC and D's, seeing as you are continually allocating new arrays. Could just be that the GCC backend for NIM generates better assembly than LLVM. Could profile the two and look at the assembly to see where the additional time is being spent.
Jun 29 2019