digitalmars.D - Fastest Prime Sieve, in D

Jabari Zakiya (51/51) Jun 28 2019 This is a continuation of a previous thread on the Segmented

Jabari Zakiya (15/78) Jun 29 2019 Timing comparisons on my system. Nim code compiled with GCC 8.3.1

Exil (6/30) Jun 29 2019 Could be the difference between NIM's GC and D's, seeing as you

Jabari Zakiya <jzakiya gmail.com> writes:

This is a continuation of a previous thread on the Segmented 
Sieve of Zakiya (SSoZ):

https://forum.dlang.org/post/ozwapetxgcxkbzjjslqe forum.dlang.org

to present the D version of the current state-of-art of the work.

The D version source code also performs the Fastest Primes Sieve 
for Twin Primes.

https://gist.github.com/jzakiya/ae93bfa03dbc8b25ccc7f97ff8ad0f61

It's a direct translation of the (original) Nim 
(www.nim-lang.org) version.

https://gist.github.com/jzakiya/6c7e1868bd749a6b1add62e3e3b2341e

For those interested, here are two places to get my current paper 
on it.
I need to update it (again) to incorporate this new algorithm, 
and results.

https://www.academia.edu/37952623/The_Use_of_Prime_Generators_to_Implement_Fast_Twin_Primes_Sieve_of_Zakiya_SoZ_Applications_to_Number_Theory_and_Implications_for_the_Riemann_Hypotheses

https://www.scribd.com/document/395415391/The-Use-of-Prime-Generators-to-Implement-Fast-Twin-Primes-Sieve-Of-Zakiya-SoZ-Applications-to-Number-Theory-and-Implications-for-the-Riemann-Hypoth

I compiled the D source using the latest LDC 1.16.0. When 
comparing times to Nim it's comparable but becomes slower as 
inputs sizes increase. Here are some times for the Nim version 
compared to `primesieve` -- 
https://github.com/kimwalisch/primesieve

```
➜  ~ inxi -SCM
System:    Host: localhost.localdomain Kernel: 5.1.11-pclos1 
x86_64 bits: 64 Desktop: KDE Plasma 5.16.1 Distro: PCLinuxOS 2019
Machine:   Type: Laptop System: System76 product: Gazelle v: 
gaze10 serial: <root required>
            Mobo: System76 model: Gazelle v: gaze10 serial: <root 
required> UEFI [Legacy]: American Megatrends v: 1.05.08
            date: 03/31/2016
CPU:       Topology: Quad Core model: Intel Core i7-6700HQ bits: 
64 type: MT MCP L2 cache: 6144 KiB
            Speed: 1185 MHz min/max: 800/3500 MHz Core speeds 
(MHz): 1: 800 2: 800 3: 800 4: 800 5: 800 6: 800 7: 800 8: 800
```

```
             N          | twinprimes_ssoz |   primesieve  | Ratio |
     -------------------|-----------------|---------------|-------|
        100_000_000_000 |       4.628     |      5.837    |  1.26 |
        500_000_000_000 |      23.308     |     32.765    |  1.41 |
      1_000_000_000_000 |      47.966     |     69.705    |  1.45 |
      5_000_000_000_000 |     265.746     |    388.233    |  1.46 |
     10_000_000_000_000 |     572.803     |    821.755    |  1.43 |
     50_000_000_000_000 |    3004.487     |   4620.525    |  1.54 |
    100_000_000_000_000 |    6307.521     |   9724.568    |  1.54 |
    200_000_000_000_000 |   13603.531     |  20279.547    |  1.49 |
```
I'm hoping people can help improve it and make it faster than it 
currently is.

Thanks.

Jabari

Jun 28 2019

Jabari Zakiya <jzakiya gmail.com> writes:

On Friday, 28 June 2019 at 19:32:41 UTC, Jabari Zakiya wrote:
 This is a continuation of a previous thread on the Segmented 
 Sieve of Zakiya (SSoZ):

 https://forum.dlang.org/post/ozwapetxgcxkbzjjslqe forum.dlang.org

 to present the D version of the current state-of-art of the 
 work.

 The D version source code also performs the Fastest Primes 
 Sieve for Twin Primes.

 https://gist.github.com/jzakiya/ae93bfa03dbc8b25ccc7f97ff8ad0f61

 It's a direct translation of the (original) Nim 
 (www.nim-lang.org) version.

 https://gist.github.com/jzakiya/6c7e1868bd749a6b1add62e3e3b2341e

 For those interested, here are two places to get my current 
 paper on it.
 I need to update it (again) to incorporate this new algorithm, 
 and results.

 https://www.academia.edu/37952623/The_Use_of_Prime_Generators_to_Implement_Fast_Twin_Primes_Sieve_of_Zakiya_SoZ_Applications_to_Number_Theory_and_Implications_for_the_Riemann_Hypotheses

 https://www.scribd.com/document/395415391/The-Use-of-Prime-Generators-to-Implement-Fast-Twin-Primes-Sieve-Of-Zakiya-SoZ-Applications-to-Number-Theory-and-Implications-for-the-Riemann-Hypoth

 I compiled the D source using the latest LDC 1.16.0. When 
 comparing times to Nim it's comparable but becomes slower as 
 inputs sizes increase. Here are some times for the Nim version 
 compared to `primesieve` -- 
 https://github.com/kimwalisch/primesieve

 ```
 ➜  ~ inxi -SCM
 System:    Host: localhost.localdomain Kernel: 5.1.11-pclos1 
 x86_64 bits: 64 Desktop: KDE Plasma 5.16.1 Distro: PCLinuxOS 
 2019
 Machine:   Type: Laptop System: System76 product: Gazelle v: 
 gaze10 serial: <root required>
            Mobo: System76 model: Gazelle v: gaze10 serial: 
 <root required> UEFI [Legacy]: American Megatrends v: 1.05.08
            date: 03/31/2016
 CPU:       Topology: Quad Core model: Intel Core i7-6700HQ 
 bits: 64 type: MT MCP L2 cache: 6144 KiB
            Speed: 1185 MHz min/max: 800/3500 MHz Core speeds 
 (MHz): 1: 800 2: 800 3: 800 4: 800 5: 800 6: 800 7: 800 8: 800
 ```

 ```
             N          | twinprimes_ssoz |   primesieve  | 
 Ratio |
     
 -------------------|-----------------|---------------|-------|
        100_000_000_000 |       4.628     |      5.837    |  
 1.26 |
        500_000_000_000 |      23.308     |     32.765    |  
 1.41 |
      1_000_000_000_000 |      47.966     |     69.705    |  
 1.45 |
      5_000_000_000_000 |     265.746     |    388.233    |  
 1.46 |
     10_000_000_000_000 |     572.803     |    821.755    |  
 1.43 |
     50_000_000_000_000 |    3004.487     |   4620.525    |  
 1.54 |
    100_000_000_000_000 |    6307.521     |   9724.568    |  
 1.54 |
    200_000_000_000_000 |   13603.531     |  20279.547    |  
 1.49 |
 ```
 I'm hoping people can help improve it and make it faster than 
 it currently is.

 Thanks.

 Jabari

Timing comparisons on my system. Nim code compiled with GCC 8.3.1 
(system has LLVM 8.0.0).

```
             N          |       Nim       |       D       | Ratio |
     -------------------|-----------------|---------------|-------|
        100_000_000_000 |       4.628     |      4.619    |  1.00 |
        500_000_000_000 |      23.308     |     24.071    |  1.03 |
      1_000_000_000_000 |      47.966     |     49.703    |  1.04 |
      5_000_000_000_000 |     265.746     |    276.847    |  1.04 |
     10_000_000_000_000 |     572.803     |    590.881    |  1.03 |
     50_000_000_000_000 |    3009.485     |   3046.913    |  1.01 |
    100_000_000_000_000 |    6307.521     |   6594.283    |  1.04 |
    200_000_000_000_000 |   13603.531     |  14706.743    |  1.08 |
```

Jun 29 2019

Exil <Exil gmall.com> writes:

On Saturday, 29 June 2019 at 12:28:35 UTC, Jabari Zakiya wrote:
 Timing comparisons on my system. Nim code compiled with GCC 
 8.3.1 (system has LLVM 8.0.0).

 ```
             N          |       Nim       |       D       | 
 Ratio |
     
 -------------------|-----------------|---------------|-------|
        100_000_000_000 |       4.628     |      4.619    |  
 1.00 |
        500_000_000_000 |      23.308     |     24.071    |  
 1.03 |
      1_000_000_000_000 |      47.966     |     49.703    |  
 1.04 |
      5_000_000_000_000 |     265.746     |    276.847    |  
 1.04 |
     10_000_000_000_000 |     572.803     |    590.881    |  
 1.03 |
     50_000_000_000_000 |    3009.485     |   3046.913    |  
 1.01 |
    100_000_000_000_000 |    6307.521     |   6594.283    |  
 1.04 |
    200_000_000_000_000 |   13603.531     |  14706.743    |  
 1.08 |
 ```

Could be the difference between NIM's GC and D's, seeing as you 
are continually allocating new arrays. Could just be that the GCC 
backend for NIM generates better assembly than LLVM. Could 
profile the two and look at the assembly to see where the 
additional time is being spent.

Jun 29 2019

D Programming

C/C++ Programming

Other

digitalmars.D - Fastest Prime Sieve, in D