## digitalmars.D.learn - speeding up + ctfe question

• maarten van damme (15/15) May 26 2012 I finally made the primerange I wanted to make. I think I'm using a
• sclytrack (72/87) May 26 2012 See below for a rather dumb algorithm. The idea is to list
• jerro (15/16) May 26 2012 I got a 30 percent speedup by replacing this line:
• maarten van damme (6/6) May 26 2012 well, I was thinking about using a sieve but when you were to request
• jerro (10/21) May 26 2012 Because you only need primes up to sqrt(n) to check if n is prime,
• maarten van damme (7/7) May 27 2012 well, maybe, looking back at it, a range isn't the ideal way to go
• sclytrack (2/9) May 27 2012 https://github.com/PhilippeSigaud/D-templates-tutorial
• maarten van damme (13/13) May 27 2012 thank you :)
• sclytrack (13/26) May 27 2012 Yeah I get the same errors. And I'm not acquainted with the benchmark co...
• jerro (9/22) May 27 2012 You say it does compile, but then crashes immediatly? I can't
• maarten van damme (6/6) May 27 2012 ok, can't seem to reproduce the crashing. now on to optimizing my
• jerro (7/15) May 27 2012 Have you tried turning optimization on? I get about 8ms with
• maarten van damme (8/8) May 28 2012 I tried using 6k+-1 for all primes and for some reason it performed
• jerro (7/17) May 28 2012 You aren't, by any chance, using divisions or remainders? those
• maarten van damme (5/5) May 28 2012 I didn't use divisions or remainders, only multiplications. I've
maarten van damme <maartenvd1994 gmail.com> writes:
```I finally made the primerange I wanted to make. I think I'm using a
rather dumb algorithm. The idea is to store the prime we're at in
curPrime and the amount of preceding primes in countPrime. Then
opindex is able to calculate how many primes following or preceding
curPrime we have to find to get the asked value.
I'm not english so some names can be poorly chosen. Code is at
http://dl.dropbox.com/u/15024434/d/primeRange.d

The problem is that it's pretty slow. I wanted to find the first prime
number with 10 digits and it literally takes forever (but manages it
though). Is there an easy way to speed up or is this the ceiling?
(better algorithms, stupidity's I've left,...)

I also have an array of primes I use to "quicktest" to see if
something is a prime. You can change it's size using setQuicktestSize
at runtime. I want to make it large at compile time though. as I'm new
to ctfe I have no idea as to how one can do this.
```
May 26 2012
sclytrack <sclytrack iq87.fr> writes:
```On 05/26/2012 03:49 PM, maarten van damme wrote:
I finally made the primerange I wanted to make. I think I'm using a
rather dumb algorithm. The idea is to store the prime we're at in
curPrime and the amount of preceding primes in countPrime. Then
opindex is able to calculate how many primes following or preceding
curPrime we have to find to get the asked value.
I'm not english so some names can be poorly chosen. Code is at
http://dl.dropbox.com/u/15024434/d/primeRange.d

The problem is that it's pretty slow. I wanted to find the first prime
number with 10 digits and it literally takes forever (but manages it
though). Is there an easy way to speed up or is this the ceiling?
(better algorithms, stupidity's I've left,...)

I also have an array of primes I use to "quicktest" to see if
something is a prime. You can change it's size using setQuicktestSize
at runtime. I want to make it large at compile time though. as I'm new
to ctfe I have no idea as to how one can do this.

See below for a rather dumb algorithm. The idea is to list
a list of prime numbers. The problem is that it is pretty
slow and you get a ...

//src/main.d(57): Error: template instance main.isDivisable!(499,498)
recursive expansion

error when the number is too large. I'm not English so
the names ARE poorly chosen.

//writeln(primeList!(200)); //<---don't pick over 200 or it goes nuts
//writeln(decompose!(49,2));

template primeList(int index)
{
static if (index > 1)
{
static if (isPrime!(index))
{
enum int [] primeList =  primeList!(index-1) ~ [index];
} else
{
enum int [] primeList = primeList!(index-1);
}
}
else
{
enum int [] primeList = [];
}
}

template isDivisable( int number, int divisor)
{
enum bool isDivisable = (number % divisor) == 0;
}
//	10 ---> [2,5]

template decompose(int number, int d = 2)
{
static if (d < number)
{
static if (isDivisable!(number, d))
{
enum int [] decompose = [d] ~ decompose!(number/d, d);
}
else
{
enum int [] decompose = decompose!(number, d+1);
}
} else
{
enum int [] decompose = [number];
}
}

template isPrimeDecompose(int number, int d = 2)
{
static if (d < number)
{
static if (isDivisable!(number, d))
{
enum bool isPrimeDecompose = false;
} else
{
enum bool isPrimeDecompose = isPrimeDecompose!(number, d+1);
}
} else
{
enum bool isPrimeDecompose = true;
}
}

template isPrime(int number)
{
static if (number > 1)
enum bool isPrime = isPrimeDecompose!(number, 2);
else
enum bool isPrime = false;
}
```
May 26 2012
"jerro" <a a.com> writes:
```On Saturday, 26 May 2012 at 13:49:53 UTC, maarten van damme wrote:

Is there an easy way to speed up or is this the
ceiling?

I got a 30 percent speedup by replacing this line:

if(&& canFind(quicktestPrimes, possiblePrime))

with this:

if(quicktestPrimes.back >= possiblePrime &&
canFind(quicktestPrimes, possiblePrime))

You could probably get a much better speedup by using some
kind of a prime sieve.

BTW you shouldn't be using opIndex like that. Calling opIndex
shouldn't change observable state of the object and it should
run in O(log(n)) or better. People expect an index operator to
behave similarly to the index operator on arrays. If the
function does something completely different, it is very
confusing to make it an opIndex.
```
May 26 2012
maarten van damme <maartenvd1994 gmail.com> writes:
```well, I was thinking about using a sieve but when you were to request
prime 400_000 you're sieve would blow up in size. That's why I opted
for something else (and I don't know if it was the right thing to do
though). (Ab)using opIndex like that is indeed a bit wrong but what
would be the alternative? remove slicing and introduce getPrime?
Wouldn't that be the same but without the great syntactic sugar?
```
May 26 2012
"jerro" <a a.com> writes:
```On Saturday, 26 May 2012 at 18:40:53 UTC, maarten van damme wrote:
well, I was thinking about using a sieve but when you were to
request
prime 400_000 you're sieve would blow up in size.

Because you only need primes up to sqrt(n) to check if n is prime,
you can make a sieve based range that only uses O(sqrt(n)) memory.
I've put an example at https://gist.github.com/2795954 .It could
be optimized further (skipping even numbers, keeping track of the
multiples of small primes...), but it is already much faster
than a range that does not use a sieve.

That's why I
opted
for something else (and I don't know if it was the right thing
to do
though). (Ab)using opIndex like that is indeed a bit wrong but
what
would be the alternative? remove slicing and introduce getPrime?
Wouldn't that be the same but without the great syntactic sugar?

Syntax sugar isn't great if it confuses people and causes bugs.
You can use r.drop(n).front to get nth element of a range.
If you need to do it often, you could just write a nth() function.
```
May 26 2012
maarten van damme <maartenvd1994 gmail.com> writes:
```well, maybe, looking back at it, a range isn't the ideal way to go
about generating primes easily. I'm going to implement the sieve of
Atkin and make it return an array of primes up to a given number. This
might be a bit more useful and fast.

Is there somewhere I can catch up on ctfe? after reading andrei's book
and the documentation it's still not clear to me how I could make my
setquicktestsize function run at compile time.
```
May 27 2012
sclytrack <sclytrack iq87.fr> writes:
```On 05/27/2012 01:18 PM, maarten van damme wrote:
well, maybe, looking back at it, a range isn't the ideal way to go
about generating primes easily. I'm going to implement the sieve of
Atkin and make it return an array of primes up to a given number. This
might be a bit more useful and fast.

Is there somewhere I can catch up on ctfe? after reading andrei's book
and the documentation it's still not clear to me how I could make my
setquicktestsize function run at compile time.

https://github.com/PhilippeSigaud/D-templates-tutorial
```
May 27 2012
maarten van damme <maartenvd1994 gmail.com> writes:
```thank you :)

I wrote a sieve of aristotle because atkin's sieve needed waay to many
optimizations to beat aristotle's sieve by even a little bit so I
wanted to see if aristotle's was good enough.

I ran in two problems. It was extremely fast when sieving a byte array
of 1 million entries (without optimizations at all) but when trying
with 10_000_000 entries the program crashes before it even starts to
execute (main isn't reached).

Then I decided to benchmark the code and dmd fails to compile with
"Internal error: ..\ztc\cgcs.s 354".

I assume the first problem has to do with memory limits and that the
second is a little dmd bug. code is at
http://dl.dropbox.com/u/15024434/d/aristotleSieve.d
```
May 27 2012
sclytrack <sclytrack iq87.fr> writes:
```On 05/27/2012 02:10 PM, maarten van damme wrote:
thank you :)

I wrote a sieve of aristotle because atkin's sieve needed waay to many
optimizations to beat aristotle's sieve by even a little bit so I
wanted to see if aristotle's was good enough.

I ran in two problems. It was extremely fast when sieving a byte array
of 1 million entries (without optimizations at all) but when trying
with 10_000_000 entries the program crashes before it even starts to
execute (main isn't reached).

Then I decided to benchmark the code and dmd fails to compile with
"Internal error: ..\ztc\cgcs.s 354".

I assume the first problem has to do with memory limits and that the
second is a little dmd bug. code is at
http://dl.dropbox.com/u/15024434/d/aristotleSieve.d

Yeah I get the same errors. And I'm not acquainted with the benchmark code.
--
http://dlang.org/concepts.html
isprime there says that 2 is not a prime.
enum result = isprime(2);
compile time enum "result" says false.
--
http://erdani.com/d/web/phobos-prerelease/std_benchmark.html
(the future)
--
Er is zon buiten. Zonnige zondag namiddag met priemgetallen in de plaats
van buiten zitten. Tss tss. :-)
```
May 27 2012
"jerro" <a a.com> writes:
``` I ran in two problems. It was extremely fast when sieving a
byte array
of 1 million entries (without optimizations at all) but when
trying
with 10_000_000 entries the program crashes before it even
starts to
execute (main isn't reached).

You say it does compile, but then crashes immediatly? I can't
reproduce that. For me it is a compile time error if toSieve
is 16MB or more, but otherwise it runs fine. Anyway you should
use a dynamic array instead of a static one if you want really
large arrays.

Then I decided to benchmark the code and dmd fails to compile
with
"Internal error: ..\ztc\cgcs.s 354".

I assume the first problem has to do with memory limits and
that the
second is a little dmd bug.

It is a bug, all internal errors are. It looks like you can
avoid like this:

auto tmp=(benchmark!(f0)(1000));
int time=tmp[0].to!("msecs", int);
```
May 27 2012
maarten van damme <maartenvd1994 gmail.com> writes:
```ok, can't seem to reproduce the crashing. now on to optimizing my
sieve a bit more ,9 miliseconds for 1_000_000 is still to slow.
--
Er is zon buiten. Zonnige zondag namiddag met priemgetallen in de
plaats van buiten zitten. Tss tss. :-)
hoe? wie? x)
```
May 27 2012
"jerro" <a a.com> writes:
```On Sunday, 27 May 2012 at 17:00:01 UTC, maarten van damme wrote:
ok, can't seem to reproduce the crashing. now on to optimizing
my
sieve a bit more ,9 miliseconds for 1_000_000 is still to slow.
--
Er is zon buiten. Zonnige zondag namiddag met priemgetallen in
de
plaats van buiten zitten. Tss tss. :-)
hoe? wie? x)

Have you tried turning optimization on? I get about 8ms with
your code with no flags and about 2.9 with -O -inline -release.

To optimize it further you could make a sieve that only works
with odd numbers. I get (https://gist.github.com/2817289) about
1ms that way. I'm sure it is possible to do much better, but
that would probably be much more work too.
```
May 27 2012
maarten van damme <maartenvd1994 gmail.com> writes:
```I tried using 6k+-1 for all primes and for some reason it performed
slower. I think I have something completely inefficient somewhere,
can't figure out where though.
I think it has something to do with me increasing k and then
multiplying with k while I could have simply added 6 to K...

and I haven't tried with optimization on, first I want to get the
algorithm as good as possible before trying to squeeze out those extra
milliseconds :)
```
May 28 2012
"jerro" <a a.com> writes:
``` I tried using 6k+-1 for all primes and for some reason it
performed
slower. I think I have something completely inefficient
somewhere,
can't figure out where though.

You aren't, by any chance, using divisions or remainders? those
are much slower than, say, multiplications (unless the divisor
is a power of two and known at compile time, in which case
the compiler will optimize them to bitwise operations).

and I haven't tried with optimization on, first I want to get
the
algorithm as good as possible before trying to squeeze out
those extra
milliseconds :)

I prefer to compile with optimizations on when trying to
optimize something. That way I don't waste my time on
micro-optimizations that the compiler could do for me.
```
May 28 2012
maarten van damme <maartenvd1994 gmail.com> writes:
```I didn't use divisions or remainders, only multiplications. I've
changed it so it only uses addition and now it still doesn't
outperform a version that only checks odd's. it's not as fast as your
version where every index corresponds to i*2+1 because I fill every
even number with false...
```
May 28 2012