digitalmars.D - Speed of hash tables compared to lua

Domingo (65/65) Sep 13 2020 Hello !

Domingo (34/45) Sep 13 2020 After writing the previous message I realized the test was not

Domingo (32/81) Sep 13 2020 And here is the C++ using unordered map, c++ speed is equal to

ikod (6/37) Sep 13 2020 Default runtime map provide some properties which may be

Daniel Kozak (2/19) Sep 13 2020 lua takes 489ms
Daniel Kozak (8/13) Sep 13 2020 I have tried emsi_containers, memutils and ikod-containers.

Steven Schveighoffer (11/32) Sep 13 2020 The allocation patterns definitely affect the builtin containers. A long...

FeepingCreature (6/11) Sep 15 2020 This reinforces my increasing conviction that

James Blachly (3/9) Sep 13 2020 ikod-containers is nice. Don't forget also dklib's khash =)

Daniel Kozak (11/20) Sep 13 2020 Does not work, first I have to change

mw (3/6) Sep 13 2020 Create a github repo, so everyone can run the benchmark, ... and

ikod (6/13) Sep 13 2020 Nice idea! It also should contain memory measurements and feature

ikod (5/19) Sep 24 2020 I added benchmark folder to my project

Daniel Kozak (5/14) Sep 13 2020 But this line will fix that for me:

James Blachly (4/30) Sep 14 2020 Hmm, thank you for that. We are using extensively internally and have

Domingo <mingodad gmail.com> writes:

Hello !
Doing some tests I found that D hash tables are a lot slower than 
Lua tables see bellow, D is using 2 times more memory and 5 times 
more time compared to Lua, when comparing with Luajit it's 
unbelievable worse.

I would expect a compiled language like D to use less resources 
than a scripting language, anyone have experienced something 
similar ?

Does the community know this ?

I hope this simple test would help to test and improve the 
situation !

=====
local  dsize = 6000008;
local d = {};

for i=1, dsize
do
	if ((i % 2) == 0) then  d[i] = 0.0000000123;
	else d[i] = 12345.0000000001; end
end

local sum = 0.0;
for i=1, dsize
do
       sum = sum + d[i];
end

print(string.format("count: %d, sum: %35.15f\n", #d, sum));
=====
/usr/bin/time lua sum-test.lua
count: 6000008, sum:         37035049380.000160217285156

0.39user 0.03system 0:00.43elapsed 99%CPU (0avgtext+0avgdata 
133528maxresident)k
520inputs+0outputs (1major+32908minor)pagefaults 0swaps

=====
/usr/bin/time luajit sum-test.lua
count: 6000008, sum:         37035049380.000160217285156

0.07user 0.02system 0:00.10elapsed 97%CPU (0avgtext+0avgdata 
67824maxresident)k
984inputs+0outputs (3major+18560minor)pagefaults 0swaps
=====

=====
import std.stdio;

int main(char[][] argv)
{
	immutable auto DSIZE = 6000008;
	double[int] d;

	for (int i = 0; i < DSIZE; ++i)
	{
	      d[i] = (i % 2) == 0 ? 0.0000000123 : 12345.0000000001;
	}
	//d.rehash;

	double sum = 0.0;
	for (int i = 0; i < DSIZE; ++i)
	{
	      sum += d[i];
	}

	writefln("count: %d, sum: %35.15f\n", DSIZE, sum);
	return 0;
}
=====
/usr/bin/time ./sum_test_aa
count: 6000008, sum:         37035049380.000160217285156

1.96user 0.10system 0:02.04elapsed 101%CPU (0avgtext+0avgdata 
272888maxresident)k
0inputs+0outputs (0major+68226minor)pagefaults 0swaps
=====
Cheers !

Sep 13 2020

Domingo <mingodad gmail.com> writes:

On Sunday, 13 September 2020 at 09:50:27 UTC, Domingo wrote:
 Hello !
 Doing some tests I found that D hash tables are a lot slower 
 than Lua tables see bellow, D is using 2 times more memory and 
 5 times more time compared to Lua, when comparing with Luajit 
 it's unbelievable worse.

 I would expect a compiled language like D to use less resources 
 than a scripting language, anyone have experienced something 
 similar ?

 Does the community know this ?

 I hope this simple test would help to test and improve the 
 situation !

After writing the previous message I realized the test was not 
exactly fair, Lua was probably using the array functionality of 
it's table so I changed it a bit and now Lua uses as much memory 
as D but still is faster by a good margin (~4).

=====
local  dsize = 6000008;
local d = {};

for i=dsize, 1, -1
do
	if ((i % 2) == 0) then  d[i] = 0.0000000123;
	else d[i] = 12345.0000000001; end
end

local sum = 0.0;
for i=dsize, 1, -1
do
       sum = sum + d[i];
end

print(string.format("count: %d, sum: %35.15f\n", #d, sum));
=====
=====
/usr/bin/time lua sum-test-rev.lua
count: 6000008, sum:         37035049380.000160217285156

0.63user 0.11system 0:00.74elapsed 100%CPU (0avgtext+0avgdata 
264500maxresident)k
0inputs+0outputs (0major+98435minor)pagefaults 0swaps
=====
=====
/usr/bin/time luajit sum-test-rev.lua
count: 6000008, sum:         37035049380.000160217285156

0.48user 0.04system 0:00.52elapsed 100%CPU (0avgtext+0avgdata 
117240maxresident)k
0inputs+0outputs (0major+41086minor)pagefaults 0swaps

=====

Sep 13 2020

Domingo <mingodad gmail.com> writes:

On Sunday, 13 September 2020 at 10:06:56 UTC, Domingo wrote:
 On Sunday, 13 September 2020 at 09:50:27 UTC, Domingo wrote:
 Hello !
 Doing some tests I found that D hash tables are a lot slower 
 than Lua tables see bellow, D is using 2 times more memory and 
 5 times more time compared to Lua, when comparing with Luajit 
 it's unbelievable worse.

 I would expect a compiled language like D to use less 
 resources than a scripting language, anyone have experienced 
 something similar ?

 Does the community know this ?

 I hope this simple test would help to test and improve the 
 situation !

 After writing the previous message I realized the test was not 
 exactly fair, Lua was probably using the array functionality of 
 it's table so I changed it a bit and now Lua uses as much 
 memory as D but still is faster by a good margin (~4).

 =====
 local  dsize = 6000008;
 local d = {};

 for i=dsize, 1, -1
 do
 	if ((i % 2) == 0) then  d[i] = 0.0000000123;
 	else d[i] = 12345.0000000001; end
 end

 local sum = 0.0;
 for i=dsize, 1, -1
 do
       sum = sum + d[i];
 end

 print(string.format("count: %d, sum: %35.15f\n", #d, sum));
 =====
 =====
 /usr/bin/time lua sum-test-rev.lua
 count: 6000008, sum:         37035049380.000160217285156

 0.63user 0.11system 0:00.74elapsed 100%CPU (0avgtext+0avgdata 
 264500maxresident)k
 0inputs+0outputs (0major+98435minor)pagefaults 0swaps
 =====
 =====
 /usr/bin/time luajit sum-test-rev.lua
 count: 6000008, sum:         37035049380.000160217285156

 0.48user 0.04system 0:00.52elapsed 100%CPU (0avgtext+0avgdata 
 117240maxresident)k
 0inputs+0outputs (0major+41086minor)pagefaults 0swaps

 =====

And here is the C++ using unordered map, c++ speed is equal to 
Lua but uses more memory:

=====
#include <unordered_map>
#include <string>
#include <cstdio>
int main()
{
	const int DSIZE = 6000008;
	// Create an empty unordered_map
	std::unordered_map<int, double> intMap;

	for (int i = 0; i < DSIZE; ++i)
	{
		intMap.insert( { i, (i % 2) == 0 ? 0.0000000123 : 
12345.0000000001 });
	}

	double sum = 0.0;
	for (std::pair<int, double> element : intMap)
		sum += element.second;

	printf("count: %d, sum: %35.15f\n", DSIZE, sum);
	return 0;
}
=====
=====
g++ -O2 -o sum_test_aa-cpp sum_test_aa.cpp

/usr/bin/time ./sum_test_aa-cpp
count: 6000008, sum:         37035049380.000160217285156
0.49user 0.09system 0:00.59elapsed 99%CPU (0avgtext+0avgdata 
330508maxresident)k
0inputs+0outputs (0major+93595minor)pagefaults 0swaps

=====

Sep 13 2020

ikod <igor.khasilev gmail.com> writes:

On Sunday, 13 September 2020 at 10:20:06 UTC, Domingo wrote:
 And here is the C++ using unordered map, c++ speed is equal to 
 Lua but uses more memory:

 =====
 #include <unordered_map>
 #include <string>
 #include <cstdio>
 int main()
 {
 	const int DSIZE = 6000008;
 	// Create an empty unordered_map
 	std::unordered_map<int, double> intMap;

 	for (int i = 0; i < DSIZE; ++i)
 	{
 		intMap.insert( { i, (i % 2) == 0 ? 0.0000000123 : 
 12345.0000000001 });
 	}

 	double sum = 0.0;
 	for (std::pair<int, double> element : intMap)
 		sum += element.second;

 	printf("count: %d, sum: %35.15f\n", DSIZE, sum);
 	return 0;
 }
 =====
 =====
 g++ -O2 -o sum_test_aa-cpp sum_test_aa.cpp

 /usr/bin/time ./sum_test_aa-cpp
 count: 6000008, sum:         37035049380.000160217285156
 0.49user 0.09system 0:00.59elapsed 99%CPU (0avgtext+0avgdata 
 330508maxresident)k
 0inputs+0outputs (0major+93595minor)pagefaults 0swaps

 =====

Default runtime map provide some properties which may be 
unavailable for other implementations. Its speed may be affected 
by GC, it uses to keep table items.

You may include some hash map implementations from the dub repo 
in your benchmark.

Sep 13 2020

Daniel Kozak <kozzi11 gmail.com> writes:

On Sun, Sep 13, 2020 at 10:50 PM Daniel Kozak <kozzi11 gmail.com> wrote:

 On Sun, Sep 13, 2020 at 1:25 PM ikod via Digitalmars-d <
 digitalmars-d puremagic.com> wrote:

 Default runtime map provide some properties which may be
 unavailable for other implementations. Its speed may be affected
 by GC, it uses to keep table items.

 You may include some hash map implementations from the dub repo
 in your benchmark.

 I have tried emsi_containers, memutils and  ikod-containers.
 default D's AA take 1s on my pc
 emsi_containers take 3s realy slow
 memutils was fast but does not realy work (seems it is broken because
 there has been nan at d[0] and many  other indexes)
 ikod-containers take 242ms

lua takes 489ms

Sep 13 2020

Daniel Kozak <kozzi11 gmail.com> writes:

On Sun, Sep 13, 2020 at 1:25 PM ikod via Digitalmars-d <
digitalmars-d puremagic.com> wrote:

 Default runtime map provide some properties which may be
 unavailable for other implementations. Its speed may be affected
 by GC, it uses to keep table items.

 You may include some hash map implementations from the dub repo
 in your benchmark.

I have tried emsi_containers, memutils and  ikod-containers.
default D's AA take 1s on my pc
emsi_containers take 3s realy slow
memutils was fast but does not realy work (seems it is broken because there
has been nan at d[0] and many  other indexes)
ikod-containers take 242ms

Sep 13 2020

Steven Schveighoffer <schveiguy gmail.com> writes:

On 9/13/20 4:50 PM, Daniel Kozak wrote:
 
 
 On Sun, Sep 13, 2020 at 1:25 PM ikod via Digitalmars-d 
 <digitalmars-d puremagic.com <mailto:digitalmars-d puremagic.com>> wrote:
 
 
     Default runtime map provide some properties which may be
     unavailable for other implementations. Its speed may be affected
     by GC, it uses to keep table items.
 
     You may include some hash map implementations from the dub repo
     in your benchmark.
 
 
 I have tried emsi_containers, memutils and  ikod-containers.
 default D's AA take 1s on my pc
 emsi_containers take 3s realy slow
 memutils was fast but does not realy work (seems it is broken because 
 there has been nan at d[0] and many  other indexes)
 ikod-containers take 242ms
 

The allocation patterns definitely affect the builtin containers. A long 
time ago, dcollections kicked the pants off of builtin AAs for most 
patterns, basically by using different memory allocation strategies (for 
example, allocating blocks of elements at a time, and using C malloc 
when no pointers were detected in the type).

Builtin AA's will always have to be slower, because they must allocate a 
separate piece of memory for each bucket to retain the property that you 
can get a pointer to any element, and that memory will survive anything 
the AA can do.

-Steve

Sep 13 2020

FeepingCreature <feepingcreature gmail.com> writes:

On Sunday, 13 September 2020 at 21:29:53 UTC, Steven 
Schveighoffer wrote:
 Builtin AA's will always have to be slower, because they must 
 allocate a separate piece of memory for each bucket to retain 
 the property that you can get a pointer to any element, and 
 that memory will survive anything the AA can do.

 -Steve

This reinforces my increasing conviction that 
referentiability-by-default is on a level with 
nullability-by-default and mutability-by-default in the "Great 
Collection of C-Derived Language Design Errors".

Sep 15 2020

James Blachly <james.blachly gmail.com> writes:

On 9/13/20 4:50 PM, Daniel Kozak wrote:
 I have tried emsi_containers, memutils and  ikod-containers.
 default D's AA take 1s on my pc
 emsi_containers take 3s realy slow
 memutils was fast but does not realy work (seems it is broken because 
 there has been nan at d[0] and many  other indexes)
 ikod-containers take 242ms

ikod-containers is nice. Don't forget also dklib's khash =)

https://github.com/blachlylab/dklib/

Sep 13 2020

Daniel Kozak <kozzi11 gmail.com> writes:

On Mon, Sep 14, 2020 at 3:55 AM James Blachly via Digitalmars-d <
digitalmars-d puremagic.com> wrote:

 On 9/13/20 4:50 PM, Daniel Kozak wrote:
 I have tried emsi_containers, memutils and  ikod-containers.
 default D's AA take 1s on my pc
 emsi_containers take 3s realy slow
 memutils was fast but does not realy work (seems it is broken because
 there has been nan at d[0] and many  other indexes)
 ikod-containers take 242ms

 ikod-containers is nice. Don't forget also dklib's khash =)

 https://github.com/blachlylab/dklib/

Does not work, first I have to change

auto map = khash!(keytype, valuetype);

to

auto map = khash!(keytype, valuetype)();

Because I get (source/app.d(20,11): Error: type khash!(uint, double, true,
true) has no value)

And after that it takes 262ms but end up with SIGSEGV.
But it print right sum, so I guess it is something with freeing when try to
dealocate hashmap

Sep 13 2020

mw <mingwu gmail.com> writes:

On Monday, 14 September 2020 at 05:25:03 UTC, Daniel Kozak wrote:
 Does not work, first I have to change
 ...
 And after that it takes 262ms but end up with SIGSEGV.

Create a github repo, so everyone can run the benchmark, ... and 
fix it?

Sep 13 2020

ikod <igor.khasilev gmail.com> writes:

On Monday, 14 September 2020 at 05:31:57 UTC, mw wrote:
 On Monday, 14 September 2020 at 05:25:03 UTC, Daniel Kozak 
 wrote:
 Does not work, first I have to change
 ...
 And after that it takes 262ms but end up with SIGSEGV.

 Create a github repo, so everyone can run the benchmark, ... 
 and fix it?

Nice idea! It also should contain memory measurements and feature 
matrix (as feature set can affect speed).

It also would be nice to have benchmarks on some different 
key/value types or sizes, like class as key or value, struct(fat 
struct) as key or value, etc..

Sep 13 2020

ikod <igor.khasilev gmail.com> writes:

On Monday, 14 September 2020 at 06:02:49 UTC, ikod wrote:
 On Monday, 14 September 2020 at 05:31:57 UTC, mw wrote:
 On Monday, 14 September 2020 at 05:25:03 UTC, Daniel Kozak 
 wrote:
 Does not work, first I have to change
 ...
 And after that it takes 262ms but end up with SIGSEGV.

 Create a github repo, so everyone can run the benchmark, ... 
 and fix it?

 Nice idea! It also should contain memory measurements and 
 feature matrix (as feature set can affect speed).

 It also would be nice to have benchmarks on some different 
 key/value types or sizes, like class as key or value, 
 struct(fat struct) as key or value, etc..

I added benchmark folder to my project

https://github.com/ikod/ikod-containers/tree/master/bench

Please add your implementations

Thanks!

Sep 24 2020

Daniel Kozak <kozzi11 gmail.com> writes:

On Mon, Sep 14, 2020 at 7:25 AM Daniel Kozak <kozzi11 gmail.com> wrote:

 Does not work, first I have to change

 auto map = khash!(keytype, valuetype);

 to

 auto map = khash!(keytype, valuetype)();

 Because I get (source/app.d(20,11): Error: type khash!(uint, double, true,
 true) has no value)

 And after that it takes 262ms but end up with SIGSEGV.
 But it print right sum, so I guess it is something with freeing when try
 to dealocate hashmap

But this line will fix that for me:

extern(C) __gshared string[] rt_options = [ "gcopt=parallel:0 cleanup:none"
];

So it is something with GC trying to free already freed memory I geuss

Sep 13 2020

James Blachly <james.blachly gmail.com> writes:

On 9/14/20 1:28 AM, Daniel Kozak wrote:
 On Mon, Sep 14, 2020 at 7:25 AM Daniel Kozak <kozzi11 gmail.com 
 <mailto:kozzi11 gmail.com>> wrote:
 
 
     Does not work, first I have to change
 
     auto map = khash!(keytype, valuetype);
 
     to
 
     auto map = khash!(keytype, valuetype)();
 
     Because I get (source/app.d(20,11): Error: type khash!(uint, double,
     true, true) has no value)
 
     And after that it takes 262ms but end up with SIGSEGV.
     But it print right sum, so I guess it is something with freeing when
     try to dealocate hashmap
 
 
 But this line will fix that for me:
 
 extern(C) __gshared string[] rt_options = [ "gcopt=parallel:0 
 cleanup:none" ];
 
 So it is something with GC trying to free already freed memory I geuss

Hmm, thank you for that. We are using extensively internally and have 
not had segfault. Can you share the benchmarking code?

I will of course update the readme to include the omitted function call ();

Sep 14 2020

D Programming

C/C++ Programming

Other

digitalmars.D - Speed of hash tables compared to lua