digitalmars.D.learn - What is the closest to ConcurrentHashMap and NavigableMap in Java?

Jacek Furmankiewicz (23/23) Nov 14 2013 In our Java code, we make heavy use of ConcurrentHashMap for

TheFlyingFiddle (6/29) Nov 14 2013 D does not have alot of diffrent containers atm. This work has

Jacek Furmankiewicz (6/6) Nov 14 2013 So how do existing D applications (especially the high perf ones

TheFlyingFiddle (11/16) Nov 14 2013 Good question. I have no idea.
JN (3/7) Nov 14 2013 They don't, because there aren't really such apps in the wild yet.

ilya-stromberg (6/8) Nov 14 2013 Try to look dcollections:

Jacek Furmankiewicz (5/5) Nov 14 2013 Thanks for the links.

ilya-stromberg (23/28) Nov 14 2013 No, you can:

TheFlyingFiddle (21/32) Nov 14 2013 Locking every time you use the map dosn't rly seem reasonable.

ilya-stromberg (43/45) Nov 14 2013 Yes, it's probably not the best example. It's valid if you have

Jacek Furmankiewicz (4/4) Nov 14 2013 hashmap per thread is not an option. The cache may be a few GBs

ilya-stromberg (4/8) Nov 14 2013 How often do you change the data? Probably, you should use

Jacek Furmankiewicz (8/12) Nov 14 2013 Customer specific. It may change once a year. It may change

ilya-stromberg (49/61) Nov 14 2013 You can use `immutable` variables. It allows you to share the

Jacek Furmankiewicz (9/73) Nov 15 2013 So what happens when the "write" operation is doing

ilya-stromberg (7/15) Nov 15 2013 Yes, this is thread safe.

Jacek Furmankiewicz (5/5) Nov 15 2013 So, if you add a read() method to MyMap for those threads, would

ilya-stromberg (10/12) Nov 15 2013 In this case you can use Readers-writer lock

Dicebot (2/4) Nov 15 2013 http://dlang.org/phobos/core_sync_rwmutex.html

ilya-stromberg (2/7) Nov 15 2013 Thank you. I just never use it.

Russel Winder (21/36) Nov 15 2013 Sorry to come in late on this one (and miss everything that comes

Jacek Furmankiewicz (9/9) Nov 15 2013 Yes, that is what they say in Go...but it doesn't scale either.

Russel Winder (16/28) Nov 15 2013 I don't follow. CSP scales very well and Go implements CSP. (Well an

Jacek Furmankiewicz (6/6) Nov 15 2013 if I recall from the initial Go discussion, the Go folks were

Russel Winder (14/21) Nov 15 2013 I guess they were hinting at scheduling issues where there are many more

Jacek Furmankiewicz (2/2) Nov 15 2013 Thank you Russell for the explanation.

ilya-stromberg (6/11) Nov 15 2013 It's possible to implement lock-free data structures in D, you

Russel Winder (14/27) Nov 15 2013 I didn't intend to imply that core data structures had to be lock free,

Jacek Furmankiewicz (5/16) Nov 15 2013 True, concurrency in Java is really simple these days (especially

ilya-stromberg (5/7) Nov 15 2013 Yes, that's sad truth: if you want to use D, be ready make

Jacek Furmankiewicz (14/14) Nov 15 2013 No, we didn't decide to migrate to D. Java is working out fine

Dicebot (8/13) Nov 15 2013 I don't really buy it. It is good from simplicity/safety point of

Charles Hixson (10/17) Nov 14 2013 Immutable variables are nice when they can be used. Often, however,

Dmitry Olshansky (9/27) Nov 15 2013 Would be slow unless batched. At least in D message passing involves

bearophile (6/8) Nov 14 2013 But is the D garbage collector able to manage efficiently enough

Jacek Furmankiewicz (9/16) Nov 14 2013 Well, these are the types of questions I have as a Java veteran

bearophile (19/27) Nov 14 2013 The development of the Java language, its GC, Oracle JVM,

Jacek Furmankiewicz (9/9) Nov 14 2013 True.

qznc (13/20) Nov 15 2013 Yes, I also think for long-running memory-hungry server-stuff the

Jacek Furmankiewicz (2/2) Nov 15 2013 So....how does Facebook handle it with their new D code?

SomeDude (5/7) Nov 16 2013 AFAWK, Facebook doesn't use D for its core business yet, only for

lomereiter (9/14) Nov 15 2013 In such cases the easiest route is to find some C/C++ library for

"Jacek Furmankiewicz" <jacek99 gmail.com> writes:

In our Java code, we make heavy use of ConcurrentHashMap for 
in-memory caches:

http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ConcurrentHashMap.html

We use it mostly for in-memory caches. A background thread wakes 
up every X seconds and resyncs the cache with whatever changes 
occurred in the DB. In the meantime, all the incoming requests 
can read from the same cache without any concurrency issues and 
excellent performance.

All the major Java Map implementations also support the 
NavigableMap interface:

http://docs.oracle.com/javase/7/docs/api/java/util/NavigableMap.html

which allows you to get the key closest to the one you are 
looking for (either up or down). This is very useful when you 
have hierarchies of business rules that have effective date 
ranges and you are trying to find which rules is active on a 
particular day.

I read up the chapter on associative arrays in D, but I do not 
see anything similar to this functionality in there.

Could anyone point me to what would be the closest D equivalents 
(maybe in an external library if not part of Phobos) so we can 
playing around with them?

Much appreciated
Jacek

Nov 14 2013

"TheFlyingFiddle" <theflyingfiddle gmail.com> writes:

On Thursday, 14 November 2013 at 17:36:09 UTC, Jacek 
Furmankiewicz wrote:
 In our Java code, we make heavy use of ConcurrentHashMap for 
 in-memory caches:

 http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ConcurrentHashMap.html

 We use it mostly for in-memory caches. A background thread 
 wakes up every X seconds and resyncs the cache with whatever 
 changes occurred in the DB. In the meantime, all the incoming 
 requests can read from the same cache without any concurrency 
 issues and excellent performance.

 All the major Java Map implementations also support the 
 NavigableMap interface:

 http://docs.oracle.com/javase/7/docs/api/java/util/NavigableMap.html

 which allows you to get the key closest to the one you are 
 looking for (either up or down). This is very useful when you 
 have hierarchies of business rules that have effective date 
 ranges and you are trying to find which rules is active on a 
 particular day.

 I read up the chapter on associative arrays in D, but I do not 
 see anything similar to this functionality in there.

 Could anyone point me to what would be the closest D 
 equivalents (maybe in an external library if not part of 
 Phobos) so we can playing around with them?

 Much appreciated
 Jacek

D does not have alot of diffrent containers atm. This work has 
been postponed until a working version of std.allocators is 
implemented. So atleast in the phobos library right now you will 
not find what you are looking for.

Nov 14 2013

"Jacek Furmankiewicz" <jacek99 gmail.com> writes:

So how do existing D applications (especially the high perf ones 
in let's say the financial sector) deal with having some part of 
the data in memory and keeping it in sync with the DB source?

This must be a very common requirement for any app with realtime 
or close-to-realtime SLAs.

is there a different idiom or approach in D?

Nov 14 2013

"TheFlyingFiddle" <theflyingfiddle gmail.com> writes:

On Thursday, 14 November 2013 at 18:08:22 UTC, Jacek 
Furmankiewicz wrote:
 So how do existing D applications (especially the high perf 
 ones in let's say the financial sector) deal with having some 
 part of the data in memory and keeping it in sync with the DB 
 source?

Good question. I have no idea.

 is there a different idiom or approach in D?

Well in D the prefered way (or recommended according to TDPL) to 
do concurrent sharing of resources is to share resources via 
message passing.

So in this case a single thread would be responsible for the 
caching of data in memory and other threads would ask this thread 
for data through message passing.

If this way is faster/better then javas ConcurrentHashMap i am 
not sure.

Nov 14 2013

"JN" <666total wp.pl> writes:

On Thursday, 14 November 2013 at 18:08:22 UTC, Jacek 
Furmankiewicz wrote:
 So how do existing D applications (especially the high perf 
 ones in let's say the financial sector) deal with having some 
 part of the data in memory and keeping it in sync with the DB 
 source?

They don't, because there aren't really such apps in the wild yet.

Nov 14 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Thursday, 14 November 2013 at 17:36:09 UTC, Jacek 
Furmankiewicz wrote:
 In our Java code, we make heavy use of ConcurrentHashMap for 
 in-memory caches:

Try to look dcollections:
http://www.dsource.org/projects/dcollections

Also, Vibe.d has own hashmap:
https://github.com/rejectedsoftware/vibe.d/blob/master/source/vibe/utils/hashmap.d

Nov 14 2013

"Jacek Furmankiewicz" <jacek99 gmail.com> writes:

Thanks for the links.

I looked at the dcollections docs, but none of their collections 
seem thread safe. The vibe.d I guess is because it is meant to be 
used from async I/O in a single thread...but once you add 
multi-threading to an app I am guessing it would not be usable.

Nov 14 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Thursday, 14 November 2013 at 20:00:10 UTC, Jacek 
Furmankiewicz wrote:
 I looked at the dcollections docs, but none of their 
 collections seem thread safe. The vibe.d I guess is because it 
 is meant to be used from async I/O in a single thread...but 
 once you add multi-threading to an app I am guessing it would 
 not be usable.

No, you can:
1) Use different hashmap per tread. I don't know your situation, 
but it can be possible fo read-only cache like this:

import vibe.utils.hashmap;

HashMap!(int, int) map;

void foo()
{
    //use map
    map[1] = 1;
}

2) Use `shared` storage class and mutex like this:

import vibe.utils.hashmap;

shared HashMap!(int, int) map;

void foo()
{
    synchronized
    {
       //use map
       map[1] = 1;
    }
}

Nov 14 2013

"TheFlyingFiddle" <theflyingfiddle gmail.com> writes:

 2) Use `shared` storage class and mutex like this:

 import vibe.utils.hashmap;

 shared HashMap!(int, int) map;

 void foo()
 {
    synchronized
    {
       //use map
       map[1] = 1;
    }
 }

Locking every time you use the map dosn't rly seem reasonable. 
It's not particulary fast and you might forget to lock the map at 
some point (or does the shared modifier not allow you to do this 
in D?)

I'm not that fammiliar with the synchronzed statement but 
shouldn't it be locked on some object?

void bar()
{
    //Can one thread be in this block...
    synchronized
    {
      map[1] = 1;
    }

    //... while another thread is in this block?
    synchronized
    {
      map[2] = 2;
    }
}

If that is the case are you not limited in the way you can update 
the map eg only in a single block?

Nov 14 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Thursday, 14 November 2013 at 21:16:15 UTC, TheFlyingFiddle 
wrote:

 If that is the case are you not limited in the way you can 
 update the map eg only in a single block?

Yes, it's probably not the best example. It's valid if you have 
only 1 synchronized block for map. But you can use something like 
this:

void bar()
{
    synchronized(map)
    {
      map[1] = 1;
    }

    synchronized(map)
    {
      map[2] = 2;
    }
}

Or this:

//Note: valid only if you have 1 function that use map
synchronized void bar()
{
      map[1] = 1;

      map[2] = 2;
}

Or this:

shared myMap;

synchronized class MyMap
{
    HashMap!(int, int) map;

    void foo()
    {
       map[1] = 1;
    }

    void bar()
    {
       map[2] = 2;
    }
}

//init map
shared static this()
{
    myMap = new MyMap();
}

Probably, it's the best example.

Nov 14 2013

"Jacek Furmankiewicz" <jacek99 gmail.com> writes:

hashmap per thread is not an option. The cache may be a few GBs 
of data, there is no way we can duplicate that data per thread.

Not to mention the start up time when we have to warm up the 
cache.

Nov 14 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Thursday, 14 November 2013 at 21:31:52 UTC, Jacek 
Furmankiewicz wrote:
 hashmap per thread is not an option. The cache may be a few GBs 
 of data, there is no way we can duplicate that data per thread.

 Not to mention the start up time when we have to warm up the 
 cache.

How often do you change the data? Probably, you should use 
`immutable` variables.

Nov 14 2013

"Jacek Furmankiewicz" <jacek99 gmail.com> writes:

On Thursday, 14 November 2013 at 21:36:46 UTC, ilya-stromberg 
wrote:
 On Thursday, 14 November 2013 at 21:31:52 UTC, Jacek 
 Furmankiewicz wrote:
 How often do you change the data? Probably, you should use 
 `immutable` variables.

Customer specific. It may change once a year. It may change 
multiple times per second for a while, then nothing again for 
weeks.

Others may do mass loads of business rules, hence do mass changes 
every few hours.

Next to impossible to predict.

Nov 14 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Thursday, 14 November 2013 at 22:12:10 UTC, Jacek 
Furmankiewicz wrote:
 On Thursday, 14 November 2013 at 21:36:46 UTC, ilya-stromberg 
 wrote:
 On Thursday, 14 November 2013 at 21:31:52 UTC, Jacek 
 Furmankiewicz wrote:
 How often do you change the data? Probably, you should use 
 `immutable` variables.

 Customer specific. It may change once a year. It may change 
 multiple times per second for a while, then nothing again for 
 weeks.

 Others may do mass loads of business rules, hence do mass 
 changes every few hours.

 Next to impossible to predict.

You can use `immutable` variables. It allows you to share the 
data without any synchronization. Like this:

class MyData
{
    int data1;
    string data2;

    //creates new object
    this(int data1, string data2)
    {
       this.data1 = data1;
       this.data2 = data2;
    }

    //modify the data
    immutable(MyData) editData(int i) const
    {
       //copy this object - we can't change immutable variables
       MyData dataCopy = new MyData(this.data1, this.data2)

       //modify the data copy
       dataCopy.data1 += i;

       //assume that `dataCopy` is immutable
       return cast(immutable(MyData)) dataCopy;
    }
}

shared myMap;

//map implementation
synchronized class MyMap
{
    HashMap!(int, immutable(MyData)) map;

    void foo()
    {
       map[1] = new immutable MyData(1, "data");
    }

    void bar()
    {
       map[1] = map[1].editData(5);
    }
}

//init map
shared static this()
{
    myMap = new MyMap();
}

void main()
{
    myMap.foo();
    myMap.bar();
}

Nov 14 2013

"Jacek Furmankiewicz" <jacek99 gmail.com> writes:

On Friday, 15 November 2013 at 07:42:22 UTC, ilya-stromberg wrote:
 On Thursday, 14 November 2013 at 22:12:10 UTC, Jacek 
 Furmankiewicz wrote:
 On Thursday, 14 November 2013 at 21:36:46 UTC, ilya-stromberg 
 wrote:
 On Thursday, 14 November 2013 at 21:31:52 UTC, Jacek 
 Furmankiewicz wrote:
 How often do you change the data? Probably, you should use 
 `immutable` variables.

 Customer specific. It may change once a year. It may change 
 multiple times per second for a while, then nothing again for 
 weeks.

 Others may do mass loads of business rules, hence do mass 
 changes every few hours.

 Next to impossible to predict.

 You can use `immutable` variables. It allows you to share the 
 data without any synchronization. Like this:

 class MyData
 {
    int data1;
    string data2;

    //creates new object
    this(int data1, string data2)
    {
       this.data1 = data1;
       this.data2 = data2;
    }

    //modify the data
    immutable(MyData) editData(int i) const
    {
       //copy this object - we can't change immutable variables
       MyData dataCopy = new MyData(this.data1, this.data2)

       //modify the data copy
       dataCopy.data1 += i;

       //assume that `dataCopy` is immutable
       return cast(immutable(MyData)) dataCopy;
    }
 }

 shared myMap;

 //map implementation
 synchronized class MyMap
 {
    HashMap!(int, immutable(MyData)) map;

    void foo()
    {
       map[1] = new immutable MyData(1, "data");
    }

    void bar()
    {
       map[1] = map[1].editData(5);
    }
 }

 //init map
 shared static this()
 {
    myMap = new MyMap();
 }

 void main()
 {
    myMap.foo();
    myMap.bar();
 }


So what happens when the "write" operation is doing

  map[1] = map[1].editData(5);

and at the same time 50 threads are simultaneously reading the 
value in map[1]?.

Is that reassignment operation thread safe?
Or would I get corrupted reads with potentially a partially 
overriden value?

Jacek

Nov 15 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Friday, 15 November 2013 at 15:21:59 UTC, Jacek Furmankiewicz 
wrote:
 So what happens when the "write" operation is doing

  map[1] = map[1].editData(5);

 and at the same time 50 threads are simultaneously reading the 
 value in map[1]?.

 Is that reassignment operation thread safe?
 Or would I get corrupted reads with potentially a partially 
 overriden value?

 Jacek

Yes, this is thread safe.
Put attention to the `MyMap` class definition, it's marked as 
`synchronized`. It means that all class functions use the same 
Mutex. So, "write" operation will block map and all 50 threads 
will wait.

Nov 15 2013

"Jacek Furmankiewicz" <jacek99 gmail.com> writes:

So, if you add a read() method to MyMap for those threads, would 
that be synchronized as well?

That is what we would not want due performance impact.

How can you achieve lock-free reads with the synchronized MyMap 
approach?

Nov 15 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Friday, 15 November 2013 at 16:36:56 UTC, Jacek Furmankiewicz 
wrote:
 How can you achieve lock-free reads with the synchronized MyMap 
 approach?

In this case you can use Readers-writer lock
http://en.wikipedia.org/wiki/Readers%E2%80%93writer_lock

It allows multiple reads and single write. I think that the 
easiest way is use OS spesific function, for example 
`pthread_rwlock_t` for POSX. Note that D supports C ABI, so you 
can call any C function from D.

I don't know any D implementation of Readers-writer lock, but you 
can ask this question - maybe it already exist.

Nov 15 2013

"Dicebot" <public dicebot.lv> writes:

On Friday, 15 November 2013 at 17:03:15 UTC, ilya-stromberg wrote:
 I don't know any D implementation of Readers-writer lock, but 
 you can ask this question - maybe it already exist.

http://dlang.org/phobos/core_sync_rwmutex.html

Nov 15 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Friday, 15 November 2013 at 17:09:54 UTC, Dicebot wrote:
 On Friday, 15 November 2013 at 17:03:15 UTC, ilya-stromberg 
 wrote:
 I don't know any D implementation of Readers-writer lock, but 
 you can ask this question - maybe it already exist.

 http://dlang.org/phobos/core_sync_rwmutex.html

Thank you. I just never use it.

Nov 15 2013

Russel Winder <russel winder.org.uk> writes:

On Fri, 2013-11-15 at 18:03 +0100, ilya-stromberg wrote:
 On Friday, 15 November 2013 at 16:36:56 UTC, Jacek Furmankiewicz 
 wrote:
 How can you achieve lock-free reads with the synchronized MyMap 
 approach?

 
 In this case you can use Readers-writer lock
 http://en.wikipedia.org/wiki/Readers%E2%80%93writer_lock
 
 It allows multiple reads and single write. I think that the 
 easiest way is use OS spesific function, for example 
 `pthread_rwlock_t` for POSX. Note that D supports C ABI, so you 
 can call any C function from D.
 
 I don't know any D implementation of Readers-writer lock, but you 
 can ask this question - maybe it already exist.

Sorry to come in late on this one (and miss everything that comes
before).

The trend in the JVM-verse is very much "if you use synchronized or an
explicit lock, and you are not creating a core library data structure,
you are doing it wrong". The background is that the whole purpose of a
lock it to control concurrency and thus stop parallelism. Applications
programmers should never have to use a lock. ConcurrentHashMap, and
thread safe queues are two consequences of all this.

In the Go-verse the attitude is basically the same, you should use
channels and communications – the synchronization is managed by the data
structure.

If D programmers are being told to use locks in applications code, then
the D programming model and library are failing. Or the advice is
wrong ;-)

-- 
Russel.
=============================================================================
Dr Russel Winder      t: +44 20 7585 2200   voip: sip:russel.winder ekiga.net
41 Buckmaster Road    m: +44 7770 465 077   xmpp: russel winder.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder

Nov 15 2013

"Jacek Furmankiewicz" <jacek99 gmail.com> writes:

Yes, that is what they say in Go...but it doesn't scale either. 
:-)

I had the exact same discussion on the Go forums a while back and 
the conclusion was basically the same...roll your own maps with 
RW locks:

https://groups.google.com/forum/?fromgroups#!searchin/golang-nuts/furmankiewicz/golang-nuts/jjjvXG4HdUw/ffWytKQ7X9YJ

But...at the end someone actually built lock-free data structures 
in Go out of this:

https://github.com/zond/gotomic

Nov 15 2013

Russel Winder <russel winder.org.uk> writes:

On Fri, 2013-11-15 at 18:55 +0100, Jacek Furmankiewicz wrote:
 Yes, that is what they say in Go...but it doesn't scale either. 
 :-)

I don't follow. CSP scales very well and Go implements CSP. (Well an
updated version from Hoare's 1978 CSP.) 

 I had the exact same discussion on the Go forums a while back and 
 the conclusion was basically the same...roll your own maps with 
 RW locks:
 
 https://groups.google.com/forum/?fromgroups#!searchin/golang-nuts/furmankiewicz/golang-nuts/jjjvXG4HdUw/ffWytKQ7X9YJ
 
 But...at the end someone actually built lock-free data structures 
 in Go out of this:
 
 https://github.com/zond/gotomic

This is, of course, how ConcurrentHashMap arrived in Java, Java didn't
have a shared access, thread safe map so someone created it.  Go has no
shared access, thread safe map and no-one has created one that is in the
standard library.

Of course Java is a shared-memory multithreading language whereas Go is
a CSP one, so the idea of a shared access memory safe data structure is
actually anathema.

-- 
Russel.
=============================================================================
Dr Russel Winder      t: +44 20 7585 2200   voip: sip:russel.winder ekiga.net
41 Buckmaster Road    m: +44 7770 465 077   xmpp: russel winder.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder

Nov 15 2013

"Jacek Furmankiewicz" <jacek99 gmail.com> writes:

if I recall from the initial Go discussion, the Go folks were 
saying that for close to realtime SLAs the goroutine/channel 
approach may have some scalability limits...which is why they 
started recommending the RW mutex approach in the end.

Now, that was a few months ago, since then Go 1.1 (and soon 1.2) 
came out, so that may be a false statement at this time.

Nov 15 2013

Russel Winder <russel winder.org.uk> writes:

On Fri, 2013-11-15 at 20:10 +0100, Jacek Furmankiewicz wrote:
 if I recall from the initial Go discussion, the Go folks were 
 saying that for close to realtime SLAs the goroutine/channel 
 approach may have some scalability limits...which is why they 
 started recommending the RW mutex approach in the end.
 
 Now, that was a few months ago, since then Go 1.1 (and soon 1.2) 
 came out, so that may be a false statement at this time.

I guess they were hinting at scheduling issues where there are many more
goroutines than kernel threads available. Not a problem for Web services
and applications but a potential problem for hard real-time. I don't
think 12.0, 1.1, 1.2,… will change the core issue – though it will
change the code generation, which is getting better. Though gccgo
already produces very efficient code, much faster execution than the
main Go system.

-- 
Russel.
=============================================================================
Dr Russel Winder      t: +44 20 7585 2200   voip: sip:russel.winder ekiga.net
41 Buckmaster Road    m: +44 7770 465 077   xmpp: russel winder.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder

Nov 15 2013

"Jacek Furmankiewicz" <jacek99 gmail.com> writes:

Thank you Russell for the explanation.

Always a chance to learn something new.

Nov 15 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Friday, 15 November 2013 at 17:46:41 UTC, Russel Winder wrote:
 If D programmers are being told to use locks in applications 
 code, then
 the D programming model and library are failing. Or the advice 
 is
 wrong ;-)

It's possible to implement lock-free data structures in D, you 
can use core.atomic
http://dlang.org/phobos/core_atomic.html

But it's REALLY difficult to implement and it can be SLOWER than 
Mutex version (not only in D, it depends from usage situation).

Nov 15 2013

Russel Winder <russel winder.org.uk> writes:

On Fri, 2013-11-15 at 19:05 +0100, ilya-stromberg wrote:
 On Friday, 15 November 2013 at 17:46:41 UTC, Russel Winder wrote:
 If D programmers are being told to use locks in applications 
 code, then
 the D programming model and library are failing. Or the advice 
 is
 wrong ;-)

 
 It's possible to implement lock-free data structures in D, you 
 can use core.atomic
 http://dlang.org/phobos/core_atomic.html
 
 But it's REALLY difficult to implement and it can be SLOWER than 
 Mutex version (not only in D, it depends from usage situation).

I didn't intend to imply that core data structures had to be lock free,
it is clear that creators of thread and process safe data structures
should be free to use locks if it makes things faster and more
efficient. My point was about applications built on the language
platform: the platform should provide all the things needed so that
applications code never mention locks.
 
-- 
Russel.
=============================================================================
Dr Russel Winder      t: +44 20 7585 2200   voip: sip:russel.winder ekiga.net
41 Buckmaster Road    m: +44 7770 465 077   xmpp: russel winder.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder

Nov 15 2013

"Jacek Furmankiewicz" <jacek99 gmail.com> writes:

On Friday, 15 November 2013 at 17:46:41 UTC, Russel Winder wrote:
 The trend in the JVM-verse is very much "if you use 
 synchronized or an
 explicit lock, and you are not creating a core library data 
 structure,
 you are doing it wrong". The background is that the whole 
 purpose of a
 lock it to control concurrency and thus stop parallelism. 
 Applications
 programmers should never have to use a lock. ConcurrentHashMap, 
 and
 thread safe queues are two consequences of all this.

True, concurrency in Java is really simple these days (especially 
with the Executors framework that Python 3 pretty much copies 
verbatim).

taskPool looks like the closest equivalent in D that I could find.

Nov 15 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Friday, 15 November 2013 at 18:16:17 UTC, Jacek Furmankiewicz 
wrote:
 taskPool looks like the closest equivalent in D that I could 
 find.

Yes, that's sad truth: if you want to use D, be ready make 
something yourself.

BTW, why did you decide to migrate to D? Any problems with Java?

Nov 15 2013

"Jacek Furmankiewicz" <jacek99 gmail.com> writes:

No, we didn't decide to migrate to D. Java is working out fine 
for us.

I am however always interested in what is out there, 'cause you 
never know if there may not be a better solution.

And from what I've seen so far I really like D in terms of pure 
language features.

Go is cool too, but it has made some choices which to me are 
questionable (error codes instead of exceptions, lack of 
templates/generics).

Coupled with vibe.d, dub, etc. I see some really interesting 
stuff going on in the D community that seems to have been greatly 
under the radar.

Definitely plan to spend more time with D on my own, even if I 
cannot use it at work.

Nov 15 2013

"Dicebot" <public dicebot.lv> writes:

On Friday, 15 November 2013 at 17:46:41 UTC, Russel Winder wrote:
 If D programmers are being told to use locks in applications 
 code, then
 the D programming model and library are failing. Or the advice 
 is
 wrong ;-)

I don't really buy it. It is good from simplicity/safety point of 
view (just use library stuff and your code is thread-safe) but 
not performance. Back in C++ days we have almost always resorted 
to writing own concurrent data structures to abuse domain 
specifics and application architecture as much as possible and 
thus minimize actual concurrent locking frequency. And most of 
those solutions were completely unsuitable as generic ones.

Nov 15 2013

Charles Hixson <charleshixsn earthlink.net> writes:

On 11/14/2013 01:36 PM, ilya-stromberg wrote:
 On Thursday, 14 November 2013 at 21:31:52 UTC, Jacek Furmankiewicz wrote:
 hashmap per thread is not an option. The cache may be a few GBs of 
 data, there is no way we can duplicate that data per thread.

 Not to mention the start up time when we have to warm up the cache.

 How often do you change the data? Probably, you should use `immutable` 
 variables.

Immutable variables are nice when they can be used.  Often, however, 
they can't.

I think that for the "concurrent hashmap" the best answer is probably to 
run the map in a thread, with message passing access whether for read or 
write.  And I wouldn't be surprised if that's how Java's concurrent 
hashmap is implemented under the covers. (OTOH, I haven't ever debugged 
such a setup.  Someone who has may have a better answer.)

-- 
Charles Hixson

Nov 14 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

15-Nov-2013 03:35, Charles Hixson пишет:
 On 11/14/2013 01:36 PM, ilya-stromberg wrote:
 On Thursday, 14 November 2013 at 21:31:52 UTC, Jacek Furmankiewicz wrote:
 hashmap per thread is not an option. The cache may be a few GBs of
 data, there is no way we can duplicate that data per thread.

 Not to mention the start up time when we have to warm up the cache.

 How often do you change the data? Probably, you should use `immutable`
 variables.

 Immutable variables are nice when they can be used.  Often, however,
 they can't.

 I think that for the "concurrent hashmap" the best answer is probably to
 run the map in a thread, with message passing access whether for read or
 write.

Would be slow unless batched. At least in D message passing involves 
locking/unlocking a queue of messages. Sending back and forth you get 2 
lock-wait-unlock and correspondingly context switches.

 And I wouldn't be surprised if that's how Java's concurrent
 hashmap is implemented under the covers. (OTOH, I haven't ever debugged
 such a setup.  Someone who has may have a better answer.)

As stated in Oracle's documentation somewhere it's implemented with fine 
grained locking (a lock per bucket of a hash map), some operations still 
lock the whole map. Rehashing still locks the whole thing I bet.

-- 
Dmitry Olshansky

Nov 15 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Jacek Furmankiewicz:

 hashmap per thread is not an option. The cache may be a few GBs 
 of data, there is no way we can duplicate that data per thread.

But is the D garbage collector able to manage efficiently enough 
associative arrays of few gigabytes? You are not dealing with a 
GC nearly as efficient as the JavaVM one.

Bye,
bearophile

Nov 14 2013

"Jacek Furmankiewicz" <jacek99 gmail.com> writes:

On Thursday, 14 November 2013 at 21:39:53 UTC, bearophile wrote:
 Jacek Furmankiewicz:

 hashmap per thread is not an option. The cache may be a few 
 GBs of data, there is no way we can duplicate that data per 
 thread.

 But is the D garbage collector able to manage efficiently 
 enough associative arrays of few gigabytes? You are not dealing 
 with a GC nearly as efficient as the JavaVM one.

Well, these are the types of questions I have as a Java veteran
who is having a first look at D after the recent Facebook 
announcement.

By now I have a decent idea of where most of the new languages 
(Go has same issues, for the most part) come up short when 
compared to Java's very mature SDK, so that is usually where I 
start probing first.

Sorry :-(

Nov 14 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Jacek Furmankiewicz:

 Well, these are the types of questions I have as a Java veteran
 who is having a first look at D after the recent Facebook 
 announcement.

 By now I have a decent idea of where most of the new languages 
 (Go has same issues, for the most part) come up short when 
 compared to Java's very mature SDK, so that is usually where I 
 start probing first.

 Sorry :-(

The development of the Java language, its GC, Oracle JVM, 
standard library (and its IDEs, etc) have received tons of money, 
time, and hours of work, so it's not strange Java is "better" 
than D.

On the other hand different languages are fitter for different 
purposes. I like D a lot, but programmers should choose languages 
wisely, and Java is a wiser choice for several commercial 
purposes. If you rewrite Minecraft from Java to D I suspect you 
produce a game that's faster and with a shorter source code, 
while keeping most of its programmer-friendly nature and its 
coding safety, despite the current limits of the D GC.

If you want to use D try to find niches where it could be useful 
and fit. I am using D where it's better than equivalent Java 
code. Today Python is used a lot, but in many cases it's not 
replacing equivalent Java code.

Probably you can replace some Java code with Scala code.

Bye,
bearophile

Nov 14 2013

"Jacek Furmankiewicz" <jacek99 gmail.com> writes:

True.

While looking a D, I am just trying to focus on the parts which I 
know would be a showstopper for us on day one...and this 
particular issue is it.

I do like D a lot as well from what I've seen so far.

Regarding the GC, I've seen some slides on DConf about other 
garbage collectors available. Is there any resource/tutorial that 
shows how you can swap out the default GC for those alternative 
implementations?

Nov 14 2013

"qznc" <qznc web.de> writes:

On Thursday, 14 November 2013 at 23:10:58 UTC, Jacek 
Furmankiewicz wrote:
 While looking a D, I am just trying to focus on the parts which 
 I know would be a showstopper for us on day one...and this 
 particular issue is it.

Yes, I also think for long-running memory-hungry server-stuff the 
current conservative GC is a show stopper. Some people are 
working on a concurrent and a  precise GC. Then parallel and 
concurrent and incremental and generational and whatnot GCs will 
be explored. The point where it gets interesting for these 
server-apps is not clear though.

 Regarding the GC, I've seen some slides on DConf about other 
 garbage collectors available. Is there any resource/tutorial 
 that shows how you can swap out the default GC for those 
 alternative implementations?

As far as I know those other GCs are not ready for prime time 
yet. For sure, there is no other GC shipped with the current D 
2.064.2.

Oh, and keep nagging! We need to hear about showstoppers, so we 
can fix them! ;)

Nov 15 2013

"Jacek Furmankiewicz" <jacek99 gmail.com> writes:

So....how does Facebook handle it with their new D code?

No GC at all, explicit memory management?

Nov 15 2013

"SomeDude" <lovelydear mailmetrash.com> writes:

On Friday, 15 November 2013 at 22:22:32 UTC, Jacek Furmankiewicz 
wrote:
 So....how does Facebook handle it with their new D code?

 No GC at all, explicit memory management?

AFAWK, Facebook doesn't use D for its core business yet, only for 
buiding tools. OTOH, Andrei has been working hard on memory 
allocators, so maybe that's one idea that they are digging.

Nov 16 2013

"lomereiter" <lomereiter gmail.com> writes:

On Thursday, 14 November 2013 at 17:36:09 UTC, Jacek
Furmankiewicz wrote:
 Could anyone point me to what would be the closest D 
 equivalents (maybe in an external library if not part of 
 Phobos) so we can playing around with them?

 Much appreciated
 Jacek

In such cases the easiest route is to find some C/C++ library for
such tasks, make a C interface in the latter case, and link with
it. That would require a bit of extra work but much less than
writing your own performant implementation from scratch.
E.g. I once wrote a simple wrapper for the Kyoto Cabinet
key-value store:
https://github.com/lomereiter/kyoto-d/blob/master/kyotocabinet.d

Nov 15 2013

D Programming

C/C++ Programming

Other

digitalmars.D.learn - What is the closest to ConcurrentHashMap and NavigableMap in Java?