www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - What is the closest to ConcurrentHashMap and NavigableMap in Java?

reply "Jacek Furmankiewicz" <jacek99 gmail.com> writes:
In our Java code, we make heavy use of ConcurrentHashMap for 
in-memory caches:

http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ConcurrentHashMap.html

We use it mostly for in-memory caches. A background thread wakes 
up every X seconds and resyncs the cache with whatever changes 
occurred in the DB. In the meantime, all the incoming requests 
can read from the same cache without any concurrency issues and 
excellent performance.

All the major Java Map implementations also support the 
NavigableMap interface:

http://docs.oracle.com/javase/7/docs/api/java/util/NavigableMap.html

which allows you to get the key closest to the one you are 
looking for (either up or down). This is very useful when you 
have hierarchies of business rules that have effective date 
ranges and you are trying to find which rules is active on a 
particular day.

I read up the chapter on associative arrays in D, but I do not 
see anything similar to this functionality in there.

Could anyone point me to what would be the closest D equivalents 
(maybe in an external library if not part of Phobos) so we can 
playing around with them?

Much appreciated
Jacek
Nov 14 2013
next sibling parent reply "TheFlyingFiddle" <theflyingfiddle gmail.com> writes:
On Thursday, 14 November 2013 at 17:36:09 UTC, Jacek 
Furmankiewicz wrote:
 In our Java code, we make heavy use of ConcurrentHashMap for 
 in-memory caches:

 http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ConcurrentHashMap.html

 We use it mostly for in-memory caches. A background thread 
 wakes up every X seconds and resyncs the cache with whatever 
 changes occurred in the DB. In the meantime, all the incoming 
 requests can read from the same cache without any concurrency 
 issues and excellent performance.

 All the major Java Map implementations also support the 
 NavigableMap interface:

 http://docs.oracle.com/javase/7/docs/api/java/util/NavigableMap.html

 which allows you to get the key closest to the one you are 
 looking for (either up or down). This is very useful when you 
 have hierarchies of business rules that have effective date 
 ranges and you are trying to find which rules is active on a 
 particular day.

 I read up the chapter on associative arrays in D, but I do not 
 see anything similar to this functionality in there.

 Could anyone point me to what would be the closest D 
 equivalents (maybe in an external library if not part of 
 Phobos) so we can playing around with them?

 Much appreciated
 Jacek
D does not have alot of diffrent containers atm. This work has been postponed until a working version of std.allocators is implemented. So atleast in the phobos library right now you will not find what you are looking for.
Nov 14 2013
parent reply "Jacek Furmankiewicz" <jacek99 gmail.com> writes:
So how do existing D applications (especially the high perf ones 
in let's say the financial sector) deal with having some part of 
the data in memory and keeping it in sync with the DB source?

This must be a very common requirement for any app with realtime 
or close-to-realtime SLAs.

is there a different idiom or approach in D?
Nov 14 2013
next sibling parent "TheFlyingFiddle" <theflyingfiddle gmail.com> writes:
On Thursday, 14 November 2013 at 18:08:22 UTC, Jacek 
Furmankiewicz wrote:
 So how do existing D applications (especially the high perf 
 ones in let's say the financial sector) deal with having some 
 part of the data in memory and keeping it in sync with the DB 
 source?
Good question. I have no idea.
 is there a different idiom or approach in D?
Well in D the prefered way (or recommended according to TDPL) to do concurrent sharing of resources is to share resources via message passing. So in this case a single thread would be responsible for the caching of data in memory and other threads would ask this thread for data through message passing. If this way is faster/better then javas ConcurrentHashMap i am not sure.
Nov 14 2013
prev sibling parent "JN" <666total wp.pl> writes:
On Thursday, 14 November 2013 at 18:08:22 UTC, Jacek 
Furmankiewicz wrote:
 So how do existing D applications (especially the high perf 
 ones in let's say the financial sector) deal with having some 
 part of the data in memory and keeping it in sync with the DB 
 source?
They don't, because there aren't really such apps in the wild yet.
Nov 14 2013
prev sibling next sibling parent reply "ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:
On Thursday, 14 November 2013 at 17:36:09 UTC, Jacek 
Furmankiewicz wrote:
 In our Java code, we make heavy use of ConcurrentHashMap for 
 in-memory caches:
Try to look dcollections: http://www.dsource.org/projects/dcollections Also, Vibe.d has own hashmap: https://github.com/rejectedsoftware/vibe.d/blob/master/source/vibe/utils/hashmap.d
Nov 14 2013
parent reply "Jacek Furmankiewicz" <jacek99 gmail.com> writes:
Thanks for the links.

I looked at the dcollections docs, but none of their collections 
seem thread safe. The vibe.d I guess is because it is meant to be 
used from async I/O in a single thread...but once you add 
multi-threading to an app I am guessing it would not be usable.
Nov 14 2013
parent reply "ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:
On Thursday, 14 November 2013 at 20:00:10 UTC, Jacek 
Furmankiewicz wrote:
 I looked at the dcollections docs, but none of their 
 collections seem thread safe. The vibe.d I guess is because it 
 is meant to be used from async I/O in a single thread...but 
 once you add multi-threading to an app I am guessing it would 
 not be usable.
No, you can: 1) Use different hashmap per tread. I don't know your situation, but it can be possible fo read-only cache like this: import vibe.utils.hashmap; HashMap!(int, int) map; void foo() { //use map map[1] = 1; } 2) Use `shared` storage class and mutex like this: import vibe.utils.hashmap; shared HashMap!(int, int) map; void foo() { synchronized { //use map map[1] = 1; } }
Nov 14 2013
next sibling parent reply "TheFlyingFiddle" <theflyingfiddle gmail.com> writes:
 2) Use `shared` storage class and mutex like this:

 import vibe.utils.hashmap;

 shared HashMap!(int, int) map;

 void foo()
 {
    synchronized
    {
       //use map
       map[1] = 1;
    }
 }
Locking every time you use the map dosn't rly seem reasonable. It's not particulary fast and you might forget to lock the map at some point (or does the shared modifier not allow you to do this in D?) I'm not that fammiliar with the synchronzed statement but shouldn't it be locked on some object? void bar() { //Can one thread be in this block... synchronized { map[1] = 1; } //... while another thread is in this block? synchronized { map[2] = 2; } } If that is the case are you not limited in the way you can update the map eg only in a single block?
Nov 14 2013
parent "ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:
On Thursday, 14 November 2013 at 21:16:15 UTC, TheFlyingFiddle 
wrote:

 If that is the case are you not limited in the way you can 
 update the map eg only in a single block?
Yes, it's probably not the best example. It's valid if you have only 1 synchronized block for map. But you can use something like this: void bar() { synchronized(map) { map[1] = 1; } synchronized(map) { map[2] = 2; } } Or this: //Note: valid only if you have 1 function that use map synchronized void bar() { map[1] = 1; map[2] = 2; } Or this: shared myMap; synchronized class MyMap { HashMap!(int, int) map; void foo() { map[1] = 1; } void bar() { map[2] = 2; } } //init map shared static this() { myMap = new MyMap(); } Probably, it's the best example.
Nov 14 2013
prev sibling parent reply "Jacek Furmankiewicz" <jacek99 gmail.com> writes:
hashmap per thread is not an option. The cache may be a few GBs 
of data, there is no way we can duplicate that data per thread.

Not to mention the start up time when we have to warm up the 
cache.
Nov 14 2013
next sibling parent reply "ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:
On Thursday, 14 November 2013 at 21:31:52 UTC, Jacek 
Furmankiewicz wrote:
 hashmap per thread is not an option. The cache may be a few GBs 
 of data, there is no way we can duplicate that data per thread.

 Not to mention the start up time when we have to warm up the 
 cache.
How often do you change the data? Probably, you should use `immutable` variables.
Nov 14 2013
next sibling parent reply "Jacek Furmankiewicz" <jacek99 gmail.com> writes:
On Thursday, 14 November 2013 at 21:36:46 UTC, ilya-stromberg 
wrote:
 On Thursday, 14 November 2013 at 21:31:52 UTC, Jacek 
 Furmankiewicz wrote:
 How often do you change the data? Probably, you should use 
 `immutable` variables.
Customer specific. It may change once a year. It may change multiple times per second for a while, then nothing again for weeks. Others may do mass loads of business rules, hence do mass changes every few hours. Next to impossible to predict.
Nov 14 2013
parent reply "ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:
On Thursday, 14 November 2013 at 22:12:10 UTC, Jacek 
Furmankiewicz wrote:
 On Thursday, 14 November 2013 at 21:36:46 UTC, ilya-stromberg 
 wrote:
 On Thursday, 14 November 2013 at 21:31:52 UTC, Jacek 
 Furmankiewicz wrote:
 How often do you change the data? Probably, you should use 
 `immutable` variables.
Customer specific. It may change once a year. It may change multiple times per second for a while, then nothing again for weeks. Others may do mass loads of business rules, hence do mass changes every few hours. Next to impossible to predict.
You can use `immutable` variables. It allows you to share the data without any synchronization. Like this: class MyData { int data1; string data2; //creates new object this(int data1, string data2) { this.data1 = data1; this.data2 = data2; } //modify the data immutable(MyData) editData(int i) const { //copy this object - we can't change immutable variables MyData dataCopy = new MyData(this.data1, this.data2) //modify the data copy dataCopy.data1 += i; //assume that `dataCopy` is immutable return cast(immutable(MyData)) dataCopy; } } shared myMap; //map implementation synchronized class MyMap { HashMap!(int, immutable(MyData)) map; void foo() { map[1] = new immutable MyData(1, "data"); } void bar() { map[1] = map[1].editData(5); } } //init map shared static this() { myMap = new MyMap(); } void main() { myMap.foo(); myMap.bar(); }
Nov 14 2013
parent reply "Jacek Furmankiewicz" <jacek99 gmail.com> writes:
On Friday, 15 November 2013 at 07:42:22 UTC, ilya-stromberg wrote:
 On Thursday, 14 November 2013 at 22:12:10 UTC, Jacek 
 Furmankiewicz wrote:
 On Thursday, 14 November 2013 at 21:36:46 UTC, ilya-stromberg 
 wrote:
 On Thursday, 14 November 2013 at 21:31:52 UTC, Jacek 
 Furmankiewicz wrote:
 How often do you change the data? Probably, you should use 
 `immutable` variables.
Customer specific. It may change once a year. It may change multiple times per second for a while, then nothing again for weeks. Others may do mass loads of business rules, hence do mass changes every few hours. Next to impossible to predict.
You can use `immutable` variables. It allows you to share the data without any synchronization. Like this: class MyData { int data1; string data2; //creates new object this(int data1, string data2) { this.data1 = data1; this.data2 = data2; } //modify the data immutable(MyData) editData(int i) const { //copy this object - we can't change immutable variables MyData dataCopy = new MyData(this.data1, this.data2) //modify the data copy dataCopy.data1 += i; //assume that `dataCopy` is immutable return cast(immutable(MyData)) dataCopy; } } shared myMap; //map implementation synchronized class MyMap { HashMap!(int, immutable(MyData)) map; void foo() { map[1] = new immutable MyData(1, "data"); } void bar() { map[1] = map[1].editData(5); } } //init map shared static this() { myMap = new MyMap(); } void main() { myMap.foo(); myMap.bar(); }
So what happens when the "write" operation is doing map[1] = map[1].editData(5); and at the same time 50 threads are simultaneously reading the value in map[1]?. Is that reassignment operation thread safe? Or would I get corrupted reads with potentially a partially overriden value? Jacek
Nov 15 2013
parent reply "ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:
On Friday, 15 November 2013 at 15:21:59 UTC, Jacek Furmankiewicz 
wrote:
 So what happens when the "write" operation is doing

  map[1] = map[1].editData(5);

 and at the same time 50 threads are simultaneously reading the 
 value in map[1]?.

 Is that reassignment operation thread safe?
 Or would I get corrupted reads with potentially a partially 
 overriden value?

 Jacek
Yes, this is thread safe. Put attention to the `MyMap` class definition, it's marked as `synchronized`. It means that all class functions use the same Mutex. So, "write" operation will block map and all 50 threads will wait.
Nov 15 2013
parent reply "Jacek Furmankiewicz" <jacek99 gmail.com> writes:
So, if you add a read() method to MyMap for those threads, would 
that be synchronized as well?

That is what we would not want due performance impact.

How can you achieve lock-free reads with the synchronized MyMap 
approach?
Nov 15 2013
parent reply "ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:
On Friday, 15 November 2013 at 16:36:56 UTC, Jacek Furmankiewicz 
wrote:
 How can you achieve lock-free reads with the synchronized MyMap 
 approach?
In this case you can use Readers-writer lock http://en.wikipedia.org/wiki/Readers%E2%80%93writer_lock It allows multiple reads and single write. I think that the easiest way is use OS spesific function, for example `pthread_rwlock_t` for POSX. Note that D supports C ABI, so you can call any C function from D. I don't know any D implementation of Readers-writer lock, but you can ask this question - maybe it already exist.
Nov 15 2013
next sibling parent reply "Dicebot" <public dicebot.lv> writes:
On Friday, 15 November 2013 at 17:03:15 UTC, ilya-stromberg wrote:
 I don't know any D implementation of Readers-writer lock, but 
 you can ask this question - maybe it already exist.
http://dlang.org/phobos/core_sync_rwmutex.html
Nov 15 2013
parent "ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:
On Friday, 15 November 2013 at 17:09:54 UTC, Dicebot wrote:
 On Friday, 15 November 2013 at 17:03:15 UTC, ilya-stromberg 
 wrote:
 I don't know any D implementation of Readers-writer lock, but 
 you can ask this question - maybe it already exist.
http://dlang.org/phobos/core_sync_rwmutex.html
Thank you. I just never use it.
Nov 15 2013
prev sibling parent reply Russel Winder <russel winder.org.uk> writes:
On Fri, 2013-11-15 at 18:03 +0100, ilya-stromberg wrote:
 On Friday, 15 November 2013 at 16:36:56 UTC, Jacek Furmankiewicz 
 wrote:
 How can you achieve lock-free reads with the synchronized MyMap 
 approach?
In this case you can use Readers-writer lock http://en.wikipedia.org/wiki/Readers%E2%80%93writer_lock It allows multiple reads and single write. I think that the easiest way is use OS spesific function, for example `pthread_rwlock_t` for POSX. Note that D supports C ABI, so you can call any C function from D. I don't know any D implementation of Readers-writer lock, but you can ask this question - maybe it already exist.
Sorry to come in late on this one (and miss everything that comes before). The trend in the JVM-verse is very much "if you use synchronized or an explicit lock, and you are not creating a core library data structure, you are doing it wrong". The background is that the whole purpose of a lock it to control concurrency and thus stop parallelism. Applications programmers should never have to use a lock. ConcurrentHashMap, and thread safe queues are two consequences of all this. In the Go-verse the attitude is basically the same, you should use channels and communications – the synchronization is managed by the data structure. If D programmers are being told to use locks in applications code, then the D programming model and library are failing. Or the advice is wrong ;-) -- Russel. ============================================================================= Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.net 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Nov 15 2013
next sibling parent reply "Jacek Furmankiewicz" <jacek99 gmail.com> writes:
Yes, that is what they say in Go...but it doesn't scale either. 
:-)

I had the exact same discussion on the Go forums a while back and 
the conclusion was basically the same...roll your own maps with 
RW locks:

https://groups.google.com/forum/?fromgroups#!searchin/golang-nuts/furmankiewicz/golang-nuts/jjjvXG4HdUw/ffWytKQ7X9YJ

But...at the end someone actually built lock-free data structures 
in Go out of this:

https://github.com/zond/gotomic
Nov 15 2013
parent reply Russel Winder <russel winder.org.uk> writes:
On Fri, 2013-11-15 at 18:55 +0100, Jacek Furmankiewicz wrote:
 Yes, that is what they say in Go...but it doesn't scale either. 
 :-)
I don't follow. CSP scales very well and Go implements CSP. (Well an updated version from Hoare's 1978 CSP.)
 I had the exact same discussion on the Go forums a while back and 
 the conclusion was basically the same...roll your own maps with 
 RW locks:
 
 https://groups.google.com/forum/?fromgroups#!searchin/golang-nuts/furmankiewicz/golang-nuts/jjjvXG4HdUw/ffWytKQ7X9YJ
 
 But...at the end someone actually built lock-free data structures 
 in Go out of this:
 
 https://github.com/zond/gotomic
This is, of course, how ConcurrentHashMap arrived in Java, Java didn't have a shared access, thread safe map so someone created it. Go has no shared access, thread safe map and no-one has created one that is in the standard library. Of course Java is a shared-memory multithreading language whereas Go is a CSP one, so the idea of a shared access memory safe data structure is actually anathema. -- Russel. ============================================================================= Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.net 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Nov 15 2013
parent reply "Jacek Furmankiewicz" <jacek99 gmail.com> writes:
if I recall from the initial Go discussion, the Go folks were 
saying that for close to realtime SLAs the goroutine/channel 
approach may have some scalability limits...which is why they 
started recommending the RW mutex approach in the end.

Now, that was a few months ago, since then Go 1.1 (and soon 1.2) 
came out, so that may be a false statement at this time.
Nov 15 2013
parent reply Russel Winder <russel winder.org.uk> writes:
On Fri, 2013-11-15 at 20:10 +0100, Jacek Furmankiewicz wrote:
 if I recall from the initial Go discussion, the Go folks were 
 saying that for close to realtime SLAs the goroutine/channel 
 approach may have some scalability limits...which is why they 
 started recommending the RW mutex approach in the end.
 
 Now, that was a few months ago, since then Go 1.1 (and soon 1.2) 
 came out, so that may be a false statement at this time.
I guess they were hinting at scheduling issues where there are many more goroutines than kernel threads available. Not a problem for Web services and applications but a potential problem for hard real-time. I don't think 12.0, 1.1, 1.2,… will change the core issue – though it will change the code generation, which is getting better. Though gccgo already produces very efficient code, much faster execution than the main Go system. -- Russel. ============================================================================= Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.net 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Nov 15 2013
parent "Jacek Furmankiewicz" <jacek99 gmail.com> writes:
Thank you Russell for the explanation.

Always a chance to learn something new.
Nov 15 2013
prev sibling next sibling parent reply "ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:
On Friday, 15 November 2013 at 17:46:41 UTC, Russel Winder wrote:
 If D programmers are being told to use locks in applications 
 code, then
 the D programming model and library are failing. Or the advice 
 is
 wrong ;-)
It's possible to implement lock-free data structures in D, you can use core.atomic http://dlang.org/phobos/core_atomic.html But it's REALLY difficult to implement and it can be SLOWER than Mutex version (not only in D, it depends from usage situation).
Nov 15 2013
parent Russel Winder <russel winder.org.uk> writes:
On Fri, 2013-11-15 at 19:05 +0100, ilya-stromberg wrote:
 On Friday, 15 November 2013 at 17:46:41 UTC, Russel Winder wrote:
 If D programmers are being told to use locks in applications 
 code, then
 the D programming model and library are failing. Or the advice 
 is
 wrong ;-)
It's possible to implement lock-free data structures in D, you can use core.atomic http://dlang.org/phobos/core_atomic.html But it's REALLY difficult to implement and it can be SLOWER than Mutex version (not only in D, it depends from usage situation).
I didn't intend to imply that core data structures had to be lock free, it is clear that creators of thread and process safe data structures should be free to use locks if it makes things faster and more efficient. My point was about applications built on the language platform: the platform should provide all the things needed so that applications code never mention locks. -- Russel. ============================================================================= Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.net 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Nov 15 2013
prev sibling next sibling parent reply "Jacek Furmankiewicz" <jacek99 gmail.com> writes:
On Friday, 15 November 2013 at 17:46:41 UTC, Russel Winder wrote:
 The trend in the JVM-verse is very much "if you use 
 synchronized or an
 explicit lock, and you are not creating a core library data 
 structure,
 you are doing it wrong". The background is that the whole 
 purpose of a
 lock it to control concurrency and thus stop parallelism. 
 Applications
 programmers should never have to use a lock. ConcurrentHashMap, 
 and
 thread safe queues are two consequences of all this.
True, concurrency in Java is really simple these days (especially with the Executors framework that Python 3 pretty much copies verbatim). taskPool looks like the closest equivalent in D that I could find.
Nov 15 2013
parent reply "ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:
On Friday, 15 November 2013 at 18:16:17 UTC, Jacek Furmankiewicz 
wrote:
 taskPool looks like the closest equivalent in D that I could 
 find.
Yes, that's sad truth: if you want to use D, be ready make something yourself. BTW, why did you decide to migrate to D? Any problems with Java?
Nov 15 2013
parent "Jacek Furmankiewicz" <jacek99 gmail.com> writes:
No, we didn't decide to migrate to D. Java is working out fine 
for us.

I am however always interested in what is out there, 'cause you 
never know if there may not be a better solution.

And from what I've seen so far I really like D in terms of pure 
language features.

Go is cool too, but it has made some choices which to me are 
questionable (error codes instead of exceptions, lack of 
templates/generics).

Coupled with vibe.d, dub, etc. I see some really interesting 
stuff going on in the D community that seems to have been greatly 
under the radar.

Definitely plan to spend more time with D on my own, even if I 
cannot use it at work.
Nov 15 2013
prev sibling parent "Dicebot" <public dicebot.lv> writes:
On Friday, 15 November 2013 at 17:46:41 UTC, Russel Winder wrote:
 If D programmers are being told to use locks in applications 
 code, then
 the D programming model and library are failing. Or the advice 
 is
 wrong ;-)
I don't really buy it. It is good from simplicity/safety point of view (just use library stuff and your code is thread-safe) but not performance. Back in C++ days we have almost always resorted to writing own concurrent data structures to abuse domain specifics and application architecture as much as possible and thus minimize actual concurrent locking frequency. And most of those solutions were completely unsuitable as generic ones.
Nov 15 2013
prev sibling parent reply Charles Hixson <charleshixsn earthlink.net> writes:
On 11/14/2013 01:36 PM, ilya-stromberg wrote:
 On Thursday, 14 November 2013 at 21:31:52 UTC, Jacek Furmankiewicz wrote:
 hashmap per thread is not an option. The cache may be a few GBs of 
 data, there is no way we can duplicate that data per thread.

 Not to mention the start up time when we have to warm up the cache.
How often do you change the data? Probably, you should use `immutable` variables.
Immutable variables are nice when they can be used. Often, however, they can't. I think that for the "concurrent hashmap" the best answer is probably to run the map in a thread, with message passing access whether for read or write. And I wouldn't be surprised if that's how Java's concurrent hashmap is implemented under the covers. (OTOH, I haven't ever debugged such a setup. Someone who has may have a better answer.) -- Charles Hixson
Nov 14 2013
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
15-Nov-2013 03:35, Charles Hixson пишет:
 On 11/14/2013 01:36 PM, ilya-stromberg wrote:
 On Thursday, 14 November 2013 at 21:31:52 UTC, Jacek Furmankiewicz wrote:
 hashmap per thread is not an option. The cache may be a few GBs of
 data, there is no way we can duplicate that data per thread.

 Not to mention the start up time when we have to warm up the cache.
How often do you change the data? Probably, you should use `immutable` variables.
Immutable variables are nice when they can be used. Often, however, they can't. I think that for the "concurrent hashmap" the best answer is probably to run the map in a thread, with message passing access whether for read or write.
Would be slow unless batched. At least in D message passing involves locking/unlocking a queue of messages. Sending back and forth you get 2 lock-wait-unlock and correspondingly context switches.
 And I wouldn't be surprised if that's how Java's concurrent
 hashmap is implemented under the covers. (OTOH, I haven't ever debugged
 such a setup.  Someone who has may have a better answer.)
As stated in Oracle's documentation somewhere it's implemented with fine grained locking (a lock per bucket of a hash map), some operations still lock the whole map. Rehashing still locks the whole thing I bet. -- Dmitry Olshansky
Nov 15 2013
prev sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Jacek Furmankiewicz:

 hashmap per thread is not an option. The cache may be a few GBs 
 of data, there is no way we can duplicate that data per thread.
But is the D garbage collector able to manage efficiently enough associative arrays of few gigabytes? You are not dealing with a GC nearly as efficient as the JavaVM one. Bye, bearophile
Nov 14 2013
parent reply "Jacek Furmankiewicz" <jacek99 gmail.com> writes:
On Thursday, 14 November 2013 at 21:39:53 UTC, bearophile wrote:
 Jacek Furmankiewicz:

 hashmap per thread is not an option. The cache may be a few 
 GBs of data, there is no way we can duplicate that data per 
 thread.
But is the D garbage collector able to manage efficiently enough associative arrays of few gigabytes? You are not dealing with a GC nearly as efficient as the JavaVM one.
Well, these are the types of questions I have as a Java veteran who is having a first look at D after the recent Facebook announcement. By now I have a decent idea of where most of the new languages (Go has same issues, for the most part) come up short when compared to Java's very mature SDK, so that is usually where I start probing first. Sorry :-(
Nov 14 2013
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Jacek Furmankiewicz:

 Well, these are the types of questions I have as a Java veteran
 who is having a first look at D after the recent Facebook 
 announcement.

 By now I have a decent idea of where most of the new languages 
 (Go has same issues, for the most part) come up short when 
 compared to Java's very mature SDK, so that is usually where I 
 start probing first.

 Sorry :-(
The development of the Java language, its GC, Oracle JVM, standard library (and its IDEs, etc) have received tons of money, time, and hours of work, so it's not strange Java is "better" than D. On the other hand different languages are fitter for different purposes. I like D a lot, but programmers should choose languages wisely, and Java is a wiser choice for several commercial purposes. If you rewrite Minecraft from Java to D I suspect you produce a game that's faster and with a shorter source code, while keeping most of its programmer-friendly nature and its coding safety, despite the current limits of the D GC. If you want to use D try to find niches where it could be useful and fit. I am using D where it's better than equivalent Java code. Today Python is used a lot, but in many cases it's not replacing equivalent Java code. Probably you can replace some Java code with Scala code. Bye, bearophile
Nov 14 2013
parent reply "Jacek Furmankiewicz" <jacek99 gmail.com> writes:
True.

While looking a D, I am just trying to focus on the parts which I 
know would be a showstopper for us on day one...and this 
particular issue is it.

I do like D a lot as well from what I've seen so far.

Regarding the GC, I've seen some slides on DConf about other 
garbage collectors available. Is there any resource/tutorial that 
shows how you can swap out the default GC for those alternative 
implementations?
Nov 14 2013
parent reply "qznc" <qznc web.de> writes:
On Thursday, 14 November 2013 at 23:10:58 UTC, Jacek 
Furmankiewicz wrote:
 While looking a D, I am just trying to focus on the parts which 
 I know would be a showstopper for us on day one...and this 
 particular issue is it.
Yes, I also think for long-running memory-hungry server-stuff the current conservative GC is a show stopper. Some people are working on a concurrent and a precise GC. Then parallel and concurrent and incremental and generational and whatnot GCs will be explored. The point where it gets interesting for these server-apps is not clear though.
 Regarding the GC, I've seen some slides on DConf about other 
 garbage collectors available. Is there any resource/tutorial 
 that shows how you can swap out the default GC for those 
 alternative implementations?
As far as I know those other GCs are not ready for prime time yet. For sure, there is no other GC shipped with the current D 2.064.2. Oh, and keep nagging! We need to hear about showstoppers, so we can fix them! ;)
Nov 15 2013
parent reply "Jacek Furmankiewicz" <jacek99 gmail.com> writes:
So....how does Facebook handle it with their new D code?

No GC at all, explicit memory management?
Nov 15 2013
parent "SomeDude" <lovelydear mailmetrash.com> writes:
On Friday, 15 November 2013 at 22:22:32 UTC, Jacek Furmankiewicz 
wrote:
 So....how does Facebook handle it with their new D code?

 No GC at all, explicit memory management?
AFAWK, Facebook doesn't use D for its core business yet, only for buiding tools. OTOH, Andrei has been working hard on memory allocators, so maybe that's one idea that they are digging.
Nov 16 2013
prev sibling parent "lomereiter" <lomereiter gmail.com> writes:
On Thursday, 14 November 2013 at 17:36:09 UTC, Jacek
Furmankiewicz wrote:
 Could anyone point me to what would be the closest D 
 equivalents (maybe in an external library if not part of 
 Phobos) so we can playing around with them?

 Much appreciated
 Jacek
In such cases the easiest route is to find some C/C++ library for such tasks, make a C interface in the latter case, and link with it. That would require a bit of extra work but much less than writing your own performant implementation from scratch. E.g. I once wrote a simple wrapper for the Kyoto Cabinet key-value store: https://github.com/lomereiter/kyoto-d/blob/master/kyotocabinet.d
Nov 15 2013