digitalmars.D - Interesting synchronization paradigm: flat combining
- Piotr Szturmaj (14/14) May 21 2016 https://www.cs.bgu.ac.il/~hendlerd/papers/flat-combining.pdf [2010]
- Piotr Szturmaj (10/24) May 24 2016 Another relevant article:
https://www.cs.bgu.ac.il/~hendlerd/papers/flat-combining.pdf [2010] "Traditional data structure designs, whether lock-based or lock-free, provide parallelism via fine grained synchronization among threads. We introduce a new synchronization paradigm based on coarse locking, which we call flat combining. The cost of synchronization in flat combining is so low, that having a single thread holding a lock perform the combined access requests of all others, delivers, up to a certain non-negligible concurrency level, better performance than the most effective parallel finely synchronized implementations. We use flat-combining to devise, among other structures, new linearizable stack, queue, and priority queue algorithms that greatly outperform all prior algorithms."
May 21 2016
On 2016-05-21 16:06, Piotr Szturmaj wrote:https://www.cs.bgu.ac.il/~hendlerd/papers/flat-combining.pdf [2010] "Traditional data structure designs, whether lock-based or lock-free, provide parallelism via fine grained synchronization among threads. We introduce a new synchronization paradigm based on coarse locking, which we call flat combining. The cost of synchronization in flat combining is so low, that having a single thread holding a lock perform the combined access requests of all others, delivers, up to a certain non-negligible concurrency level, better performance than the most effective parallel finely synchronized implementations. We use flat-combining to devise, among other structures, new linearizable stack, queue, and priority queue algorithms that greatly outperform all prior algorithms."Another relevant article: https://software.intel.com/en-us/blogs/2013/02/22/combineraggregator-synchronization-primitive Combiner is an alternative to mutex, and basically takes critical sections from many threads, combines and executes them on one thread (the one that currently has lock), leading to non-negligible boost in performance due to cache locality. If there are no combinining opportunities then it behaves exactly like a mutex. I'm leaving it here in case someone would like to implement this pattern in D language :)
May 24 2016