digitalmars.D - From C++14 and Java 1.8
- bearophile (52/53) Apr 21 2013 Through Reddit I've seen this interesting report about C++14:
- dsimcha (8/39) Apr 21 2013 FWIW, I created a parallel sort in D a while back using
- bearophile (5/7) Apr 21 2013 Then do you know why the Java version seems to be advantageous
- dsimcha (13/20) Apr 21 2013 I don't know Java very well, but possiblities include:
- Martin Nowak (4/9) Apr 22 2013 Your implementation performs a lot of copying. Maybe an in-place
- bearophile (4/7) Apr 21 2013 With some more changes:
Through Reddit I've seen this interesting report about C++14: http://isocpp.org/blog/2013/04/trip-report-iso-c-spring-2013-meeting Variable templates: sometimes in D I've felt a little desire for a shorter syntax, a templated alias: alias Foo(T) = Bar!(T, "red"); But in the end I think this that syntax sugar isn't so necessary. - - - - - - - - - Dynamic Arrays: they replace the variable-sized stack-allocated arrays of C99. I'd like something like this in D2 (Issue 9832) with a better integration with the type system. - - - - - - - - - optional<T>: see the little maybeTo helper function I have suggested in http://d.puremagic.com/issues/show_bug.cgi?id=6840 ----------------------------------- From another thread about the future Java 1.8 I've seen Java parallelSort: http://blog.sanaulla.info/2013/04/08/arrays-sort-versus-arrays-parallelsort/ That blog post explains quickly how it works: << Arrays#parallelSort uses Fork/Join framework introduced in Java 7 to assign the sorting tasks to multiple threads available in the thread pool. This is called eating your own dog food. Fork/Join implements a work stealing algorithm where in a idle thread can steal tasks queued up in another thread. An overview of Arrays#parallelSort: The method uses a threshold value and any array of size lesser than the threshold value is sorted using the Arrays#sort() API (i.e sequential sorting). And the threshold is calculated considering the parallelism of the machine, size of the array and is calculated as: private static final int getSplitThreshold(int n) { int p = ForkJoinPool.getCommonPoolParallelism(); int t = (p > 1) ? (1 + n / (p << 3)) : n; return t < MIN_ARRAY_SORT_GRAN ? MIN_ARRAY_SORT_GRAN : t; } Once its decided whether to sort the array in parallel or in serial, its now to decide how to divide the array in to multiple parts and then assign each part to a Fork/Join task which will take care of sorting it and then another Fork/Join task which will take care of merging the sorted arrays. The implementation in JDK 8 uses this approach: - Divide the array into 4 parts. - Sort the first two parts and then merge them. - Sort the next two parts and then merge them. And the above steps are repeated recursively with each part until the size of the part to sort is not lesser than the threshold value calculated above.I think it's worth adding something similar as strategy of std.algorithm.sort. Bye, bearophile
Apr 21 2013
On Sunday, 21 April 2013 at 12:08:54 UTC, bearophile wrote:Arrays#parallelSort uses Fork/Join framework introduced in Java 7 to assign the sorting tasks to multiple threads available in the thread pool. This is called eating your own dog food. Fork/Join implements a work stealing algorithm where in a idle thread can steal tasks queued up in another thread. An overview of Arrays#parallelSort: The method uses a threshold value and any array of size lesser than the threshold value is sorted using the Arrays#sort() API (i.e sequential sorting). And the threshold is calculated considering the parallelism of the machine, size of the array and is calculated as: private static final int getSplitThreshold(int n) { int p = ForkJoinPool.getCommonPoolParallelism(); int t = (p > 1) ? (1 + n / (p << 3)) : n; return t < MIN_ARRAY_SORT_GRAN ? MIN_ARRAY_SORT_GRAN : t; } Once its decided whether to sort the array in parallel or in serial, its now to decide how to divide the array in to multiple parts and then assign each part to a Fork/Join task which will take care of sorting it and then another Fork/Join task which will take care of merging the sorted arrays. The implementation in JDK 8 uses this approach: - Divide the array into 4 parts. - Sort the first two parts and then merge them. - Sort the next two parts and then merge them. And the above steps are repeated recursively with each part until the size of the part to sort is not lesser than the threshold value calculated above.FWIW, I created a parallel sort in D a while back using std.parallelism. It was part of std.parallel_algorithm, a half-finished project that I abandoned because I was disappointed at how poorly most of it was scaling in practice, probably due to memory bandwidth. If you have some expensive-to-compare types, though, it may be worthwhile. https://github.com/dsimcha/parallel_algorithm/blob/master/parallel_algorithm.dI think it's worth adding something similar as strategy of std.algorithm.sort.
Apr 21 2013
dsimcha:I abandoned because I was disappointed at how poorly most of it was scaling in practice, probably due to memory bandwidth.Then do you know why the Java version seems to be advantageous (with four cores)? Bye, bearophile
Apr 21 2013
On Sunday, 21 April 2013 at 13:30:32 UTC, bearophile wrote:dsimcha:I don't know Java very well, but possiblities include: 1. Sorting using a virtual or otherwise non-inlined comparison function. This makes the sorting require much more CPU time but not a lot more memory bandwidth. It does beg the question, though, of why the comparison function isn't inlined, especially since modern JITs can sometimes inline virtual functions. 2. Different hardware than I tested on, maybe with better memory bandwidth. 3. Expensive comparison functions. I didn't test this in D either because I couldn't think of a good use case. I tested the D parallel sort using small primitive types (ints and floats and stuff).I abandoned because I was disappointed at how poorly most of it was scaling in practice, probably due to memory bandwidth.Then do you know why the Java version seems to be advantageous (with four cores)? Bye, bearophile
Apr 21 2013
On 04/21/2013 04:46 PM, dsimcha wrote:2. Different hardware than I tested on, maybe with better memory bandwidth.Your implementation performs a lot of copying. Maybe an in-place parallel sort algorithm would perform better, e.g. parallel quicksort.3. Expensive comparison functions. I didn't test this in D either because I couldn't think of a good use case. I tested the D parallel sort using small primitive types (ints and floats and stuff).String sorting is a good use case with slightly higher comparison cost.
Apr 22 2013
Dynamic Arrays: they replace the variable-sized stack-allocated arrays of C99. I'd like something like this in D2 (Issue 9832) with a better integration with the type system.With some more changes: http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20130415/078426.html Bye, bearophile
Apr 21 2013