digitalmars.D - std.algorithm.copy could be better optimized
- TommiT (15/15) Jun 17 2013 std.algorithm.copy adheres to the specification of the C++
- Andrei Alexandrescu (4/18) Jun 17 2013 Bugzilla. FWIW I did notice Duff can sometimes beat memmove, but that
- TommiT (15/45) Jun 17 2013 Another thing that comes to mind is that...
- Brad Roberts (10/41) Jun 17 2013 In my experience, this is an area that ends up being deferred over to th...
- TommiT (6/6) Jun 18 2013 I added these two enhancement requests:
- Andrei Alexandrescu (4/10) Jun 18 2013 You're on the right track. Specializing for types with certain traits is...
std.algorithm.copy adheres to the specification of the C++ standard library function std::copy. That specification states that the source and target ranges may not overlap. Yet, all the major C++ standard libraries (libc++, libstdc++, Dinkum C++ Library) implement std::copy so that it becomes a memmove if the source and target element types are the same and the element type is std::is_trivially_copy_assignable. Thus, the current C++ specification seems to be too strict. The current std.algorithm.copy implementation doesn't get optimized into a memmove when it would be safe to do so. It really should, because it's the fastest way to get the job done, and it allows the ranges to overlap. Also, the documentation should be changed to notify of this relaxation. There's some benchmarks of memmove vs other methods: http://nadeausoftware.com/articles/2012/05/c_c_tip_how_copy_memory_quickly
Jun 17 2013
On 6/17/13 1:36 PM, TommiT wrote:std.algorithm.copy adheres to the specification of the C++ standard library function std::copy. That specification states that the source and target ranges may not overlap. Yet, all the major C++ standard libraries (libc++, libstdc++, Dinkum C++ Library) implement std::copy so that it becomes a memmove if the source and target element types are the same and the element type is std::is_trivially_copy_assignable. Thus, the current C++ specification seems to be too strict. The current std.algorithm.copy implementation doesn't get optimized into a memmove when it would be safe to do so. It really should, because it's the fastest way to get the job done, and it allows the ranges to overlap. Also, the documentation should be changed to notify of this relaxation. There's some benchmarks of memmove vs other methods: http://nadeausoftware.com/articles/2012/05/c_c_tip_how_copy_memory_quicklyBugzilla. FWIW I did notice Duff can sometimes beat memmove, but that was some 10 years ago. Andrei
Jun 17 2013
On Monday, 17 June 2013 at 17:43:25 UTC, Andrei Alexandrescu wrote:On 6/17/13 1:36 PM, TommiT wrote:Another thing that comes to mind is that... R find(alias pred = "a == b", R, E)(R haystack, E needle) ...could be optimized, when pred is "a == b", R is an array of E's, and E is either char, byte or ubyte, to call the c function memchr instead. It's a bit tricky though: quite recently a bug was found in this particular optimization in the standard library that ships with Visual Studio, as described in this interesting tutorial: http://channel9.msdn.com/Series/C9-Lectures-Stephan-T-Lavavej-Core-C-/Stephan-T-Lavavej-Core-C-7-of-n More generally, what I'm saying is that we should probably just go through some good C++ standard library implementation and copy all the optimizations that they have done. I'm sure there's much more.std.algorithm.copy adheres to the specification of the C++ standard library function std::copy. That specification states that the source and target ranges may not overlap. Yet, all the major C++ standard libraries (libc++, libstdc++, Dinkum C++ Library) implement std::copy so that it becomes a memmove if the source and target element types are the same and the element type is std::is_trivially_copy_assignable. Thus, the current C++ specification seems to be too strict. The current std.algorithm.copy implementation doesn't get optimized into a memmove when it would be safe to do so. It really should, because it's the fastest way to get the job done, and it allows the ranges to overlap. Also, the documentation should be changed to notify of this relaxation. There's some benchmarks of memmove vs other methods: http://nadeausoftware.com/articles/2012/05/c_c_tip_how_copy_memory_quicklyBugzilla. FWIW I did notice Duff can sometimes beat memmove, but that was some 10 years ago. Andrei
Jun 17 2013
On 6/17/13 6:14 PM, TommiT wrote:On Monday, 17 June 2013 at 17:43:25 UTC, Andrei Alexandrescu wrote:In my experience, this is an area that ends up being deferred over to the compiler (via a builtin or direct use of mem* that the compiler is just smart enough to recognize and implement itself) to let it generate the optimal code. The thresholds for inline vs function call and myriad of optimization techniques end up being super platform specific and duplicating all that complication essentially guarantees that you're not going to do as good a job. As an example, there's people at intel that are paid to do little more than improve the code generation in gcc and often submit new versions of the mem* routines both to gcc and to glibc. It's actually amazing how often that code is tweaked for both older generations of cpus and to add new ones. - BradOn 6/17/13 1:36 PM, TommiT wrote:Another thing that comes to mind is that... R find(alias pred = "a == b", R, E)(R haystack, E needle) ...could be optimized, when pred is "a == b", R is an array of E's, and E is either char, byte or ubyte, to call the c function memchr instead. It's a bit tricky though: quite recently a bug was found in this particular optimization in the standard library that ships with Visual Studio, as described in this interesting tutorial: http://channel9.msdn.com/Series/C9-Lectures-Stephan-T-Lavavej-Core-C-/Stephan-T-Lavavej-Core-C-7-of-n More generally, what I'm saying is that we should probably just go through some good C++ standard library implementation and copy all the optimizations that they have done. I'm sure there's much more.std.algorithm.copy adheres to the specification of the C++ standard library function std::copy. That specification states that the source and target ranges may not overlap. Yet, all the major C++ standard libraries (libc++, libstdc++, Dinkum C++ Library) implement std::copy so that it becomes a memmove if the source and target element types are the same and the element type is std::is_trivially_copy_assignable. Thus, the current C++ specification seems to be too strict. The current std.algorithm.copy implementation doesn't get optimized into a memmove when it would be safe to do so. It really should, because it's the fastest way to get the job done, and it allows the ranges to overlap. Also, the documentation should be changed to notify of this relaxation. There's some benchmarks of memmove vs other methods: http://nadeausoftware.com/articles/2012/05/c_c_tip_how_copy_memory_quicklyBugzilla. FWIW I did notice Duff can sometimes beat memmove, but that was some 10 years ago. Andrei
Jun 17 2013
I added these two enhancement requests: 1) memmove optimization for std.algorithm.copy: http://d.puremagic.com/issues/show_bug.cgi?id=10402 2) memchr optimization for std.algorithm.find: http://d.puremagic.com/issues/show_bug.cgi?id=10403 ...but I suspect they're just a tip of the ice berg.
Jun 18 2013
On 6/18/13 8:56 AM, TommiT wrote:I added these two enhancement requests: 1) memmove optimization for std.algorithm.copy: http://d.puremagic.com/issues/show_bug.cgi?id=10402 2) memchr optimization for std.algorithm.find: http://d.puremagic.com/issues/show_bug.cgi?id=10403 ...but I suspect they're just a tip of the ice berg.You're on the right track. Specializing for types with certain traits is par for the course in generic programming. Andrei
Jun 18 2013