digitalmars.D - Interesting performance data-point
- Don Allen (69/69) Dec 31 2024 As I've mentioned in previous messages, I've ported my personal
- Steven Schveighoffer (4/22) Dec 31 2024 Great read! Yeah, I think the interoperability with C is very
- Chris Piker (10/14) Jan 04 Exactly this (plus a good standard library).
- Mike Shah (29/47) Jan 04 Agreed, the effectively 100% interop with C is one of DLang's
- ryuukk_ (6/13) Jan 05 There is no "anti-gc" crowd
- monkyyy (8/11) Jan 05 Who? I feel your conflating a few people, the allocator debate,
- Serg Gini (7/8) Jan 05 Did you try to ask for help in chatGPT or Rust forums?
As I've mentioned in previous messages, I've ported my personal finance package from C to D, having first ported some of it to Rust until I just couldn't stand it anymore. One of the utilities that exists in both the D and Rust versions reads .csv files downloaded from American Express and loads the transactions into the Sqlite database that contains my financial data, trying to assign an expense account to each incoming transaction by fuzzy-comparing the transaction's description to existing transactions, using an algorithm based on Levenshtein distance. The Levenshtein calculation is done using a user-defined Sqlite function that is loaded as an extension. What I've found is that the D version of this utility is about twice as fast (compiled with DMD) as the Rust version to get identical results. While I haven't done detailed enough measurements to explain the performance disparity with certainty, I've done enough to know that both versions spend most of their time in the Levenshtein distance function. But I have a theory that I think is the likely explanation. And if I'm correct, it highlights one of D's strongest points -- the ability to call C libraries directly, without the need for an elaborate interface layer. What I think is going on is that rusqlite, the crate that is Rust's primary Sqlite interface package, does not provide a way to step through the results of a select query, as the Sqlite library itself does, stopping when you are happy. Instead, you run the 'query' method (or one of its variants) on a prepared statement, which either returns an iterator for you to access all the returned rows or calls a closure to process each row. This difference matters when each row involves an expensive calculation. In my case, I want the most recent transaction that meets the Levenshtein distance criterion, which will be the first row in the result set, since I order them by post-date descending. In D, I am able to step the match query and either I get a row or I don't. If I do, I stop, use that transaction's expense account and I'm done. The entire result set is not computed. In Rust, rusqlite computes the entire result set, which is expensive due to the Levenshtein calculation, and then hands it to me row by row. It is not a simple matter to convince Sqlite to restrict the result set to the most recent row. 'limit 1' makes no difference in the Rust application's performance (I tried it). Apparently Sqlite applies 'limit' after computing the result set. There *may* be a way to do this using Sqlite's windowing capability, but that's a bit of a research project that I have no inclination to take on. I have also not found a Rust crate that provides step-level control over Sqlite *and* lets you load extensions. I think this illustrates a strength of D that I don't think enough people understand -- the ability to talk directly and easily to the C world. People complain that D doesn't have a rich set of libraries. It doesn't need one; all the C libraries are almost as easily accessible from D as they are from C or C++. And this has gotten even easier with the advent of ImportC, which I think is a very important addition to D and worth continued development to hide the craziness in C header files. In my case, in D, I can use a straight-forward query and have the same simple interaction with Sqlite that I would have in C. There may be a way to match D's performance in this case with Rust, but it would require effort, perhaps a lot. This is typical of the Rust experience compared to D. Things are just more difficult, mainly because the user plays a bigger role in memory management in Rust than in languages, like D, that provide a GC (I simply do not understand the anti-GC religious fanatics, especially when we are talking about ordinary applications on today's multi-ghz hardware with huge amounts of memory). D's performance is comparable (except in the case of the AMEX utility, where it is a lot better) and the code is more readable. Unfortunately, people jump on band-wagons mindlessly.
Dec 31 2024
On Tuesday, 31 December 2024 at 15:36:31 UTC, Don Allen wrote:As I've mentioned in previous messages, I've ported my personal finance package from C to D, having first ported some of it to Rust until I just couldn't stand it anymore. One of the utilities that exists in both the D and Rust versions reads .csv files downloaded from American Express and loads the transactions into the Sqlite database that contains my financial data, trying to assign an expense account to each incoming transaction by fuzzy-comparing the transaction's description to existing transactions, using an algorithm based on Levenshtein distance. The Levenshtein calculation is done using a user-defined Sqlite function that is loaded as an extension. What I've found is that the D version of this utility is about twice as fast (compiled with DMD) as the Rust version to get identical results. While I haven't done detailed enough measurements to explain the performance disparity with certainty, I've done enough to know that both versions spend most of their time in the Levenshtein distance function.Great read! Yeah, I think the interoperability with C is very much a super-power of D. -Steve
Dec 31 2024
On Tuesday, 31 December 2024 at 15:36:31 UTC, Don Allen wrote:But I have a theory that I think is the likely explanation. And if I'm correct, it highlights one of D's strongest points -- the ability to call C libraries directly, without the need for an elaborate interface layer.Exactly this (plus a good standard library). In space-physics we have decades of C (and fortran) libraries whose output has been thoroughly vetted. When a measurement platform is traveling over 7.5 km/second it's wise to use codes with well characterized round-off, and that usually means using old software. Since python is the de-facto language of science, there are students whose entire master thesis is making useful interfaces to these old libraries. From D, I can just call them. It's great!
Jan 04
Nice post -- thanks for sharing this case study! On Tuesday, 31 December 2024 at 15:36:31 UTC, Don Allen wrote:I think this illustrates a strength of D that I don't think enough people understand -- the ability to talk directly and easily to the C world. People complain that D doesn't have a rich set of libraries. It doesn't need one; all the C libraries are almost as easily accessible from D as they are from C or C++. And this has gotten even easier with the advent of ImportC, which I think is a very important addition to D and worth continued development to hide the craziness in C header files.Agreed, the effectively 100% interop with C is one of DLang's awesome superpowers :) The interop with C++ is also quite good -- I think D and Swift are the only languages I've seen otherwise to have interop in a nearly perfect or good state with all three of C, C++, Objective-C interop. Some of my favorite other superpowers (that you probably already know, but in case anyone new is lurking here): 1. metaprogramming and templates 2. CTFE 3. Slicing 4. Most of the defaults match my preference (default-on bounds checking, initialized variables) 5. Module system (i.e. no messing around distinguishing with source and header files). 6. Ranges 7. Tooling: Dub, profiler, gc-profiler, cov (Might not be perfect tools, but having a package manager and default build system is really nice for those just getting started) 8. ...my list could go on -- but I'll just say I have fun doing real software engineering in D :)[...] Things are just more difficult, mainly because the user plays a bigger role in memory management in Rust than in languages, like D, that provide a GC (I simply do not understand the anti-GC religious fanatics, especially when we are talking about ordinary applications on today's multi-ghz hardware with huge amounts of memory). D's performance is comparable (except in the case of the AMEX utility, where it is a lot better) and the code is more readable. Unfortunately, people jump on band-wagons mindlessly.I think the funny thing is that D provides several memory management strategies, and somehow the anti-GC crowd got stuck on the default. Applications like games where control over memory is needed (whether that means simply preallocating, explicitly calling when to GC, not using GC at all and using malloc, frame allocation, using double-stack buffers, etc.) is all possible :)
Jan 04
On Sunday, 5 January 2025 at 02:00:17 UTC, Mike Shah wrote:I think the funny thing is that D provides several memory management strategies, and somehow the anti-GC crowd got stuck on the default. Applications like games where control over memory is needed (whether that means simply preallocating, explicitly calling when to GC, not using GC at all and using malloc, frame allocation, using double-stack buffers, etc.) is all possible :)There is no "anti-gc" crowd There is a "anti tell me to use the gc crowd" When we ask for an Allocator api in the runtime, we get told to "just use the GC bro" It's really not hard to understand
Jan 05
On Sunday, 5 January 2025 at 19:11:50 UTC, ryuukk_ wrote:There is a "anti tell me to use the gc crowd" When we ask for an Allocator api in the runtime, we get told to "just use the GC bro"Who? I feel your conflating a few people, the allocator debate, the betterc-go-away, and merge datastructures debates I feel are different people; if your talking about adr-ish people I feel they couldnt give a rats ass if someone adopts a allocator api, they want to delete the betterc flag in general. Where I have grown to hate the allocator-debate, I tried to do wasm which is a betterc hell
Jan 05
On Tuesday, 31 December 2024 at 15:36:31 UTC, Don Allen wrote:mindlessly.Did you try to ask for help in chatGPT or Rust forums? Rust has several libraries for SQlite including async ones, and full rust rewrites (Limbo/Turso). So I would be very surprised if they don’t have such a simple query implementation detail. And afaik Rust has zero issues with creating C bindings
Jan 05