digitalmars.D - Why is Json parsing slower in multiple threads?
- Alexandre Bourquelot (112/112) Jun 20 2023 Hello everyone. We have some D code running in production that
- Kagamin (2/2) Jun 20 2023 The program does 3 times more work and gets it done in 3 times
- Alexandre Bourquelot (7/9) Jun 20 2023 Thanks for your reply.
- Stefan Koch (4/13) Jun 20 2023 try preallocating the memory you need.
- FeepingCreature (11/24) Jun 20 2023 Yeah if you look with `perf record`, you will see that the
- Sergey (8/15) Jun 20 2023 Btw if you want really fast solution, I can recommend to try ASDF
- Steven Schveighoffer (15/18) Jun 20 2023 The issue, undoubtedly, is memory allocation. Your JSON parsers (both
- Andrej Mitrovic (10/13) Jun 21 2023 This would be something that's important enough to list on the
- Alexandre Bourquelot (5/11) Jun 21 2023 This makes a lot of sense. I ended up using asdf and it works
Hello everyone. We have some D code running in production that reads files containing lines of JSON data, that we would like to parse and process. These files can be processed in parallel, so we create one thread for processing each file. However I noticed significant slowdowns when processing multiple files in parallel, as opposed to processing only one file. Here is a simple code snippet reproducing the issue. It reads from a file containing the same json copy pasted 100k times, like so: ```json { "s" : "string", "i" : 42} { "s" : "string", "i" : 42} { "s" : "string", "i" : 42} ... ``` It gives the following output: ``` ➜ ./test 1 (file ) (thread id 140310703728384) starting processing file (file )Done in 1 sec, 549 ms, 257 μs, and 6 hnsecs ➜ ./test 3 (file ) (thread id 140071550318336) starting processing file (file ) (thread id 140078235236096) starting processing file (file ) (thread id 140078221063936) starting processing file (file )Done in 4 secs, 296 ms, 780 μs, and 9 hnsecs (file )Done in 4 secs, 360 ms, 498 μs, and 3 hnsecs (file )Done in 4 secs, 393 ms, 342 μs, and 6 hnsecs ``` Another curious thing is that this behaviour is not present when compiling the code with the `--build=profile` option. For reference: ```bash ➜ ldc2 --version LDC - the LLVM D compiler (1.24.0): based on DMD v2.094.1 and LLVM 11.0.1 ``` ```d import std.file; import core.thread.osthread; import std.conv; import std.concurrency; import std.json; import std.stdio; import std.encoding; import std.datetime.systime : Clock; import std.process; import std.functional; import std.algorithm; import std.bitmanip; void parseInThread(string[] lines) { writefln("(file %s) (thread id %s) starting processing file", "", thisThreadID); auto startTime = Clock.currTime; foreach (line; lines) { line.parseJSON; } writefln("(file %s )Done in %s", "", Clock.currTime - startTime); } class T { Thread t_; string _filename; string[] _lines; this(string[] lines) { _lines = lines.dup; t_ = new Thread(() { parseInThread(_lines); }); } void opCall() { t_.start; } void join() { t_.join; } } int main(string[] args) { T[] threads; string filenameBase = "./file"; foreach (k; 1 .. args[1].to!int + 1) { auto v = filenameBase ~ k.to!string; auto newFile = File(v ~ "", "r"); string[] lines; foreach (line; newFile.byLine) { lines ~= (line.to!string); } newFile.close; threads ~= new T(lines); } foreach (thread; threads) { thread(); } foreach (thread; threads) { thread.join; } return 0; } ``` Thanks in advance, this has been annoying me for a couple of days and I have no idea what might be the problem. Strangely enough I also have the same problem when using `vibe-d` json library for parsing.
Jun 20 2023
The program does 3 times more work and gets it done in 3 times more time: 1.5*3=4.5
Jun 20 2023
On Tuesday, 20 June 2023 at 10:29:04 UTC, Kagamin wrote:The program does 3 times more work and gets it done in 3 times more time: 1.5*3=4.5Thanks for your reply. I am using threads so the work should get done in the same time that it takes for one file to get processed, since its distributed across cores. I even wrote a similar C++ program just to be sure, that performs as expected.
Jun 20 2023
On Tuesday, 20 June 2023 at 10:39:42 UTC, Alexandre Bourquelot wrote:On Tuesday, 20 June 2023 at 10:29:04 UTC, Kagamin wrote:try preallocating the memory you need. it might very well be that the GC allocation lock slows you down.The program does 3 times more work and gets it done in 3 times more time: 1.5*3=4.5Thanks for your reply. I am using threads so the work should get done in the same time that it takes for one file to get processed, since its distributed across cores. I even wrote a similar C++ program just to be sure, that performs as expected.
Jun 20 2023
On Tuesday, 20 June 2023 at 09:31:57 UTC, Alexandre Bourquelot wrote:Hello everyone. We have some D code running in production that reads files containing lines of JSON data, that we would like to parse and process. These files can be processed in parallel, so we create one thread for processing each file. However I noticed significant slowdowns when processing multiple files in parallel, as opposed to processing only one file. ... Thanks in advance, this has been annoying me for a couple of days and I have no idea what might be the problem. Strangely enough I also have the same problem when using `vibe-d` json library for parsing.Yeah if you look with `perf record`, you will see that the program spends approximately all its runtime in the garbage collector. JSON parsing is very memory hungry. So you get no parallelization because the allocator takes a lock, and you also get the overhead of lots and lots of lock waits. I recommend using a streaming JSON parser like std_data_json https://github.com/dlang-community/std_data_json and loading into a well-typed data structure directly, to keep the amount of unnecessary allocations to a minimum.
Jun 20 2023
On Tuesday, 20 June 2023 at 09:31:57 UTC, Alexandre Bourquelot wrote:Hello everyone. We have some D code running in production that reads files containing lines of JSON data, that we would like to parse and process. Thanks in advance, this has been annoying me for a couple of days and I have no idea what might be the problem. Strangely enough I also have the same problem when using `vibe-d` json library for parsing.Btw if you want really fast solution, I can recommend to try ASDF library. There is also Mir-ION successor, but I haven't tried it. With help of ASDF I was able to prepare almost the best solution for JSON serde problem. Also with low memory consumption! https://programming-language-benchmarks.vercel.app/problem/json-serde
Jun 20 2023
On 6/20/23 5:31 AM, Alexandre Bourquelot wrote:Thanks in advance, this has been annoying me for a couple of days and I have no idea what might be the problem. Strangely enough I also have the same problem when using `vibe-d` json library for parsing.The issue, undoubtedly, is memory allocation. Your JSON parsers (both std.json and vibe-d) allocate an AA for each object, and parse the entire string into a DOM structure. The D GC has a single global lock to allocate memory -- even memory that might be on a free list. So the threads are all bottlenecked on waiting their turn for the lock. Depending on what you want to do, like others here, I'd recommend a stream-based json parser. And then you also don't have to split it into lines. If the goal is to build a huge representation of all the data, then there's not much else to be done, unless you want to pre-allocate. But you may end up having to drive that yourself. I can possibly recommend, in addition to what others have mentioned, my jsoniopipe library. -Steve
Jun 20 2023
On Wednesday, 21 June 2023 at 00:35:42 UTC, Steven Schveighoffer wrote:The D GC has a single global lock to allocate memory -- even memory that might be on a free list. So the threads are all bottlenecked on waiting their turn for the lock.This would be something that's important enough to list on the spec page for the GC: https://dlang.org/spec/garbage.html It's only mentioned in passing here: https://dlang.org/articles/d-array-article.html#caching in the sentence "not to mention acquiring the global GC lock". In theory the GC is replaceable but I think we should document the behavior of the default one. I'll submit an issue for it.
Jun 21 2023
On Wednesday, 21 June 2023 at 00:35:42 UTC, Steven Schveighoffer wrote:The issue, undoubtedly, is memory allocation. Your JSON parsers (both std.json and vibe-d) allocate an AA for each object, and parse the entire string into a DOM structure. The D GC has a single global lock to allocate memory -- even memory that might be on a free list. So the threads are all bottlenecked on waiting their turn for the lock.This makes a lot of sense. I ended up using asdf and it works great. Thank you everyone for your insight.
Jun 21 2023