digitalmars.D.learn - Aliases to mutable thread-local data not allowed
- mark (77/77) Mar 10 2020 I have this struct:
- Simen =?UTF-8?B?S2rDpnLDpXM=?= (16/46) Mar 10 2020 As the error message hints at, the problem is Deb may hold
- mark (32/45) Mar 10 2020 Thanks, that's led to some progress.
- mark (34/34) Mar 10 2020 I just tried:
- Steven Schveighoffer (18/59) Mar 10 2020 You aren't accepting an immutable(Deb), but a Deb. So you aren't
- mark (16/31) Mar 10 2020 [snip]
- mark (185/185) Mar 10 2020 I've managed to make a cut-down version that's < 170 LOC.
- Simen =?UTF-8?B?S2rDpnLDpXM=?= (34/38) Mar 11 2020 This took some time figuring out. Turns out,
- mark (174/174) Mar 11 2020 Hi Simen,
- mark (72/72) Mar 11 2020 I finally got a threaded version that works, and a lot more
- Simen =?UTF-8?B?S2rDpnLDpXM=?= (18/23) Mar 11 2020 Fascinating. It works just fine when compiling for 32-bit targets
- mark (6/22) Mar 11 2020 I did that and it compiles & runs, but no Debs get added to the
- Simen =?UTF-8?B?S2rDpnLDpXM=?= (10/33) Mar 11 2020 Yeah, I forgot we cast to immutable to be able to send, so
- mark (6/14) Mar 11 2020 Thanks, that fixed it.
I have this struct: struct Deb { string name; ... Unit[string] tags; // set of tags Deb dup() const { Deb deb; deb.name = name; ... foreach (key; tags.byKey) deb.tags[key] = unit; return deb; } } And I want to populate an AA of them: Deb[string] debForName; This takes 1.5 sec which is too slow. So now I'm trying to read the files concurrently: private struct DoneMessage {} void initialize() { // main thread (some code elided) Tid[] tids; try { foreach (string filename; dirEntries(PATH_PATTERN, SpanMode.shallow)) tids ~= spawn(&readPackageFile, thisTid, filename); auto jobs = tids.length; while (jobs) { receive( (Deb deb) { debForName[deb.name] = deb; }, (DoneMessage m) { jobs--; } ); } } catch (FileException err) { stderr.writeln("failed to read packages: ", err); } } // This is called once per child thread and calls send() 1000s of times to // return Deb structs, finally sending DoneMessage. void readPackageFile(Tid parentTid, string filename) { // (some code elided) try { Deb deb; auto file = File(filename); foreach(lino, line; file.byLine.enumerate(1)) readPackageLine(parentTid, filename, lino, line, deb); // readPackageLine also calls send with same code as AAA below if (deb.valid) send(parentTid, deb.dup); // AAA } catch (FileException err) { stderr.writeln(err); } send(parentTid, DoneMessage()); } Unfortunately, I can't send Debs: src/model.d(71,30): Error: template std.concurrency.spawn cannot deduce function from argument types !()(void delegate(Tid parentTid, string filename), Tid, string), candidates are: /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/concurrency.d(460,5): spawn(F, T...)(F fn, T args) with F = void delegate(Tid, string), T = (Tid, string) must satisfy the following constraint: isSpawnable!(F, T) src/model.d(72,13): Warning: statement is not reachable src/model.d(73,13): Warning: statement is not reachable src/model.d(79,13): Warning: statement is not reachable src/model.d(72,13): Warning: statement is not reachable src/model.d(73,13): Warning: statement is not reachable src/model.d(79,13): Warning: statement is not reachable /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/concurrency.d(625,5): Error: static assert: "Aliases to mutable thread-local data not allowed." src/model.d(102,21): instantiated from here: send!(Deb) /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/ldc2 failed with exit code 1. Is there any nice solution to this? Surely it is a common pattern to read multiple files and create lots of data items to be merged into a collection?
Mar 10 2020
On Tuesday, 10 March 2020 at 08:13:19 UTC, mark wrote:I have this struct: struct Deb { string name; ... Unit[string] tags; // set of tags Deb dup() const { Deb deb; deb.name = name; ... foreach (key; tags.byKey) deb.tags[key] = unit; return deb; } }void readPackageFile(Tid parentTid, string filename) { // (some code elided) try { Deb deb; auto file = File(filename); foreach(lino, line; file.byLine.enumerate(1)) readPackageLine(parentTid, filename, lino, line, deb); // readPackageLine also calls send with same code as AAA below if (deb.valid) send(parentTid, deb.dup); // AAA/home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/c ncurrency.d(625,5): Error: static assert: "Aliases to mutable thread-local data not allowed." Is there any nice solution to this? Surely it is a common pattern to read multiple files and create lots of data items to be merged into a collection?As the error message hints at, the problem is Deb may hold references to data that is shared with other objects on the thread from which it originates. Since you know this is not the case, even if the compiler can't prove it, you can safely cast your Deb to immutable: if (deb.valid) send(parentTid, cast(immutable)deb.dup); In fact, even the .dup is unnecessary here, since no data is shared with other objects, so you can simply write send(parentTid, cast(immutable)deb);. (Note on this point: since you have not included all your code, it could be other parts create shared mutable state, in which case .dup is necessary, and if badly written may not be sufficient. -- Simen
Mar 10 2020
On Tuesday, 10 March 2020 at 10:02:16 UTC, Simen Kjærås wrote: [snip]As the error message hints at, the problem is Deb may hold references to data that is shared with other objects on the thread from which it originates. Since you know this is not the case, even if the compiler can't prove it, you can safely cast your Deb to immutable: if (deb.valid) send(parentTid, cast(immutable)deb.dup); In fact, even the .dup is unnecessary here, since no data is shared with other objects, so you can simply write send(parentTid, cast(immutable)deb);. (Note on this point: since you have not included all your code, it could be other parts create shared mutable state, in which case .dup is necessary, and if badly written may not be sufficient.Thanks, that's led to some progress. Once each Deb is populated its contents are never modified. Nonetheless I've kept with .dup since I reuse the same Deb each time to populate, then send a copy, then clear and reuse the original. Unfortunately, I'm now getting different errors: src/model.d(85,30): Error: template std.concurrency.spawn cannot deduce function from argument types !()(void delegate(Tid parentTid, string filename), Tid, string), candidates are: /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/concurrency.d(460,5): spawn(F, T...)(F fn, T args) with F = void delegate(Tid, string), T = (Tid, string) must satisfy the following constraint: isSpawnable!(F, T) src/model.d(86,13): Warning: statement is not reachable src/model.d(87,13): Warning: statement is not reachable src/model.d(86,13): Warning: statement is not reachable src/model.d(87,13): Warning: statement is not reachable /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/variant.d(701,26): Error: cannot implicitly convert expression rhs of type immutable(Deb) to Deb /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/variant.d(603,17): Error: template instance std.variant.VariantN!32LU.VariantN.opAssign!(immutable(Deb)) error instantiating /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/co currency.d(126,22): instantiated from here: __ctor!(immutable(Deb)) /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/co currency.d(656,23): instantiated from here: __ctor!(immutable(Deb)) /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/co currency.d(647,10): instantiated from here: _send!(immutable(Deb)) /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/co currency.d(626,10): instantiated from here: _send!(immutable(Deb)) src/model.d(115,21): instantiated from here: send!(immutable(Deb)) /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/ldc2 failed with exit code 1. Should I be somehow casting away the immutable at the receive end? (Even though received Debs are never modified?)
Mar 10 2020
I just tried: auto jobs = tids.length; while (jobs) { receive( (Deb deb) { debForName[deb.name] = cast(Deb)deb; }, (DoneMessage m) { jobs--; } ); } to cast away the immutable from the sender, but that doesn't work either: src/model.d(85,30): Error: template std.concurrency.spawn cannot deduce function from argument types !()(void delegate(Tid parentTid, string filename), Tid, string), candidates are: /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/concurrency.d(460,5): spawn(F, T...)(F fn, T args) with F = void delegate(Tid, string), T = (Tid, string) must satisfy the following constraint: isSpawnable!(F, T) src/model.d(86,13): Warning: statement is not reachable src/model.d(87,13): Warning: statement is not reachable src/model.d(86,13): Warning: statement is not reachable src/model.d(87,13): Warning: statement is not reachable /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/variant.d(701,26): Error: cannot implicitly convert expression rhs of type immutable(Deb) to Deb /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/variant.d(603,17): Error: template instance std.variant.VariantN!32LU.VariantN.opAssign!(immutable(Deb)) error instantiating /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/co currency.d(126,22): instantiated from here: __ctor!(immutable(Deb)) /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/co currency.d(656,23): instantiated from here: __ctor!(immutable(Deb)) /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/co currency.d(647,10): instantiated from here: _send!(immutable(Deb)) /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/co currency.d(626,10): instantiated from here: _send!(immutable(Deb)) src/model.d(115,21): instantiated from here: send!(immutable(Deb)) /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/ldc2 failed with exit code 1. :
Mar 10 2020
On 3/10/20 7:09 AM, mark wrote:I just tried: auto jobs = tids.length; while (jobs) { receive( (Deb deb) { debForName[deb.name] = cast(Deb)deb; }, (DoneMessage m) { jobs--; } ); } to cast away the immutable from the sender, but that doesn't work either:You aren't accepting an immutable(Deb), but a Deb. So you aren't actually casting away anything. But this seems more like it would be a runtime error (std.concurrency can't find any function that handles the immutable(Deb) e.g.). Still, the correct thing here is to handle immutable(Deb). However, I'd strongly caution you against casting away immutable, that can lead to undefined behavior in D. Better to just store it as immutable to begin with.src/model.d(85,30): Error: template std.concurrency.spawn cannot deduce function from argument types !()(void delegate(Tid parentTid, string filename), Tid, string), candidates are: /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/c ncurrency.d(460,5): spawn(F, T...)(F fn, T args) with F = void delegate(Tid, string), T = (Tid, string) must satisfy the following constraint: isSpawnable!(F, T) src/model.d(86,13): Warning: statement is not reachable src/model.d(87,13): Warning: statement is not reachable src/model.d(86,13): Warning: statement is not reachable src/model.d(87,13): Warning: statement is not reachableHard to see what the problem is here, it looks like you have a function that has unreachable statements, which is why the spawn isn't compiling (the function you are passing it can't compile)./home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/variant.d(701,26): Error: cannot implicitly convert expression rhs of type immutable(Deb) to Deb /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/variant.d(603,17): Error: template instance std.variant.VariantN!32LU.VariantN.opAssign!(immutable(Deb)) error instantiating /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/co currency.d(126,22): instantiated from here: __ctor!(immutable(Deb)) /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/co currency.d(656,23): instantiated from here: __ctor!(immutable(Deb)) /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/co currency.d(647,10): instantiated from here: _send!(immutable(Deb)) /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/co currency.d(626,10): instantiated from here: _send!(immutable(Deb)) src/model.d(115,21): instantiated from here: send!(immutable(Deb)) /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/ldc2 failed with exit code 1.This looks like an error in std.concurrency (or std.variant), and I'm not sure what is the problem here. It seems std.variant cannot accept an immutable, which makes no sense as immutable data is a huge usecase of std.concurrency. Can you narrow this down to a small example that can be submitted as a bug report? -Steve
Mar 10 2020
On Tuesday, 10 March 2020 at 15:27:04 UTC, Steven Schveighoffer wrote:On 3/10/20 7:09 AM, mark wrote:[snip]Still, the correct thing here is to handle immutable(Deb). However, I'd strongly caution you against casting away immutable, that can lead to undefined behavior in D. Better to just store it as immutable to begin with.I'm happy to store immutable Debs in Deb[string] debForName; but the syntax immutable(Deb)[string] debForName; just produced errors. [snip]Hard to see what the problem is here, it looks like you have a function that has unreachable statements, which is why the spawn isn't compiling (the function you are passing it can't compile).They are only unreachable (I think) because of the stuff further down the backtrace. [snip]This looks like an error in std.concurrency (or std.variant), and I'm not sure what is the problem here. It seems std.variant cannot accept an immutable, which makes no sense as immutable data is a huge usecase of std.concurrency. Can you narrow this down to a small example that can be submitted as a bug report?I'll try to put the project on github -- if I ever get it working it'll be GPL, just that I don't like putting up pre-alpha stuff. So then I'll add the link. Thanks.
Mar 10 2020
I've managed to make a cut-down version that's < 170 LOC. It needs to be run on Debian or a Debian-based Linux (e.g., Ubuntu). Below is the full source followed by the error output. // app.d import std.typecons: Tuple; void main() { import std.stdio: writeln, writefln; auto model = Model(); model.readPackages(); writefln("read %,d packages", model.length); } enum PACKAGE_DIR = "/var/lib/apt/lists"; enum PACKAGE_PATTERN = "*Packages"; alias Unit = void[0]; enum unit = Unit.init; alias MaybeKeyValue = Tuple!(string, "key", string, "value", bool, "ok"); struct DoneMessage {} struct Deb { string name; string description; Unit[string] tags; // set of tags Deb dup() const { Deb deb; deb.name = name; deb.description = description; foreach (key; tags.byKey) deb.tags[key] = unit; return deb; } bool valid() { import std.string: empty; return !name.empty && !description.empty; } void clear() { name = ""; description = ""; tags.clear; } } struct Model { import std.concurrency: Tid; private Deb[string] debForName; // Only read once populated size_t length() const { return debForName.length; } void readPackages() { import std.concurrency: receive, thisTid, spawn; import std.file: dirEntries, FileException, SpanMode; Tid[] tids; try { foreach (string filename; dirEntries(PACKAGE_DIR, PACKAGE_PATTERN, SpanMode.shallow)) tids ~= spawn(&readPackageFile, thisTid, filename); auto jobs = tids.length; while (jobs) { receive( (Deb deb) { debForName[deb.name] = deb; }, (DoneMessage) { jobs--; } ); } } catch (FileException err) { import std.stdio: stderr; stderr.writeln("failed to read packages: ", err); } } private void readPackageFile(Tid parentTid, string filename) { import std.concurrency: send; import std.file: FileException; import std.range: enumerate; import std.stdio: File, stderr; try { bool inDescription = false; // Descriptions can by multi-line bool inContinuation = false; // Other things can be multi-line Deb deb; auto file = File(filename); foreach(lino, line; file.byLine.enumerate(1)) readPackageLine(parentTid, filename, lino, line, deb, inDescription, inContinuation); if (deb.valid) send(parentTid, cast(immutable)deb.dup); } catch (FileException err) { stderr.writefln("error: %s: failed to read packages: %s", filename, err); } send(parentTid, DoneMessage()); } private void readPackageLine( Tid parentTid, const string filename, const int lino, const(char[]) line, ref Deb deb, ref bool inDescription, ref bool inContinuation) { import std.concurrency: send; import std.path: baseName; import std.stdio: stderr; import std.string: empty, startsWith, strip; if (strip(line).empty) { if (deb.valid) send(parentTid, cast(immutable)deb.dup); else if (!deb.name.empty || !deb.description.empty || !deb.tags.empty) stderr.writefln("error: %s:%,d: incomplete package: %s", baseName(filename), lino, deb); deb.clear; return; } if (inDescription || inContinuation) { if (line.startsWith(' ') || line.startsWith('\t')) { if (inDescription) deb.description ~= line; return; } inDescription = inContinuation = false; } immutable keyValue = maybeKeyValue(line); if (!keyValue.ok) inContinuation = true; else inDescription = populateDeb(deb, keyValue.key, keyValue.value); } } MaybeKeyValue maybeKeyValue(const(char[]) line) { import std.string: indexOf, strip; immutable i = line.indexOf(':'); if (i == -1) return MaybeKeyValue("", "", false); immutable key = strip(line[0..i]).idup; immutable value = strip(line[i + 1..$]).idup; return MaybeKeyValue(key, value, true); } bool populateDeb(ref Deb deb, const string key, const string value) { import std.conv: to; switch (key) { case "Package": deb.name = value; return false; case "Description", "Npp-Description": // XXX ignore Npp-? deb.description ~= value; return true; // We are now in a description case "Tag": maybePopulateTags(deb, value); return false; default: return false; // Ignore "uninteresting" fields } } void maybePopulateTags(ref Deb deb, const string tags) { import std.regex: ctRegex, split; auto rx = ctRegex!(`\s*,\s*`); foreach (tag; tags.split(rx)) deb.tags[tag] = unit; } // end of app.d Error output: src/app.d(59,30): Error: template std.concurrency.spawn cannot deduce function from argument types !()(void delegate(Tid parentTid, string filename), Tid, string), candidates are: /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/concurrency.d(460,5): spawn(F, T...)(F fn, T args) with F = void delegate(Tid, string), T = (Tid, string) must satisfy the following constraint: isSpawnable!(F, T) src/app.d(62,24): Error: template std.concurrency.receive cannot deduce function from argument types !()(void delegate(Deb deb) pure nothrow safe, void), candidates are: /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/concurrency.d(673,6): receive(T...)(T ops) /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/variant.d(701,26): Error: cannot implicitly convert expression rhs of type immutable(Deb) to Deb /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/variant.d(603,17): Error: template instance std.variant.VariantN!32LU.VariantN.opAssign!(immutable(Deb)) error instantiating /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/co currency.d(126,22): instantiated from here: __ctor!(immutable(Deb)) /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/co currency.d(656,23): instantiated from here: __ctor!(immutable(Deb)) /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/co currency.d(647,10): instantiated from here: _send!(immutable(Deb)) /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/co currency.d(626,10): instantiated from here: _send!(immutable(Deb)) src/app.d(88,21): instantiated from here: send!(immutable(Deb)) /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/ldc2 failed with exit code 1. Hopefully this will help someone understand and be able to help!
Mar 10 2020
On Tuesday, 10 March 2020 at 20:03:21 UTC, mark wrote:I've managed to make a cut-down version that's < 170 LOC. It needs to be run on Debian or a Debian-based Linux (e.g., Ubuntu). Hopefully this will help someone understand and be able to help!This took some time figuring out. Turns out, std.concurrency.spawn won't take a delegate as its callable argument. There are sensible reasons for this - delegates have a context that is not guaranteed to be immutable, so allowing delegate callables could lead to mutable aliasing. I've filed an issue to improve documentation and error messages: https://issues.dlang.org/show_bug.cgi?id=20665 However, knowing that some things are impossible do not necessarily help us figure out what we can do to fix the problem, and the good news is, the problems can be fixed. Since the problem is we're giving a delegate where the API expects a function, we can simply turn it into a function. In the code you've given, that means making readPackageFile and readPackageLine static. This make spawn() run as it should. In addition, there's a problem with this line in receive(): (DoneMessage) { jobs--; } That looks sensible, but since DoneMessage doesn't have a name, it is parsed as a templated function taking one argument of unspecified type and called DoneMessage. For some reason, templates passed as function arguments show up in compiler output as 'void', giving this weird error message: template std.concurrency.receive cannot deduce function from argument types !()(void delegate(Deb deb) pure nothrow safe, void) The solution here is to simply give DoneMessage a name: (DoneMessage d) { jobs--; } With those changes, things at least compile. Now it's up to you to ensure the semantics are correct. :) One last thing: you're passing parentTid around, and that's not actually necessary - std.concurrency.ownerTid does the exact same thing, and is available even if not explicitly passed anywhere. -- Simen
Mar 11 2020
Hi Simen, I think you must have done something else but didn't mention to get it to compile. I did the exact changes you said and it wouldn't compile. Here's what I get with changes mentioned below (with new full source): /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/variant.d(701,26): Error: cannot implicitly convert expression rhs of type immutable(Deb) to Deb /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/variant.d(603,17): Error: template instance std.variant.VariantN!32LU.VariantN.opAssign!(immutable(Deb)) error instantiating /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/co currency.d(126,22): instantiated from here: __ctor!(immutable(Deb)) /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/co currency.d(656,23): instantiated from here: __ctor!(immutable(Deb)) /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/co currency.d(647,10): instantiated from here: _send!(immutable(Deb)) /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/../import/std/co currency.d(626,10): instantiated from here: _send!(immutable(Deb)) src/app.d(89,17): instantiated from here: send!(immutable(Deb)) /home/mark/opt/ldc2-1.20.0-linux-x86_64/bin/ldc2 failed with exit code 1. Here is a modified version that names the DoneMessage variable and makes readPackageFile and readPackageLine into functions. Instead of making them static, I just took them outside the struct: after all, they never access struct member data. I also now use ownerTid. // app.d import std.typecons: Tuple; void main() { import std.stdio: writeln, writefln; auto model = Model(); model.readPackages(); writefln("read %,d packages", model.length); } enum PACKAGE_DIR = "/var/lib/apt/lists"; enum PACKAGE_PATTERN = "*Packages"; alias Unit = void[0]; // These two lines allow me to use AAs as sets enum unit = Unit.init; alias MaybeKeyValue = Tuple!(string, "key", string, "value", bool, "ok"); struct DoneMessage {} struct Deb { string name; string description; Unit[string] tags; // set of tags Deb dup() const { Deb deb; deb.name = name; deb.description = description; foreach (key; tags.byKey) deb.tags[key] = unit; return deb; } bool valid() { import std.string: empty; return !name.empty && !description.empty; } void clear() { name = ""; description = ""; tags.clear; } } struct Model { import std.concurrency: Tid; private Deb[string] debForName; // Only read once populated size_t length() const { return debForName.length; } void readPackages() { import std.concurrency: receive, spawn; import std.file: dirEntries, FileException, SpanMode; Tid[] tids; try { foreach (string filename; dirEntries(PACKAGE_DIR, PACKAGE_PATTERN, SpanMode.shallow)) tids ~= spawn(&readPackageFile, filename); auto jobs = tids.length; while (jobs) { receive( (Deb deb) { debForName[deb.name] = deb; }, (DoneMessage m) { jobs--; } ); } } catch (FileException err) { import std.stdio: stderr; stderr.writeln("failed to read packages: ", err); } } } void readPackageFile(string filename) { import std.concurrency: ownerTid, send; import std.file: FileException; import std.range: enumerate; import std.stdio: File, stderr; try { bool inDescription = false; // Descriptions can by multi-line bool inContinuation = false; // Other things can be multi-line Deb deb; auto file = File(filename); foreach(lino, line; file.byLine.enumerate(1)) readPackageLine(filename, lino, line, deb, inDescription, inContinuation); if (deb.valid) send(ownerTid, cast(immutable)deb.dup); } catch (FileException err) { stderr.writefln("error: %s: failed to read packages: %s", filename, err); } send(ownerTid, DoneMessage()); } void readPackageLine(const string filename, const int lino, const(char[]) line, ref Deb deb, ref bool inDescription, ref bool inContinuation) { import std.concurrency: ownerTid, send; import std.path: baseName; import std.stdio: stderr; import std.string: empty, startsWith, strip; if (strip(line).empty) { if (deb.valid) send(ownerTid, cast(immutable)deb.dup); else if (!deb.name.empty || !deb.description.empty || !deb.tags.empty) stderr.writefln("error: %s:%,d: incomplete package: %s", baseName(filename), lino, deb); deb.clear; return; } if (inDescription || inContinuation) { if (line.startsWith(' ') || line.startsWith('\t')) { if (inDescription) deb.description ~= line; return; } inDescription = inContinuation = false; } immutable keyValue = maybeKeyValue(line); if (!keyValue.ok) inContinuation = true; else inDescription = populateDeb(deb, keyValue.key, keyValue.value); } MaybeKeyValue maybeKeyValue(const(char[]) line) { import std.string: indexOf, strip; immutable i = line.indexOf(':'); if (i == -1) return MaybeKeyValue("", "", false); immutable key = strip(line[0..i]).idup; immutable value = strip(line[i + 1..$]).idup; return MaybeKeyValue(key, value, true); } bool populateDeb(ref Deb deb, const string key, const string value) { import std.conv: to; switch (key) { case "Package": deb.name = value; return false; case "Description", "Npp-Description": // XXX ignore Npp-? deb.description ~= value; return true; // We are now in a description case "Tag": maybePopulateTags(deb, value); return false; default: return false; // Ignore "uninteresting" fields } } void maybePopulateTags(ref Deb deb, const string tags) { import std.regex: ctRegex, split; auto rx = ctRegex!(`\s*,\s*`); foreach (tag; tags.split(rx)) deb.tags[tag] = unit; }
Mar 11 2020
I finally got a threaded version that works, and a lot more cleanly than using send/receive. (But performance is dismal, see the end.) Here's the heart of the solution: void readPackages() { import std.algorithm: max; import std.array: array; import std.parallelism: taskPool, totalCPUs; import std.file: dirEntries, FileException, SpanMode; try { auto filenames = dirEntries(PACKAGE_DIR, PACKAGE_PATTERN, SpanMode.shallow).array; foreach (debs; taskPool.map!readPackageFile(filenames)) foreach (deb; debs) debForName[deb.name] = deb.dup; } catch (FileException err) { import std.stdio: stderr; stderr.writeln("failed to read packages: ", err); } } I had to change readPackageFile (and the functions it calls), e.g.,: Deb[] readPackageFile(string filename) { import std.file: FileException; import std.range: enumerate; import std.stdio: File, stderr; Deb[] debs; Deb deb; try { bool inDescription = false; // Descriptions can by multi-line bool inContinuation = false; // Other things can be multi-line auto file = File(filename); foreach(lino, line; file.byLine.enumerate(1)) readPackageLine(debs, deb, filename, lino, line, inDescription, inContinuation); if (deb.valid) debs ~= deb.dup; } catch (FileException err) { stderr.writefln("error: %s: failed to read packages: %s", filename, err); } return debs; } I also changed main() to do some timings & to allow me to compare outputs: void main(const string[] args) { import std.datetime.stopwatch: AutoStart, StopWatch; import std.stdio: stderr, writeln; auto model = Model(); auto timer = StopWatch(AutoStart.yes); model.readPackages(); stderr.writefln("read %,d packages in %s", model.length, timer.peek); if (args.length > 1) foreach (deb; model.debForName) writeln(deb); } This produces the same output as the single-threaded version. Here's the output of a typical single-threaded version's run: read 65,480 packages in 1 sec, 314 ms, 798 μs, and 7 hnsecs And here's the output of a typical task-based multi-threaded version's run: read 65,480 packages in 1 sec, 377 ms, 605 μs, and 3 hnsecs In fact, the multi-threaded has never yet been as fast as the single-threaded version! I've put both versions on github in case anyone's interested: https://github.com/mark-summerfield/d-debtest-experiment
Mar 11 2020
On Wednesday, 11 March 2020 at 09:29:54 UTC, mark wrote:Hi Simen, I think you must have done something else but didn't mention to get it to compile. I did the exact changes you said and it wouldn't compile. Here's what I get with changes mentioned below (with new full source):Fascinating. It works just fine when compiling for 32-bit targets with DMD on Windows, but not for 64-bit targets, nor when compiling with LDC. Apparently, this difference is due to DMD supporting 80-bit reals, and thus giving a different size to Variant (VariantN!20 on DMD on Windows, VariantN!16 or VariantN!32 elsewhere). There's a bug in VariantN that then causes the compilation to fail (https://issues.dlang.org/show_bug.cgi?id=20666). The issue at hand then, is that Deb is too big until that issue if fixed. The simple solution to this is to allocate Deb on the heap with new and pass pointers instead of instances directly. Since you are already calling .dup whenever you pass a Deb somewhere, you can simply modify .dup to return a Deb* and the receive function to receive a Deb*, and I think you should be good to go. -- Simen
Mar 11 2020
On Wednesday, 11 March 2020 at 12:22:21 UTC, Simen Kjærås wrote:On Wednesday, 11 March 2020 at 09:29:54 UTC, mark wrote:[snip]Fascinating. It works just fine when compiling for 32-bit targets with DMD on Windows, but not for 64-bit targets, nor when compiling with LDC. Apparently, this difference is due to DMD supporting 80-bit reals, and thus giving a different size to Variant (VariantN!20 on DMD on Windows, VariantN!16 or VariantN!32 elsewhere). There's a bug in VariantN that then causes the compilation to fail (https://issues.dlang.org/show_bug.cgi?id=20666). The issue at hand then, is that Deb is too big until that issue if fixed. The simple solution to this is to allocate Deb on the heap with new and pass pointers instead of instances directly. Since you are already calling .dup whenever you pass a Deb somewhere, you can simply modify .dup to return a Deb* and the receive function to receive a Deb*, and I think you should be good to go.I did that and it compiles & runs, but no Debs get added to the collection. See https://github.com/mark-summerfield/d-debtest-experiment -- the 'mto' version is the one with your fixes.
Mar 11 2020
On Wednesday, 11 March 2020 at 12:43:28 UTC, mark wrote:On Wednesday, 11 March 2020 at 12:22:21 UTC, Simen Kjærås wrote:Yeah, I forgot we cast to immutable to be able to send, so receive has to receive immutable(Deb)*, after which you can call deb.dup to get a mutable copy: receive( (immutable(Deb)* deb) { debForName[deb.name] = deb.dup; }, (DoneMessage m) { jobs--; } ); -- SimenOn Wednesday, 11 March 2020 at 09:29:54 UTC, mark wrote:[snip]Fascinating. It works just fine when compiling for 32-bit targets with DMD on Windows, but not for 64-bit targets, nor when compiling with LDC. Apparently, this difference is due to DMD supporting 80-bit reals, and thus giving a different size to Variant (VariantN!20 on DMD on Windows, VariantN!16 or VariantN!32 elsewhere). There's a bug in VariantN that then causes the compilation to fail (https://issues.dlang.org/show_bug.cgi?id=20666). The issue at hand then, is that Deb is too big until that issue if fixed. The simple solution to this is to allocate Deb on the heap with new and pass pointers instead of instances directly. Since you are already calling .dup whenever you pass a Deb somewhere, you can simply modify .dup to return a Deb* and the receive function to receive a Deb*, and I think you should be good to go.I did that and it compiles & runs, but no Debs get added to the collection. See https://github.com/mark-summerfield/d-debtest-experiment -- the 'mto' version is the one with your fixes.
Mar 11 2020
On Wednesday, 11 March 2020 at 14:01:13 UTC, Simen Kjærås wrote: [snip]Yeah, I forgot we cast to immutable to be able to send, so receive has to receive immutable(Deb)*, after which you can call deb.dup to get a mutable copy: receive( (immutable(Deb)* deb) { debForName[deb.name] = deb.dup; }, (DoneMessage m) { jobs--; } );Thanks, that fixed it. However, timing-wise the single threaded version (st) is fastest, then the task multi-threaded version (mt), and finally this version (mto).
Mar 11 2020