digitalmars.D.learn - AA vs __gshared
- IchorDev (14/14) Jul 27 2023 I've been getting a lot of segfaults from using associative
- Dennis (2/4) Jul 27 2023 Are you accessing the AA from multiple threads?
- Jonathan M Davis (34/48) Jul 27 2023 Strictly speaking, __gshared is really only intended for stuff like C
- IchorDev (15/26) Jul 27 2023 I use a shared Mutex from `core.sync.mutex`.
- Kagamin (6/10) Jul 27 2023 The difference between them is purely formal if you're not on an
- IchorDev (6/9) Jul 27 2023 I'm not sure that's correct. Also I always use the latest
- IchorDev (163/175) Jul 28 2023 Doing this doesn't help, unfortunately.
- Steven Schveighoffer (10/44) Jul 28 2023 My suspicion would be a race condition in your real code, and no race
- IchorDev (21/25) Jul 28 2023 "All __gshared does is give you [a live bomb, ready to go off at
- Steven Schveighoffer (25/52) Jul 28 2023 It seems like it's not __gshared at all, but misusing malloc with GC
- IchorDev (6/24) Aug 09 2023 This also works *unless* I do a `foreach` over the AA's keys &
- Steven Schveighoffer (10/14) Aug 10 2023 That shouldn't matter.
- IchorDev (12/15) Aug 12 2023 Well, it does here. The AA is mutated during the loop, so perhaps
- Steven Schveighoffer (15/31) Aug 15 2023 oh yeah. That is not allowed. Any mutation of the AA during iteration
- IchorDev (8/27) Aug 15 2023 And yet this apparently doesn’t apply to an equivalent `for`
- Kagamin (2/2) Jul 28 2023 Your error is using allocating the object with malloc. Since gc
I've been getting a lot of segfaults from using associative arrays recently. The faults happen seemingly at random, and from pretty mundane stuff like `if(auto x = y in z)` that run very often: ``` Segmentation fault. const(void*), scope const(TypeInfo)) inout () ``` I suspect that this is because they've all been placed inside `__ghared` structs. Are DRuntime's AAs simply incompatible with `__gshared`? Do they need to be marked as `shared` to prevent these shenanigans?
Jul 27 2023
On Thursday, 27 July 2023 at 15:57:51 UTC, IchorDev wrote:The faults happen seemingly at random, and from pretty mundane stuff like `if(auto x = y in z)` that run very often:Are you accessing the AA from multiple threads?
Jul 27 2023
On Thursday, July 27, 2023 9:57:51 AM MDT IchorDev via Digitalmars-d-learn wrote:I've been getting a lot of segfaults from using associative arrays recently. The faults happen seemingly at random, and from pretty mundane stuff like `if(auto x = y in z)` that run very often: ``` Segmentation fault. const(void*), scope const(TypeInfo)) inout () ``` I suspect that this is because they've all been placed inside `__ghared` structs. Are DRuntime's AAs simply incompatible with `__gshared`? Do they need to be marked as `shared` to prevent these shenanigans?Strictly speaking, __gshared is really only intended for stuff like C globals (which can't be shared due to name-mangling issues). Using it on anything else can at least potentially cause problems due to the fact that the compiler will assume that the variable is thread-local. So, I would strongly advise against using __gshared in a case like this. In practice, you can often get away with it, because the compiler doesn't do much in the way of optimizing stuff based on objects being thread-local right now, but it's definitely risking problems with the type system if you used __gshared when you're not trying to do something like bind to a C global. What should normally be happening is that you use shared, and then when you've protected the object so that you know that it can only be accessed on the current thread by the section of code that you're in (e.g. by locking a mutex), you temporarily cast away shared to operate on the object via a thread-local reference. Then, before exiting that section of code and removing the protections that are preventing other threads from accessing the object (e.g. by unlocking the mutex), you make sure that you've gotten rid of all of the thread-local references to the object so that only the shared reference exists. That way, you don't accidentally mutate the object while it's not protected from access by other threads. Now, as to what's happening in your code that's causing segfaults, the most likely culprit would be that you're accessing the AA without actually having done anything to prevent other threads from accessing it at the same time (or your protections were inadequate). And because the object is being treated as thread-local by the compiler, it would be easy to have accidentally let a reference to it leak somewhere that wasn't being protected by whatever mutex you're using, whereas if the AA were shared, the only sections of code where you would have to worry about thread-local references escaping would be in the sections of code where you've cast away shared after locking the relevant mutex. So, similar to what happens with safe and trusted, using shared allows you to limit the code that you have to examine to find the problem. - Jonathan M Davis
Jul 27 2023
On Thursday, 27 July 2023 at 21:31:02 UTC, Jonathan M Davis wrote:Now, as to what's happening in your code that's causing segfaults, the most likely culprit would be that you're accessing the AA without actually having done anything to prevent other threads from accessing it at the same time (or your protections were inadequate).I use a shared Mutex from `core.sync.mutex`. The AA itself and the `final class` that it's a member of is only ever accessed by one thread for now. The code inside that `final class` that accesses the AA is called by a function from another `final class`, and that other `final class` is used by multiple threads, but it's fully guarded by Mutex locks—I haven't had any issues with the millions of non-atomic reads and writes on its data I've performed from the two threads. Both objects are "static" (declared at module scope) if that matters at all.if the AA were shared, the only sections of code where you would have to worry about thread-local references escaping would be in the sections of code where you've cast away shared after locking the relevant mutex. So, similar to what happens with safe and trusted, using shared allows you to limit the code that you have to examine to find the problem.I was told that using `__gshared` is quite a bit faster at runtime than using `shared`, but I also don't really know anything concrete about `shared` because the spec is so incredibly vague about it.
Jul 27 2023
On Friday, 28 July 2023 at 03:54:53 UTC, IchorDev wrote:I was told that using `__gshared` is quite a bit faster at runtime than using `shared`, but I also don't really know anything concrete about `shared` because the spec is so incredibly vague about it.The difference between them is purely formal if you're not on an crashes are frequent, can you reproduce a crash with a minimal amount of code, start many threads and access the locked AA concurrently.
Jul 27 2023
On Friday, 28 July 2023 at 04:13:13 UTC, Kagamin wrote:The difference between them is purely formal if you're not onI'm not sure that's correct. Also I always use the latest compiler versions where possible.start many threads and access the locked AA concurrently.It's only being accessed from one thread, so such a test wouldn't be an accurate reproduction of the issue I'm having. Also I'm only using 2 threads total.
Jul 27 2023
On Thursday, 27 July 2023 at 21:31:02 UTC, Jonathan M Davis wrote:What should normally be happening is that you use shared, and then when you've protected the object so that you know that it can only be accessed on the current thread by the section of code that you're in (e.g. by locking a mutex), you temporarily cast away shared to operate on the object via a thread-local reference. Then, before exiting that section of code and removing the protections that are preventing other threads from accessing the object (e.g. by unlocking the mutex), you make sure that you've gotten rid of all of the thread-local references to the object so that only the shared reference exists. That way, you don't accidentally mutate the object while it's not protected from access by other threads.Doing this doesn't help, unfortunately. This is an abstract version what the problematic code looks like now: ```d import std.stdio: writeln; import core.sync.mutex; import core.thread; import core.time; static import core.stdc.stdlib; struct Pos{ long x,y; } shared Obj obj; final class Obj{ CacheData[Pos] cache; CacheData* getCache(Pos key){ if(auto item = key in cache){ if(++item.useCount >= 8){ cache.remove(key); //I thought this ^ might cause a use-after-free issue, but apparently not. } return item; } return null; } struct CacheData{ double[1000] data; uint useCount = 1; } double[1000] doStuff(Pos pos){ immutable data = (){ if(auto data = getCache(pos)){ return *data; }else{ double[1000] data; //initialisses it with something, this is arbirray though: foreach(ref item; data){ import std.random; item = uniform01(); } cache[pos] = CacheData(data); return CacheData(data); } }(); //do stuff with data return data.data; } } __gshared OtherObj otherObj; final class OtherObj{ shared Mutex mutex; __gshared ubyte[2^^18] data; this(long n){ obj = cast(shared Obj)alloc!Obj(n); mutex = new shared Mutex; } void callObj(Pos pos){ double[1000] data; { auto objRef = cast(Obj)obj; data = objRef.doStuff(pos); } //do things with returned value... } } void thread(){ bool run = true; Pos pos; while(run){ otherObj.mutex.lock(); foreach(i; 0..100){ otherObj.callObj(pos); //this is pretty arbirary: import std.random; if(uniform01() > 0.9){ auto v = uniform01(); if(v < 0.25) pos.x--; else if(v < 0.5) pos.y++; else if(v < 0.75) pos.y--; else pos.x++; } } otherObj.mutex.unlock(); Thread.sleep(1.seconds / 20); } } void main(){ otherObj = alloc!OtherObj(-7); assert(otherObj !is null); auto otherThread = new Thread(&thread); otherThread.start(); bool run = true; while(run){ if(!otherThread.isRunning()) otherThread.join(); otherObj.mutex.lock(); foreach(val; otherObj.data){ //do stuff } otherObj.mutex.unlock(); Thread.sleep(1.seconds / 80); } } T alloc(T, A...)(auto ref A args){ enum classSize = __traits(classInstanceSize, T); void* mem = core.stdc.stdlib.malloc(classSize); scope(failure) core.stdc.stdlib.free(mem); assert(mem !is null, "Out of memory"); mem[0..classSize] = __traits(initSymbol, T); T inst = cast(T)mem; static if(__traits(hasMember, T, "__ctor")){ inst.__ctor(__traits(parameters)); } return inst; } ``` Issue is, this code I posted actually runs just fine, unlike the real code. My actual code does this HIGHLY SUSPICIOUS thing when printing their length each time before using them: ``` 766 766 765 766 767 768 768 768 768 768 768 768 768 768 768 768 768 768 768 128000 Error Program exited with code -11 (backtrace:) const(void*), scope const(TypeInfo)) inout () ``` Therefore I must logically conclude that DRuntime's AAs are cursed! Unless this is a well-known issue or something... Thinking back, I've actually had them cause segfaults in non-threaded code, maybe `__gshared` was never the cause at all.
Jul 28 2023
On 7/28/23 4:39 AM, IchorDev wrote:Issue is, this code I posted actually runs just fine, unlike the real code. My actual code does this HIGHLY SUSPICIOUS thing when printing their length each time before using them: ``` 766 766 765 766 767 768 768 768 768 768 768 768 768 768 768 768 768 768 768 128000 Error Program exited with code -11 (backtrace:) const(void*), scope const(TypeInfo)) inout () ```My suspicion would be a race condition in your real code, and no race condition in this toy code. Second suspicion would be memory corruption (possibly caused by a race condition).Therefore I must logically conclude that DRuntime's AAs are cursed! Unless this is a well-known issue or something...AAs have worked forever. I've never had problems with them. Not saying there's not a bug here, maybe there is. But I would be surprised.Thinking back, I've actually had them cause segfaults in non-threaded code, maybe `__gshared` was never the cause at all.All `__gshared` does is give you storage that is accessible from all threads, but is not typed as `shared`. It doesn't change how the data is used. -Steve
Jul 28 2023
On Friday, 28 July 2023 at 11:15:31 UTC, Steven Schveighoffer wrote:All `__gshared` does is give you storage that is accessible from all threads,"All __gshared does is give you [a live bomb, ready to go off at any moment]" !! On Friday, 28 July 2023 at 14:10:16 UTC, Kagamin wrote:Your error is using allocating the object with malloc. Since gc doesn't see your AA, the AA is freed and you get UAF.Friend, I think you nailed it. After adding this I haven't been able to reproduce the segfault again: ```d this(long n){ (){ cache = new typeof(cache); assert(cache); import core.memory; core.memory.GC.addRoot(cast(void*)cache); }(); // ... ``` It did always happen at random, so perhaps I haven't spent enough time testing it yet, but I've gone far longer without it segfaulting than ever before.
Jul 28 2023
On 7/28/23 11:15 AM, IchorDev wrote:On Friday, 28 July 2023 at 11:15:31 UTC, Steven Schveighoffer wrote:It seems like it's not __gshared at all, but misusing malloc with GC pointers.All `__gshared` does is give you storage that is accessible from all threads,"All __gshared does is give you [a live bomb, ready to go off at any moment]" !!On Friday, 28 July 2023 at 14:10:16 UTC, Kagamin wrote:This is the wrong approach, it's the allocating call that should add the root (actually a range). For instance, the mutex is not added, that might be collected. Or if you add more GC-pointing things into the class, that could be a problem. What I'd do is: ```d T alloc(T, A...)(auto ref A args){ enum classSize = __traits(classInstanceSize, T); void* mem = core.stdc.stdlib.malloc(classSize); assert(mem !is null, "Out of memory"); core.memory.GC.addRange(mem[0 .. classSize]); scope(failure) { core.memory.GC.removeRange(mem[0 .. classSize]); core.stdc.stdlib.free(mem); } T inst = cast(T)mem; inst.emplace(__traits(parameters)); return inst; } ``` And of course, a `dealloc` that removes the range should also be added. -SteveYour error is using allocating the object with malloc. Since gc doesn't see your AA, the AA is freed and you get UAF.Friend, I think you nailed it. After adding this I haven't been able to reproduce the segfault again: ```d this(long n){ (){ cache = new typeof(cache); assert(cache); import core.memory; core.memory.GC.addRoot(cast(void*)cache); }(); // ... ``` It did always happen at random, so perhaps I haven't spent enough time testing it yet, but I've gone far longer without it segfaulting than ever before.
Jul 28 2023
On Friday, 28 July 2023 at 21:22:01 UTC, Steven Schveighoffer wrote:```d T alloc(T, A...)(auto ref A args){ enum classSize = __traits(classInstanceSize, T); void* mem = core.stdc.stdlib.malloc(classSize); assert(mem !is null, "Out of memory"); core.memory.GC.addRange(mem[0 .. classSize]); scope(failure) { core.memory.GC.removeRange(mem[0 .. classSize]); core.stdc.stdlib.free(mem); } T inst = cast(T)mem; inst.emplace(__traits(parameters)); return inst; } ``` And of course, a `dealloc` that removes the range should also be added. -SteveThis also works *unless* I do a `foreach` over the AA's keys & values (`foreach(k,v; aa)`), in which case it still segfaults at random. Deconstructing the foreach into a for works though, for whatever reason.
Aug 09 2023
On 8/10/23 1:59 AM, IchorDev wrote:This also works *unless* I do a `foreach` over the AA's keys & values (`foreach(k,v; aa)`), in which case it still segfaults at random. Deconstructing the foreach into a for works though, for whatever reason.That shouldn't matter. The key thing is, if you are going to use malloc, you need to register the memory with the GC. The GC doesn't scan memory it doesn't know about. But my main point was the class itself shouldn't be registering itself -- it did not make the decision to malloc. You should register where you allocate. I also highly recommend using `emplace` to handle all the sticky issues with lifetime/construction. -Steve
Aug 10 2023
On Thursday, 10 August 2023 at 15:20:28 UTC, Steven Schveighoffer wrote:That shouldn't matter.Well, it does here. The AA is mutated during the loop, so perhaps this is an optimisation quirk where it works with `for` but segfaults in `foreach`? I've pretty thoroughly abused the `for` version and I haven't gotten it to segfault yet.I also highly recommend using `emplace` to handle all the sticky issues with lifetime/construction.Have not run into the aforementioned sticky issues yet, but I can't even find `emplace`'s docs anywhere now. I recall it being incompatible with classes that have nogc/nothrow constructors though, which made it pretty useless to me, and it wouldn't work with BetterC, which was a requirement for the allocation wrapper I was writing at the time.
Aug 12 2023
On 8/12/23 5:55 AM, IchorDev wrote:On Thursday, 10 August 2023 at 15:20:28 UTC, Steven Schveighoffer wrote:oh yeah. That is not allowed. Any mutation of the AA during iteration can invalidate existing foreach or ranges over the AA. Basically, a rehash can cause the buckets to jumble up, and in that case, the "current index" can be changed to point at a null bucket. More here: https://dlang.org/spec/statement.html#foreach_restrictions In fact, that statement is way too broad. Invalidation of iteration should be based on the type's requirements. We really should put a note in the AA spec page.That shouldn't matter.Well, it does here. The AA is mutated during the loop, so perhaps this is an optimisation quirk where it works with `for` but segfaults in `foreach`? I've pretty thoroughly abused the `for` version and I haven't gotten it to segfault yet.https://dlang.org/phobos/core_lifetime.html#.emplaceI also highly recommend using `emplace` to handle all the sticky issues with lifetime/construction.Have not run into the aforementioned sticky issues yet, but I can't even find `emplace`'s docs anywhere now.I recall it being incompatible with classes that have nogc/nothrow constructors though, which made it pretty useless to me, and it wouldn't work with BetterC, which was a requirement for the allocation wrapper I was writing at the time.It probably won't work with betterC, but that's probably just because of linker errors. Any attribute requirements would be inferred based on the attributes of your constructor, because emplace is a template. -Steve
Aug 15 2023
On Tuesday, 15 August 2023 at 17:36:13 UTC, Steven Schveighoffer wrote:On 8/12/23 5:55 AM, IchorDev wrote:And yet this apparently doesn’t apply to an equivalent `for` somehow. Perhaps `foreach` shouldn’t have this arbitrary problem in the first place…On Thursday, 10 August 2023 at 15:20:28 UTC, Steven Schveighoffer wrote:oh yeah. That is not allowed. Any mutation of the AA during iteration can invalidate existing foreach or ranges over the AA.That shouldn't matter.Well, it does here. The AA is mutated during the loop, so perhaps this is an optimisation quirk where it works with `for` but segfaults in `foreach`? I've pretty thoroughly abused the `for` version and I haven't gotten it to segfault yet.In fact, that statement is way too broad. Invalidation of iteration should be based on the type's requirements. We really should put a note in the AA spec page.Probably yeah.https://dlang.org/phobos/core_lifetime.html#.emplace Any attribute requirements would be inferred based on the attributes of your constructor, because emplace is a template. -SteveWell the docs don’t say that at all, and the code is an unreadable mess. I dunno how anymore is meant to figure that out?
Aug 15 2023
Your error is using allocating the object with malloc. Since gc doesn't see your AA, the AA is freed and you get UAF.
Jul 28 2023