digitalmars.D.learn - AA vs _

digitalmars.D.learn - AA vs __gshared

IchorDev (14/14) Jul 27 2023 I've been getting a lot of segfaults from using associative

Dennis (2/4) Jul 27 2023 Are you accessing the AA from multiple threads?
Jonathan M Davis (34/48) Jul 27 2023 Strictly speaking, __gshared is really only intended for stuff like C

IchorDev (15/26) Jul 27 2023 I use a shared Mutex from `core.sync.mutex`.

Kagamin (6/10) Jul 27 2023 The difference between them is purely formal if you're not on an

IchorDev (6/9) Jul 27 2023 I'm not sure that's correct. Also I always use the latest

IchorDev (163/175) Jul 28 2023 Doing this doesn't help, unfortunately.

Steven Schveighoffer (10/44) Jul 28 2023 My suspicion would be a race condition in your real code, and no race

IchorDev (21/25) Jul 28 2023 "All __gshared does is give you [a live bomb, ready to go off at

Steven Schveighoffer (25/52) Jul 28 2023 It seems like it's not __gshared at all, but misusing malloc with GC

IchorDev (6/24) Aug 09 2023 This also works *unless* I do a `foreach` over the AA's keys &

Steven Schveighoffer (10/14) Aug 10 2023 That shouldn't matter.

IchorDev (12/15) Aug 12 2023 Well, it does here. The AA is mutated during the loop, so perhaps

Steven Schveighoffer (15/31) Aug 15 2023 oh yeah. That is not allowed. Any mutation of the AA during iteration

IchorDev (8/27) Aug 15 2023 And yet this apparently doesn’t apply to an equivalent `for`

Kagamin (2/2) Jul 28 2023 Your error is using allocating the object with malloc. Since gc

IchorDev <zxinsworld gmail.com> writes:

I've been getting a lot of segfaults from using associative 
arrays recently. The faults happen seemingly at random, and from 
pretty mundane stuff like `if(auto x = y in z)` that run very 
often:
```
Segmentation fault.

const(void*), scope const(TypeInfo)) inout ()

```

I suspect that this is because they've all been placed inside 
`__ghared` structs. Are DRuntime's AAs simply incompatible with 
`__gshared`? Do they need to be marked as `shared` to prevent 
these shenanigans?

Jul 27 2023

Dennis <dkorpel gmail.com> writes:

On Thursday, 27 July 2023 at 15:57:51 UTC, IchorDev wrote:
 The faults happen seemingly at random, and from pretty mundane 
 stuff like `if(auto x = y in z)` that run very often:

Are you accessing the AA from multiple threads?

Jul 27 2023

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Thursday, July 27, 2023 9:57:51 AM MDT IchorDev via Digitalmars-d-learn 
wrote:
 I've been getting a lot of segfaults from using associative
 arrays recently. The faults happen seemingly at random, and from
 pretty mundane stuff like `if(auto x = y in z)` that run very
 often:
 ```
 Segmentation fault.

 const(void*), scope const(TypeInfo)) inout ()

 ```

 I suspect that this is because they've all been placed inside
 `__ghared` structs. Are DRuntime's AAs simply incompatible with
 `__gshared`? Do they need to be marked as `shared` to prevent
 these shenanigans?

Strictly speaking, __gshared is really only intended for stuff like C
globals (which can't be shared due to name-mangling issues). Using it on
anything else can at least potentially cause problems due to the fact that
the compiler will assume that the variable is thread-local. So, I would
strongly advise against using __gshared in a case like this. In practice,
you can often get away with it, because the compiler doesn't do much in the
way of optimizing stuff based on objects being thread-local right now, but
it's definitely risking problems with the type system if you used __gshared
when you're not trying to do something like bind to a C global.

What should normally be happening is that you use shared, and then when
you've protected the object so that you know that it can only be accessed on
the current thread by the section of code that you're in (e.g. by locking a
mutex), you temporarily cast away shared to operate on the object via a
thread-local reference. Then, before exiting that section of code and
removing the protections that are preventing other threads from accessing
the object (e.g. by unlocking the mutex), you make sure that you've gotten
rid of all of the thread-local references to the object so that only the
shared reference exists. That way, you don't accidentally mutate the object
while it's not protected from access by other threads.

Now, as to what's happening in your code that's causing segfaults, the most
likely culprit would be that you're accessing the AA without actually having
done anything to prevent other threads from accessing it at the same time
(or your protections were inadequate). And because the object is being
treated as thread-local by the compiler, it would be easy to have
accidentally let a reference to it leak somewhere that wasn't being
protected by whatever mutex you're using, whereas if the AA were shared, the
only sections of code where you would have to worry about thread-local
references escaping would be in the sections of code where you've cast away
shared after locking the relevant mutex. So, similar to what happens with
 safe and  trusted, using shared allows you to limit the code that you have
to examine to find the problem.

- Jonathan M Davis

Jul 27 2023

IchorDev <zxinsworld gmail.com> writes:

On Thursday, 27 July 2023 at 21:31:02 UTC, Jonathan M Davis wrote:
 Now, as to what's happening in your code that's causing 
 segfaults, the most likely culprit would be that you're 
 accessing the AA without actually having done anything to 
 prevent other threads from accessing it at the same time (or 
 your protections were inadequate).

I use a shared Mutex from `core.sync.mutex`.
The AA itself and the `final class` that it's a member of is only 
ever accessed by one thread for now.
The code inside that `final class` that accesses the AA is called 
by a function from another `final class`, and that other `final 
class` is used by multiple threads, but it's fully guarded by 
Mutex locks—I haven't had any issues with the millions of 
non-atomic reads and writes on its data I've performed from the 
two threads. Both objects are "static" (declared at module scope) 
if that matters at all.

 if the AA were shared, the only sections of code where you 
 would have to worry about thread-local references escaping 
 would be in the sections of code where you've cast away shared 
 after locking the relevant mutex. So, similar to what happens 
 with  safe and  trusted, using shared allows you to limit the 
 code that you have to examine to find the problem.

I was told that using `__gshared` is quite a bit faster at 
runtime than using `shared`, but I also don't really know 
anything concrete about `shared` because the spec is so 
incredibly vague about it.

Jul 27 2023

Kagamin <spam here.lot> writes:

On Friday, 28 July 2023 at 03:54:53 UTC, IchorDev wrote:
 I was told that using `__gshared` is quite a bit faster at 
 runtime than using `shared`, but I also don't really know 
 anything concrete about `shared` because the spec is so 
 incredibly vague about it.

The difference between them is purely formal if you're not on an 

crashes are frequent, can you reproduce a crash with a minimal 
amount of code, start many threads and access the locked AA 
concurrently.

Jul 27 2023

IchorDev <zxinsworld gmail.com> writes:

On Friday, 28 July 2023 at 04:13:13 UTC, Kagamin wrote:
 The difference between them is purely formal if you're not on 


I'm not sure that's correct. Also I always use the latest 
compiler versions where possible.

 start many threads and access the locked AA concurrently.

It's only being accessed from one thread, so such a test wouldn't 
be an accurate reproduction of the issue I'm having. Also I'm 
only using 2 threads total.

Jul 27 2023

IchorDev <zxinsworld gmail.com> writes:

On Thursday, 27 July 2023 at 21:31:02 UTC, Jonathan M Davis wrote:
 What should normally be happening is that you use shared, and 
 then when you've protected the object so that you know that it 
 can only be accessed on the current thread by the section of 
 code that you're in (e.g. by locking a mutex), you temporarily 
 cast away shared to operate on the object via a thread-local 
 reference. Then, before exiting that section of code and 
 removing the protections that are preventing other threads from 
 accessing the object (e.g. by unlocking the mutex), you make 
 sure that you've gotten rid of all of the thread-local 
 references to the object so that only the shared reference 
 exists. That way, you don't accidentally mutate the object 
 while it's not protected from access by other threads.

Doing this doesn't help, unfortunately.
This is an abstract version what the problematic code looks like 
now:
```d
import std.stdio: writeln;
import core.sync.mutex;
import core.thread;
import core.time;
static import core.stdc.stdlib;

struct Pos{
	long x,y;
}

shared Obj obj;

final class Obj{
	CacheData[Pos] cache;
	CacheData* getCache(Pos key){
		if(auto item = key in cache){
			if(++item.useCount >= 8){
				cache.remove(key);
				//I thought this ^ might cause a use-after-free issue, but 
apparently not.
			}
			return item;
		}
		return null;
	}
	struct CacheData{
		double[1000] data;
		uint useCount = 1;
	}
	
	double[1000] doStuff(Pos pos){
		immutable data = (){
			if(auto data = getCache(pos)){
				return *data;
			}else{
				double[1000] data;
				//initialisses it with something, this is arbirray though:
				foreach(ref item; data){
					import std.random;
					item = uniform01();
				}
				cache[pos] = CacheData(data);
				return CacheData(data);
			}
		}();
		
		//do stuff with data
		
		return data.data;
	}	
}

__gshared OtherObj otherObj;

final class OtherObj{
	shared Mutex mutex;
	__gshared ubyte[2^^18] data;
	
	this(long n){
		obj = cast(shared Obj)alloc!Obj(n);
		mutex = new shared Mutex;
	}
	
	void callObj(Pos pos){
		double[1000] data;
		
		{
			auto objRef = cast(Obj)obj;
			data = objRef.doStuff(pos);
		}
		
		//do things with returned value...
	}
}

void thread(){
	bool run = true;
	Pos pos;
	while(run){
		otherObj.mutex.lock();
		foreach(i; 0..100){
			otherObj.callObj(pos);
			//this is pretty arbirary:
			import std.random;
			if(uniform01() > 0.9){
				auto v = uniform01();
				if(v < 0.25) pos.x--;
				else if(v < 0.5) pos.y++;
				else if(v < 0.75) pos.y--;
				else pos.x++;
			}
		}
		otherObj.mutex.unlock();
		
		Thread.sleep(1.seconds / 20);
	}
}

void main(){
	otherObj = alloc!OtherObj(-7);
	assert(otherObj !is null);
	
	auto otherThread = new Thread(&thread);
	otherThread.start();
	
	bool run = true;
	while(run){
		if(!otherThread.isRunning()) otherThread.join();
		otherObj.mutex.lock();
		foreach(val; otherObj.data){
			//do stuff
		}
		otherObj.mutex.unlock();
		Thread.sleep(1.seconds / 80);
	}
}

T alloc(T, A...)(auto ref A args){
	enum classSize = __traits(classInstanceSize, T);
	void* mem = core.stdc.stdlib.malloc(classSize);
	scope(failure) core.stdc.stdlib.free(mem);
	assert(mem !is null, "Out of memory");
	mem[0..classSize] = __traits(initSymbol, T);
	T inst = cast(T)mem;
	static if(__traits(hasMember, T, "__ctor")){
		
		inst.__ctor(__traits(parameters));
	}
	return inst;
}
```
Issue is, this code I posted actually runs just fine, unlike the 
real code.
My actual code does this HIGHLY SUSPICIOUS thing when printing 
their length each time before using them:
```
766
766
765
766
767
768
768
768
768
768
768
768
768
768
768
768
768
768
768
128000
Error Program exited with code -11
(backtrace:)

const(void*), scope const(TypeInfo)) inout ()

```
Therefore I must logically conclude that DRuntime's AAs are 
cursed! Unless this is a well-known issue or something...
Thinking back, I've actually had them cause segfaults in 
non-threaded code, maybe `__gshared` was never the cause at all.

Jul 28 2023

Steven Schveighoffer <schveiguy gmail.com> writes:

On 7/28/23 4:39 AM, IchorDev wrote:

 Issue is, this code I posted actually runs just fine, unlike the real code.
 My actual code does this HIGHLY SUSPICIOUS thing when printing their 
 length each time before using them:
 ```
 766
 766
 765
 766
 767
 768
 768
 768
 768
 768
 768
 768
 768
 768
 768
 768
 768
 768
 768
 128000
 Error Program exited with code -11
 (backtrace:)

 const(void*), scope const(TypeInfo)) inout ()

 ```

My suspicion would be a race condition in your real code, and no race 
condition in this toy code. Second suspicion would be memory corruption 
(possibly caused by a race condition).

 Therefore I must logically conclude that DRuntime's AAs are cursed! 
 Unless this is a well-known issue or something...

AAs have worked forever. I've never had problems with them. Not saying 
there's not a bug here, maybe there is. But I would be surprised.

 Thinking back, I've actually had them cause segfaults in non-threaded 
 code, maybe `__gshared` was never the cause at all.

All `__gshared` does is give you storage that is accessible from all 
threads, but is not typed as `shared`. It doesn't change how the data is 
used.

-Steve

Jul 28 2023

IchorDev <zxinsworld gmail.com> writes:

On Friday, 28 July 2023 at 11:15:31 UTC, Steven Schveighoffer 
wrote:
 All `__gshared` does is give you storage that is accessible 
 from all threads,

"All __gshared does is give you [a live bomb, ready to go off at 
any moment]"
!!

On Friday, 28 July 2023 at 14:10:16 UTC, Kagamin wrote:
 Your error is using allocating the object with malloc. Since gc 
 doesn't see your AA, the AA is freed and you get UAF.

Friend, I think you nailed it. After adding this I haven't been 
able to reproduce the segfault again:
```d
this(long n){
	(){
		cache = new typeof(cache);
		assert(cache);
		import core.memory;
		core.memory.GC.addRoot(cast(void*)cache);
	}();
	// ...
```
It did always happen at random, so perhaps I haven't spent enough 
time testing it yet, but I've gone far longer without it 
segfaulting than ever before.

Jul 28 2023

Steven Schveighoffer <schveiguy gmail.com> writes:

On 7/28/23 11:15 AM, IchorDev wrote:
 On Friday, 28 July 2023 at 11:15:31 UTC, Steven Schveighoffer wrote:
 All `__gshared` does is give you storage that is accessible from all 
 threads,

 
 "All __gshared does is give you [a live bomb, ready to go off at any 
 moment]"
 !!

It seems like it's not __gshared at all, but misusing malloc with GC 
pointers.

 
 On Friday, 28 July 2023 at 14:10:16 UTC, Kagamin wrote:
 Your error is using allocating the object with malloc. Since gc 
 doesn't see your AA, the AA is freed and you get UAF.

 
 Friend, I think you nailed it. After adding this I haven't been able to 
 reproduce the segfault again:
 ```d
 this(long n){
      (){
          cache = new typeof(cache);
          assert(cache);
          import core.memory;
          core.memory.GC.addRoot(cast(void*)cache);
      }();
      // ...
 ```
 It did always happen at random, so perhaps I haven't spent enough time 
 testing it yet, but I've gone far longer without it segfaulting than 
 ever before.

This is the wrong approach, it's the allocating call that should add the 
root (actually a range).

For instance, the mutex is not added, that might be collected. Or if you 
add more GC-pointing things into the class, that could be a problem.

What I'd do is:

```d
T alloc(T, A...)(auto ref A args){
     enum classSize = __traits(classInstanceSize, T);
     void* mem = core.stdc.stdlib.malloc(classSize);
     assert(mem !is null, "Out of memory");
     core.memory.GC.addRange(mem[0 .. classSize]);
     scope(failure) {
        core.memory.GC.removeRange(mem[0 .. classSize]);
        core.stdc.stdlib.free(mem);
     }
     T inst = cast(T)mem;
     inst.emplace(__traits(parameters));
     return inst;
}
```

And of course, a `dealloc` that removes the range should also be added.

-Steve

Jul 28 2023

IchorDev <zxinsworld gmail.com> writes:

On Friday, 28 July 2023 at 21:22:01 UTC, Steven Schveighoffer 
wrote:
 ```d
 T alloc(T, A...)(auto ref A args){
     enum classSize = __traits(classInstanceSize, T);
     void* mem = core.stdc.stdlib.malloc(classSize);
     assert(mem !is null, "Out of memory");
     core.memory.GC.addRange(mem[0 .. classSize]);
     scope(failure) {
        core.memory.GC.removeRange(mem[0 .. classSize]);
        core.stdc.stdlib.free(mem);
     }
     T inst = cast(T)mem;
     inst.emplace(__traits(parameters));
     return inst;
 }
 ```

 And of course, a `dealloc` that removes the range should also 
 be added.

 -Steve

This also works *unless* I do a `foreach` over the AA's keys & 
values (`foreach(k,v; aa)`), in which case it still segfaults at 
random. Deconstructing the foreach into a for works though, for 
whatever reason.

Aug 09 2023

Steven Schveighoffer <schveiguy gmail.com> writes:

On 8/10/23 1:59 AM, IchorDev wrote:

 
 This also works *unless* I do a `foreach` over the AA's keys & values 
 (`foreach(k,v; aa)`), in which case it still segfaults at random. 
 Deconstructing the foreach into a for works though, for whatever reason.

That shouldn't matter.

The key thing is, if you are going to use malloc, you need to register 
the memory with the GC. The GC doesn't scan memory it doesn't know about.

But my main point was the class itself shouldn't be registering itself 
-- it did not make the decision to malloc. You should register where you 
allocate.

I also highly recommend using `emplace` to handle all the sticky issues 
with lifetime/construction.

-Steve

Aug 10 2023

IchorDev <zxinsworld gmail.com> writes:

On Thursday, 10 August 2023 at 15:20:28 UTC, Steven Schveighoffer 
wrote:
 That shouldn't matter.

Well, it does here. The AA is mutated during the loop, so perhaps 
this is an optimisation quirk where it works with `for` but 
segfaults in `foreach`? I've pretty thoroughly abused the `for` 
version and I haven't gotten it to segfault yet.

 I also highly recommend using `emplace` to handle all the 
 sticky issues with lifetime/construction.

Have not run into the aforementioned sticky issues yet, but I 
can't even find `emplace`'s docs anywhere now. I recall it being 
incompatible with classes that have  nogc/nothrow constructors 
though, which made it pretty useless to me, and it wouldn't work 
with BetterC, which was a requirement for the allocation wrapper 
I was writing at the time.

Aug 12 2023

Steven Schveighoffer <schveiguy gmail.com> writes:

On 8/12/23 5:55 AM, IchorDev wrote:
 On Thursday, 10 August 2023 at 15:20:28 UTC, Steven Schveighoffer wrote:
 That shouldn't matter.

 
 Well, it does here. The AA is mutated during the loop, so perhaps this 
 is an optimisation quirk where it works with `for` but segfaults in 
 `foreach`? I've pretty thoroughly abused the `for` version and I haven't 
 gotten it to segfault yet.

oh yeah. That is not allowed. Any mutation of the AA during iteration 
can invalidate existing foreach or ranges over the AA.

Basically, a rehash can cause the buckets to jumble up, and in that 
case, the "current index" can be changed to point at a null bucket.

More here: https://dlang.org/spec/statement.html#foreach_restrictions

In fact, that statement is way too broad. Invalidation of iteration 
should be based on the type's requirements.

We really should put a note in the AA spec page.

 I also highly recommend using `emplace` to handle all the sticky 
 issues with lifetime/construction.

 
 Have not run into the aforementioned sticky issues yet, but I can't even 
 find `emplace`'s docs anywhere now.

https://dlang.org/phobos/core_lifetime.html#.emplace

 I recall it being incompatible with 
 classes that have  nogc/nothrow constructors though, which made it 
 pretty useless to me, and it wouldn't work with BetterC, which was a 
 requirement for the allocation wrapper I was writing at the time.

It probably won't work with betterC, but that's probably just because of 
linker errors.

Any attribute requirements would be inferred based on the attributes of 
your constructor, because emplace is a template.

-Steve

Aug 15 2023

IchorDev <zxinsworld gmail.com> writes:

On Tuesday, 15 August 2023 at 17:36:13 UTC, Steven Schveighoffer 
wrote:
 On 8/12/23 5:55 AM, IchorDev wrote:
 On Thursday, 10 August 2023 at 15:20:28 UTC, Steven 
 Schveighoffer wrote:
 That shouldn't matter.

 
 Well, it does here. The AA is mutated during the loop, so 
 perhaps this is an optimisation quirk where it works with 
 `for` but segfaults in `foreach`? I've pretty thoroughly 
 abused the `for` version and I haven't gotten it to segfault 
 yet.

 oh yeah. That is not allowed. Any mutation of the AA during 
 iteration can invalidate existing foreach or ranges over the AA.

And yet this apparently doesn’t apply to an equivalent `for` 
somehow. Perhaps `foreach` shouldn’t have this arbitrary problem 
in the first place…

 In fact, that statement is way too broad. Invalidation of 
 iteration should be based on the type's requirements.

 We really should put a note in the AA spec page.

Probably yeah.

 https://dlang.org/phobos/core_lifetime.html#.emplace

 Any attribute requirements would be inferred based on the 
 attributes of your constructor, because emplace is a template.

 -Steve

Well the docs don’t say that at all, and the code is an 
unreadable mess. I dunno how anymore is meant to figure that out?

Aug 15 2023

Kagamin <spam here.lot> writes:

Your error is using allocating the object with malloc. Since gc 
doesn't see your AA, the AA is freed and you get UAF.

Jul 28 2023

D Programming

C/C++ Programming

Other

digitalmars.D.learn - AA vs __gshared