digitalmars.D - More on Multithreading Performance
- dsimcha (47/47) Dec 16 2009 Our multithreading performance problems can probably be mitigated, at le...
Our multithreading performance problems can probably be mitigated, at least on Windows, by using InitializeCriticalSectionAndSpinCount instead of InitializeCriticalSection to implement synchronized blocks. According to http://msdn.microsoft.com/en-us/library/ms683476%28VS.85%29.aspx this causes the waiting thread to spin a specified amount of times before being context switched, but only on multiprocesser computers. This seems like a no-brainer for the GC lock. Having a small amount of spinning before the context switch also seems like a pretty good default for synchronized blocks in general. People who really want to customize things like this will use something something more customizable than a plain old synchronized block. Here's a test program that measures the speed-up. import core.thread, std.stdio, std.perf, core.sys.windows.windows, std.conv, std.string; extern(Windows) BOOL InitializeCriticalSectionAndSpinCount(CRITICAL_SECTION*, DWORD); enum nThreads = 2; __gshared int num = 0; __gshared CRITICAL_SECTION lock; void main(string[] args) { stderr.writeln("Give me a spin count."); int spinCount = to!int( readln().strip() ); InitializeCriticalSectionAndSpinCount(&lock, spinCount); auto pc = new PerformanceCounter; pc.start; auto threads = new Thread[nThreads]; for(int i = 0; i < nThreads; i++) { threads[i] = new Thread(&doStuff); threads[i].start(); } foreach(thread; threads) { thread.join(); } pc.stop; writeln(pc.milliseconds); } void doStuff() { for(int i = 0; i < 10_000_000; i++) { EnterCriticalSection(&lock); LeaveCriticalSection(&lock); } } spin count = 0: 3843 ms spin count = 4000: 2095 ms core.sync.Mutex doesn't use this feature. Neither do synchronized blocks. Based on looking at the source files for these, it seems trivial to start using them. Anyone see a good reason not to, or should I Bugzilla/patch this one?
Dec 16 2009