digitalmars.D - Race conditions in critical.c?
- Mike Hearn (21/21) Sep 20 2008 I was reading the code to the phobos runtime library and saw the followi...
- Janderson (3/4) Sep 21 2008 You can cancel/remove your own messages in Thunderbird if you wish :)
- Mike Hearn (1/27) Sep 25 2008
- Mike Hearn (2/31) Sep 25 2008
- Fawzi Mohamed (7/52) Sep 25 2008 You are right dcs->next should be set only after the initialization is
- Bartosz Milewski (29/57) Sep 30 2008 When I was rewriting monitor.c in D, I noticed the race in critical.c
- Sean Kelly (6/8) Sep 30 2008 Yup, it's completely useless given that POSIX only provides guarantees
- Sean Kelly (4/12) Sep 30 2008 Oh, and I haven't fixed critical.c for the same reason. Given how they
- Fawzi Mohamed (8/23) Oct 01 2008 I think that it should either throw an exception "unsupported" or be cor...
I was reading the code to the phobos runtime library and saw the following code in critical.c: void _d_criticalenter(D_CRITICAL_SECTION *dcs) { if (!dcs->next) { EnterCriticalSection(&critical_section.cs); if (!dcs->next) // if, in the meantime, another thread didn't set it { dcs->next = dcs_list; <---- dcs_list = dcs; InitializeCriticalSection(&dcs->cs); } LeaveCriticalSection(&critical_section.cs); } EnterCriticalSection(&dcs->cs); } There's a race condition here - consider a thread that is suspended at the marked point. Another thread then attempts to lock the critsec. It sees that dcs->next is set and calls EnterCriticalSection on an uninitialized CRITICAL_SECTION. The Linux code has the same problem, and then another one as well - the initialization of the "critsec lock" is itself lazy and is not protected by anything at all. The result is potentially multiple initializations of that lock, and because atexit does not coalesce multiple registrations, a potentially catastrophic multiple invocation of _STD_critical_term. It appears this code (for both platforms) needs to be scrapped and rewritten. I'd start by checking that lazy initialization of critsecs is actually needed - do D compilers really generate tons of potentially unnecessary D_CRITICAL_SECTIONs? thanks -mike
Sep 20 2008
Digited wrote:thunderbird "rules"... every post option does the same thing... sorry. :(You can cancel/remove your own messages in Thunderbird if you wish :) -Joel
Sep 21 2008
Mike Hearn Wrote:I was reading the code to the phobos runtime library and saw the following code in critical.c: void _d_criticalenter(D_CRITICAL_SECTION *dcs) { if (!dcs->next) { EnterCriticalSection(&critical_section.cs); if (!dcs->next) // if, in the meantime, another thread didn't set it { dcs->next = dcs_list; <---- dcs_list = dcs; InitializeCriticalSection(&dcs->cs); } LeaveCriticalSection(&critical_section.cs); } EnterCriticalSection(&dcs->cs); } There's a race condition here - consider a thread that is suspended at the marked point. Another thread then attempts to lock the critsec. It sees that dcs->next is set and calls EnterCriticalSection on an uninitialized CRITICAL_SECTION. The Linux code has the same problem, and then another one as well - the initialization of the "critsec lock" is itself lazy and is not protected by anything at all. The result is potentially multiple initializations of that lock, and because atexit does not coalesce multiple registrations, a potentially catastrophic multiple invocation of _STD_critical_term. It appears this code (for both platforms) needs to be scrapped and rewritten. I'd start by checking that lazy initialization of critsecs is actually needed - do D compilers really generate tons of potentially unnecessary D_CRITICAL_SECTIONs? thanks -mike
Sep 25 2008
Sorry, hit return too early. So is anybody going to take a look at this? I just noticed there is a Bugzilla, should I file a bug? Mike Hearn Wrote:Mike Hearn Wrote:I was reading the code to the phobos runtime library and saw the following code in critical.c: void _d_criticalenter(D_CRITICAL_SECTION *dcs) { if (!dcs->next) { EnterCriticalSection(&critical_section.cs); if (!dcs->next) // if, in the meantime, another thread didn't set it { dcs->next = dcs_list; <---- dcs_list = dcs; InitializeCriticalSection(&dcs->cs); } LeaveCriticalSection(&critical_section.cs); } EnterCriticalSection(&dcs->cs); } There's a race condition here - consider a thread that is suspended at the marked point. Another thread then attempts to lock the critsec. It sees that dcs->next is set and calls EnterCriticalSection on an uninitialized CRITICAL_SECTION. The Linux code has the same problem, and then another one as well - the initialization of the "critsec lock" is itself lazy and is not protected by anything at all. The result is potentially multiple initializations of that lock, and because atexit does not coalesce multiple registrations, a potentially catastrophic multiple invocation of _STD_critical_term. It appears this code (for both platforms) needs to be scrapped and rewritten. I'd start by checking that lazy initialization of critsecs is actually needed - do D compilers really generate tons of potentially unnecessary D_CRITICAL_SECTIONs? thanks -mike
Sep 25 2008
On 2008-09-25 23:23:10 +0200, Mike Hearn <mike plan99.net> said:Sorry, hit return too early. So is anybody going to take a look at this? I just noticed there is a Bugzilla, should I file a bug?You are right dcs->next should be set only after the initialization is performed. do submit a bug on bugzilla http://d.puremagic.com/issues/ Well spotted! FawziMike Hearn Wrote:Mike Hearn Wrote:I was reading the code to the phobos runtime library and saw the following code in critical.c: void _d_criticalenter(D_CRITICAL_SECTION *dcs) { if (!dcs->next) { EnterCriticalSection(&critical_section.cs); if (!dcs->next) // if, in the meantime, another thread didn't set it { dcs->next = dcs_list; <---- dcs_list = dcs; InitializeCriticalSection(&dcs->cs); } LeaveCriticalSection(&critical_section.cs); } EnterCriticalSection(&dcs->cs); } There's a race condition here - consider a thread that is suspended at the marked point. Another thread then attempts to lock the critsec. It sees that dcs->next is set and calls EnterCriticalSection on an uninitialized CRITICAL_SECTION. The Linux code has the same problem, and then another one as well - the initialization of the "critsec lock" is itself lazy and is not protected by anything at all. The result is potentially multiple initializations of that lock, and because atexit does not coalesce multiple registrations, a potentially catastrophic multiple invocation of _STD_critical_term. It appears this code (for both platforms) needs to be scrapped and rewritten. I'd start by checking that lazy initialization of critsecs is actually needed - do D compilers really generate tons of potentially unnecessary D_CRITICAL_SECTIONs? thanks -mike
Sep 25 2008
When I was rewriting monitor.c in D, I noticed the race in critical.c but decided not to work on it. Instead I've been trying to convince Walter to scrap critical.c altogether. Here's why: How do you interpret the following fragment of code? (It's a simplified version of actual code written by a D programmer.) class Foo { int i, j; void a() { synchronized { int x = ++i; j = x; } } void b() { synchronized { int x = --i; j = x; } } } Does it guarantee that i==j? If you say yes, you subscribe to one of the schools of thought that argue that "synchronized" in this context either means synchronized(this) or synchronized(this.classinfo), or synchronized(SomeHiddenGlobalSingleton). Wrong! The code in critical.c creates a separate new critical section for every synchronized block of code. So a() and b() use _different_ locks and their execution may interleave. If you think this is bizarre bordering on reckless I'm with you. Mike Hearn wrote:I was reading the code to the phobos runtime library and saw the following code in critical.c: void _d_criticalenter(D_CRITICAL_SECTION *dcs) { if (!dcs->next) { EnterCriticalSection(&critical_section.cs); if (!dcs->next) // if, in the meantime, another thread didn't set it { dcs->next = dcs_list; <---- dcs_list = dcs; InitializeCriticalSection(&dcs->cs); } LeaveCriticalSection(&critical_section.cs); } EnterCriticalSection(&dcs->cs); } There's a race condition here - consider a thread that is suspended at the marked point. Another thread then attempts to lock the critsec. It sees that dcs->next is set and calls EnterCriticalSection on an uninitialized CRITICAL_SECTION. The Linux code has the same problem, and then another one as well - the initialization of the "critsec lock" is itself lazy and is not protected by anything at all. The result is potentially multiple initializations of that lock, and because atexit does not coalesce multiple registrations, a potentially catastrophic multiple invocation of _STD_critical_term. It appears this code (for both platforms) needs to be scrapped and rewritten. I'd start by checking that lazy initialization of critsecs is actually needed - do D compilers really generate tons of potentially unnecessary D_CRITICAL_SECTIONs? thanks -mike
Sep 30 2008
Bartosz Milewski wrote:If you think this is bizarre bordering on reckless I'm with you.Yup, it's completely useless given that POSIX only provides guarantees for visibility, etc, for threads entering the *same* mutex. So one couldn't even argue that this approach would work as some strange sort of memory barrier. Sean
Sep 30 2008
Sean Kelly wrote:Bartosz Milewski wrote:Oh, and I haven't fixed critical.c for the same reason. Given how they work, does *anyone* use this feature? SeanIf you think this is bizarre bordering on reckless I'm with you.Yup, it's completely useless given that POSIX only provides guarantees for visibility, etc, for threads entering the *same* mutex. So one couldn't even argue that this approach would work as some strange sort of memory barrier.
Sep 30 2008
On 2008-10-01 06:49:11 +0200, Sean Kelly <sean invisibleduck.org> said:Sean Kelly wrote:I think that it should either throw an exception "unsupported" or be corrected. it is possible to use it correctly if you have a function that is called in parallel with only one critical section, or when the whole function is synchronized. I checked my code and I prefer explicit locks, so indeed I haven't used, but both tango and phobos have some synchronized statements. FawziBartosz Milewski wrote:Oh, and I haven't fixed critical.c for the same reason. Given how they work, does *anyone* use this feature? SeanIf you think this is bizarre bordering on reckless I'm with you.Yup, it's completely useless given that POSIX only provides guarantees for visibility, etc, for threads entering the *same* mutex. So one couldn't even argue that this approach would work as some strange sort of memory barrier.
Oct 01 2008