www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Race conditions in critical.c?

reply Mike Hearn <mike plan99.net> writes:
I was reading the code to the phobos runtime library and saw the following code
in critical.c:

void _d_criticalenter(D_CRITICAL_SECTION *dcs)
{
    if (!dcs->next)
    {
        EnterCriticalSection(&critical_section.cs);
        if (!dcs->next) // if, in the meantime, another thread didn't set it
        {
            dcs->next = dcs_list;
                 <----   
            dcs_list = dcs;      
            InitializeCriticalSection(&dcs->cs);
        }
        LeaveCriticalSection(&critical_section.cs);
    }
    EnterCriticalSection(&dcs->cs);
}

There's a race condition here - consider a thread that is suspended at the
marked point. Another thread then attempts to lock the critsec. It sees that
dcs->next is set and calls EnterCriticalSection on an uninitialized
CRITICAL_SECTION.

The Linux code has the same problem, and then another one as well - the
initialization of the "critsec lock" is itself lazy and is not protected by
anything at all. The result is potentially multiple initializations of that
lock, and because atexit does not coalesce multiple registrations, a
potentially catastrophic multiple invocation of _STD_critical_term.

It appears this code (for both platforms) needs to be scrapped and rewritten.
I'd start by checking that lazy initialization of critsecs is actually needed -
do D compilers really generate tons of potentially unnecessary
D_CRITICAL_SECTIONs?

thanks -mike
Sep 20 2008
next sibling parent Janderson <ask me.com> writes:
Digited wrote:
 thunderbird "rules"... every post option does the same thing... sorry. :(
You can cancel/remove your own messages in Thunderbird if you wish :) -Joel
Sep 21 2008
prev sibling parent reply Mike Hearn <mike plan99.net> writes:
Mike Hearn Wrote:

 I was reading the code to the phobos runtime library and saw the following
code in critical.c:
 
 void _d_criticalenter(D_CRITICAL_SECTION *dcs)
 {
     if (!dcs->next)
     {
         EnterCriticalSection(&critical_section.cs);
         if (!dcs->next) // if, in the meantime, another thread didn't set it
         {
             dcs->next = dcs_list;
                  <----   
             dcs_list = dcs;      
             InitializeCriticalSection(&dcs->cs);
         }
         LeaveCriticalSection(&critical_section.cs);
     }
     EnterCriticalSection(&dcs->cs);
 }
 
 There's a race condition here - consider a thread that is suspended at the
marked point. Another thread then attempts to lock the critsec. It sees that
dcs->next is set and calls EnterCriticalSection on an uninitialized
CRITICAL_SECTION.
 
 The Linux code has the same problem, and then another one as well - the
initialization of the "critsec lock" is itself lazy and is not protected by
anything at all. The result is potentially multiple initializations of that
lock, and because atexit does not coalesce multiple registrations, a
potentially catastrophic multiple invocation of _STD_critical_term.
 
 It appears this code (for both platforms) needs to be scrapped and rewritten.
I'd start by checking that lazy initialization of critsecs is actually needed -
do D compilers really generate tons of potentially unnecessary
D_CRITICAL_SECTIONs?
 
 thanks -mike
Sep 25 2008
parent reply Mike Hearn <mike plan99.net> writes:
Sorry, hit return too early. So is anybody going to take a look at this? I just
noticed there is a Bugzilla, should I file a bug? 

Mike Hearn  Wrote:

 Mike Hearn Wrote:
 
 I was reading the code to the phobos runtime library and saw the following
code in critical.c:
 
 void _d_criticalenter(D_CRITICAL_SECTION *dcs)
 {
     if (!dcs->next)
     {
         EnterCriticalSection(&critical_section.cs);
         if (!dcs->next) // if, in the meantime, another thread didn't set it
         {
             dcs->next = dcs_list;
                  <----   
             dcs_list = dcs;      
             InitializeCriticalSection(&dcs->cs);
         }
         LeaveCriticalSection(&critical_section.cs);
     }
     EnterCriticalSection(&dcs->cs);
 }
 
 There's a race condition here - consider a thread that is suspended at the
marked point. Another thread then attempts to lock the critsec. It sees that
dcs->next is set and calls EnterCriticalSection on an uninitialized
CRITICAL_SECTION.
 
 The Linux code has the same problem, and then another one as well - the
initialization of the "critsec lock" is itself lazy and is not protected by
anything at all. The result is potentially multiple initializations of that
lock, and because atexit does not coalesce multiple registrations, a
potentially catastrophic multiple invocation of _STD_critical_term.
 
 It appears this code (for both platforms) needs to be scrapped and rewritten.
I'd start by checking that lazy initialization of critsecs is actually needed -
do D compilers really generate tons of potentially unnecessary
D_CRITICAL_SECTIONs?
 
 thanks -mike
Sep 25 2008
next sibling parent Fawzi Mohamed <fmohamed mac.com> writes:
On 2008-09-25 23:23:10 +0200, Mike Hearn  <mike plan99.net> said:

 Sorry, hit return too early. So is anybody going to take a look at 
 this? I just noticed there is a Bugzilla, should I file a bug?
You are right dcs->next should be set only after the initialization is performed. do submit a bug on bugzilla http://d.puremagic.com/issues/ Well spotted! Fawzi
 
 Mike Hearn  Wrote:
 
 Mike Hearn Wrote:
 
 I was reading the code to the phobos runtime library and saw the 
 following code in critical.c:
 
 void _d_criticalenter(D_CRITICAL_SECTION *dcs)
 {
 if (!dcs->next)
 {
 EnterCriticalSection(&critical_section.cs);
 if (!dcs->next) // if, in the meantime, another thread didn't set it
 {
 dcs->next = dcs_list;
 <----
 dcs_list = dcs;
 InitializeCriticalSection(&dcs->cs);
 }
 LeaveCriticalSection(&critical_section.cs);
 }
 EnterCriticalSection(&dcs->cs);
 }
 
 There's a race condition here - consider a thread that is suspended at 
 the marked point. Another thread then attempts to lock the critsec. It 
 sees that dcs->next is set and calls EnterCriticalSection on an 
 uninitialized CRITICAL_SECTION.
 
 The Linux code has the same problem, and then another one as well - the 
 initialization of the "critsec lock" is itself lazy and is not 
 protected by anything at all. The result is potentially multiple 
 initializations of that lock, and because atexit does not coalesce 
 multiple registrations, a potentially catastrophic multiple invocation 
 of _STD_critical_term.
 
 It appears this code (for both platforms) needs to be scrapped and 
 rewritten. I'd start by checking that lazy initialization of critsecs 
 is actually needed - do D compilers really generate tons of potentially 
 unnecessary D_CRITICAL_SECTIONs?
 
 thanks -mike
Sep 25 2008
prev sibling parent reply Bartosz Milewski <bartosz relisoft.com> writes:
When I was rewriting monitor.c in D, I noticed the race in critical.c 
but decided not to work on it. Instead I've been trying to convince 
Walter to scrap critical.c altogether. Here's why:

How do you interpret the following fragment of code? (It's a simplified 
version of actual code written by a D programmer.)

class Foo {
    int i, j;
    void a() {
       synchronized {
          int x = ++i;
          j = x;
       }
    }
    void b() {
       synchronized {
          int x = --i;
          j = x;
       }
    }
}

Does it guarantee that i==j? If you say yes, you subscribe to one of the 
schools of thought that argue that "synchronized" in this context either 
means synchronized(this) or synchronized(this.classinfo), or 
synchronized(SomeHiddenGlobalSingleton).

Wrong! The code in critical.c creates a separate new critical section 
for every synchronized block of code. So a() and b() use _different_ 
locks and their execution may interleave.

If you think this is bizarre bordering on reckless I'm with you.

Mike Hearn wrote:
 I was reading the code to the phobos runtime library and saw the following
code in critical.c:

 void _d_criticalenter(D_CRITICAL_SECTION *dcs)
 {
     if (!dcs->next)
     {
         EnterCriticalSection(&critical_section.cs);
         if (!dcs->next) // if, in the meantime, another thread didn't set it
         {
             dcs->next = dcs_list;
                  <----   
             dcs_list = dcs;      
             InitializeCriticalSection(&dcs->cs);
         }
         LeaveCriticalSection(&critical_section.cs);
     }
     EnterCriticalSection(&dcs->cs);
 }

 There's a race condition here - consider a thread that is suspended at the
marked point. Another thread then attempts to lock the critsec. It sees that
dcs->next is set and calls EnterCriticalSection on an uninitialized
CRITICAL_SECTION.

 The Linux code has the same problem, and then another one as well - the
initialization of the "critsec lock" is itself lazy and is not protected by
anything at all. The result is potentially multiple initializations of that
lock, and because atexit does not coalesce multiple registrations, a
potentially catastrophic multiple invocation of _STD_critical_term.

 It appears this code (for both platforms) needs to be scrapped and rewritten.
I'd start by checking that lazy initialization of critsecs is actually needed -
do D compilers really generate tons of potentially unnecessary
D_CRITICAL_SECTIONs?

 thanks -mike
Sep 30 2008
parent reply Sean Kelly <sean invisibleduck.org> writes:
Bartosz Milewski wrote:
 
 If you think this is bizarre bordering on reckless I'm with you.
Yup, it's completely useless given that POSIX only provides guarantees for visibility, etc, for threads entering the *same* mutex. So one couldn't even argue that this approach would work as some strange sort of memory barrier. Sean
Sep 30 2008
parent reply Sean Kelly <sean invisibleduck.org> writes:
Sean Kelly wrote:
 Bartosz Milewski wrote:
 If you think this is bizarre bordering on reckless I'm with you.
Yup, it's completely useless given that POSIX only provides guarantees for visibility, etc, for threads entering the *same* mutex. So one couldn't even argue that this approach would work as some strange sort of memory barrier.
Oh, and I haven't fixed critical.c for the same reason. Given how they work, does *anyone* use this feature? Sean
Sep 30 2008
parent Fawzi Mohamed <fmohamed mac.com> writes:
On 2008-10-01 06:49:11 +0200, Sean Kelly <sean invisibleduck.org> said:

 Sean Kelly wrote:
 Bartosz Milewski wrote:
 
 If you think this is bizarre bordering on reckless I'm with you.
Yup, it's completely useless given that POSIX only provides guarantees for visibility, etc, for threads entering the *same* mutex. So one couldn't even argue that this approach would work as some strange sort of memory barrier.
Oh, and I haven't fixed critical.c for the same reason. Given how they work, does *anyone* use this feature? Sean
I think that it should either throw an exception "unsupported" or be corrected. it is possible to use it correctly if you have a function that is called in parallel with only one critical section, or when the whole function is synchronized. I checked my code and I prefer explicit locks, so indeed I haven't used, but both tango and phobos have some synchronized statements. Fawzi
Oct 01 2008