digitalmars.D - Race conditions in critical.c?

Mike Hearn (21/21) Sep 20 2008 I was reading the code to the phobos runtime library and saw the followi...

Janderson (3/4) Sep 21 2008 You can cancel/remove your own messages in Thunderbird if you wish :)
Mike Hearn (1/27) Sep 25 2008

Mike Hearn (2/31) Sep 25 2008

Fawzi Mohamed (7/52) Sep 25 2008 You are right dcs->next should be set only after the initialization is
Bartosz Milewski (29/57) Sep 30 2008 When I was rewriting monitor.c in D, I noticed the race in critical.c

Sean Kelly (6/8) Sep 30 2008 Yup, it's completely useless given that POSIX only provides guarantees

Sean Kelly (4/12) Sep 30 2008 Oh, and I haven't fixed critical.c for the same reason. Given how they

Fawzi Mohamed (8/23) Oct 01 2008 I think that it should either throw an exception "unsupported" or be cor...

Mike Hearn <mike plan99.net> writes:

I was reading the code to the phobos runtime library and saw the following code
in critical.c:

void _d_criticalenter(D_CRITICAL_SECTION *dcs)
{
    if (!dcs->next)
    {
        EnterCriticalSection(&critical_section.cs);
        if (!dcs->next) // if, in the meantime, another thread didn't set it
        {
            dcs->next = dcs_list;
                 <----   
            dcs_list = dcs;      
            InitializeCriticalSection(&dcs->cs);
        }
        LeaveCriticalSection(&critical_section.cs);
    }
    EnterCriticalSection(&dcs->cs);
}

There's a race condition here - consider a thread that is suspended at the
marked point. Another thread then attempts to lock the critsec. It sees that
dcs->next is set and calls EnterCriticalSection on an uninitialized
CRITICAL_SECTION.

The Linux code has the same problem, and then another one as well - the
initialization of the "critsec lock" is itself lazy and is not protected by
anything at all. The result is potentially multiple initializations of that
lock, and because atexit does not coalesce multiple registrations, a
potentially catastrophic multiple invocation of _STD_critical_term.

It appears this code (for both platforms) needs to be scrapped and rewritten.
I'd start by checking that lazy initialization of critsecs is actually needed -
do D compilers really generate tons of potentially unnecessary
D_CRITICAL_SECTIONs?

thanks -mike

Sep 20 2008

Janderson <ask me.com> writes:

Digited wrote:
 thunderbird "rules"... every post option does the same thing... sorry. :(

You can cancel/remove your own messages in Thunderbird if you wish :)

-Joel

Sep 21 2008

Mike Hearn <mike plan99.net> writes:

Mike Hearn Wrote:

 I was reading the code to the phobos runtime library and saw the following
code in critical.c:
 
 void _d_criticalenter(D_CRITICAL_SECTION *dcs)
 {
     if (!dcs->next)
     {
         EnterCriticalSection(&critical_section.cs);
         if (!dcs->next) // if, in the meantime, another thread didn't set it
         {
             dcs->next = dcs_list;
                  <----   
             dcs_list = dcs;      
             InitializeCriticalSection(&dcs->cs);
         }
         LeaveCriticalSection(&critical_section.cs);
     }
     EnterCriticalSection(&dcs->cs);
 }
 
 There's a race condition here - consider a thread that is suspended at the
marked point. Another thread then attempts to lock the critsec. It sees that
dcs->next is set and calls EnterCriticalSection on an uninitialized
CRITICAL_SECTION.
 
 The Linux code has the same problem, and then another one as well - the
initialization of the "critsec lock" is itself lazy and is not protected by
anything at all. The result is potentially multiple initializations of that
lock, and because atexit does not coalesce multiple registrations, a
potentially catastrophic multiple invocation of _STD_critical_term.
 
 It appears this code (for both platforms) needs to be scrapped and rewritten.
I'd start by checking that lazy initialization of critsecs is actually needed -
do D compilers really generate tons of potentially unnecessary
D_CRITICAL_SECTIONs?
 
 thanks -mike

Sep 25 2008

Mike Hearn <mike plan99.net> writes:

Sorry, hit return too early. So is anybody going to take a look at this? I just
noticed there is a Bugzilla, should I file a bug? 

Mike Hearn  Wrote:

 Mike Hearn Wrote:
 
 I was reading the code to the phobos runtime library and saw the following
code in critical.c:
 
 void _d_criticalenter(D_CRITICAL_SECTION *dcs)
 {
     if (!dcs->next)
     {
         EnterCriticalSection(&critical_section.cs);
         if (!dcs->next) // if, in the meantime, another thread didn't set it
         {
             dcs->next = dcs_list;
                  <----   
             dcs_list = dcs;      
             InitializeCriticalSection(&dcs->cs);
         }
         LeaveCriticalSection(&critical_section.cs);
     }
     EnterCriticalSection(&dcs->cs);
 }
 
 There's a race condition here - consider a thread that is suspended at the
marked point. Another thread then attempts to lock the critsec. It sees that
dcs->next is set and calls EnterCriticalSection on an uninitialized
CRITICAL_SECTION.
 
 The Linux code has the same problem, and then another one as well - the
initialization of the "critsec lock" is itself lazy and is not protected by
anything at all. The result is potentially multiple initializations of that
lock, and because atexit does not coalesce multiple registrations, a
potentially catastrophic multiple invocation of _STD_critical_term.
 
 It appears this code (for both platforms) needs to be scrapped and rewritten.
I'd start by checking that lazy initialization of critsecs is actually needed -
do D compilers really generate tons of potentially unnecessary
D_CRITICAL_SECTIONs?
 
 thanks -mike

Sep 25 2008

Fawzi Mohamed <fmohamed mac.com> writes:

On 2008-09-25 23:23:10 +0200, Mike Hearn  <mike plan99.net> said:

 Sorry, hit return too early. So is anybody going to take a look at 
 this? I just noticed there is a Bugzilla, should I file a bug?

You are right dcs->next should be set only after the initialization is 
performed.
do submit a bug on bugzilla
	http://d.puremagic.com/issues/

Well spotted!

Fawzi
 
 Mike Hearn  Wrote:
 
 Mike Hearn Wrote:
 
 I was reading the code to the phobos runtime library and saw the 
 following code in critical.c:
 
 void _d_criticalenter(D_CRITICAL_SECTION *dcs)
 {
 if (!dcs->next)
 {
 EnterCriticalSection(&critical_section.cs);
 if (!dcs->next) // if, in the meantime, another thread didn't set it
 {
 dcs->next = dcs_list;
 <----
 dcs_list = dcs;
 InitializeCriticalSection(&dcs->cs);
 }
 LeaveCriticalSection(&critical_section.cs);
 }
 EnterCriticalSection(&dcs->cs);
 }
 
 There's a race condition here - consider a thread that is suspended at 
 the marked point. Another thread then attempts to lock the critsec. It 
 sees that dcs->next is set and calls EnterCriticalSection on an 
 uninitialized CRITICAL_SECTION.
 
 The Linux code has the same problem, and then another one as well - the 
 initialization of the "critsec lock" is itself lazy and is not 
 protected by anything at all. The result is potentially multiple 
 initializations of that lock, and because atexit does not coalesce 
 multiple registrations, a potentially catastrophic multiple invocation 
 of _STD_critical_term.
 
 It appears this code (for both platforms) needs to be scrapped and 
 rewritten. I'd start by checking that lazy initialization of critsecs 
 is actually needed - do D compilers really generate tons of potentially 
 unnecessary D_CRITICAL_SECTIONs?
 
 thanks -mike

Sep 25 2008

Bartosz Milewski <bartosz relisoft.com> writes:

When I was rewriting monitor.c in D, I noticed the race in critical.c 
but decided not to work on it. Instead I've been trying to convince 
Walter to scrap critical.c altogether. Here's why:

How do you interpret the following fragment of code? (It's a simplified 
version of actual code written by a D programmer.)

class Foo {
    int i, j;
    void a() {
       synchronized {
          int x = ++i;
          j = x;
       }
    }
    void b() {
       synchronized {
          int x = --i;
          j = x;
       }
    }
}

Does it guarantee that i==j? If you say yes, you subscribe to one of the 
schools of thought that argue that "synchronized" in this context either 
means synchronized(this) or synchronized(this.classinfo), or 
synchronized(SomeHiddenGlobalSingleton).

Wrong! The code in critical.c creates a separate new critical section 
for every synchronized block of code. So a() and b() use _different_ 
locks and their execution may interleave.

If you think this is bizarre bordering on reckless I'm with you.

Mike Hearn wrote:
 I was reading the code to the phobos runtime library and saw the following
code in critical.c:

 void _d_criticalenter(D_CRITICAL_SECTION *dcs)
 {
     if (!dcs->next)
     {
         EnterCriticalSection(&critical_section.cs);
         if (!dcs->next) // if, in the meantime, another thread didn't set it
         {
             dcs->next = dcs_list;
                  <----   
             dcs_list = dcs;      
             InitializeCriticalSection(&dcs->cs);
         }
         LeaveCriticalSection(&critical_section.cs);
     }
     EnterCriticalSection(&dcs->cs);
 }

 There's a race condition here - consider a thread that is suspended at the
marked point. Another thread then attempts to lock the critsec. It sees that
dcs->next is set and calls EnterCriticalSection on an uninitialized
CRITICAL_SECTION.

 The Linux code has the same problem, and then another one as well - the
initialization of the "critsec lock" is itself lazy and is not protected by
anything at all. The result is potentially multiple initializations of that
lock, and because atexit does not coalesce multiple registrations, a
potentially catastrophic multiple invocation of _STD_critical_term.

 It appears this code (for both platforms) needs to be scrapped and rewritten.
I'd start by checking that lazy initialization of critsecs is actually needed -
do D compilers really generate tons of potentially unnecessary
D_CRITICAL_SECTIONs?

 thanks -mike

Sep 30 2008

Sean Kelly <sean invisibleduck.org> writes:

Bartosz Milewski wrote:
 
 If you think this is bizarre bordering on reckless I'm with you.

Yup, it's completely useless given that POSIX only provides guarantees 
for visibility, etc, for threads entering the *same* mutex.  So one 
couldn't even argue that this approach would work as some strange sort 
of memory barrier.


Sean

Sep 30 2008

Sean Kelly <sean invisibleduck.org> writes:

Sean Kelly wrote:
 Bartosz Milewski wrote:
 If you think this is bizarre bordering on reckless I'm with you.

 
 Yup, it's completely useless given that POSIX only provides guarantees 
 for visibility, etc, for threads entering the *same* mutex.  So one 
 couldn't even argue that this approach would work as some strange sort 
 of memory barrier.

Oh, and I haven't fixed critical.c for the same reason.  Given how they 
work, does *anyone* use this feature?


Sean

Sep 30 2008

Fawzi Mohamed <fmohamed mac.com> writes:

On 2008-10-01 06:49:11 +0200, Sean Kelly <sean invisibleduck.org> said:

 Sean Kelly wrote:
 Bartosz Milewski wrote:
 
 If you think this is bizarre bordering on reckless I'm with you.

 
 Yup, it's completely useless given that POSIX only provides guarantees 
 for visibility, etc, for threads entering the *same* mutex.  So one 
 couldn't even argue that this approach would work as some strange sort 
 of memory barrier.

 
 Oh, and I haven't fixed critical.c for the same reason.  Given how they 
 work, does *anyone* use this feature?
 
 
 Sean

I think that it should either throw an exception "unsupported" or be corrected.

it is possible to use it correctly if you have a function that is 
called in parallel with only one critical section, or when the whole 
function is synchronized.

I checked my code and I prefer explicit locks, so indeed I haven't 
used, but both tango and phobos have some synchronized statements.

Fawzi

Oct 01 2008

D Programming

C/C++ Programming

Other

digitalmars.D - Race conditions in critical.c?