digitalmars.D - Bogus thread termination under linux when blocked on a socket receive
- Kris (11/11) Jul 17 2004 I have a SocketListener class that accepts a socket during construction,...
- Regan Heath (8/21) Jul 17 2004 I believe I/we ran into this where I work once. I have sent a message to...
- Kris (6/30) Jul 17 2004 Thanks Regan ...
- Regan Heath (9/40) Jul 18 2004 Unfortunately.. either we've forgotten we had this problem, or we didn't...
- Kris (6/47) Jul 18 2004 Bummer; it's a SEGV. Thanks for trying mate ...
- Regan Heath (13/68) Jul 18 2004 The simplest answer would be that the main thread is closing/releasing t...
- John Reimer (7/20) Jul 18 2004 Yes, please have a look at it! I'm kind of stumped myself about this
- Kris (2/3) Jul 18 2004 You and I have both expended a lot of effort on this :-)
- Ben Hinkle (10/11) Jul 18 2004 no - Mike Swieton and I have been battling this problem with the
- Kris (4/15) Jul 18 2004 Good idea: might at least show some alternate behaviour. Thanks Ben.
- Kris (32/103) Jul 18 2004 Tried various combinations of that Regan; the thread watches for termina...
- teqDruid (3/5) Jul 19 2004 I'm sure you can find someone with a linux server that's willing to give
- Kris (4/9) Jul 21 2004 I somehow missed your post Druid. Sorry, and thank you for the suggestio...
- Regan Heath (6/17) Jul 22 2004 I got me a linux box too.. kinds.. VMware is really cool!
- Regan Heath (27/47) Jul 22 2004 Ok.. trying to replicate the working case on my VMware linux machine I g...
I have a SocketListener class that accepts a socket during construction, and fires up a thread to read said socket and dispatch to a user-supplied notification routine. The socket is of the datagram variety. Everything works fine until program termination, where the blocked thread (waiting in a socket-receive function) segfaults instead of returning. Here's the kicker: if, just after starting the thread, I pause() and then resume() it, everything works perfectly during termination also. That is, the socket-receive unblocks and returns as expected, causing the thread to shutdown gracefully. There are no such woes under Win32 ... Help! Any ideas?
Jul 17 2004
On Sat, 17 Jul 2004 22:27:39 -0700, Kris <someidiot earthlink.dot.dot.dot.net> wrote:I have a SocketListener class that accepts a socket during construction, and fires up a thread to read said socket and dispatch to a user-supplied notification routine. The socket is of the datagram variety. Everything works fine until program termination, where the blocked thread (waiting in a socket-receive function) segfaults instead of returning. Here's the kicker: if, just after starting the thread, I pause() and then resume() it, everything works perfectly during termination also. That is, the socket-receive unblocks and returns as expected, causing the thread to shutdown gracefully. There are no such woes under Win32 ... Help! Any ideas?I believe I/we ran into this where I work once. I have sent a message to the guys involved, hopefully someone remembers why it occurs and how we solved it. Regan -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jul 17 2004
Thanks Regan ... "Regan Heath" <regan netwin.co.nz> wrote in message news:opsbbj3vx55a2sq9 digitalmars.com...On Sat, 17 Jul 2004 22:27:39 -0700, Kris <someidiot earthlink.dot.dot.dot.net> wrote:threadI have a SocketListener class that accepts a socket during construction, and fires up a thread to read said socket and dispatch to a user-supplied notification routine. The socket is of the datagram variety. Everything works fine until program termination, where the blockedthen(waiting in a socket-receive function) segfaults instead of returning. Here's the kicker: if, just after starting the thread, I pause() andis,resume() it, everything works perfectly during termination also. Thatthe socket-receive unblocks and returns as expected, causing the thread to shutdown gracefully. There are no such woes under Win32 ... Help! Any ideas?I believe I/we ran into this where I work once. I have sent a message to the guys involved, hopefully someone remembers why it occurs and how we solved it. Regan -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jul 17 2004
Unfortunately.. either we've forgotten we had this problem, or we didn't have this exact problem. Is it definately a SIGSEGV or could it be a SIGHUP you're getting during termination? Regan On Sat, 17 Jul 2004 23:22:25 -0700, Kris <someidiot earthlink.dot.dot.dot.net> wrote:Thanks Regan ... "Regan Heath" <regan netwin.co.nz> wrote in message news:opsbbj3vx55a2sq9 digitalmars.com...-- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/On Sat, 17 Jul 2004 22:27:39 -0700, Kris <someidiot earthlink.dot.dot.dot.net> wrote:threadI have a SocketListener class that accepts a socket duringconstruction,and fires up a thread to read said socket and dispatch to a user-supplied notification routine. The socket is of the datagram variety. Everything works fine until program termination, where the blockedthen(waiting in a socket-receive function) segfaults instead of returning. Here's the kicker: if, just after starting the thread, I pause() andis,resume() it, everything works perfectly during termination also. Thatthe socket-receive unblocks and returns as expected, causing thethreadto shutdown gracefully. There are no such woes under Win32 ... Help! Any ideas?I believe I/we ran into this where I work once. I have sent a message to the guys involved, hopefully someone remembers why it occurs and how we solved it.
Jul 18 2004
Bummer; it's a SEGV. Thanks for trying mate ... - Kris "Regan Heath" <regan netwin.co.nz> wrote in message news:opsbcuyfpv5a2sq9 digitalmars.com...Unfortunately.. either we've forgotten we had this problem, or we didn't have this exact problem. Is it definately a SIGSEGV or could it be a SIGHUP you're getting during termination? Regan On Sat, 17 Jul 2004 23:22:25 -0700, Kris <someidiot earthlink.dot.dot.dot.net> wrote:returning.Thanks Regan ... "Regan Heath" <regan netwin.co.nz> wrote in message news:opsbbj3vx55a2sq9 digitalmars.com...On Sat, 17 Jul 2004 22:27:39 -0700, Kris <someidiot earthlink.dot.dot.dot.net> wrote:threadI have a SocketListener class that accepts a socket duringconstruction,and fires up a thread to read said socket and dispatch to a user-supplied notification routine. The socket is of the datagram variety. Everything works fine until program termination, where the blocked(waiting in a socket-receive function) segfaults instead oftothenHere's the kicker: if, just after starting the thread, I pause() andis,resume() it, everything works perfectly during termination also. Thatthe socket-receive unblocks and returns as expected, causing thethreadto shutdown gracefully. There are no such woes under Win32 ... Help! Any ideas?I believe I/we ran into this where I work once. I have sent a message-- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/the guys involved, hopefully someone remembers why it occurs and how we solved it.
Jul 18 2004
The simplest answer would be that the main thread is closing/releasing the socket before the thread was done with it, Occums Razor and all that. A more complex answer.. might it have something to do with who owns the socket handle. It's created in the main thread, and can either be inherited by the child thread, or duplicated, or.. depending on how the thread is created. Does the main thread wait for it's child threads to finish before exiting? Is the code somewhere I can look at it? Regan On Sun, 18 Jul 2004 17:34:45 -0700, Kris <someidiot earthlink.dot.dot.dot.net> wrote:Bummer; it's a SEGV. Thanks for trying mate ... - Kris "Regan Heath" <regan netwin.co.nz> wrote in message news:opsbcuyfpv5a2sq9 digitalmars.com...-- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/Unfortunately.. either we've forgotten we had this problem, or we didn't have this exact problem. Is it definately a SIGSEGV or could it be a SIGHUP you're getting during termination? Regan On Sat, 17 Jul 2004 23:22:25 -0700, Kris <someidiot earthlink.dot.dot.dot.net> wrote:returning.Thanks Regan ... "Regan Heath" <regan netwin.co.nz> wrote in message news:opsbbj3vx55a2sq9 digitalmars.com...user-suppliedOn Sat, 17 Jul 2004 22:27:39 -0700, Kris <someidiot earthlink.dot.dot.dot.net> wrote:I have a SocketListener class that accepts a socket duringconstruction,and fires up a thread to read said socket and dispatch to athreadnotification routine. The socket is of the datagram variety. Everything works fine until program termination, where the blocked(waiting in a socket-receive function) segfaults instead oftoandHere's the kicker: if, just after starting the thread, I pause()thenThatresume() it, everything works perfectly during termination also.is,the socket-receive unblocks and returns as expected, causing thethreadto shutdown gracefully. There are no such woes under Win32 ... Help! Any ideas?I believe I/we ran into this where I work once. I have sent a messagewethe guys involved, hopefully someone remembers why it occurs and how-- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/solved it.
Jul 18 2004
On Mon, 19 Jul 2004 13:21:13 +1200, Regan Heath wrote:The simplest answer would be that the main thread is closing/releasing the socket before the thread was done with it, Occums Razor and all that. A more complex answer.. might it have something to do with who owns the socket handle. It's created in the main thread, and can either be inherited by the child thread, or duplicated, or.. depending on how the thread is created. Does the main thread wait for it's child threads to finish before exiting? Is the code somewhere I can look at it? ReganYes, please have a look at it! I'm kind of stumped myself about this problem. Go to http://www.dsource.org/projects/mango/ for the zip downloads. I'm running Gentoo Linux, and my limited knowledge of threads on linux hasn't helped ferret out the problem. (Sorry, Kris, couldn't resist jumping in on this one).
Jul 18 2004
"John Reimer" wrote in messageYou and I have both expended a lot of effort on this :-)(Sorry, Kris, couldn't resist jumping in on this one).
Jul 18 2004
Does the main thread wait for it's child threads to finish before exiting?no - Mike Swieton and I have been battling this problem with the concurrent library. I think Mike suggested one make sure the last part of your main gets the list of threads and waits for them to finish explicitly. Something like import std.thread; int main(char[][] args) { ... while (Thread.nthreads > 1) Thread.yield(); return 0; }
Jul 18 2004
Good idea: might at least show some alternate behaviour. Thanks Ben. "Ben Hinkle" <bhinkle4 juno.com> wrote in message news:189mf0dfl4rqvobet6gqdvfolk8kc0r7ti 4ax.com...exiting?Does the main thread wait for it's child threads to finish beforeno - Mike Swieton and I have been battling this problem with the concurrent library. I think Mike suggested one make sure the last part of your main gets the list of threads and waits for them to finish explicitly. Something like import std.thread; int main(char[][] args) { ... while (Thread.nthreads > 1) Thread.yield(); return 0; }
Jul 18 2004
The simplest answer would be that the main thread is closing/releasing the socket before the thread was done with it, Occums Razor and all that.Tried various combinations of that Regan; the thread watches for termination conditions and exceptions, but the segfault (apparently) happens within the lower level (sockets) library. As noted earlier; if the thread is paused() and then resumed() everything terminates perfectly. It's almost as though there's an early race condition or something and some socket state gets clobbered (I really need to get a linux box for Mango)A more complex answer.. might it have something to do with who owns the socket handle. It's created in the main thread, and can either be inherited by the child thread, or duplicated, or.. depending on how the thread is created.The socket is created by the main thread, and handed to the child.Does the main thread wait for it's child threads to finish before exiting?No. The order of termination is determined by the D runtime. Experience indicates that the live D threads are paused during shutdown, and any live sockets are already interrupted by then. The reason main() does not wait is that the child threads are typically blocked on a read() by the OS. It is possible to keep track of all sockets used, but that can lead to all kinds of other problems (this is a library, not a managed application).Is the code somewhere I can look at it?Absolutely. Pick up the v0.91 zipfiles from here: http://dsource.org/projects/mango/?sec=downloads Even better, grab the very latest from SVN here: http://svn.dsource.org/svn/projects/mango/trunk/ The latest stuff in SVN has some version (LinuxTrace){} stuff in it to help follow the execution path. Take a look near bottom of unittest.d where there's a testMulticast() function ... Though the quantity of source might look intimidating, there's actually on a few classes involved with this issue (SocketListener, Thread, Socket). If you wish, let's move onto email. I can walk you through a variety of "uncomment this, then this ..." scenarios to try out, plus some version() stuff to try. For those just checking in: this is a linux only issue ~ works as designed running under Win32, which might be the problem :-) - Kris anOn Sun, 18 Jul 2004 17:34:45 -0700, Kris <someidiot earthlink.dot.dot.dot.net> wrote:didn'tBummer; it's a SEGV. Thanks for trying mate ... - Kris "Regan Heath" <regan netwin.co.nz> wrote in message news:opsbcuyfpv5a2sq9 digitalmars.com...Unfortunately.. either we've forgotten we had this problem, or weduringhave this exact problem. Is it definately a SIGSEGV or could it be a SIGHUP you're gettingmessagetermination? Regan On Sat, 17 Jul 2004 23:22:25 -0700, Kris <someidiot earthlink.dot.dot.dot.net> wrote:returning.Thanks Regan ... "Regan Heath" <regan netwin.co.nz> wrote in message news:opsbbj3vx55a2sq9 digitalmars.com...user-suppliedOn Sat, 17 Jul 2004 22:27:39 -0700, Kris <someidiot earthlink.dot.dot.dot.net> wrote:I have a SocketListener class that accepts a socket duringconstruction,and fires up a thread to read said socket and dispatch to athreadnotification routine. The socket is of the datagram variety. Everything works fine until program termination, where the blocked(waiting in a socket-receive function) segfaults instead ofandHere's the kicker: if, just after starting the thread, I pause()thenThatresume() it, everything works perfectly during termination also.is,the socket-receive unblocks and returns as expected, causing thethreadto shutdown gracefully. There are no such woes under Win32 ... Help! Any ideas?I believe I/we ran into this where I work once. I have sent ato-- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/wethe guys involved, hopefully someone remembers why it occurs and how-- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/solved it.
Jul 18 2004
On Sun, 18 Jul 2004 19:06:28 -0700, Kris wrote:some socket state gets clobbered (I really need to get a linux box for Mango)I'm sure you can find someone with a linux server that's willing to give you a shell account. I know a coupla guys you might try to contact.
Jul 19 2004
I somehow missed your post Druid. Sorry, and thank you for the suggestion. I was given one in the meantime (for doing remote sanity checks). "teqDruid" <me teqdruid.com> wrote in message news:pan.2004.07.19.09.09.26.816995 teqdruid.com...On Sun, 18 Jul 2004 19:06:28 -0700, Kris wrote:some socket state gets clobbered (I really need to get a linux box for Mango)I'm sure you can find someone with a linux server that's willing to give you a shell account. I know a coupla guys you might try to contact.
Jul 21 2004
I got me a linux box too.. kinds.. VMware is really cool! I'll have a play with this tomorrow :) On Wed, 21 Jul 2004 22:43:59 -0700, Kris <someidiot earthlink.dot.dot.dot.net> wrote:I somehow missed your post Druid. Sorry, and thank you for the suggestion. I was given one in the meantime (for doing remote sanity checks). "teqDruid" <me teqdruid.com> wrote in message news:pan.2004.07.19.09.09.26.816995 teqdruid.com...-- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/On Sun, 18 Jul 2004 19:06:28 -0700, Kris wrote:some socket state gets clobbered (I really need to get a linux box for Mango)I'm sure you can find someone with a linux server that's willing to give you a shell account. I know a coupla guys you might try to contact.
Jul 22 2004
Ok.. trying to replicate the working case on my VMware linux machine I get: looking up exception error code ... socket cancel status now set closing resource via destructor closing socket handle ... socket handle closed From a brief look at the code the exception is thrown here: mango\io\MulticastSocket.d(121): exception ("Unable to join multicast due to this call: if (! setGroup (groupAddress, Option.IP_ADD_MEMBERSHIP)) which returns whether setsockopt worked or not. According to the man pages on setsockopt it does not set that error code.. so something weird appears to be going on, perhaps due to the way VMware handles the networking. I found getError() is used to get the errno value, what about setting it? It would be good to set it to 0 before the setsockopt call to ensure setsockopt is really setting errno to this value. As a side note, how does getError work, it's not a C fn someone must have stubbed access to errno, right? but where/how :) Regan. On Thu, 22 Jul 2004 23:35:14 +1200, Regan Heath <regan netwin.co.nz> wrote:I got me a linux box too.. kinds.. VMware is really cool! I'll have a play with this tomorrow :) On Wed, 21 Jul 2004 22:43:59 -0700, Kris <someidiot earthlink.dot.dot.dot.net> wrote:-- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/I somehow missed your post Druid. Sorry, and thank you for the suggestion. I was given one in the meantime (for doing remote sanity checks). "teqDruid" <me teqdruid.com> wrote in message news:pan.2004.07.19.09.09.26.816995 teqdruid.com...On Sun, 18 Jul 2004 19:06:28 -0700, Kris wrote:some socket state gets clobbered (I really need to get a linux boxforMango)I'm sure you can find someone with a linux server that's willing to give you a shell account. I know a coupla guys you might try to contact.
Jul 22 2004