www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 268] New: Bug with SocketSet and classes

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=268

           Summary: Bug with SocketSet and classes
           Product: D
           Version: 0.162
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Keywords: patch
          Severity: normal
          Priority: P2
         Component: Phobos
        AssignedTo: bugzilla digitalmars.com
        ReportedBy: felipe.contreras gmail.com


If you do the following:

 new SocketSet (1);
 rset.reset ();
 rset.add (sock);

Inside a class, SocketSet:Add segfaults.

It only happens inside the class, and with values < than 256.

I added the following in socket.d and it seemed to work:

 if(nbytes < NFDBITS)
  nbytes = NFDBITS;


-- 
Jul 29 2006
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=268


felipe.contreras gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |critical




-- 
Feb 12 2007
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=268


thomas-dloop kuehne.cn changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         OS/Version|Linux                       |All





The real problem is that SocketSet.add uses the "in"
contract to enforce that not too many sockets are added
but phobos is compiled with the "-release" switch and thus
the tests skipped.

Socket.select has the same problem


-- 
Apr 29 2007
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=268






I don't know what this code does, so if someone who understands sockets can
post a tested/correct patch, I'll fold it in.


-- 
Jul 15 2007
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=268






Created an attachment (id=150)
 --> (http://d.puremagic.com/issues/attachment.cgi?id=150&action=view)
Proposed patch.


-- 
Jul 15 2007
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=268






Created an attachment (id=151)
 --> (http://d.puremagic.com/issues/attachment.cgi?id=151&action=view)
Test.


-- 
Jul 15 2007
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=268


felipe.contreras gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |felipe.contreras gmail.com





Only one year to comment, and I already provided the fix. I don't what to know
what happens with problematic bugs.

Also the format of all the source files seems to be wrong.


-- 
Jul 15 2007
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=268







 Also the format of all the source files seems to be wrong.
 
As I recall, Walter uses the strange imho convention of 4 spaces to an indent, but 8 spaces to a tab. Since many editors for any graphical interface use 4 spaces to a tab now, this can cause some confusion. But don't worry, they are formatted properly when viewed through the right glasses. -[Unknown] --
Jul 15 2007
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=268






The whole SocketSet class looks rather odd to me (given only a few minutes
looking at it).  It's confusing 'max' as a number of sockets that can be added
and the fact that the socket fd value won't necessarily have any relation to
that count.  Other oddities are functions like this:

socket_t* first() { return cast(socket_t*)buf; }
fd_set* _fd_set() { return cast(fd_set*)buf; }   

Treating the same block of memory as two different and incompatible things. 
Does this class even work as it's name suggests?

The proposed change to the constructor doesn't feel right given the arbitrary
value that a socket fd might have.  Using anything other than FD_SETSIZE for
'max' is likely to result in problems.  NFDBITS is simply the number of bits
that fit in a word which doesn't make much sense as a minimum for nbytes.

Really, I'd just ditch the entire concept of a variable size set.  It doesn't
make any sense for unix style select() api's.  For other type of selection
interfaces it might make more sense such as epoll on linux, but there the
actual set is managed inside the kernel, not user space.


-- 
Jul 15 2007
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=268






Hence my request for help. The class seems to be a mess, and I don't know how
to fix it because I don't know what it's supposed to be doing.


-- 
Jul 15 2007
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=268








 Also the format of all the source files seems to be wrong.
 
As I recall, Walter uses the strange imho convention of 4 spaces to an indent, but 8 spaces to a tab. Since many editors for any graphical interface use 4 spaces to a tab now, this can cause some confusion. But don't worry, they are formatted properly when viewed through the right glasses. -[Unknown]
I mean, I get a ^M at the end of each line when I try to view the files in vim, Linux. I did try to change the file format to dos, but it still looks the same. --
Jul 15 2007
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=268







 The whole SocketSet class looks rather odd to me (given only a few minutes
 looking at it).  It's confusing 'max' as a number of sockets that can be added
 and the fact that the socket fd value won't necessarily have any relation to
 that count.  Other oddities are functions like this:
 
 socket_t* first() { return cast(socket_t*)buf; }
 fd_set* _fd_set() { return cast(fd_set*)buf; }   
 
 Treating the same block of memory as two different and incompatible things. 
 Does this class even work as it's name suggests?
 
 The proposed change to the constructor doesn't feel right given the arbitrary
 value that a socket fd might have.  Using anything other than FD_SETSIZE for
 'max' is likely to result in problems.  NFDBITS is simply the number of bits
 that fit in a word which doesn't make much sense as a minimum for nbytes.
 
 Really, I'd just ditch the entire concept of a variable size set.  It doesn't
 make any sense for unix style select() api's.  For other type of selection
 interfaces it might make more sense such as epoll on linux, but there the
 actual set is managed inside the kernel, not user space.
 
I agree, max should always be FD_SETSIZE. /* Maximum number of file descriptors in `fd_set'. */ #define FD_SETSIZE __FD_SETSIZE --
Jul 15 2007
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=268






Created an attachment (id=152)
 --> (http://d.puremagic.com/issues/attachment.cgi?id=152&action=view)
Major overhaul to SocketSet -- untested

Walter, here's a potential rewrite of SocketSet but it's untested in it's
entirety.  The only thing I confirmed is that it compiles with 1.018 on linux. 
I haven't test compiled win32.  I haven't tried it with any test apps. 
Unfortunately there's no unittests to make life easy.


-- 
Jul 15 2007
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=268


braddr puremagic.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------

              patch|                            |




-- 
Jul 15 2007
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=268








 The whole SocketSet class looks rather odd to me (given only a few minutes
 looking at it).  It's confusing 'max' as a number of sockets that can be added
 and the fact that the socket fd value won't necessarily have any relation to
 that count.  Other oddities are functions like this:
 
 socket_t* first() { return cast(socket_t*)buf; }
 fd_set* _fd_set() { return cast(fd_set*)buf; }   
 
 Treating the same block of memory as two different and incompatible things. 
 Does this class even work as it's name suggests?
 
 The proposed change to the constructor doesn't feel right given the arbitrary
 value that a socket fd might have.  Using anything other than FD_SETSIZE for
 'max' is likely to result in problems.  NFDBITS is simply the number of bits
 that fit in a word which doesn't make much sense as a minimum for nbytes.
 
 Really, I'd just ditch the entire concept of a variable size set.  It doesn't
 make any sense for unix style select() api's.  For other type of selection
 interfaces it might make more sense such as epoll on linux, but there the
 actual set is managed inside the kernel, not user space.
 
I agree, max should always be FD_SETSIZE. /* Maximum number of file descriptors in `fd_set'. */ #define FD_SETSIZE __FD_SETSIZE
 The whole SocketSet class looks rather odd to me (given only a few minutes
 looking at it).  It's confusing 'max' as a number of sockets that can be added
 and the fact that the socket fd value won't necessarily have any relation to
 that count.  Other oddities are functions like this:
 
 socket_t* first() { return cast(socket_t*)buf; }
 fd_set* _fd_set() { return cast(fd_set*)buf; }   
 
 Treating the same block of memory as two different and incompatible things. 
 Does this class even work as it's name suggests?
 
 The proposed change to the constructor doesn't feel right given the arbitrary
 value that a socket fd might have.  Using anything other than FD_SETSIZE for
 'max' is likely to result in problems.  NFDBITS is simply the number of bits
 that fit in a word which doesn't make much sense as a minimum for nbytes.
 
 Really, I'd just ditch the entire concept of a variable size set.  It doesn't
 make any sense for unix style select() api's.  For other type of selection
 interfaces it might make more sense such as epoll on linux, but there the
 actual set is managed inside the kernel, not user space.
 
I did some research. The real problem is that FD_ZERO always erases the whole fd_set. Since the fd_set didn't have the right size then more memory was erased and the corruption generated the crash. So we _need_ to have fd_set width the right size which turns out to be: FD_SETSIZE / NFDBITS * int.size That's because inside fd_set there isn't a list of sockets, but a binary map. So if bit 0 is on, that means the socket with fd 0 should be checked and so on. So the first() method was wrong, there's no way from the fd_set to find the socket_t structure. I'm not sure why a size of NFDBITS makes it not crash, maybe the zeroed structures are not used in this case. Anyway, I have tested the attached code with up to 1024 sockets. The assert doesn't get activated because the socket creation fails. --
Jul 15 2007
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=268






Created an attachment (id=153)
 --> (http://d.puremagic.com/issues/attachment.cgi?id=153&action=view)
Proposed patch v2.0


-- 
Jul 15 2007
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=268






For what it's worth, I just tested my rewrite with the Test file that Felipe
provided.

Felipe, would you please apply my changes, a more thorough fix / rewrite than
your changes, and see how they do for you?


-- 
Jul 15 2007
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=268


braddr puremagic.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------

           obsolete|                            |





Created an attachment (id=154)
 --> (http://d.puremagic.com/issues/attachment.cgi?id=154&action=view)
Major overhaul to SocketSet -- lightly tested -- v2

Missed decrementing SocketSet.count during SocketSet.remove


-- 
Jul 15 2007
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=268






Can you please expand the test case so it at least gives good coverage via the
-cov switch? This will save us a lot of trouble.


-- 
Jul 15 2007
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=268






As an additional data point, I would like to confirm that "Proposed patch v2.0"
fixes some memory corruption I was experiencing when using SocketSets with a
small (4) max number of sockets.

The ^M pollution is due to inconsistent line termination standards which should
have been resolved 15 years ago. You can strip them pretty easily with:
tr -d "\015" < infile.d 

Sadly that's not going to get you very far as it'll then be quite inconsistent
with the rest of the source. Not that I would mind seeing them vanish from the
entirety of the source, but that's a different issue ...


-- 
Jul 19 2007
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=268








I have not had time to try the major overhaul patch. Perhaps it should be
tested under Windows too.

I really don't care, as I'm not using D anymore. This issue and the lack of
proper debugging tools (Linux) to be able to fix it made me switch to Ruby.

I just wanted to make my contribution.


-- 
Jul 20 2007
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=268


bugzilla digitalmars.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED





Fixed DMD 1.019 and 2.003


-- 
Jul 30 2007