digitalmars.D - Clean-up of std.socket

Vladimir Panteleev (67/67) Sep 12 2011 Hi,

David Nadlinger (11/15) Sep 12 2011 As discussed on IRC, throwing on reverse lookup failure seems very wrong...

Vladimir Panteleev (10/19) Sep 12 2011 I'm thinking of making all of Address.to(Addr|HostName|Port|Service)Stri...

Regan Heath (14/24) Sep 13 2011 This is one of those things I haven't managed to come to a definite

Vladimir Panteleev (21/26) Sep 13 2011 Why do you say that? Let's look at each of those functions.

Regan Heath (4/13) Sep 13 2011 My bad, I didn't take a good look at the source and assumed these were

Regan Heath (10/20) Sep 13 2011 I agree. To me, throwing on lookup failure will end up being "using
Vladimir Panteleev (6/20) Sep 14 2011 https://github.com/CyberShadow/phobos/commit/5fac9e2b5d39583235185f36b9e...

David Nadlinger (4/23) Sep 14 2011 What, my unittests for this weren't already in std.socket?! My Git-fu

David Nadlinger (2/4) Sep 14 2011 :/, even.

Sean Kelly (36/36) Sep 12 2011 Looks much nicer than the current std.socket. A few random comments =

Adam Burton (10/53) Sep 12 2011 Regardless if it is correct or wrong I think there is a reason it is voi...

Sean Kelly (11/26) Sep 12 2011 void[]=20

Adam Burton (17/38) Sep 12 2011 Like I said, regardless if it is correct or wrong. I'm not arguing for i...

Simen Kjaeraas (6/17) Sep 12 2011 I believe the reasons for not using void[] is exactly that it could cont...

Adam Burton (7/24) Sep 12 2011 How does a ubyte[] prevent that? If you've serialized an int (or even a

Jonathan M Davis (8/34) Sep 12 2011 With void[], you can pass something like int*[] to it without having to ...

Adam Burton (6/43) Sep 12 2011 Fair enough that's more clear, I hadn't actually thought of an array of

Adam Burton (7/51) Sep 12 2011 Just a thought then, rather than using ubyte[] and casting to force some...

Dmitry Olshansky (13/64) Sep 13 2011 Don't forget that there is also network byte order vs host machine byte

Vladimir Panteleev (32/68) Sep 12 2011 I'd say this is debatable (e.g. File.rawWrite is templated to the same
Masahiro Nakagawa (7/13) Sep 12 2011 [snip]

David Nadlinger (15/22) Sep 13 2011 Which kind of »provided details« would be interesting for you? The

Vladimir Panteleev (16/22) Sep 13 2011 Something like this post, thanks.

Marco Leise (20/39) Sep 13 2011 t> =

David Nadlinger (7/12) Sep 14 2011 Currently, it is covered in a »Note« section

Marco Leise (8/20) Sep 14 2011 =

Vladimir Panteleev (6/25) Sep 14 2011 https://github.com/CyberShadow/phobos/commit/89feff70e2c8ae68d7efd8a2fb7...

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

Hi,

I've spent some time polishing up std.socket a bit. I've tried to preserve  
compatibility as much as possible.

The branch is here:  
https://github.com/CyberShadow/phobos/tree/new-std-socket

A list of commits is here:  
https://github.com/CyberShadow/phobos/compare/master...new-std-socket

Docs are here: http://thecybershadow.net/d/new-std-socket/std_socket.html

The most important changes are:

* Incorporate Chris Miller's std.socket updates and license change, which  
were posted on Bugzilla as issue 5401 in January.

* Add bounds checking to SocketSet. Previously, adding sockets outside the  
SocketSet's capacity was an unsafe operation which could corrupt memory.

* SocketSet now supports variable fd_set sizes on Windows.

* Re-entrant IPv4 name resolution for supported POSIX platforms. This will  
potentially speed up existing multi-threaded network code.

* IPv6 address support, with wrapper functions which use IPv4-only  
functions when the IPv6 functions are unavailable (Windows versions before  
XP).

* Fixes for issues 5177 and 3484.

* Improved documentation, added examples.

* Some minor added functionality, such as retrieval of more detailed error  
information, Unix Domain sockets, setting TCP keep-alive options.

I'd appreciate if someone with an existing body of D2 code using  
std.socket could try my version, and let me know of any code breakage.

I've heard a lot of criticism about std.socket before. If I haven't fixed  
your gripe, feel free to let me know.

Some concerns:

* std.socket enumerations do not conform to D's naming conventions. Fixing  
this is complicated, due to (IIUC) enum aliases breaking code which  
enumerate enum members, and the inability to deprecate individual enum  
members.

* Exceptions retrieve a text description of numerical error codes when  
they are created. If it's possible, it would be best to make that happen  
when a text description is requested (msg field or .toString), though I  
don't think msg being a field allows this.

* InternetAddress (and by convention, Internet6Address) has a constructor  
which accepts a hostname. The constructor resolves the hostname and picks  
the first address entry. I understand that conflating DNS resolution with  
other functionality may be undesirable, so perhaps such functionality  
should be deprecated.

* Currently, reverse hostname lookup functions throw on failure. Such  
lookups are not reliable and are expected to sometimes fail, so perhaps a  
more appropriate behavior would be to return the requested IP address  
unchanged, or a value indicating failure (null or false).

* As far as I can tell, the UnknownAddress class is useless. The generic  
sockaddr structure it encapsulates is not large enough to abstract and  
hold newer socket address structures.

* David Nadlinger added functionality to work around an apparent oddity of  
the Windows socket implementation (see WINSOCK_TIMEOUT_SKEW). Although the  
hack is documented, I'm a bit uncomfortable with that there are no  
provided details or instructions on how to reproduce the experiments and  
measurements which led to the inclusion of this hack. (There's also the  
question whether a language library's purpose includes working around  
apparent bugs in platforms' implementations.)

* Some new functions (notably getAddress) could have probably been named  
better. "getAddress", which returns an array of Address class instances,  
is the logical extension of getAddressInfo (which returns the addresses  
with accompanying information), which in turn is named after the POSIX  
getaddrinfo function.

* InternetAddress has constructors and getters which use uint32_t as the  
native type for an IPv4 address. Should Internet6Address use ubyte[16]?  
Currently it uses the in6_addr structure, which is also used in POSIX  
network structures.

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Sep 12 2011

David Nadlinger <see klickverbot.at> writes:

On 9/12/11 4:11 PM, Vladimir Panteleev wrote:
 * Currently, reverse hostname lookup functions throw on failure. Such
 lookups are not reliable and are expected to sometimes fail, so perhaps
 a more appropriate behavior would be to return the requested IP address
 unchanged, or a value indicating failure (null or false).

As discussed on IRC, throwing on reverse lookup failure seems very wrong 
to me, as it is certainly expected. In my opinion, the best solution 
would be to return null (empty string), but I am not certain if it 
should still throw if something went wrong during lookup (besides the IP 
address not being found).

I'll probably change the current std.socket.toHostAddrString() to behave 
like this, as the current behavior is inconsistent (when the 
getHostByAddr fallback is used), and I accidentally left it undocumented 
anyway.

David

Sep 12 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Mon, 12 Sep 2011 20:55:42 +0300, David Nadlinger <see klickverbot.at>  
wrote:

 As discussed on IRC, throwing on reverse lookup failure seems very wrong  
 to me, as it is certainly expected. In my opinion, the best solution  
 would be to return null (empty string), but I am not certain if it  
 should still throw if something went wrong during lookup (besides the IP  
 address not being found).

I'm thinking of making all of Address.to(Addr|HostName|Port|Service)String  
return null on failure for consistency. Sounds good?

 I'll probably change the current std.socket.toHostAddrString() to behave  
 like this, as the current behavior is inconsistent (when the  
 getHostByAddr fallback is used), and I accidentally left it undocumented  
 anyway.

I'd prefer if we minimized changes on the master branch. I hope we can  
finalize and merge in the cleaned-up version before the next release  
anyway.

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Sep 12 2011

"Regan Heath" <regan netmail.co.nz> writes:

On Mon, 12 Sep 2011 23:10:29 +0100, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Mon, 12 Sep 2011 20:55:42 +0300, David Nadlinger <see klickverbot.at>  
 wrote:

 As discussed on IRC, throwing on reverse lookup failure seems very  
 wrong to me, as it is certainly expected. In my opinion, the best  
 solution would be to return null (empty string), but I am not certain  
 if it should still throw if something went wrong during lookup (besides  
 the IP address not being found).

 I'm thinking of making all of  
 Address.to(Addr|HostName|Port|Service)String return null on failure for  
 consistency. Sounds good?

This is one of those things I haven't managed to come to a definite  
opinion on myself.  In some of these cases you'll be returning null for  
incorrect input (essentially) which is something you could argue warrants  
an exception, or does it warrant an assertion?  The line, to me, between  
where to use assert and when to throw often blurs.  I guess at the end of  
the day you should throw in cases where the arguments may have been 'user'  
input.. but that seems to me, to be all the time, because you cannot be  
certain.  So, that leaves us using assert only for 'internal' functions,  
where we know the arguments are not user input, or should have been  
sanitized already by our own code.

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Sep 13 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Tue, 13 Sep 2011 12:59:35 +0300, Regan Heath <regan netmail.co.nz>  
wrote:

 I'm thinking of making all of  
 Address.to(Addr|HostName|Port|Service)String return null on failure for  
 consistency. Sounds good?

 In some of these cases you'll be returning null for incorrect input  
 (essentially)

Why do you say that? Let's look at each of those functions.

An Address class encapsulates a socket address that has already been  
parsed/resolved/retrieved to a binary numeric format.

Address.toAddrString returns a numeric string representation of the host  
address. For IPv4, it means taking the 32-bit value and formatting it to  
the common %d.%d.%d.%d format. I don't see how that could fail, except for  
catastrophic conditions (out-of-memory etc). Same with IPv6 - AFAIK any  
16-byte sequence can be represented as an IPv6 string (%02x:%02x:%02x...).  
Same with Address.toPortString.

The only question regarding the above is with address families which do  
not have a meaningful host address/port, for example Unix domain sockets.

Address.toHostNameString was the point of our discussion. The method  
attempts a reverse lookup, which can be expected to fail.  
Address.toServiceString is similar, however it doesn't need to perform a  
network lookup - it only needs to check the host's database of service  
names.

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Sep 13 2011

"Regan Heath" <regan netmail.co.nz> writes:

On Tue, 13 Sep 2011 13:12:47 +0100, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Tue, 13 Sep 2011 12:59:35 +0300, Regan Heath <regan netmail.co.nz>  
 wrote:

 I'm thinking of making all of  
 Address.to(Addr|HostName|Port|Service)String return null on failure  
 for consistency. Sounds good?

 In some of these cases you'll be returning null for incorrect input  
 (essentially)

 Why do you say that? Let's look at each of those functions.

My bad, I didn't take a good look at the source and assumed these were  
static methods converting to/from string representation or similar.

Sep 13 2011

"Regan Heath" <regan netmail.co.nz> writes:

On Mon, 12 Sep 2011 18:55:42 +0100, David Nadlinger <see klickverbot.at>  
wrote:

 On 9/12/11 4:11 PM, Vladimir Panteleev wrote:
 * Currently, reverse hostname lookup functions throw on failure. Such
 lookups are not reliable and are expected to sometimes fail, so perhaps
 a more appropriate behavior would be to return the requested IP address
 unchanged, or a value indicating failure (null or false).

 As discussed on IRC, throwing on reverse lookup failure seems very wrong  
 to me, as it is certainly expected. In my opinion, the best solution  
 would be to return null (empty string), but I am not certain if it  
 should still throw if something went wrong during lookup (besides the IP  
 address not being found).

I agree.  To me, throwing on lookup failure will end up being "using  
exceptions for flow control" (which is a well known 'bad'(TM) thing,  
right?) for callers specifically who will almost always want to/have to  
catch the (hopefully specific) exception and handle it.  Or, to look at it  
another way it is using an exception for something which is not actually  
exceptional, which just seems wrong.

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Sep 13 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Mon, 12 Sep 2011 20:55:42 +0300, David Nadlinger <see klickverbot.at>  
wrote:

 On 9/12/11 4:11 PM, Vladimir Panteleev wrote:
 * Currently, reverse hostname lookup functions throw on failure. Such
 lookups are not reliable and are expected to sometimes fail, so perhaps
 a more appropriate behavior would be to return the requested IP address
 unchanged, or a value indicating failure (null or false).

 As discussed on IRC, throwing on reverse lookup failure seems very wrong  
 to me, as it is certainly expected. In my opinion, the best solution  
 would be to return null (empty string), but I am not certain if it  
 should still throw if something went wrong during lookup (besides the IP  
 address not being found).

 I'll probably change the current std.socket.toHostAddrString() to behave  
 like this, as the current behavior is inconsistent (when the  
 getHostByAddr fallback is used), and I accidentally left it undocumented  
 anyway.

https://github.com/CyberShadow/phobos/commit/5fac9e2b5d39583235185f36b9e5bd8346be5cf3

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Sep 14 2011

David Nadlinger <see klickverbot.at> writes:

On 9/14/11 4:27 PM, Vladimir Panteleev wrote:
 On Mon, 12 Sep 2011 20:55:42 +0300, David Nadlinger <see klickverbot.at>
 wrote:

 On 9/12/11 4:11 PM, Vladimir Panteleev wrote:
 * Currently, reverse hostname lookup functions throw on failure. Such
 lookups are not reliable and are expected to sometimes fail, so perhaps
 a more appropriate behavior would be to return the requested IP address
 unchanged, or a value indicating failure (null or false).

 As discussed on IRC, throwing on reverse lookup failure seems very
 wrong to me, as it is certainly expected. In my opinion, the best
 solution would be to return null (empty string), but I am not certain
 if it should still throw if something went wrong during lookup
 (besides the IP address not being found).

 I'll probably change the current std.socket.toHostAddrString() to
 behave like this, as the current behavior is inconsistent (when the
 getHostByAddr fallback is used), and I accidentally left it
 undocumented anyway.

 https://github.com/CyberShadow/phobos/commit/5fac9e2b5d39583235185f36b9e5bd8346be5cf3

What, my unittests for this weren't already in std.socket?! My Git-fu 
must have been not strong enough back then… ;)

David

Sep 14 2011

David Nadlinger <see klickverbot.at> writes:

On 9/14/11 10:36 PM, David Nadlinger wrote:
 What, my unittests for this weren't already in std.socket?! My Git-fu
 must have been not strong enough back then… ;)

:/, even.

Sep 14 2011

Sean Kelly <sean invisibleduck.org> writes:

Looks much nicer than the current std.socket.  A few random comments =
from a quick scan of the code:

Socket.send/receive should use ubyte[], not void[] for the data.

I'd like some way to avoid new objects being created during any =
low-level socket operation I expect to do regularly.  For example, =
socket.receive=46rom creates a new Address instance every time it's =
called.  Perhaps I could have the option to supply an Address object to =
be overwritten instead?

That Address.name() returns a sockaddr* is kind of weird.  I'd expect it =
to return a string?  I know that the sockaddr is generally called a =
"name" in API parlance, but it seems a bit weird in this context.

Why is InternetHost an instantiable object?  It has data fields that =
aren't initialized by any ctor, but only by calls where a hostent* is =
passed?  And all for access to API calls which no one is supposed to use =
anyway?  Please just make this go away :-)

There are a number of bool parameters that should really be =
EnumName.yes/no.

The current approach that appears to be required for connecting to a =
remote host kind of stinks:

    Socket sock =3D null;
    foreach(info, getAddressInfo("www.digitalmars.com")) {
        try {
            sock =3D new Socket(info); // will throw if can't create a =
socket based on info
            sock.connect(info.address);
            break;
        } catch (Exception e) {
            sock =3D null;
        }
    }
    if (sock is null)
        // unable to connect via any available method!

As an aside=85 =46rom your comments, I gather that you're not terribly =
happy with certain design requirements imposed by the existing =
std.socket.  Why not create an entirely new API in std.socket2 and not =
worry about it?  Would your design change enough to warrant doing this?

Sep 12 2011

Adam Burton <adz21c gmail.com> writes:

Sean Kelly wrote:

 Looks much nicer than the current std.socket.  A few random comments from
 a quick scan of the code:
 
 Socket.send/receive should use ubyte[], not void[] for the data.

Regardless if it is correct or wrong I think there is a reason it is void[] 
(I am sure you are aware of this but just in case you are not ;)). All 
arrays implicitly convert to void[] 
(http://www.digitalmars.com/d/2.0/arrays.html - Implicit conversions) and 
the array length is automatically modified such that it is a byte count (for 
example assigning a dstring "hello" to void[] sets void[]'s length to 20 
while dstring is 5), this lets you send data to send/receive without having 
to cast it. I've inferred that to mean void[] is expected for buffers of 
bytes and ubyte[]/byte[] as arrays of bytes.
 
 I'd like some way to avoid new objects being created during any low-level
 socket operation I expect to do regularly.  For example,
 socket.receiveFrom creates a new Address instance every time it's called. 
 Perhaps I could have the option to supply an Address object to be
 overwritten instead?
 
 That Address.name() returns a sockaddr* is kind of weird.  I'd expect it
 to return a string?  I know that the sockaddr is generally called a "name"
 in API parlance, but it seems a bit weird in this context.
 
 Why is InternetHost an instantiable object?  It has data fields that
 aren't initialized by any ctor, but only by calls where a hostent* is
 passed?  And all for access to API calls which no one is supposed to use
 anyway?  Please just make this go away :-)
 
 There are a number of bool parameters that should really be
 EnumName.yes/no.
 
 The current approach that appears to be required for connecting to a
 remote host kind of stinks:
 
     Socket sock = null;
     foreach(info, getAddressInfo("www.digitalmars.com")) {
         try {
             sock = new Socket(info); // will throw if can't create a
             socket based on info sock.connect(info.address);
             break;
         } catch (Exception e) {
             sock = null;
         }
     }
     if (sock is null)
         // unable to connect via any available method!
 
 As an aside… From your comments, I gather that you're not terribly happy
 with certain design requirements imposed by the existing std.socket.  Why
 not create an entirely new API in std.socket2 and not worry about it? 
 Would your design change enough to warrant doing this?

Sep 12 2011

Sean Kelly <sean invisibleduck.org> writes:

On Sep 12, 2011, at 1:12 PM, Adam Burton wrote:

 Sean Kelly wrote:
=20
 Looks much nicer than the current std.socket.  A few random comments =


from
 a quick scan of the code:
=20
 Socket.send/receive should use ubyte[], not void[] for the data.

 Regardless if it is correct or wrong I think there is a reason it is =

void[]=20
 (I am sure you are aware of this but just in case you are not ;)). All=20=

 arrays implicitly convert to void[]=20
 (http://www.digitalmars.com/d/2.0/arrays.html - Implicit conversions) =

and=20
 the array length is automatically modified such that it is a byte =

count (for=20
 example assigning a dstring "hello" to void[] sets void[]'s length to =

20=20
 while dstring is 5), this lets you send data to send/receive without =

having=20
 to cast it. I've inferred that to mean void[] is expected for buffers =

of=20
 bytes and ubyte[]/byte[] as arrays of bytes.

Sure=85 but is this a feature that's actually desirable here?  I suppose =
it would be good for sending char strings, but other than that I'd =
probably want to serialize the data somehow before sending it.

Sep 12 2011

Adam Burton <adz21c gmail.com> writes:

Sean Kelly wrote:

 On Sep 12, 2011, at 1:12 PM, Adam Burton wrote:
 
 Sean Kelly wrote:
 
 Looks much nicer than the current std.socket.  A few random comments
 from a quick scan of the code:
 
 Socket.send/receive should use ubyte[], not void[] for the data.

 Regardless if it is correct or wrong I think there is a reason it is
 void[] (I am sure you are aware of this but just in case you are not ;)).
 All arrays implicitly convert to void[]
 (http://www.digitalmars.com/d/2.0/arrays.html - Implicit conversions) and
 the array length is automatically modified such that it is a byte count
 (for example assigning a dstring "hello" to void[] sets void[]'s length
 to 20 while dstring is 5), this lets you send data to send/receive
 without having to cast it. I've inferred that to mean void[] is expected
 for buffers of bytes and ubyte[]/byte[] as arrays of bytes.

 
 Sure… but is this a feature that's actually desirable here?  I suppose it
 would be good for sending char strings, but other than that I'd probably
 want to serialize the data somehow before sending it.

Like I said, regardless if it is correct or wrong. I'm not arguing for it 
either way I was just making sure it was known why it would use void[].

If the data is serialized or not it makes no difference if send/receive uses 
ubyte[] or void[] since void[] can handle both. I quite like the idea of 
void[] representing a chunk of memory that could contain anything, 
serialized data; an array of ubytes or strings, and allow ubyte[] to just 
represent an array of ubytes (after all is serialized data an array of bytes 
or a block of data containing various data types cramed into it in some 
organised manner?). In the end it is just a convention I like, not attached 
to it or anything, and D tends to discourage working based on conventions 
anyway, I guess I am somewhat playing devil's advocate in this paragraph 
:-).

The only reason I can see not to change it to ubyte[] is it seems to me a 
change that would be breaking, due to some code maybe needing casts, (or 
atleast require a fairly simple deprecation) with no real benefit (as far as 
I can see). That's assuming it is not turned into std.socket2 :-).

Sep 12 2011

"Simen Kjaeraas" <simen.kjaras gmail.com> writes:

On Mon, 12 Sep 2011 23:13:29 +0200, Adam Burton <adz21c gmail.com> wrote:
 I quite like the idea of
 void[] representing a chunk of memory that could contain anything,
 serialized data; an array of ubytes or strings, and allow ubyte[] to just
 represent an array of ubytes (after all is serialized data an array of  
 bytes
 or a block of data containing various data types cramed into it in some
 organised manner?). In the end it is just a convention I like, not  
 attached
 to it or anything, and D tends to discourage working based on conventions
 anyway, I guess I am somewhat playing devil's advocate in this paragraph
 :-).

I believe the reasons for not using void[] is exactly that it could contain
anything, including pointers, which likely would not be valid in the other
end.

-- 
   Simen

Sep 12 2011

Adam Burton <adz21c gmail.com> writes:

Simen Kjaeraas wrote:

 On Mon, 12 Sep 2011 23:13:29 +0200, Adam Burton <adz21c gmail.com> wrote:
 I quite like the idea of
 void[] representing a chunk of memory that could contain anything,
 serialized data; an array of ubytes or strings, and allow ubyte[] to just
 represent an array of ubytes (after all is serialized data an array of
 bytes
 or a block of data containing various data types cramed into it in some
 organised manner?). In the end it is just a convention I like, not
 attached
 to it or anything, and D tends to discourage working based on conventions
 anyway, I guess I am somewhat playing devil's advocate in this paragraph
 :-).

 
 I believe the reasons for not using void[] is exactly that it could
 contain anything, including pointers, which likely would not be valid in
 the other end.
 

How does a ubyte[] prevent that? If you've serialized an int (or even a 
pointer) then ubyte[] is just as bad, ubyte[0] would seem to indicate a 
meaningful unit of data itself when it's actually just the first byte of an 
int (or pointer). void[] at least says "I don't know, I just know the start 
and how long, you figure it out, I presume I have somewhere to go to be 
given context".

Sep 12 2011

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Monday, September 12, 2011 14:53 Adam Burton wrote:
 Simen Kjaeraas wrote:
 On Mon, 12 Sep 2011 23:13:29 +0200, Adam Burton <adz21c gmail.com> wrote:
 I quite like the idea of
 void[] representing a chunk of memory that could contain anything,
 serialized data; an array of ubytes or strings, and allow ubyte[] to
 just represent an array of ubytes (after all is serialized data an
 array of bytes
 or a block of data containing various data types cramed into it in some
 organised manner?). In the end it is just a convention I like, not
 attached
 to it or anything, and D tends to discourage working based on
 conventions anyway, I guess I am somewhat playing devil's advocate in
 this paragraph
 
 :-).

 
 I believe the reasons for not using void[] is exactly that it could
 contain anything, including pointers, which likely would not be valid in
 the other end.

 
 How does a ubyte[] prevent that? If you've serialized an int (or even a
 pointer) then ubyte[] is just as bad, ubyte[0] would seem to indicate a
 meaningful unit of data itself when it's actually just the first byte of an
 int (or pointer). void[] at least says "I don't know, I just know the start
 and how long, you figure it out, I presume I have somewhere to go to be
 given context".

With void[], you can pass something like int*[] to it without having to worry 
about converting it, because the conversion is implicity. ubyte[], on the 
other hand, forces you to do the conversion explicitly. So yes, you could 
still make it so that the ubyte[] passed in contains pointers, but you have to 
do it explicitly, whereas with void[], it'll take any array without 
complaining.

- Jonathan M Davis

Sep 12 2011

Adam Burton <adz21c gmail.com> writes:

Jonathan M Davis wrote:

 On Monday, September 12, 2011 14:53 Adam Burton wrote:
 Simen Kjaeraas wrote:
 On Mon, 12 Sep 2011 23:13:29 +0200, Adam Burton <adz21c gmail.com>
 wrote:
 I quite like the idea of
 void[] representing a chunk of memory that could contain anything,
 serialized data; an array of ubytes or strings, and allow ubyte[] to
 just represent an array of ubytes (after all is serialized data an
 array of bytes
 or a block of data containing various data types cramed into it in
 some organised manner?). In the end it is just a convention I like,
 not attached
 to it or anything, and D tends to discourage working based on
 conventions anyway, I guess I am somewhat playing devil's advocate in
 this paragraph
 
 :-).

 
 I believe the reasons for not using void[] is exactly that it could
 contain anything, including pointers, which likely would not be valid
 in the other end.

 
 How does a ubyte[] prevent that? If you've serialized an int (or even a
 pointer) then ubyte[] is just as bad, ubyte[0] would seem to indicate a
 meaningful unit of data itself when it's actually just the first byte of
 an int (or pointer). void[] at least says "I don't know, I just know the
 start and how long, you figure it out, I presume I have somewhere to go
 to be given context".

 
 With void[], you can pass something like int*[] to it without having to
 worry about converting it, because the conversion is implicity. ubyte[],
 on the other hand, forces you to do the conversion explicitly. So yes, you
 could still make it so that the ubyte[] passed in contains pointers, but
 you have to do it explicitly, whereas with void[], it'll take any array
 without complaining.
 
 - Jonathan M Davis

Fair enough that's more clear, I hadn't actually thought of an array of 
pointers as I was thinking of a pointer forced into ubyte[] with other data 
types. I suppose that'll help remind people to double check what they are 
sending but if you are going to send int*[] down a socket then you're 
probably gonna put cast(ubyte[]) without looking anyway.

Sep 12 2011

Adam Burton <adz21c gmail.com> writes:

Adam Burton wrote:

 Jonathan M Davis wrote:
 
 On Monday, September 12, 2011 14:53 Adam Burton wrote:
 Simen Kjaeraas wrote:
 On Mon, 12 Sep 2011 23:13:29 +0200, Adam Burton <adz21c gmail.com>
 wrote:
 I quite like the idea of
 void[] representing a chunk of memory that could contain anything,
 serialized data; an array of ubytes or strings, and allow ubyte[] to
 just represent an array of ubytes (after all is serialized data an
 array of bytes
 or a block of data containing various data types cramed into it in
 some organised manner?). In the end it is just a convention I like,
 not attached
 to it or anything, and D tends to discourage working based on
 conventions anyway, I guess I am somewhat playing devil's advocate in
 this paragraph
 
 :-).

 
 I believe the reasons for not using void[] is exactly that it could
 contain anything, including pointers, which likely would not be valid
 in the other end.

 
 How does a ubyte[] prevent that? If you've serialized an int (or even a
 pointer) then ubyte[] is just as bad, ubyte[0] would seem to indicate a
 meaningful unit of data itself when it's actually just the first byte of
 an int (or pointer). void[] at least says "I don't know, I just know the
 start and how long, you figure it out, I presume I have somewhere to go
 to be given context".

 
 With void[], you can pass something like int*[] to it without having to
 worry about converting it, because the conversion is implicity. ubyte[],
 on the other hand, forces you to do the conversion explicitly. So yes,
 you could still make it so that the ubyte[] passed in contains pointers,
 but you have to do it explicitly, whereas with void[], it'll take any
 array without complaining.
 
 - Jonathan M Davis

 Fair enough that's more clear, I hadn't actually thought of an array of
 pointers as I was thinking of a pointer forced into ubyte[] with other
 data types. I suppose that'll help remind people to double check what they
 are sending but if you are going to send int*[] down a socket then you're
 probably gonna put cast(ubyte[]) without looking anyway.

Just a thought then, rather than using ubyte[] and casting to force someone 
to check (and possibly encourage a bad habit of automatically putting in a 
cast without checking, through fustration or over confidence) make send and 
receive templates methods that don't accept types we are unable to determine 
how to handle (like pointers and classes)? Maybe even give a static assert 
with an error message explaining pointers etc are not allowed?

Sep 12 2011

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 13.09.2011 2:44, Adam Burton wrote:
 Adam Burton wrote:

 Jonathan M Davis wrote:

 On Monday, September 12, 2011 14:53 Adam Burton wrote:
 Simen Kjaeraas wrote:
 On Mon, 12 Sep 2011 23:13:29 +0200, Adam Burton<adz21c gmail.com>
 wrote:
 I quite like the idea of
 void[] representing a chunk of memory that could contain anything,
 serialized data; an array of ubytes or strings, and allow ubyte[] to
 just represent an array of ubytes (after all is serialized data an
 array of bytes
 or a block of data containing various data types cramed into it in
 some organised manner?). In the end it is just a convention I like,
 not attached
 to it or anything, and D tends to discourage working based on
 conventions anyway, I guess I am somewhat playing devil's advocate in
 this paragraph

 :-).

 I believe the reasons for not using void[] is exactly that it could
 contain anything, including pointers, which likely would not be valid
 in the other end.

 How does a ubyte[] prevent that? If you've serialized an int (or even a
 pointer) then ubyte[] is just as bad, ubyte[0] would seem to indicate a
 meaningful unit of data itself when it's actually just the first byte of
 an int (or pointer). void[] at least says "I don't know, I just know the
 start and how long, you figure it out, I presume I have somewhere to go
 to be given context".

 With void[], you can pass something like int*[] to it without having to
 worry about converting it, because the conversion is implicity. ubyte[],
 on the other hand, forces you to do the conversion explicitly. So yes,
 you could still make it so that the ubyte[] passed in contains pointers,
 but you have to do it explicitly, whereas with void[], it'll take any
 array without complaining.

 - Jonathan M Davis

 Fair enough that's more clear, I hadn't actually thought of an array of
 pointers as I was thinking of a pointer forced into ubyte[] with other
 data types. I suppose that'll help remind people to double check what they
 are sending but if you are going to send int*[] down a socket then you're
 probably gonna put cast(ubyte[]) without looking anyway.

 Just a thought then, rather than using ubyte[] and casting to force someone
 to check (and possibly encourage a bad habit of automatically putting in a
 cast without checking, through fustration or over confidence) make send and
 receive templates methods that don't accept types we are unable to determine
 how to handle (like pointers and classes)? Maybe even give a static assert
 with an error message explaining pointers etc are not allowed?

Don't forget that there is also network byte order vs host machine byte 
order. In other words everything should be (de)serialized, except plain 
bytes/chars.

There was a talk of making result of e.g. htonl a special type so that 
it can be send directly w/o cast, dunno if it's that useful. As a safety 
net untill complementary call to ntohl this special type is unusable for 
anything else except storage/copy.

Actually now when I recalled it, it seems to me like a good thing. Being 
able to catch wrong byte order statically is nice, since it's a hard to 
track bug (e.g. missing both of hton*/ntoh*).


-- 
Dmitry Olshansky

Sep 13 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Mon, 12 Sep 2011 22:13:47 +0300, Sean Kelly <sean invisibleduck.org>  
wrote:

 Looks much nicer than the current std.socket.  A few random comments  
 from a quick scan of the code:

 Socket.send/receive should use ubyte[], not void[] for the data.

I'd say this is debatable (e.g. File.rawWrite is templated to the same  
effect).
It can't be changed without breaking compatibility, but it could be  
possible to add overloads and deprecate the void[] versions.

 I'd like some way to avoid new objects being created during any  
 low-level socket operation I expect to do regularly.  For example,  
 socket.receiveFrom creates a new Address instance every time it's  
 called.  Perhaps I could have the option to supply an Address object to  
 be overwritten instead?

Good point. Luckily, this particular case has a simple and  
backwards-compatible fix:

https://github.com/CyberShadow/phobos/commit/2fbb7d6287ccd760f4e1a6c91acb60f05bf52ed8

 That Address.name() returns a sockaddr* is kind of weird.  I'd expect it  
 to return a string?  I know that the sockaddr is generally called a  
 "name" in API parlance, but it seems a bit weird in this context.

Another oddity of the original design. Generally, we're free to rename  
methods and schedule aliases for old names for deprecation - and this  
method shouldn't have much use outside std.socket anyway. What would be a  
better name?

 Why is InternetHost an instantiable object?  It has data fields that  
 aren't initialized by any ctor, but only by calls where a hostent* is  
 passed?  And all for access to API calls which no one is supposed to use  
 anyway?  Please just make this go away :-)

I'm not sure what to do about it. It's in use by current code.

The Service and Protocol classes work in a similar manner (fields  
initialized by various methods).

 There are a number of bool parameters that should really be  
 EnumName.yes/no.

The only candidate I can spot is the Socket.blocking property. What did I  
miss?
(Address.toHostString and toAddressString are private)

 The current approach that appears to be required for connecting to a  
 remote host kind of stinks:

     Socket sock = null;
     foreach(info, getAddressInfo("www.digitalmars.com")) {
         try {
             sock = new Socket(info); // will throw if can't create a  
 socket based on info
             sock.connect(info.address);
             break;
         } catch (Exception e) {
             sock = null;
         }
     }
     if (sock is null)
         // unable to connect via any available method!

It's a question of how much gruntwork should the network module abstract  
away. FWIW, the situation is similar with Python:  
http://docs.python.org/library/socket.html (scroll down to second "Echo  
client program" example)

I've heard opinions on IRC that std.socket should definitely not conflate  
connections with DNS lookups, thus a Socket.connect(string hostname)  
method wouldn't belong.

 As an aside… From your comments, I gather that you're not terribly happy  
 with certain design requirements imposed by the existing std.socket.   
 Why not create an entirely new API in std.socket2 and not worry about  
 it?  Would your design change enough to warrant doing this?

I'm not sure if I can find the time and commitment to design an entirely  
new socket API at the moment. Simply put, I tried to improve the existing  
module without breaking too much.

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Sep 12 2011

"Masahiro Nakagawa" <repeatedly gmail.com> writes:

On Tue, 13 Sep 2011 04:13:47 +0900, Sean Kelly <sean invisibleduck.org>  
wrote:

 Looks much nicer than the current std.socket.  A few random comments  
 from a quick scan of the code:

[snip]
 As an aside$B!D(B From your comments, I gather that you're not terribly
happy  
 with certain design requirements imposed by the existing std.socket.   
 Why not create an entirely new API in std.socket2 and not worry about  
 it?  Would your design change enough to warrant doing this?

I think we should create new Socket API(My old post at Phobos ML was the  
first step).
I will restart more rewrite with new project.

Of course, this patch is useful to improvement current std.socket.

Sep 12 2011

David Nadlinger <see klickverbot.at> writes:

 * David Nadlinger added functionality to work around an apparent oddity
 of the Windows socket implementation (see WINSOCK_TIMEOUT_SKEW).
 Although the hack is documented, I'm a bit uncomfortable with that there
 are no provided details or instructions on how to reproduce the
 experiments and measurements which led to the inclusion of this hack.

Which kind of »provided details« would be interesting for you? The 
WinSock receive timeout duration seems to be be off by half a second on 
all Windows boxes I and other helpful people on IRC tested (no personal 
firewall/antivirus software/… involved), and that's about it.

A test case is trivial to write, e.g. https://gist.github.com/1211819.

I tried hard to find any official information about the issue, but 
except for a few other people having stumbled across the issue, I 
couldn't really turn up anything (see e.g. 
http://us.generation-nt.com/answer/recv-timeout-so-rcvtimeo-plus-half-second-help-26653302.html).


 (There's also the question whether a language library's purpose includes
 working around apparent bugs in platforms' implementations.)

If not in the standard library, where else? Granted, the difference is 
probably only going to cause problems in unit tests (since actual 
programs shouldn't rely on the exact socket timings anyway), but pushing 
the burden of writing platform-specific workaround codeto the std.socket 
users doesn't seem like a good solution to me either.

David

Sep 13 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Tue, 13 Sep 2011 18:52:02 +0300, David Nadlinger <see klickverbot.at>  
wrote:

 Which kind of »provided details« would be interesting for you?

Something like this post, thanks.

 If not in the standard library, where else? Granted, the difference is  
 probably only going to cause problems in unit tests (since actual  
 programs shouldn't rely on the exact socket timings anyway), but pushing  
 the burden of writing platform-specific workaround codeto the std.socket  
 users doesn't seem like a good solution to me either.

The obvious problem with such hacks is forward-compatibility - the problem  
might be fixed in Windows 8/9/etc. and no one might notice. I guess it  
wouldn't be hard to add an unit test for this.

Then, there's the question of expectations. For example, someone porting  
their code from another language might already account for this oddity,  
which would cause timeouts to be off 500ms in the other direction. Does  
any other language's standard library do something like this?

Personally, I don't have a strong opinion one way or another, but I do  
think that if the hack is left in, it should be well-documented and its  
necessity be easily verifiable.

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Sep 13 2011

"Marco Leise" <Marco.Leise gmx.de> writes:

Am 13.09.2011, 18:40 Uhr, schrieb Vladimir Panteleev  =

<vladimir thecybershadow.net>:

 On Tue, 13 Sep 2011 18:52:02 +0300, David Nadlinger <see klickverbot.a=

t>  =

 wrote:

 Which kind of =C2=BBprovided details=C2=AB would be interesting for y=


ou?
 Something like this post, thanks.

 If not in the standard library, where else? Granted, the difference i=


s  =

 probably only going to cause problems in unit tests (since actual  =


 programs shouldn't rely on the exact socket timings anyway), but  =


 pushing the burden of writing platform-specific workaround codeto the=


  =

 std.socket users doesn't seem like a good solution to me either.

 The obvious problem with such hacks is forward-compatibility - the  =

 problem might be fixed in Windows 8/9/etc. and no one might notice. I =

 =

 guess it wouldn't be hard to add an unit test for this.

 Then, there's the question of expectations. For example, someone porti=

ng  =

 their code from another language might already account for this oddity=

,  =

 which would cause timeouts to be off 500ms in the other direction. Doe=

s  =

 any other language's standard library do something like this?

 Personally, I don't have a strong opinion one way or another, but I do=

  =

 think that if the hack is left in, it should be well-documented and it=

s  =

 necessity be easily verifiable.

Especially if the involved call looks suspiciously low-level, a user wil=
l  =

often assume that it is a direct wrapper of the native API. So +1 in suc=
h  =

cases on good documentation. Inspired by other language documents, a  =

'caveats' section or other highlighting will do, because socket experts =
 =

will skip the text they think they know already.

Sep 13 2011

David Nadlinger <see klickverbot.at> writes:

On 9/14/11 5:16 AM, Marco Leise wrote:
 Especially if the involved call looks suspiciously low-level, a user
 will often assume that it is a direct wrapper of the native API. So +1
 in such cases on good documentation. Inspired by other language
 documents, a 'caveats' section or other highlighting will do, because
 socket experts will skip the text they think they know already.

Currently, it is covered in a »Note« section 
(http://d-programming-language.org/phobos/std_socket.html#setOption), 
but feel free to convert it into a big red warning or whatever, I am not 
emotionally attached to neither my workaround/kludge/hack nor the docs 
for it.

David

Sep 14 2011

"Marco Leise" <Marco.Leise gmx.de> writes:

Am 14.09.2011, 13:01 Uhr, schrieb David Nadlinger <see klickverbot.at>:

 On 9/14/11 5:16 AM, Marco Leise wrote:
 Especially if the involved call looks suspiciously low-level, a user
 will often assume that it is a direct wrapper of the native API. So +=


1
 in such cases on good documentation. Inspired by other language
 documents, a 'caveats' section or other highlighting will do, because=


 socket experts will skip the text they think they know already.

 Currently, it is covered in a =C2=BBNote=C2=AB section  =

 (http://d-programming-language.org/phobos/std_socket.html#setOption), =

 =

 but feel free to convert it into a big red warning or whatever, I am n=

ot  =

 emotionally attached to neither my workaround/kludge/hack nor the docs=

  =

 for it.

 David

Nah, that's fine. I just didn't track back the link to your documentatio=
n  =

to check how it looks right now, and it looks highlighted enough to me.

Sep 14 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Tue, 13 Sep 2011 19:40:54 +0300, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Tue, 13 Sep 2011 18:52:02 +0300, David Nadlinger <see klickverbot.at>  
 wrote:

 Which kind of »provided details« would be interesting for you?

 Something like this post, thanks.

 If not in the standard library, where else? Granted, the difference is  
 probably only going to cause problems in unit tests (since actual  
 programs shouldn't rely on the exact socket timings anyway), but  
 pushing the burden of writing platform-specific workaround codeto the  
 std.socket users doesn't seem like a good solution to me either.

 The obvious problem with such hacks is forward-compatibility - the  
 problem might be fixed in Windows 8/9/etc. and no one might notice. I  
 guess it wouldn't be hard to add an unit test for this.

 Then, there's the question of expectations. For example, someone porting  
 their code from another language might already account for this oddity,  
 which would cause timeouts to be off 500ms in the other direction. Does  
 any other language's standard library do something like this?

 Personally, I don't have a strong opinion one way or another, but I do  
 think that if the hack is left in, it should be well-documented and its  
 necessity be easily verifiable.

https://github.com/CyberShadow/phobos/commit/89feff70e2c8ae68d7efd8a2fb7edd2acb9ea765

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Sep 14 2011

D Programming

C/C++ Programming

Other

digitalmars.D - Clean-up of std.socket