digitalmars.D - Curl wrapper
- jdrewsen (7/7) May 16 2011 Hi,
- dsimcha (14/21) May 16 2011 This looks very good and I'd definitely be in favor of including it in t...
- Jimmy Cao (2/2) May 16 2011 It looks nice. I plan to use this soon.
- Daniel Gibson (3/24) May 16 2011 Maybe it's common sense, but it doesn't hurt to have that information.
- Jonas Drewsen (9/30) May 17 2011 I included it because the libcurl handle used in a HTTP instance is not
- Andrei Alexandrescu (54/61) May 16 2011 Now that's one sight for sore eyes. Nice work. A few comments and nits:
- Vladimir Panteleev (7/8) May 16 2011 I was the one who suggested using void[]. If data on wire is ubyte[], wh...
- Jonas Drewsen (3/8) May 17 2011 I believe we decided on ubyte[] for reading and void[] for write in an
- Andrei Alexandrescu (3/8) May 17 2011 Bug in std.file.
- Vladimir Panteleev (10/19) May 17 2011 I remember suggesting a long time ago that all types that do not have
- Andrei Alexandrescu (6/23) May 17 2011 Yes. The problem is that converting a type implicitly to ubyte[] allows
- Vladimir Panteleev (7/31) May 17 2011 Ah, of course.
- Andrei Alexandrescu (4/34) May 17 2011 Yah, const(ubyte)[] is much more reasonable. We may as well add a
- Andrej Mitrovic (3/5) May 17 2011 Would this mean that e.g. you could save the state of an object to
- Andrei Alexandrescu (4/9) May 17 2011 If it embeds no pointers, sure. Otherwise, a more involved serialization...
- KennyTM~ (3/13) May 17 2011 I suppose that applies to std.mmfile too? MmFile.opSlice() currently
- Jonas Drewsen (46/110) May 17 2011 The code actually defined the Http/Ftp classes as final. But somehow it
- Andrei Alexandrescu (31/83) May 17 2011 Thanks for the response! A few more answers and comments within
- jdrewsen (19/66) May 17 2011 I guess I should just remove that comment because other defaults has
- Johannes Pfau (16/73) May 18 2011 I'm not sure if it'd be that easy. Using select to implement a blocking
- Jonas Drewsen (25/105) May 18 2011 _Doable_ - not easy :)
- Johannes Pfau (17/127) May 18 2011 I think it's not necessary to do multiple calls if the first didn't
- jdrewsen (2/139) May 18 2011
- Andrei Alexandrescu (22/27) May 18 2011 Perhaps this would be too complicated. In any case the core
- jdrewsen (34/59) May 18 2011 If byChunk is using a hidden thread to download into buffers, then how
- Andrei Alexandrescu (16/68) May 18 2011 Sorry, byChunkAsync and byLineAsync (which I wrongly denoted as
- jdrewsen (5/77) May 19 2011 It see your point. By buffering data asynchronously the reads and writes...
- Masahiro Nakagawa (12/34) May 23 2011 Good!
- jdrewsen (8/48) May 23 2011 If you have any code for inspiration or that could be used as a base I
- Masahiro Nakagawa (6/55) May 29 2011 My old design is based on Boost/Asio. After I rewrote std.socket(includi...
- Robert Jacques (93/100) May 16 2011 I think the API looks good with regard to the core functionality it
- Jonas Drewsen (56/160) May 17 2011 Please see my reply to Andrei
- Robert Jacques (28/83) May 17 2011 [snip]
- jdrewsen (3/94) May 17 2011 I think I'll go for the toString() version.
- Jacob Carlborg (4/11) May 17 2011 I like the API, seems simple.
- bls (16/23) May 23 2011 You really think this is good for your mental health ?
Hi, I've been working on a wrapper for the etc.c.curl module. It is now pretty stable and I would very much like some feedback on the API. http://freeze.steamwinter.com/D/web/phobos/etc_curl.html BTW I use template mixins which doesn't seem to get included in the generated docs. Is there any way I can make this work? /Jonas
May 16 2011
== Quote from jdrewsen (jdrewsen nospam.com)'s articleHi, I've been working on a wrapper for the etc.c.curl module. It is now pretty stable and I would very much like some feedback on the API. http://freeze.steamwinter.com/D/web/phobos/etc_curl.html BTW I use template mixins which doesn't seem to get included in the generated docs. Is there any way I can make this work? /JonasThis looks very good and I'd definitely be in favor of including it in the next release. However, there are a few small issues: 1. In the docs for Http, I don't think you need to say "Do not use the same instance in two threads simultaneously." I think this is pretty much common sense. Same with the Ftp class. 2. A short overview of the asynch stuff at the top of the file would be nice. I'm a little confused about what it's for and when to use it instead of the regular synchronous functions. Does this allow you to process the beginning of a request while the end is still being received? 3. I don't think you need to say how all the convenience functions are implemented, i.e. "Internally this is implemented using an instance of the Http class." This is both likely to be assumed and an irrelevant implementation detail. Overall, nice work.
May 16 2011
It looks nice. I plan to use this soon. I hope it can be in phobos soon.
May 16 2011
Am 17.05.2011 02:43, schrieb dsimcha:== Quote from jdrewsen (jdrewsen nospam.com)'s articleMaybe it's common sense, but it doesn't hurt to have that information. Someone will certainly try that and ask why it gives strange results ;)Hi, I've been working on a wrapper for the etc.c.curl module. It is now pretty stable and I would very much like some feedback on the API. http://freeze.steamwinter.com/D/web/phobos/etc_curl.html BTW I use template mixins which doesn't seem to get included in the generated docs. Is there any way I can make this work? /JonasThis looks very good and I'd definitely be in favor of including it in the next release. However, there are a few small issues: 1. In the docs for Http, I don't think you need to say "Do not use the same instance in two threads simultaneously." I think this is pretty much common sense. Same with the Ftp class.2. A short overview of the asynch stuff at the top of the file would be nice. I'm a little confused about what it's for and when to use it instead of the regular synchronous functions. Does this allow you to process the beginning of a request while the end is still being received? 3. I don't think you need to say how all the convenience functions are implemented, i.e. "Internally this is implemented using an instance of the Http class." This is both likely to be assumed and an irrelevant implementation detail. Overall, nice work.
May 16 2011
On 17/05/11 02.43, dsimcha wrote:== Quote from jdrewsen (jdrewsen nospam.com)'s articleI included it because the libcurl handle used in a HTTP instance is not thread safe itself. What if I instantiated a shared instance of this class. Wouldn't that make the comment relevant in that access to the internal libcurl handle is not protected by mutexes?Hi, I've been working on a wrapper for the etc.c.curl module. It is now pretty stable and I would very much like some feedback on the API. http://freeze.steamwinter.com/D/web/phobos/etc_curl.html BTW I use template mixins which doesn't seem to get included in the generated docs. Is there any way I can make this work? /JonasThis looks very good and I'd definitely be in favor of including it in the next release. However, there are a few small issues: 1. In the docs for Http, I don't think you need to say "Do not use the same instance in two threads simultaneously." I think this is pretty much common sense. Same with the Ftp class.2. A short overview of the asynch stuff at the top of the file would be nice. I'm a little confused about what it's for and when to use it instead of the regular synchronous functions. Does this allow you to process the beginning of a request while the end is still being received?Exactly. I'll add the missing overview.3. I don't think you need to say how all the convenience functions are implemented, i.e. "Internally this is implemented using an instance of the Http class." This is both likely to be assumed and an irrelevant implementation detail.You're right.Overall, nice work.Thank you for the feedback!
May 17 2011
On 05/16/2011 04:07 PM, jdrewsen wrote:Hi, I've been working on a wrapper for the etc.c.curl module. It is now pretty stable and I would very much like some feedback on the API. http://freeze.steamwinter.com/D/web/phobos/etc_curl.html BTW I use template mixins which doesn't seem to get included in the generated docs. Is there any way I can make this work? /JonasNow that's one sight for sore eyes. Nice work. A few comments and nits: 1. Http can't be a class. Network resources are the poster child of expensive resources that are NOT memory and don't work well with garbage collection. RAII is the poster child of deterministic resource management. So... 2. Furthermore I see customization via delegates (onReceive etc). That should be a good proxy for customization instead of using inheritance. 3. Data on wire is ubyte[], not void[]. 4. Shouldn't onSend take a ref to its argument in case it wants to expand it? 5. Why HttpMethod and not Http.Method? 6. "Do not use the same instance of this class in two threads simultanously." -> remove. A D program isn't supposed to do that unless you offer shared methods. 7. HttpStatusLine -> Http.StatusLine. One up for encapsulation. 8. HttpMethod -> Http.Method. Ditto. 9. "this(in const(char)[] url);" -> "this(string url);" I assume you store a copy of the URL inside by calling e.g. to!string. If the client has a string, you refuse to tap into that nice information and share the string safely and efficiently, and make a copy of it for no good reason. If the client doesn't have a string, no worry, the compiler will tell her she needs to do a copy. 10. addHeader etc. Same comment. Use string whenever you want to keep a copy anyway. 11. setTimeCondition -> use core.Duration for time representation throughout. 12. AsyncHttpResult -> Http.AsyncResult :o) 13. HttpResult -> Http.Result 14. Isn't the max redirect configurable via a parameter? 15. Typo: "Callbacks is not supported..." 16. Documentation should point to descriptions of the HTTP methods wrapped (e.g. "post", "del" etc). 17. There should be an example with login/password. 18. See onReceiveHeader">std.curl.Curl.onReceiveHeader and others look like doc macros gone wrong. 19. "chuncked" -> "chunked" 20. "Max allowed redirs. -1 for infinite." -> use uint and uint.max for infinite. 21. "short isRunning()" -> what's the semantics of the short? 22. byLine/byChunk should not expose a string or generally data that can be shared safely by the client. That's because it would need to create a new string with each iteration, which is very inefficient. Instead, they should expose char[]/ubyte[] just like File.byLine does, and reuse the buffer with each call. Essentially byLine/byChunk should be near-zero-overhead means to transfer and inspect arbitrarily large streams. 23. Ftp should have get in-memory and streaming byLine/byChunk, not only get to file. 24. The shutdown mechanism should be handled properly. Shutting down libcurl would have all pending transfers instantly throw a special exception. Without a shutdown API, applications won't be able to implement e.g. a Quit button. Again, great work! Andrei
May 16 2011
On Tue, 17 May 2011 07:43:06 +0300, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:3. Data on wire is ubyte[], not void[].I was the one who suggested using void[]. If data on wire is ubyte[], why is data on disk void[] (according to std.file.read/write)? -- Best regards, Vladimir mailto:vladimir thecybershadow.net
May 16 2011
On 17/05/11 07.23, Vladimir Panteleev wrote:On Tue, 17 May 2011 07:43:06 +0300, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:I believe we decided on ubyte[] for reading and void[] for write in an earlier thread. And std.file should be fixed.3. Data on wire is ubyte[], not void[].I was the one who suggested using void[]. If data on wire is ubyte[], why is data on disk void[] (according to std.file.read/write)?
May 17 2011
On 5/17/11 12:23 AM, Vladimir Panteleev wrote:On Tue, 17 May 2011 07:43:06 +0300, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Bug in std.file. Andrei3. Data on wire is ubyte[], not void[].I was the one who suggested using void[]. If data on wire is ubyte[], why is data on disk void[] (according to std.file.read/write)?
May 17 2011
On Tue, 17 May 2011 17:45:00 +0300, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:On 5/17/11 12:23 AM, Vladimir Panteleev wrote:I remember suggesting a long time ago that all types that do not have pointers be implicitly convertible to ubyte[]. That way, write functions can safely use ubyte[] without casts (and even provide a bit of safety against accidentally writing pointers to disk or network). Are you still against the idea? -- Best regards, Vladimir mailto:vladimir thecybershadow.netOn Tue, 17 May 2011 07:43:06 +0300, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Bug in std.file.3. Data on wire is ubyte[], not void[].I was the one who suggested using void[]. If data on wire is ubyte[], why is data on disk void[] (according to std.file.read/write)?
May 17 2011
On 5/17/11 12:39 PM, Vladimir Panteleev wrote:On Tue, 17 May 2011 17:45:00 +0300, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Yes. The problem is that converting a type implicitly to ubyte[] allows anyone to write raw bytes into an object, making a hash of any type system guarantees without a cast in sight. That is, granted, memory-safe (because there are no indirections), but is not recommendable. AndreiOn 5/17/11 12:23 AM, Vladimir Panteleev wrote:I remember suggesting a long time ago that all types that do not have pointers be implicitly convertible to ubyte[]. That way, write functions can safely use ubyte[] without casts (and even provide a bit of safety against accidentally writing pointers to disk or network). Are you still against the idea?On Tue, 17 May 2011 07:43:06 +0300, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Bug in std.file.3. Data on wire is ubyte[], not void[].I was the one who suggested using void[]. If data on wire is ubyte[], why is data on disk void[] (according to std.file.read/write)?
May 17 2011
On Tue, 17 May 2011 20:43:11 +0300, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:On 5/17/11 12:39 PM, Vladimir Panteleev wrote:Ah, of course. const(ubyte)[] ? :) -- Best regards, Vladimir mailto:vladimir thecybershadow.netOn Tue, 17 May 2011 17:45:00 +0300, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Yes. The problem is that converting a type implicitly to ubyte[] allows anyone to write raw bytes into an object, making a hash of any type system guarantees without a cast in sight. That is, granted, memory-safe (because there are no indirections), but is not recommendable.On 5/17/11 12:23 AM, Vladimir Panteleev wrote:I remember suggesting a long time ago that all types that do not have pointers be implicitly convertible to ubyte[]. That way, write functions can safely use ubyte[] without casts (and even provide a bit of safety against accidentally writing pointers to disk or network). Are you still against the idea?On Tue, 17 May 2011 07:43:06 +0300, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Bug in std.file.3. Data on wire is ubyte[], not void[].I was the one who suggested using void[]. If data on wire is ubyte[], why is data on disk void[] (according to std.file.read/write)?
May 17 2011
On 5/17/11 1:00 PM, Vladimir Panteleev wrote:On Tue, 17 May 2011 20:43:11 +0300, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Yah, const(ubyte)[] is much more reasonable. We may as well add a function representation() that returns it for any object. AndreiOn 5/17/11 12:39 PM, Vladimir Panteleev wrote:Ah, of course. const(ubyte)[] ? :)On Tue, 17 May 2011 17:45:00 +0300, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Yes. The problem is that converting a type implicitly to ubyte[] allows anyone to write raw bytes into an object, making a hash of any type system guarantees without a cast in sight. That is, granted, memory-safe (because there are no indirections), but is not recommendable.On 5/17/11 12:23 AM, Vladimir Panteleev wrote:I remember suggesting a long time ago that all types that do not have pointers be implicitly convertible to ubyte[]. That way, write functions can safely use ubyte[] without casts (and even provide a bit of safety against accidentally writing pointers to disk or network). Are you still against the idea?On Tue, 17 May 2011 07:43:06 +0300, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Bug in std.file.3. Data on wire is ubyte[], not void[].I was the one who suggested using void[]. If data on wire is ubyte[], why is data on disk void[] (according to std.file.read/write)?
May 17 2011
On 5/17/11, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Yah, const(ubyte)[] is much more reasonable. We may as well add a function representation() that returns it for any object.Would this mean that e.g. you could save the state of an object to disk via repr(), and then retrieve the state at a later point?
May 17 2011
On 5/17/11 1:35 PM, Andrej Mitrovic wrote:On 5/17/11, Andrei Alexandrescu<SeeWebsiteForEmail erdani.org> wrote:If it embeds no pointers, sure. Otherwise, a more involved serialization mechanism must be used. AndreiYah, const(ubyte)[] is much more reasonable. We may as well add a function representation() that returns it for any object.Would this mean that e.g. you could save the state of an object to disk via repr(), and then retrieve the state at a later point?
May 17 2011
On May 17, 11 22:45, Andrei Alexandrescu wrote:On 5/17/11 12:23 AM, Vladimir Panteleev wrote:I suppose that applies to std.mmfile too? MmFile.opSlice() currently returns a void[].On Tue, 17 May 2011 07:43:06 +0300, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Bug in std.file. Andrei3. Data on wire is ubyte[], not void[].I was the one who suggested using void[]. If data on wire is ubyte[], why is data on disk void[] (according to std.file.read/write)?
May 17 2011
On 17/05/11 06.43, Andrei Alexandrescu wrote:On 05/16/2011 04:07 PM, jdrewsen wrote:I see what you mean. Will fix.Hi, I've been working on a wrapper for the etc.c.curl module. It is now pretty stable and I would very much like some feedback on the API. http://freeze.steamwinter.com/D/web/phobos/etc_curl.html BTW I use template mixins which doesn't seem to get included in the generated docs. Is there any way I can make this work? /JonasNow that's one sight for sore eyes. Nice work. A few comments and nits: 1. Http can't be a class. Network resources are the poster child of expensive resources that are NOT memory and don't work well with garbage collection. RAII is the poster child of deterministic resource management. So...2. Furthermore I see customization via delegates (onReceive etc). That should be a good proxy for customization instead of using inheritance.The code actually defined the Http/Ftp classes as final. But somehow it is not reflected in the generated docs. Anyway, I will change it to struct and inheritance is not an issue anyway.3. Data on wire is ubyte[], not void[].Data received is ubyte[] and a convenience method for conversion to string is provided. Data send is void[] since all arrays implicitely converts to void[]. This makes e.g [1,2,3,4,5] works as input. Convenience methods are provided for chars in order to set HTTP headers correctly. For onSend callback it is simply void[] in order to use it as an out parameter. For Http.post/put it is const(void)[] For Http.postAsync/putAsync it is immutable(void)[] because internally I need to send the data to another thread using message passing. Hopes this makes sense?4. Shouldn't onSend take a ref to its argument in case it wants to expand it?It is not allowed to expand it. The callback originates from libcurl which determines the size of the array.5. Why HttpMethod and not Http.Method?I had the thought myself but decided on the former. By looking at the replies to my post it seems that it should be changed. I'll do that.6. "Do not use the same instance of this class in two threads simultanously." -> remove. A D program isn't supposed to do that unless you offer shared methods.Ok. So even though I declare: shared Http http = new Http(); I cannot have multiple threads call http.perform without that method being marked as shared as well? If thats the case I'll remove the comment.7. HttpStatusLine -> Http.StatusLine. One up for encapsulation. 8. HttpMethod -> Http.Method. Ditto. 9. "this(in const(char)[] url);" -> "this(string url);" I assume you store a copy of the URL inside by calling e.g. to!string. If the client has a string, you refuse to tap into that nice information and share the string safely and efficiently, and make a copy of it for no good reason. If the client doesn't have a string, no worry, the compiler will tell her she needs to do a copy.I do not store a copy but calls libcurl which in turn does a copy by itself. So I believe the current signature is appropriate?10. addHeader etc. Same comment. Use string whenever you want to keep a copy anyway.At 9. the string is copied by libcurl itself.11. setTimeCondition -> use core.Duration for time representation throughout.Will fix.12. AsyncHttpResult -> Http.AsyncResult :o) 13. HttpResult -> Http.Result 14. Isn't the max redirect configurable via a parameter?Yes. It is called Http.followLocation from libcurls followLocation option. I will rename it to maxRedirections for clearity.15. Typo: "Callbacks is not supported..."ok.16. Documentation should point to descriptions of the HTTP methods wrapped (e.g. "post", "del" etc).Do you mean point to the relevant RFC?17. There should be an example with login/password.ok.18. See onReceiveHeader">std.curl.Curl.onReceiveHeader and others look like doc macros gone wrong. 19. "chuncked" -> "chunked" 20. "Max allowed redirs. -1 for infinite." -> use uint and uint.max for infinite.Will fix.21. "short isRunning()" -> what's the semantics of the short?This is a bug. It returns an internal bool value.22. byLine/byChunk should not expose a string or generally data that can be shared safely by the client. That's because it would need to create a new string with each iteration, which is very inefficient. Instead, they should expose char[]/ubyte[] just like File.byLine does, and reuse the buffer with each call. Essentially byLine/byChunk should be near-zero-overhead means to transfer and inspect arbitrarily large streams.byLine/byChunk is currenly only available for the async methods which is implemented using message passing. This means that for passing the message a copy has to be made because a message is by-value or immutable data. Is there another way for me to pass the message without doing a copy... some kind of move semantic?23. Ftp should have get in-memory and streaming byLine/byChunk, not only get to file.Definitely. I wanted to get feedback on HTTP API before doing the same to FTP.24. The shutdown mechanism should be handled properly. Shutting down libcurl would have all pending transfers instantly throw a special exception. Without a shutdown API, applications won't be able to implement e.g. a Quit button.If you're only using this API this should be handled. But I see that it is not documented.Again, great work!Thank you for the feedback.
May 17 2011
Thanks for the response! A few more answers and comments within (everything deleted counts as "sounds great"). On 5/17/11 3:50 AM, Jonas Drewsen wrote:I think it's best to systematically use void[] when sending data out (i.e. you start with structured data in memory and you let it implicitly "decay" as you pass it on the wire), and ubyte[] when receiving data (i.e. you start with unstructured data and force the user to explicitly build structure on it). As noted the current File API is not to be considered an example to follow.3. Data on wire is ubyte[], not void[].Data received is ubyte[] and a convenience method for conversion to string is provided. Data send is void[] since all arrays implicitely converts to void[]. This makes e.g [1,2,3,4,5] works as input. Convenience methods are provided for chars in order to set HTTP headers correctly. For onSend callback it is simply void[] in order to use it as an out parameter. For Http.post/put it is const(void)[] For Http.postAsync/putAsync it is immutable(void)[] because internally I need to send the data to another thread using message passing. Hopes this makes sense?Yes, shared limits access to only shared fields and methods.6. "Do not use the same instance of this class in two threads simultanously." -> remove. A D program isn't supposed to do that unless you offer shared methods.Ok. So even though I declare: shared Http http = new Http(); I cannot have multiple threads call http.perform without that method being marked as shared as well? If thats the case I'll remove the comment.Got it, thanks. I sense a little inefficiency here that we can't quite get rid of - the user passes a string, we must convert it to a zero-terminated char* (one allocation) and then libcurl copies it again (two allocations). Oh well.9. "this(in const(char)[] url);" -> "this(string url);" I assume you store a copy of the URL inside by calling e.g. to!string. If the client has a string, you refuse to tap into that nice information and share the string safely and efficiently, and make a copy of it for no good reason. If the client doesn't have a string, no worry, the compiler will tell her she needs to do a copy.I do not store a copy but calls libcurl which in turn does a copy by itself. So I believe the current signature is appropriate?You mention this about many higher-level functions: "Maximum redirections are set to 10." Then I'm thinking, if it's an important parameter, make it a defaulted parameter for the function.14. Isn't the max redirect configurable via a parameter?Yes. It is called Http.followLocation from libcurls followLocation option. I will rename it to maxRedirections for clearity.Yah, or a good tutorial. (I didn't know DEL existed!)16. Documentation should point to descriptions of the HTTP methods wrapped (e.g. "post", "del" etc).Do you mean point to the relevant RFC?A library element is planned but not on the short list yet. You can work around that by internally doing a cast. As long as you observe the commitment that buffers are not accessed simultaneously by more than one thread you're well within defined behavior. One more suggestion - you may want to also provide classic boring synchronous read/write methods that take a user-provided buffer. I'm sure people could use such e.g. when they want to stream data inside their own threads, or even in the main thread if they don't mind blocking.22. byLine/byChunk should not expose a string or generally data that can be shared safely by the client. That's because it would need to create a new string with each iteration, which is very inefficient. Instead, they should expose char[]/ubyte[] just like File.byLine does, and reuse the buffer with each call. Essentially byLine/byChunk should be near-zero-overhead means to transfer and inspect arbitrarily large streams.byLine/byChunk is currenly only available for the async methods which is implemented using message passing. This means that for passing the message a copy has to be made because a message is by-value or immutable data. Is there another way for me to pass the message without doing a copy... some kind of move semantic?Yah, in the future we need to put an onShutdown protocol inside core, similar to atexit() in C. Each library that may block would plant a hook. That way an application will be able to exit quickly and gracefully even if it has blocked threads. Andrei24. The shutdown mechanism should be handled properly. Shutting down libcurl would have all pending transfers instantly throw a special exception. Without a shutdown API, applications won't be able to implement e.g. a Quit button.If you're only using this API this should be handled. But I see that it is not documented.
May 17 2011
Please see comments below. Den 17-05-2011 16:42, Andrei Alexandrescu skrev:Thanks for the response! A few more answers and comments within (everything deleted counts as "sounds great"). On 5/17/11 3:50 AM, Jonas Drewsen wrote:I guess I should just remove that comment because other defaults has been selected as well e.g. timeouts.You mention this about many higher-level functions: "Maximum redirections are set to 10." Then I'm thinking, if it's an important parameter, make it a defaulted parameter for the function.14. Isn't the max redirect configurable via a parameter?Yes. It is called Http.followLocation from libcurls followLocation option. I will rename it to maxRedirections for clearity.Well DEL is actually called DELETE in the HTTP RFC. But delete is a reserved word i D so I used del() instead. delete() wouldn't work in following code in think: with(auto http = ...) { delete(...); }Yah, or a good tutorial. (I didn't know DEL existed!)16. Documentation should point to descriptions of the HTTP methods wrapped (e.g. "post", "del" etc).Do you mean point to the relevant RFC?ok nice to know.A library element is planned but not on the short list yet. You can work around that by internally doing a cast. As long as you observe the commitment that buffers are not accessed simultaneously by more than one thread you're well within defined behavior.22. byLine/byChunk should not expose a string or generally data that can be shared safely by the client. That's because it would need to create a new string with each iteration, which is very inefficient. Instead, they should expose char[]/ubyte[] just like File.byLine does, and reuse the buffer with each call. Essentially byLine/byChunk should be near-zero-overhead means to transfer and inspect arbitrarily large streams.byLine/byChunk is currenly only available for the async methods which is implemented using message passing. This means that for passing the message a copy has to be made because a message is by-value or immutable data. Is there another way for me to pass the message without doing a copy... some kind of move semantic?One more suggestion - you may want to also provide classic boring synchronous read/write methods that take a user-provided buffer. I'm sure people could use such e.g. when they want to stream data inside their own threads, or even in the main thread if they don't mind blocking.This would mean hooking into libcurl and selecting on its sockets. Totally doable. But I would rather stop here feature wise and wrap this thing up. I would like to focus my efforts on a candidate for std.net.event/reactor/proactor module. Anyone who wants to implement this read/write in the curl wrapper are ofcourse very welcome.That would be neat. /JonasYah, in the future we need to put an onShutdown protocol inside core, similar to atexit() in C. Each library that may block would plant a hook. That way an application will be able to exit quickly and gracefully even if it has blocked threads.24. The shutdown mechanism should be handled properly. Shutting down libcurl would have all pending transfers instantly throw a special exception. Without a shutdown API, applications won't be able to implement e.g. a Quit button.If you're only using this API this should be handled. But I see that it is not documented.
May 17 2011
jdrewsen wrote:Please see comments below. Den 17-05-2011 16:42, Andrei Alexandrescu skrev:I'm not sure if it'd be that easy. Using select to implement a blocking api is possible, but select is only used to wait until data is ready, reading and writing the data is still done by curl. And as curl does not allow you to supply the data buffer, having user-provided buffers is afaik impossible.Thanks for the response! A few more answers and comments within (everything deleted counts as "sounds great"). On 5/17/11 3:50 AM, Jonas Drewsen wrote:I guess I should just remove that comment because other defaults has=20 been selected as well e.g. timeouts.You mention this about many higher-level functions: "Maximum redirections are set to 10." Then I'm thinking, if it's an important parameter, make it a defaulted parameter for the function.14. Isn't the max redirect configurable via a parameter?Yes. It is called Http.followLocation from libcurls followLocation option. I will rename it to maxRedirections for clearity.Well DEL is actually called DELETE in the HTTP RFC. But delete is a=20 reserved word i D so I used del() instead. delete() wouldn't work in=20 following code in think: with(auto http =3D ...) { delete(...); }Yah, or a good tutorial. (I didn't know DEL existed!)16. Documentation should point to descriptions of the HTTP methods wrapped (e.g. "post", "del" etc).Do you mean point to the relevant RFC?ok nice to know.A library element is planned but not on the short list yet. You can work around that by internally doing a cast. As long as you observe the commitment that buffers are not accessed simultaneously by more than one thread you're well within defined behavior.22. byLine/byChunk should not expose a string or generally data that can be shared safely by the client. That's because it would need to create a new string with each iteration, which is very inefficient. Instead, they should expose char[]/ubyte[] just like File.byLine does, and reuse the buffer with each call. Essentially byLine/byChunk should be near-zero-overhead means to transfer and inspect arbitrarily large streams.byLine/byChunk is currenly only available for the async methods which is implemented using message passing. This means that for passing the message a copy has to be made because a message is by-value or immutable data. Is there another way for me to pass the message without doing a copy... some kind of move semantic?One more suggestion - you may want to also provide classic boring synchronous read/write methods that take a user-provided buffer. I'm sure people could use such e.g. when they want to stream data inside their own threads, or even in the main thread if they don't mind blocking.This would mean hooking into libcurl and selecting on its sockets.=20 Totally doable.But I would rather stop here feature wise and wrap this thing up. I would like to focus my efforts on a candidate for=20 std.net.event/reactor/proactor module.Have you already started working on std.net.event? Do you plan to base that on libev / libevent? I've been trying to get familiar with libev lately, so I have bindings for livev 3.9 / 4.04 (combined, version can be selected with version statements): http://dl.dropbox.com/u/24218791/d/src/libev.html (comments in the c header have been turned into doc comments. The binding is partially based on Leandro Lucarella bindings: http://git.llucax.com.ar/?p=3Dsoftware/ev.d.git) --=20 Johannes Pfau
May 18 2011
On 18/05/11 10.09, Johannes Pfau wrote:jdrewsen wrote:_Doable_ - not easy :) Select will wait for data to be ready and ask curl to handle the data chunk. Curl in turn calls back to a registered callback handler with the data read. That handler fills the buffer provided by the user. If not enough data has been receive an new select is performed until the requested amount of data is read. Then the blocking method can return.Please see comments below. Den 17-05-2011 16:42, Andrei Alexandrescu skrev:I'm not sure if it'd be that easy. Using select to implement a blocking api is possible, but select is only used to wait until data is ready, reading and writing the data is still done by curl. And as curl does not allow you to supply the data buffer, having user-provided buffers is afaik impossible.Thanks for the response! A few more answers and comments within (everything deleted counts as "sounds great"). On 5/17/11 3:50 AM, Jonas Drewsen wrote:I guess I should just remove that comment because other defaults has been selected as well e.g. timeouts.You mention this about many higher-level functions: "Maximum redirections are set to 10." Then I'm thinking, if it's an important parameter, make it a defaulted parameter for the function.14. Isn't the max redirect configurable via a parameter?Yes. It is called Http.followLocation from libcurls followLocation option. I will rename it to maxRedirections for clearity.Well DEL is actually called DELETE in the HTTP RFC. But delete is a reserved word i D so I used del() instead. delete() wouldn't work in following code in think: with(auto http = ...) { delete(...); }Yah, or a good tutorial. (I didn't know DEL existed!)16. Documentation should point to descriptions of the HTTP methods wrapped (e.g. "post", "del" etc).Do you mean point to the relevant RFC?ok nice to know.A library element is planned but not on the short list yet. You can work around that by internally doing a cast. As long as you observe the commitment that buffers are not accessed simultaneously by more than one thread you're well within defined behavior.22. byLine/byChunk should not expose a string or generally data that can be shared safely by the client. That's because it would need to create a new string with each iteration, which is very inefficient. Instead, they should expose char[]/ubyte[] just like File.byLine does, and reuse the buffer with each call. Essentially byLine/byChunk should be near-zero-overhead means to transfer and inspect arbitrarily large streams.byLine/byChunk is currenly only available for the async methods which is implemented using message passing. This means that for passing the message a copy has to be made because a message is by-value or immutable data. Is there another way for me to pass the message without doing a copy... some kind of move semantic?One more suggestion - you may want to also provide classic boring synchronous read/write methods that take a user-provided buffer. I'm sure people could use such e.g. when they want to stream data inside their own threads, or even in the main thread if they don't mind blocking.This would mean hooking into libcurl and selecting on its sockets. Totally doable.I'll finish the curl wrapper before starting on it. I have had a look at libev/event before and they are very nice libs. But I think I'll have to implement it from scratch for two reasons: 1, AFAIK we cannot include code in std phobos that is not boost licensed or a license as liberal. libev for example requires you to distribute the license with your software. 2, It introduces a dependency to an external project in phobos. Currently no dependencies are present. This makes phobos very easy to install and use out of the box. The optimal solution to these problems from my point of view would be to get that "much discussed" package tool in place soon (CPAN,apt alike). Heck, now I think about it maybe I should do that before any std.net stuff. Let's see how the wind blows - many interesting projects :)But I would rather stop here feature wise and wrap this thing up. I would like to focus my efforts on a candidate for std.net.event/reactor/proactor module.Have you already started working on std.net.event? Do you plan to base that on libev / libevent?I've been trying to get familiar with libev lately, so I have bindings for livev 3.9 / 4.04 (combined, version can be selected with version statements): http://dl.dropbox.com/u/24218791/d/src/libev.html (comments in the c header have been turned into doc comments. The binding is partially based on Leandro Lucarella bindings: http://git.llucax.com.ar/?p=software/ev.d.git)Very nice. I've also stumbled upon some other async D libs out there that could be a good starting point. It seems that Leandro has both bindings and a wrapper in place. And a wrapper is really nice to have to make it more D'ish. /Jonas
May 18 2011
Jonas Drewsen wrote:On 18/05/11 10.09, Johannes Pfau wrote:I think it's not necessary to do multiple calls if the first didn't return enough data. Most read calls are only guaranteed to return a maximum of data, less is always possible. It's a bigger problem if the user supplied buffer is smaller than the curl buffer. And you always end up copying data. So it might be better to just return a reference to the buffer allocated by curl, or just forget about this api, curl just doesn't work well this way.jdrewsen wrote:_Doable_ - not easy :) Select will wait for data to be ready and ask curl to handle the data=20 chunk. Curl in turn calls back to a registered callback handler with the data read. That handler fills the buffer provided by the user. If not enough data has been receive an new select is performed until the=20 requested amount of data is read. Then the blocking method can return.Please see comments below. Den 17-05-2011 16:42, Andrei Alexandrescu skrev:I'm not sure if it'd be that easy. Using select to implement a blocking api is possible, but select is only used to wait until data is ready, reading and writing the data is still done by curl. And as curl does not allow you to supply the data buffer, having user-provided buffers is afaik impossible.Thanks for the response! A few more answers and comments within (everything deleted counts as "sounds great"). On 5/17/11 3:50 AM, Jonas Drewsen wrote:I guess I should just remove that comment because other defaults has been selected as well e.g. timeouts.You mention this about many higher-level functions: "Maximum redirections are set to 10." Then I'm thinking, if it's an important parameter, make it a defaulted parameter for the function.14. Isn't the max redirect configurable via a parameter?Yes. It is called Http.followLocation from libcurls followLocation option. I will rename it to maxRedirections for clearity.Well DEL is actually called DELETE in the HTTP RFC. But delete is a reserved word i D so I used del() instead. delete() wouldn't work in following code in think: with(auto http =3D ...) { delete(...); }Yah, or a good tutorial. (I didn't know DEL existed!)16. Documentation should point to descriptions of the HTTP methods wrapped (e.g. "post", "del" etc).Do you mean point to the relevant RFC?ok nice to know.A library element is planned but not on the short list yet. You can work around that by internally doing a cast. As long as you observe the commitment that buffers are not accessed simultaneously by more than one thread you're well within defined behavior.22. byLine/byChunk should not expose a string or generally data that can be shared safely by the client. That's because it would need to create a new string with each iteration, which is very inefficient. Instead, they should expose char[]/ubyte[] just like File.byLine does, and reuse the buffer with each call. Essentially byLine/byChunk should be near-zero-overhead means to transfer and inspect arbitrarily large streams.byLine/byChunk is currenly only available for the async methods which is implemented using message passing. This means that for passing the message a copy has to be made because a message is by-value or immutable data. Is there another way for me to pass the message without doing a copy... some kind of move semantic?One more suggestion - you may want to also provide classic boring synchronous read/write methods that take a user-provided buffer. I'm sure people could use such e.g. when they want to stream data inside their own threads, or even in the main thread if they don't mind blocking.This would mean hooking into libcurl and selecting on its sockets. Totally doable.That could indeed be a problem. I think libev/libevent even have the same license. Probably it's really best to implement it from scratch, but it will be quite some work. I'd still have a look at the libev documentation though, it contains some details about OS quirks. And try to avoid Linux AIO ;-)I'll finish the curl wrapper before starting on it. I have had a look at libev/event before and they are very nice libs. But I think I'll have to implement it from scratch for two reasons: 1, AFAIK we cannot include code in std phobos that is not boost licensed or a license as liberal. libev for example requires you to distribute the license with your software.But I would rather stop here feature wise and wrap this thing up. I would like to focus my efforts on a candidate for std.net.event/reactor/proactor module.Have you already started working on std.net.event? Do you plan to base that on libev / libevent?2, It introduces a dependency to an external project in phobos.=20 Currently no dependencies are present. This makes phobos very easy to=20 install and use out of the box. The optimal solution to these problems from my point of view would be to get that "much discussed" package tool in place soon (CPAN,apt alike). Heck, now I think about it maybe I should do that before any std.net=20 stuff. Let's see how the wind blows - many interesting projects :)Yes, I have a higher level wrapper for libev as well, but the new 'shared' stuff made it a lot more complicated to get that right.I've been trying to get familiar with libev lately, so I have bindings for livev 3.9 / 4.04 (combined, version can be selected with version statements): http://dl.dropbox.com/u/24218791/d/src/libev.html (comments in the c header have been turned into doc comments. The binding is partially based on Leandro Lucarella bindings: http://git.llucax.com.ar/?p=3Dsoftware/ev.d.git)Very nice. I've also stumbled upon some other async D libs out there=20 that could be a good starting point. It seems that Leandro has both bindings and a wrapper in place. And a=20 wrapper is really nice to have to make it more D'ish./Jonas--=20 Johannes Pfau
May 18 2011
Den 18-05-2011 14:52, Johannes Pfau skrev:Jonas Drewsen wrote:Yeah. That's part of the reason why I'm not interested in implementing it.On 18/05/11 10.09, Johannes Pfau wrote:I think it's not necessary to do multiple calls if the first didn't return enough data. Most read calls are only guaranteed to return a maximum of data, less is always possible. It's a bigger problem if the user supplied buffer is smaller than the curl buffer. And you always end up copying data. So it might be better to just return a reference to the buffer allocated by curl, or just forget about this api, curl just doesn't work well this way.jdrewsen wrote:_Doable_ - not easy :) Select will wait for data to be ready and ask curl to handle the data chunk. Curl in turn calls back to a registered callback handler with the data read. That handler fills the buffer provided by the user. If not enough data has been receive an new select is performed until the requested amount of data is read. Then the blocking method can return.Please see comments below. Den 17-05-2011 16:42, Andrei Alexandrescu skrev:I'm not sure if it'd be that easy. Using select to implement a blocking api is possible, but select is only used to wait until data is ready, reading and writing the data is still done by curl. And as curl does not allow you to supply the data buffer, having user-provided buffers is afaik impossible.Thanks for the response! A few more answers and comments within (everything deleted counts as "sounds great"). On 5/17/11 3:50 AM, Jonas Drewsen wrote:I guess I should just remove that comment because other defaults has been selected as well e.g. timeouts.You mention this about many higher-level functions: "Maximum redirections are set to 10." Then I'm thinking, if it's an important parameter, make it a defaulted parameter for the function.14. Isn't the max redirect configurable via a parameter?Yes. It is called Http.followLocation from libcurls followLocation option. I will rename it to maxRedirections for clearity.Well DEL is actually called DELETE in the HTTP RFC. But delete is a reserved word i D so I used del() instead. delete() wouldn't work in following code in think: with(auto http = ...) { delete(...); }Yah, or a good tutorial. (I didn't know DEL existed!)16. Documentation should point to descriptions of the HTTP methods wrapped (e.g. "post", "del" etc).Do you mean point to the relevant RFC?ok nice to know.A library element is planned but not on the short list yet. You can work around that by internally doing a cast. As long as you observe the commitment that buffers are not accessed simultaneously by more than one thread you're well within defined behavior.22. byLine/byChunk should not expose a string or generally data that can be shared safely by the client. That's because it would need to create a new string with each iteration, which is very inefficient. Instead, they should expose char[]/ubyte[] just like File.byLine does, and reuse the buffer with each call. Essentially byLine/byChunk should be near-zero-overhead means to transfer and inspect arbitrarily large streams.byLine/byChunk is currenly only available for the async methods which is implemented using message passing. This means that for passing the message a copy has to be made because a message is by-value or immutable data. Is there another way for me to pass the message without doing a copy... some kind of move semantic?One more suggestion - you may want to also provide classic boring synchronous read/write methods that take a user-provided buffer. I'm sure people could use such e.g. when they want to stream data inside their own threads, or even in the main thread if they don't mind blocking.This would mean hooking into libcurl and selecting on its sockets. Totally doable.That could indeed be a problem. I think libev/libevent even have the same license. Probably it's really best to implement it from scratch, but it will be quite some work. I'd still have a look at the libev documentation though, it contains some details about OS quirks. And try to avoid Linux AIO ;-)I'll finish the curl wrapper before starting on it. I have had a look at libev/event before and they are very nice libs. But I think I'll have to implement it from scratch for two reasons: 1, AFAIK we cannot include code in std phobos that is not boost licensed or a license as liberal. libev for example requires you to distribute the license with your software.But I would rather stop here feature wise and wrap this thing up. I would like to focus my efforts on a candidate for std.net.event/reactor/proactor module.Have you already started working on std.net.event? Do you plan to base that on libev / libevent?2, It introduces a dependency to an external project in phobos. Currently no dependencies are present. This makes phobos very easy to install and use out of the box. The optimal solution to these problems from my point of view would be to get that "much discussed" package tool in place soon (CPAN,apt alike). Heck, now I think about it maybe I should do that before any std.net stuff. Let's see how the wind blows - many interesting projects :)Yes, I have a higher level wrapper for libev as well, but the new 'shared' stuff made it a lot more complicated to get that right.I've been trying to get familiar with libev lately, so I have bindings for livev 3.9 / 4.04 (combined, version can be selected with version statements): http://dl.dropbox.com/u/24218791/d/src/libev.html (comments in the c header have been turned into doc comments. The binding is partially based on Leandro Lucarella bindings: http://git.llucax.com.ar/?p=software/ev.d.git)Very nice. I've also stumbled upon some other async D libs out there that could be a good starting point. It seems that Leandro has both bindings and a wrapper in place. And a wrapper is really nice to have to make it more D'ish./Jonas
May 18 2011
On 5/18/11 6:07 AM, Jonas Drewsen wrote:Select will wait for data to be ready and ask curl to handle the data chunk. Curl in turn calls back to a registered callback handler with the data read. That handler fills the buffer provided by the user. If not enough data has been receive an new select is performed until the requested amount of data is read. Then the blocking method can return.Perhaps this would be too complicated. In any case the core functionality must be paid top attention. And the core functionality is streaming. Currently there are two proposed ways to stream data from an HTTP address: (a) by using the onReceive callback, and (b) by using byLine/byChunk. If either of these perform slower than the best-of-the-breed streaming using libcurl, we have failed. The onReceive method is not particularly appealing because the client and libcurl block each other: the client is blocked while libcurl is waiting for data, and the client blocks libcurl while inside the callback. (Please correct me if I'm wrong.) To make byLine/byChunk fast, the basic setup should include a hidden thread that does the download in separation from the client's thread. There should be K buffers allocated (K = 2 to e.g. 10), and a simple protocol for passing the buffers back and forth between the client thread and the hidden thread. That way, in the quiescent state, there is no memory allocation and either both client and libcurl are busy doing work, or one is much slower than the other, which waits. The same mechanism should be used in byChunkAsync or byFileAsync. Thanks, Andrei
May 18 2011
Den 18-05-2011 16:53, Andrei Alexandrescu skrev:On 5/18/11 6:07 AM, Jonas Drewsen wrote:If byChunk is using a hidden thread to download into buffers, then how does it differ from the byChunkAsync that you mention? The current curl wrapper actually does the hidden thread trick (based on a hint you gave me a while ago). It does not reuse buffers because I thought that all data had to be immutable or by value to go through the message passing system. I'll fix this since it is a good place to do some type casting to allow passing the buffers for reuse. I think that we have to consider the context of the streaming before we can tell the best solution. I do not have any number to back the following up, but this is how I see it: If data that is read is going to be processed (e.g. compressed) in some way it is most likely a benefit to spawn a thread to handle the data buffering. If no processing is done (e.g. a simple copy from net to disk) I believe keeping things in the same thread and simply select on sockets (disk or net) is fastest. This way no message passing and context switching is taking place and does cause any overhead. libcurl can give you access to the file descriptors for this exact purpose but it does have some drawbacks: you are not in control of the buffers used by libcurl. This means that reading from one curl connection and sending on another you would have to copy the data. libcurl does in fact provide even simpler methods where you can provide your own buffers for read/writes. Unfortunately this is only supported for HTTP and a lot of the convenience features such as redirections are lost. The more you want to control to get the last drop of performance, the more you have to manually handle yourself. In my opinion I think that providing the performance of the standard libcurl API in the D wrapper is the way to go (as done in the current curl wrapper). Generic and efficient streaming across protocols is best done in std.net where buffers can be handled entirely in D. I know this is not a small task which is why I started out with wrapping libcurl. Thanks JonasSelect will wait for data to be ready and ask curl to handle the data chunk. Curl in turn calls back to a registered callback handler with the data read. That handler fills the buffer provided by the user. If not enough data has been receive an new select is performed until the requested amount of data is read. Then the blocking method can return.Perhaps this would be too complicated. In any case the core functionality must be paid top attention. And the core functionality is streaming. Currently there are two proposed ways to stream data from an HTTP address: (a) by using the onReceive callback, and (b) by using byLine/byChunk. If either of these perform slower than the best-of-the-breed streaming using libcurl, we have failed. The onReceive method is not particularly appealing because the client and libcurl block each other: the client is blocked while libcurl is waiting for data, and the client blocks libcurl while inside the callback. (Please correct me if I'm wrong.) To make byLine/byChunk fast, the basic setup should include a hidden thread that does the download in separation from the client's thread. There should be K buffers allocated (K = 2 to e.g. 10), and a simple protocol for passing the buffers back and forth between the client thread and the hidden thread. That way, in the quiescent state, there is no memory allocation and either both client and libcurl are busy doing work, or one is much slower than the other, which waits. The same mechanism should be used in byChunkAsync or byFileAsync.
May 18 2011
On 5/18/11 5:29 PM, jdrewsen wrote:Den 18-05-2011 16:53, Andrei Alexandrescu skrev:Sorry, byChunkAsync and byLineAsync (which I wrongly denoted as byFileAsync) would be methods File.On 5/18/11 6:07 AM, Jonas Drewsen wrote:If byChunk is using a hidden thread to download into buffers, then how does it differ from the byChunkAsync that you mention?Select will wait for data to be ready and ask curl to handle the data chunk. Curl in turn calls back to a registered callback handler with the data read. That handler fills the buffer provided by the user. If not enough data has been receive an new select is performed until the requested amount of data is read. Then the blocking method can return.Perhaps this would be too complicated. In any case the core functionality must be paid top attention. And the core functionality is streaming. Currently there are two proposed ways to stream data from an HTTP address: (a) by using the onReceive callback, and (b) by using byLine/byChunk. If either of these perform slower than the best-of-the-breed streaming using libcurl, we have failed. The onReceive method is not particularly appealing because the client and libcurl block each other: the client is blocked while libcurl is waiting for data, and the client blocks libcurl while inside the callback. (Please correct me if I'm wrong.) To make byLine/byChunk fast, the basic setup should include a hidden thread that does the download in separation from the client's thread. There should be K buffers allocated (K = 2 to e.g. 10), and a simple protocol for passing the buffers back and forth between the client thread and the hidden thread. That way, in the quiescent state, there is no memory allocation and either both client and libcurl are busy doing work, or one is much slower than the other, which waits. The same mechanism should be used in byChunkAsync or byFileAsync.The current curl wrapper actually does the hidden thread trick (based on a hint you gave me a while ago). It does not reuse buffers because I thought that all data had to be immutable or by value to go through the message passing system. I'll fix this since it is a good place to do some type casting to allow passing the buffers for reuse.Great, thanks. Don't forget there's great motivation to do so.I think that we have to consider the context of the streaming before we can tell the best solution. I do not have any number to back the following up, but this is how I see it: If data that is read is going to be processed (e.g. compressed) in some way it is most likely a benefit to spawn a thread to handle the data buffering. If no processing is done (e.g. a simple copy from net to disk) I believe keeping things in the same thread and simply select on sockets (disk or net) is fastest.Not at all. If operating with the network and operating with the disk block each other, you're guaranteed to be slower than the slowest of them. Consider that disk speed is V1 MB/s and network speed is V2 MB/s, and that they're independent of each other. If you do one thing at a time, you need to take 1/V1 + 1/V2 seconds to transfer one MB. The speed of the process is therefore 1/(1/V1 + 1/V2) = V1 * V2 / (V1 + V2). If the two devices have comparable speeds, you're halving the speed. As soon as you do buffering with two threads you can easily reach close to the minimum of the two speeds, which is the theoretical best.In my opinion I think that providing the performance of the standard libcurl API in the D wrapper is the way to go (as done in the current curl wrapper). Generic and efficient streaming across protocols is best done in std.net where buffers can be handled entirely in D. I know this is not a small task which is why I started out with wrapping libcurl.Sounds reasonable, although if you take care of recycling the buffers in the implementation you have, your wrapper may as well be the best of breed. Andrei
May 18 2011
Den 19-05-2011 00:54, Andrei Alexandrescu skrev:On 5/18/11 5:29 PM, jdrewsen wrote:It see your point. By buffering data asynchronously the reads and writes don't block each other and this increases performance. The thing is that the OS already does buffering for us. So while we're writing data to disk the OS is buffering incoming data from the network asynchronously.Den 18-05-2011 16:53, Andrei Alexandrescu skrev:Sorry, byChunkAsync and byLineAsync (which I wrongly denoted as byFileAsync) would be methods File.On 5/18/11 6:07 AM, Jonas Drewsen wrote:If byChunk is using a hidden thread to download into buffers, then how does it differ from the byChunkAsync that you mention?Select will wait for data to be ready and ask curl to handle the data chunk. Curl in turn calls back to a registered callback handler with the data read. That handler fills the buffer provided by the user. If not enough data has been receive an new select is performed until the requested amount of data is read. Then the blocking method can return.Perhaps this would be too complicated. In any case the core functionality must be paid top attention. And the core functionality is streaming. Currently there are two proposed ways to stream data from an HTTP address: (a) by using the onReceive callback, and (b) by using byLine/byChunk. If either of these perform slower than the best-of-the-breed streaming using libcurl, we have failed. The onReceive method is not particularly appealing because the client and libcurl block each other: the client is blocked while libcurl is waiting for data, and the client blocks libcurl while inside the callback. (Please correct me if I'm wrong.) To make byLine/byChunk fast, the basic setup should include a hidden thread that does the download in separation from the client's thread. There should be K buffers allocated (K = 2 to e.g. 10), and a simple protocol for passing the buffers back and forth between the client thread and the hidden thread. That way, in the quiescent state, there is no memory allocation and either both client and libcurl are busy doing work, or one is much slower than the other, which waits. The same mechanism should be used in byChunkAsync or byFileAsync.The current curl wrapper actually does the hidden thread trick (based on a hint you gave me a while ago). It does not reuse buffers because I thought that all data had to be immutable or by value to go through the message passing system. I'll fix this since it is a good place to do some type casting to allow passing the buffers for reuse.Great, thanks. Don't forget there's great motivation to do so.I think that we have to consider the context of the streaming before we can tell the best solution. I do not have any number to back the following up, but this is how I see it: If data that is read is going to be processed (e.g. compressed) in some way it is most likely a benefit to spawn a thread to handle the data buffering. If no processing is done (e.g. a simple copy from net to disk) I believe keeping things in the same thread and simply select on sockets (disk or net) is fastest.Not at all. If operating with the network and operating with the disk block each other, you're guaranteed to be slower than the slowest of them. Consider that disk speed is V1 MB/s and network speed is V2 MB/s, and that they're independent of each other. If you do one thing at a time, you need to take 1/V1 + 1/V2 seconds to transfer one MB. The speed of the process is therefore 1/(1/V1 + 1/V2) = V1 * V2 / (V1 + V2). If the two devices have comparable speeds, you're halving the speed. As soon as you do buffering with two threads you can easily reach close to the minimum of the two speeds, which is the theoretical best.In my opinion I think that providing the performance of the standard libcurl API in the D wrapper is the way to go (as done in the current curl wrapper). Generic and efficient streaming across protocols is best done in std.net where buffers can be handled entirely in D. I know this is not a small task which is why I started out with wrapping libcurl.Sounds reasonable, although if you take care of recycling the buffers in the implementation you have, your wrapper may as well be the best of breed. Andrei
May 19 2011
On Wed, 18 May 2011 20:07:01 +0900, Jonas Drewsen <jdrewsen nospam.com> wrote:On 18/05/11 10.09, Johannes Pfau wrote:Good! Last year, I tried to implement std.event. In Windows, using IOCP with Overlapped-IO for proactor. Others, IO multiplexing with worker-thread for proactor emulation. But Johannes already started implementing event module using libev, so I deleted this task from my tasks. My pending projects need event module. I expect your efforts! Masahiro P.S. cURL wrapper is nice. I wait this :)jdrewsen wrote:I'll finish the curl wrapper before starting on it. I have had a look at libev/event before and they are very nice libs. But I think I'll have to implement it from scratch for two reasons: 1, AFAIK we cannot include code in std phobos that is not boost licensed or a license as liberal. libev for example requires you to distribute the license with your software. 2, It introduces a dependency to an external project in phobos. Currently no dependencies are present. This makes phobos very easy to install and use out of the box. The optimal solution to these problems from my point of view would be to get that "much discussed" package tool in place soon (CPAN,apt alike). Heck, now I think about it maybe I should do that before any std.net stuff. Let's see how the wind blows - many interesting projects :)But I would rather stop here feature wise and wrap this thing up. I would like to focus my efforts on a candidate for std.net.event/reactor/proactor module.Have you already started working on std.net.event? Do you plan to base that on libev / libevent?
May 23 2011
Den 23-05-2011 21:02, Masahiro Nakagawa skrev:On Wed, 18 May 2011 20:07:01 +0900, Jonas Drewsen <jdrewsen nospam.com> wrote:If you have any code for inspiration or that could be used as a base I would very much like to see/use it. Especially if it is under Boost license. I would like to create reactor initially and proceed to build the proactor on top of that (not on top for windows and IOCP of course). Reactor and proactor designs each have their forces and therefore I do not think targeting a proactor only design is the way to go. /JonasOn 18/05/11 10.09, Johannes Pfau wrote:Good! Last year, I tried to implement std.event. In Windows, using IOCP with Overlapped-IO for proactor. Others, IO multiplexing with worker-thread for proactor emulation. But Johannes already started implementing event module using libev, so I deleted this task from my tasks.jdrewsen wrote:I'll finish the curl wrapper before starting on it. I have had a look at libev/event before and they are very nice libs. But I think I'll have to implement it from scratch for two reasons: 1, AFAIK we cannot include code in std phobos that is not boost licensed or a license as liberal. libev for example requires you to distribute the license with your software. 2, It introduces a dependency to an external project in phobos. Currently no dependencies are present. This makes phobos very easy to install and use out of the box. The optimal solution to these problems from my point of view would be to get that "much discussed" package tool in place soon (CPAN,apt alike). Heck, now I think about it maybe I should do that before any std.net stuff. Let's see how the wind blows - many interesting projects :)But I would rather stop here feature wise and wrap this thing up. I would like to focus my efforts on a candidate for std.net.event/reactor/proactor module.Have you already started working on std.net.event? Do you plan to base that on libev / libevent?
May 23 2011
On Tue, 24 May 2011 04:50:10 +0900, jdrewsen <jdrewsen nospam.com> wrote:Den 23-05-2011 21:02, Masahiro Nakagawa skrev:My old design is based on Boost/Asio. After I rewrote std.socket(including other utilities such as DNS, IPAddresses), I stopped implementing task.On Wed, 18 May 2011 20:07:01 +0900, Jonas Drewsen <jdrewsen nospam.com> wrote:If you have any code for inspiration or that could be used as a base I would very much like to see/use it. Especially if it is under Boost license.On 18/05/11 10.09, Johannes Pfau wrote:Good! Last year, I tried to implement std.event. In Windows, using IOCP with Overlapped-IO for proactor. Others, IO multiplexing with worker-thread for proactor emulation. But Johannes already started implementing event module using libev, so I deleted this task from my tasks.jdrewsen wrote:I'll finish the curl wrapper before starting on it. I have had a look at libev/event before and they are very nice libs. But I think I'll have to implement it from scratch for two reasons: 1, AFAIK we cannot include code in std phobos that is not boost licensed or a license as liberal. libev for example requires you to distribute the license with your software. 2, It introduces a dependency to an external project in phobos. Currently no dependencies are present. This makes phobos very easy to install and use out of the box. The optimal solution to these problems from my point of view would be to get that "much discussed" package tool in place soon (CPAN,apt alike). Heck, now I think about it maybe I should do that before any std.net stuff. Let's see how the wind blows - many interesting projects :)But I would rather stop here feature wise and wrap this thing up. I would like to focus my efforts on a candidate for std.net.event/reactor/proactor module.Have you already started working on std.net.event? Do you plan to base that on libev / libevent?I would like to create reactor initially and proceed to build the proactor on top of that (not on top for windows and IOCP of course). Reactor and proactor designs each have their forces and therefore I do not think targeting a proactor only design is the way to go.I agree. General library should provides two approaches. My main purpose is high-performance RPC. In such cases, Proactor is better(Java's Netty provides good abstraction APIs).
May 29 2011
Is there any reason why this code (using SMTP + Curl): https://gist.github.com/998214 segfaults on Windows but not on Fedora 15? I found an easy fix, but it looks like a bug somewhere that I can't figure out.
May 29 2011
Den 30-05-2011 01:16, Jimmy Cao skrev:Is there any reason why this code (using SMTP + Curl): https://gist.github.com/998214 segfaults on Windows but not on Fedora 15? I found an easy fix, but it looks like a bug somewhere that I can't figure out.etc.curl is currently a moving target. The last version I pushed to github is outdated. Currently the focus is on getting http and ftp working and the other protocol hasn't been tested. Out of curiosity I would very much like to known what your easy fix was? /Jonas
May 30 2011
On Mon, May 30, 2011 at 12:38 PM, jdrewsen <jdrewsen nospam.com> wrote:Den 30-05-2011 01:16, Jimmy Cao skrev: Is there any reason why this code (using SMTP + Curl):Uncomment line 19 and it works.https://gist.github.com/998214 segfaults on Windows but not on Fedora 15? I found an easy fix, but it looks like a bug somewhere that I can't figure out.etc.curl is currently a moving target. The last version I pushed to github is outdated. Currently the focus is on getting http and ftp working and the other protocol hasn't been tested. Out of curiosity I would very much like to known what your easy fix was? /Jonas
May 30 2011
On Mon, 16 May 2011 17:07:57 -0400, jdrewsen <jdrewsen nospam.com> wrote:Hi, I've been working on a wrapper for the etc.c.curl module. It is now pretty stable and I would very much like some feedback on the API. http://freeze.steamwinter.com/D/web/phobos/etc_curl.html BTW I use template mixins which doesn't seem to get included in the generated docs. Is there any way I can make this work? /JonasI think the API looks good with regard to the core functionality it provides, but the documentation is sub-par, an certain structs should be defined inside the Http class. I've listed a bunch of comments I had while reading the documentation below. One minor question regarding the actual implementation: do you define a pragma(lib, ...) to ease setup/tool chain pains, or do you intend something different? Regarding the First example: * Use string instead of const(char)[] * Don't waste vertical space with empty comment lines like this // // This comment style // * Comment lines should be grammatically correct, i.e. '// GET using async range' should be '// GET using an asynchronous range' or '// Perform a GET using an asynchronous range', etc * Don't use 'l' as a variable. It looks like a 1. Also, using 'line' would make the meaning clearer * Example code should not exceed 80 chars in width. People's browser setups vary, and wrapped lines are bad. * You can drop the 'delegate size_t' from the PUT example by making the return 0u, 0U or size_t.init. * The last example is missing a description. (Also, it goes over 80 chars) The license section should include a mention of libcurl's license, etc in addition to the wrapper's license. In general, for non-Boost libraries like this, should we include the license/copyright as a string/tuple enum? (i.e. for use in about boxes / command line options) Should said license/copyright be register to a global repository upon module import? (i.e. for auto-magical about boxes, etc) 'Do not use the same instance of this class in two threads simultanously.' First, you need to spell check all your documentation (i.e. simultaneously vs simultanously). Second, this should be assumed by most programmers to be true of all non-shared classes. Third, if you do want to leave this line in, use something short and sweet, like 'Http is not thread safe', instead. You need to look at/think about the order in which things are declared. Having things like HttpMethod declared long after Http doesn't make any sense. Also, it looks like you're not using '///ditto' for HttpMethod. On that note, why is HttpMethod a free-standing enum and not declared inside http? Http.Method, Http.Result, etc. make a lot more sense. The setTimeCondition documentation seems a bit confusing. I'd recommend stream-lining the convenience function documentation. For example, http.head only requires a /// Returns: An HttpResult containing the url's headers. Alternatively, you could use ditto and merge the documentation and examples for head/get/etc into a single block. Also, headAsync seems to document a data argument, but not take one, while post has an undocumented parameter, etc. In general, you might want to consider removing some of the redundant/self-obvious parameter documentation blocks. On second thought, use ditto, slim down the docs and move this set of functions either to the beginning or end of Http. It would make sense to put these right after the definition of Http.Method/method. You also might want to consider grouping all the xxxAsync functions together, instead of interleaving them. Check the grammar in http.postData http.onReceiveHeader has issues with it's 'See' link. It's example is too wide. You don't need really need the parameter documentation, particularly if you improve the example, i.e.: string[string] headerInformation; with(auto http = new Http("http://www.google.com")) { onReceiveHeader = (string key, string value) { if ( value.length <= 10 ) headerInformation[key.idup] = value.idup; }; onReceive = (ubyte[] data) { }; perform; } Also, an tab/indent should be 4 spaces not 5 and no example should start indented. Remember to comment like this: /** My regular comment * Example: --- // Example code --- * Other Stuff: */ You need to fix/complete onSend's documentation. contentLength(size_t len); => contentLength(size_t length); 'Perform http request' => 'Perform an http request' AuthMethod/CurlAuth need to be part of docs. Why is setAuthenticationMethod not authenticationMethod or authentication? (And why is AuthMethod not AuthenticationMethod?) Similarly, why is setTimeCondition not timeCondition and CurlTimeCond not Http.TimeCondition? Why is followLocation(int maxRedirs) not named maxRedirections? HttpResult.text should have huge warning flags with regard to use and/or have its design re-thought. Is there some reason you're not at least caching the result and throwing errors on invalid uses of content, etc. Calling text multiple times should be okay. And calling content after text should be either logically valid, or throw. AsyncHttpResult.byLine/byChunk examples is improperly indented and too wide. Why is HttpStatusLine not Http.Status?
May 16 2011
On 17/05/11 08.03, Robert Jacques wrote:On Mon, 16 May 2011 17:07:57 -0400, jdrewsen <jdrewsen nospam.com> wrote:I define a pragma yes.Hi, I've been working on a wrapper for the etc.c.curl module. It is now pretty stable and I would very much like some feedback on the API. http://freeze.steamwinter.com/D/web/phobos/etc_curl.html BTW I use template mixins which doesn't seem to get included in the generated docs. Is there any way I can make this work? /JonasI think the API looks good with regard to the core functionality it provides, but the documentation is sub-par, an certain structs should be defined inside the Http class. I've listed a bunch of comments I had while reading the documentation below. One minor question regarding the actual implementation: do you define a pragma(lib, ...) to ease setup/tool chain pains, or do you intend something different?Regarding the First example: * Use string instead of const(char)[]Please see my reply to Andrei* Don't waste vertical space with empty comment lines like thisok// // This comment style // * Comment lines should be grammatically correct, i.e. '// GET using async range' should be '// GET using an asynchronous range' or '// Perform a GET using an asynchronous range', etcok* Don't use 'l' as a variable. It looks like a 1. Also, using 'line' would make the meaning clearerok* Example code should not exceed 80 chars in width. People's browser setups vary, and wrapped lines are bad.ok* You can drop the 'delegate size_t' from the PUT example by making the return 0u, 0U or size_t.init.nice. will do.* The last example is missing a description. (Also, it goes over 80 chars)okThe license section should include a mention of libcurl's license, etc in addition to the wrapper's license. In general, for non-Boost libraries like this, should we include the license/copyright as a string/tuple enum? (i.e. for use in about boxes / command line options) Should said license/copyright be register to a global repository upon module import? (i.e. for auto-magical about boxes, etc)I'll mention what license libcurl is under. Until a standard for a special string/tuple has been decided upon I'll leave this out.'Do not use the same instance of this class in two threads simultanously.' First, you need to spell check all your documentation (i.e. simultaneously vs simultanously). Second, this should be assumed by most programmers to be true of all non-shared classes. Third, if you do want to leave this line in, use something short and sweet, like 'Http is not thread safe', instead.I'll do a spell check. See reply to Andrei on the other issue.You need to look at/think about the order in which things are declared. Having things like HttpMethod declared long after Http doesn't make any sense. Also, it looks like you're not using '///ditto' for HttpMethod.I don't know what ditto does. I'll try to look at the ddoc documentation.On that note, why is HttpMethod a free-standing enum and not declared inside http? Http.Method, Http.Result, etc. make a lot more sense.See reply to Andrei.The setTimeCondition documentation seems a bit confusing.I'll try to improve.I'd recommend stream-lining the convenience function documentation. For example, http.head only requires a /// Returns: An HttpResult containing the url's headers. Alternatively, you could use ditto and merge the documentation and examples for head/get/etc into a single block. Also, headAsync seems to document a data argument, but not take one, while post has an undocumented parameter, etc. In general, you might want to consider removing some of the redundant/self-obvious parameter documentation blocks. On second thought, use ditto, slim down the docs and move this set of functions either to the beginning or end of Http. It would make sense to put these right after the definition of Http.Method/method. You also might want to consider grouping all the xxxAsync functions together, instead of interleaving them.As mentioned I need to figure out what ditto does. headAsync comment does not match its signature. I'll fix. I've noticed that trying out examples live in the docs may be activated at some point. If that is the case I think it is nice to have running examples of each method.Check the grammar in http.postDataokhttp.onReceiveHeader has issues with it's 'See' link. It's example is too wide. You don't need really need the parameter documentation, particularly if you improve the example, i.e.: string[string] headerInformation; with(auto http = new Http("http://www.google.com")) { onReceiveHeader = (string key, string value) { if ( value.length <= 10 ) headerInformation[key.idup] = value.idup; }; onReceive = (ubyte[] data) { }; perform; }I see your point. The question is how we would like phobos docs to be: 1, consistent with all parameters documented for each method and each method explained and exemplified 2, minimal with examples and explanations where absolutely needed. or somewhere in between. The current style if the curl wrapper is closer to 1 than 2. Anything official about this?Also, an tab/indent should be 4 spaces not 5 and no example should start indented. Remember to comment like this: /** My regular comment * Example: --- // Example code --- * Other Stuff: */okYou need to fix/complete onSend's documentation.Will give it a lookcontentLength(size_t len); => contentLength(size_t length); 'Perform http request' => 'Perform an http request'okAuthMethod/CurlAuth need to be part of docs.I guess it I should alias the enum in etc.curl directlyWhy is setAuthenticationMethod not authenticationMethod or authentication? (And why is AuthMethod not AuthenticationMethod?) Similarly, why is setTimeCondition not timeCondition and CurlTimeCond not Http.TimeCondition?authenticationMethod will be made into a property. AuthMethod because that's what it is called in libcurl. I like the short form better and have done the same for AsyncHttpResult (not AsynchronousHttpResult). setTimeCondition cannot be made into a property since it takes two arguments. CurlTimeCond is not calld Http.TimeCondition because it comes from a etc.c.curl. I could alias it in Http ofcourse.Why is followLocation(int maxRedirs) not named maxRedirections?This is what it is called in libcurl. I think I'll rename it to maxRedirections anyway since it seems to confuse.HttpResult.text should have huge warning flags with regard to use and/or have its design re-thought. Is there some reason you're not at least caching the result and throwing errors on invalid uses of content, etc. Calling text multiple times should be okay. And calling content after text should be either logically valid, or throw.It is mostly based on storage concerns. Initially the data is stored as ubyte[] and can be returned by content(). When calling text() to get a string it will be decoded and returned. I now have a coule of options: 1, to store the string in the result for future calls to text(). I can do that and keep the original data for future calls to content(). This will double the amount of memory needed. 2, generated the string on each call to text(). This way no copy of the data is kept in HttpResult. 3, generated and store the string and then throw away the original data. Again no copy is kept in HttpResult. I've chosen the 2. version since I guessed that once you have called text() it is more likely that you'll call text again and not content().AsyncHttpResult.byLine/byChunk examples is improperly indented and too wide.okWhy is HttpStatusLine not Http.Status?Thank you for your feedback!
May 17 2011
On Tue, 17 May 2011 07:24:16 -0400, Jonas Drewsen <jdrewsen nospam.com> wrote:On 17/05/11 08.03, Robert Jacques wrote:[snip]On Mon, 16 May 2011 17:07:57 -0400, jdrewsen <jdrewsen nospam.com> wrote:Hi, I've been working on a wrapper for the etc.c.curl module. It is now pretty stable and I would very much like some feedback on the API. http://freeze.steamwinter.com/D/web/phobos/etc_curl.html BTW I use template mixins which doesn't seem to get included in the generated docs. Is there any way I can make this work? /JonasI don't know what ditto does. I'll try to look at the ddoc documentation."If a documentation comment for a declaration consists only of the identifier ditto then the documentation comment for the previous declaration at the same declaration scope is applied to this declaration as well." [snip]Official? I don't know. But looking over std.algorithm, Param: is only used 3 times, for each of the find variants, in order to clarify what haystack and needle are and in one case make the limitations/requirements on haystack/needle explicit. Looking over the rest of phobos, I see a mix of styles, depending on the author, and some things (like datetime) seem to do both. [snip]http.onReceiveHeader has issues with it's 'See' link. It's example is too wide. You don't need really need the parameter documentation, particularly if you improve the example, i.e.: string[string] headerInformation; with(auto http = new Http("http://www.google.com")) { onReceiveHeader = (string key, string value) { if ( value.length <= 10 ) headerInformation[key.idup] = value.idup; }; onReceive = (ubyte[] data) { }; perform; }I see your point. The question is how we would like phobos docs to be: 1, consistent with all parameters documented for each method and each method explained and exemplified 2, minimal with examples and explanations where absolutely needed. or somewhere in between. The current style if the curl wrapper is closer to 1 than 2. Anything official about this?CurlTimeCond is not calld Http.TimeCondition because it comes from a etc.c.curl. I could alias it in Http ofcourse.I think where it's defined is less of an issue than where the documentation is. [snip]Hmm... I take it that content and text are going to be large enough, and result long lived enough to warrant not having the array cached inside result. In that case you might want to rename text to toString or decode. 'text' feels like it should be cheaply reusable, while 'toString'/'decode' doesn't. (this is a verb/noun thing) Also, 'toString' tends to compose better with the rest of D while 'decode' is similar to the std.utf functions. Also, if memory re-use is a concern, a 'toStringInPlace' or 'decodeInPlace' with the old behavior might be appropriate, although I might recommend setting content to the utf-8 string and updating the encodingSchemeName as appropriate.HttpResult.text should have huge warning flags with regard to use and/or have its design re-thought. Is there some reason you're not at least caching the result and throwing errors on invalid uses of content, etc. Calling text multiple times should be okay. And calling content after text should be either logically valid, or throw.It is mostly based on storage concerns. Initially the data is stored as ubyte[] and can be returned by content(). When calling text() to get a string it will be decoded and returned. I now have a coule of options: 1, to store the string in the result for future calls to text(). I can do that and keep the original data for future calls to content(). This will double the amount of memory needed. 2, generated the string on each call to text(). This way no copy of the data is kept in HttpResult. 3, generated and store the string and then throw away the original data. Again no copy is kept in HttpResult. I've chosen the 2. version since I guessed that once you have called text() it is more likely that you'll call text again and not content().
May 17 2011
Den 17-05-2011 16:57, Robert Jacques skrev:On Tue, 17 May 2011 07:24:16 -0400, Jonas Drewsen <jdrewsen nospam.com> wrote:I think I'll go for the toString() version. /JonasOn 17/05/11 08.03, Robert Jacques wrote:[snip]On Mon, 16 May 2011 17:07:57 -0400, jdrewsen <jdrewsen nospam.com> wrote:Hi, I've been working on a wrapper for the etc.c.curl module. It is now pretty stable and I would very much like some feedback on the API. http://freeze.steamwinter.com/D/web/phobos/etc_curl.html BTW I use template mixins which doesn't seem to get included in the generated docs. Is there any way I can make this work? /JonasI don't know what ditto does. I'll try to look at the ddoc documentation."If a documentation comment for a declaration consists only of the identifier ditto then the documentation comment for the previous declaration at the same declaration scope is applied to this declaration as well." [snip]Official? I don't know. But looking over std.algorithm, Param: is only used 3 times, for each of the find variants, in order to clarify what haystack and needle are and in one case make the limitations/requirements on haystack/needle explicit. Looking over the rest of phobos, I see a mix of styles, depending on the author, and some things (like datetime) seem to do both. [snip]http.onReceiveHeader has issues with it's 'See' link. It's example is too wide. You don't need really need the parameter documentation, particularly if you improve the example, i.e.: string[string] headerInformation; with(auto http = new Http("http://www.google.com")) { onReceiveHeader = (string key, string value) { if ( value.length <= 10 ) headerInformation[key.idup] = value.idup; }; onReceive = (ubyte[] data) { }; perform; }I see your point. The question is how we would like phobos docs to be: 1, consistent with all parameters documented for each method and each method explained and exemplified 2, minimal with examples and explanations where absolutely needed. or somewhere in between. The current style if the curl wrapper is closer to 1 than 2. Anything official about this?CurlTimeCond is not calld Http.TimeCondition because it comes from a etc.c.curl. I could alias it in Http ofcourse.I think where it's defined is less of an issue than where the documentation is. [snip]Hmm... I take it that content and text are going to be large enough, and result long lived enough to warrant not having the array cached inside result. In that case you might want to rename text to toString or decode. 'text' feels like it should be cheaply reusable, while 'toString'/'decode' doesn't. (this is a verb/noun thing) Also, 'toString' tends to compose better with the rest of D while 'decode' is similar to the std.utf functions. Also, if memory re-use is a concern, a 'toStringInPlace' or 'decodeInPlace' with the old behavior might be appropriate, although I might recommend setting content to the utf-8 string and updating the encodingSchemeName as appropriate.HttpResult.text should have huge warning flags with regard to use and/or have its design re-thought. Is there some reason you're not at least caching the result and throwing errors on invalid uses of content, etc. Calling text multiple times should be okay. And calling content after text should be either logically valid, or throw.It is mostly based on storage concerns. Initially the data is stored as ubyte[] and can be returned by content(). When calling text() to get a string it will be decoded and returned. I now have a coule of options: 1, to store the string in the result for future calls to text(). I can do that and keep the original data for future calls to content(). This will double the amount of memory needed. 2, generated the string on each call to text(). This way no copy of the data is kept in HttpResult. 3, generated and store the string and then throw away the original data. Again no copy is kept in HttpResult. I've chosen the 2. version since I guessed that once you have called text() it is more likely that you'll call text again and not content().
May 17 2011
On 2011-05-16 23:07, jdrewsen wrote:Hi, I've been working on a wrapper for the etc.c.curl module. It is now pretty stable and I would very much like some feedback on the API. http://freeze.steamwinter.com/D/web/phobos/etc_curl.html BTW I use template mixins which doesn't seem to get included in the generated docs. Is there any way I can make this work? /JonasI like the API, seems simple. -- /Jacob Carlborg
May 17 2011
Am 16.05.2011 23:07, schrieb jdrewsen:Hi, I've been working on a wrapper for the etc.c.curl module. It is now pretty stable and I would very much like some feedback on the API. http://freeze.steamwinter.com/D/web/phobos/etc_curl.html BTW I use template mixins which doesn't seem to get included in the generated docs. Is there any way I can make this work? /JonasYou really think this is good for your mental health ? // PUT with data senders string msg = "Hello world"; -------------------------------- http.onSend = (void[] data) { if (msg.empty) return 0; auto m = cast(void[])msg; typeof(size_t) len = m.length; data[0..len] = m[0..$]; msg.length = 0; return len; }; Remarkable stuff, indeed ! I am not even sure that this thingy compiles , but fuc da duc, at least it shows what you CAN do in D. :)
May 23 2011