digitalmars.D.learn - Download file via http
- Kai Meyer (90/90) Dec 13 2011 I've been trying to modify the htmlget.d example for std.socketstream
- Vladimir Panteleev (7/14) Dec 13 2011 In an HTTP request, the headers are separated from the body by an
- Kai Meyer (12/25) Dec 13 2011 http://www.d-programming-language.org/phobos/std_stream.html
- Regan Heath (14/40) Dec 13 2011 I would have expected what you're doing to work.
- Kai Meyer (9/51) Dec 13 2011 Doing a read(out ubyte) read a single byte from the stream, and allowed
- Bystroushaak (4/98) Dec 18 2011 I've created HTTP client module. It's just http module, no cookies, no
I've been trying to modify the htmlget.d example for std.socketstream (http://www.d-programming-language.org/phobos/std_socketstream.html) to be able to download a file. My code ends up looking like this at the end: auto outfile = new std.stream.File(destination, FileMode.Out); outfile.copyFrom(ss, bytes_needed); I get bytes_needed from the Content-Length header. The I get the correct number of bytes from the Content-Length, bytes_needed gets the right value, but the resulting file isn't right. The file has the right number of bytes, but I appear to have an extra '0a' at the very beginning of the file, but if I do 'ss.getchar()', to get rid of it, I get an exception that there's not enough data in the stream. Here's the output from hexdump that I'm basing my analysis from. Sorry if it doesn't come through 100% formatted correctly. [kai server _source]$ hexdump -C correct_file.exe | head 00000000 4d 5a 60 00 01 00 00 00 04 00 10 00 ff ff 00 00 |MZ`.............| 00000010 fe 00 00 00 12 00 00 00 40 00 00 00 00 00 00 00 |........ .......| 00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 00000030 00 00 00 00 00 00 00 00 00 00 00 00 60 00 00 00 |............`...| 00000040 52 65 71 75 69 72 65 73 20 57 69 6e 33 32 20 20 |Requires Win32 | 00000050 20 24 16 1f 33 d2 b4 09 cd 21 b8 01 4c cd 21 00 | $..3....!..L.!.| 00000060 50 45 00 00 4c 01 06 00 00 00 00 00 00 00 00 00 |PE..L...........| 00000070 00 00 00 00 e0 00 8e 81 0b 01 08 00 00 7e 28 00 |.............~(.| 00000080 00 02 00 00 00 00 00 00 8c d7 27 00 00 20 00 00 |..........'.. ..| 00000090 00 a0 28 00 00 00 40 00 00 10 00 00 00 02 00 00 |..(... .........| [kai server _source]$ hexdump -C downloaded_file.exe | head 00000000 0a 4d 5a 60 00 01 00 00 00 04 00 10 00 ff ff 00 |.MZ`............| 00000010 00 fe 00 00 00 12 00 00 00 40 00 00 00 00 00 00 |......... ......| 00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 00000030 00 00 00 00 00 00 00 00 00 00 00 00 00 60 00 00 |.............`..| 00000040 00 52 65 71 75 69 72 65 73 20 57 69 6e 33 32 20 |.Requires Win32 | 00000050 20 20 24 16 1f 33 d2 b4 09 cd 21 b8 01 4c cd 21 | $..3....!..L.!| 00000060 00 50 45 00 00 4c 01 06 00 00 00 00 00 00 00 00 |.PE..L..........| 00000070 00 00 00 00 00 e0 00 8e 81 0b 01 08 00 00 7e 28 |..............~(| 00000080 00 00 02 00 00 00 00 00 00 8c d7 27 00 00 20 00 |...........'.. .| 00000090 00 00 a0 28 00 00 00 40 00 00 10 00 00 00 02 00 |...(... ........| [kai server _source]$ hexdump -C correct_file.exe | tail 002b5c10 80 30 84 30 88 30 8c 30 90 30 94 30 98 30 9c 30 |.0.0.0.0.0.0.0.0| 002b5c20 a0 30 a4 30 a8 30 ac 30 b0 30 b4 30 b8 30 bc 30 |.0.0.0.0.0.0.0.0| 002b5c30 c0 30 c4 30 c8 30 cc 30 d0 30 d4 30 d8 30 dc 30 |.0.0.0.0.0.0.0.0| 002b5c40 f4 30 f8 30 fc 30 00 31 64 31 68 31 6c 31 70 31 |.0.0.0.1d1h1l1p1| 002b5c50 74 31 38 37 00 00 00 00 00 00 00 00 00 00 00 00 |t187............| 002b5c60 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 002b5e00 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 |................| 002b5e10 00 00 00 00 00 00 00 00 00 00 00 00 |............| 002b5e1c [kai server _source]$ hexdump -C downloaded_file.exe | tail 002b5c10 30 80 30 84 30 88 30 8c 30 90 30 94 30 98 30 9c |0.0.0.0.0.0.0.0.| 002b5c20 30 a0 30 a4 30 a8 30 ac 30 b0 30 b4 30 b8 30 bc |0.0.0.0.0.0.0.0.| 002b5c30 30 c0 30 c4 30 c8 30 cc 30 d0 30 d4 30 d8 30 dc |0.0.0.0.0.0.0.0.| 002b5c40 30 f4 30 f8 30 fc 30 00 31 64 31 68 31 6c 31 70 |0.0.0.0.1d1h1l1p| 002b5c50 31 74 31 38 37 00 00 00 00 00 00 00 00 00 00 00 |1t187...........| 002b5c60 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 002b5e00 00 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 |................| 002b5e10 00 00 00 00 00 00 00 00 00 00 00 00 |............|
Dec 13 2011
On Tuesday, 13 December 2011 at 17:29:20 UTC, Kai Meyer wrote:I get bytes_needed from the Content-Length header. The I get the correct number of bytes from the Content-Length, bytes_needed gets the right value, but the resulting file isn't right. The file has the right number of bytes, but I appear to have an extra '0a' at the very beginning of the file, but if I do 'ss.getchar()', to get rid of it, I get an exception that there's not enough data in the stream.In an HTTP request, the headers are separated from the body by an empty line. Headers use CR/LF line endings, so the body is always preceded by a 0D 0A 0D 0A sequence. It looks like your code is not snipping the last 0A. Where did the getchar method come from? There is no mention of it in Phobos. Perhaps you could try the read(out ubyte) method?
Dec 13 2011
On 12/13/2011 10:39 AM, Vladimir Panteleev wrote:On Tuesday, 13 December 2011 at 17:29:20 UTC, Kai Meyer wrote:http://www.d-programming-language.org/phobos/std_stream.html Oh, I meant getc(), not getchar(), sorry. It looks like read(out ubyte) worked on windows. I'm using ss.readLine() to pull headers from the stream. When the string returned from ss.readLine() is empty, then I move on to the stream. I'm going to be using this application on Windows, Linux, and Mac, which is why I chose D. This feels like I've just entered the newline/carriage return nightmare. Should I not be using readLine()? Or is there some generic code that will always work and stick me at the beginning of the file? -Kai MeyerI get bytes_needed from the Content-Length header. The I get the correct number of bytes from the Content-Length, bytes_needed gets the right value, but the resulting file isn't right. The file has the right number of bytes, but I appear to have an extra '0a' at the very beginning of the file, but if I do 'ss.getchar()', to get rid of it, I get an exception that there's not enough data in the stream.In an HTTP request, the headers are separated from the body by an empty line. Headers use CR/LF line endings, so the body is always preceded by a 0D 0A 0D 0A sequence. It looks like your code is not snipping the last 0A. Where did the getchar method come from? There is no mention of it in Phobos. Perhaps you could try the read(out ubyte) method?
Dec 13 2011
On Tue, 13 Dec 2011 17:58:57 -0000, Kai Meyer <kai unixlords.com> wrote:On 12/13/2011 10:39 AM, Vladimir Panteleev wrote:I would have expected what you're doing to work. IIRC when you make a GET request you send HTTP/1.0 or HTTP/1.1 or similar in the GET request line, right? (my memory of the syntax is a bit fuzzy). Warning, wacky idea with little/no backing knowledge.. IIRC using HTTP/1.1 introduced additional data into the response, lengths or checksums, or something - I never did get to the bottom of it. But, if you change to using HTTP/1.0 they go away. I wonder if the 0A is related to that. As a simple test you could try HTTP/1.0 in your request and look at the response content-length, it might just be 1 byte shorter as a result. Regan -- Using Opera's revolutionary email client: http://www.opera.com/mail/On Tuesday, 13 December 2011 at 17:29:20 UTC, Kai Meyer wrote:http://www.d-programming-language.org/phobos/std_stream.html Oh, I meant getc(), not getchar(), sorry. It looks like read(out ubyte) worked on windows. I'm using ss.readLine() to pull headers from the stream. When the string returned from ss.readLine() is empty, then I move on to the stream. I'm going to be using this application on Windows, Linux, and Mac, which is why I chose D. This feels like I've just entered the newline/carriage return nightmare. Should I not be using readLine()? Or is there some generic code that will always work and stick me at the beginning of the file?I get bytes_needed from the Content-Length header. The I get the correct number of bytes from the Content-Length, bytes_needed gets the right value, but the resulting file isn't right. The file has the right number of bytes, but I appear to have an extra '0a' at the very beginning of the file, but if I do 'ss.getchar()', to get rid of it, I get an exception that there's not enough data in the stream.In an HTTP request, the headers are separated from the body by an empty line. Headers use CR/LF line endings, so the body is always preceded by a 0D 0A 0D 0A sequence. It looks like your code is not snipping the last 0A. Where did the getchar method come from? There is no mention of it in Phobos. Perhaps you could try the read(out ubyte) method?
Dec 13 2011
On 12/13/2011 11:10 AM, Regan Heath wrote:On Tue, 13 Dec 2011 17:58:57 -0000, Kai Meyer <kai unixlords.com> wrote:Doing a read(out ubyte) read a single byte from the stream, and allowed me to continue to read the full content-length number of bytes. Switching read(out ubyte) for a simple getc() caused the not-enough-bytes in stream exception. I'm now downloading the correct size file with bytes in the correct places after calling read(). I may or may not play with HTTP/1.0. I need to turn my attention to other matters at the moment though, since it currently "works". -Kai MeyerOn 12/13/2011 10:39 AM, Vladimir Panteleev wrote:I would have expected what you're doing to work. IIRC when you make a GET request you send HTTP/1.0 or HTTP/1.1 or similar in the GET request line, right? (my memory of the syntax is a bit fuzzy). Warning, wacky idea with little/no backing knowledge.. IIRC using HTTP/1.1 introduced additional data into the response, lengths or checksums, or something - I never did get to the bottom of it. But, if you change to using HTTP/1.0 they go away. I wonder if the 0A is related to that. As a simple test you could try HTTP/1.0 in your request and look at the response content-length, it might just be 1 byte shorter as a result. ReganOn Tuesday, 13 December 2011 at 17:29:20 UTC, Kai Meyer wrote:http://www.d-programming-language.org/phobos/std_stream.html Oh, I meant getc(), not getchar(), sorry. It looks like read(out ubyte) worked on windows. I'm using ss.readLine() to pull headers from the stream. When the string returned from ss.readLine() is empty, then I move on to the stream. I'm going to be using this application on Windows, Linux, and Mac, which is why I chose D. This feels like I've just entered the newline/carriage return nightmare. Should I not be using readLine()? Or is there some generic code that will always work and stick me at the beginning of the file?I get bytes_needed from the Content-Length header. The I get the correct number of bytes from the Content-Length, bytes_needed gets the right value, but the resulting file isn't right. The file has the right number of bytes, but I appear to have an extra '0a' at the very beginning of the file, but if I do 'ss.getchar()', to get rid of it, I get an exception that there's not enough data in the stream.In an HTTP request, the headers are separated from the body by an empty line. Headers use CR/LF line endings, so the body is always preceded by a 0D 0A 0D 0A sequence. It looks like your code is not snipping the last 0A. Where did the getchar method come from? There is no mention of it in Phobos. Perhaps you could try the read(out ubyte) method?
Dec 13 2011
I've created HTTP client module. It's just http module, no cookies, no https, so if you need something small, try it. https://github.com/Bystroushaak/DHTTPClient On 13.12.2011 18:29, Kai Meyer wrote:I've been trying to modify the htmlget.d example for std.socketstream (http://www.d-programming-language.org/phobos/std_socketstream.html) to be able to download a file. My code ends up looking like this at the end: auto outfile = new std.stream.File(destination, FileMode.Out); outfile.copyFrom(ss, bytes_needed); I get bytes_needed from the Content-Length header. The I get the correct number of bytes from the Content-Length, bytes_needed gets the right value, but the resulting file isn't right. The file has the right number of bytes, but I appear to have an extra '0a' at the very beginning of the file, but if I do 'ss.getchar()', to get rid of it, I get an exception that there's not enough data in the stream. Here's the output from hexdump that I'm basing my analysis from. Sorry if it doesn't come through 100% formatted correctly. [kai server _source]$ hexdump -C correct_file.exe | head 00000000 4d 5a 60 00 01 00 00 00 04 00 10 00 ff ff 00 00 |MZ`.............| 00000010 fe 00 00 00 12 00 00 00 40 00 00 00 00 00 00 00 |........ .......| 00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 00000030 00 00 00 00 00 00 00 00 00 00 00 00 60 00 00 00 |............`...| 00000040 52 65 71 75 69 72 65 73 20 57 69 6e 33 32 20 20 |Requires Win32 | 00000050 20 24 16 1f 33 d2 b4 09 cd 21 b8 01 4c cd 21 00 | $..3....!..L.!.| 00000060 50 45 00 00 4c 01 06 00 00 00 00 00 00 00 00 00 |PE..L...........| 00000070 00 00 00 00 e0 00 8e 81 0b 01 08 00 00 7e 28 00 |.............~(.| 00000080 00 02 00 00 00 00 00 00 8c d7 27 00 00 20 00 00 |..........'.. ..| 00000090 00 a0 28 00 00 00 40 00 00 10 00 00 00 02 00 00 |..(... .........| [kai server _source]$ hexdump -C downloaded_file.exe | head 00000000 0a 4d 5a 60 00 01 00 00 00 04 00 10 00 ff ff 00 |.MZ`............| 00000010 00 fe 00 00 00 12 00 00 00 40 00 00 00 00 00 00 |......... ......| 00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 00000030 00 00 00 00 00 00 00 00 00 00 00 00 00 60 00 00 |.............`..| 00000040 00 52 65 71 75 69 72 65 73 20 57 69 6e 33 32 20 |.Requires Win32 | 00000050 20 20 24 16 1f 33 d2 b4 09 cd 21 b8 01 4c cd 21 | $..3....!..L.!| 00000060 00 50 45 00 00 4c 01 06 00 00 00 00 00 00 00 00 |.PE..L..........| 00000070 00 00 00 00 00 e0 00 8e 81 0b 01 08 00 00 7e 28 |..............~(| 00000080 00 00 02 00 00 00 00 00 00 8c d7 27 00 00 20 00 |...........'.. .| 00000090 00 00 a0 28 00 00 00 40 00 00 10 00 00 00 02 00 |...(... ........| [kai server _source]$ hexdump -C correct_file.exe | tail 002b5c10 80 30 84 30 88 30 8c 30 90 30 94 30 98 30 9c 30 |.0.0.0.0.0.0.0.0| 002b5c20 a0 30 a4 30 a8 30 ac 30 b0 30 b4 30 b8 30 bc 30 |.0.0.0.0.0.0.0.0| 002b5c30 c0 30 c4 30 c8 30 cc 30 d0 30 d4 30 d8 30 dc 30 |.0.0.0.0.0.0.0.0| 002b5c40 f4 30 f8 30 fc 30 00 31 64 31 68 31 6c 31 70 31 |.0.0.0.1d1h1l1p1| 002b5c50 74 31 38 37 00 00 00 00 00 00 00 00 00 00 00 00 |t187............| 002b5c60 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 002b5e00 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 |................| 002b5e10 00 00 00 00 00 00 00 00 00 00 00 00 |............| 002b5e1c [kai server _source]$ hexdump -C downloaded_file.exe | tail 002b5c10 30 80 30 84 30 88 30 8c 30 90 30 94 30 98 30 9c |0.0.0.0.0.0.0.0.| 002b5c20 30 a0 30 a4 30 a8 30 ac 30 b0 30 b4 30 b8 30 bc |0.0.0.0.0.0.0.0.| 002b5c30 30 c0 30 c4 30 c8 30 cc 30 d0 30 d4 30 d8 30 dc |0.0.0.0.0.0.0.0.| 002b5c40 30 f4 30 f8 30 fc 30 00 31 64 31 68 31 6c 31 70 |0.0.0.0.1d1h1l1p| 002b5c50 31 74 31 38 37 00 00 00 00 00 00 00 00 00 00 00 |1t187...........| 002b5c60 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 002b5e00 00 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 |................| 002b5e10 00 00 00 00 00 00 00 00 00 00 00 00 |............|
Dec 18 2011