digitalmars.D.learn - std.net.curl get webpage asia font issue

Sam Hu (21/21) Jun 06 2012 Greeting!

Kevin (4/8) Jun 07 2012 I'm not really sure but try:

Sam Hu (3/14) Jun 07 2012 Sorry,no,it does not work,I tried to print the content to DFL

Dmitry Olshansky (8/28) Jun 07 2012 It's simple this line you "convert" whatever site content was to

Sam Hu (2/8) Jun 07 2012 Thanks.May I know how ?Appreciated a piece of code segment.

Dmitry Olshansky (8/17) Jun 08 2012 seems like

"Sam Hu" <samhudotsamhu gmail.com> writes:

Greeting!

The document on this website provide an example on how to get 
webpage information by std.net.curl.It is quite straightforward:

[code]
import std.net.curl, std.stdio;

void main(){

// Return a string containing the content specified by an URL
string content = get("dlang.org");

writefln("%s\n",content);

readln;
}
[/code]

When I change get("dlang.org") to get("yahoo.com"),everything 
goes fine;but when I change to get("yahoo.com.cn"),a runtime 
error said bad gbk encoding bla...

So my very simple question is how to retrieve information from a 
webpage which could possibily contains asia font (like Chinese 
font)?

Thanks for your help in advance.

Regards,
Sam

Jun 06 2012

Kevin <kevincox.ca gmail.com> writes:

On 07/06/12 02:57, Sam Hu wrote:
 string content = get("dlang.org");
 writefln("%s\n",content);

 So my very simple question is how to retrieve information from a
 webpage which could possibily contains asia font (like Chinese font)?

I'm not really sure but try:
wstring content = get("dlang.org");

Also make sure your terminal is set up for unicode.

Jun 07 2012

"Sam Hu" <samhudotsamhu gmail.com> writes:

On Thursday, 7 June 2012 at 10:38:53 UTC, Kevin wrote:
 On 07/06/12 02:57, Sam Hu wrote:
 string content = get("dlang.org");
 writefln("%s\n",content);

 So my very simple question is how to retrieve information from 
 a
 webpage which could possibily contains asia font (like Chinese 
 font)?

 I'm not really sure but try:
 wstring content = get("dlang.org");

 Also make sure your terminal is set up for unicode.

Sorry,no,it does not work,I tried to print the content to DFL
TextBox control but still the same issue.

Jun 07 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 07.06.2012 10:57, Sam Hu wrote:
 Greeting!

 The document on this website provide an example on how to get webpage
 information by std.net.curl.It is quite straightforward:

 [code]
 import std.net.curl, std.stdio;

 void main(){

 // Return a string containing the content specified by an URL
 string content = get("dlang.org");

It's simple this line you "convert" whatever site content was to 
unicode. Problem is that "convert" is either broken or it's simply a 
cast whereas it should re-encode source as unicode. So the way around is 
to get it to array of bytes and decode yourself.

 writefln("%s\n",content);

 readln;
 }
 [/code]

 When I change get("dlang.org") to get("yahoo.com"),everything goes
 fine;but when I change to get("yahoo.com.cn"),a runtime error said bad
 gbk encoding bla...

 So my very simple question is how to retrieve information from a webpage
 which could possibily contains asia font (like Chinese font)?

I think it's not "font" but encoding problem.

 Thanks for your help in advance.

 Regards,
 Sam


-- 
Dmitry Olshansky

Jun 07 2012

"Sam Hu" <samhudotsamhu gmail.com> writes:

On Thursday, 7 June 2012 at 10:43:32 UTC, Dmitry Olshansky wrote:
 string content = get("dlang.org");

 It's simple this line you "convert" whatever site content was 
 to unicode. Problem is that "convert" is either broken or it's 
 simply a cast whereas it should re-encode source as unicode. So 
 the way around is to get it to array of bytes and decode 
 yourself.

Thanks.May I know how ?Appreciated a piece of code segment.

Jun 07 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 08.06.2012 5:03, Sam Hu wrote:
 On Thursday, 7 June 2012 at 10:43:32 UTC, Dmitry Olshansky wrote:
 string content = get("dlang.org");

 It's simple this line you "convert" whatever site content was to
 unicode. Problem is that "convert" is either broken or it's simply a
 cast whereas it should re-encode source as unicode. So the way around
 is to get it to array of bytes and decode yourself.

 Thanks.May I know how ?Appreciated a piece of code segment.

seems like
ubyte[] data = get!(AutoProtocol, ubyte)("your-site.cn");
//should work, sorry I'm on windows and curl doesn't work here for me
then you work with your data, decode and whatever, at least this:
writeln(data);//will not throw but will print bytes

-- 
Dmitry Olshansky

Jun 08 2012

D Programming

C/C++ Programming

Other

digitalmars.D.learn - std.net.curl get webpage asia font issue