www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Re: String compare performance

reply Kagamin <spam here.lot> writes:
bearophile Wrote:

 Also, is there a way to bit-compare given memory areas at much higher speed
than element per element (I mean for arrays in general)?

I don't know. I think you can't.

You can use memcmp, though only for utf-8 strings.
Nov 27 2010
parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
On 27/11/2010 23:04, Kagamin wrote:
 bearophile Wrote:

 Also, is there a way to bit-compare given memory areas at much
 higher speed than element per element (I mean for arrays in
 general)?

I don't know. I think you can't.

You can use memcmp, though only for utf-8 strings.

Only for utf-8 strings? Why's that? I would've thought memcmp to be type agnostic. Stewart.
Nov 28 2010
next sibling parent reply bearophobic <notbear cave.net> writes:
Stewart Gordon Wrote:

 On 27/11/2010 23:04, Kagamin wrote:
 bearophile Wrote:

 Also, is there a way to bit-compare given memory areas at much
 higher speed than element per element (I mean for arrays in
 general)?

I don't know. I think you can't.

You can use memcmp, though only for utf-8 strings.

Only for utf-8 strings? Why's that? I would've thought memcmp to be type agnostic. Stewart.

D community is amazing cult of premature optimization fans. Any one of you heard of canonically equivalent sequences? The integrated Unicode support is a clusterfuck. Please do compare ASCII strings with memcmp, but no Unicode. Where did the original poster pull this problem from, his ass? "My system runs 100,000,000,000 instructions per second, but this comparison of 4 letter strings uses 5 cycles too much! This is the only problem on the way to world domination with my $500 Microsoft Word clone!". No wait, the problems are completely imaginatory. Bye, bearophobic
Nov 28 2010
parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2010-11-28 20:57:38 -0500, bearophobic <notbear cave.net> said:

 Stewart Gordon Wrote:
 
 On 27/11/2010 23:04, Kagamin wrote:
 bearophile Wrote:
 
 Also, is there a way to bit-compare given memory areas at much
 higher speed than element per element (I mean for arrays in
 general)?

I don't know. I think you can't.

You can use memcmp, though only for utf-8 strings.

Only for utf-8 strings? Why's that? I would've thought memcmp to be type agnostic. Stewart.

D community is amazing cult of premature optimization fans. Any one of you heard of canonically equivalent sequences? The integrated Unicode support is a clusterfuck. Please do compare ASCII strings with memcmp, but no Unicode. Where did the original poster pull this problem from, his ass? "My system runs 100,000,000,000 instructions per second, but this comparison of 4 letter strings uses 5 cycles too much! This is the only problem on the way to world domination with my $500 Microsoft Word clone!". No wait, the problems are completely imaginatory.

Comparing unicode UTF-* strings using memcmp is fine as long as what you want to know is whether the code points are the same. If your point was that per-code-point comparisons aren't the right way to compare Unicode strings (in most situations), then I support this view too. Though if that's what you wanted to say, you could have made your point clearer. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Nov 28 2010
parent reply Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 29/11/2010 02:11, Michel Fortin wrote:
 On 2010-11-28 20:57:38 -0500, bearophobic <notbear cave.net> said:

 Stewart Gordon Wrote:

 On 27/11/2010 23:04, Kagamin wrote:
 bearophile Wrote:

 Also, is there a way to bit-compare given memory areas at much
 higher speed than element per element (I mean for arrays in
 general)?

I don't know. I think you can't.

You can use memcmp, though only for utf-8 strings.

Only for utf-8 strings? Why's that? I would've thought memcmp to be type agnostic. Stewart.

D community is amazing cult of premature optimization fans. Any one of you heard of canonically equivalent sequences? The integrated Unicode support is a clusterfuck. Please do compare ASCII strings with memcmp, but no Unicode. Where did the original poster pull this problem from, his ass? "My system runs 100,000,000,000 instructions per second, but this comparison of 4 letter strings uses 5 cycles too much! This is the only problem on the way to world domination with my $500 Microsoft Word clone!". No wait, the problems are completely imaginatory.

Comparing unicode UTF-* strings using memcmp is fine as long as what you want to know is whether the code points are the same. If your point was that per-code-point comparisons aren't the right way to compare Unicode strings (in most situations), then I support this view too. Though if that's what you wanted to say, you could have made your point clearer.

Why are people still replying to nameless trolls? There has been several cases of that in recent threads. :/ -- Bruno Medeiros - Software Engineer
Dec 07 2010
parent reply =?ISO-8859-1?Q?Br=fcno_Mediocre?= <gay mail.com> writes:
Bruno Medeiros Wrote:

 On 29/11/2010 02:11, Michel Fortin wrote:
 On 2010-11-28 20:57:38 -0500, bearophobic <notbear cave.net> said:

 Stewart Gordon Wrote:

 On 27/11/2010 23:04, Kagamin wrote:
 bearophile Wrote:

 Also, is there a way to bit-compare given memory areas at much
 higher speed than element per element (I mean for arrays in
 general)?

I don't know. I think you can't.

You can use memcmp, though only for utf-8 strings.

Only for utf-8 strings? Why's that? I would've thought memcmp to be type agnostic. Stewart.

D community is amazing cult of premature optimization fans. Any one of you heard of canonically equivalent sequences? The integrated Unicode support is a clusterfuck. Please do compare ASCII strings with memcmp, but no Unicode. Where did the original poster pull this problem from, his ass? "My system runs 100,000,000,000 instructions per second, but this comparison of 4 letter strings uses 5 cycles too much! This is the only problem on the way to world domination with my $500 Microsoft Word clone!". No wait, the problems are completely imaginatory.

Comparing unicode UTF-* strings using memcmp is fine as long as what you want to know is whether the code points are the same. If your point was that per-code-point comparisons aren't the right way to compare Unicode strings (in most situations), then I support this view too. Though if that's what you wanted to say, you could have made your point clearer.

Why are people still replying to nameless trolls? There has been several cases of that in recent threads. :/

Trololol. Maybe they're a bit dumb, my brother. If they some day become smarter, they'll stop using D. They see how much shit it is. I miss my wife. Oh god.... bring back my life! Bring me my.. sandwich!
Dec 08 2010
parent reply Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 08/12/2010 15:02, Brüno Mediocre wrote:
 Bruno Medeiros Wrote:
 Why are people still replying to nameless trolls? There has been several
 cases of that in recent threads. :/

Trololol. Maybe they're a bit dumb, my brother. If they some day become smarter, they'll stop using D. They see how much shit it is. I miss my wife. Oh god.... bring back my life! Bring me my.. sandwich!

Lol, "Brüno Mediocre", well thought, that's actually funny. -- Brüno Mediocre - Software Engineer
Dec 09 2010
parent Daniel Gibson <metalcaedes gmail.com> writes:
Bruno Medeiros schrieb:
 On 08/12/2010 15:02, Brüno Mediocre wrote:
 Bruno Medeiros Wrote:
 Why are people still replying to nameless trolls? There has been several
 cases of that in recent threads. :/

Trololol. Maybe they're a bit dumb, my brother. If they some day become smarter, they'll stop using D. They see how much shit it is. I miss my wife. Oh god.... bring back my life! Bring me my.. sandwich!

Lol, "Brüno Mediocre", well thought, that's actually funny.

Nah, I think it's a rather mediocre joke.
Dec 09 2010
prev sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Sun, 28 Nov 2010 20:32:24 -0500, Stewart Gordon <smjg_1998 yahoo.com>  
wrote:

 On 27/11/2010 23:04, Kagamin wrote:
 bearophile Wrote:

 Also, is there a way to bit-compare given memory areas at much
 higher speed than element per element (I mean for arrays in
 general)?

I don't know. I think you can't.

You can use memcmp, though only for utf-8 strings.

Only for utf-8 strings? Why's that? I would've thought memcmp to be type agnostic. Stewart.

memcmp is type agnostic if all you want to compare is equality. The other use of memcmp is essentially as an opCmp, in which case it would be type sensitive.
Nov 28 2010