www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - cannot sort an array of char

reply "Ivan Kazmenko" <gassa mail.ru> writes:
Hi!

This gives an error (cannot deduce template function from 
argument types):

-----
import std.algorithm;
void main () {
	char [] c;
	sort (c);
}
-----

Why is "char []" so special that it can't be sorted?

For example, if I know the array contains only ASCII characters, 
sorting it sounds no different to sorting an "int []".

Ivan Kazmenko.
Nov 05 2014
next sibling parent reply "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Wednesday, 5 November 2014 at 12:54:03 UTC, Ivan Kazmenko 
wrote:
 Hi!

 This gives an error (cannot deduce template function from 
 argument types):

 -----
 import std.algorithm;
 void main () {
 	char [] c;
 	sort (c);
 }
 -----

 Why is "char []" so special that it can't be sorted?

 For example, if I know the array contains only ASCII 
 characters, sorting it sounds no different to sorting an "int 
 []".
Hmm... this doesn't work either: import std.algorithm; import std.utf; void main () { char [] c; sort (c.byCodeUnit); } But IMO it should.
Nov 05 2014
next sibling parent reply "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Wednesday, 5 November 2014 at 13:34:05 UTC, Marc Schütz wrote:
 On Wednesday, 5 November 2014 at 12:54:03 UTC, Ivan Kazmenko 
 wrote:
 Hi!

 This gives an error (cannot deduce template function from 
 argument types):

 -----
 import std.algorithm;
 void main () {
 	char [] c;
 	sort (c);
 }
 -----

 Why is "char []" so special that it can't be sorted?

 For example, if I know the array contains only ASCII 
 characters, sorting it sounds no different to sorting an "int 
 []".
Hmm... this doesn't work either: import std.algorithm; import std.utf; void main () { char [] c; sort (c.byCodeUnit); } But IMO it should.
https://issues.dlang.org/show_bug.cgi?id=13689
Nov 05 2014
parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 11/05/2014 05:44 AM, "Marc Schütz" <schuetzm gmx.net>" wrote:
 On Wednesday, 5 November 2014 at 13:34:05 UTC, Marc Schütz wrote:
 On Wednesday, 5 November 2014 at 12:54:03 UTC, Ivan Kazmenko wrote:
 Hi!

 This gives an error (cannot deduce template function from argument
 types):

 -----
 import std.algorithm;
 void main () {
     char [] c;
     sort (c);
 }
 -----

 Why is "char []" so special that it can't be sorted?

 For example, if I know the array contains only ASCII characters,
 sorting it sounds no different to sorting an "int []".
Hmm... this doesn't work either: import std.algorithm; import std.utf; void main () { char [] c; sort (c.byCodeUnit); } But IMO it should.
https://issues.dlang.org/show_bug.cgi?id=13689
It can't be a RandomAccessRange because it cannot satisfy random access at O(1) time. Ali
Nov 05 2014
parent =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 11/05/2014 10:01 AM, Ali Çehreli wrote:

         sort (c.byCodeUnit);
     }

 But IMO it should.
https://issues.dlang.org/show_bug.cgi?id=13689
It can't be a RandomAccessRange because it cannot satisfy random access at O(1) time.
Sorry, I misunderstood (again): code unit is random-access, code point is not. Ali P.S. I would like to have a word with the Unicode people who settled on the terms "code unit" and "code point". Every time I come across one of those, I have to think at least 5 seconds to fool myself to think that I understood correctly which one was meant. :p
Nov 05 2014
prev sibling parent reply "Ivan Kazmenko" <gassa mail.ru> writes:
On Wednesday, 5 November 2014 at 13:34:05 UTC, Marc Schütz wrote:
 On Wednesday, 5 November 2014 at 12:54:03 UTC, Ivan Kazmenko 
 wrote:
 Hi!

 This gives an error (cannot deduce template function from 
 argument types):

 -----
 import std.algorithm;
 void main () {
 	char [] c;
 	sort (c);
 }
 -----

 Why is "char []" so special that it can't be sorted?

 For example, if I know the array contains only ASCII 
 characters, sorting it sounds no different to sorting an "int 
 []".
Hmm... this doesn't work either: import std.algorithm; import std.utf; void main () { char [] c; sort (c.byCodeUnit); } But IMO it should.
So, you imply that to use a char array as a RandomAccessRange, I have to use byCodeUnit? (and it should work, but doesn't?) Fine, but how does one learn that except by asking here? Googling did not produce meaningful results for me. For example, isRandomAccessRange[0] states the problem: ----- Although char[] and wchar[] (as well as their qualified versions including string and wstring) are arrays, isRandomAccessRange yields false for them because they use variable-length encodings (UTF-8 and UTF-16 respectively). These types are bidirectional ranges only. ----- but does not offer a solution. If (when) byCodeUnit does really provide a random-access range, it would be desirable to have it linked where the problem is stated.
Nov 06 2014
parent reply "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Thursday, 6 November 2014 at 10:52:32 UTC, Ivan Kazmenko wrote:
 On Wednesday, 5 November 2014 at 13:34:05 UTC, Marc Schütz 
 wrote:
 On Wednesday, 5 November 2014 at 12:54:03 UTC, Ivan Kazmenko 
 wrote:
 Hi!

 This gives an error (cannot deduce template function from 
 argument types):

 -----
 import std.algorithm;
 void main () {
 	char [] c;
 	sort (c);
 }
 -----

 Why is "char []" so special that it can't be sorted?

 For example, if I know the array contains only ASCII 
 characters, sorting it sounds no different to sorting an "int 
 []".
Hmm... this doesn't work either: import std.algorithm; import std.utf; void main () { char [] c; sort (c.byCodeUnit); } But IMO it should.
So, you imply that to use a char array as a RandomAccessRange, I have to use byCodeUnit? (and it should work, but doesn't?)
Yes. H.S. Teoh has already submitted a PR to fix it.
 Fine, but how does one learn that except by asking here?  
 Googling did not produce meaningful results for me.

 For example, isRandomAccessRange[0] states the problem:
 -----
 Although char[] and wchar[] (as well as their qualified 
 versions including string and wstring) are arrays, 
 isRandomAccessRange yields false for them because they use 
 variable-length encodings (UTF-8 and UTF-16 respectively). 
 These types are bidirectional ranges only.
 -----
 but does not offer a solution.  If (when) byCodeUnit does 
 really provide a random-access range, it would be desirable to 
 have it linked where the problem is stated.


I agree. But how should it be implemented? We would have to modify algorithms that require an RA range to also accept char[], but then print an error message with the suggestion to use byCodeUnit. I think that's not practicable. Any better ideas?
Nov 06 2014
parent "Ivan Kazmenko" <gassa mail.ru> writes:
IK>> For example, isRandomAccessRange[0] states the problem:
IK>> -----
IK>> Although char[] and wchar[] (as well as their qualified
IK>> versions including string and wstring) are arrays,
IK>> isRandomAccessRange yields false for them because they use
IK>> variable-length encodings (UTF-8 and UTF-16 respectively).
IK>> These types are bidirectional ranges only.
IK>> -----
IK>> but does not offer a solution.  If (when) byCodeUnit does
IK>> really provide a random-access range, it would be desirable 
to
IK>> have it linked where the problem is stated.
IK>>
IK>> [0] 


MS> I agree. But how should it be implemented? We would have to
MS> modify algorithms that require an RA range to also accept
MS> char[], but then print an error message with the suggestion to
MS> use byCodeUnit. I think that's not practicable. Any better
MS> ideas?

I meant just mentioning a workaround (byCodeUnit or 
representation) in the documentation, not in a compiler error.  
But the latter option does have some sense, too.
Nov 11 2014
prev sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 11/5/14 7:54 AM, Ivan Kazmenko wrote:
 Hi!

 This gives an error (cannot deduce template function from argument types):

 -----
 import std.algorithm;
 void main () {
      char [] c;
      sort (c);
 }
 -----

 Why is "char []" so special that it can't be sorted?
Because sort works on ranges, and std.range has the view that char[] is a range of dchar without random access. Nevermind what the compiler thinks :) I believe you can get what you want with std.string.representation: import std.string; sort(c.representation); -Steve
Nov 06 2014
parent reply "Ivan Kazmenko" <gassa mail.ru> writes:
IK>> Why is "char []" so special that it can't be sorted?

SS> Because sort works on ranges, and std.range has the view that
SS> char[] is a range of dchar without random access. Nevermind
SS> what the compiler thinks :)
SS>
SS> I believe you can get what you want with
SS> std.string.representation:
SS>
SS> import std.string;
SS>
SS> sort(c.representation);

Thank you for showing a library way to do that.
I ended up with using a cast, like "sort (cast (ubyte []) c)".
And this looks like a safe way to do the same.

Now, std.utf's byCodeUnit and std.string's representation seem 
like duplicate functionality, albeit with different input and 
output types (and bugs :) ).
Nov 11 2014
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 11/11/14 6:07 AM, Ivan Kazmenko wrote:
 IK>> Why is "char []" so special that it can't be sorted?

 SS> Because sort works on ranges, and std.range has the view that
 SS> char[] is a range of dchar without random access. Nevermind
 SS> what the compiler thinks :)
 SS>
 SS> I believe you can get what you want with
 SS> std.string.representation:
 SS>
 SS> import std.string;
 SS>
 SS> sort(c.representation);

 Thank you for showing a library way to do that.
 I ended up with using a cast, like "sort (cast (ubyte []) c)".
 And this looks like a safe way to do the same.
It's safe but be careful. For instance, if c becomes an immutable(char)[] or const(char)[], then you will have undefined behavior. If you use the representation, it will properly reject this behavior.
 Now, std.utf's byCodeUnit and std.string's representation seem like
 duplicate functionality, albeit with different input and output types
 (and bugs :) ).
No, byCodeUnit is not an array, it's a range of char. They solve different problems, and mean different things. Note, byCodeUnit should work for sort, I'm surprised it doesn't. -Steve
Nov 11 2014
parent "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Tuesday, 11 November 2014 at 13:20:53 UTC, Steven 
Schveighoffer wrote:
 On 11/11/14 6:07 AM, Ivan Kazmenko wrote:
 IK>> Why is "char []" so special that it can't be sorted?

 SS> Because sort works on ranges, and std.range has the view 
 that
 SS> char[] is a range of dchar without random access. Nevermind
 SS> what the compiler thinks :)
 SS>
 SS> I believe you can get what you want with
 SS> std.string.representation:
 SS>
 SS> import std.string;
 SS>
 SS> sort(c.representation);

 Thank you for showing a library way to do that.
 I ended up with using a cast, like "sort (cast (ubyte []) c)".
 And this looks like a safe way to do the same.
It's safe but be careful. For instance, if c becomes an immutable(char)[] or const(char)[], then you will have undefined behavior. If you use the representation, it will properly reject this behavior.
 Now, std.utf's byCodeUnit and std.string's representation seem 
 like
 duplicate functionality, albeit with different input and 
 output types
 (and bugs :) ).
No, byCodeUnit is not an array, it's a range of char. They solve different problems, and mean different things. Note, byCodeUnit should work for sort, I'm surprised it doesn't.
That's what he meant by "bugs" :-P But it's been fixed already, thanks to H.S. Teoh.
Nov 11 2014