www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - about std.string.representation

reply "bioinfornatics" <bioinfornatics fedoraproject.org> writes:
Hi,
I try to understand which type char, dchar, wchar will give 
ubyte,ushort,uint…

for this i try :

------------ CODE ---------------------
import std.string;

void main(){
	string  t  = "test";
	char[]  c  = "test".dup;
	dchar[] dc = "test"d.dup;
	wchar[] wc = "test"w.dup;
	writeln( typeid( t ),  ' ',  t.sizeof,  ' ', t.representation()  
);
	writeln( typeid( c ),  ' ',  c.sizeof,  ' ', c.representation()  
  );
	writeln( typeid( dc ), ' ', dc.sizeof, ' ', dc.representation() 
);
	writeln( typeid( wc ), ' ', wc.sizeof, ' ', wc.representation() 
);
}
------------ RESULT ---------------------
immutable(char)[]16 [116, 101, 115, 116]
char[] 16 [116, 101, 115, 116]
dchar[] 16 [116, 101, 115, 116]
wchar[] 16 [116, 101, 115, 116]
------------------------------------------

each time it seem ushort is used someone coul explaen please?
Nov 13 2013
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
bioinfornatics:

 each time it seem ushort is used someone coul explaen please?
sizeof of a dynamic array always gives 2 * size_t.sizeof = 8 or 16 bytes. Bye, bearophile
Nov 13 2013
prev sibling next sibling parent "Jared Miller" <none example.com> writes:
To expand on bearophile's answer, a dynamic array is basically a 
package of two things: a length and a pointer to the contents. 
Each of these is 8 bytes on a 64-bit system, thus the sizeof a 
dynamic array is 16 bytes.

You are asking how char, wchar, and dchar correspond to integer 
types. That is defined here: http://dlang.org/type.html.

char  = 1 byte  (ubyte)
wchar = 2 bytes (ushort)
dchar = 4 bytes (uint)

To show this, print the sizeof an element in each of those arrays:

   writeln( typeid( t ),  ' ',  t[0].sizeof,  ' ', 
t.representation() ); // 1
   writeln( typeid( c ),  ' ',  c[0].sizeof,  ' ', 
c.representation() ); // 1
   writeln( typeid( dc ), ' ', dc[0].sizeof, ' ', 
dc.representation() ); // 4
   writeln( typeid( wc ), ' ', wc[0].sizeof, ' ', 
wc.representation() ); // 2
Nov 13 2013
prev sibling parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 11/13/2013 04:32 PM, bioinfornatics wrote:
 Hi,
 I try to understand which type char, dchar, wchar will give
 ubyte,ushort,uint…
And for templates, there is std.range.ElementEncodingType: import std.stdio; import std.range; void foo(R)(R range) { // In contrast, ElementType!R for strings is always dchar writeln(typeid(ElementEncodingType!R)); } void main() { string t = "test"; char[] c = "test".dup; dchar[] dc = "test"d.dup; wchar[] wc = "test"w.dup; foo(t); foo(c); foo(dc); foo(wc); } Prints: immutable(char) char dchar wchar Ali
Nov 14 2013
parent reply "bioinfornatics" <bioinfornatics fedoraproject.org> writes:
On Thursday, 14 November 2013 at 12:01:04 UTC, Ali Çehreli wrote:
 On 11/13/2013 04:32 PM, bioinfornatics wrote:
 Hi,
 I try to understand which type char, dchar, wchar will give
 ubyte,ushort,uint…
And for templates, there is std.range.ElementEncodingType: import std.stdio; import std.range; void foo(R)(R range) { // In contrast, ElementType!R for strings is always dchar writeln(typeid(ElementEncodingType!R)); } void main() { string t = "test"; char[] c = "test".dup; dchar[] dc = "test"d.dup; wchar[] wc = "test"w.dup; foo(t); foo(c); foo(dc); foo(wc); } Prints: immutable(char) char dchar wchar Ali
Thanks Ali that is interesting too … In same way i would like to know if they are a function wich return ubyte, ushort, uint instead of: - char, dchar, wchar from std.range.ElementEncodingType - ubyte[], ushort[], uint[] from std.string.representation maybe: foo(T)( string s ){ alias T typeof(s.representation[0]); … … … }
Nov 18 2013
parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 11/18/2013 07:48 PM, bioinfornatics wrote:
 On Thursday, 14 November 2013 at 12:01:04 UTC, Ali Çehreli wrote:
 On 11/13/2013 04:32 PM, bioinfornatics wrote:
 Hi,
 I try to understand which type char, dchar, wchar will give
 ubyte,ushort,uint…
And for templates, there is std.range.ElementEncodingType: import std.stdio; import std.range; void foo(R)(R range) { // In contrast, ElementType!R for strings is always dchar writeln(typeid(ElementEncodingType!R)); } void main() { string t = "test"; char[] c = "test".dup; dchar[] dc = "test"d.dup; wchar[] wc = "test"w.dup; foo(t); foo(c); foo(dc); foo(wc); } Prints: immutable(char) char dchar wchar Ali
Thanks Ali that is interesting too … In same way i would like to know if they are a function wich return ubyte, ushort, uint instead of: - char, dchar, wchar from std.range.ElementEncodingType - ubyte[], ushort[], uint[] from std.string.representation maybe: foo(T)( string s ){ alias T typeof(s.representation[0]); … … … }
I don't know an existing function but I think the following is what you are looking for: import std.range; template NonUtfElementEncodingType(S) { alias ET = ElementEncodingType!S; static if (is (ET == char)) { alias NonUtfElementEncodingType = ubyte; } else static if (is (ET == wchar)) { alias NonUtfElementEncodingType = ushort; } else static if (is (ET == dchar)) { alias NonUtfElementEncodingType = uint; } else { alias NonUtfElementEncodingType = ET; } } void main() { alias Foo = NonUtfElementEncodingType!string; Foo[] myByteArray; } Ali
Nov 18 2013
parent reply "bioinfornatics" <bioinfornatics fedoraproject.org> writes:
On Tuesday, 19 November 2013 at 06:11:26 UTC, Ali Çehreli wrote:
 On 11/18/2013 07:48 PM, bioinfornatics wrote:
 On Thursday, 14 November 2013 at 12:01:04 UTC, Ali Çehreli 
 wrote:
 On 11/13/2013 04:32 PM, bioinfornatics wrote:
 Hi,
 I try to understand which type char, dchar, wchar will give
 ubyte,ushort,uint…
And for templates, there is std.range.ElementEncodingType: import std.stdio; import std.range; void foo(R)(R range) { // In contrast, ElementType!R for strings is always dchar writeln(typeid(ElementEncodingType!R)); } void main() { string t = "test"; char[] c = "test".dup; dchar[] dc = "test"d.dup; wchar[] wc = "test"w.dup; foo(t); foo(c); foo(dc); foo(wc); } Prints: immutable(char) char dchar wchar Ali
Thanks Ali that is interesting too … In same way i would like to know if they are a function wich return ubyte, ushort, uint instead of: - char, dchar, wchar from std.range.ElementEncodingType - ubyte[], ushort[], uint[] from std.string.representation maybe: foo(T)( string s ){ alias T typeof(s.representation[0]); … … … }
I don't know an existing function but I think the following is what you are looking for: import std.range; template NonUtfElementEncodingType(S) { alias ET = ElementEncodingType!S; static if (is (ET == char)) { alias NonUtfElementEncodingType = ubyte; } else static if (is (ET == wchar)) { alias NonUtfElementEncodingType = ushort; } else static if (is (ET == dchar)) { alias NonUtfElementEncodingType = uint; } else { alias NonUtfElementEncodingType = ET; } } void main() { alias Foo = NonUtfElementEncodingType!string; Foo[] myByteArray; } Ali
Yes that is what i want. whyu not to use a piece of code from std.string.representation ? auto CharEncodingType(Char)(Char[] s) pure nothrow if(isSomeChar!Char) { // Get representation type alias TypeTuple!(ubyte, ushort, uint)U; // const and immutable storage classes static if (is(Char == immutable)) alias immutable(U) T; else static if (is(Char == const)) alias const(U) T; else alias U T; // shared storage class (because shared(const(T)) is possible) static if (is(Char == shared)) alias shared(T) ST; else alias T ST; return cast(ST) s; }
Nov 19 2013
parent reply "bioinfornatics" <bioinfornatics fedoraproject.org> writes:
On Tuesday, 19 November 2013 at 23:53:33 UTC, bioinfornatics 
wrote:
 On Tuesday, 19 November 2013 at 06:11:26 UTC, Ali Çehreli wrote:
 On 11/18/2013 07:48 PM, bioinfornatics wrote:
 On Thursday, 14 November 2013 at 12:01:04 UTC, Ali Çehreli 
 wrote:
 On 11/13/2013 04:32 PM, bioinfornatics wrote:
 Hi,
 I try to understand which type char, dchar, wchar will give
 ubyte,ushort,uint…
And for templates, there is std.range.ElementEncodingType: import std.stdio; import std.range; void foo(R)(R range) { // In contrast, ElementType!R for strings is always dchar writeln(typeid(ElementEncodingType!R)); } void main() { string t = "test"; char[] c = "test".dup; dchar[] dc = "test"d.dup; wchar[] wc = "test"w.dup; foo(t); foo(c); foo(dc); foo(wc); } Prints: immutable(char) char dchar wchar Ali
Thanks Ali that is interesting too … In same way i would like to know if they are a function wich return ubyte, ushort, uint instead of: - char, dchar, wchar from std.range.ElementEncodingType - ubyte[], ushort[], uint[] from std.string.representation maybe: foo(T)( string s ){ alias T typeof(s.representation[0]); … … … }
I don't know an existing function but I think the following is what you are looking for: import std.range; template NonUtfElementEncodingType(S) { alias ET = ElementEncodingType!S; static if (is (ET == char)) { alias NonUtfElementEncodingType = ubyte; } else static if (is (ET == wchar)) { alias NonUtfElementEncodingType = ushort; } else static if (is (ET == dchar)) { alias NonUtfElementEncodingType = uint; } else { alias NonUtfElementEncodingType = ET; } } void main() { alias Foo = NonUtfElementEncodingType!string; Foo[] myByteArray; } Ali
Yes that is what i want. whyu not to use a piece of code from std.string.representation ? auto CharEncodingType(Char)(Char[] s) pure nothrow if(isSomeChar!Char) { // Get representation type alias TypeTuple!(ubyte, ushort, uint)U; // const and immutable storage classes static if (is(Char == immutable)) alias immutable(U) T; else static if (is(Char == const)) alias const(U) T; else alias U T; // shared storage class (because shared(const(T)) is possible) static if (is(Char == shared)) alias shared(T) ST; else alias T ST; return cast(ST) s; }
FIX template CharEncodingType(Char[] s) pure nothrow if(isSomeChar!Char) { // Get representation type alias TypeTuple!(ubyte, ushort, uint)[Char.sizeof / 2] U; // const and immutable storage classes static if (is(Char == immutable)) alias immutable(U) T; else static if (is(Char == const)) alias const(U) T; else alias U T; // shared storage class (because shared(const(T)) is possible) static if (is(Char == shared)) alias shared(T) CharEncodingType; else alias T CharEncodingType; }
Nov 19 2013
parent reply "bioinfornatics" <bioinfornatics fedoraproject.org> writes:
I try this





import std.traits       : isSomeChar;
import std.typetuple    : TypeTuple;
import std.stdio;

template CharEncodingType(Char)
{
     // Get representation type
     alias TypeTuple!(ubyte, ushort, uint) U;

     // const and immutable storage classes
     static if (is(Char == immutable)) alias immutable(U) T;
     else static if (is(Char == const)) alias const(U) T;
     else alias U T;

     // shared storage class (because shared(const(T)) is possible)
     static if (is(Char == shared)) alias shared(T)  
CharEncodingType;
     else alias T CharEncodingType;
}

void main()
{
     string  t  = "test";
     char[]  c  = "test".dup;
     dchar[] dc = "test"d.dup;
     wchar[] wc = "test"w.dup;

     alias T  = CharEncodingType!(typeof(t));
     alias C  = CharEncodingType!(typeof(c));
     alias DC = CharEncodingType!(typeof(dc));
     alias WC = CharEncodingType!(typeof(wc));

     writeln( typeid(T), typeid(C), typeid(DC), typeid(WC) );

}





RESULT

$ ~/test_type
(ubyte,ushort,uint)(ubyte,ushort,uint)(ubyte,ushort,uint)(ubyte,ushort,uint)


instead of ubyte,ushort,uint

i will use your way ali

thanks for your useful code
Nov 19 2013
parent "bioinfornatics" <bioinfornatics fedoraproject.org> writes:
works as expected

import std.traits   : isSomeChar;
import std.range    : ElementEncodingType;
import std.stdio;

 safe pure nothrow
template CharEncodingType(Char)
{
     alias ET = ElementEncodingType!Char;

     static if (is (ET == char)) {
         alias CharEncodingType = ubyte;

     } else static if (is (ET == wchar)) {
         alias CharEncodingType = ushort;

     } else static if (is (ET == dchar)) {
         alias CharEncodingType = uint;

     } else {
         alias CharEncodingType = ET;
     }
}

void main()
{
     string  t  = "test";
     char[]  c  = "test".dup;
     dchar[] dc = "test"d.dup;
     wchar[] wc = "test"w.dup;

     alias T  = CharEncodingType!(typeof(t));
     alias C  = CharEncodingType!(typeof(c));
     alias DC = CharEncodingType!(typeof(dc));
     alias WC = CharEncodingType!(typeof(wc));

     writeln( typeid(T), ' ', typeid(C), ' ', typeid(DC), ' ', 
typeid(WC) );

}


Result
immutable(char) ubyte uint ushort



thanks all
Nov 19 2013