www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Re: Higher level built-in strings

reply Jesse Phillips <jessekphillips+d gmail.com> writes:
What about:

struct String {
	string items;
	alias items this;
}

And add the needed functions you wish to have in string and it will still work
in existing functions that operate on immutable(char)[]
Jul 19 2010
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 07/19/2010 06:51 PM, Jesse Phillips wrote:
 What about:

 struct String {
 	string items;
 	alias items this;
 }

 And add the needed functions you wish to have in string and it will still work
in existing functions that operate on immutable(char)[]

Fortunately you can essentially achieve the above by simply writing free functions that take a string or a ref string as their first argument. Then you can use str.foo(args) as an alternative for foo(str, args). Andrei
Jul 19 2010
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 19 Jul 2010 20:26:47 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 07/19/2010 06:51 PM, Jesse Phillips wrote:
 What about:

 struct String {
 	string items;
 	alias items this;
 }

 And add the needed functions you wish to have in string and it will  
 still work in existing functions that operate on immutable(char)[]

Fortunately you can essentially achieve the above by simply writing free functions that take a string or a ref string as their first argument. Then you can use str.foo(args) as an alternative for foo(str, args).

How do we make this work? auto str = "hello world"; foreach(c; str) assert(is(typeof(c) == dchar)); -Steve
Jul 20 2010
parent reply Sean Kelly <sean invisibleduck.org> writes:
Steven Schveighoffer Wrote:
 
 How do we make this work?
 
 auto str = "hello world";
 foreach(c; str)
     assert(is(typeof(c) == dchar));

foreach (dchar c; str) assert(...); This feature has been in D for years.
Jul 20 2010
next sibling parent reply =?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= <jeberger free.fr> writes:
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Sean Kelly wrote:
 Steven Schveighoffer Wrote:
 How do we make this work?

 auto str =3D "hello world";
 foreach(c; str)
     assert(is(typeof(c) =3D=3D dchar));

foreach (dchar c; str) assert(...); =20 This feature has been in D for years.

And what about this one: void func(T) (T range) { foreach (elem; range) assert (is (typeof (elem) =3D=3D ElementType!(T))); } func ("azerty"); auto a =3D [ 1, 2, 3, 4, 5]; func (a); Jerome --=20 mailto:jeberger free.fr http://jeberger.free.fr Jabber: jeberger jabber.fr
Jul 20 2010
parent reply Walter Bright <newshound2 digitalmars.com> writes:
Jérôme M. Berger wrote:
 	And what about this one:
 
 void func(T) (T range) {
     foreach (elem; range)
         assert (is (typeof (elem) == ElementType!(T)));
 }
 
 func ("azerty");
 auto a = [ 1, 2, 3, 4, 5];
 func (a);

You can specialize the template for strings: void func(T:string)(T range) { ... }
Jul 20 2010
next sibling parent reply =?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= <jeberger free.fr> writes:
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Walter Bright wrote:
 J=C3=A9r=C3=B4me M. Berger wrote:
     And what about this one:

 void func(T) (T range) {
     foreach (elem; range)
         assert (is (typeof (elem) =3D=3D ElementType!(T)));
 }

 func ("azerty");
 auto a =3D [ 1, 2, 3, 4, 5];
 func (a);

You can specialize the template for strings: =20 void func(T:string)(T range) { ... }

Sure, i can also not use a template and write however many overloaded functions I need. So what are templates for? Jerome --=20 mailto:jeberger free.fr http://jeberger.free.fr Jabber: jeberger jabber.fr
Jul 20 2010
parent Walter Bright <newshound2 digitalmars.com> writes:
Jérôme M. Berger wrote:
 Walter Bright wrote:
 You can specialize the template for strings:

 void func(T:string)(T range) { ... }

Sure, i can also not use a template and write however many overloaded functions I need. So what are templates for?

The overloaded template specialization capability is exactly because it is often advantageous to write custom versions for certain types. The user of the template doesn't see that, it looks generic to him.
Jul 20 2010
prev sibling parent "Aelxx" <aelxx yandex.ru> writes:
"Walter Bright" <newshound2 digitalmars.com> ÓÏÏÂÝÉÌ/ÓÏÏÂÝÉÌÁ × ÎÏ×ÏÓÔÑÈ 
ÓÌÅÄÕÀÝÅÅ: news:i24st1$12uh$1 digitalmars.com...
 Jerome M. Berger wrote:
 And what about this one:

 void func(T) (T range) {
     foreach (elem; range)
         assert (is (typeof (elem) == ElementType!(T)));
 }

 func ("azerty");
 auto a = [ 1, 2, 3, 4, 5];
 func (a);

You can specialize the template for strings: void func(T:string)(T range) { ... }

Hmm. Theoreticaly a bit more general void func(T, U, V )(T rangeT, U rangeU, V rangeV) { ... } void func(T:string, U, V )(T rangeT, U rangeU, V rangeV) { ... } void func(T, U:string, V )(T rangeT, U rangeU, V rangeV) { ... } void func(T, U, V:string )(T rangeT, U rangeU, V rangeV) { ... } void func(T:string, U:string, V )(T rangeT, U rangeU, V rangeV) { ... } void func(T:string, U, V:string )(T rangeT, U rangeU, V rangeV) { ... } void func(T, U:string, V:string )(T rangeT, U rangeU, V rangeV) { ... } void func(T:string, U:string, V:string )(T rangeT, U rangeU, V rangeV) { ... }
Jul 21 2010
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
Steven Schveighoffer wrote:
 The omission of dchar is on purpose.  Phobos has characterized string as 
 a bidirectional range of dchars.  For every range where I do:
 
 foreach(e; range)
 
 e is of the type of the range.  Except for char and wchar.  This 
 schizophrenia of type induction is very bad for D, and it's a good 
 argument of why strings should not simply be arrays.

For many algorithms on strings, iterating by char is preferred over dchar, even for multibyte strings.
Jul 20 2010
parent reply Walter Bright <newshound2 digitalmars.com> writes:
Steven Schveighoffer wrote:
 On Tue, 20 Jul 2010 15:21:34 -0400, Walter Bright 
 <newshound2 digitalmars.com> wrote:
 
 Steven Schveighoffer wrote:
 The omission of dchar is on purpose.  Phobos has characterized string 
 as a bidirectional range of dchars.  For every range where I do:
  foreach(e; range)
  e is of the type of the range.  Except for char and wchar.  This 
 schizophrenia of type induction is very bad for D, and it's a good 
 argument of why strings should not simply be arrays.

For many algorithms on strings, iterating by char is preferred over dchar, even for multibyte strings.

Huh? Which ones?

Searching, for one.
 AFAIK, all of std.algorithm treats strings as ranges 
 of dchar.

Andrei posted elsewhere that there were specializations for strings to do it one way or the other based on which was more efficient.
Jul 20 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Walter Bright wrote:
 Steven Schveighoffer wrote:
 On Tue, 20 Jul 2010 15:21:34 -0400, Walter Bright 
 <newshound2 digitalmars.com> wrote:

 Steven Schveighoffer wrote:
 The omission of dchar is on purpose.  Phobos has characterized 
 string as a bidirectional range of dchars.  For every range where I do:
  foreach(e; range)
  e is of the type of the range.  Except for char and wchar.  This 
 schizophrenia of type induction is very bad for D, and it's a good 
 argument of why strings should not simply be arrays.

For many algorithms on strings, iterating by char is preferred over dchar, even for multibyte strings.

Huh? Which ones?

Searching, for one.
 AFAIK, all of std.algorithm treats strings as ranges of dchar.

Andrei posted elsewhere that there were specializations for strings to do it one way or the other based on which was more efficient.

Boyer-Moore comes to mind. Andrei
Jul 20 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 20 Jul 2010 11:02:57 -0400, Sean Kelly <sean invisibleduck.org>  
wrote:

 Steven Schveighoffer Wrote:
 How do we make this work?

 auto str = "hello world";
 foreach(c; str)
     assert(is(typeof(c) == dchar));

foreach (dchar c; str) assert(...); This feature has been in D for years.

The omission of dchar is on purpose. Phobos has characterized string as a bidirectional range of dchars. For every range where I do: foreach(e; range) e is of the type of the range. Except for char and wchar. This schizophrenia of type induction is very bad for D, and it's a good argument of why strings should not simply be arrays. -Steve
Jul 20 2010
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 20 Jul 2010 15:21:34 -0400, Walter Bright  
<newshound2 digitalmars.com> wrote:

 Steven Schveighoffer wrote:
 The omission of dchar is on purpose.  Phobos has characterized string  
 as a bidirectional range of dchars.  For every range where I do:
  foreach(e; range)
  e is of the type of the range.  Except for char and wchar.  This  
 schizophrenia of type induction is very bad for D, and it's a good  
 argument of why strings should not simply be arrays.

For many algorithms on strings, iterating by char is preferred over dchar, even for multibyte strings.

Huh? Which ones? AFAIK, all of std.algorithm treats strings as ranges of dchar. I am 100% in agreement with you that indexing and length should be done by char. All I'm talking about is foreach. -Steve
Jul 20 2010
prev sibling next sibling parent "Rory McGuire" <rmcguire neonova.co.za> writes:
On Tue, 20 Jul 2010 01:51:51 +0200, Jesse Phillips  
<jessekphillips+d gmail.com> wrote:

 What about:

 struct String {
 	string items;
 	alias items this;
 }

 And add the needed functions you wish to have in string and it will  
 still work in existing functions that operate on immutable(char)[]

You shouldn't need to do that: string strstr(string haystack, string needle); can be used as: string s; s.strstr("needle"); so you can add "methods" to a string or whatever just by defining functions. -Rory
Jul 20 2010
prev sibling next sibling parent "Rory McGuire" <rmcguire neonova.co.za> writes:
On Tue, 20 Jul 2010 16:08:06 +0200, Jesse Phillips  
<jesse.k.phillips gmail.com> wrote:

 But then you can't overload operators.

 On Tue, Jul 20, 2010 at 12:54 AM, Rory McGuire <rmcguire neonova.co.za>  
 wrote:
 On Tue, 20 Jul 2010 01:51:51 +0200, Jesse Phillips
 <jessekphillips+d gmail.com> wrote:

 What about:

 struct String {
        string items;
        alias items this;
 }

 And add the needed functions you wish to have in string and it will  
 still
 work in existing functions that operate on immutable(char)[]

You shouldn't need to do that: string strstr(string haystack, string needle); can be used as: string s; s.strstr("needle"); so you can add "methods" to a string or whatever just by defining functions. -Rory


such as?
Jul 20 2010
prev sibling next sibling parent reply "Rory McGuire" <rmcguire neonova.co.za> writes:
On Tue, 20 Jul 2010 16:51:57 +0200, Rory McGuire <rmcguire neonova.co.za>  
wrote:

 On Tue, 20 Jul 2010 16:08:06 +0200, Jesse Phillips  
 <jesse.k.phillips gmail.com> wrote:

 But then you can't overload operators.

 On Tue, Jul 20, 2010 at 12:54 AM, Rory McGuire <rmcguire neonova.co.za>  
 wrote:
 On Tue, 20 Jul 2010 01:51:51 +0200, Jesse Phillips
 <jessekphillips+d gmail.com> wrote:

 What about:

 struct String {
        string items;
        alias items this;
 }

 And add the needed functions you wish to have in string and it will  
 still
 work in existing functions that operate on immutable(char)[]

You shouldn't need to do that: string strstr(string haystack, string needle); can be used as: string s; s.strstr("needle"); so you can add "methods" to a string or whatever just by defining functions. -Rory


such as?

I mean is there not another way to do the same thing?
Jul 20 2010
parent Walter Bright <newshound2 digitalmars.com> writes:
Rory McGuire wrote:
 I'm using opera mail. Any suggestions for Linux+Windows, excluding 
 thunderbird(slow)?

I use thunderbird on both windows & linux, haven't noticed speed problems other than my slow internet connection. I have noticed that thunderbird uses multithreading fairly effectively to speed itself up.
Jul 20 2010
prev sibling next sibling parent "Simen kjaeraas" <simen.kjaras gmail.com> writes:
Rory McGuire <rmcguire neonova.co.za> wrote:

[snip]

Rory, is there something wrong with your newsreader? I keep seeing your
posts as replies only to the top post.

-- 
Simen
Jul 20 2010
prev sibling next sibling parent "Rory McGuire" <rmcguire neonova.co.za> writes:
On Tue, 20 Jul 2010 18:35:12 +0200, Simen kjaeraas  
<simen.kjaras gmail.com> wrote:

 Rory McGuire <rmcguire neonova.co.za> wrote:

 [snip]

 Rory, is there something wrong with your newsreader? I keep seeing your
 posts as replies only to the top post.

I'm using opera mail. Any suggestions for Linux+Windows, excluding thunderbird(slow)?
Jul 20 2010
prev sibling next sibling parent Jonathan M Davis <jmdavisprog gmail.com> writes:
On Tuesday, July 20, 2010 11:45:41 Rory McGuire wrote:
 On Tue, 20 Jul 2010 18:35:12 +0200, Simen kjaeraas
 
 <simen.kjaras gmail.com> wrote:
 Rory McGuire <rmcguire neonova.co.za> wrote:
 
 [snip]
 
 Rory, is there something wrong with your newsreader? I keep seeing your
 posts as replies only to the top post.

I'm using opera mail. Any suggestions for Linux+Windows, excluding thunderbird(slow)?

Well, since I'm a kde user, I use knode if I want a newsreader and kmail if I want a mail client. I prefer knode for dealing with newsgroups rather than using a mail list with kmail, but I do sometimes end up using kmail with mail lists rather than knode because I can take advantage of imap and have stuff properly synced between my machines. - Jonathan M Davis
Jul 20 2010
prev sibling parent Fawzi Mohamed <fawzi gmx.ch> writes:
I did not read all the discussion in detail, but in my opinion  
something that would be very useful in a library is

struct String{
	void *ptr;
	size_t _l;
	enum :size_t {
		MaskLen=((~cast(size_t)0)>>2)
	}
	enum :int {
		BitsLen=8*size_t.sizeof-2
	}
	size_t len(){
		return (_l & MaskLen);
	}
	int encodingId(){
		return cast(int)(_l>>BitsLen);
	}
}

plus stuff to simplify its creation from T[] arrays and getting T[]  
arrays from it.

this type would them be used where one wants a string without caring  
about its encoding, and without having to make all string accepting  
functions templates.
As it was explained by others many string operations are rather generic.
*this* is what I would have expected from string, not an alias to  
char[].

Fawzi
	
Jul 20 2010