www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Class-based string library

Okay, I got sick of dealing with UTF-8 :-)
I've got a proof-of-concept of a class-based String with a bunch of 
operations. All data manipulation is character-based, you manipulate 
unicode codepoints (characters) and don't worry about encodings.
It's fairly slow at the moment, the proof-of-concept version stores 
strings internally as dchar arrays, and it hasn't been optimised. 
Barring any killer bugs, it should be usable anywhere that string 
performance isn't a bottleneck.
It's probably only useful to you if you're not entirely happy with d's 
strings and/or arrays.
I plan to do some optimisations and write a backend that uses UTF-8 
(internally only), which should be faster (I hope).
Interaction with libraries should be easy, char[]/wchar[]/dchar[] to 
String is just String(data) (or String.valueOf(data)), String to the 
array form is s.toUTF8/16/32().
I'll write up a pretty-looking example sometime that isn't 3am ;-)

A simple reference is here:
http://tunah.net/~tunah/d-string/doc.txt
And the code is here:
http://tunah.net/~tunah/d-string/string.d

If you try it, let me know what you think or any suggestions you have.
Sam
Jun 30 2004