digitalmars.D - "bstring"
- Michel Fortin (10/10) Apr 05 2010 Lately I've been using the type "immutable(ubyte)[]" a lot to pass
- Justin Spahr-Summers (11/20) Apr 05 2010 I use it quite a lot too, but I'm not sure if making it (effectively) a
- =?UTF-8?B?QWxpIMOHZWhyZWxp?= (5/9) Apr 06 2010 If by "character" you mean "code unit", yes.
- BCS (5/15) Apr 06 2010 I think that's what the "not strictly enforced by D" part was about. Tru...
- Michel Fortin (10/26) Apr 06 2010 It may not be strictly enforced, but std.range now iterates on code
- Justin Spahr-Summers (5/19) Apr 06 2010 Sorry, yes. I'm not very familiar with Unicode terminology, but I do
Lately I've been using the type "immutable(ubyte)[]" a lot to pass around binary data of various kinds. In a couple of places now, to save some typing, I'm using this alias: alias immutable(ubyte)[] bstring; Would that make a worthy addition to the other standard string formats defined in object.o? Or am I the only one who is using this type a lot? -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Apr 05 2010
On Mon, 5 Apr 2010 18:48:58 -0400, Michel Fortin <michel.fortin michelf.com> wrote:Lately I've been using the type "immutable(ubyte)[]" a lot to pass around binary data of various kinds. In a couple of places now, to save some typing, I'm using this alias: alias immutable(ubyte)[] bstring; Would that make a worthy addition to the other standard string formats defined in object.o? Or am I the only one who is using this type a lot?I use it quite a lot too, but I'm not sure if making it (effectively) a language keyword is the right approach. I mean, I use mutable byte strings probably just as often. The fact that ubytes are really just arbitrary data I think somewhat diminishes the usefulness of a keyword; to compare, 'string' to me represents a contiguous run of valid *characters* (i.e., the data has meaning and representation in and of itself)... not strictly enforced by D, of course, but that's how the type is used. Apologies if this came out rather disjointed.
Apr 05 2010
Justin Spahr-Summers wrote:'string' to me represents a contiguous run of valid *characters* (i.e., the data has meaning and representation in and of itself)... not strictly enforced by D, of course, but that's how the type is used.If by "character" you mean "code unit", yes. string characters are UTF-8 code units in D and have meanings by themselves only if they are one-byte UTF-8 sequences. Ali
Apr 06 2010
Hello Ali,Justin Spahr-Summers wrote:I think that's what the "not strictly enforced by D" part was about. True or not, people often assume that a string is valid UTF-8 of some kind. -- ... <IXOYE><'string' to me represents a contiguous run of valid *characters* (i.e., the data has meaning and representation in and of itself)... not strictly enforced by D, of course, but that's how the type is used.If by "character" you mean "code unit", yes. string characters are UTF-8 code units in D and have meanings by themselves only if they are one-byte UTF-8 sequences.
Apr 06 2010
On 2010-04-06 17:10:25 -0400, BCS <none anon.com> said:Hello Ali,It may not be strictly enforced, but std.range now iterates on code points instead of code units, making 'string' not very practical to use as a range when you need to iterate over UTF-8 code units (bytes), or with other text encodings. "bstring" is more appropriate for those cases. -- Michel Fortin michel.fortin michelf.com http://michelf.com/Justin Spahr-Summers wrote:I think that's what the "not strictly enforced by D" part was about. True or not, people often assume that a string is valid UTF-8 of some kind.'string' to me represents a contiguous run of valid *characters* (i.e., the data has meaning and representation in and of itself)... not strictly enforced by D, of course, but that's how the type is used.If by "character" you mean "code unit", yes. string characters are UTF-8 code units in D and have meanings by themselves only if they are one-byte UTF-8 sequences.
Apr 06 2010
On Tue, 06 Apr 2010 11:50:36 -0700, Ali Çehreli <acehreli yahoo.com> wrote:Justin Spahr-Summers wrote: > 'string' to me represents a contiguous run of valid > *characters* (i.e., the data has meaning and representation in and of > itself)... not strictly enforced by D, of course, but that's how the > type is used. If by "character" you mean "code unit", yes. string characters are UTF-8 code units in D and have meanings by themselves only if they are one-byte UTF-8 sequences. AliSorry, yes. I'm not very familiar with Unicode terminology, but I do know that strings don't always contain valid Unicode sequences, and that's what I meant.
Apr 06 2010