www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Trying to alias this a grapheme range + making it a forward range

reply aliak <something something.com> writes:
Problem 1:

I'm trying to get a string to behave as a .byGrapheme range by 
default, but I can't figure out Grapheme. I'm trying to replicate 
this behavior:

foreach (g; "hello".byGrapheme) {
     write(g[]);
}

In a custom type:

struct ustring {
     string data;
     this(string data) {
     	this.data = data;
     }
     auto get() {
         static struct Range {
             typeof(string.init.byGrapheme) source;
             bool empty() { return source.empty; }
             void popFront() { source.popFront; }
             auto front() { return source.front[]; }
             auto save() { return this; };
         }
         return Range(this.data.byGrapheme);
     }
     alias get this;
}

But I keep on ending up with a UTFException: "Encoding an invalid 
code point in UTF-8" with code like:

writeln("hello".ustring);

Problem 2:

How can I get the aliased ustring type to behave as a 
ForwardRange? If I add the save method to the voldermort range 
type, the isForwardRange!ustring fails because the requirement on 
isForwardRange checks to see if save returns the same type it's 
called on. Which is not the case here since typeof(ustring.save) 
== ustring.get.Range). But nontheless does have a save method.

Cheers,
- Ali
Jul 08 2019
parent reply ag0aep6g <anonymous example.com> writes:
On 08.07.19 23:55, aliak wrote:
 struct ustring {
      string data;
      this(string data) {
          this.data = data;
      }
      auto get() {
          static struct Range {
              typeof(string.init.byGrapheme) source;
              bool empty() { return source.empty; }
              void popFront() { source.popFront; }
              auto front() { return source.front[]; }
              auto save() { return this; };
          }
          return Range(this.data.byGrapheme);
      }
      alias get this;
 }
 
 But I keep on ending up with a UTFException: "Encoding an invalid code 
 point in UTF-8" with code like:
 
 writeln("hello".ustring);
`source.front` is a temporary `Grapheme` and you're calling `opSlice` on it. The documentation for `Grapheme.opSlice` warns: "Invalidates when this Grapheme leaves the scope, attempts to use it then would lead to memory corruption." [1] So you can't return `source.front[]` from your `front`. You'll have to store the current `front` in your struct, I guess. Also, returning a fresh range on every `alias this` call is asking for trouble. This is an infinite loop: auto u = "hello".ustring; while (!u.empty) u.popFront(); because `u.empty` and `u.popFront` are called on fresh, non-empty, independent ranges.
 Problem 2:
 
 How can I get the aliased ustring type to behave as a ForwardRange? If I 
 add the save method to the voldermort range type, the 
 isForwardRange!ustring fails because the requirement on isForwardRange 
 checks to see if save returns the same type it's called on. Which is not 
 the case here since typeof(ustring.save) == ustring.get.Range). But 
 nontheless does have a save method.
You must provide a `save` that returns a `ustring`. There's no way around it. Maybe make `ustring` itself the range. In the code you've shown, the `alias this` only seems to make everything more complicated. But you might have good reasons for it, of course. By the way, your're not calling `source.save` in `Range.save`. You're just copying `source`. I don't know if that's effectively the same, and even if it is, I'd advise to call `.save` explicitly. Better safe than sorry. [1] https://dlang.org/phobos/std_uni.html#.Grapheme.opSlice
Jul 08 2019
parent aliak <something something.com> writes:
On Monday, 8 July 2019 at 23:01:49 UTC, ag0aep6g wrote:
 On 08.07.19 23:55, aliak wrote:
 [...]
`source.front` is a temporary `Grapheme` and you're calling `opSlice` on it. The documentation for `Grapheme.opSlice` warns: "Invalidates when this Grapheme leaves the scope, attempts to use it then would lead to memory corruption." [1]
Ah. Right. Thanks!
 [...]
hah yes, I realized this as well.
 [...]
No you're right. It was indeed just making things more complicated and was just a bad idea.
 [...]
Cheers, - Ali
Jul 09 2019