www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - rvalue types

reply Simen =?UTF-8?B?S2rDpnLDpXM=?= <simen.kjaras gmail.com> writes:
There's been a discussion[0] over in D.learn about a library-only 
implementation of properties. One of the ideas mentioned there is 
rvalue types - types that automatically decay into a different 
type when passed to a function or assigned to a variable. They 
are useful for property wrappers like this, and when you want to 
perform a series of operations on something before performing 
some final step.

An example of the latter is `a ^^ b % c` for BigInts, where the 
naïve way would be horribly inefficient, and there's a much 
better way of doing it, which requires more knowledge of the 
operations involved. If `a ^^ b` returned an rvalue type that 
either decays to a regular BigInt or acts as the LHS in `tmp % 
c`, it would have the necessary information and be able to do the 
right thing.

Another use case is a chain of operations, e.g. fluent 
initialization:

Widget.create()
     .width(35)
     .height(960)
     .data(readData())
     .Done();

Where in current D the Done() step needs to be explicit, an 
rvalue type would automatically call Done when the result is 
assigned to a variable or passed to a function.

The problem with such a set of types, of course, is that 
`typeof(functionThatReturnsRvalueType())` will be different from 
`typeof((){ auto t = functionThatReturnsRvalueType(); return 
t;}())`, and that bleeds into documentation. It may also be 
confusing that `return a ^^ b % c;` is much faster than `auto tmp 
= a ^^ b; return tmp % c;`.

An example of how they would work:

struct ModularExponentiationTemporary {
     BigInt lhs, rhs;

      rvalue // Or however one would mark it as such.
     alias get this;

     BigInt get() {
         return pow(lhs, rhs);
     }

     BigInt opBinaryRight(string op : "%")(BigInt mod) {
         return modularPow(lhs, rhs, mod);
     }
}

unittest {
     BigInt b = 4;
     BigInt e = 13
     BigInt m = 497;

     // b ^^ e returns a ModularExponentiationTemporary,
     // and its opBinaryRight() is immediately invoked.
     auto fast = b ^^ e % m;

     assert(is(typeof(fast) == BigInt));
     assert(fast== 445);

     // b ^^ e returns a ModularExponentiationTemporary,
     // and its get() method is immediately invoked.
     auto slowTmp = b ^^ e;
     auto slow = slowTmp % m;

     assert(is(typeof(slowTmp == t) == BigInt));
     assert(is(typeof(slow) == BigInt));
     assert(slow == 445);
}

Is this an interesting concept? Are there other use cases I 
haven't covered? Can this be done with existing language 
features? Are there problems I haven't foreseen?

--
   Simen

[0]: 
https://forum.dlang.org/post/mqveusvzkmkshrzwsgjy forum.dlang.org
Mar 12 2018
next sibling parent reply Shachar Shemesh <shachar weka.io> writes:
I'll just point out that the C++ name for this is "Proxy classes". 
Maybe, for the sake of reducing confusion, it might be a good idea to 
adopt that.

Shachar

On 12/03/18 15:59, Simen Kjærås wrote:
 There's been a discussion[0] over in D.learn about a library-only 
 implementation of properties. One of the ideas mentioned there is rvalue 
 types - types that automatically decay into a different type when passed 
 to a function or assigned to a variable. They are useful for property 
 wrappers like this, and when you want to perform a series of operations 
 on something before performing some final step.
 
 An example of the latter is `a ^^ b % c` for BigInts, where the naïve 
 way would be horribly inefficient, and there's a much better way of 
 doing it, which requires more knowledge of the operations involved. If 
 `a ^^ b` returned an rvalue type that either decays to a regular BigInt 
 or acts as the LHS in `tmp % c`, it would have the necessary information 
 and be able to do the right thing.
 
 Another use case is a chain of operations, e.g. fluent initialization:
 
 Widget.create()
      .width(35)
      .height(960)
      .data(readData())
      .Done();
 
 Where in current D the Done() step needs to be explicit, an rvalue type 
 would automatically call Done when the result is assigned to a variable 
 or passed to a function.
 
 The problem with such a set of types, of course, is that 
 `typeof(functionThatReturnsRvalueType())` will be different from 
 `typeof((){ auto t = functionThatReturnsRvalueType(); return t;}())`, 
 and that bleeds into documentation. It may also be confusing that 
 `return a ^^ b % c;` is much faster than `auto tmp = a ^^ b; return tmp 
 % c;`.
 
 An example of how they would work:
 
 struct ModularExponentiationTemporary {
      BigInt lhs, rhs;
 
       rvalue // Or however one would mark it as such.
      alias get this;
 
      BigInt get() {
          return pow(lhs, rhs);
      }
 
      BigInt opBinaryRight(string op : "%")(BigInt mod) {
          return modularPow(lhs, rhs, mod);
      }
 }
 
 unittest {
      BigInt b = 4;
      BigInt e = 13
      BigInt m = 497;
 
      // b ^^ e returns a ModularExponentiationTemporary,
      // and its opBinaryRight() is immediately invoked.
      auto fast = b ^^ e % m;
 
      assert(is(typeof(fast) == BigInt));
      assert(fast== 445);
 
      // b ^^ e returns a ModularExponentiationTemporary,
      // and its get() method is immediately invoked.
      auto slowTmp = b ^^ e;
      auto slow = slowTmp % m;
 
      assert(is(typeof(slowTmp == t) == BigInt));
      assert(is(typeof(slow) == BigInt));
      assert(slow == 445);
 }
 
 Is this an interesting concept? Are there other use cases I haven't 
 covered? Can this be done with existing language features? Are there 
 problems I haven't foreseen?
 
 -- 
    Simen
 
 [0]: https://forum.dlang.org/post/mqveusvzkmkshrzwsgjy forum.dlang.org
Mar 12 2018
parent reply Simen =?UTF-8?B?S2rDpnLDpXM=?= <simen.kjaras gmail.com> writes:
On Monday, 12 March 2018 at 14:19:17 UTC, Shachar Shemesh wrote:
 I'll just point out that the C++ name for this is "Proxy 
 classes". Maybe, for the sake of reducing confusion, it might 
 be a good idea to adopt that.
The main idea behind rvalue types and their name, is that they are types that can only ever be rvalues, not by convention, but through the type system. I agree that most rvalue types will be proxy types, but it is not necessarily the case, nor is a proxy type necessarily an rvalue type. I don't want to conflate the two when discussing the 'rvalue-only' aspect. -- Simen
Mar 12 2018
parent reply Shachar Shemesh <shachar weka.io> writes:
On 12/03/18 16:31, Simen Kjærås wrote:
 The main idea behind rvalue types and their name, is that they are types 
 that can only ever be rvalues, not by convention, but through the type 
 system.
How do you prevent creating a named instance of them using either auto or ReturnType!func ? Shachar
Mar 12 2018
parent Simen =?UTF-8?B?S2rDpnLDpXM=?= <simen.kjaras gmail.com> writes:
On Monday, 12 March 2018 at 18:04:07 UTC, Shachar Shemesh wrote:
 On 12/03/18 16:31, Simen Kjærås wrote:
 The main idea behind rvalue types and their name, is that they 
 are types that can only ever be rvalues, not by convention, 
 but through the type system.
How do you prevent creating a named instance of them using either auto or ReturnType!func ?
You create a new language feature that explicitly forbids this. I'm not saying this is possible in the language now, I'm saying it's a thought that I've entertained for a while and wonder if it would be a reasonable addition to the language. -- Simen
Mar 12 2018
prev sibling next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Mar 12, 2018 at 01:59:59PM +0000, Simen Kjrs via Digitalmars-d wrote:
 There's been a discussion[0] over in D.learn about a library-only
 implementation of properties. One of the ideas mentioned there is
 rvalue types - types that automatically decay into a different type
 when passed to a function or assigned to a variable. They are useful
 for property wrappers like this, and when you want to perform a series
 of operations on something before performing some final step.
 
 An example of the latter is `a ^^ b % c` for BigInts, where the nave
 way would be horribly inefficient, and there's a much better way of
 doing it, which requires more knowledge of the operations involved. If
 `a ^^ b` returned an rvalue type that either decays to a regular
 BigInt or acts as the LHS in `tmp % c`, it would have the necessary
 information and be able to do the right thing.
I suspect the current language already supports this, or is 90% of the way there and just needs some small concessions on syntax. For example, today you can already make opBinary() return something other than the parent type, and use alias this to make it decay to the parent type. Of course, this requires the caller to write `BigInt x = a^^b % c` rather than `auto x = a^^b % c`, but I think that's a minor inconvenience.
 Another use case is a chain of operations, e.g. fluent initialization:
 
 Widget.create()
     .width(35)
     .height(960)
     .data(readData())
     .Done();
 
 Where in current D the Done() step needs to be explicit, an rvalue
 type would automatically call Done when the result is assigned to a
 variable or passed to a function.
[...] Not necessarily, if you make the concession that the caller has to explicitly assign the result to a Widget, say: Widget w = Widget.create() // returns WidgetBuilder, say .width(35) // modifies WidgetBuilder .height(960) // ditto .data(readData()); // implicitly converts to Widget You could just have WidgetBuilder alias this itself to a widget: struct WidgetBuilder { ... // stuff property Widget done() { return new Widget(...); } alias done this; } Then it will work the way you want. T -- You have to expect the unexpected. -- RL
Mar 12 2018
next sibling parent reply Nick Treleaven <nick geany.org> writes:
On Monday, 12 March 2018 at 16:51:06 UTC, H. S. Teoh wrote:
 For example, today you can already make opBinary() return 
 something other than the parent type, and use alias this to 
 make it decay to the parent type.
Sounds like this: https://en.wikipedia.org/wiki/Expression_templates
Mar 12 2018
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Mar 12, 2018 at 05:46:25PM +0000, Nick Treleaven via Digitalmars-d
wrote:
 On Monday, 12 March 2018 at 16:51:06 UTC, H. S. Teoh wrote:
 For example, today you can already make opBinary() return something
 other than the parent type, and use alias this to make it decay to
 the parent type.
Sounds like this: https://en.wikipedia.org/wiki/Expression_templates
Yep, pretty much. T -- I'm still trying to find a pun for "punishment"...
Mar 12 2018
prev sibling parent reply Simen =?UTF-8?B?S2rDpnLDpXM=?= <simen.kjaras gmail.com> writes:
On Monday, 12 March 2018 at 16:51:06 UTC, H. S. Teoh wrote:
 I suspect the current language already supports this, or is 90% 
 of the way there and just needs some small concessions on 
 syntax.  For example, today you can already make opBinary() 
 return something other than the parent type, and use alias this 
 to make it decay to the parent type. Of course, this requires 
 the caller to write `BigInt x = a^^b % c` rather than `auto x = 
 a^^b % c`, but I think that's a minor inconvenience.
I mostly agree, but it can play havoc on generic code. On the other hand, if we don't think carefully through how they should work, so would rvalue types. -- Simen
Mar 12 2018
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Mar 12, 2018 at 06:46:56PM +0000, Simen Kjrs via Digitalmars-d wrote:
 On Monday, 12 March 2018 at 16:51:06 UTC, H. S. Teoh wrote:
 I suspect the current language already supports this, or is 90% of
 the way there and just needs some small concessions on syntax.  For
 example, today you can already make opBinary() return something
 other than the parent type, and use alias this to make it decay to
 the parent type. Of course, this requires the caller to write
 `BigInt x = a^^b % c` rather than `auto x = a^^b % c`, but I think
 that's a minor inconvenience.
I mostly agree, but it can play havoc on generic code. On the other hand, if we don't think carefully through how they should work, so would rvalue types.
[...] Actually, I'm even wondering if allowing the assignment of an intermediate type might not necessarily be a bad thing. Suppose BigInt.opBinary returns some intermediate type, like BigIntIntermediate, that implicitly converts to BigInt via alias this. If you write: BigInt a, b, c; auto x = a*b + c; then x will be a BigIntIntermediate instead of a BigInt. But is that really so bad? It can participate in further BigInt operations, returning more instances of BigIntIntermediate, and only when you actually try to do something to it, like assign it to a BigInt variable or return it from a function with BigInt return type, will the implicit conversion (and presumably the actual computation) happen. I'd say this is even a *good* thing, because then: BigInt x = a*b + c; will actually be equivalent to: auto tmp = a*b; BigInt x = tmp + c; and the latter will actually have the same optimizations as the single-expression case! So essentially, BigIntIntermediate becomes a lazy type that only performs the computation when it's actually necessary, and in the meantime it can accumulate knowledge of what's being computed so that it can optimize expensive operations. T -- The richest man is not he who has the most, but he who needs the least.
Mar 12 2018
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On Monday, 12 March 2018 at 19:03:10 UTC, H. S. Teoh wrote:
 On Mon, Mar 12, 2018 at 06:46:56PM +0000, Simen Kjærås via 
 Digitalmars-d wrote:
 On Monday, 12 March 2018 at 16:51:06 UTC, H. S. Teoh wrote:
 I suspect the current language already supports this, or is 
 90% of the way there and just needs some small concessions 
 on syntax.  For example, today you can already make 
 opBinary() return something other than the parent type, and 
 use alias this to make it decay to the parent type. Of 
 course, this requires the caller to write `BigInt x = a^^b % 
 c` rather than `auto x = a^^b % c`, but I think that's a 
 minor inconvenience.
I mostly agree, but it can play havoc on generic code. On the other hand, if we don't think carefully through how they should work, so would rvalue types.
[...] Actually, I'm even wondering if allowing the assignment of an intermediate type might not necessarily be a bad thing. Suppose BigInt.opBinary returns some intermediate type, like BigIntIntermediate, that implicitly converts to BigInt via alias this. If you write: BigInt a, b, c; auto x = a*b + c; then x will be a BigIntIntermediate instead of a BigInt. But is that really so bad?
When combined with a few range primitives exponental template bloat is inevitable. While writing Pry I soon come to realize that using types to store information is a dead end. They are incredibly brittle esp. once you start optimizing on them using operations such as T1 eqivalent to T2, where T1 != T2. And slow. Did I meantioned they are slow? And ofc 64kbyte symbols that make no sense anyway. A better approach for cases beyond a handful operators is so called “staging” - 2 stage computation, where you first build a blueprint of operation using CTFE. Secondly you “instantiate it” and apply arbitrarry number of times. Optimization opportunities inbetween those stages are remarkable and much easier to grasp compared to “type dance”. For instance: enum Expr!double bluePrint = factor!”a” ^^ factor!”b” % factor!”c”; Where that Expr is eg a class instance that holds AST of operaton as plain values. Now usage: alias powMod = bluePrint.instantiate; // here we do optimizations and CTFE-based codegen powMod(a,b,c); // use as many times as needed Notation could be improved by using the same expression template idea but with polymorphic types at CTFE. Another thing is partial specialization: alias squareMod = bluePrint.assume({ “b” : 2 }).instantiate; Now : squareMod(a,c); // should be faster the elaborate algorithm
Mar 12 2018
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Mar 13, 2018 at 06:51:01AM +0000, Dmitry Olshansky via Digitalmars-d
wrote:
[...]
 While writing Pry I soon come to realize that using types to store
 information is a dead end. They are incredibly brittle esp. once you
 start optimizing on them using operations such as T1 eqivalent to T2,
 where T1 != T2.
 And slow. Did I meantioned they are slow? And ofc 64kbyte symbols that
 make no sense anyway.
 
 A better approach for cases beyond a handful operators is so called
 “staging” - 2 stage computation, where you first build a blueprint of
 operation using CTFE. Secondly you “instantiate it” and apply
 arbitrarry number of times. Optimization opportunities inbetween those
 stages are remarkable and much easier to grasp compared to “type
 dance”.
 
 For instance:
 
 enum Expr!double bluePrint = factor!”a” ^^ factor!”b” % factor!”c”;
 
 Where that Expr is eg a class instance that holds AST of operaton as
 plain values.
 
 Now usage:
 
 alias powMod = bluePrint.instantiate; // here we do optimizations and
 CTFE-based codegen
 
 powMod(a,b,c); // use as many times as needed
Wouldn't CTFE-based codegen be pretty slow too? Until newCTFE is merged, it would seem to be about as slow as using templates (if not slower).
 Notation could be improved by using the same expression template idea
 but with polymorphic types at CTFE.
 
 Another thing is partial specialization:
 
 alias squareMod = bluePrint.assume({ “b” : 2 }).instantiate;
 
 Now :
 
 squareMod(a,c); // should be faster the elaborate algorithm
I think the general idea is a good approach, and it seems that ultimately we're just reinventing expression DSLs. Overloading built-in operators works up to a point, and then you really want to just use a string DSL, parse that in CTFE and use mixin to codegen. That frees you from the spaghetti template expansions in expression templates, and also frees you from being limited by built-in operators, precedence, and syntax. T -- Public parking: euphemism for paid parking. -- Flora
Mar 13 2018
next sibling parent Meta <jared771 gmail.com> writes:
On Tuesday, 13 March 2018 at 17:33:14 UTC, H. S. Teoh wrote:
 I think the general idea is a good approach, and it seems that 
 ultimately we're just reinventing expression DSLs.  Overloading 
 built-in operators works up to a point, and then you really 
 want to just use a string DSL, parse that in CTFE and use mixin 
 to codegen.  That frees you from the spaghetti template 
 expansions in expression templates, and also frees you from 
 being limited by built-in operators, precedence, and syntax.
IMO one of the advantages that Dmitry's approach has is that you don't have to do the lexing during CTFE, which may slow things down even more. It's already done for you by the user.
Mar 13 2018
prev sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On Tuesday, 13 March 2018 at 17:33:14 UTC, H. S. Teoh wrote:
 Now usage:
 
 alias powMod = bluePrint.instantiate; // here we do 
 optimizations and CTFE-based codegen
 
 powMod(a,b,c); // use as many times as needed
Wouldn't CTFE-based codegen be pretty slow too? Until newCTFE is merged, it would seem to be about as slow as using templates (if not slower).
Trying to do even most basic optimizations on a bunch of nested templated types is worst of both worlds: it’s amazingly awkward _and_ slow. CTFE is almost fine for stright-forward manipulations.
 Notation could be improved by using the same expression 
 template idea but with polymorphic types at CTFE.
 
 Another thing is partial specialization:
 
 alias squareMod = bluePrint.assume({ “b” : 2 }).instantiate;
 
 Now :
 
 squareMod(a,c); // should be faster the elaborate algorithm
I think the general idea is a good approach, and it seems that ultimately we're just reinventing expression DSLs. Overloading built-in operators works up to a point, and then you really want to just use a string DSL, parse that in CTFE and use mixin to codegen.
 That frees you from the spaghetti template expansions in 
 expression templates, and also frees you from being limited by 
 built-in operators, precedence, and syntax.
Staged aproach has the benefit of composing pieces together + partial specialization. In a sense “parse the DSL” could do the same if it splits AST and codegen phases.
Mar 13 2018
prev sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 3/12/18 9:59 AM, Simen Kjærås wrote:

 Is this an interesting concept? Are there other use cases I haven't 
 covered? Can this be done with existing language features? Are there 
 problems I haven't foreseen?
Very interesting idea. So if I could rephrase to make sure I understand: An rvalue type is one that you can never assign to a variable. As soon as you try to "store" it somewhere, it becomes a new type that is returned by its "get" function. In this case, once it gets passed into any function that is *not* a member function, it decays to the designated type? I think this would solve the issue, and help with the property debate, but it would have to trim out all the intermediary stuff. I wonder instead of types which you don't want to exist anyway, you could find a better way to formulate this. -Steve
Mar 12 2018
parent reply Simen =?UTF-8?B?S2rDpnLDpXM=?= <simen.kjaras gmail.com> writes:
On Monday, 12 March 2018 at 18:00:02 UTC, Steven Schveighoffer 
wrote:
 So if I could rephrase to make sure I understand: An rvalue 
 type is one that you can never assign to a variable. As soon as 
 you try to "store" it somewhere, it becomes a new type that is 
 returned by its "get" function.

 In this case, once it gets passed into any function that is 
 *not* a member function, it decays to the designated type?
Exactly.
 I think this would solve the issue, and help with the property 
 debate, but it would have to trim out all the intermediary 
 stuff. I wonder instead of types which you don't want to exist 
 anyway, you could find a better way to formulate this.
It's easy to imagine some patchwork solutions - going back to the property discussion I mentioned that allowing alias this and operator overloads on named mixins would enable properties to work as if they were implemented with rvalue types. In the same vein, we could imagine a solution for chaining operators: struct BigInt { BigInt opChain(string[] ops, T...)(T args) if (allSatisfy!(isBigInt, T) && ops == ["^^", "%"]) { return powMod(this, args); } } This would take care of those two cases, but don't offer a solution to the WidgetBuilder example. It might be possible to find a similar one-off solution there, but I'd rather have a more general solution that might be useful in cases I haven't yet thought of. -- Simen
Mar 12 2018
parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 3/12/18 2:56 PM, Simen Kjærås wrote:
 I think this would solve the issue, and help with the property debate, 
 but it would have to trim out all the intermediary stuff. I wonder 
 instead of types which you don't want to exist anyway, you could find 
 a better way to formulate this.
It's easy to imagine some patchwork solutions - going back to the property discussion I mentioned that allowing alias this and operator overloads on named mixins would enable properties to work as if they were implemented with rvalue types. In the same vein, we could imagine a solution for chaining operators: struct BigInt {     BigInt opChain(string[] ops, T...)(T args)     if (allSatisfy!(isBigInt, T) && ops == ["^^", "%"])     {         return powMod(this, args);     } }
Yeah, what you need is some sort of access to the entire expression. I don't think it needs to be limited to operators. What I don't like about your proposal is that you have to maintain (essentially) the AST yourself, in a type that never really should exist. Too bad the compiler can't just give you the AST and let you figure it out, but I think that's probably not ever going to happen. Ideally, your call will lower to a single call to powMod, and all the cruft around it disappears. But it seems like every time stuff like this is tried, it leaves behind artifacts that suck.
 This would take care of those two cases, but don't offer a solution to 
 the WidgetBuilder example. It might be possible to find a similar 
 one-off solution there, but I'd rather have a more general solution that 
 might be useful in cases I haven't yet thought of.
A similar thing I did once in C++ for a logger is to use the destructor of an essentially rvalue type to actually log the information. It looked something like this: log << value1 << " " << value2; What log did is, based on the logging level, it would return either something that did nothing, or something that built up a single std::string, and then in the destructor, actually logged the data to something. This prevented partial messages from being logged (the destructor would lock a global mutex for this). It's a similar thing to your widget builder in that the end of the statement is what triggers all the stuff to happen. Unfortunately, we can't hook the destructor, but that would be the best place to do this. Hm... I wonder if we could make a return type for the destructor, and that would be the result of the expression? -Steve
Mar 12 2018