www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Lazy evaluation

reply Manu <turkeyman gmail.com> writes:
So, I've been thinking about this const/lazy evaluation problem again.

I think most of the complaints about D's const is due to the use of lazy
evaluation for optimisation. I started wondering if there was room for a
'lazy' keyword.
Perhaps if you declared some function as lazy, then the compiler would be
expected to cache the return value, and also generate its own dirty flag,
which it may manipulate in its own way. The compiler should have all the
information it needs within the lazy function to know what external
influences would cause a dirtying of the state. If they are sibling members
as is often the case, it should be easy for them to be tagged to mark the
dirt bit dirty whenever they are modified... if they are external non-const
factors, then I would expect a warning (or error?).

Just a thought anyway. Curious to know if anything like this has ever been
considered... It might address the majority of the problems with D's const.
Ie, I imagine a lazy function could also be marked const, and either, the
compiler would have to just ignore the lazy cache and perform a full
evaluation when called within a const context, OR it would be the
responsibility of the compiler to implement the dirty logic in a 'safe' way
(with respect to thread safety... lock free primitive usage? etc)

Am I insane?

- Manu
Nov 22 2011
next sibling parent reply Peter Alexander <peter.alexander.au gmail.com> writes:
On 22/11/11 10:00 PM, Manu wrote:
 So, I've been thinking about this const/lazy evaluation problem again.

 I think most of the complaints about D's const is due to the use of lazy
 evaluation for optimisation. I started wondering if there was room for a
 'lazy' keyword.
There already is one :-) http://d-programming-language.org/lazy-evaluation.html
 Perhaps if you declared some function as lazy, then the compiler would
 be expected to cache the return value
How does it cache it? i.e. - What data structure does it use? - Is there a size limit, or does it just keep adding into the cache until you run out of memory? - What allocator does it use? - If there is a size limit, what is the eviction policy? - and what is the size limit? - How do I manually flag the cache as dirty if I want to clear it? There is no automatic way to do caching. Every single application of caching has its own needs and you need control over that. You can't just stick it all in a hash table and be done with it.
Nov 22 2011
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 11/23/2011 12:06 AM, Peter Alexander wrote:
 On 22/11/11 10:00 PM, Manu wrote:
 So, I've been thinking about this const/lazy evaluation problem again.

 I think most of the complaints about D's const is due to the use of lazy
 evaluation for optimisation. I started wondering if there was room for a
 'lazy' keyword.
There already is one :-) http://d-programming-language.org/lazy-evaluation.html
That is not lazy evaluation. It is call by name.
 Perhaps if you declared some function as lazy, then the compiler would
 be expected to cache the return value
How does it cache it? i.e. - What data structure does it use? - Is there a size limit, or does it just keep adding into the cache until you run out of memory? - What allocator does it use? - If there is a size limit, what is the eviction policy? - and what is the size limit? - How do I manually flag the cache as dirty if I want to clear it? There is no automatic way to do caching. Every single application of caching has its own needs and you need control over that. You can't just stick it all in a hash table and be done with it.
Nov 22 2011
parent Peter Alexander <peter.alexander.au gmail.com> writes:
On 22/11/11 11:19 PM, Timon Gehr wrote:
 On 11/23/2011 12:06 AM, Peter Alexander wrote:
 On 22/11/11 10:00 PM, Manu wrote:
 So, I've been thinking about this const/lazy evaluation problem again.

 I think most of the complaints about D's const is due to the use of lazy
 evaluation for optimisation. I started wondering if there was room for a
 'lazy' keyword.
There already is one :-) http://d-programming-language.org/lazy-evaluation.html
That is not lazy evaluation. It is call by name.
True.
Nov 22 2011
prev sibling parent reply Manu <turkeyman gmail.com> writes:
 Perhaps if you declared some function as lazy, then the compiler would
 be expected to cache the return value
How does it cache it? i.e. - What data structure does it use? - Is there a size limit, or does it just keep adding into the cache until you run out of memory? - What allocator does it use? - If there is a size limit, what is the eviction policy? - and what is the size limit? - How do I manually flag the cache as dirty if I want to clear it? There is no automatic way to do caching. Every single application of caching has its own needs and you need control over that. You can't just stick it all in a hash table and be done with it.
I can imagine many possibilities. Here's one straight off the top. - The data structure is the return value of the function contained within its parent class/struct + any necessary dirty bits. Where you place those it in the containing class/struct is debatable, but sensible options exist. - Size limit? What is the size limit of the object returned from the function in the first place? - Not really relevant, if it's a primitive type or struct, it is contained by the functions containing object, if it is a class, then it is a reference to, and it is allocated however the lazy function does. - I don't think this is relevant. - Again... - Fair question. Perhaps the function could have a property, or the containing class could have some property to access the cached object... but I would imagine the compiler would explicitly manage dirtying of the state, so that it can enforce that it is done in a ' safe' way... Am I wrong in suggesting that this is the most frequent cause of people raising the "D const issues' topic?
Nov 23 2011
parent reply Peter Alexander <peter.alexander.au gmail.com> writes:
On 23/11/11 6:10 PM, Manu wrote:
         Perhaps if you declared some function as lazy, then the compiler
         would
         be expected to cache the return value


     How does it cache it?

     i.e.

     - What data structure does it use?
     - Is there a size limit, or does it just keep adding into the cache
     until you run out of memory?
     - What allocator does it use?
     - If there is a size limit, what is the eviction policy?
     - and what is the size limit?
     - How do I manually flag the cache as dirty if I want to clear it?

     There is no automatic way to do caching. Every single application of
     caching has its own needs and you need control over that. You can't
     just stick it all in a hash table and be done with it.


 I can imagine many possibilities. Here's one straight off the top.

 - The data structure is the return value of the function contained
 within its parent class/struct + any necessary dirty bits. Where you
 place those it in the containing class/struct is debatable, but sensible
 options exist.
This simply doesn't work for functions of arguments that require caching. class FooManager { class Foo { ... } const(Foo) getFoo(int i) const { return new Foo(i); } } Assume that constructing a Foo is expensive. How can I cache the values of Foo? One approach would be to use an array: class FooManager { class Foo { ... } const(Foo) getFoo(int i) const { if (m_foos[i] is null) m_foos[i] = new Foo(i); // illegal in D return m_foos[i]; } const(Foo)[kCacheSize] m_foos; } Of course, this requires knowledge of the range of 'i' that can be received, so there is no way to automate this type of caching. You could automatically use a hash map, but that is needlessly inefficient. Also, what if getFoo could be called with any value of i? You wouldn't want to cache all values, you'd just want to cache a subset using some eviction policy, perhaps the 10 most recently used values of 'i'? There's no way to automate caching. One size does not fit all.
 - Size limit? What is the size limit of the object returned from the
 function in the first place?
 - Not really relevant, if it's a primitive type or struct, it is
 contained by the functions containing object, if it is a class, then it
 is a reference to, and it is allocated however the lazy function does.
 - I don't think this is relevant.
 - Again...
 - Fair question. Perhaps the function could have a property, or the
 containing class could have some property to access the cached object...
 but I would imagine the compiler would explicitly manage dirtying of the
 state, so that it can enforce that it is done in a ' safe' way...
All these responses assume that the function is not a function of its arguments (or it has no arguments).
 Am I wrong in suggesting that this is the most frequent cause of people
 raising the "D const issues' topic?
Yeah, it usually comes down to caching (or lazy computation, which is similar) but it's not the only thing. Essentially, any object that contains state that is not related to its definition of equality is difficult to model using D's const qualifiers.
Nov 23 2011
parent Manu <turkeyman gmail.com> writes:
On 23 November 2011 21:21, Peter Alexander <peter.alexander.au gmail.com>wrote:

 On 23/11/11 6:10 PM, Manu wrote:

        Perhaps if you declared some function as lazy, then the compiler
        would
        be expected to cache the return value


    How does it cache it?

    i.e.

    - What data structure does it use?
    - Is there a size limit, or does it just keep adding into the cache
    until you run out of memory?
    - What allocator does it use?
    - If there is a size limit, what is the eviction policy?
    - and what is the size limit?
    - How do I manually flag the cache as dirty if I want to clear it?

    There is no automatic way to do caching. Every single application of
    caching has its own needs and you need control over that. You can't
    just stick it all in a hash table and be done with it.


 I can imagine many possibilities. Here's one straight off the top.

 - The data structure is the return value of the function contained
 within its parent class/struct + any necessary dirty bits. Where you
 place those it in the containing class/struct is debatable, but sensible
 options exist.
This simply doesn't work for functions of arguments that require caching. class FooManager { class Foo { ... } const(Foo) getFoo(int i) const { return new Foo(i); } } Assume that constructing a Foo is expensive. How can I cache the values of Foo? One approach would be to use an array: class FooManager { class Foo { ... } const(Foo) getFoo(int i) const { if (m_foos[i] is null) m_foos[i] = new Foo(i); // illegal in D return m_foos[i]; } const(Foo)[kCacheSize] m_foos; } Of course, this requires knowledge of the range of 'i' that can be received, so there is no way to automate this type of caching. You could automatically use a hash map, but that is needlessly inefficient. Also, what if getFoo could be called with any value of i? You wouldn't want to cache all values, you'd just want to cache a subset using some eviction policy, perhaps the 10 most recently used values of 'i'? There's no way to automate caching. One size does not fit all.
This isn't a typical lazy evaluation. In your example, the problem exists irrespective of the const implementation. I don't think it addresses the topic. I'm talking about calls to the same function returning same result every time. So yes, functions with no arguments. I can sort of imagine it might be possible to implement my proposal though even in your case where it receives an index and internally chooses from a set (each having their own associated dirty flag), but where it uses an argument to produce the output that may change every time... this is not lazy evaluation, it's functional evaluation.
  - Size limit? What is the size limit of the object returned from the
 function in the first place?
 - Not really relevant, if it's a primitive type or struct, it is
 contained by the functions containing object, if it is a class, then it
 is a reference to, and it is allocated however the lazy function does.
 - I don't think this is relevant.
 - Again...
 - Fair question. Perhaps the function could have a property, or the
 containing class could have some property to access the cached object...
 but I would imagine the compiler would explicitly manage dirtying of the
 state, so that it can enforce that it is done in a ' safe' way...
All these responses assume that the function is not a function of its arguments (or it has no arguments).
Correct. Lazy/cached evaluation of some slow method. Am I wrong in suggesting that this is the most frequent cause of people
 raising the "D const issues' topic?
Yeah, it usually comes down to caching (or lazy computation, which is similar) but it's not the only thing. Essentially, any object that contains state that is not related to its definition of equality is difficult to model using D's const qualifiers.
Nov 23 2011
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 11/22/2011 11:00 PM, Manu wrote:
 So, I've been thinking about this const/lazy evaluation problem again.

 I think most of the complaints about D's const is due to the use of lazy
 evaluation for optimisation. I started wondering if there was room for a
 'lazy' keyword.
 Perhaps if you declared some function as lazy, then the compiler would
 be expected to cache the return value, and also generate its own dirty
 flag, which it may manipulate in its own way. The compiler should have
 all the information it needs within the lazy function to know what
 external influences would cause a dirtying of the state. If they are
 sibling members as is often the case, it should be easy for them to be
 tagged to mark the dirt bit dirty whenever they are modified... if they
 are external non-const factors, then I would expect a warning (or error?).

 Just a thought anyway. Curious to know if anything like this has ever
 been considered... It might address the majority of the problems with
 D's const.
 Ie, I imagine a lazy function could also be marked const, and either,
 the compiler would have to just ignore the lazy cache and perform a full
 evaluation when called within a const context, OR it would be the
 responsibility of the compiler to implement the dirty logic in a 'safe'
 way (with respect to thread safety... lock free primitive usage? etc)

 Am I insane?

 - Manu
I'd say variables would have to be declared as lazy, not functions. Then you could set the dirty flag manually on mutable instances. Const instances that refer to mutable data would benefit from lazy evaluation while immutable instances would have to compute the lazy members during their construction. This usage clashes with the current call by name semantics of the lazy parameter storage class though.
Nov 22 2011
parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, November 23, 2011 00:23:16 Timon Gehr wrote:
 On 11/22/2011 11:00 PM, Manu wrote:
 So, I've been thinking about this const/lazy evaluation problem again.
 
 I think most of the complaints about D's const is due to the use of lazy
 evaluation for optimisation. I started wondering if there was room for a
 'lazy' keyword.
 Perhaps if you declared some function as lazy, then the compiler would
 be expected to cache the return value, and also generate its own dirty
 flag, which it may manipulate in its own way. The compiler should have
 all the information it needs within the lazy function to know what
 external influences would cause a dirtying of the state. If they are
 sibling members as is often the case, it should be easy for them to be
 tagged to mark the dirt bit dirty whenever they are modified... if they
 are external non-const factors, then I would expect a warning (or
 error?).
 
 Just a thought anyway. Curious to know if anything like this has ever
 been considered... It might address the majority of the problems with
 D's const.
 Ie, I imagine a lazy function could also be marked const, and either,
 the compiler would have to just ignore the lazy cache and perform a full
 evaluation when called within a const context, OR it would be the
 responsibility of the compiler to implement the dirty logic in a 'safe'
 way (with respect to thread safety... lock free primitive usage? etc)
 
 Am I insane?
 
 - Manu
I'd say variables would have to be declared as lazy, not functions. Then you could set the dirty flag manually on mutable instances. Const instances that refer to mutable data would benefit from lazy evaluation while immutable instances would have to compute the lazy members during their construction. This usage clashes with the current call by name semantics of the lazy parameter storage class though.
This basic idea was discussed a couple of months back, and it was determined that it was overly complicated for very little gain. - Jonathan M Davis
Nov 22 2011
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 11/23/2011 12:53 AM, Jonathan M Davis wrote:
 On Wednesday, November 23, 2011 00:23:16 Timon Gehr wrote:
 On 11/22/2011 11:00 PM, Manu wrote:
 So, I've been thinking about this const/lazy evaluation problem again.

 I think most of the complaints about D's const is due to the use of lazy
 evaluation for optimisation. I started wondering if there was room for a
 'lazy' keyword.
 Perhaps if you declared some function as lazy, then the compiler would
 be expected to cache the return value, and also generate its own dirty
 flag, which it may manipulate in its own way. The compiler should have
 all the information it needs within the lazy function to know what
 external influences would cause a dirtying of the state. If they are
 sibling members as is often the case, it should be easy for them to be
 tagged to mark the dirt bit dirty whenever they are modified... if they
 are external non-const factors, then I would expect a warning (or
 error?).

 Just a thought anyway. Curious to know if anything like this has ever
 been considered... It might address the majority of the problems with
 D's const.
 Ie, I imagine a lazy function could also be marked const, and either,
 the compiler would have to just ignore the lazy cache and perform a full
 evaluation when called within a const context, OR it would be the
 responsibility of the compiler to implement the dirty logic in a 'safe'
 way (with respect to thread safety... lock free primitive usage? etc)

 Am I insane?

 - Manu
I'd say variables would have to be declared as lazy, not functions. Then you could set the dirty flag manually on mutable instances. Const instances that refer to mutable data would benefit from lazy evaluation while immutable instances would have to compute the lazy members during their construction. This usage clashes with the current call by name semantics of the lazy parameter storage class though.
This basic idea was discussed a couple of months back, and it was determined that it was overly complicated for very little gain. - Jonathan M Davis
I strongly disagree.
Nov 23 2011
parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, November 23, 2011 18:41:46 Timon Gehr wrote:
 On 11/23/2011 12:53 AM, Jonathan M Davis wrote:
 This basic idea was discussed a couple of months back, and it was
 determined that it was overly complicated for very little gain.
I strongly disagree.
Well, you're free to disagree, but that's essentially what the discussion resulted. And glancing at the thread ( http://www.mail- archive.com/digitalmars-d puremagic.com/msg65324.html ), it looks like you posted a couple of times in it, so I'd expect you to be at least somewhat aware of what was discussed in it. It's nice in theory, but there are a number of complications for what most people consider to be a minor gain. And for the folks who really don't like transitive const, it doesn't go anywhere near far enough, since they want full-on caching with const, not just lazy evaluation, which just isn't going to happen. - Jonathan M Dav
Nov 23 2011
parent reply Peter Alexander <peter.alexander.au gmail.com> writes:
On 23/11/11 6:05 PM, Jonathan M Davis wrote:
 On Wednesday, November 23, 2011 18:41:46 Timon Gehr wrote:
 On 11/23/2011 12:53 AM, Jonathan M Davis wrote:
 This basic idea was discussed a couple of months back, and it was
 determined that it was overly complicated for very little gain.
I strongly disagree.
Well, you're free to disagree, but that's essentially what the discussion resulted. And glancing at the thread ( http://www.mail- archive.com/digitalmars-d puremagic.com/msg65324.html ), it looks like you posted a couple of times in it, so I'd expect you to be at least somewhat aware of what was discussed in it. It's nice in theory, but there are a number of complications for what most people consider to be a minor gain. And for the folks who really don't like transitive const, it doesn't go anywhere near far enough, since they want full-on caching with const, not just lazy evaluation, which just isn't going to happen. - Jonathan M Dav
Just to be clear, I don't think anyone has any problem with the transitivity of const, but rather the fact that D only offers bitwise const and not logical const, and also that D protocols enforce bitwise const when only logical const is needed. Bitwise const is needed for safe concurrency, but only logical const is needed for pure functional programming.
Nov 23 2011
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 11/23/2011 07:59 PM, Peter Alexander wrote:
 On 23/11/11 6:05 PM, Jonathan M Davis wrote:
 On Wednesday, November 23, 2011 18:41:46 Timon Gehr wrote:
 On 11/23/2011 12:53 AM, Jonathan M Davis wrote:
 This basic idea was discussed a couple of months back, and it was
 determined that it was overly complicated for very little gain.
I strongly disagree.
Well, you're free to disagree, but that's essentially what the discussion resulted. And glancing at the thread ( http://www.mail- archive.com/digitalmars-d puremagic.com/msg65324.html ), it looks like you posted a couple of times in it, so I'd expect you to be at least somewhat aware of what was discussed in it. It's nice in theory, but there are a number of complications for what most people consider to be a minor gain. And for the folks who really don't like transitive const, it doesn't go anywhere near far enough, since they want full-on caching with const, not just lazy evaluation, which just isn't going to happen. - Jonathan M Dav
Just to be clear, I don't think anyone has any problem with the transitivity of const, but rather the fact that D only offers bitwise const and not logical const, and also that D protocols enforce bitwise const when only logical const is needed.
Exactly. This is how pure functional programming looks like in D at the moment: http://pastebin.com/C6vf9DQQ I didn't spot any 'immutable' or 'const'.
 Bitwise const is needed for safe concurrency, but only logical const is
 needed for pure functional programming.
And lazy evaluation already suffices.
Nov 23 2011
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 11/22/2011 2:00 PM, Manu wrote:
 Just a thought anyway. Curious to know if anything like this has ever been
 considered...
Yes, but it quickly got too complicated and arbitrary. We thought things were better left as having the user do it explicitly.
Nov 24 2011