digitalmars.D - Purity (D2 standard libraries / object.d)
- Jason House (3/3) Jan 09 2009 In the latest D2, both the frontend and backend support use and optimiza...
- Walter Bright (3/7) Jan 09 2009 As Andrei pointed out, the trouble with making the Object functions pure...
- Robert Fraser (11/19) Jan 09 2009 If you have:
- Walter Bright (2/5) Jan 09 2009 It should optimize it.
- Jason House (2/10) Jan 09 2009 One could easily generalize that to mean non-final virtual functions sho...
- Walter Bright (4/18) Jan 09 2009 Memoization and purity don't mix. For each function, you'll have to pick...
- Jason House (8/12) Jan 09 2009 True, but...
- Walter Bright (5/8) Jan 09 2009 Another word for that is "common subexpressions". And yes, the compiler
- Michel Fortin (38/45) Jan 09 2009 Hum, could the compiler be trusted to add the memoization code to pure
- Walter Bright (6/8) Jan 09 2009 If the compiler does general memoization on pure functions, all it has
- Jason House (2/12) Jan 09 2009 What about a compiler hint similar to inline or register?
- Andrei Alexandrescu (5/18) Jan 09 2009 Maybe making a conscious effort to leave the language feature as a last
- Jason House (2/22) Jan 10 2009 I chose nearly dead keywords for a reason ;) If I got Walter to think ab...
- Walter Bright (2/8) Jan 10 2009 Coffee shop discussions are certainly the best.
- dsimcha (5/13) Jan 09 2009 Wouldn't this also cause threading issues? The obvious solution would b...
- Denis Koroskin (2/18) Jan 09 2009 Thread-safe (lock-free?) container would do the trick just fine.
- Michel Fortin (18/24) Jan 10 2009 Well, I'd trust the compiler-generated code would not cause threading
- Walter Bright (2/13) Jan 10 2009 Immutable data doesn't have threading issues, which is the beauty of the...
- dsimcha (7/20) Jan 10 2009 You misunderstood the question. I mean, at the implementation level, ev...
- Walter Bright (3/9) Jan 10 2009 You're right, I hadn't thought of that. But you could do a per-thread
- Andrei Alexandrescu (9/18) Jan 09 2009 Not the bits, the full objects if they have indirection. (Pure functions...
- Michel Fortin (16/22) Jan 10 2009 That's only true because, in your specific context, those functions
- bearophile (7/18) Jan 10 2009 Right. There are time-limited memoization strategies, and other ones as ...
- Daniel Keep (15/38) Jan 10 2009 ... so let's tell it what the usage patterns are!
- Andrei Alexandrescu (3/16) Jan 10 2009 No, because I use interpolation.
- Walter Bright (4/21) Jan 10 2009 That's way beyond the ability of a compiler to do automatically. The
- Andrei Alexandrescu (9/31) Jan 10 2009 You're replying to the wrong guy. I'm saying: the compiler shouldn't
- Bill Baxter (10/42) Jan 10 2009 Just get Walter to implement AST macros and it should stop. :-)
- Christopher Wright (5/11) Jan 11 2009 The ability for you to memoize a pure function and get a pure function
- Bill Baxter (4/16) Jan 11 2009 Isn't this the same old "logically const" vs "const" argument we went
- Jason House (20/31) Jan 11 2009 Here's the simplest memoization example I could come up with. Maybe we ...
- Stewart Gordon (9/25) Jan 11 2009 So effectiely, the compiler could just trust pure functions to be
- Walter Bright (5/16) Jan 10 2009 You're right, the "just the bits on the stack" is if the arguments are
- dsimcha (26/42) Jan 10 2009 Even if the bits are immutable, I can't see how this could work. What i...
-
Walter Bright
(4/8)
Jan 10 2009
That's the beauty of a garbage collector
. The bits will include a - dsimcha (5/13) Jan 10 2009 Good point, the bits in the internal memoization AA will contain a refer...
- Walter Bright (3/17) Jan 10 2009 The job of managing those kind of design tradeoffs is something that the...
- Frits van Bommel (3/13) Jan 12 2009 *cough* And with weak pointer/reference types we could handle it quite
- Steven Schveighoffer (10/26) Jan 12 2009 Bits on the stack also do not work for immutable reference arguments.
- Michel Fortin (15/20) Jan 10 2009 True, but only in the absence of pointers to mutable data (pointers to
-
Stewart Gordon
(6/13)
Jan 10 2009
- Stewart Gordon (6/9) Jan 10 2009 Or maybe not...
- Christopher Wright (2/14) Jan 10 2009 I thought you were joking. What's confusing you?
- Walter Bright (2/17) Jan 10 2009 I forgot.
- Christopher Wright (2/20) Jan 11 2009 Walter Bright is the new Stewart Gordon!
- Bill Baxter (6/16) Jan 10 2009 Heh heh. Yeh, I remember I thought I'd found a major typo in the CLR
- Andrei Alexandrescu (8/60) Jan 09 2009 Yes!
- Andrei Alexandrescu (4/23) Jan 09 2009 Memoization and purity can and should go together. We need to sit down
- Stewart Gordon (22/25) Jan 10 2009 Actually, what they don't mix with is the requirement that it still be
In the latest D2, both the frontend and backend support use and optimization when user code declares pure code. This isn't particularly useful when pure functions can't call standard library code. When considering the standard library changes, I realized that object.d could change. I believe a pure opCmp or toString could break user code with impure versions of those functions. Would that kind of a change to object.d cause any real problems for D2 users? Are there other ambiguous purity options when converting the standard libraries? Which parts of Phobos/druntime would you expect to be pure/impure?
Jan 09 2009
Jason House wrote:When considering the standard library changes, I realized that object.d could change. I believe a pure opCmp or toString could break user code with impure versions of those functions. Would that kind of a change to object.d cause any real problems for D2 users?As Andrei pointed out, the trouble with making the Object functions pure is if you want to do an override that caches its value.
Jan 09 2009
Walter Bright Wrote:Jason House wrote:If you have: class A { invariant final pure bool opEquals(invariant(Object) other) { ... } } int main() { auto a = cast(invariant) new A(); auto b = cast(invariant) new A(); if(a1 == a2) { return 0; } else { return 1; } } ... will the compiler optimize it? Since A.opEquals() is final, there's no polymorphism, so it should be able to, but I don't know about the implementation.When considering the standard library changes, I realized that object.d could change. I believe a pure opCmp or toString could break user code with impure versions of those functions. Would that kind of a change to object.d cause any real problems for D2 users?As Andrei pointed out, the trouble with making the Object functions pure is if you want to do an override that caches its value.
Jan 09 2009
Robert Fraser wrote:... will the compiler optimize it? Since A.opEquals() is final, there's no polymorphism, so it should be able to, but I don't know about the implementation.It should optimize it.
Jan 09 2009
Walter Bright Wrote:Jason House wrote:One could easily generalize that to mean non-final virtual functions should never be pure. Memoizaton is a pretty basic need. I thought that was in D's future. Is that not true?When considering the standard library changes, I realized that object.d could change. I believe a pure opCmp or toString could break user code with impure versions of those functions. Would that kind of a change to object.d cause any real problems for D2 users?As Andrei pointed out, the trouble with making the Object functions pure is if you want to do an override that caches its value.
Jan 09 2009
Jason House wrote:Walter Bright Wrote:Memoization and purity don't mix. For each function, you'll have to pick which way you want to go. I don't see what that has to do with D's future, though, the language allows either.Jason House wrote:One could easily generalize that to mean non-final virtual functions should never be pure. Memoizaton is a pretty basic need. I thought that was in D's future. Is that not true?When considering the standard library changes, I realized that object.d could change. I believe a pure opCmp or toString could break user code with impure versions of those functions. Would that kind of a change to object.d cause any real problems for D2 users?As Andrei pointed out, the trouble with making the Object functions pure is if you want to do an override that caches its value.
Jan 09 2009
Walter Bright wrote:Memoization and purity don't mix.True, but... All pure functions can be memoized. They can be memoized by the caller of the pure function, or the pure function could be rewritten to automatically memoize. When writing a pure but potentially costly function, the coder may consider memoization. They must predict how frequently the function would get called with the same input. They must also decide if the time/space trade-off is worth it for the program as a whole and how inconvenient it would be to call the function. If it's called from another pure function, well... they don't mix. All-in-all, memoization is a tough call to make. For library writers, it's even tougher. They may not know how their code would get used and may be unable to make the call about memoization. It's possible to provide a pure function and a memoized alternative. Of course, there are many memoization options available, especially when memory space is restricted or various thread safety/locking methods may be desired. A library writer may just declare that if the user wants memoization that they should use a memoization library to do it. Anyone who writes an inheritable classes also have to predict if the writers of classes that inherit from their class will want to memoize or not. Some may forbid it because then they can declare the function pure. Others will feel like they can't declare any virtual functions as pure (and therefore allow memoization).For each function, you'll have to pick which way you want to go. I don't see what that has to do with D's future, though, the language allows either.Somehow, I had the impression that D might use memoization as some kind of optimization once pure functions are embraced. That was probably a misunderstanding on my part. I do think that writers of generic/library/inheritable code will continuously revisit the question of memoization. That may affect the future of D usage, but not the underlying language specification.
Jan 09 2009
Jason House wrote:Somehow, I had the impression that D might use memoization as some kind of optimization once pure functions are embraced. That was probably a misunderstanding on my part.Another word for that is "common subexpressions". And yes, the compiler has the option to do that, and in fact does do that (in a small way) in the latest release. But what I was talking about is the user himself doing memoization.
Jan 09 2009
On 2009-01-09 20:45:57 -0500, Jason House <jason.james.house gmail.com> said:Hum, could the compiler be trusted to add the memoization code to pure functions so they can stay pure? Something like this: pure memoize int costlyCalculus(int i, int j) { return i + j; } being transformed into this by the compiler? int[TypeTuple!(i, j)] _costlyCalculus_cache; pure int costlyCalculus(int i, int j) { { int* _found_result = Tuple!(i, j) in _costlyCalculus_cache; if (_found_result) return _found_result; } { int _calculated_result = i + j; _costlyCalculus_cache[Tuple!(i, j)] = _calculated_result; return _calculated_result; } } Surely this way the compiler would be able to bypass purity checks for accessing the cache while still ensuring that the function has no side effect and always return the same result for the same arguments. The downside being that a hash table may not always be the best memoization technique. Perhaps then it belongs more in the standard library. pure int costlyCalculus(int i, int j); Memoizer!(costlyCalculus) memoizedCostlyCalculus; where Memoizer is a struct template defining a pure opCall and containing the cache it can access by casting away purity in a few places. But it makes it more difficult to add memoization while overriding a function. -- Michel Fortin michel.fortin michelf.com http://michelf.com/For each function, you'll have to pick which way you want to go. I don't see what that has to do with D's future, though, the language allows either.Somehow, I had the impression that D might use memoization as some kind of optimization once pure functions are embraced. That was probably a misunderstanding on my part.
Jan 09 2009
Michel Fortin wrote:Hum, could the compiler be trusted to add the memoization code to pure functions so they can stay pure?If the compiler does general memoization on pure functions, all it has to do is use the bits of the arguments passed on the stack to the function as an index into an associative array of the return values. The problem is identifying if this would be faster than recomputing the return value.
Jan 09 2009
Walter Bright Wrote:Michel Fortin wrote:What about a compiler hint similar to inline or register?Hum, could the compiler be trusted to add the memoization code to pure functions so they can stay pure?If the compiler does general memoization on pure functions, all it has to do is use the bits of the arguments passed on the stack to the function as an index into an associative array of the return values. The problem is identifying if this would be faster than recomputing the return value.
Jan 09 2009
Jason House wrote:Walter Bright Wrote:Maybe making a conscious effort to leave the language feature as a last resort action would be better. Creating a memoization feature is a fish, creating a language that allows implementing memoization is fishing. :o/ AndreiMichel Fortin wrote:What about a compiler hint similar to inline or register?Hum, could the compiler be trusted to add the memoization code to pure functions so they can stay pure?If the compiler does general memoization on pure functions, all it has to do is use the bits of the arguments passed on the stack to the function as an index into an associative array of the return values. The problem is identifying if this would be faster than recomputing the return value.
Jan 09 2009
Andrei Alexandrescu Wrote:Jason House wrote:I chose nearly dead keywords for a reason ;) If I got Walter to think about how to implement "pure" memoizers for an extra half hour, I will consider myself highly successful. I must leave the fishing discussion up to the two of you in a coffee shop! (I'd love to sit in and participate with that discussion, but online forums make the conversation far too difficult)Walter Bright Wrote:Maybe making a conscious effort to leave the language feature as a last resort action would be better. Creating a memoization feature is a fish, creating a language that allows implementing memoization is fishing. :o/ AndreiMichel Fortin wrote:What about a compiler hint similar to inline or register?Hum, could the compiler be trusted to add the memoization code to pure functions so they can stay pure?If the compiler does general memoization on pure functions, all it has to do is use the bits of the arguments passed on the stack to the function as an index into an associative array of the return values. The problem is identifying if this would be faster than recomputing the return value.
Jan 10 2009
Jason House wrote:I chose nearly dead keywords for a reason ;) If I got Walter to think about how to implement "pure" memoizers for an extra half hour, I will consider myself highly successful. I must leave the fishing discussion up to the two of you in a coffee shop! (I'd love to sit in and participate with that discussion, but online forums make the conversation far too difficult)Coffee shop discussions are certainly the best.
Jan 10 2009
== Quote from Walter Bright (newshound1 digitalmars.com)'s articleMichel Fortin wrote:Wouldn't this also cause threading issues? The obvious solution would be to use TLS, but his requires duplicating the cache across threads. Also, using AAs internally like this would lead to very deeply hidden memory allocations, and therefore possibly more frequent GC, etc.Hum, could the compiler be trusted to add the memoization code to pure functions so they can stay pure?If the compiler does general memoization on pure functions, all it has to do is use the bits of the arguments passed on the stack to the function as an index into an associative array of the return values. The problem is identifying if this would be faster than recomputing the return value.
Jan 09 2009
On Sat, 10 Jan 2009 07:14:07 +0300, dsimcha <dsimcha yahoo.com> wrote:== Quote from Walter Bright (newshound1 digitalmars.com)'s articleThread-safe (lock-free?) container would do the trick just fine.Michel Fortin wrote:Wouldn't this also cause threading issues? The obvious solution would be to use TLS, but his requires duplicating the cache across threads. Also, using AAs internally like this would lead to very deeply hidden memory allocations, and therefore possibly more frequent GC, etc.Hum, could the compiler be trusted to add the memoization code to pure functions so they can stay pure?If the compiler does general memoization on pure functions, all it has to do is use the bits of the arguments passed on the stack to the function as an index into an associative array of the return values. The problem is identifying if this would be faster than recomputing the return value.
Jan 09 2009
On 2009-01-09 23:14:07 -0500, dsimcha <dsimcha yahoo.com> said:Wouldn't this also cause threading issues?Well, I'd trust the compiler-generated code would not cause threading issues. :-)The obvious solution would be to use TLS, but his requires duplicating the cache across threads.Well, there isn't many choices. Either it's TLS, or the cache need to be synchronized (using either locks or lock-free algorithms; both will add some overhead anyway). That said, if you memoize a function member of a class, the memoization data could be added to the class and not be global. If the object is not shared across threads, then the memoization data isn't either. If the object is shared, then you'd need either TLS or synchronization.Also, using AAs internally like this would lead to very deeply hidden memory allocations, and therefore possibly more frequent GC, etc.Perhaps the cache needs to be a little smarter than a regular AA. You may not want to keep each and every value that was computed. Depending on the situation, keeping only the 100 last results may be enough, in which case you can dedicate a fixed amount of memory for caching. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Jan 10 2009
dsimcha wrote:== Quote from Walter Bright (newshound1 digitalmars.com)'s articleImmutable data doesn't have threading issues, which is the beauty of them.If the compiler does general memoization on pure functions, all it has to do is use the bits of the arguments passed on the stack to the function as an index into an associative array of the return values. The problem is identifying if this would be faster than recomputing the return value.Wouldn't this also cause threading issues? The obvious solution would be to use TLS, but his requires duplicating the cache across threads. Also, using AAs internally like this would lead to very deeply hidden memory allocations, and therefore possibly more frequent GC, etc.
Jan 10 2009
== Quote from Walter Bright (newshound1 digitalmars.com)'s articledsimcha wrote:You misunderstood the question. I mean, at the implementation level, even though it wouldn't be visible to the programmer, the AA that did the memoization would have to be mutable so it could be modified when a new value was to be memoized. This is where the threading issues would come in. What I was saying is that synchronizing on this would be bad for obvious reasons, and TLS would still not be great because you wouldn't be able to share memoized values across threads.== Quote from Walter Bright (newshound1 digitalmars.com)'s articleImmutable data doesn't have threading issues, which is the beauty of them.If the compiler does general memoization on pure functions, all it has to do is use the bits of the arguments passed on the stack to the function as an index into an associative array of the return values. The problem is identifying if this would be faster than recomputing the return value.Wouldn't this also cause threading issues? The obvious solution would be to use TLS, but his requires duplicating the cache across threads. Also, using AAs internally like this would lead to very deeply hidden memory allocations, and therefore possibly more frequent GC, etc.
Jan 10 2009
dsimcha wrote:You misunderstood the question. I mean, at the implementation level, even though it wouldn't be visible to the programmer, the AA that did the memoization would have to be mutable so it could be modified when a new value was to be memoized. This is where the threading issues would come in. What I was saying is that synchronizing on this would be bad for obvious reasons, and TLS would still not be great because you wouldn't be able to share memoized values across threads.You're right, I hadn't thought of that. But you could do a per-thread memoization using thread local storage. No sync problems.
Jan 10 2009
Walter Bright wrote:Michel Fortin wrote:Not the bits, the full objects if they have indirection. (Pure functions should accept const parameters!) And it's not only simple associative array. It could also be a dense array, a little interpolation engine etc.Hum, could the compiler be trusted to add the memoization code to pure functions so they can stay pure?If the compiler does general memoization on pure functions, all it has to do is use the bits of the arguments passed on the stack to the function as an index into an associative array of the return values.The problem is identifying if this would be faster than recomputing the return value.I used memoizers for exp and log (the most called functions in some code I wrote) and it made the original version feel like it was driving in reverse. Andrei
Jan 09 2009
On 2009-01-10 00:10:11 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:That's only true because, in your specific context, those functions were called often with the same input. In a program that rarely use the same inputs, your memoizing functions will just waste time and memory. So to determine if a function is worth memoizing, you have to say yes to those two things: 1. is it faster than recomputing the return value? 2. is this function called often with the same inputs? While the compiler could take an educated guess at answering 1 given the code of a function, question 2 can only be answered by knowing the usage pattern of various functions. -- Michel Fortin michel.fortin michelf.com http://michelf.com/The problem is identifying if this would be faster than recomputing the return value.I used memoizers for exp and log (the most called functions in some code I wrote) and it made the original version feel like it was driving in reverse.
Jan 10 2009
Michel Fortin:So to determine if a function is worth memoizing, you have to say yes to those two things: 1. is it faster than recomputing the return value? 2. is this function called often with the same inputs? While the compiler could take an educated guess at answering 1 given the code of a function, question 2 can only be answered by knowing the usage pattern of various functions.GCC has profile-driven optimization. I presume it may help in such problems too.Perhaps the cache needs to be a little smarter than a regular AA. You may not want to keep each and every value that was computed. Depending on the situation, keeping only the 100 last results may be enough, in which case you can dedicate a fixed amount of memory for caching.Right. There are time-limited memoization strategies, and other ones as well. In Python I have seen plenty of them: http://code.activestate.com/recipes/325905/ http://code.activestate.com/recipes/498110/ Bye, bearophile
Jan 10 2009
Michel Fortin wrote:On 2009-01-10 00:10:11 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:... so let's tell it what the usage patterns are! 1. Compile with profiling. $ dmd -profile pure_program.d 2. Feed the profiled executable a big gob of test data. Om nom nom. $ for I in {0..99} do; pure_program < test_data_$I.dat; done 3. Recompile, giving the compiler the trace log, telling it which functions got called frequently, and where time was spent. $ dmd -trace=trace.log pure_program.d Viola; now the compiler can make a well-informed guess at compile-time. All it really lacks is information on how many distinct argument sets any given pure functions gets, but I imagine that could be added. I think feeding the program test data with profiling enabled and then re-compiling would be a very acceptable trade off. -- DanielThat's only true because, in your specific context, those functions were called often with the same input. In a program that rarely use the same inputs, your memoizing functions will just waste time and memory. So to determine if a function is worth memoizing, you have to say yes to those two things: 1. is it faster than recomputing the return value? 2. is this function called often with the same inputs? While the compiler could take an educated guess at answering 1 given the code of a function, question 2 can only be answered by knowing the usage pattern of various functions.The problem is identifying if this would be faster than recomputing the return value.I used memoizers for exp and log (the most called functions in some code I wrote) and it made the original version feel like it was driving in reverse.
Jan 10 2009
Michel Fortin wrote:On 2009-01-10 00:10:11 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:No, because I use interpolation. AndreiThat's only true because, in your specific context, those functions were called often with the same input. In a program that rarely use the same inputs, your memoizing functions will just waste time and memory.The problem is identifying if this would be faster than recomputing the return value.I used memoizers for exp and log (the most called functions in some code I wrote) and it made the original version feel like it was driving in reverse.
Jan 10 2009
Andrei Alexandrescu wrote:Michel Fortin wrote:That's way beyond the ability of a compiler to do automatically. The compiler would have to understand that the pure function produces continuous results.On 2009-01-10 00:10:11 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:No, because I use interpolation.That's only true because, in your specific context, those functions were called often with the same input. In a program that rarely use the same inputs, your memoizing functions will just waste time and memory.The problem is identifying if this would be faster than recomputing the return value.I used memoizers for exp and log (the most called functions in some code I wrote) and it made the original version feel like it was driving in reverse.
Jan 10 2009
Walter Bright wrote:Andrei Alexandrescu wrote:You're replying to the wrong guy. I'm saying: the compiler shouldn't have to do so, but it should allow functions to do it. Lately it looks like a lot of the problems discussed here inevitably lead to a built-in feature. We want properties, let's have a property thingy. We want memoization, let's have a memoize thingy. We want optimally aligned structures, let's have an optimal align thingy. I'm holding my breath for a request for the kitchen sink thingy. AndreiMichel Fortin wrote:That's way beyond the ability of a compiler to do automatically. The compiler would have to understand that the pure function produces continuous results.On 2009-01-10 00:10:11 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:No, because I use interpolation.That's only true because, in your specific context, those functions were called often with the same input. In a program that rarely use the same inputs, your memoizing functions will just waste time and memory.The problem is identifying if this would be faster than recomputing the return value.I used memoizers for exp and log (the most called functions in some code I wrote) and it made the original version feel like it was driving in reverse.
Jan 10 2009
On Sun, Jan 11, 2009 at 8:51 AM, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Walter Bright wrote:Just get Walter to implement AST macros and it should stop. :-) Until then the library solutions are often unpalatable. Like using string mixins to implement properties. Ick. But the call for a memoization thingy I just don't get. No way that should be a compiler feature, just far too many ways to memoize with too many different tradeoffs. And a memoizing wrapper function basically loses nothing in terms of expressiveness. --bbAndrei Alexandrescu wrote:You're replying to the wrong guy. I'm saying: the compiler shouldn't have to do so, but it should allow functions to do it. Lately it looks like a lot of the problems discussed here inevitably lead to a built-in feature. We want properties, let's have a property thingy. We want memoization, let's have a memoize thingy. We want optimally aligned structures, let's have an optimal align thingy. I'm holding my breath for a request for the kitchen sink thingy.Michel Fortin wrote:That's way beyond the ability of a compiler to do automatically. The compiler would have to understand that the pure function produces continuous results.On 2009-01-10 00:10:11 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:No, because I use interpolation.That's only true because, in your specific context, those functions were called often with the same input. In a program that rarely use the same inputs, your memoizing functions will just waste time and memory.The problem is identifying if this would be faster than recomputing the return value.I used memoizers for exp and log (the most called functions in some code I wrote) and it made the original version feel like it was driving in reverse.
Jan 10 2009
Bill Baxter wrote:But the call for a memoization thingy I just don't get. No way that should be a compiler feature, just far too many ways to memoize with too many different tradeoffs. And a memoizing wrapper function basically loses nothing in terms of expressiveness. --bbThe ability for you to memoize a pure function and get a pure function as a result. Your pure functions can't use the memoized version of the pure function. This means you might have to use a lot more memoization wrappers.
Jan 11 2009
On Sun, Jan 11, 2009 at 11:43 PM, Christopher Wright <dhasenan gmail.com> wrote:Bill Baxter wrote:Isn't this the same old "logically const" vs "const" argument we went through ages ago? --bbBut the call for a memoization thingy I just don't get. No way that should be a compiler feature, just far too many ways to memoize with too many different tradeoffs. And a memoizing wrapper function basically loses nothing in terms of expressiveness. --bbThe ability for you to memoize a pure function and get a pure function as a result. Your pure functions can't use the memoized version of the pure function. This means you might have to use a lot more memoization wrappers.
Jan 11 2009
Andrei Alexandrescu wrote:Here's the simplest memoization example I could come up with. Maybe we should try to figure out how to get this to work and then expand to other cases? /** Fibonocci number generator based on * http://en.wikipedia.org/wiki/Fibonacci_number **/ pure int fibonocci(int n) in { assert( i>=0 && i<10 ); } out(result) { assert( result>0 ); } body{ static __thread int[10] cache = [0,1,-1,-1,-1,-1,-1,-1,-1,-1]; if (cache[n] >= 0) return i; else return fibonacci(n-1)+fibonacci(n-2); } unittest{ assert(fibonocci(4) == 3); assert(fibonocci(9) == 34); } Note the thread-local static variable. That's guaranteed to be thread safe and is easy to prove that it has no side effects outside of the fibonocci function. Inevitably, I've embarrassed myself by not actually passing what I thought was simple code through a compiler...That's way beyond the ability of a compiler to do automatically. The compiler would have to understand that the pure function produces continuous results.You're replying to the wrong guy. I'm saying: the compiler shouldn't have to do so, but it should allow functions to do it.Lately it looks like a lot of the problems discussed here inevitably lead to a built-in feature. We want properties, let's have a property thingy. We want memoization, let's have a memoize thingy. We want optimally aligned structures, let's have an optimal align thingy. I'm holding my breath for a request for the kitchen sink thingy.That kitchen sink thingy sounds useful. What does it do? ;)
Jan 11 2009
Jason House wrote:Andrei Alexandrescu wrote:So effectiely, the compiler could just trust pure functions to be actually pure, with UB for any that isn't?That's way beyond the ability of a compiler to do automatically. The compiler would have to understand that the pure function produces continuous results.You're replying to the wrong guy. I'm saying: the compiler shouldn't have to do so, but it should allow functions to do it.Here's the simplest memoization example I could come up with.You mean the simplest example beyond just remembering a single property value?Maybe we should try to figure out how to get this to work and then expand to other cases?<snip>Note the thread-local static variable. That's guaranteed to be thread safe and is easy to prove that it has no side effects outside of the fibonocci function. Inevitably, I've embarrassed myself by not actually passing what I thought was simple code through a compiler...<snip> Indeed. Stewart.
Jan 11 2009
Andrei Alexandrescu wrote:Walter Bright wrote:You're right, the "just the bits on the stack" is if the arguments are immutable. Const arguments would have to include whatever was indirectly referenced, which I argue is impractical for the runtime to handle automatically.Michel Fortin wrote:Not the bits, the full objects if they have indirection. (Pure functions should accept const parameters!)Hum, could the compiler be trusted to add the memoization code to pure functions so they can stay pure?If the compiler does general memoization on pure functions, all it has to do is use the bits of the arguments passed on the stack to the function as an index into an associative array of the return values.
Jan 10 2009
== Quote from Walter Bright (newshound1 digitalmars.com)'s articleAndrei Alexandrescu wrote:Even if the bits are immutable, I can't see how this could work. What if something is GC'd and then that memory location gets overwritten? For example: import std.contracts, std.stdio; enum N = 100; void main() { foo(); bar(); } uint pureFun(invariant uint[] data) pure { // Do stuff. return someUint; } void foo() { uint[] myArray = new uint[N]; // Fill in myArray; invariant uint[] myInvariantArray = assumeUnique(myArray); writeln(pureFun(myInvariantArray)); } void bar() { // Same as foo(). } In this case, foo.myInvariantArray goes out of scope when foo() returns. When bar() allocates an array, it's entirely possible that the GC will give bar.myArray the same memory address as foo.myArray had. In this case, if you're only using stack bits to handle memoization, you're screwed.Walter Bright wrote:You're right, the "just the bits on the stack" is if the arguments are immutable. Const arguments would have to include whatever was indirectly referenced, which I argue is impractical for the runtime to handle automatically.Michel Fortin wrote:Not the bits, the full objects if they have indirection. (Pure functions should accept const parameters!)Hum, could the compiler be trusted to add the memoization code to pure functions so they can stay pure?If the compiler does general memoization on pure functions, all it has to do is use the bits of the arguments passed on the stack to the function as an index into an associative array of the return values.
Jan 10 2009
dsimcha wrote:In this case, foo.myInvariantArray goes out of scope when foo() returns. When bar() allocates an array, it's entirely possible that the GC will give bar.myArray the same memory address as foo.myArray had. In this case, if you're only using stack bits to handle memoization, you're screwed.That's the beauty of a garbage collector <g>. The bits will include a reference to the immutable data, which the gc will see, and so the gc will not reclaim that data.
Jan 10 2009
== Quote from Walter Bright (newshound1 digitalmars.com)'s articledsimcha wrote:Good point, the bits in the internal memoization AA will contain a reference. I forgot about that. This leads to another problem, though: Wouldn't this cause a lot of soft memory leaks, since you're holding references to stuff that's otherwise no longer needed?In this case, foo.myInvariantArray goes out of scope when foo() returns. When bar() allocates an array, it's entirely possible that the GC will give bar.myArray the same memory address as foo.myArray had. In this case, if you're only using stack bits to handle memoization, you're screwed.That's the beauty of a garbage collector <g>. The bits will include a reference to the immutable data, which the gc will see, and so the gc will not reclaim that data.
Jan 10 2009
dsimcha wrote:== Quote from Walter Bright (newshound1 digitalmars.com)'s articleThe job of managing those kind of design tradeoffs is something that the programmer should handle.dsimcha wrote:Good point, the bits in the internal memoization AA will contain a reference. I forgot about that. This leads to another problem, though: Wouldn't this cause a lot of soft memory leaks, since you're holding references to stuff that's otherwise no longer needed?In this case, foo.myInvariantArray goes out of scope when foo() returns. When bar() allocates an array, it's entirely possible that the GC will give bar.myArray the same memory address as foo.myArray had. In this case, if you're only using stack bits to handle memoization, you're screwed.That's the beauty of a garbage collector <g>. The bits will include a reference to the immutable data, which the gc will see, and so the gc will not reclaim that data.
Jan 10 2009
Walter Bright wrote:dsimcha wrote:*cough* And with weak pointer/reference types we could handle it quite nicely. *cough* ;)Good point, the bits in the internal memoization AA will contain a reference. I forgot about that. This leads to another problem, though: Wouldn't this cause a lot of soft memory leaks, since you're holding references to stuff that's otherwise no longer needed?The job of managing those kind of design tradeoffs is something that the programmer should handle.
Jan 12 2009
"Walter Bright" wroteAndrei Alexandrescu wrote:Bits on the stack also do not work for immutable reference arguments. Imagine two immutable strings that are exactly the same data, but have different pointers. You wouldn't want to memoize only on the pointer values (or you'd lose I think bits on the stack only work for non-reference arguments. It sounds like memoization is a non-automatic feature in any case. Which is a shame because I thought it was a common pure-function optimization by the way it was discussed in the past. -SteveWalter Bright wrote:You're right, the "just the bits on the stack" is if the arguments are immutable. Const arguments would have to include whatever was indirectly referenced, which I argue is impractical for the runtime to handle automatically.Michel Fortin wrote:Not the bits, the full objects if they have indirection. (Pure functions should accept const parameters!)Hum, could the compiler be trusted to add the memoization code to pure functions so they can stay pure?If the compiler does general memoization on pure functions, all it has to do is use the bits of the arguments passed on the stack to the function as an index into an associative array of the return values.
Jan 12 2009
On 2009-01-09 22:41:01 -0500, Walter Bright <newshound1 digitalmars.com> said:If the compiler does general memoization on pure functions, all it has to do is use the bits of the arguments passed on the stack to the function as an index into an associative array of the return values.True, but only in the absence of pointers to mutable data (pointers to immutable data should be fine). (Hum, and beware of moving GCs, and ignore any memoization data which may be present inside the arguments... isn't determining equality a job for opEquals?)The problem is identifying if this would be faster than recomputing the return value.That's why I was using a "memoized" attribute in the function declaration, so the programmer decides if this function needs memoization or not. I know it's often a tough decision even for the designer of a function, but if adding memoization becomes easy, as a library designer you could just not worry about it and let users create a memoization wrapper when necessary. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Jan 10 2009
Walter Bright wrote:Michel Fortin wrote:<snip> Walter, before you go and implement any of this, I must point out that it's spelt "memorization" and "memorize". (Or "memorisation" and "memorise" if you're British, but that's an aside.) Stewart.Hum, could the compiler be trusted to add the memoization code to pure functions so they can stay pure?If the compiler does general memoization on pure functions, all it has to do is use the bits of the arguments passed on the stack to the function as an index into an associative array of the return values.
Jan 10 2009
Stewart Gordon wrote: <snip>Walter, before you go and implement any of this, I must point out that it's spelt "memorization" and "memorize". (Or "memorisation" and "memorise" if you're British, but that's an aside.)Or maybe not... http://en.wikipedia.org/wiki/Memoization Still, I'm confused.... Stewart.
Jan 10 2009
Stewart Gordon wrote:Stewart Gordon wrote: <snip>I thought you were joking. What's confusing you?Walter, before you go and implement any of this, I must point out that it's spelt "memorization" and "memorize". (Or "memorisation" and "memorise" if you're British, but that's an aside.)Or maybe not... http://en.wikipedia.org/wiki/Memoization Still, I'm confused.... Stewart.
Jan 10 2009
Christopher Wright wrote:Stewart Gordon wrote:I forgot.Stewart Gordon wrote: <snip>I thought you were joking. What's confusing you?Walter, before you go and implement any of this, I must point out that it's spelt "memorization" and "memorize". (Or "memorisation" and "memorise" if you're British, but that's an aside.)Or maybe not... http://en.wikipedia.org/wiki/Memoization Still, I'm confused.... Stewart.
Jan 10 2009
Walter Bright wrote:Christopher Wright wrote:Walter Bright is the new Stewart Gordon!Stewart Gordon wrote:I forgot.Stewart Gordon wrote: <snip>I thought you were joking. What's confusing you?Walter, before you go and implement any of this, I must point out that it's spelt "memorization" and "memorize". (Or "memorisation" and "memorise" if you're British, but that's an aside.)Or maybe not... http://en.wikipedia.org/wiki/Memoization Still, I'm confused.... Stewart.
Jan 11 2009
On Sun, Jan 11, 2009 at 4:31 AM, Stewart Gordon <smjg_1998 yahoo.com> wrote:Stewart Gordon wrote: <snip>Heh heh. Yeh, I remember I thought I'd found a major typo in the CLR algorithms tome when I first ran across that. But it's not a typo. It's "memo-ization" as in writing down a memo for yourself to remind you of the answer later. --bbWalter, before you go and implement any of this, I must point out that it's spelt "memorization" and "memorize". (Or "memorisation" and "memorise" if you're British, but that's an aside.)Or maybe not... http://en.wikipedia.org/wiki/Memoization Still, I'm confused.... Stewart.
Jan 10 2009
Michel Fortin wrote:On 2009-01-09 20:45:57 -0500, Jason House <jason.james.house gmail.com> said:Yes! Memoization is One Great Litmus Test of compile-time reflection. I implemented a couple of simple memoizers along the lines you mention, and I plan to make memoization examples part of TDPL. The Even Greater Litmus Test would be to expose the memoizer itself as a pure function (which it essentially is). AndreiHum, could the compiler be trusted to add the memoization code to pure functions so they can stay pure? Something like this: pure memoize int costlyCalculus(int i, int j) { return i + j; } being transformed into this by the compiler? int[TypeTuple!(i, j)] _costlyCalculus_cache; pure int costlyCalculus(int i, int j) { { int* _found_result = Tuple!(i, j) in _costlyCalculus_cache; if (_found_result) return _found_result; } { int _calculated_result = i + j; _costlyCalculus_cache[Tuple!(i, j)] = _calculated_result; return _calculated_result; } } Surely this way the compiler would be able to bypass purity checks for accessing the cache while still ensuring that the function has no side effect and always return the same result for the same arguments. The downside being that a hash table may not always be the best memoization technique. Perhaps then it belongs more in the standard library. pure int costlyCalculus(int i, int j); Memoizer!(costlyCalculus) memoizedCostlyCalculus; where Memoizer is a struct template defining a pure opCall and containing the cache it can access by casting away purity in a few places. But it makes it more difficult to add memoization while overriding a function.For each function, you'll have to pick which way you want to go. I don't see what that has to do with D's future, though, the language allows either.Somehow, I had the impression that D might use memoization as some kind of optimization once pure functions are embraced. That was probably a misunderstanding on my part.
Jan 09 2009
Walter Bright wrote:Jason House wrote:Memoization and purity can and should go together. We need to sit down and discuss that someday. AndreiWalter Bright Wrote:Memoization and purity don't mix. For each function, you'll have to pick which way you want to go. I don't see what that has to do with D's future, though, the language allows either.Jason House wrote:One could easily generalize that to mean non-final virtual functions should never be pure. Memoizaton is a pretty basic need. I thought that was in D's future. Is that not true?When considering the standard library changes, I realized that object.d could change. I believe a pure opCmp or toString could break user code with impure versions of those functions. Would that kind of a change to object.d cause any real problems for D2 users?As Andrei pointed out, the trouble with making the Object functions pure is if you want to do an override that caches its value.
Jan 09 2009
Walter Bright wrote: <snip>Memoization and purity don't mix. For each function, you'll have to pick which way you want to go. I don't see what that has to do with D's future, though, the language allows either.Actually, what they don't mix with is the requirement that it still be callable on a mutable object. There's http://d.puremagic.com/issues/show_bug.cgi?id=1824 and, while it's a nice idea, it precludes caching of the result. Unless you want to keep the cache separate from the object. A pure toString may be a nice idea; however, this precludes using any mutable members of the class to generate it. But it would also be nice to be able to call such methods on all objects of the class, mutable or not. One possibility is to have separate functions string toString() string toString() invariant and let the programmer implement memo(r)ization on the mutable version if desired, whereas the invariant version would have it generated by the compiler. But there are still problems - nobody would know which to call if the reference is const, rather than mutable or invariant - it would be necessary to maintain two versions of toString in parallel Maybe there's a solution somewhere out there.... Stewart.
Jan 10 2009