digitalmars.D - Is this function pure?
- Janice Caron (15/15) Sep 18 2007 Is this function pure?
- Vladimir Panteleev (13/28) Sep 18 2007 Forbidding operations that may cause an exception would imply forbidding...
- Lutger (4/21) Sep 18 2007 Putting what the compiler can check aside, isn't it the return statement...
- Janice Caron (16/21) Sep 18 2007 I don't think so, since I could easily rewrite it as
- Lutger (33/56) Sep 18 2007 What I meant is that the return value is depending on whether new throws...
- Steven Schveighoffer (7/22) Sep 18 2007 I would guess that it is not pure, as it is having side effects. The ne...
- Lutger (5/34) Sep 18 2007 That's true, using 'new' does make a function impure. That doesn't mean
- Don Clugston (16/33) Sep 18 2007 Interesting question.
- OF (5/22) Sep 18 2007 As long as new char[x] can fail without consistency this function is not...
- Janice Caron (13/15) Sep 18 2007 I think the "address of" operator would have to banned completely from
- Bruno Medeiros (6/19) Sep 18 2007 is invalid even for non-pure functions.
- Janice Caron (7/16) Sep 18 2007 If you insist...
- Brad Roberts (20/42) Sep 18 2007 Purity typically means that the result of the function is exclusively
- Steven Schveighoffer (3/6) Sep 18 2007 You mean 'logically' pure? (cackles manically)
- Nathan Reed (19/36) Sep 18 2007 Pure is intended to mimic some of the capabilities of functional
- Nathan Reed (9/13) Sep 18 2007 I just realized that (as someone else pointed out) taking the address of...
- BCS (8/23) Sep 18 2007 It would be a better approach (IMHO) for pure function to just be preven...
- BCS (2/27) Sep 18 2007 (BTW: all that is with regards to pure running at compile time)
- Steven Schveighoffer (26/42) Sep 18 2007 I tend to agree with you. But Walter is touting pure functions as being...
- Steven Schveighoffer (6/16) Sep 18 2007 Totally misread your challenge. I thought you said impossible to write ...
- Nathan Reed (7/12) Sep 18 2007 The allocator has to deal with allocation requests coming from multiple
- 0ffh (5/9) Sep 18 2007 We might even want go all the way, and *require* a pure function to thro...
- Bruce Adams (3/14) Sep 18 2007 My understanding is that pure is a compile time concept. If your functio...
- Nathan Reed (8/9) Sep 18 2007 Pure functions aren't necessarily /evaluated/ at compile time. They can...
- Janice Caron (22/24) Sep 19 2007 I presume that stands for Compile Time Function Execution and not
- Robert Fraser (3/33) Sep 19 2007 Someone probably already mentioned this, but you can't (easily, i.e. may...
- Brad Roberts (3/38) Sep 19 2007 Yes you can. The malloc subsystem out of practical necessity is
- Robert Fraser (2/42) Sep 19 2007
- Bill Baxter (7/24) Sep 19 2007 On the other hand the support for concurrency may consist of little more...
- Nathan Reed (12/21) Sep 19 2007 On a multicore system the allocator might conceivably have a heap for
- Bill Baxter (22/39) Sep 19 2007 I've heard this claim a couple of times recently (allocation is fast in
- Frits van Bommel (15/17) Sep 19 2007 Yes, AFAIK the 'fast allocation' is mostly true of garbage collectors
- Nathan Reed (9/11) Sep 19 2007 I didn't know that; I assumed D was using at least a copying collector.
- Frits van Bommel (43/53) Sep 19 2007 The big blocker to any sort of copying/moving collector is that it needs...
- 0ffh (11/25) Sep 18 2007 Actually, in spite of all the "Not quite sure, probably not." replies
- Janice Caron (4/23) Sep 18 2007 You don't? The way I see it, if the input is 5, then output might
- 0ffh (11/14) Sep 18 2007 That's surely a point which can be discussed:
- Janice Caron (12/14) Sep 18 2007 I don't think it's that simple. Exceptions should be allowed as flow
- Ingo Oeser (12/27) Sep 18 2007 A valid optimisation is to assume that x -> a.length, since a.length
- Janice Caron (16/20) Sep 18 2007 OK, so change it to
- Ingo Oeser (7/25) Sep 18 2007 Ok, now that's impure, since now you are using the memory by returning i...
- 0ffh (6/18) Sep 18 2007 I am afraid I must disagree.
- Ingo Oeser (8/12) Sep 18 2007 Sorry, mixed that up with const functions.
- Nathan Reed (16/20) Sep 18 2007 I think if you take this point of view, you have to conclude that even a...
Is this function pure? int probablyNotPure(int x) { try { char[] a = new char[x]; return a.length; } catch { return 0; } } I must assume not, since it's not deterministic. But which part of it wrong? Should "new" be forbidden? Or is "try/catch"? Or both?
Sep 18 2007
On Tue, 18 Sep 2007 15:36:50 +0300, Janice Caron <caron800 googlemail.co= m> wrote:Is this function pure? int probablyNotPure(int x) { try { char[] a =3D new char[x]; return a.length; } catch { return 0; } } I must assume not, since it's not deterministic. But which part of it wrong? Should "new" be forbidden? Or is "try/catch"? Or both?Forbidding operations that may cause an exception would imply forbidding= division, since that may raise a division by zero statement. I don't se= e what's wrong with handling your own exceptions. And, "determinism" is = a stretchable notion - if you can determine the number of concurrent thr= eads which will call this function, the points where switches occur and = the amount of available memory... in D's context, though, I don't think = determinism should be a necessity for pure functions - just the limitati= on of not accessing external data or "non-pure" code. -- = Best regards, Vladimir mailto:thecybershadow gmail.com
Sep 18 2007
Janice Caron wrote:Is this function pure? int probablyNotPure(int x) { try { char[] a = new char[x]; return a.length; } catch { return 0; } } I must assume not, since it's not deterministic. But which part of it wrong? Should "new" be forbidden? Or is "try/catch"? Or both?Putting what the compiler can check aside, isn't it the return statement in the catch block what prevents this function from being pure, or referentially transparent?
Sep 18 2007
On 9/18/07, Lutger <lutger.blijdestijn gmail.com> wrote:Janice Caron wrote:I don't think so, since I could easily rewrite it as int probablyNotPure(int x) { int n; try { char[] a = new char[x]; n = a.length; } catch { n = 0; } return n; }Is this function pure?Putting what the compiler can check aside, isn't it the return statement in the catch block what prevents this function from being pure, or referentially transparent?
Sep 18 2007
Janice Caron wrote:On 9/18/07, Lutger <lutger.blijdestijn gmail.com> wrote:What I meant is that the return value is depending on whether new throws or not. This is the case in the above function as well as the one in the original post, they amount to the same thing. From the standpoint of referential transparency I don't think this function is really different from: int surelyNotPure(int x) { int n = rand(); if (n > 100) { return x; } else { return 0; } } But if you write it as this I don't see why it would not be pure: int probablyPure(int x) { int n; try { char[] a = new char[x]; n = a.length; } catch { throw new FooException; } return n; }Janice Caron wrote:I don't think so, since I could easily rewrite it as int probablyNotPure(int x) { int n; try { char[] a = new char[x]; n = a.length; } catch { n = 0; } return n; }Is this function pure?Putting what the compiler can check aside, isn't it the return statement in the catch block what prevents this function from being pure, or referentially transparent?
Sep 18 2007
"Janice Caron" wroteIs this function pure? int probablyNotPure(int x) { try { char[] a = new char[x]; return a.length; } catch { return 0; } } I must assume not, since it's not deterministic. But which part of it wrong? Should "new" be forbidden? Or is "try/catch"? Or both?I would guess that it is not pure, as it is having side effects. The new call is modifying global state (that of the garbage collector), so I don't see how that 'new' call can be pure. From what I understand, a pure function can only call other pure functions. So I think to answer your question, new should be forbidden. -Steve
Sep 18 2007
Steven Schveighoffer wrote:"Janice Caron" wroteThat's true, using 'new' does make a function impure. That doesn't mean it cannot be referentially transparent though. Even if some 'impure' operations were to be allowed, in the case of new you would also have to take custom allocators into account.Is this function pure? int probablyNotPure(int x) { try { char[] a = new char[x]; return a.length; } catch { return 0; } } I must assume not, since it's not deterministic. But which part of it wrong? Should "new" be forbidden? Or is "try/catch"? Or both?I would guess that it is not pure, as it is having side effects. The new call is modifying global state (that of the garbage collector), so I don't see how that 'new' call can be pure. From what I understand, a pure function can only call other pure functions. So I think to answer your question, new should be forbidden. -Steve
Sep 18 2007
Janice Caron wrote:Is this function pure? int probablyNotPure(int x) { try { char[] a = new char[x]; return a.length; } catch { return 0; } } I must assume not, since it's not deterministic. But which part of it wrong? Should "new" be forbidden? Or is "try/catch"? Or both?Interesting question. This is exactly the same: int probablyToo(char [] x) { try { return (x ~ "a").length; } catch { return 0; } } If 'new' was forbidden for reasons of non-determinism, then string concatenation would have to be, as well. Which I think would make pure functions pretty limited. But, if you assume that anything with is CTFE-able is also pure, then string concatenation is OK.
Sep 18 2007
Janice Caron Wrote:Is this function pure? int probablyNotPure(int x) { try { char[] a = new char[x]; return a.length; } catch { return 0; } } I must assume not, since it's not deterministic. But which part of it wrong? Should "new" be forbidden? Or is "try/catch"? Or both?As long as new char[x] can fail without consistency this function is not pure. That is, the return value depends on something else than x. However, allocating memory is in itself not necessarily 'unpure', as long as it's predictible (say: I always get memory, else make this computation fail, which could be handled by a caller on a lower and not pure level). Problems arise as long as you allow new to fail like this or if you use the address you get (which of course is not deterministic to the degree we want). Probably it's best to leave memory allocations to implicit allocations in pure functions, but this is tricky if you want to return a new object as a modified version of an input object. I don't know how it's intended for D to solve this nicely, but I'm very interested in hearing it.
Sep 18 2007
On 9/18/07, OF <nospam nospammington.com> wrote:Problems arise as long as you allow new to fail like this or if you use the address you get (which of course is not deterministic to the degree we want).I think the "address of" operator would have to banned completely from pure functions, otherwise you could do int definitelyNotPure() { int x; return cast(int)&x; }Probably it's best to leave memory allocations to implicit allocations in pure functions, but this is tricky if you want to return a new object as a modified version of an input object. I don't know how it's intended for D to solve this nicely, but I'm very interested in hearing it.The only way I can think of is the old-fashioned C way, of passing a buffer and a length (or in D, an array) into the function, and using that and that alone as your memory. ...Ooooh, but wait - don't all inputs have to be (transitively) const!? Yikes.
Sep 18 2007
Janice Caron wrote:On 9/18/07, OF <nospam nospammington.com> wrote:Can you give another example? The one you gave:Problems arise as long as you allow new to fail like this or if you use the address you get (which of course is not deterministic to the degree we want).I think the "address of" operator would have to banned completely from pure functions, otherwise you could doint definitelyNotPure() { int x; return cast(int)&x; }is invalid even for non-pure functions. -- Bruno Medeiros - MSc in CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Sep 18 2007
On 9/19/07, Bruno Medeiros <brunodomedeiros+spam com.gmail> wrote:Can you give another example? The one you gave:If you insist... int definitelyNotPure() { auto x = new int[1]; return cast(int)&x; }int definitelyNotPure() { int x; return cast(int)&x; }is invalid even for non-pure functions.
Sep 18 2007
Janice Caron wrote:On 9/18/07, OF <nospam nospammington.com> wrote:Purity typically means that the result of the function is exclusively determined by it's inputs. IE, that the results are entirely repeatable. So, this would be pure: void foo(in int x, out int y) { y = x+1; } No matter how often you call it, for any given value of x, what it sets y to is exactly the same. Obviously, this is a simple example and it's essentially isomorphic with the simpler: int foo(in int x) { return x+1; } Whether use of the memory subsystem or other subtle side effects affects purity likely becomes a matter of definition. I personally hope that it's included such that your original example is pure. Later, BradProblems arise as long as you allow new to fail like this or if you use the address you get (which of course is not deterministic to the degree we want).I think the "address of" operator would have to banned completely from pure functions, otherwise you could do int definitelyNotPure() { int x; return cast(int)&x; }Probably it's best to leave memory allocations to implicit allocations in pure functions, but this is tricky if you want to return a new object as a modified version of an input object. I don't know how it's intended for D to solve this nicely, but I'm very interested in hearing it.The only way I can think of is the old-fashioned C way, of passing a buffer and a length (or in D, an array) into the function, and using that and that alone as your memory. ...Ooooh, but wait - don't all inputs have to be (transitively) const!? Yikes.
Sep 18 2007
"Brad Roberts" wroteWhether use of the memory subsystem or other subtle side effects affects purity likely becomes a matter of definition. I personally hope that it's included such that your original example is pure.You mean 'logically' pure? (cackles manically) -Steve
Sep 18 2007
Janice Caron wrote:Is this function pure? int probablyNotPure(int x) { try { char[] a = new char[x]; return a.length; } catch { return 0; } } I must assume not, since it's not deterministic. But which part of it wrong? Should "new" be forbidden? Or is "try/catch"? Or both?Pure is intended to mimic some of the capabilities of functional languages, yes? Well, at the implementation level functional languages have to allocate memory just like anyone else, and those allocations can potentially fail just like always. That doesn't keep functional language functions from being pure, so it shouldn't keep D functions from being pure either. The function you've written is nondeterministic, but only due to the nondeterminism inherent in memory allocation. It would be interesting to see if it's possible to write a function that acheives nondeterminism *without* either memory allocation or reading global state (e.g. rand()). I believe it's not. If that is the case, then the only potential source of nondeterminism in pure functions is memory allocation, and I don't think we need to really worry about that too much. After all, if we've run out of memory, we've likely got bigger problems than our pure functions becoming nondeterministic. :) Thanks, Nathan Reed
Sep 18 2007
Nathan Reed wrote:It would be interesting to see if it's possible to write a function that acheives nondeterminism *without* either memory allocation or reading global state (e.g. rand()). I believe it's not.I just realized that (as someone else pointed out) taking the address of a local variable would also be nondeterministic. Pointer shenanigans in general can lead to nondeterminism, so maybe pointers should be forbidden entirely in pure functions. Given that pointers are rarely used in D except for interfacing with C functions and suchlike, this seems like it would be not entirely unreasonable. Thanks, Nathan Reed
Sep 18 2007
Reply to Nathan,Nathan Reed wrote:It would be a better approach (IMHO) for pure function to just be prevented from accessing "other things". This could be done with some sort of VM. It is interesting to note that given such a VM, pure function can call some non pure function. The requirement would be that a pure function must not call anything that accesses something that the pure function can't. In a VM this is easy to enforce, just don't put anything into it that the function shouldn't access.It would be interesting to see if it's possible to write a function that acheives nondeterminism *without* either memory allocation or reading global state (e.g. rand()). I believe it's not.I just realized that (as someone else pointed out) taking the address of a local variable would also be nondeterministic. Pointer shenanigans in general can lead to nondeterminism, so maybe pointers should be forbidden entirely in pure functions. Given that pointers are rarely used in D except for interfacing with C functions and suchlike, this seems like it would be not entirely unreasonable. Thanks, Nathan Reed
Sep 18 2007
Reply to Benjamin,Reply to Nathan,(BTW: all that is with regards to pure running at compile time)Nathan Reed wrote:It would be a better approach (IMHO) for pure function to just be prevented from accessing "other things". This could be done with some sort of VM. It is interesting to note that given such a VM, pure function can call some non pure function. The requirement would be that a pure function must not call anything that accesses something that the pure function can't. In a VM this is easy to enforce, just don't put anything into it that the function shouldn't access.It would be interesting to see if it's possible to write a function that acheives nondeterminism *without* either memory allocation or reading global state (e.g. rand()). I believe it's not.I just realized that (as someone else pointed out) taking the address of a local variable would also be nondeterministic. Pointer shenanigans in general can lead to nondeterminism, so maybe pointers should be forbidden entirely in pure functions. Given that pointers are rarely used in D except for interfacing with C functions and suchlike, this seems like it would be not entirely unreasonable. Thanks, Nathan Reed
Sep 18 2007
"Nathan Reed" wrotePure is intended to mimic some of the capabilities of functional languages, yes? Well, at the implementation level functional languages have to allocate memory just like anyone else, and those allocations can potentially fail just like always. That doesn't keep functional language functions from being pure, so it shouldn't keep D functions from being pure either.I tend to agree with you. But Walter is touting pure functions as being able to run simultaneously on two cores. What happens when both functions allocate memory without normal thread locking? i.e.: char(invariant)[] x = y() ~ z(); // both y and z are pure Since y and z are pure, they should be able to run simultaneously on two cores. What happens when they both try to allocate memory?The function you've written is nondeterministic, but only due to the nondeterminism inherent in memory allocation. It would be interesting to see if it's possible to write a function that acheives nondeterminism *without* either memory allocation or reading global state (e.g. rand()). I believe it's not.I don't follow... Isn't this a pure function? no global state read/memory allocation: int add(int x, int y) { return x + y; } Basically, you can do pure functions only on the stack, but the issue becomes returning a variable length data structure, such as a string. I think you can't do it unless memory allocation is given a pass. However, then all new operator overrides must be allowed to be pure. Perhaps you give a single memory allocation routine a free pass, and code that deep in the bowels of D, making sure it is thread-safe. Then all other memory allocation routines can declare themselves pure if they use that memory allocation routine as a base. I'm sure Walter has a solution in mind here...If that is the case, then the only potential source of nondeterminism in pure functions is memory allocation, and I don't think we need to really worry about that too much. After all, if we've run out of memory, we've likely got bigger problems than our pure functions becoming nondeterministic. :)I think exception catching should be outlawed in pure functions. This makes the pure function deterministic iff it does not throw an exception. You can handle cases like divide by zero just by checking if the divisor is 0. -Steve
Sep 18 2007
"Steven Schveighoffer" wrote"Nathan Reed" wroteTotally misread your challenge. I thought you said impossible to write a function that achieves *determinism*. Sorry :) I think you are right. I think the challenge can be reduced to just reading global state, as allocating memory is a subset of that. -SteveIt would be interesting to see if it's possible to write a function that acheives nondeterminism *without* either memory allocation or reading global state (e.g. rand()). I believe it's not.I don't follow... Isn't this a pure function? no global state read/memory allocation: int add(int x, int y) { return x + y; }
Sep 18 2007
Steven Schveighoffer wrote:Since y and z are pure, they should be able to run simultaneously on two cores. What happens when they both try to allocate memory?The allocator has to deal with allocation requests coming from multiple threads/cores anyway. This case presents no additional problem.I think exception catching should be outlawed in pure functions. This makes the pure function deterministic iff it does not throw an exception. You can handle cases like divide by zero just by checking if the divisor is 0.Exception throwing and catching is deterministic too as long as the condition that generates the exception is (which dividing by zero would be). I think it is perfectly reasonable to allow pure functions to throw and catch exceptions.
Sep 18 2007
Nathan Reed wrote:Exception throwing and catching is deterministic too as long as the condition that generates the exception is (which dividing by zero would be). I think it is perfectly reasonable to allow pure functions to throw and catch exceptions.We might even want go all the way, and *require* a pure function to throw an exception if it detects that functional purity has been compromised like in Janice's out-of-memory example. Regards, frank
Sep 18 2007
0ffh Wrote:Nathan Reed wrote:My understanding is that pure is a compile time concept. If your function can violate it its not pure, unless you have a scheme for compile time exception handling - yeauch. Bruce.Exception throwing and catching is deterministic too as long as the condition that generates the exception is (which dividing by zero would be). I think it is perfectly reasonable to allow pure functions to throw and catch exceptions.We might even want go all the way, and *require* a pure function to throw an exception if it detects that functional purity has been compromised like in Janice's out-of-memory example. Regards, frank
Sep 18 2007
Bruce Adams wrote:My understanding is that pure is a compile time concept. If your function can violate it its not pure, unless you have a scheme for compile time exception handling - yeauch.Pure functions aren't necessarily /evaluated/ at compile time. They can throw exceptions when evaluated at run-time, just like any function. However, even for functions evaluated at compile time I can't see that throwing and catching exceptions would be that big a deal. CTFE is basically like running an interpreter, right? Thanks, Nathan Reed
Sep 18 2007
On 9/19/07, Nathan Reed <nathaniel.reed gmail.com> wrote:CTFE is basically like running an interpreter, right?I presume that stands for Compile Time Function Execution and not chlorotrifluoroethylene, right? :-) My feeling is that Ingo is basically right, in that Walter needs to assume that new can never fail - that it can successfully allocate infinite amounts of memory without causing trouble. If that assumption holds (...obviously it doesn't, but bear with me...) then new is allowed. Otherwise the simplest things would be banned: a function which does toString (e.g. that turns 42 into "42") would be banned; a function which adds a passed-in element to a passed-in associative array would be banned, and so on. I guess it just has to be allowed. But I still have a problem with it, because such a program is only "logically pure", not "physically pure". And I have to wonder, will be be allowed to write our own "logically pure" functions and declare them pure? How will custom allocators for "new" fit into the picture? A custom allocator must surely access global state? A custom allocator could be written to be (logically) pure, or it could be decidedly impure (for example by initialising the created variables with rand()). Will purity be transitive and as strictly enforced as const? Will we be able to cast away purity when we need to? I guess we'll wait and see, but I would surely like to know.
Sep 19 2007
Someone probably already mentioned this, but you can't (easily, i.e. maybe some sort of heap partitioning could do it but you'd need OS support for that) automatically parallelize functions that allocate memory (since the heap is a shared resource), which was one of the big selling points of "pure". So I'm assuming that allocations are out of the equation. And since exceptions are most often paired up with an allocation (although not necessarily), those might be out, too. Not totally sure about that one. There's no reason stack-allocated/scope classes couldn't be used, though. Janice Caron Wrote:On 9/19/07, Nathan Reed <nathaniel.reed gmail.com> wrote:CTFE is basically like running an interpreter, right?I presume that stands for Compile Time Function Execution and not chlorotrifluoroethylene, right? :-) My feeling is that Ingo is basically right, in that Walter needs to assume that new can never fail - that it can successfully allocate infinite amounts of memory without causing trouble. If that assumption holds (...obviously it doesn't, but bear with me...) then new is allowed. Otherwise the simplest things would be banned: a function which does toString (e.g. that turns 42 into "42") would be banned; a function which adds a passed-in element to a passed-in associative array would be banned, and so on. I guess it just has to be allowed. But I still have a problem with it, because such a program is only "logically pure", not "physically pure". And I have to wonder, will be be allowed to write our own "logically pure" functions and declare them pure? How will custom allocators for "new" fit into the picture? A custom allocator must surely access global state? A custom allocator could be written to be (logically) pure, or it could be decidedly impure (for example by initialising the created variables with rand()). Will purity be transitive and as strictly enforced as const? Will we be able to cast away purity when we need to? I guess we'll wait and see, but I would surely like to know.
Sep 19 2007
Yes you can. The malloc subsystem out of practical necessity is internally thread safe and can be used concurrently without concern. Robert Fraser wrote:Someone probably already mentioned this, but you can't (easily, i.e. maybe some sort of heap partitioning could do it but you'd need OS support for that) automatically parallelize functions that allocate memory (since the heap is a shared resource), which was one of the big selling points of "pure". So I'm assuming that allocations are out of the equation. And since exceptions are most often paired up with an allocation (although not necessarily), those might be out, too. Not totally sure about that one. There's no reason stack-allocated/scope classes couldn't be used, though. Janice Caron Wrote:On 9/19/07, Nathan Reed <nathaniel.reed gmail.com> wrote:CTFE is basically like running an interpreter, right?I presume that stands for Compile Time Function Execution and not chlorotrifluoroethylene, right? :-) My feeling is that Ingo is basically right, in that Walter needs to assume that new can never fail - that it can successfully allocate infinite amounts of memory without causing trouble. If that assumption holds (...obviously it doesn't, but bear with me...) then new is allowed. Otherwise the simplest things would be banned: a function which does toString (e.g. that turns 42 into "42") would be banned; a function which adds a passed-in element to a passed-in associative array would be banned, and so on. I guess it just has to be allowed. But I still have a problem with it, because such a program is only "logically pure", not "physically pure". And I have to wonder, will be be allowed to write our own "logically pure" functions and declare them pure? How will custom allocators for "new" fit into the picture? A custom allocator must surely access global state? A custom allocator could be written to be (logically) pure, or it could be decidedly impure (for example by initialising the created variables with rand()). Will purity be transitive and as strictly enforced as const? Will we be able to cast away purity when we need to? I guess we'll wait and see, but I would surely like to know.
Sep 19 2007
Heh, never knew that; that's good news. Thanks for brightening up my day! Brad Roberts Wrote:Yes you can. The malloc subsystem out of practical necessity is internally thread safe and can be used concurrently without concern. Robert Fraser wrote:Someone probably already mentioned this, but you can't (easily, i.e. maybe some sort of heap partitioning could do it but you'd need OS support for that) automatically parallelize functions that allocate memory (since the heap is a shared resource), which was one of the big selling points of "pure". So I'm assuming that allocations are out of the equation. And since exceptions are most often paired up with an allocation (although not necessarily), those might be out, too. Not totally sure about that one. There's no reason stack-allocated/scope classes couldn't be used, though. Janice Caron Wrote:On 9/19/07, Nathan Reed <nathaniel.reed gmail.com> wrote:CTFE is basically like running an interpreter, right?I presume that stands for Compile Time Function Execution and not chlorotrifluoroethylene, right? :-) My feeling is that Ingo is basically right, in that Walter needs to assume that new can never fail - that it can successfully allocate infinite amounts of memory without causing trouble. If that assumption holds (...obviously it doesn't, but bear with me...) then new is allowed. Otherwise the simplest things would be banned: a function which does toString (e.g. that turns 42 into "42") would be banned; a function which adds a passed-in element to a passed-in associative array would be banned, and so on. I guess it just has to be allowed. But I still have a problem with it, because such a program is only "logically pure", not "physically pure". And I have to wonder, will be be allowed to write our own "logically pure" functions and declare them pure? How will custom allocators for "new" fit into the picture? A custom allocator must surely access global state? A custom allocator could be written to be (logically) pure, or it could be decidedly impure (for example by initialising the created variables with rand()). Will purity be transitive and as strictly enforced as const? Will we be able to cast away purity when we need to? I guess we'll wait and see, but I would surely like to know.
Sep 19 2007
Robert Fraser wrote:Someone probably already mentioned this, but you can't (easily, i.e. maybe some sort of heap partitioning could do it but you'd need OS support for that) automatically parallelize functions that allocate memory (since the heap is a shared resource), which was one of the big selling points of "pure". So I'm assuming that allocations are out of the equation. And since exceptions are most often paired up with an allocation (although not necessarily), those might be out, too. Not totally sure about that one. There's no reason stack-allocated/scope classes couldn't be used, though.Brad Roberts Wrote:Yes you can. The malloc subsystem out of practical necessity is internally thread safe and can be used concurrently without concern.Robert Fraser wrote: Heh, never knew that; that's good news. Thanks for brightening up my day!On the other hand the support for concurrency may consist of little more than a global mutex. That means that the performance benefits of launching multiple threads might not be so great since they'll all just end up in a queued waiting for their turn to malloc. I have no idea what they do in practice though. Do typical malloc systems actually allow simultaneous allocations in multiple threads? --bb
Sep 19 2007
Bill Baxter wrote:On the other hand the support for concurrency may consist of little more than a global mutex. That means that the performance benefits of launching multiple threads might not be so great since they'll all just end up in a queued waiting for their turn to malloc. I have no idea what they do in practice though. Do typical malloc systems actually allow simultaneous allocations in multiple threads? --bbOn a multicore system the allocator might conceivably have a heap for each core or something like that. However, even if this is not the case, allocation in a garbage-collected language is very quick and the mutex would only need to be held for the duration of the allocation. With a pure function that does some allocation and then a bunch of computation, all the computations could proceed in parallel once each thread's memory had been allocated. Synchronized allocations certainly need not negate the benefits of parallelism. Thanks, Nathan Reed
Sep 19 2007
Nathan Reed wrote:Bill Baxter wrote:I've heard this claim a couple of times recently (allocation is fast in GC languages). I don't think it's true, though, at least not of D. Maybe it's a distortion of the the claim that *overall*, memory management in a GC language can be faster than explicit management. But that's not because allocations are cheaper. It's because you don't have to keep deleting things one by one as they go out of scope. Instead, a bunch of out-of-scope objects all get reclaimed together during a GC sweep. But the GC sweep takes place when you do an allocation, so in addition to the normal work necessary to do an allocation (finding a contiguous block of free memory that's big enough) you also have to do a full mark-sweep of all reachable memory. I don't see how mark+sweep+allocate can be faster than just allocate. [Rereading that now, though, it doesn't make much sense to me -- if there's a big gain to be had just by not explicitly freeing memory until the next allocation attempt, then regular malloc implementations could do that too, queuing up the free() requests until the next malloc call.] The 'fast allocation' claim may be true of generational GCs or other fancy GCs, but D currently has a mark&sweep type. Sean and other folks more in the know about GC -- please feel free to correct me on any of this. --bbOn the other hand the support for concurrency may consist of little more than a global mutex. That means that the performance benefits of launching multiple threads might not be so great since they'll all just end up in a queued waiting for their turn to malloc. I have no idea what they do in practice though. Do typical malloc systems actually allow simultaneous allocations in multiple threads? --bbOn a multicore system the allocator might conceivably have a heap for each core or something like that. However, even if this is not the case, allocation in a garbage-collected language is very quick and the mutex would only need to be held for the duration of the allocation.
Sep 19 2007
Bill Baxter wrote:The 'fast allocation' claim may be true of generational GCs or other fancy GCs, but D currently has a mark&sweep type.Yes, AFAIK the 'fast allocation' is mostly true of garbage collectors that move objects that aren't collected, since they have all free memory in a contiguous block and can just increment a pointer. However, I can see /deallocation/ being faster with a GC since the cost of explicit deallocation isn't just limited to the free/delete call; it also often incurs the cost of tracking *when* to call them (which often means using things like reference counts, which in a multithreaded app need a mutex...). Also, a GC gets to postpone memory deallocation until it's actually needed, which can sometimes mean they don't have to do it at all (if your program doesn't allocate all that much). Note: I'm not saying using a GC is necessarily faster (since I don't have the numbers), just that I can see how it *could* be. Of course, there's also the programmer convenience factor :).
Sep 19 2007
Bill Baxter wrote:The 'fast allocation' claim may be true of generational GCs or other fancy GCs, but D currently has a mark&sweep type.I didn't know that; I assumed D was using at least a copying collector. You're right that allocation wouldn't be any faster with a mark-and-sweep collector than in a language with explicit memory management. D /could/ theoretically use a copying collector or generational collector though; as far as I know there's no barrier except writing the code for it. Thanks, Nathan Reed
Sep 19 2007
Nathan Reed wrote:Bill Baxter wrote:The big blocker to any sort of copying/moving collector is that it needs to know which bytes represent pointers and which don't. This would require the compiler to emit extra type information it currently doesn't. In particular: * ClassInfo.offTi[] would need to actually be filled (IIRC it's always an empty array currently) * Similar data is needed for stack frames. AFAIK there's currently no way to communicate this. * Pointer data for structs, delegates, array types, AAs etc. would also need to be provided (but could be "inlined" into the containing class/stackframe information) * Heap objects would need some kind of tag to indicate how to find pointers in them. For classes it would be easy, just get .classinfo.offTi through the vtable ptr, but for example structs and array data can also be heap-allocated and don't have a vtable ptr. In fact, for arrays it isn't even constant (though a 'repeat x times to visit all allocated data' marker could be used). The problem that requires all that is the fact that when you move a heap cell, you need to update all (reachable) pointers/references to it (but not non-pointer data that happens to *look* like a pointer). So the GC can only do this when it is sure it can update all pointers to it, and *only* pointers to it. A GC could of course refrain from moving objects if it's not sure everything that looks like a pointer to it actually is one, but that would negate at least some of the benefit of a moving collector. (An interesting idea might be to try and use debugging information as the source of type information. However, even though that might work for stack data it could be harder for heap data since all you've got to go on is the declared types of the pointers to it; and void*s could point to anything. Besides, requiring debug info for the GC to work well is a bit of a hack) Another issue is that sometimes the compiler simply doesn't *know* whether something is a pointer or not. Consider: * You can allocate void[]s, which explicitly *may* contain pointers but don't necessarily consist of only pointers. * Unions of pointer and non-pointer types * Code that's linked in could be compiled by another compiler entirely (for instance, a library written in C). Currently using these just requires that you keep a pointer where the GC can find it, but if you need to find all pointers to heap objects you'd need non-D code to declare where exactly they keep their pointers... (Assuming it can't be inferred from sources such as debug info)The 'fast allocation' claim may be true of generational GCs or other fancy GCs, but D currently has a mark&sweep type.I didn't know that; I assumed D was using at least a copying collector. You're right that allocation wouldn't be any faster with a mark-and-sweep collector than in a language with explicit memory management. D /could/ theoretically use a copying collector or generational collector though; as far as I know there's no barrier except writing the code for it.
Sep 19 2007
Janice Caron wrote:Is this function pure? int probablyNotPure(int x) { try { char[] a = new char[x]; return a.length; } catch { return 0; } }Actually, in spite of all the "Not quite sure, probably not." replies you had so far, I want to state boldly: Yes, sure, why not? :-) Scared by memory allocation changing the state of the GC? You may give a damn: When the function leaves, the dynamic array is dead, and thus not part of the state any more. Functional purity has only these conditions: * The result is a function of the parameters, and only of the parameters. * Variables can be created and duplicated, but not changed. I don't see your example violating these conditions. Regards, frank
Sep 18 2007
On 9/18/07, 0ffh <spam frankhirsch.net> wrote:Janice Caron wrote:You don't? The way I see it, if the input is 5, then output might sometimes be 5 and sometimes be 0. That would seem to violate your first bullet-point, would it not?Is this function pure? int probablyNotPure(int x) { try { char[] a = new char[x]; return a.length; } catch { return 0; } }Functional purity has only these conditions: * The result is a function of the parameters, and only of the parameters. * Variables can be created and duplicated, but not changed. I don't see your example violating these conditions.
Sep 18 2007
Janice Caron wrote:You don't? The way I see it, if the input is 5, then output might sometimes be 5 and sometimes be 0. That would seem to violate your first bullet-point, would it not?That's surely a point which can be discussed: I could, for example, say: A result of 0 happens only in case of an error condition. Sure, you loose functional purity then, but there's just nothing you can do about that. (Actually, in *this* special case you could do something about it, but not in the general case.) Your function could return 23, because a lump of radiation hit the stack memory and flipped two bits. That's just to bad, but that's it. Therefore: In case of errors, there's an exception! (And, in this case, a double entendre. :-) Regards, frank
Sep 18 2007
On 9/18/07, 0ffh <spam frankhirsch.net> wrote:Therefore: In case of errors, there's an exception! (And, in this case, a double entendre. :-)I don't think it's that simple. Exceptions should be allowed as flow control structures even in pure code, and don't necessarily represent an error. For example, a (pure) divide function might legitimately throw a divide-by-zero exception, and then a (also pure) fixed-point saturating divide function might legitimately call the first divide function, catch the exception, and return the saturation value. None of that violates purity. I don't think you can ban try/catch in pure code, nor assume that exception==error. Of course, only Walter can tell us the /real/ (as in, will-be-implemented) answer!
Sep 18 2007
Janice Caron wrote:Is this function pure?Yes.int probablyNotPure(int x) { try { char[] a = new char[x]; return a.length; } catch { return 0; } }A valid optimisation is to assume that x -> a.length, since a.length is directly derived from x. If you don't use (e.g. reference) the memory you are allocating, the compiler should be free to optimise the allocation away. The catch will be dead code, but the function is also valid, with the catch being dead code, since that would happen with infinite storage or a GC deleting unreachable reference (like D has :-) ).I must assume not, since it's not deterministic. But which part of it wrong? Should "new" be forbidden? Or is "try/catch"? Or both?It can be optimised into a deterministic function without any algorithmic side effect. So that function actually IS deterministic and thus pure :-) Best Regards Ingo Oeser
Sep 18 2007
On 9/18/07, Ingo Oeser <ioe-news rameria.de> wrote:A valid optimisation is to assume that x -> a.length, since a.length is directly derived from x. If you don't use (e.g. reference) the memory you are allocating, the compiler should be free to optimise the allocation away.OK, so change it to char[] probablyNotPure(int x) { try { char[] a = new char[x]; return a; } catch { return null; } } and consider that the input might be 0x7FFFFFFF. I think it looks less easy to optimise now.
Sep 18 2007
Janice Caron wrote:OK, so change it to char[] probablyNotPure(int x) { try { char[] a = new char[x]; return a; } catch { return null; } } and consider that the input might be 0x7FFFFFFF. I think it looks less easy to optimise now.Ok, now that's impure, since now you are using the memory by returning it. This is really not side effect free. You are draining a global resource, which is a no-no for pure functions. All the discussions and points raised by the other people now apply. Best Regards Ingo Oeser
Sep 18 2007
Ingo Oeser wrote:Janice Caron wrote:I am afraid I must disagree. We're talking functional purity in a programming context, not maths. It is perfectly allowable in a purely functional language to return a pointer to a heap allocated data structure. Regards, frankOK, so change it to char[] probablyNotPure(int x) [...] I think it looks less easy to optimise now.Ok, now that's impure, since now you are using the memory by returning it. This is really not side effect free. You are draining a global resource, which is a no-no for pure functions. All the discussions and points raised by the other people now apply.
Sep 18 2007
0ffh wrote:I am afraid I must disagree. We're talking functional purity in a programming context, not maths. It is perfectly allowable in a purely functional language to return a pointer to a heap allocated data structure.Sorry, mixed that up with const functions. Those (const functions) only depend on their parameters and have no effect besides their return value. At least GCC defines them this way. Sorry again. Best Regards Ingo Oeser
Sep 18 2007
Ingo Oeser wrote:Ok, now that's impure, since now you are using the memory by returning it. This is really not side effect free. You are draining a global resource, which is a no-no for pure functions. All the discussions and points raised by the other people now apply.I think if you take this point of view, you have to conclude that even a function that allocates memory that becomes garbage by the end of the function is strictly not side-effect-free, since now some other code that is executing simultaneously with our "pure" function could run out of memory, where it wouldn't have otherwise. In other words, memory allocation in a pure function would have to be totally disallowed, drastically reducing the usefulness of pure functions. In fact, following the same line of reasoning, the space consumed by the *stack frame* of a pure function would make it strictly not side-effect-free. Memory allocation shouldn't be an impediment to making a function pure. If we actually run out of memory, we've got bigger problems than breaking the purity of the functions. Thanks, Nathan Reed
Sep 18 2007