digitalmars.D - Idea: partially pure functions
- Bruno Medeiros (57/57) Apr 29 2008 I want to conceptualize an idea that has been briefly mentioned in the
- Janice Caron (6/8) Apr 29 2008 Not that this in any way changes your argument, but I take it you do
- Robert Fraser (4/14) Apr 30 2008 I don't think that was the point of the post... but I never knew that.
- Janice Caron (7/10) Apr 30 2008 Your wish is my command:
- sambeau (3/3) Apr 30 2008 Phil Wadler's ideas of "Blame" might be of interest to you
- Don (14/28) Apr 30 2008 The first case to consider is nested functions.
- Bruno Medeiros (18/40) Apr 30 2008 Nested functions are a more complicated case than regular functions, as
- Don (12/53) Apr 30 2008 I don't think so, in practice. Nested functions are a lot easier to
- Bruno Medeiros (19/57) May 01 2008 I state that if you have 'pure' implemented (in a sensible manner), it
- Steven Schveighoffer (9/24) May 01 2008 What if localVar is a pointer to global data, or transitively contains a...
- Bruno Medeiros (10/42) May 01 2008 Whatever checks you have to make for localVar, you still have to do them...
- Steven Schveighoffer (29/68) May 01 2008 From the digitalmars web site:
- Bruno Medeiros (12/33) May 01 2008 Huh? You're mixing things up.
- Steven Schveighoffer (15/45) May 01 2008 OK, I think I see what the barrier here is. Walter and Andrei's version...
- Steven Schveighoffer (11/18) May 01 2008 I re-read this and I see what you are proposing.
- Steven Schveighoffer (24/38) Apr 30 2008 I really like the idea of 'partially pure'. It basically allows you to
- Bruno Medeiros (11/23) May 01 2008 The compiler can only infer partial purity if it has access to the
- Steven Schveighoffer (18/38) May 01 2008 I meant an *extra* keyword. If you mark a function as pure, it can be
- Bruno Medeiros (21/65) May 01 2008 That decision of "whether he is expecting a function to be partially
- Steven Schveighoffer (23/81) May 01 2008 It's not possible to call a partially pure function with mutable argumen...
- Bruno Medeiros (9/28) May 02 2008 It's not stated in the pdf "that pure functions can call new
- mort (6/6) Jul 31 2008 I'm not an expert on D, but it looks like a pretty nice language. While...
- Simen Kjaeraas (7/9) Aug 01 2008 I would believe the following to work:
- Jason House (19/93) May 01 2008 Rather than thinking of functions than can mix with pure functions as pa...
- Bruno Medeiros (8/105) May 02 2008 Why is this last case illegal? It's fine for a pure function to return a...
- Jason House (3/29) May 02 2008 You're right. True purity requires the class member functions to be mar...
- Bruno Medeiros (7/10) May 03 2008 Says who? If the constructor of C is pure/partial-pure, the calling of
- Jason House (4/13) May 03 2008 Use of "new" causes side effects. Maybe to a part of code that most
- Sean Reque (8/21) Aug 01 2008 I agree with you completely, and was planning on making an similar post ...
I want to conceptualize an idea that has been briefly mentioned in the previous pure discussions (by Don I think?), but has not been explicitly brought into attention. As we know, the current definition of pure functions (according to http://www.digitalmars.com/d/2.0/accu-functional.pdf) is that they can only access invariant, or local mutable data. This means that all of the function's parameters have to be invariant. The idea is: let's also allow pure functions to have (and access) non-invariant parameters. These won't be normal pure functions, but instead "partially" pure functions. Their semantics are: the function promises not to change any external/global data, except for the parameters that are not invariant. Example: pure char[] mutable_tolower(char[] str) { // in-place tolower here as str is mutable... } But what's the use of partially pure functions then? Well, what happens is that, since the set of possible mutable data of a partial function is "finite", and restricted to only the function's parameters, one can safely allow the calling of partially pure functions inside fully pure functions (and also other partially pure functions). How come? Well, a fully pure function is allowed to mutate local state. Thus, it can use that mutable local sate as arguments to a partially pure function, since that partially pure function will only mutable those arguments, and nothing else. The contract for a full pure function is maintained. Example: consider a pure function that takes two strings as arguments, concatenates them, tolower's the first half, and toupper's the second half. How does one write that function with the usual rules for pure functions? Something like: alias char[] mstring; char[] xpto(string a, string b) { mstring result = a ~ b; auto halflen = result.length/2; return (tolower(result[0..halflen]) ~ toupper(result[halflen..$])); } But notice that this version is inefficient. About 3 unnecessary temporary allocations are performed. How could this be prevented? One solution would be to call mutable versions of tolower and toupper... but since they are not pure functions, they cannot be called under the normal rules. But if one allows the partially pure function rules, it becomes possible. One then could write the previous function as this: char[] xpto2(string a, string b) { mstring result = a ~ b; auto halflen = result.length/2; mutable_tolower(result[0..halflen]); mutable_toupper(result[halflen..$]); return result; } Now, I know that xpto could be rewritten so as to be as efficient as xpto2 with the current rules, but the code wouldn't look nearly as nice. And that is precisely the point: You would have to code tolower inside of xpto, and this is just a trivial example, imagine with more complicated functions... The partially pure functions allow more expressiveness and efficiency without any kind of compromise. -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Apr 29 2008
2008/4/30 Bruno Medeiros <brunodomedeiros+spam com.gmail>:Example: consider a pure function that takes two strings as arguments, concatenates them, tolower's the first half, and toupper's the second half.Not that this in any way changes your argument, but I take it you do realise that toupper and tolower cannot be done in place, because the UTF-8 sequences representing dchar c might be a different length from the UTF-8 sequences representing toupper(c) or tolower(c). This is a common misconception popular among people who only use ASCII. :-)
Apr 29 2008
Janice Caron wrote:2008/4/30 Bruno Medeiros <brunodomedeiros+spam com.gmail>:I don't think that was the point of the post... but I never knew that. Can you give me an example of a character where the number of bytes in that character's representation changes when its case is changed?Example: consider a pure function that takes two strings as arguments, concatenates them, tolower's the first half, and toupper's the second half.Not that this in any way changes your argument, but I take it you do realise that toupper and tolower cannot be done in place, because the UTF-8 sequences representing dchar c might be a different length from the UTF-8 sequences representing toupper(c) or tolower(c). This is a common misconception popular among people who only use ASCII. :-)
Apr 30 2008
2008/4/30 Robert Fraser <fraserofthenight gmail.com>:Can you give me an example of a character where the number of bytes in that character's representation changes when its case is changed?Your wish is my command: Character '\u023A' is LATIN CAPITAL LETTER A WITH STROKE In UTF-8: [ 0xC8, 0xBA ] Character '\u2C65' is LATIN SMALL LETTER A WITH STROKE In UTF-8: [ 0xE2, 0xB1, 0xA5 ] So it gets longer when you lowercase it, shorter when you uppercase it.
Apr 30 2008
Phil Wadler's ideas of "Blame" might be of interest to you (essentially it's about mixing dynamic typing with static typing, but I think it is relevent to pure/impure thoughts, too) http://video.google.com/videoplay?docid=-4167170843018186532
Apr 30 2008
Bruno Medeiros wrote:I want to conceptualize an idea that has been briefly mentioned in the previous pure discussions (by Don I think?), but has not been explicitly brought into attention. As we know, the current definition of pure functions (according to http://www.digitalmars.com/d/2.0/accu-functional.pdf) is that they can only access invariant, or local mutable data. This means that all of the function's parameters have to be invariant. The idea is: let's also allow pure functions to have (and access) non-invariant parameters. These won't be normal pure functions, but instead "partially" pure functions. Their semantics are: the function promises not to change any external/global data, except for the parameters that are not invariant. Example:The first case to consider is nested functions. pure int func(int x) { int var=0; int localfunc(int y) { ++var; return x*var; } int localfunc2(int y) { return y; } return localfunc(x)*localfunc(x+1)+localfunc2(x); } localfunc() is definitely not pure. localfunc2() is not marked as pure. But I think you can make a pretty decent argument that func() is pure. But Walter's already said that pure functions will start out very restricted, and the rules will be relaxed over time. So it's not really worth worrying about the rules right now.
Apr 30 2008
Don wrote:Bruno Medeiros wrote:Nested functions are a more complicated case than regular functions, as they have other "inputs" other than the parameters: the outer variables. So I think that if you implement partial pure semantics for nested functions, then you have it working for normal functions as well.I want to conceptualize an idea that has been briefly mentioned in the previous pure discussions (by Don I think?), but has not been explicitly brought into attention. As we know, the current definition of pure functions (according to http://www.digitalmars.com/d/2.0/accu-functional.pdf) is that they can only access invariant, or local mutable data. This means that all of the function's parameters have to be invariant. The idea is: let's also allow pure functions to have (and access) non-invariant parameters. These won't be normal pure functions, but instead "partially" pure functions. Their semantics are: the function promises not to change any external/global data, except for the parameters that are not invariant. Example:The first case to consider is nested functions.But Walter's already said that pure functions will start out very restricted, and the rules will be relaxed over time. So it's not really worth worrying about the rules right now.True, if we're thinking about compiler implementation only. But in terms of design, this might not be just an improvement, it could a crucial functionality. For example, consider this other example: Suppose you have a pure function with a local object and you want to mutate that object using a mutable method. If you are not able to call the mutable method in the pure function, you might not even be able to describe the method changes inside the pure function, because of Object encapsulation (example: changing a private member). So partial pure semantics might be necessary if one wants pure functions to be able to work with objects in the general sense. -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Apr 30 2008
Bruno Medeiros wrote:Don wrote:I don't think so, in practice. Nested functions are a lot easier to implement, since while compiling the outer function, the compiler always has access to the complete code of the inner function and can confirm that the outer function remains pure. It's impossible to change the nested function in a way that accidentally violates purity of the outer function -- it just won't compile. But a regular function could appear in a completely different library, so the contextually purity needs to be enforced, and this would need a language change.Bruno Medeiros wrote:Nested functions are a more complicated case than regular functions, as they have other "inputs" other than the parameters: the outer variables. So I think that if you implement partial pure semantics for nested functions, then you have it working for normal functions as well.I want to conceptualize an idea that has been briefly mentioned in the previous pure discussions (by Don I think?), but has not been explicitly brought into attention. As we know, the current definition of pure functions (according to http://www.digitalmars.com/d/2.0/accu-functional.pdf) is that they can only access invariant, or local mutable data. This means that all of the function's parameters have to be invariant. The idea is: let's also allow pure functions to have (and access) non-invariant parameters. These won't be normal pure functions, but instead "partially" pure functions. Their semantics are: the function promises not to change any external/global data, except for the parameters that are not invariant. Example:The first case to consider is nested functions.Yes, I don't see how objects can be used in pure functions without some form of 'contextually pure' functions.But Walter's already said that pure functions will start out very restricted, and the rules will be relaxed over time. So it's not really worth worrying about the rules right now.True, if we're thinking about compiler implementation only. But in terms of design, this might not be just an improvement, it could a crucial functionality. For example, consider this other example: Suppose you have a pure function with a local object and you want to mutate that object using a mutable method. If you are not able to call the mutable method in the pure function, you might not even be able to describe the method changes inside the pure function, because of Object encapsulation (example: changing a private member). So partial pure semantics might be necessary if one wants pure functions to be able to work with objects in the general sense.
Apr 30 2008
Don wrote:Bruno Medeiros wrote:I state that if you have 'pure' implemented (in a sensible manner), it takes zero effort to implement partial/contextual pure. The definition of partial pure functions requires no extra effort from the compiler to check. (The *same* check for normal pure functions is made: that no external state is accessed unless invariant) The usage of partial pure functions also requires no extra effort. Inside a pure function body, the compiler already has to check if no external, non-invariant variables are accessed, like this: Foo globalVar; pure void func(...) { localVar = globalVar + 1; // not allowed then just apply this same check on *any* expression, including function call expressions: partiallyPureFunction(localVar, globalVar); // not allowed partiallyPureFunction(localVar, localVar2); // but this is allowed -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#DDon wrote:I don't think so, in practice. Nested functions are a lot easier to implement, since while compiling the outer function, the compiler always has access to the complete code of the inner function and can confirm that the outer function remains pure. It's impossible to change the nested function in a way that accidentally violates purity of the outer function -- it just won't compile. But a regular function could appear in a completely different library, so the contextually purity needs to be enforced, and this would need a language change.Bruno Medeiros wrote:Nested functions are a more complicated case than regular functions, as they have other "inputs" other than the parameters: the outer variables. So I think that if you implement partial pure semantics for nested functions, then you have it working for normal functions as well.I want to conceptualize an idea that has been briefly mentioned in the previous pure discussions (by Don I think?), but has not been explicitly brought into attention. As we know, the current definition of pure functions (according to http://www.digitalmars.com/d/2.0/accu-functional.pdf) is that they can only access invariant, or local mutable data. This means that all of the function's parameters have to be invariant. The idea is: let's also allow pure functions to have (and access) non-invariant parameters. These won't be normal pure functions, but instead "partially" pure functions. Their semantics are: the function promises not to change any external/global data, except for the parameters that are not invariant. Example:The first case to consider is nested functions.
May 01 2008
"Bruno Medeiros" wroteI state that if you have 'pure' implemented (in a sensible manner), it takes zero effort to implement partial/contextual pure. The definition of partial pure functions requires no extra effort from the compiler to check. (The *same* check for normal pure functions is made: that no external state is accessed unless invariant) The usage of partial pure functions also requires no extra effort. Inside a pure function body, the compiler already has to check if no external, non-invariant variables are accessed, like this: Foo globalVar; pure void func(...) { localVar = globalVar + 1; // not allowed then just apply this same check on *any* expression, including function call expressions: partiallyPureFunction(localVar, globalVar); // not allowed partiallyPureFunction(localVar, localVar2); // but this is allowedWhat if localVar is a pointer to global data, or transitively contains a pointer to global data? There is no way to know whether localVar is a pointer to global data or 'contained' heap data without lots of context checking. If partially pure functions with mutable parameters were allowed, I'd say it's a requirement that you can only call them from partially or fully pure functions, as local pointers from those functions are guaranteed to be contained within the function. -Steve
May 01 2008
Steven Schveighoffer wrote:"Bruno Medeiros" wroteWhatever checks you have to make for localVar, you still have to do them with normal pure. (And not that it matters to the point, how do you initialize a local var with global data?)I state that if you have 'pure' implemented (in a sensible manner), it takes zero effort to implement partial/contextual pure. The definition of partial pure functions requires no extra effort from the compiler to check. (The *same* check for normal pure functions is made: that no external state is accessed unless invariant) The usage of partial pure functions also requires no extra effort. Inside a pure function body, the compiler already has to check if no external, non-invariant variables are accessed, like this: Foo globalVar; pure void func(...) { localVar = globalVar + 1; // not allowed then just apply this same check on *any* expression, including function call expressions: partiallyPureFunction(localVar, globalVar); // not allowed partiallyPureFunction(localVar, localVar2); // but this is allowedWhat if localVar is a pointer to global data, or transitively contains a pointer to global data? There is no way to know whether localVar is a pointer to global data or 'contained' heap data without lots of context checking.If partially pure functions with mutable parameters were allowed, I'd say it's a requirement that you can only call them from partially or fully pure functions, as local pointers from those functions are guaranteed to be contained within the function.I'm not sure I understand what you're saying. Are you saying that partially pure functions should not be called from *non-pure* functions? What's the sense in that? -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
May 01 2008
"Bruno Medeiros" wroteSteven Schveighoffer wrote:From the digitalmars web site: "A pure function can throw exceptions and allocate memory via a NewExpression." So in the case of a partially pure function that takes a char array: pure void f(char[] x) {...} Let's say I have a non-pure function g: void g() { char[] v; ... f(v); } In the ..., v can be set to local mutable heap data (via new char[x] or whatever), or it could be set to point to a global mutable string. How can the compiler be sure without severe context analysis? The point is, in order to ensure context-free lexical analysis by the compiler, it must disallow calling f from a non-pure function. Contrast that from a pure version of g: pure void g() { char[] v; ... f(v); } Because g is now pure, it is a requirement that v cannot be assigned to some global variable. The fact that g is pure ensures that f can safely be called. v can only be set by calling a new expression. -Steve"Bruno Medeiros" wroteWhatever checks you have to make for localVar, you still have to do them with normal pure. (And not that it matters to the point, how do you initialize a local var with global data?)I state that if you have 'pure' implemented (in a sensible manner), it takes zero effort to implement partial/contextual pure. The definition of partial pure functions requires no extra effort from the compiler to check. (The *same* check for normal pure functions is made: that no external state is accessed unless invariant) The usage of partial pure functions also requires no extra effort. Inside a pure function body, the compiler already has to check if no external, non-invariant variables are accessed, like this: Foo globalVar; pure void func(...) { localVar = globalVar + 1; // not allowed then just apply this same check on *any* expression, including function call expressions: partiallyPureFunction(localVar, globalVar); // not allowed partiallyPureFunction(localVar, localVar2); // but this is allowedWhat if localVar is a pointer to global data, or transitively contains a pointer to global data? There is no way to know whether localVar is a pointer to global data or 'contained' heap data without lots of context checking.If partially pure functions with mutable parameters were allowed, I'd say it's a requirement that you can only call them from partially or fully pure functions, as local pointers from those functions are guaranteed to be contained within the function.I'm not sure I understand what you're saying. Are you saying that partially pure functions should not be called from *non-pure* functions? What's the sense in that?
May 01 2008
Steven Schveighoffer wrote:So in the case of a partially pure function that takes a char array: pure void f(char[] x) {...} Let's say I have a non-pure function g: void g() { char[] v; ... f(v); } In the ..., v can be set to local mutable heap data (via new char[x] or whatever), or it could be set to point to a global mutable string. How can the compiler be sure without severe context analysis? The point is, in order to ensure context-free lexical analysis by the compiler, it must disallow calling f from a non-pure function.Huh? You're mixing things up. Yes, if g() is not pure, then the compiler will not check if v points to global data or not. Thus we cannot guarantee that calling "f(v)" will not change global state. But so what? That doesn't mean we shouldn't allow it. It just means such call may have side-effects, and the compiler should threat that as a normal function call (make no particular pure optimizations). Thus "partially" pure: pure in some contexts, impure in other contexts. -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
May 01 2008
"Bruno Medeiros" wroteSteven Schveighoffer wrote:OK, I think I see what the barrier here is. Walter and Andrei's version of pure is supposed to be for solving multithreading issues. There is an attribute of D pure functions that is inherently true in functional programming languages. Not only does a pure function have no side effects, but nothing outside the pure function can affect its execution. If v points to global data, and some other thread changes that data mid-way through the function execution, the function might crash or throw an exception because it is expecting to be the only one that can change it's own data. With that in mind, partially pure functions that have mutable parameters cannot have that attribute unless it is guaranteed that those mutable parameters won't be changed by anything outside the partially pure function. The only way to guarantee that is to only allow calling that function from another pure or partially pure function. -SteveSo in the case of a partially pure function that takes a char array: pure void f(char[] x) {...} Let's say I have a non-pure function g: void g() { char[] v; ... f(v); } In the ..., v can be set to local mutable heap data (via new char[x] or whatever), or it could be set to point to a global mutable string. How can the compiler be sure without severe context analysis? The point is, in order to ensure context-free lexical analysis by the compiler, it must disallow calling f from a non-pure function.Huh? You're mixing things up. Yes, if g() is not pure, then the compiler will not check if v points to global data or not. Thus we cannot guarantee that calling "f(v)" will not change global state. But so what? That doesn't mean we shouldn't allow it. It just means such call may have side-effects, and the compiler should threat that as a normal function call (make no particular pure optimizations). Thus "partially" pure: pure in some contexts, impure in other contexts.
May 01 2008
"Bruno Medeiros" wroteYes, if g() is not pure, then the compiler will not check if v points to global data or not. Thus we cannot guarantee that calling "f(v)" will not change global state. But so what? That doesn't mean we shouldn't allow it. It just means such call may have side-effects, and the compiler should threat that as a normal function call (make no particular pure optimizations). Thus "partially" pure: pure in some contexts, impure in other contexts.I re-read this and I see what you are proposing. I can see how that would work, but you would definitely need another keyword. Any time a developer sees the pure keyword, he will assume he doesn't need to worry about thread issues. However, a partially pure function would be subject to thread issues when not called from a pure function. I still think it would be a worthwhile feature to have, as it would save on implementing lots of functionality more than once. BTW, forget my other post, we were arguing about totally different things. -Steve
May 01 2008
"Bruno Medeiros" wroteDon wrote:I really like the idea of 'partially pure'. It basically allows you to build functions from modular pieces, which is key to any successful project. To have to implement every piece of common functions every time you write one seems very error-prone to me. However, I tend to agree with Don. What you want is a relaxation of the rules, which can easily happen after pure functions are supported. In fact, you don't even need any new keywords or syntax, the compiler just implies the partial purity from the parameter/return types. As for your idea, to be complete, I'd say there are different levels of partially pure functions. level 1: mutable return value, but const/invariant parameters. These functions can be re-ordered, and can be called from pure or unpure functions. The result cannot be memoized. level 2: mutable parameters. These functions cannot be re-ordered, but can be used inside pure functions. They cannot be called from non-pure functions. These also cannot be memoized. There may be more levels, but I think that's at least a start. An example of a level 1 partially pure function is new. A level 2 function may need a new keyword, as the contract is different than a level 1 or pure function in that it cannot be called from a non-pure function. Level 1 functions are no different contract-wise than fully pure functions, just the optimization is different (cannot memoize). -SteveBut Walter's already said that pure functions will start out very restricted, and the rules will be relaxed over time. So it's not really worth worrying about the rules right now.True, if we're thinking about compiler implementation only. But in terms of design, this might not be just an improvement, it could a crucial functionality. For example, consider this other example: Suppose you have a pure function with a local object and you want to mutate that object using a mutable method. If you are not able to call the mutable method in the pure function, you might not even be able to describe the method changes inside the pure function, because of Object encapsulation (example: changing a private member). So partial pure semantics might be necessary if one wants pure functions to be able to work with objects in the general sense.
Apr 30 2008
Steven Schveighoffer wrote:However, I tend to agree with Don. What you want is a relaxation of the rules, which can easily happen after pure functions are supported. In fact, you don't even need any new keywords or syntax, the compiler just implies the partial purity from the parameter/return types.The compiler can only infer partial purity if it has access to the function body (just the same as it can infer normal purity). But if you only have a function signature, you need a keyword.As for your idea, to be complete, I'd say there are different levels of partially pure functions. level 1: mutable return value, but const/invariant parameters. These functions can be re-ordered, and can be called from pure or unpure functions. The result cannot be memoized.This is already a regular pure function. The idea that a pure function has to return an invariant is a misconception, it can safely return mutable data. It can me memoized just the same (although a copy would have to be made if the result is not invariant) -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
May 01 2008
"Bruno Medeiros" wroteSteven Schveighoffer wrote:I meant an *extra* keyword. If you mark a function as pure, it can be partially pure or fully pure depending on the arguments. However, having an extra keyword lets the developer decide whether he is expecting a function to be partially pure or not, so there are no surprises :) I'd vote for a new keyword, but this does not need to happen right away.However, I tend to agree with Don. What you want is a relaxation of the rules, which can easily happen after pure functions are supported. In fact, you don't even need any new keywords or syntax, the compiler just implies the partial purity from the parameter/return types.The compiler can only infer partial purity if it has access to the function body (just the same as it can infer normal purity). But if you only have a function signature, you need a keyword.I would hope that the compiler would not memoize if I returned a mutable piece of data, as I would generally expect to use this function as a way to initialize some data that I want to build with (in a pure function or otherwise). If I wanted it to memoize, I can return invariant. For example, to memoize on 'new' would be a waste, but 'new' can be considered a pure (arguably partially pure) function. Sometimes memoization is not desirable, especially if you are rarely calling the pure function with the same arguments. I guess it all depends on what you consider partially pure :) I believe the current rules as stated on D's website require invariant return data, or data that can be implicitly cast to invariant. -SteveAs for your idea, to be complete, I'd say there are different levels of partially pure functions. level 1: mutable return value, but const/invariant parameters. These functions can be re-ordered, and can be called from pure or unpure functions. The result cannot be memoized.This is already a regular pure function. The idea that a pure function has to return an invariant is a misconception, it can safely return mutable data. It can me memoized just the same (although a copy would have to be made if the result is not invariant)
May 01 2008
Steven Schveighoffer wrote:"Bruno Medeiros" wroteThat decision of "whether he is expecting a function to be partially pure or " pure is not a useful decision to have to make, it's redundant (and thus cumbersome) to ask the programmer to specify that. Also, having two different keywords would impossibilitate or difficult templating of constness (like with the "scoped const" proposal). Just think of 'pure' as meaning: this function does not depend on, or modify, external state. (external state being any data other than the parameters)Steven Schveighoffer wrote:I meant an *extra* keyword. If you mark a function as pure, it can be partially pure or fully pure depending on the arguments. However, having an extra keyword lets the developer decide whether he is expecting a function to be partially pure or not, so there are no surprises :) I'd vote for a new keyword, but this does not need to happen right away.However, I tend to agree with Don. What you want is a relaxation of the rules, which can easily happen after pure functions are supported. In fact, you don't even need any new keywords or syntax, the compiler just implies the partial purity from the parameter/return types.The compiler can only infer partial purity if it has access to the function body (just the same as it can infer normal purity). But if you only have a function signature, you need a keyword.I said the compiler *could* memoize the result just the same. Doesn't mean it *should* memoize if the result is mutable. As a matter of fact, even if the result is invariant, doesn't mean the compiler should memoize it. Memoization is a complicated optimization that likely only the user knows when it should be applied, not the compiler.I would hope that the compiler would not memoize if I returned a mutable piece of data, as I would generally expect to use this function as a way to initialize some data that I want to build with (in a pure function or otherwise). If I wanted it to memoize, I can return invariant. For example, to memoize on 'new' would be a waste, but 'new' can be considered a pure (arguably partially pure) function. Sometimes memoization is not desirable, especially if you are rarely calling the pure function with the same arguments.As for your idea, to be complete, I'd say there are different levels of partially pure functions. level 1: mutable return value, but const/invariant parameters. These functions can be re-ordered, and can be called from pure or unpure functions. The result cannot be memoized.This is already a regular pure function. The idea that a pure function has to return an invariant is a misconception, it can safely return mutable data. It can me memoized just the same (although a copy would have to be made if the result is not invariant)I guess it all depends on what you consider partially pure :) I believe the current rules as stated on D's website require invariant return data, or data that can be implicitly cast to invariant. -SteveIn Andrei's latest presentation (http://www.digitalmars.com/d/2.0/accu-functional.pdf) when he lists the requirements for pure, it does not say that is should return invariant data. (and I'm assuming those requirements are complete) -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
May 01 2008
"Bruno Medeiros" wroteSteven Schveighoffer wrote:It's not possible to call a partially pure function with mutable arguments from a non-pure function. Doing so would be a nightmare for the compiler. So from a maintainability perspective, changing a function from fully pure to partially pure changes the contract that the function provides. However, changing from fully pure to partially pure involves changing an argument type, so in essence, code that used the fully pure function would break anyways... Having an extra keyword for 'partially pure' is probably not necessary, so I agree with you now."Bruno Medeiros" wroteThat decision of "whether he is expecting a function to be partially pure or " pure is not a useful decision to have to make, it's redundant (and thus cumbersome) to ask the programmer to specify that.Steven Schveighoffer wrote:I meant an *extra* keyword. If you mark a function as pure, it can be partially pure or fully pure depending on the arguments. However, having an extra keyword lets the developer decide whether he is expecting a function to be partially pure or not, so there are no surprises :) I'd vote for a new keyword, but this does not need to happen right away.However, I tend to agree with Don. What you want is a relaxation of the rules, which can easily happen after pure functions are supported. In fact, you don't even need any new keywords or syntax, the compiler just implies the partial purity from the parameter/return types.The compiler can only infer partial purity if it has access to the function body (just the same as it can infer normal purity). But if you only have a function signature, you need a keyword.Also, having two different keywords would impossibilitate or difficult templating of constness (like with the "scoped const" proposal). Just think of 'pure' as meaning: this function does not depend on, or modify, external state. (external state being any data other than the parameters)For D, there is also the notion that a pure function will not be affected by it's parameters changing mid-way through the function.I suspect the goal of Walter is to not have the developer make the decision to memoize... Similar to how we do not have the ability to force inlining, but we have the ability to force NOT inlining (by making a function opaque). I believe making a pure function return mutable heap data should be a signal that the compiler does not memoize the result.I said the compiler *could* memoize the result just the same. Doesn't mean it *should* memoize if the result is mutable. As a matter of fact, even if the result is invariant, doesn't mean the compiler should memoize it. Memoization is a complicated optimization that likely only the user knows when it should be applied, not the compiler.I would hope that the compiler would not memoize if I returned a mutable piece of data, as I would generally expect to use this function as a way to initialize some data that I want to build with (in a pure function or otherwise). If I wanted it to memoize, I can return invariant. For example, to memoize on 'new' would be a waste, but 'new' can be considered a pure (arguably partially pure) function. Sometimes memoization is not desirable, especially if you are rarely calling the pure function with the same arguments.As for your idea, to be complete, I'd say there are different levels of partially pure functions. level 1: mutable return value, but const/invariant parameters. These functions can be re-ordered, and can be called from pure or unpure functions. The result cannot be memoized.This is already a regular pure function. The idea that a pure function has to return an invariant is a misconception, it can safely return mutable data. It can me memoized just the same (although a copy would have to be made if the result is not invariant)Those requirements are not complete. For instance, the web site says that pure functions can call new expressions, but that is not stated in the pdf (or at least, I don't remember seeing it). I think it will eventually be stated that pure functions initially will have to return invariant data, or data that can be implicitly cast to invariant. Unless I can convince Andrei otherwise :) -SteveI guess it all depends on what you consider partially pure :) I believe the current rules as stated on D's website require invariant return data, or data that can be implicitly cast to invariant.In Andrei's latest presentation (http://www.digitalmars.com/d/2.0/accu-functional.pdf) when he lists the requirements for pure, it does not say that is should return invariant data. (and I'm assuming those requirements are complete)
May 01 2008
Steven Schveighoffer wrote:"Bruno Medeiros" wroteIt's not stated in the pdf "that pure functions can call new expressions" because it only states what pure functions *cannot* do. What they can do is everything else. So new expressions are allowed (likely following the same rules as function calling, ie, only pure constructors are allowed). -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#DThose requirements are not complete. For instance, the web site says that pure functions can call new expressions, but that is not stated in the pdf (or at least, I don't remember seeing it). I think it will eventually be stated that pure functions initially will have to return invariant data, or data that can be implicitly cast to invariant. Unless I can convince Andrei otherwise :) -SteveI guess it all depends on what you consider partially pure :) I believe the current rules as stated on D's website require invariant return data, or data that can be implicitly cast to invariant.In Andrei's latest presentation (http://www.digitalmars.com/d/2.0/accu-functional.pdf) when he lists the requirements for pure, it does not say that is should return invariant data. (and I'm assuming those requirements are complete)
May 02 2008
I'm not an expert on D, but it looks like a pretty nice language. While reading about 'pure' recently, however, I found what appeared to be pretty major usability problems with the idea as it was presented: - Pure functions can have local mutable state, but aren't allowed to factor the logic for manipulating that state out into functions (since such a function would be non-pure). - Pure functions can't be applied to mutable data. So you potentially need a mutable non-pure version of each pure function. An elegant solution to both of these problems would seem to be the one proposed in this thread - get rid of the requirement that parameters be invariant. The first problem above is solved by pure functions with mutable parameters, and the second by pure functions with const parameters. Special optimizations that require purity and parameter invariance can always be done by creating a specialized version of each pure function that's invoked when the arguments are known to be invariant. Let the compiler handle the duplication - don't make the programmer copy/paste. Thanks
Jul 31 2008
On Fri, 01 Aug 2008 03:46:30 +0200, mort <mortm gmail.com> wrote:- Pure functions can't be applied to mutable data. So you potentially need a mutable non-pure version of each pure function.I would believe the following to work: pure SomeStruct foo(invariant SomeStruct f); foreach (ref s; SomeStructArray) s = foo(cast(invariant)s); -- Simen
Aug 01 2008
Bruno Medeiros Wrote:I want to conceptualize an idea that has been briefly mentioned in the previous pure discussions (by Don I think?), but has not been explicitly brought into attention. As we know, the current definition of pure functions (according to http://www.digitalmars.com/d/2.0/accu-functional.pdf) is that they can only access invariant, or local mutable data. This means that all of the function's parameters have to be invariant. The idea is: let's also allow pure functions to have (and access) non-invariant parameters. These won't be normal pure functions, but instead "partially" pure functions. Their semantics are: the function promises not to change any external/global data, except for the parameters that are not invariant. Example: pure char[] mutable_tolower(char[] str) { // in-place tolower here as str is mutable... } But what's the use of partially pure functions then? Well, what happens is that, since the set of possible mutable data of a partial function is "finite", and restricted to only the function's parameters, one can safely allow the calling of partially pure functions inside fully pure functions (and also other partially pure functions). How come? Well, a fully pure function is allowed to mutate local state. Thus, it can use that mutable local sate as arguments to a partially pure function, since that partially pure function will only mutable those arguments, and nothing else. The contract for a full pure function is maintained. Example: consider a pure function that takes two strings as arguments, concatenates them, tolower's the first half, and toupper's the second half. How does one write that function with the usual rules for pure functions? Something like: alias char[] mstring; char[] xpto(string a, string b) { mstring result = a ~ b; auto halflen = result.length/2; return (tolower(result[0..halflen]) ~ toupper(result[halflen..$])); } But notice that this version is inefficient. About 3 unnecessary temporary allocations are performed. How could this be prevented? One solution would be to call mutable versions of tolower and toupper... but since they are not pure functions, they cannot be called under the normal rules. But if one allows the partially pure function rules, it becomes possible. One then could write the previous function as this: char[] xpto2(string a, string b) { mstring result = a ~ b; auto halflen = result.length/2; mutable_tolower(result[0..halflen]); mutable_toupper(result[halflen..$]); return result; } Now, I know that xpto could be rewritten so as to be as efficient as xpto2 with the current rules, but the code wouldn't look nearly as nice. And that is precisely the point: You would have to code tolower inside of xpto, and this is just a trivial example, imagine with more complicated functions... The partially pure functions allow more expressiveness and efficiency without any kind of compromise. -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#DRather than thinking of functions than can mix with pure functions as partially pure, I'd prefer to think of them as having no side effect. For arguments sake, I'll just use the keyword noside (analogous to nothrow). class A{} class C{} void f(); // impure void f() noside; // pure void f(A c) noside; // partially pure void f(invariant(A) c) noside; // pure C f() noside; // illegal invariant(C) f() noside; // pure (using global invariant data) class B{ C cc; void f(); // impure void f() noside; // pure void f(A c) noside; // partially pure void f(invariant(A) a) noside; // pure C f() noside; // partially pure invariant(C) f() noside; // pure }
May 01 2008
Jason House wrote:Bruno Medeiros Wrote:Why is this last case illegal? It's fine for a pure function to return a mutable value. (see my other discussion with Steven)I want to conceptualize an idea that has been briefly mentioned in the previous pure discussions (by Don I think?), but has not been explicitly brought into attention. As we know, the current definition of pure functions (according to http://www.digitalmars.com/d/2.0/accu-functional.pdf) is that they can only access invariant, or local mutable data. This means that all of the function's parameters have to be invariant. The idea is: let's also allow pure functions to have (and access) non-invariant parameters. These won't be normal pure functions, but instead "partially" pure functions. Their semantics are: the function promises not to change any external/global data, except for the parameters that are not invariant. Example: pure char[] mutable_tolower(char[] str) { // in-place tolower here as str is mutable... } But what's the use of partially pure functions then? Well, what happens is that, since the set of possible mutable data of a partial function is "finite", and restricted to only the function's parameters, one can safely allow the calling of partially pure functions inside fully pure functions (and also other partially pure functions). How come? Well, a fully pure function is allowed to mutate local state. Thus, it can use that mutable local sate as arguments to a partially pure function, since that partially pure function will only mutable those arguments, and nothing else. The contract for a full pure function is maintained. Example: consider a pure function that takes two strings as arguments, concatenates them, tolower's the first half, and toupper's the second half. How does one write that function with the usual rules for pure functions? Something like: alias char[] mstring; char[] xpto(string a, string b) { mstring result = a ~ b; auto halflen = result.length/2; return (tolower(result[0..halflen]) ~ toupper(result[halflen..$])); } But notice that this version is inefficient. About 3 unnecessary temporary allocations are performed. How could this be prevented? One solution would be to call mutable versions of tolower and toupper... but since they are not pure functions, they cannot be called under the normal rules. But if one allows the partially pure function rules, it becomes possible. One then could write the previous function as this: char[] xpto2(string a, string b) { mstring result = a ~ b; auto halflen = result.length/2; mutable_tolower(result[0..halflen]); mutable_toupper(result[halflen..$]); return result; } Now, I know that xpto could be rewritten so as to be as efficient as xpto2 with the current rules, but the code wouldn't look nearly as nice. And that is precisely the point: You would have to code tolower inside of xpto, and this is just a trivial example, imagine with more complicated functions... The partially pure functions allow more expressiveness and efficiency without any kind of compromise. -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#DRather than thinking of functions than can mix with pure functions as partially pure, I'd prefer to think of them as having no side effect. For arguments sake, I'll just use the keyword noside (analogous to nothrow). class A{} class C{} void f(); // impure void f() noside; // pure void f(A c) noside; // partially pure void f(invariant(A) c) noside; // pure C f() noside; // illegalclass B{ C cc; void f(); // impure void f() noside; // pure void f(A c) noside; // partially pure void f(invariant(A) a) noside; // pure C f() noside; // partially pure invariant(C) f() noside; // pure }All the examples here that you mention as pure, are not pure but in fact partial pure, because the methods have a mutable this parameter. -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
May 02 2008
Bruno Medeiros Wrote:Jason House wrote:That's a trickier case. I should have explained it. How does the function get a mutable C to return? It can't use mutable global data and it can't allocate a new C. Because of that, it becomes impossible to give a body to f that satisfies the contract.Rather than thinking of functions than can mix with pure functions as partially pure, I'd prefer to think of them as having no side effect. For arguments sake, I'll just use the keyword noside (analogous to nothrow). class A{} class C{} void f(); // impure void f() noside; // pure void f(A c) noside; // partially pure void f(invariant(A) c) noside; // pure C f() noside; // illegalWhy is this last case illegal? It's fine for a pure function to return a mutable value. (see my other discussion with Steven)You're right. True purity requires the class member functions to be marked as invariant.class B{ C cc; void f(); // impure void f() noside; // pure void f(A c) noside; // partially pure void f(invariant(A) a) noside; // pure C f() noside; // partially pure invariant(C) f() noside; // pure }All the examples here that you mention as pure, are not pure but in fact partial pure, because the methods have a mutable this parameter.
May 02 2008
Jason House wrote:and it can't allocate a new C.Says who? If the constructor of C is pure/partial-pure, the calling of it should be allowed as well. I don't know however if this will work in the planned initial version of the pure system. -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
May 03 2008
Bruno Medeiros wrote:Jason House wrote:Use of "new" causes side effects. Maybe to a part of code that most programmers don't care about, but it's a side effect. This can also be an important side effect when trying to do multithreading.and it can't allocate a new C.Says who? If the constructor of C is pure/partial-pure, the calling of it should be allowed as well. I don't know however if this will work in the planned initial version of the pure system.
May 03 2008
mort Wrote:I'm not an expert on D, but it looks like a pretty nice language. While reading about 'pure' recently, however, I found what appeared to be pretty major usability problems with the idea as it was presented: - Pure functions can have local mutable state, but aren't allowed to factor the logic for manipulating that state out into functions (since such a function would be non-pure). - Pure functions can't be applied to mutable data. So you potentially need a mutable non-pure version of each pure function. An elegant solution to both of these problems would seem to be the one proposed in this thread - get rid of the requirement that parameters be invariant. The first problem above is solved by pure functions with mutable parameters, and the second by pure functions with const parameters. Special optimizations that require purity and parameter invariance can always be done by creating a specialized version of each pure function that's invoked when the arguments are known to be invariant. Let the compiler handle the duplication - don't make the programmer copy/paste. ThanksI agree with you completely, and was planning on making an similar post about the subject. Take one of the string functions, for instance, countchars. It has the following signature: size_t countchars(string s, string pattern); This function etiher is or could be pure. However, the function would work just as well when operating on const (char)[] data, which is potentially mutable, or char[] data, which is explicitly mutable. Being able to have a conditionally pure function, one that was pure if its arguments were invariant and not pure if they were not, would allow this function to work for both mutable and immutable data, and gain all the optimization benefits of a pure function if its parameters were indeed invariant. There are of course occasions where functions might need to act differently based on the invariance of its parameters. string.split, for instance, would probably want to copy the data it returns if its parameters were not invariant. However, I think differences like these could easily be handled with D's compile-time reflection abilities. Lastly, I never thought about the case where a pure function would need to call another function to operate on explicitly mutable but temporary local data. It would be nice to have that handled as well! It seems like the more you dig in to pure functions, the more you realize that the compiler needs to be able to be very smart to make the concept usable for programmers. As it stands now, it seems difficult for pure functions to interact with the rest of D.
Aug 01 2008