D - Closures in D
- la7y6nvo shamko.com (196/196) Nov 07 2001 I'd like to talk about the idea of having closures in D. For those
- Axel Kittenberger (10/10) Nov 07 2001 Quite a lot of text, and honestly it only confused me.
- la7y6nvo shamko.com (47/56) Nov 07 2001 Here's an example of where a closure might be useful:
- Roberto Mariottini (7/43) Nov 09 2001 cksort(
- la7y6nvo shamko.com (11/15) Nov 09 2001 Nested classes (assuming that's what Java inner classes are) seem
- Roberto Mariottini (8/20) Nov 12 2001 Agreed.
- la7y6nvo shamko.com (27/51) Nov 12 2001 The message doesn't say what either "static inner classes" or
- Axel Kittenberger (8/12) Nov 09 2001 What an insight! Yes thinking about it inner classes are the same thing ...
- Pavel \"EvilOne\" Minayev (6/9) Nov 07 2001 First, I don't understand the need of closures for Collections
- Sean L. Palmer (15/24) Nov 08 2001 Dynamic and associative arrays barely scratch the surface of container
- Pavel \"EvilOne\" Minayev (11/20) Nov 08 2001 for-each loops is something that I was going to offer and that, IMHO,
- Pavel \"EvilOne\" Minayev (8/9) Nov 07 2001 In fact, after reading this one more time, I really can't
- Russell Borogove (26/35) Nov 07 2001 Closures can be used as a form of generic programming, but in
- Pavel \"EvilOne\" Minayev (7/13) Nov 07 2001 Thanks for explaining, now everything seems much clearer.
- la7y6nvo shamko.com (28/72) Nov 07 2001 The Perl example is generating functions dynamically at runtime.
- Patrick Down (6/202) Nov 09 2001 The following is a paper about adding closures to C++. Unfortunately
I'd like to talk about the idea of having closures in D. For those who aren't used to the term closure, a closure is something like a pointer to function, and usually is created by a syntactic construct that appears in source code in the form of an anonymous "code literal". Closures in Smalltalk ===================== I'll start by talking about closures in Smalltalk, which calls them "Blocks". (I'm assuming people can read Smalltalk syntax, which may be a bad assumption, but that's what I'm assuming.) A block in Smalltalk is code that appears between square brackets. An example would be symbolTable at: identifier ifAbsent: [ defaultSymbol ]. The '[ defaultSymbol ]' is a block that is executed only if the 'identifier' key is not present in 'symbolTable'. This example isn't very motivating, because defaultSymbol is just a value. Consider this however symbolTable at: identifier ifAbsent: [ UndeclaredSymbol new ]. The '[ UndeclaredSymbol new ]' block only gets executed if the identifier is not present. This could be useful because making a new UndeclaredSymbol might involve lots of computation. The examples above return values, but blocks can also have side effects. For example, symbolTable at: identifier ifAbsent: [ undeclaredIdentifiers add: identifier. UndeclaredSymbol new ]. Notice that the last expression evaluated in a block is what is returned for the value of the block. (Please, no flames about bracket placement style - I know some people prefer the open bracket to go on a line by itself.) Blocks can also take arguments, for example - symbolTable keysAndValuesDo: [ :identifier :entry | output nextPut: '(', identifier printString, '->', entry printString, ')'. ]. which causes a printable form of a symbol table to be added to the 'output' object. Blocks get much of their power from being able to access local context. In an example above, the block contains references to 'identifier' and 'undeclaredIdentifiers', which might be instance variables or perhaps local variables. This ability puts blocks on an equal footing with the more usual if/then/else and looping control structures. (In fact these "built-in" control structures are themselves implemented using blocks in Smalltalk, but that point is incidental - the main thing is that having access to local context gives an important kind of expressive power.) Incidentally, blocks can declare variables local to the block. (I'm not going to show an example for this.) Blocks also have a capability that might be surprising to people used to C, namely the ability to return (a value) from the method that contains the block. So for example referencesTo: identifier "Answer a collection of all references to identifier" | entry | "this is a local variable" entry := self at: identifier ifAbsent: [ ^ Set new ]. ^ programSource referencesToEntry: entry The ^ in Smalltalk is like 'return' in C. The block '[ ^ Set new ]' will, if executed, immediately return a value from the 'referencesTo:' method, and the following statement will not be executed at all. Here again, the ability to return a value from the enclosing function is one that adds expressive power, and puts blocks on an equal footing with built-in control structures. Blocks in Smalltalk are fairly representative of closures that are available in other programming languages, taking into account that Smalltalk is more of an imperative language than a functional language (in the sense of "functional programming"). Closures in D ============= For the sake of discussion let's assume that closures will be included in D. What are the implications of this? Clearly there are syntax questions (about which more later), and there are questions about what capabilities should be supported. For the moment let's use [ ParameterList ]{ FunctionBody } as the syntax for a closure in D, without being too particular about types or return values, and look at semantic issues. First, the body of a closure is the same as that for a function body, which means closures can declare variables local to the closure activation. Second, the code in the body of the closure should have access to the context of the surrounding code, notably instance variables and local variables. Instance variables aren't a problem (just have the closure keep a pointer to the object in question), but local variables are a little trickier, since a closure can outlive the stack frame of the method that created it. This could be handled by taking any variables used in closure bodies and allocating them in a heap object and having the closure hang on to a pointer to that. Yes, there is a cost associated with doing that allocation, but only for the cases where there are closures that reference local variables. [Note added during editing - People who are used to stack-based local variable frames may be surprised by the idea that local variables might outlive the function activation. Rather than get into that in too much detail here I think a follow-up post is a better idea. Expect that shortly.] Third, recursion - as with ordinary functions, closures need a stack frame for their local variables so recursion can occur. A closure activation is not the same as a closure. Fourth, nested closures. Closure bodies can contain other closure bodies. This brings with it all the usual machinery needed for nested functions, in particular accessing local variables of containing closures (when that happens). As with functions, closures that have local variables accessed by contained closures would allocate those local variables in an object in the heap. Fifth, interchangeability with function pointers. Getting a closure to masquerade as a function pointer is a little tricky (because a function pointer is only one pointer, whereas a closure is two pointers - a pointer to code and a pointer to context that the code runs in). Maybe it's useful to do that, I don't know. But going the other way is easy - turning a function pointer into a closure just means adding a null pointer for execution context, and perhaps a dummy closure body that simply invokes through the function pointer. The more closures and function pointers are interchangeable (as far as the function getting them as arguments is concerned), the more the language follows the "Law of Least Astonishment". Incidentally, closures being interchangeable with function pointers means closures should have all the same parameter passing modes. In particular, if functions have OUT and INOUT parameters, then closures should have them. Sixth, invoking a closure. How about the same syntax as that for invoking through a function pointer? (*closure_parameter)( argument, argument, argument, ... ) Seventh, closure returns. It's important to distinguish between returning from a closure and returning from a function body containing a closure. Blocks in Smalltalk use the usual return syntax (^) to indicate return from the containing method, and use the last expression evaluated as the return value for the block. Should there be a way to return a value from a closure out of the middle of the closure body? Should 'return' mean return from the closure or return from the function? I can see arguments on both sides. For nested closures, you can imagine an "intermediate" return that returns from a containing closure but not the outermost function body. Blocks in Smalltalk don't include this capability (innermost return or outermost return are the only choices); I have no experience with it, so I don't know how useful it is. But if D is to include multi-level breaks, it probably should also include multi-level returns. Eighth, other control structures. Consider the following code fragment - while( p < p_limit ){ candidate = *p; candidate.enumerate_parts( [ part ]{ if( part.passesTest() ) break; } ); p ++; } Notice what's happening. The closure is testing a value, and in the event that the test succeeds, the 'break' breaks out of the while in the function body outside the closure - a non-local transfer of control. There are similar examples for 'continue' and (dare I say it) 'goto'. These control structures aren't that different from uplevel returns in terms of what mechanisms are needed. Notice that any non-local control transfer - either a return out of or a break/continue/goto into - a function body containing a closure can fail at runtime if that function activation has already returned. This is true for intermediate (closure) function bodies as well as outermost function bodies. Presumably such a case would generate some kind of exception. Despite the difficulties of these exceptional conditions (and the reactions I'm sure some people will have at the idea of using 'goto' out of a closure), I would argue that if the language is going to be orthogonal then as much as possible closures should look like the regular built-in control structures and non-local control transfers should be supported. Syntax for Closures =================== For invoking a closure - how about using the pointer to function notation (*closure)( argument, argument, argument, ... ) Note: personally I see nothing wrong with using the unadorned function call notation for pointers-to-function function_pointer( argument, argument, argument, ... ) and would use that for closures if it were allowed for function pointers, but I don't want to start an argument on that topic. For declaring a closure variable - this should be similar (not identical) to declaring a function pointer. How will function pointers be declared in D? For writing a closure body - well I like the syntax I've been using, assuming it doesn't cause too many problems: [ ParameterList ] { FunctionBody } The types of the parameters and the return value need to go somewhere, which is sort of a pain. The parameter types might go in the parameter list, much like function definitions, e.g., [ int a, int b ] { a < b ? a : b } but for the return type, I'm at more of a loss. Perhaps other people will have some suggestions. Final Remarks ============= I think closures should be considered for D. I've tried to lay out the basic issues and implications of D having closures. I can say based on experience that closures are extremely useful - given a choice between closures and exception handling, I think closures are more generally useful. Also, I think having closures in some form is a prerequisite to writing a good set of Collection classes, which is a large part of the motivation for having some form of generics (or templates).
Nov 07 2001
Quite a lot of text, and honestly it only confused me. What's now the advantage of a closure? How is it implemented? Especially in featureless systems where C originally was developed for. In UserSpace resources are easy to get/implement, but what in kernel space? In embedded OS-less applications? This places are the real motherhood of C. Maybe more for a presentational hint, try to get faster to the point, and explain it good and in a clean easy to understand way. This text sounds just so universitary from professors who's thing they most love in life is to hear themself's talking. - Axel
Nov 07 2001
Here's an example of where a closure might be useful: quicksort( string_pointer_array, number_of_elements, [ char * a, char * b ]{ strcmp( a, b ) } ); So we get to write the compare function right here rather than having to define a separate function. Closures are also useful for dealing with collections: some_collection.do( [ ElementType element ]{ if( element.satisfies_some_condition() ) return element; } ); return NULL; This search pattern occurs frequently, so we might very well have a function that does it already, in which case we could write ElementType e = some_collection.first_element_that( [ ElementType e ]{ e.satisfies_some_condition() } ); Closures can also be used to create control structures that haven't (yet) gotten into the language: foreach_element_pair( array_a, array_b, [ ElementType a_element, ElementType b_element ]{ /* do something with a_element and b_element */ } ); Do these examples help? Implementation - closures don't need interpretation or any sort of runtime compilation; they are compiled in the same way that functions are. The non-local control transfers need some sort of mechanism similar to an exception handling mechanism - I expect Walter can provide some details here. To respond to the individual points -Quite a lot of text, and honestly it only confused me.Sorry for the confusion. There was a lot of ground to cover, and I thought breaking it into small pieces would help the presentation.What's now the advantage of a closure? How is it implemented? Especially in featureless systems where C originally was developed for. In UserSpace resources are easy to get/implement, but what in kernel space? In embedded OS-less applications? This places are the real motherhood of C.Compiling a closure is much like compiling a function. The cost of calling a closure is about the same as calling through a pointer to function; the major difference is that a closure needs an extra pointer to establish its execution context. Functional closures like the quicksort example above have essentially no more overhead than calling through pointer to function. Closures that need access to local context require one heap allocation on function entry (regardless of how many closures in the function there are). I tried to talk about some advantages up above.Maybe more for a presentational hint, try to get faster to the point, and explain it good and in a clean easy to understand way. This text sounds just so universitary from professors who's thing they most love in life is to hear themself's talking.I appreciate the advice.
Nov 07 2001
<la7y6nvo shamko.com> wrote in message news:s7c4ro6ta1a.fsf michael.shamko.com...Here's an example of where a closure might be useful:cksort(string_pointer_array, number_of_elements, [ char * a, char * b ]{ strcmp( a, b ) } ); So we get to write the compare function right here rather than having to define a separate function. Closures are also useful for dealing with collections: some_collection.do( [ ElementType element ]{ if( element.satisfies_some_condition() ) return element; } ); return NULL; This search pattern occurs frequently, so we might very well have a function that does it already, in which case we could write ElementType e = some_collection.first_element_that( [ ElementType e ]{ e.satisfies_some_condition() } ); Closures can also be used to create control structures that haven't (yet) gotten into the language: foreach_element_pair( array_a, array_b, [ ElementType a_element, ElementType b_element ]{ /* do something with a_element and b_element */ } ); Do these examples help?If I understand correctly, the only thing similar to Closures that I've seen are Java inner (and anonymous) classes. Does inner classes support resolve the problem? Ciao
Nov 09 2001
"Roberto Mariottini" <rmariottini lycosmail.com> writes:If I understand correctly, the only thing similar to Closures that I've seen are Java inner (and anonymous) classes. Does inner classes support resolve the problem?Nested classes (assuming that's what Java inner classes are) seem like a related but distinct mechanism. An important aspect of closures is being able to write them right where you need them rather than defining them elsewhere. The experience I have with nested classes (which experience is in C++) is that nested classes don't provide this capability, which is a significant lack. Beyond that, it's possible that some sort of nested class facility could provide a semantically equivalent mechanism. I'm not aware of any however.
Nov 09 2001
<la7y6nvo shamko.com> wrote innews:s7cadxvl82v.fsf michael.shamko.com..."Roberto Mariottini" <rmariottini lycosmail.com> writes:seenIf I understand correctly, the only thing similar to Closures that I'veAgreed. But inner classes are not simply nested classes. A nested class is what in Java is called "static inner class", while non-static inner classes behave much like closures. Ciaoare Java inner (and anonymous) classes. Does inner classes support resolve the problem?Nested classes (assuming that's what Java inner classes are) seem like a related but distinct mechanism. An important aspect of closures is being able to write them right where you need them rather than defining them elsewhere. The experience I have with nested classes (which experience is in C++) is that nested classes don't provide this capability, which is a significant lack.
Nov 12 2001
The message doesn't say what either "static inner classes" or "non-static inner classes" are. After doing a bit of research on the web, I found these pages: http://java.sun.com/products/jdk/1.1/docs/guide/innerclasses/spec/innerclasses.doc1.html http://java.sun.com/products/jdk/1.1/docs/guide/innerclasses/spec/innerclasses.doc2.html The phrase "nested classes" as I used the term was intended to include both what Java calls static inner classes and (non-static) inner classes. Java allows anonymous inner classes which can be written in expressions. Java inner classes provide some capabilities that closures do not: 1) Multiple instances can be made; 2) They have encapsulated state (the inner class's instance variables); 3) They can have more than one "entry point" (method). Closures are less general in some ways but also offer some advantages: 1) Closures offer a more compact notation for what they do; 2) Closures allow non-local control transfers (which as near as I can tell inner classes in Java do not). So I don't see either closures or inner classes as providing all the capabilities of the other. Like the earlier message said, they are related but distinct mechanisms. Perhaps inner classes are useful enough to merit inclusion. That's a big topic for discussion. The advantages listed for closures argue for closures being considered for inclusion in D whether inner classes are included or not. That all make sense? "Roberto Mariottini" <rmariottini lycosmail.com> writes:<la7y6nvo shamko.com> wrote innews:s7cadxvl82v.fsf michael.shamko.com..."Roberto Mariottini" <rmariottini lycosmail.com> writes:seenIf I understand correctly, the only thing similar to Closures that I'veAgreed. But inner classes are not simply nested classes. A nested class is what in Java is called "static inner class", while non-static inner classes behave much like closures. Ciaoare Java inner (and anonymous) classes. Does inner classes support resolve the problem?Nested classes (assuming that's what Java inner classes are) seem like a related but distinct mechanism. An important aspect of closures is being able to write them right where you need them rather than defining them elsewhere. The experience I have with nested classes (which experience is in C++) is that nested classes don't provide this capability, which is a significant lack.
Nov 12 2001
If I understand correctly, the only thing similar to Closures that I've seen are Java inner (and anonymous) classes. Does inner classes support resolve the problem?What an insight! Yes thinking about it inner classes are the same thing in java, only with another skin. (They running frame is created upon execution, it can outlive it's father routine, you can have several of them at the same time etc... ) So I used already something like closures without knowing it :o) - Axel -- |D) http://www.dtone.org
Nov 09 2001
<la7y6nvo shamko.com> wrote in message news:s7cbsieubtt.fsf michael.shamko.com...form is a prerequisite to writing a good set of Collection classes, which is a large part of the motivation for having some form of generics (or templates).First, I don't understand the need of closures for Collections (not that I understand them - closures - very well at all =)). But since D has dynamic and associative arrays, I think most requirements for containers are covered.
Nov 07 2001
Dynamic and associative arrays barely scratch the surface of container types. Consider a video game, which maintains its world database as an octree of objects of various types. Would it not be nice to be able to write code something like this:? octreeScene.ForEach( [ octreeNode node ] { if ( node.IsAlive() && camera.CanSee(node.boundSphere) ) node.Render(); } ); This is alot of expressive power... something that may help D out alot. I really needs something to make up for lack of generics. The alternative is a fairly clunky functor object. This is almost a way to get the compiler to write functor objects for you... I have a feeling they'd be pretty useful for callbacks as well. Sean "Pavel "EvilOne" Minayev" <evilone omen.ru> wrote in message news:9sb9jv$ik3$1 digitaldaemon.com...<la7y6nvo shamko.com> wrote in message news:s7cbsieubtt.fsf michael.shamko.com...form is a prerequisite to writing a good set of Collection classes, which is a large part of the motivation for having some form of generics (or templates).First, I don't understand the need of closures for Collections (not that I understand them - closures - very well at all =)). But since D has dynamic and associative arrays, I think most requirements for containers are covered.
Nov 08 2001
"Sean L. Palmer" <spalmer iname.com> wrote in message news:9sdl9d$28id$1 digitaldaemon.com...Dynamic and associative arrays barely scratch the surface of container types. Consider a video game, which maintains its world database as an octree of objects of various types. Would it not be nice to be able to write code something like this:? octreeScene.ForEach( [ octreeNode node ] { if ( node.IsAlive() && camera.CanSee(node.boundSphere) ) node.Render(); } ); This is alot of expressive power... something that may help D out alot. I really needs something to make up for lack of generics.for-each loops is something that I was going to offer and that, IMHO, must be in the language that supports dynamic and associative arrays. So, this example could be rewritten as (note, this is also a syntax proposal): for(node: octreeNode) { if (node.IsAlive() && camera.CanSee(node.boundSphere)) node.Render(); }
Nov 08 2001
<la7y6nvo shamko.com> wrote in message news:s7cbsieubtt.fsf michael.shamko.com...I'd like to talk about the idea of having closures in D.In fact, after reading this one more time, I really can't understand HOW can this be implemented in a non-interpreted language like D. Nor do I see any advantages of it... Maybe I'm wrong. I'm not sure that I understood anything right, besides, I'm not familiar with Smalltalk, so could you please give a shorter and more self-explanatory example? =)
Nov 07 2001
Pavel \"EvilOne\" Minayev wrote:<la7y6nvo shamko.com> wrote in message news:s7cbsieubtt.fsf michael.shamko.com...Closures can be used as a form of generic programming, but in a slightly different way than C++ templates. Perl has them and makes them very convenient to use; Perl has the advantage that the run-time environment is guaranteed to include a Perl compiler[1]. Here's a trivial example of a closure in Perl: for my $color (qw[red yellow orange green blue purple violet]) { *$color = sub { qq<<FONT COLOR="\U$color\E"> _</FONT>> }; } Don't worry about the hairballs, that's just Perl. :) This automatically generates functions/subroutines red(), yellow(), orange()... etc., which generate some HTML to render strings in the specified color. In C/C++, you could approach this with #defined macros (except for the trick where function red() generates the color tag "RED"), but closures offer quite a bit more power than this. The above example in and of itself could be evaluated completely at compile time, but you could hypothetically replace the list of color names above with a list of arbitrary dynamic text strings. At this point, you _have_ to have dynamic binding and a compiler in the runtime. My guess is that this need is incompatible with the desire to have D be easy to implement. -RB [1] Inasmuch as perl is a compiled language.I'd like to talk about the idea of having closures in D.In fact, after reading this one more time, I really can't understand HOW can this be implemented in a non-interpreted language like D. Nor do I see any advantages of it...
Nov 07 2001
"Russell Borogove" <kaleja estarcion.com> wrote in message news:3BE98692.7994A03C estarcion.com...The above example in and of itself could be evaluated completely at compile time, but you could hypothetically replace the list of color names above with a list of arbitrary dynamic text strings. At this point, you _have_ to have dynamic binding and a compiler in the runtime. My guess is that this need is incompatible with the desire to have D be easy to implement.Thanks for explaining, now everything seems much clearer. And I don't think that it's worth implementing. As you've said it requires dynamic binding, which would probably bloat the code... no thanks. Just not the "D way", I suppose - or at least it feels like that.
Nov 07 2001
The Perl example is generating functions dynamically at runtime. Closures are more static, similar to nested functions. In fact the main difference between a nested function and a closure is that closures are written in line at the point of use. Imagine a C-like language with nested functions, eg int outer_function(){ int inner_function(){ /* inner function body ... */ } needs_pointer_to_function( inner_function ); } A closure is like a function except the code is written directly where it's needed: int outer_function(){ needs_pointer_to_function( []{ /* inner function body ... */ } ); } The Perl example (as I understand it - I'm not a Perl expert) is generating function names (and I guess function bodies) dynamically and hooking those functions into the environment. Closures don't do any dynamic generation. The code is compiled just like a function body, and the value of the closure is stored in an ordinary variable, like pointer-to-function, except closures have two pointers - one for the code body, one for the execution environment - rather than just one. Russell Borogove <kaleja estarcion.com> writes:Pavel \"EvilOne\" Minayev wrote:<la7y6nvo shamko.com> wrote in message news:s7cbsieubtt.fsf michael.shamko.com...Closures can be used as a form of generic programming, but in a slightly different way than C++ templates. Perl has them and makes them very convenient to use; Perl has the advantage that the run-time environment is guaranteed to include a Perl compiler[1]. Here's a trivial example of a closure in Perl: for my $color (qw[red yellow orange green blue purple violet]) { *$color = sub { qq<<FONT COLOR="\U$color\E"> _</FONT>> }; } Don't worry about the hairballs, that's just Perl. :) This automatically generates functions/subroutines red(), yellow(), orange()... etc., which generate some HTML to render strings in the specified color. In C/C++, you could approach this with #defined macros (except for the trick where function red() generates the color tag "RED"), but closures offer quite a bit more power than this. The above example in and of itself could be evaluated completely at compile time, but you could hypothetically replace the list of color names above with a list of arbitrary dynamic text strings. At this point, you _have_ to have dynamic binding and a compiler in the runtime. My guess is that this need is incompatible with the desire to have D be easy to implement. -RB [1] Inasmuch as perl is a compiled language.I'd like to talk about the idea of having closures in D.In fact, after reading this one more time, I really can't understand HOW can this be implemented in a non-interpreted language like D. Nor do I see any advantages of it...
Nov 07 2001
The following is a paper about adding closures to C++. Unfortunately there method does not seem to support closures that persist beyond the lifetime of the activation record of the enclosing function. http://people.debian.org/~karlheg/Usenix88-lexic.pdf <la7y6nvo shamko.com> wrote in message news:s7cbsieubtt.fsf michael.shamko.com...I'd like to talk about the idea of having closures in D. For those who aren't used to the term closure, a closure is something like a pointer to function, and usually is created by a syntactic construct that appears in source code in the form of an anonymous "code literal". Closures in Smalltalk ===================== I'll start by talking about closures in Smalltalk, which calls them "Blocks". (I'm assuming people can read Smalltalk syntax, which may be a bad assumption, but that's what I'm assuming.) A block in Smalltalk is code that appears between square brackets. An example would be symbolTable at: identifier ifAbsent: [ defaultSymbol ]. The '[ defaultSymbol ]' is a block that is executed only if the 'identifier' key is not present in 'symbolTable'. This example isn't very motivating, because defaultSymbol is just a value. Consider this however symbolTable at: identifier ifAbsent: [ UndeclaredSymbol new ]. The '[ UndeclaredSymbol new ]' block only gets executed if the identifier is not present. This could be useful because making a new UndeclaredSymbol might involve lots of computation. The examples above return values, but blocks can also have side effects. For example, symbolTable at: identifier ifAbsent: [ undeclaredIdentifiers add: identifier. UndeclaredSymbol new ]. Notice that the last expression evaluated in a block is what is returned for the value of the block. (Please, no flames about bracket placement style - I know some people prefer the open bracket to go on a line by itself.) Blocks can also take arguments, for example - symbolTable keysAndValuesDo: [ :identifier :entry | output nextPut: '(', identifier printString, '->', entry printString, ')'. ]. which causes a printable form of a symbol table to be added to the 'output' object. Blocks get much of their power from being able to access local context. In an example above, the block contains references to 'identifier' and 'undeclaredIdentifiers', which might be instance variables or perhaps local variables. This ability puts blocks on an equal footing with the more usual if/then/else and looping control structures. (In fact these "built-in" control structures are themselves implemented using blocks in Smalltalk, but that point is incidental - the main thing is that having access to local context gives an important kind of expressive power.) Incidentally, blocks can declare variables local to the block. (I'm not going to show an example for this.) Blocks also have a capability that might be surprising to people used to C, namely the ability to return (a value) from the method that contains the block. So for example referencesTo: identifier "Answer a collection of all references to identifier" | entry | "this is a local variable" entry := self at: identifier ifAbsent: [ ^ Set new ]. ^ programSource referencesToEntry: entry The ^ in Smalltalk is like 'return' in C. The block '[ ^ Set new ]' will, if executed, immediately return a value from the 'referencesTo:' method, and the following statement will not be executed at all. Here again, the ability to return a value from the enclosing function is one that adds expressive power, and puts blocks on an equal footing with built-in control structures. Blocks in Smalltalk are fairly representative of closures that are available in other programming languages, taking into account that Smalltalk is more of an imperative language than a functional language (in the sense of "functional programming"). Closures in D ============= For the sake of discussion let's assume that closures will be included in D. What are the implications of this? Clearly there are syntax questions (about which more later), and there are questions about what capabilities should be supported. For the moment let's use [ ParameterList ]{ FunctionBody } as the syntax for a closure in D, without being too particular about types or return values, and look at semantic issues. First, the body of a closure is the same as that for a function body, which means closures can declare variables local to the closure activation. Second, the code in the body of the closure should have access to the context of the surrounding code, notably instance variables and local variables. Instance variables aren't a problem (just have the closure keep a pointer to the object in question), but local variables are a little trickier, since a closure can outlive the stack frame of the method that created it. This could be handled by taking any variables used in closure bodies and allocating them in a heap object and having the closure hang on to a pointer to that. Yes, there is a cost associated with doing that allocation, but only for the cases where there are closures that reference local variables. [Note added during editing - People who are used to stack-based local variable frames may be surprised by the idea that local variables might outlive the function activation. Rather than get into that in too much detail here I think a follow-up post is a better idea. Expect that shortly.] Third, recursion - as with ordinary functions, closures need a stack frame for their local variables so recursion can occur. A closure activation is not the same as a closure. Fourth, nested closures. Closure bodies can contain other closure bodies. This brings with it all the usual machinery needed for nested functions, in particular accessing local variables of containing closures (when that happens). As with functions, closures that have local variables accessed by contained closures would allocate those local variables in an object in the heap. Fifth, interchangeability with function pointers. Getting a closure to masquerade as a function pointer is a little tricky (because a function pointer is only one pointer, whereas a closure is two pointers - a pointer to code and a pointer to context that the code runs in). Maybe it's useful to do that, I don't know. But going the other way is easy - turning a function pointer into a closure just means adding a null pointer for execution context, and perhaps a dummy closure body that simply invokes through the function pointer. The more closures and function pointers are interchangeable (as far as the function getting them as arguments is concerned), the more the language follows the "Law of Least Astonishment". Incidentally, closures being interchangeable with function pointers means closures should have all the same parameter passing modes. In particular, if functions have OUT and INOUT parameters, then closures should have them. Sixth, invoking a closure. How about the same syntax as that for invoking through a function pointer? (*closure_parameter)( argument, argument, argument, ... ) Seventh, closure returns. It's important to distinguish between returning from a closure and returning from a function body containing a closure. Blocks in Smalltalk use the usual return syntax (^) to indicate return from the containing method, and use the last expression evaluated as the return value for the block. Should there be a way to return a value from a closure out of the middle of the closure body? Should 'return' mean return from the closure or return from the function? I can see arguments on both sides. For nested closures, you can imagine an "intermediate" return that returns from a containing closure but not the outermost function body. Blocks in Smalltalk don't include this capability (innermost return or outermost return are the only choices); I have no experience with it, so I don't know how useful it is. But if D is to include multi-level breaks, it probably should also include multi-level returns. Eighth, other control structures. Consider the following code fragment - while( p < p_limit ){ candidate = *p; candidate.enumerate_parts( [ part ]{ if( part.passesTest() ) break; } ); p ++; } Notice what's happening. The closure is testing a value, and in the event that the test succeeds, the 'break' breaks out of the while in the function body outside the closure - a non-local transfer of control. There are similar examples for 'continue' and (dare I say it) 'goto'. These control structures aren't that different from uplevel returns in terms of what mechanisms are needed. Notice that any non-local control transfer - either a return out of or a break/continue/goto into - a function body containing a closure can fail at runtime if that function activation has already returned. This is true for intermediate (closure) function bodies as well as outermost function bodies. Presumably such a case would generate some kind of exception. Despite the difficulties of these exceptional conditions (and the reactions I'm sure some people will have at the idea of using 'goto' out of a closure), I would argue that if the language is going to be orthogonal then as much as possible closures should look like the regular built-in control structures and non-local control transfers should be supported. Syntax for Closures =================== For invoking a closure - how about using the pointer to function notation (*closure)( argument, argument, argument, ... ) Note: personally I see nothing wrong with using the unadorned function call notation for pointers-to-function function_pointer( argument, argument, argument, ... ) and would use that for closures if it were allowed for function pointers, but I don't want to start an argument on that topic. For declaring a closure variable - this should be similar (not identical) to declaring a function pointer. How will function pointers be declared in D? For writing a closure body - well I like the syntax I've been using, assuming it doesn't cause too many problems: [ ParameterList ] { FunctionBody } The types of the parameters and the return value need to go somewhere, which is sort of a pain. The parameter types might go in the parameter list, much like function definitions, e.g., [ int a, int b ] { a < b ? a : b } but for the return type, I'm at more of a loss. Perhaps other people will have some suggestions. Final Remarks ============= I think closures should be considered for D. I've tried to lay out the basic issues and implications of D having closures. I can say based on experience that closures are extremely useful - given a choice between closures and exception handling, I think closures are more generally useful. Also, I think having closures in some form is a prerequisite to writing a good set of Collection classes, which is a large part of the motivation for having some form of generics (or templates).
Nov 09 2001