digitalmars.D - Escape analysis
- Walter Bright (23/23) Oct 27 2008 The delegate closure issue is part of a wider issue - escape analysis. A...
- Steven Schveighoffer (33/56) Oct 27 2008 I think the default should be no escape. This should cover 90% of cases...
- Hxal (15/24) Oct 27 2008 While requiring parameters by default to not escape the function
- Walter Bright (7/7) Oct 27 2008 The reason the scope/noscope needs to be part of the function signature
- Steven Schveighoffer (37/44) Oct 28 2008 But the documentation is not enough. You cannot express the intricacies...
- Robert Jacques (6/43) Oct 28 2008 Escape analysis also applies to shared/local/scope storage types and not...
- Steven Schveighoffer (19/76) Oct 28 2008 shared/unshared is not a storage class, it is a type modifier (like cons...
- Robert Jacques (8/16) Oct 28 2008 No, because shared and local objects get created and garbage collected o...
- Steven Schveighoffer (24/39) Oct 28 2008 That's an interesting point. Shared definitely has to be a type modifie...
- Robert Jacques (15/63) Oct 28 2008 This is a desirable error that was discussed back when shared was
- Walter Bright (15/54) Oct 28 2008 I think it is conceptually straightforward whether a reference escapes
- Sean Kelly (22/45) Oct 28 2008 There's another weird issue that I'm not sure if anyone has touched on:
- Walter Bright (5/9) Oct 28 2008 What you're talking about is the escaping of pointers to local
- Don (3/14) Oct 29 2008 You could allow it in inside a pure function, whenever the return type
- Sergey Gromov (3/52) Oct 28 2008 Part of s escapes, so the compiler should assume that the whole s
- Sean Kelly (9/52) Oct 28 2008 So are you saying that I'd have to rewrite fnA as:
- Steven Schveighoffer (21/76) Oct 28 2008 This 'feature' is basically useless ;) D has no shared libraries, so I
- Walter Bright (12/57) Oct 28 2008 I disagree. The whole idea behind separate compilation and using
- Steven Schveighoffer (47/104) Oct 28 2008 Any decent build tool (including make, assuming dependencies are created...
- Walter Bright (4/4) Oct 27 2008 Pure functions almost implicitly imply that its parameters are all
- Michel Fortin (9/13) Oct 27 2008 Not if you define "scope" in the function prototype as not escaping the
- Andrei Alexandrescu (4/8) Oct 27 2008 I think even the return value can be considered scoped. Essentially it
- Jason House (2/12) Oct 28 2008 As far as I know, there's no way for functions to specially prepare obje...
- Jason House (3/36) Oct 27 2008 I like the original definition of in as "const scope". I would also lik...
- Robert Fraser (2/35) Oct 27 2008 I get the feeling that D's type system is going to become the joke of th...
- Walter Bright (4/8) Oct 27 2008 That argues that "noscope" should be the default. Using "scope" would be...
- Denis Koroskin (9/17) Oct 27 2008 I hope that 'noscope' is considered to be default *not* because is would...
- Robert Fraser (6/16) Oct 27 2008 My point wasn't the number of keywords... ("shared" is actually the
- Walter Bright (6/11) Oct 27 2008 The complexity is an issue that concerns me. That's why I suspect that
- Andrei Alexandrescu (3/20) Oct 27 2008 I don't think you have a case.
- Michel Fortin (16/18) Oct 27 2008 I don't think you have much choice. Take these examples:
- Walter Bright (2/15) Oct 27 2008 scope is a storage class, not a type constructor.
- Jason House (2/18) Oct 27 2008 How do you treat members of objects passed in? If I pass in a struct wit...
- Walter Bright (2/7) Oct 27 2008 The scope applies to the bits of the object, not what they may refer to.
- Michel Fortin (18/27) Oct 28 2008 So basically, we always have head-scope. Here's my question:
- Jason House (2/36) Oct 28 2008
- Jason House (2/10) Oct 28 2008 This seems rather limiting. I know this is aimed at addressing the dynam...
- Andrei Alexandrescu (10/25) Oct 28 2008 I think it's clear that scope is transitive as much as const or
- Jason House (2/31) Oct 28 2008 Transitive scope means that scope can't be a storage class. It's a trick...
- Steven Schveighoffer (14/38) Oct 28 2008 A quick patch is not possible IMO.
- Denis Koroskin (21/25) Oct 28 2008 On Tue, 28 Oct 2008 16:54:15 +0300, Steven Schveighoffer
- Bill Baxter (10/16) Oct 28 2008 So basically programmers have to memorize all the rules the compiler
- Steven Schveighoffer (26/46) Oct 28 2008 First, the compiler does not have any sound rules for this. It currentl...
- Bill Baxter (21/60) Oct 28 2008 I don't see why not. Because the compiler might be allocating a
- Steven Schveighoffer (27/98) Oct 28 2008 No, I'm proposing the compiler SHOULDN'T allocate closures unless it can
- Sean Kelly (4/11) Oct 28 2008 Like const, I'd rather have no solution than a bad solution insofar as
- Bill Baxter (10/20) Oct 28 2008 The only serious problem people have right now is that closures are
- Sean Kelly (32/52) Oct 28 2008 This would be the most backwards-compatible way also. The only real
- Walter Bright (13/20) Oct 28 2008 The counter to that is that when there is an inadvertent escape of a
- Sean Kelly (13/32) Oct 28 2008 I think the cost/benefit of this could probably be argued either way.
- Jason House (2/11) Oct 28 2008 As the author of an open source multithreaded application in D1, I've ha...
- Jarrett Billingsley (5/16) Oct 28 2008 For what it's worth, std.bind I think depends on one Phobos-specific
- Jason House (2/22) Oct 28 2008 I ported a bind implementation and maintain it in my code base. I didn't...
- Andrei Alexandrescu (6/22) Oct 28 2008 I agree. Particularly in higher-order code this kind of problem is bound...
- Bill Baxter (10/32) Oct 28 2008 I've had bugs caused by this but they were pretty easy to find.
- Andrei Alexandrescu (4/34) Oct 28 2008 I don't think we can afford program correctness to rest on anecdote and
- Walter Bright (8/10) Oct 28 2008 I agree. When you're managing a program with a million lines of code in
- Bill Baxter (6/56) Oct 28 2008 I haven't seen any real data about how serious a problem this is from
- Andrei Alexandrescu (8/57) Oct 28 2008 Well to provide real data I'd have to spend time on user studies, which
- Andrei Alexandrescu (9/74) Oct 29 2008 I just wanted to issue an apology to Bill for the above, which is
- Bill Baxter (11/95) Oct 29 2008 No problem. My comment leading to that response was a bit snarky too.
- Steven Schveighoffer (18/23) Oct 30 2008 I doubt anyone wants that. But here is my main concern (my defense for
- ore-sama (2/4) Oct 30 2008 Moreover that sematics in some cases will force allocation when it's not...
- Walter Bright (9/12) Oct 28 2008 I have. Not often in my own code because I am very careful to avoid it,
- Sean Kelly (9/22) Oct 28 2008 I tend to ask a question along these lines to entry-level interviewees
- Walter Bright (11/16) Oct 28 2008 To me that is akin to building a car with no brakes and justifying it by...
- Steven Schveighoffer (6/18) Oct 28 2008 I agree with this. It would be nice to be able to flag these kinds of
- Bill Baxter (7/25) Oct 28 2008 Ok, I think we're completely on the same page here. I'm for the
- Robert Fraser (5/32) Oct 29 2008 How about adding a warning switch (I know Walter you're against them but...
- Robert Jacques (15/16) Oct 27 2008 Okay, I'm confused. I had assumed that the escape scope was different fr...
- Andrei Alexandrescu (4/41) Oct 27 2008 This is a misunderstanding. Scope is a storage class, not a type
- Mosfet (4/41) Oct 28 2008 I agree I think that D will be used only by people like you that
- Andrei Alexandrescu (5/50) Oct 28 2008 Well I think you were right. The question is how much you spend learning...
- dsimcha (18/22) Oct 28 2008 Seconded. Both C++ and D are very complex languages, but I don't see th...
- Jason House (7/40) Oct 27 2008 In D1, local variables implicitly follow a mixed rule:
- ore-sama (1/1) Oct 28 2008 Allocation is determined on delegate creation, not on passing it somewhe...
- Sergey Gromov (29/38) Oct 28 2008 I'm for safe defaults. Programs shouldn't crash for no reason.
- Steven Schveighoffer (31/69) Oct 28 2008 If safe defaults means 75% performance decrease, I'm for using unsafe
- Sergey Gromov (28/113) Oct 28 2008 Please note the "in the absence of function calls" part. I'm talking
- Steven Schveighoffer (27/145) Oct 28 2008 Ah, sorry. I read 'absence of function source'. My bad, in that case w...
- Sergey Gromov (13/76) Oct 29 2008 Allocation only happens when a stack variable reference escapes via a
- Steven Schveighoffer (38/120) Oct 29 2008 A static array declared on the stack absolutely is a stack variable.
- Sergey Gromov (14/124) Oct 29 2008 There is no delegate, therefore nothing to allocate a closure for. If
- Steven Schveighoffer (17/89) Oct 29 2008 I was under the impression that closures are currently allocated if you
- Sergey Gromov (6/24) Oct 29 2008 I do understand that. I just wanted to discuss whether it is possible
- Chad J (15/32) Oct 28 2008 If safe defaults means 2% performance decrease, I'm for using unsafe
- Robert Jacques (24/24) Oct 28 2008 I've run across some academic work on ownership types which seems releva...
- Michel Fortin (110/112) Oct 29 2008 I haven't read the paper yet, but the overview seems to go in the same
- Steven Schveighoffer (12/17) Oct 29 2008 [snip]
- Robert Jacques (4/27) Oct 29 2008 Note that one of a major points in the Pedigree paper is the static type...
- Michel Fortin (65/74) Oct 30 2008 I agree that this is becomming a problem, even without scope. What we
- Steven Schveighoffer (40/52) Oct 31 2008 But the burden you have left for the developer is a tough one. You have...
- Robert Jacques (19/87) Oct 31 2008 Tools can't handle function pointers, which is why escape analysis has
- Michel Fortin (62/122) Oct 31 2008 If you can't determine yourself that a function can work with scoped
- Steven Schveighoffer (27/80) Nov 01 2008 But often times, the safety of the call depends on how it is being calle...
- Andrei Alexandrescu (7/9) Nov 01 2008 I think that's a fair assessment. One suggestion I made Walter is to
- Steven Schveighoffer (4/12) Nov 02 2008 If scope delegates means trust the coder knows what he is doing (in the
- Andrei Alexandrescu (31/43) Nov 02 2008 It looks like things will move that way. Bartosz, Walter and I talked a
- bearophile (4/6) Nov 02 2008 UHm... I see. But I am not sure I like that. Isn't that a waste of memor...
- Andrei Alexandrescu (7/13) Nov 02 2008 Yah, we can't get rid of that. Possibilities discussed were (a) make
- dsimcha (7/13) Nov 02 2008 And a monitor. And RTTI. Then again, for code that absolutely must be a...
- Jarrett Billingsley (4/8) Nov 02 2008 No, they have a *pointer* to a vtable. There is only one vtable per
- Michel Fortin (56/89) Nov 02 2008 That's a little disapointing. I was hoping for something to fix all
- Andrei Alexandrescu (33/134) Nov 02 2008 That's only the half of it. If you want to take a look at a C-like
- Michel Fortin (25/40) Nov 02 2008 First, I think it's a pretty good idea to have this. Second, I think
- Andrei Alexandrescu (15/39) Nov 02 2008 [snip]
- Michel Fortin (62/75) Nov 03 2008 Studying things more in depth often at first leave you with the
- Andrei Alexandrescu (13/18) Nov 03 2008 It may be wise to read some more before writing some more. As far as I
- Michel Fortin (35/57) Nov 04 2008 Pretty interesting slides.
- Andrei Alexandrescu (10/74) Nov 04 2008 Cyclone has region subtyping which takes care of that.
- Michel Fortin (25/47) Nov 04 2008 Indeed, I was somewhat mistaken that the <> notation was templates
- Michel Fortin (47/69) Nov 05 2008 Not the same way as I'm proposing. What cyclone does is make p
- Andrei Alexandrescu (25/98) Nov 06 2008 Well how about this:
- Steven Schveighoffer (8/14) Nov 07 2008 FWIW, I still think the proposal you have put forth about references bei...
- Michel Fortin (70/126) Nov 09 2008 I don't see a problem at all. The compiler would expand the lifetime of
- Christopher Wright (5/33) Nov 09 2008 In point of fact, it's expensive to extend the stack, so any compiler
- Michel Fortin (60/79) Nov 14 2008 If you mean there could be a problem with functions referring to the
- Andrei Alexandrescu (33/116) Nov 09 2008 I agree that an escape analyzer would improve things. I am not sure that...
- Michel Fortin (87/176) Nov 12 2008 If you think I proposed a region-oblivious scheme, then you've got me
- Andrei Alexandrescu (55/187) Nov 12 2008 But how do you type then the assignment example?
- Hxal (42/80) Nov 12 2008 Examples such as this one are rare enough to afford the need for
- Michel Fortin (106/242) Nov 12 2008 Everywhere I said there was no need for named regions, I also said
- Andrei Alexandrescu (46/180) Nov 12 2008 No, the code is correct as written (without the if). You may want to
- Michel Fortin (162/300) Nov 14 2008 Ok, I've reread that part and it's true that using Cyclone's subtyping
- Andrei Alexandrescu (5/8) Nov 14 2008 By this I meant I don't have time (t < 0), not that I was writing while
- Robert Jacques (4/10) Nov 02 2008 Does this mean the whole shared/local/scope issue for classes is being
- Andrei Alexandrescu (3/15) Nov 02 2008 What issue do you have in mind?
- Robert Jacques (5/18) Nov 03 2008 Right now, it's trivial for scope classes to escape due to automatic
- Steven Schveighoffer (64/105) Nov 03 2008 Isn't this already the case?
- Andrei Alexandrescu (20/121) Nov 03 2008 It's planned as a compiler switch and module option. Essentially SafeD
- Steven Schveighoffer (38/69) Nov 04 2008 I personally probably won't use it, as I feel I have enough experience t...
- Robert Jacques (5/39) Nov 02 2008 Various research languages have shown both 1 and 2 are possible.
- Steven Schveighoffer (28/70) Nov 03 2008 I think 1 can be possibly done. 2 is a matter of subjectivity, and so f...
- Michel Fortin (28/55) Nov 04 2008 I won't dispute this. I'll note that the upcomming "shared" keyword may
- Robert Jacques (11/69) Oct 29 2008 What does the scope part of 'scope MyObject o' mean? (i.e. is this D's
- Michel Fortin (50/130) Oct 30 2008 Ok, I should have defined that better. It means that o is bound the
- Robert Jacques (12/47) Oct 30 2008 Just to clarify:
- Michel Fortin (10/21) Oct 30 2008 Well, it all depends if foo wants the second argument of i must be
- Robert Jacques (5/21) Oct 31 2008 Actually, what I meant was that o may be local or shared. However,
- Robert Jacques (8/10) Oct 30 2008 How about o.scope instead of scope(o)? Also, this would allow
- Michel Fortin (9/20) Oct 30 2008 Hum, but can that syntax guarenty a reference to o or i won't escape
- Robert Jacques (12/28) Oct 31 2008 No, the syntax was meant to address the more complex problem of specifyi...
- Robert Jacques (10/26) Oct 31 2008 Another option is for the default to be escape. i.e. a contract is
- Robert Jacques (3/4) Oct 31 2008 Correction: default to be _no_ escape.
- bearophile (5/5) Oct 29 2008 I think C++ designers are fully mad, this shows how to use C++ lambdas:
- Sergey Gromov (4/11) Oct 29 2008 Well, they're somewhat limited, and a bit manual, and actually just a
- Bill Baxter (8/19) Oct 29 2008 I think it's mostly the capture mode [] stuff that's a bit ugly.
- Sergey Gromov (5/17) Oct 29 2008 The discussed features are really a significant improvement for C++
- Jarrett Billingsley (2/3) Oct 29 2008 It's called decltype().
- Robert Fraser (2/6) Oct 29 2008 C++ is a .NET language now ;-P
- Chad J (23/23) Oct 30 2008 I wonder if it would be easy enough to allocate closures lazily at runti...
- Christopher Wright (16/25) Nov 01 2008 I appreciate OOP. I also appreciate it when it takes no significant
The delegate closure issue is part of a wider issue - escape analysis. A reference is said to 'escape' a scope if it, well, leaves that scope. Here's a trivial example: int* foo() { int i; return &i; } The reference to i escapes the scope of i, thus courting disaster. Another form of escaping: int* p; void bar(int* x) { p = x; } which is, on the surface, legitimate, but fails for: void abc(int j) { bar(&j); } This kind of problem is currently undetectable by the compiler. The first step is, are function parameters considered to be escaping by default or not by default? I.e.: void bar(noscope int* p); // p escapes void bar(scope int* p); // p does not escape void bar(int* p); // what should be the default? What should be the default? The functional programmer would probably choose scope as the default, and the OOP programmer noscope. (The issue with delegates is we need the dynamic closure only if the delegate 'escapes'.)
Oct 27 2008
"Walter Bright" wroteThe delegate closure issue is part of a wider issue - escape analysis. A reference is said to 'escape' a scope if it, well, leaves that scope. Here's a trivial example: int* foo() { int i; return &i; } The reference to i escapes the scope of i, thus courting disaster. Another form of escaping: int* p; void bar(int* x) { p = x; } which is, on the surface, legitimate, but fails for: void abc(int j) { bar(&j); } This kind of problem is currently undetectable by the compiler. The first step is, are function parameters considered to be escaping by default or not by default? I.e.: void bar(noscope int* p); // p escapes void bar(scope int* p); // p does not escape void bar(int* p); // what should be the default? What should be the default? The functional programmer would probably choose scope as the default, and the OOP programmer noscope. (The issue with delegates is we need the dynamic closure only if the delegate 'escapes'.)I think the default should be no escape. This should cover 90% of cases, and does not have an 'allocate by default' policy. But I think whether a variable escapes or not cannot really be determined by the function accepting the variable, since the function doesn't know where the variable comes from. An example: void bar(int *x, ref int *y) { y = x;} How do you know that y is not defined in the same scope or a sub-scope of the scope of x? If the compiler sees: void bar(noscope int *x, scope ref int *y) It's going to assume that x will always escape, and probably allocate a closure so it can call bar. Which might not be the right decision. I think that without a full graph analysis of what escapes to where, it is going to be impossible to make this correct for the compiler to use, and that might be too much for the compiler to deal with. I'd rather just have the compiler assume scope unless told otherwise (at the point of use, not in the function signature). For instance: void bar(int *x, ref int *y) { y = x;} void abc(int x, ref int *y) { bar(noscope &x, y); } void abc2() { int x; int *y; bar(&x, y); } tells the compiler to allocate a closure for abc, because x might escape. But does not allocate a closure for abc2, because there are no escapes indicated by the developer. -Steve
Oct 27 2008
Walter Bright Wrote:The first step is, are function parameters considered to be escaping by default or not by default? I.e.: void bar(noscope int* p); // p escapes void bar(scope int* p); // p does not escape void bar(int* p); // what should be the default? What should be the default? The functional programmer would probably choose scope as the default, and the OOP programmer noscope.While requiring parameters by default to not escape the function would be a great, because it'd cause less spam (I think they don't escape in most cases) and potentially make programmers think around and refactor their code - it'd also be quite a breaking change. Defaulting to no escape checking being done and providing a scope parameter class seems therefore the more obvious choice. It keeps existing code intact and allows correctness checking and optimization on demand. My only fear is that the feature will cause much frustration when we can reason that a reference doesn't escape, but the compiler can't know that. For example putting one scope parameter into another's field, or referencing a scope parameter from a complex return value. Anyway, if escape analysis is implemented, I'd suggest using a more high level terminology like temporary and permanent objects. Might make more sense to beginners.
Oct 27 2008
The reason the scope/noscope needs to be part of the function signature is because: 1. that makes it self-documenting 2. function bodies may be external, i.e. not present 3. virtual functions 4. notifies the user if a library function parameter scope-ness changes (you'll get a compile time error)
Oct 27 2008
"Walter Bright" wroteThe reason the scope/noscope needs to be part of the function signature is because: 1. that makes it self-documentingBut the documentation is not enough. You cannot express the intricacies of what variables are scope escapes so that the compiler can make intelligent enough decisions. What this will result in is slightly less unnecessary closures, but not enough to make a difference. Or else you won't be able to declare things the way you want, so you will be forced to declare something that *could* result in an escape, but usually doesn't.2. function bodies may be external, i.e. not present 3. virtual functionsYes, so you are now implying a scope escape contract on all derived classes. But not a very expressive one.4. notifies the user if a library function parameter scope-ness changes (you'll get a compile time error)Oh really? I imagined that if the scope-ness changed it just results in a new heap allocation when I call the function. i.e. Joe library developer has this function foo: int foo(scope int *x) {return *x;} And he now decides he wants to change it somehow: int *lastFooCalledWith; int foo(int *x) {lastFooCalledWith = x; return *x;} I used foo like this: int i; auto j = foo(&i); So does this now fail to compile? Or does it silently kill the performance of my code? If the latter, we are left with the same problem we have now. If the former, how does one call a function with a noscope parameter? The more I think about this, the more I'd rather have D1 behavior and some sort of way to indicate my function should allocate a heap frame (except on easily provable scope escapes). The most common case I think which will cause unnecessary allocations, is a very common case. A class setter: class X { private int *v_; int *v(int *newV) {return v_ = newV;} int *v() { return v_;} } Clearly, newV escapes into the class instance, but how do we know what the scope of the class instance is to know if newV truly escapes its own scope? -Steve
Oct 28 2008
On Tue, 28 Oct 2008 08:58:18 -0400, Steven Schveighoffer <schveiguy yahoo.com> wrote:"Walter Bright" wroteEscape analysis also applies to shared/local/scope storage types and not just delegates. Consider having to write a function for every combination of shared/local/scope for every object or pointer in the function signature.4. notifies the user if a library function parameter scope-ness changes (you'll get a compile time error)Oh really? I imagined that if the scope-ness changed it just results in a new heap allocation when I call the function. i.e. Joe library developer has this function foo: int foo(scope int *x) {return *x;} And he now decides he wants to change it somehow: int *lastFooCalledWith; int foo(int *x) {lastFooCalledWith = x; return *x;} I used foo like this: int i; auto j = foo(&i); So does this now fail to compile? Or does it silently kill the performance of my code? If the latter, we are left with the same problem we have now. If the former, how does one call a function with a noscope parameter? The more I think about this, the more I'd rather have D1 behavior and some sort of way to indicate my function should allocate a heap frame (except on easily provable scope escapes). The most common case I think which will cause unnecessary allocations, is a very common case. A class setter: class X { private int *v_; int *v(int *newV) {return v_ = newV;} int *v() { return v_;} } Clearly, newV escapes into the class instance, but how do we know what the scope of the class instance is to know if newV truly escapes its own scope?
Oct 28 2008
"Robert Jacques" wroteOn Tue, 28 Oct 2008 08:58:18 -0400, Steven Schveighoffer <schveiguy yahoo.com> wrote:shared/unshared is not a storage class, it is a type modifier (like const). But in any case, shared is much easier to define. Only one line needs to be checked -- is this accessible by another thread or not. Since it is a type modifier, it's carried around for every reference to shared data, and you can easily do escape analysis there. Scope is much more difficult because there are many scopes to consider. It's not just global or not global, you have a scope for each function, a scope for each set of braces within a function, and there is no easy way to say which scope you are referring to when you say a variable is scope. If you can only refer to the current scope, then you have not solved the closure problem, and useful escape analysis is impossible beyond simply 'a pointer to a variable I declared in this scope is being returned.' In order for escape analysis to be useful, I need to be able to specify in a function such as: void foo(int *x, int **y, int **z) That x might escape to y's or z's scope. How do you do allow that specification without making function signatures dreadfully complicated? -Steve"Walter Bright" wroteEscape analysis also applies to shared/local/scope storage types and not just delegates. Consider having to write a function for every combination of shared/local/scope for every object or pointer in the function signature.4. notifies the user if a library function parameter scope-ness changes (you'll get a compile time error)Oh really? I imagined that if the scope-ness changed it just results in a new heap allocation when I call the function. i.e. Joe library developer has this function foo: int foo(scope int *x) {return *x;} And he now decides he wants to change it somehow: int *lastFooCalledWith; int foo(int *x) {lastFooCalledWith = x; return *x;} I used foo like this: int i; auto j = foo(&i); So does this now fail to compile? Or does it silently kill the performance of my code? If the latter, we are left with the same problem we have now. If the former, how does one call a function with a noscope parameter? The more I think about this, the more I'd rather have D1 behavior and some sort of way to indicate my function should allocate a heap frame (except on easily provable scope escapes). The most common case I think which will cause unnecessary allocations, is a very common case. A class setter: class X { private int *v_; int *v(int *newV) {return v_ = newV;} int *v() { return v_;} } Clearly, newV escapes into the class instance, but how do we know what the scope of the class instance is to know if newV truly escapes its own scope?
Oct 28 2008
On Tue, 28 Oct 2008 09:44:28 -0400, Steven Schveighoffer <schveiguy yahoo.com> wrote:shared/unshared is not a storage class, it is a type modifier (like const).No, because shared and local objects get created and garbage collected on different heaps.In order for escape analysis to be useful, I need to be able to specify in a function such as: void foo(int *x, int **y, int **z) That x might escape to y's or z's scope. How do you do allow that specification without making function signatures dreadfully complicated?Well, x escapes to y or z is easy since it's how D works today. And if you have a no_assignment type, then the x won't escape to y or z is easy too. It's the mixed cases that things get complicated in. I'd recommend looking up pedigree types as one possible solution.
Oct 28 2008
"Robert Jacques" wroteOn Tue, 28 Oct 2008 09:44:28 -0400, Steven Schveighoffer <schveiguy yahoo.com> wrote:That's an interesting point. Shared definitely has to be a type modifier, otherwise, it cannot do this: shared int x = 0; int *xp = &x; // error, xp now is unshared, and points to shared data. But it probably also has to be a storage class also. Not sure about that.shared/unshared is not a storage class, it is a type modifier (like const).No, because shared and local objects get created and garbage collected on different heaps.But what if y or z is not in x's scope? For instance: void bar(ref int *y, ref int *z) { int x = 5; foo(&x, &y, &z); } If y or z gets set to &x, then you have to allocate a closure for bar. The opposite example: void bar(int *y, int *z) { int x = 5; foo(&x, &y, &z); } No closure necessary. So you need something to say that y or z can get set to x, so the compiler would be smart enough to only allocate a closure if y or z exists outside x's scope. Otherwise, you have unnecessary closures, and we are in the same boat as today. -SteveIn order for escape analysis to be useful, I need to be able to specify in a function such as: void foo(int *x, int **y, int **z) That x might escape to y's or z's scope. How do you do allow that specification without making function signatures dreadfully complicated?Well, x escapes to y or z is easy since it's how D works today.
Oct 28 2008
On Tue, 28 Oct 2008 14:46:34 -0400, Steven Schveighoffer <schveiguy yahoo.com> wrote:"Robert Jacques" wroteThis is a desirable error that was discussed back when shared was introduced. You can think of shared / local like immutable and mutable. The real problem is that a 'const' for shared/local/scope isn't clear yet.On Tue, 28 Oct 2008 09:44:28 -0400, Steven Schveighoffer <schveiguy yahoo.com> wrote:That's an interesting point. Shared definitely has to be a type modifier, otherwise, it cannot do this: shared int x = 0; int *xp = &x; // error, xp now is unshared, and points to shared data. But it probably also has to be a storage class also. Not sure about that.shared/unshared is not a storage class, it is a type modifier (like const).No, because shared and local objects get created and garbage collected on different heaps.Which is an issue with the user of foo, but not foo's signature.But what if y or z is not in x's scope?In order for escape analysis to be useful, I need to be able to specify in a function such as: void foo(int *x, int **y, int **z) That x might escape to y's or z's scope. How do you do allow that specification without making function signatures dreadfully complicated?Well, x escapes to y or z is easy since it's how D works today.For instance: void bar(ref int *y, ref int *z) { int x = 5; foo(&x, &y, &z); } If y or z gets set to &x, then you have to allocate a closure for bar. The opposite example: void bar(int *y, int *z) { int x = 5; foo(&x, &y, &z); } No closure necessary. So you need something to say that y or z can get set to x, so the compiler would be smart enough to only allocate a closure if y or z exists outside x's scope. Otherwise, you have unnecessary closures, and we are in the same boat as today.This example, although important is essentially about whether optimizing the closure is valid or not and has nothing to do with the behaviour of foo. However, this does seem to illustrate a need for three types: global escape (variable may escape to anywhere), pure escape (variable may escape to other inputs), no escape (variable is guaranteed not to escape). For example, if foo saved &x to a static variable (global escape) then is all cases it needs to be heap allocated. But if (as in your example) &x is saved to one of the function inputs (pure escape), then the caller can detect if it can ensure no escape and therefore use the stack.
Oct 28 2008
Steven Schveighoffer wrote:"Walter Bright" wroteI think it is conceptually straightforward whether a reference escapes or not, though it is difficult for the compiler to detect it reliably.The reason the scope/noscope needs to be part of the function signature is because: 1. that makes it self-documentingBut the documentation is not enough. You cannot express the intricacies of what variables are scope escapes so that the compiler can make intelligent enough decisions. What this will result in is slightly less unnecessary closures, but not enough to make a difference. Or else you won't be able to declare things the way you want, so you will be forced to declare something that *could* result in an escape, but usually doesn't.First off, the mangled names will be different, so it won't link until you recompile. This is critical because the caller's code depends on the scope/noscope characteristic. Secondly, passing a scoped reference to a noscope parameter should be a compile time error.2. function bodies may be external, i.e. not present 3. virtual functionsYes, so you are now implying a scope escape contract on all derived classes. But not a very expressive one.4. notifies the user if a library function parameter scope-ness changes (you'll get a compile time error)Oh really? I imagined that if the scope-ness changed it just results in a new heap allocation when I call the function.The more I think about this, the more I'd rather have D1 behavior and some sort of way to indicate my function should allocate a heap frame (except on easily provable scope escapes).Having the caller specify it is not tenable, because the caller has no control over (and likely no knowledge of) what the callee does. Functions should be regarded as black boxes, where all you can know about them is in the function signature.The most common case I think which will cause unnecessary allocations, is a very common case. A class setter: class X { private int *v_; int *v(int *newV) {return v_ = newV;} int *v() { return v_;} } Clearly, newV escapes into the class instance,Then it's noscope.but how do we know what the scope of the class instance is to know if newV truly escapes its own scope?We take the conservative approach, and regard "might escape" and "don't know if it escapes" as "treat as if it does escape".
Oct 28 2008
Walter Bright wrote:Steven Schveighoffer wrote:There's another weird issue that I'm not sure if anyone has touched on: struct S { int x; int getX() { return x; } } void main() { auto s = new S; fn( s ); } void fnA( S* s ) { fnB( &s.getX ); } void fnB( noscope int delegate() dg ) {} How does the compiler handle this? It can't tell by inspecting the type whether the data for S is dynamic... in fact, the same could be said of a "scope" instance of a class. I guess it would have to assume that object variables without a "noscope" label must be scoped? Sean"Walter Bright" wroteI think it is conceptually straightforward whether a reference escapes or not, though it is difficult for the compiler to detect it reliably.The reason the scope/noscope needs to be part of the function signature is because: 1. that makes it self-documentingBut the documentation is not enough. You cannot express the intricacies of what variables are scope escapes so that the compiler can make intelligent enough decisions. What this will result in is slightly less unnecessary closures, but not enough to make a difference. Or else you won't be able to declare things the way you want, so you will be forced to declare something that *could* result in an escape, but usually doesn't.2. function bodies may be external, i.e. not present 3. virtual functionsYes, so you are now implying a scope escape contract on all derived classes. But not a very expressive one.
Oct 28 2008
Sean Kelly wrote:How does the compiler handle this? It can't tell by inspecting the type whether the data for S is dynamic... in fact, the same could be said of a "scope" instance of a class. I guess it would have to assume that object variables without a "noscope" label must be scoped?What you're talking about is the escaping of pointers to local variables. The compiler does not detect it, except in trivial cases. This is why, in safe mode, taking the address of a local variable will not be allowed.
Oct 28 2008
Walter Bright wrote:Sean Kelly wrote:You could allow it in inside a pure function, whenever the return type does not contain pointers.How does the compiler handle this? It can't tell by inspecting the type whether the data for S is dynamic... in fact, the same could be said of a "scope" instance of a class. I guess it would have to assume that object variables without a "noscope" label must be scoped?What you're talking about is the escaping of pointers to local variables. The compiler does not detect it, except in trivial cases. This is why, in safe mode, taking the address of a local variable will not be allowed.
Oct 29 2008
Sean Kelly wrote:Walter Bright wrote:Part of s escapes, so the compiler should assume that the whole s escapes. If s is scope by default, it should be a compile-time error here.Steven Schveighoffer wrote:There's another weird issue that I'm not sure if anyone has touched on: struct S { int x; int getX() { return x; } } void main() { auto s = new S; fn( s ); } void fnA( S* s ) { fnB( &s.getX ); }"Walter Bright" wroteI think it is conceptually straightforward whether a reference escapes or not, though it is difficult for the compiler to detect it reliably.The reason the scope/noscope needs to be part of the function signature is because: 1. that makes it self-documentingBut the documentation is not enough. You cannot express the intricacies of what variables are scope escapes so that the compiler can make intelligent enough decisions. What this will result in is slightly less unnecessary closures, but not enough to make a difference. Or else you won't be able to declare things the way you want, so you will be forced to declare something that *could* result in an escape, but usually doesn't.2. function bodies may be external, i.e. not present 3. virtual functionsYes, so you are now implying a scope escape contract on all derived classes. But not a very expressive one.void fnB( noscope int delegate() dg ) {} How does the compiler handle this? It can't tell by inspecting the type whether the data for S is dynamic... in fact, the same could be said of a "scope" instance of a class. I guess it would have to assume that object variables without a "noscope" label must be scoped? Sean
Oct 28 2008
Sergey Gromov wrote:Sean Kelly wrote:So are you saying that I'd have to rewrite fnA as: void fnA( noscope S* s ) {...} I guess I can see the point, but that's horribly viral. Particularly when classes come into the picture. With this in mind, from a syntax standpoint I'd be leaning towards what D does right now (ie having noscope as the default), but from a performance standpoint this is absolutely not an option--I may as well just switch to something like Ruby. SeanWalter Bright wrote:Part of s escapes, so the compiler should assume that the whole s escapes. If s is scope by default, it should be a compile-time error here.Steven Schveighoffer wrote:There's another weird issue that I'm not sure if anyone has touched on: struct S { int x; int getX() { return x; } } void main() { auto s = new S; fn( s ); } void fnA( S* s ) { fnB( &s.getX ); }"Walter Bright" wroteI think it is conceptually straightforward whether a reference escapes or not, though it is difficult for the compiler to detect it reliably.The reason the scope/noscope needs to be part of the function signature is because: 1. that makes it self-documentingBut the documentation is not enough. You cannot express the intricacies of what variables are scope escapes so that the compiler can make intelligent enough decisions. What this will result in is slightly less unnecessary closures, but not enough to make a difference. Or else you won't be able to declare things the way you want, so you will be forced to declare something that *could* result in an escape, but usually doesn't.2. function bodies may be external, i.e. not present 3. virtual functionsYes, so you are now implying a scope escape contract on all derived classes. But not a very expressive one.
Oct 28 2008
"Walter Bright" wroteSteven Schveighoffer wrote:This 'feature' is basically useless ;) D has no shared libraries, so I don't think anyone generally keeps their stale object files around and tries to link with them instead of trying to recompile the sources. You are asking for trouble otherwise."Walter Bright" wroteI think it is conceptually straightforward whether a reference escapes or not, though it is difficult for the compiler to detect it reliably.The reason the scope/noscope needs to be part of the function signature is because: 1. that makes it self-documentingBut the documentation is not enough. You cannot express the intricacies of what variables are scope escapes so that the compiler can make intelligent enough decisions. What this will result in is slightly less unnecessary closures, but not enough to make a difference. Or else you won't be able to declare things the way you want, so you will be forced to declare something that *could* result in an escape, but usually doesn't.First off, the mangled names will be different, so it won't link until you recompile. This is critical because the caller's code depends on the scope/noscope characteristic.2. function bodies may be external, i.e. not present 3. virtual functionsYes, so you are now implying a scope escape contract on all derived classes. But not a very expressive one.4. notifies the user if a library function parameter scope-ness changes (you'll get a compile time error)Oh really? I imagined that if the scope-ness changed it just results in a new heap allocation when I call the function.Secondly, passing a scoped reference to a noscope parameter should be a compile time error.OK, so when does a closure happen? I thought the point of this was to specify when a closure was necessary... compiler sees foo(noscope int *x) I try to pass in an address to a local variable. Compiler says, hm... I need a closure to convert my scope variable into a noscope.But the compiler's lack of knowledge/proof about the escape intricacies of a function will cause either a) unnecessary closure allocation, or b) impossible specifications. i.e. I want to specify that either a scope or noscope variable can be passed in, and the variable might escape depending on what you pass in for other arguments, how do I do that?The more I think about this, the more I'd rather have D1 behavior and some sort of way to indicate my function should allocate a heap frame (except on easily provable scope escapes).Having the caller specify it is not tenable, because the caller has no control over (and likely no knowledge of) what the callee does. Functions should be regarded as black boxes, where all you can know about them is in the function signature.So then to call X.v, the function must allocate a closure? How does this improve the current situation where closures are allocated by default?The most common case I think which will cause unnecessary allocations, is a very common case. A class setter: class X { private int *v_; int *v(int *newV) {return v_ = newV;} int *v() { return v_;} } Clearly, newV escapes into the class instance,Then it's noscope.Also untenable. We have the same situation today. You will have achieved nothing with this syntax except making people write scope or noscope everywhere to satisfy incomplete compiler rules. -Stevebut how do we know what the scope of the class instance is to know if newV truly escapes its own scope?We take the conservative approach, and regard "might escape" and "don't know if it escapes" as "treat as if it does escape".
Oct 28 2008
Steven Schveighoffer wrote:"Walter Bright" wroteI disagree. The whole idea behind separate compilation and using makefiles is to recompile only what is necessary. Encoding the function specification into its identifier is a tried and true way of detecting mistakes in that.First off, the mangled names will be different, so it won't link until you recompile. This is critical because the caller's code depends on the scope/noscope characteristic.This 'feature' is basically useless ;) D has no shared libraries, so I don't think anyone generally keeps their stale object files around and tries to link with them instead of trying to recompile the sources. You are asking for trouble otherwise.Either the compiler issues an error, or it allocates the scoped variable on the heap. I prefer the former behavior.Secondly, passing a scoped reference to a noscope parameter should be a compile time error.OK, so when does a closure happen? I thought the point of this was to specify when a closure was necessary... compiler sees foo(noscope int *x) I try to pass in an address to a local variable. Compiler says, hm... I need a closure to convert my scope variable into a noscope.But the compiler's lack of knowledge/proof about the escape intricacies of a function will cause either a) unnecessary closure allocation, or b) impossible specifications. i.e. I want to specify that either a scope or noscope variable can be passed in, and the variable might escape depending on what you pass in for other arguments, how do I do that?You make it noscope. Remember that scope is an optimization.If it's escaping, you MUST allocate it in a way that doesn't disappear when the escape happens.So then to call X.v, the function must allocate a closure? How does this improve the current situation where closures are allocated by default?The most common case I think which will cause unnecessary allocations, is a very common case. A class setter: class X { private int *v_; int *v(int *newV) {return v_ = newV;} int *v() { return v_;} } Clearly, newV escapes into the class instance,Then it's noscope.The improvement with the 'scope' keyword is it allows the compiler to assume that the reference does not escape.We take the conservative approach, and regard "might escape" and "don't know if it escapes" as "treat as if it does escape".Also untenable. We have the same situation today. You will have achieved nothing with this syntax except making people write scope or noscope everywhere to satisfy incomplete compiler rules.
Oct 28 2008
"Walter Bright" wroteSteven Schveighoffer wrote:Any decent build tool (including make, assuming dependencies are created) will rebuild the source when it sees the dependency changed. In this case, if the new signature can be used, it will recompile silently. That was my point. However, I'm no longer sure what you are planning, because you have sufficiently confused me ;) So if the recompile causes a compile failure, then it would fail. But that is unrelated to the requirement that you have to recompile to get it to link. Even if the function signatures are the same, the build tool is going to recompile the file instead of linking the stale object."Walter Bright" wroteI disagree. The whole idea behind separate compilation and using makefiles is to recompile only what is necessary. Encoding the function specification into its identifier is a tried and true way of detecting mistakes in that.First off, the mangled names will be different, so it won't link until you recompile. This is critical because the caller's code depends on the scope/noscope characteristic.This 'feature' is basically useless ;) D has no shared libraries, so I don't think anyone generally keeps their stale object files around and tries to link with them instead of trying to recompile the sources. You are asking for trouble otherwise.Huh? So no automatic closures? If the compiler can't prove that a closure is or is not necessary, does code now just fail to compile?Either the compiler issues an error, or it allocates the scoped variable on the heap. I prefer the former behavior.Secondly, passing a scoped reference to a noscope parameter should be a compile time error.OK, so when does a closure happen? I thought the point of this was to specify when a closure was necessary... compiler sees foo(noscope int *x) I try to pass in an address to a local variable. Compiler says, hm... I need a closure to convert my scope variable into a noscope.The problem is, what if I know it's escaping in some cases, but not in others, but the compiler can't tell either way? (see example below)But the compiler's lack of knowledge/proof about the escape intricacies of a function will cause either a) unnecessary closure allocation, or b) impossible specifications. i.e. I want to specify that either a scope or noscope variable can be passed in, and the variable might escape depending on what you pass in for other arguments, how do I do that?You make it noscope. Remember that scope is an optimization.If it's escaping, you MUST allocate it in a way that doesn't disappear when the escape happens.So then to call X.v, the function must allocate a closure? How does this improve the current situation where closures are allocated by default?The most common case I think which will cause unnecessary allocations, is a very common case. A class setter: class X { private int *v_; int *v(int *newV) {return v_ = newV;} int *v() { return v_;} } Clearly, newV escapes into the class instance,Then it's noscope.And is that property enforced while compiling the function, or does the compiler assume the author knows best? Like I said, I'm sufficiently confused... How do I markup class X so that at least foo and foo2 compile without issues? class X { int *p; this(int *p_) {p = p_;} } // I expect this to compile and work. void foo() { int i; auto x = new X(&i); } // I expect this to compile and work. X foo2() { int[] arr = new int[1]; return new X(&arr[0]); } // What happens here, a closure or a failure? X foo3() { int i; auto x = new X(&i); return x; } If you have some syntax such that all 3 compile (i.e. foo3 creates a closure), then how does the compiler know foo3 is ok? -SteveThe improvement with the 'scope' keyword is it allows the compiler to assume that the reference does not escape.We take the conservative approach, and regard "might escape" and "don't know if it escapes" as "treat as if it does escape".Also untenable. We have the same situation today. You will have achieved nothing with this syntax except making people write scope or noscope everywhere to satisfy incomplete compiler rules.
Oct 28 2008
Pure functions almost implicitly imply that its parameters are all scoped. The exception is the return value of the pure function. If the return value can contain any references that came from the parameters, then those parameters are not scoped.
Oct 27 2008
On 2008-10-27 17:33:36 -0400, Walter Bright <newshound1 digitalmars.com> said:Pure functions almost implicitly imply that its parameters are all scoped. The exception is the return value of the pure function. If the return value can contain any references that came from the parameters, then those parameters are not scoped.Not if you define "scope" in the function prototype as not escaping the caller's scope. That would mean that you can recieve a "caller scope" pointer on input and return it back to the caller when the function ends. It never escapes the caller's scope, so all is fine. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Oct 27 2008
Walter Bright wrote:Pure functions almost implicitly imply that its parameters are all scoped. The exception is the return value of the pure function. If the return value can contain any references that came from the parameters, then those parameters are not scoped.I think even the return value can be considered scoped. Essentially it does not leave the scope of the caller. Andrei
Oct 27 2008
Andrei Alexandrescu Wrote:Walter Bright wrote:As far as I know, there's no way for functions to specially prepare objects to be called scope. Isn't that the called's choice?Pure functions almost implicitly imply that its parameters are all scoped. The exception is the return value of the pure function. If the return value can contain any references that came from the parameters, then those parameters are not scoped.I think even the return value can be considered scoped. Essentially it does not leave the scope of the caller. Andrei
Oct 28 2008
Walter Bright Wrote:The delegate closure issue is part of a wider issue - escape analysis. A reference is said to 'escape' a scope if it, well, leaves that scope. Here's a trivial example: int* foo() { int i; return &i; } The reference to i escapes the scope of i, thus courting disaster. Another form of escaping: int* p; void bar(int* x) { p = x; } which is, on the surface, legitimate, but fails for: void abc(int j) { bar(&j); } This kind of problem is currently undetectable by the compiler. The first step is, are function parameters considered to be escaping by default or not by default? I.e.: void bar(noscope int* p); // p escapes void bar(scope int* p); // p does not escape void bar(int* p); // what should be the default? What should be the default? The functional programmer would probably choose scope as the default, and the OOP programmer noscope. (The issue with delegates is we need the dynamic closure only if the delegate 'escapes'.)I like the original definition of in as "const scope". I would also like in to be the default for function parameters. Does that make me a heretic OOP programmer? :)
Oct 27 2008
Walter Bright Wrote:The delegate closure issue is part of a wider issue - escape analysis. A reference is said to 'escape' a scope if it, well, leaves that scope. Here's a trivial example: int* foo() { int i; return &i; } The reference to i escapes the scope of i, thus courting disaster. Another form of escaping: int* p; void bar(int* x) { p = x; } which is, on the surface, legitimate, but fails for: void abc(int j) { bar(&j); } This kind of problem is currently undetectable by the compiler. The first step is, are function parameters considered to be escaping by default or not by default? I.e.: void bar(noscope int* p); // p escapes void bar(scope int* p); // p does not escape void bar(int* p); // what should be the default? What should be the default? The functional programmer would probably choose scope as the default, and the OOP programmer noscope. (The issue with delegates is we need the dynamic closure only if the delegate 'escapes'.)I get the feeling that D's type system is going to become the joke of the programming world. Are we really going to have to worry about a scope unshared(invariant(int)*) ...? What other type modifiers can we put on that?
Oct 27 2008
Robert Fraser wrote:I get the feeling that D's type system is going to become the joke of the programming world. Are we really going to have to worry about a scope unshared(invariant(int)*) ...? What other type modifiers can we put on that?That argues that "noscope" should be the default. Using "scope" would be an optional optimization. BTW, "unshared" is the default. "shared" would be the keyword.
Oct 27 2008
On Tue, 28 Oct 2008 01:15:24 +0300, Walter Bright <newshound1 digitalmars.com> wrote:Robert Fraser wrote:I hope that 'noscope' is considered to be default *not* because is would introduce one more keyword otherwise... OTOH, noscope *should* be a keyword in either case, due to some casts: scope int* sp; noscope int* nsp; nsp = cast(noscope int*)sp; sp = cast(scope int*)nsp;I get the feeling that D's type system is going to become the joke of the programming world. Are we really going to have to worry about a scope unshared(invariant(int)*) ...? What other type modifiers can we put on that?That argues that "noscope" should be the default. Using "scope" would be an optional optimization. BTW, "unshared" is the default. "shared" would be the keyword.
Oct 27 2008
Walter Bright wrote:Robert Fraser wrote:My point wasn't the number of keywords... ("shared" is actually the first keyword introduced that's conflicted with an identifier I've used). My point was the type system is getting incredibly complex. The theory that static typing is the solution to everything is what lead to the beast known as checked exceptions.I get the feeling that D's type system is going to become the joke of the programming world. Are we really going to have to worry about a scope unshared(invariant(int)*) ...? What other type modifiers can we put on that?That argues that "noscope" should be the default. Using "scope" would be an optional optimization. BTW, "unshared" is the default. "shared" would be the keyword.
Oct 27 2008
Robert Fraser wrote:My point wasn't the number of keywords... ("shared" is actually the first keyword introduced that's conflicted with an identifier I've used). My point was the type system is getting incredibly complex. The theory that static typing is the solution to everything is what lead to the beast known as checked exceptions.The complexity is an issue that concerns me. That's why I suspect that if one doesn't use them, the defaults should work. I wouldn't worry about checked exceptions. *Why* it's a disaster is well understood, and the reason isn't because it is complicated or does static checking.
Oct 27 2008
Robert Fraser wrote:Walter Bright wrote:I don't think you have a case. AndreiRobert Fraser wrote:My point wasn't the number of keywords... ("shared" is actually the first keyword introduced that's conflicted with an identifier I've used). My point was the type system is getting incredibly complex. The theory that static typing is the solution to everything is what lead to the beast known as checked exceptions.I get the feeling that D's type system is going to become the joke of the programming world. Are we really going to have to worry about a scope unshared(invariant(int)*) ...? What other type modifiers can we put on that?That argues that "noscope" should be the default. Using "scope" would be an optional optimization. BTW, "unshared" is the default. "shared" would be the keyword.
Oct 27 2008
On 2008-10-27 18:15:24 -0400, Walter Bright <newshound1 digitalmars.com> said:That argues that "noscope" should be the default. Using "scope" would be an optional optimization.I don't think you have much choice. Take these examples: scope(int*)* a; // noscope pointer to a scope pointer. noscope(int*)* b; // scope pointer to a noscope pointer. Only one of these two makes sense. - - - On the other side, you could make a different syntax for scope than for const and shared, and then the noscope could be the default: int*scope* b; // scope pointer to a noscope pointer. But that looks as attractive as const in C++. - - - Hum, and please find a better name than "noscope". -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Oct 27 2008
Michel Fortin wrote:On 2008-10-27 18:15:24 -0400, Walter Bright <newshound1 digitalmars.com> said:scope is a storage class, not a type constructor.That argues that "noscope" should be the default. Using "scope" would be an optional optimization.I don't think you have much choice. Take these examples: scope(int*)* a; // noscope pointer to a scope pointer. noscope(int*)* b; // scope pointer to a noscope pointer. Only one of these two makes sense.
Oct 27 2008
Walter Bright Wrote:Michel Fortin wrote:How do you treat members of objects passed in? If I pass in a struct with a delegate in it, is it treated as scope too? What if it's an array? A class?On 2008-10-27 18:15:24 -0400, Walter Bright <newshound1 digitalmars.com> said:scope is a storage class, not a type constructor.That argues that "noscope" should be the default. Using "scope" would be an optional optimization.I don't think you have much choice. Take these examples: scope(int*)* a; // noscope pointer to a scope pointer. noscope(int*)* b; // scope pointer to a noscope pointer. Only one of these two makes sense.
Oct 27 2008
Jason House wrote:The scope applies to the bits of the object, not what they may refer to.scope is a storage class, not a type constructor.How do you treat members of objects passed in? If I pass in a struct with a delegate in it, is it treated as scope too? What if it's an array? A class?
Oct 27 2008
On 2008-10-28 00:28:27 -0400, Walter Bright <newshound1 digitalmars.com> said:Jason House wrote:So basically, we always have head-scope. Here's my question: int** a; void foo() { scope int b; scope int* c = &b; scope int** d = &c; a = &c; // error, c is scope, can't copy address of scope to non-scope. a = d; // error? d is scope, but we're only making a copy of its bits. // It's what d points to that is scope, but do we know about that? } In this case, it's obvious that the last assignment (a = d) is bogus. Is there any plan in having this fail to compile? If so, where does it fail? -- Michel Fortin michel.fortin michelf.com http://michelf.com/Walter Bright wrote:The scope applies to the bits of the object, not what they may refer to.scope is a storage class, not a type constructor.How do you treat members of objects passed in? If I pass in a struct with a delegate in it, is it treated as scope too? What if it's an array? A class?
Oct 28 2008
Michel Fortin Wrote:On 2008-10-28 00:28:27 -0400, Walter Bright <newshound1 digitalmars.com> said:Your assignment to c discards the scope protection. Taking the address of scope variables should be an error.Jason House wrote:So basically, we always have head-scope. Here's my question: int** a; void foo() { scope int b; scope int* c = &b; scope int** d = &c; a = &c; // error, c is scope, can't copy address of scope to non-scope. a = d; // error? d is scope, but we're only making a copy of its bits. // It's what d points to that is scope, but do we know about that? }Walter Bright wrote:The scope applies to the bits of the object, not what they may refer to.scope is a storage class, not a type constructor.How do you treat members of objects passed in? If I pass in a struct with a delegate in it, is it treated as scope too? What if it's an array? A class?In this case, it's obvious that the last assignment (a = d) is bogus. Is there any plan in having this fail to compile? If so, where does it fail? -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Oct 28 2008
Walter Bright Wrote:Jason House wrote:This seems rather limiting. I know this is aimed at addressing the dynamic closure problem. This solution would mean that I can't encapsulate delegates. Ideally, I should be able to declare my encapsulating struct as scope or noscope and manage the member delegate accordingly.The scope applies to the bits of the object, not what they may refer to.scope is a storage class, not a type constructor.How do you treat members of objects passed in? If I pass in a struct with a delegate in it, is it treated as scope too? What if it's an array? A class?
Oct 28 2008
Jason House wrote:Walter Bright Wrote:I think it's clear that scope is transitive as much as const or immutable are. Noscope is also transitive. Escape analysis is a tricky business. My opinion is that we either take care of it properly or blissfully ignore the entire issue. That opinion may disagree a bit with Walter's, who'd prefer a quick patch for delegates so he returns to threading. I think if we opt for a quick patch now, it'll turn to gangrene later. Among other things, it will hurt the threading infrastructure it was supposed to give precedence to. AndreiJason House wrote:This seems rather limiting. I know this is aimed at addressing the dynamic closure problem. This solution would mean that I can't encapsulate delegates. Ideally, I should be able to declare my encapsulating struct as scope or noscope and manage the member delegate accordingly.The scope applies to the bits of the object, not what they may refer to.scope is a storage class, not a type constructor.How do you treat members of objects passed in? If I pass in a struct with a delegate in it, is it treated as scope too? What if it's an array? A class?
Oct 28 2008
Andrei Alexandrescu Wrote:Jason House wrote:Transitive scope means that scope can't be a storage class. It's a tricky subject and threading is way more important to me. I'm fine with a quick fix, I just don't want to pretend it's more than that.Walter Bright Wrote:I think it's clear that scope is transitive as much as const or immutable are. Noscope is also transitive. Escape analysis is a tricky business. My opinion is that we either take care of it properly or blissfully ignore the entire issue. That opinion may disagree a bit with Walter's, who'd prefer a quick patch for delegates so he returns to threading. I think if we opt for a quick patch now, it'll turn to gangrene later. Among other things, it will hurt the threading infrastructure it was supposed to give precedence to. AndreiJason House wrote:This seems rather limiting. I know this is aimed at addressing the dynamic closure problem. This solution would mean that I can't encapsulate delegates. Ideally, I should be able to declare my encapsulating struct as scope or noscope and manage the member delegate accordingly.The scope applies to the bits of the object, not what they may refer to.scope is a storage class, not a type constructor.How do you treat members of objects passed in? If I pass in a struct with a delegate in it, is it treated as scope too? What if it's an array? A class?
Oct 28 2008
"Andrei Alexandrescu" wroteJason House wrote:A quick patch is not possible IMO. What I'd prefer is allocate closure when you can prove it, allow specification when you can't. That is, allocate a closure automatically in simple cases like this: int *f() {int x = 5; return &x;} And in cases where you can't prove it, default to not allocating a closure, and allow the developer to specify that a closure is necessary: int *f2(int *y){...} int *f() <insert closure keyword here> {int x = 5; return f2(&x);} Syntax to be debated ;) I do *not* think the problem should be ignored (i.e. continue with the current D2 implementation). -SteveWalter Bright Wrote:I think it's clear that scope is transitive as much as const or immutable are. Noscope is also transitive. Escape analysis is a tricky business. My opinion is that we either take care of it properly or blissfully ignore the entire issue. That opinion may disagree a bit with Walter's, who'd prefer a quick patch for delegates so he returns to threading. I think if we opt for a quick patch now, it'll turn to gangrene later. Among other things, it will hurt the threading infrastructure it was supposed to give precedence to.Jason House wrote:This seems rather limiting. I know this is aimed at addressing the dynamic closure problem. This solution would mean that I can't encapsulate delegates. Ideally, I should be able to declare my encapsulating struct as scope or noscope and manage the member delegate accordingly.The scope applies to the bits of the object, not what they may refer to.scope is a storage class, not a type constructor.How do you treat members of objects passed in? If I pass in a struct with a delegate in it, is it treated as scope too? What if it's an array? A class?
Oct 28 2008
On Tue, 28 Oct 2008 16:54:15 +0300, Steven Schveighoffer <schveiguy yahoo.com> wrote: [snip]What I'd prefer is allocate closure when you can prove it, allow specification when you can't. That is, allocate a closure automatically in simple cases like this: int *f() {int x = 5; return &x;}Hmm.. This is nice! You can implement 'new' in pure D in just a few lines: template new(T) { T* new(Args...)(Args args) { T t = T(args); return &t; } } Example: struct Foo { public this(int value) { this.value = value; } private int value; } Foo* foo = new!(Foo)(42);
Oct 28 2008
On Tue, Oct 28, 2008 at 10:54 PM, Steven Schveighoffer <schveiguy yahoo.com> wrote:What I'd prefer is allocate closure when you can prove it, allow specification when you can't. That is, allocate a closure automatically in simple cases like this: int *f() {int x = 5; return &x;} And in cases where you can't prove it, default to not allocating a closure, and allow the developer to specify that a closure is necessary:So basically programmers have to memorize all the rules the compiler uses to prove when it's necessary to allocate a closure, and then run those rules in their heads to determine if the current line of code will trigger allocation or not? And when the compiler gets a little smarter, the programmers need to get smarter, too. In lock step. That doesn't sound like a good solution to me. --bb
Oct 28 2008
"Bill Baxter" wroteOn Tue, Oct 28, 2008 at 10:54 PM, Steven Schveighoffer <schveiguy yahoo.com> wrote:First, the compiler does not have any sound rules for this. It currently allocates a closure on a knee jerk reaction from taking the address of a stack variable. And its either this or substitute in your statement "prove when it's *not* necessary to allocate a closure", which is about as hard and probably 10x more common. Second, for 90% of functions that don't require you to allocate closures, you don't have to think about any rules. For the 9% of functions which return a pointer to local data, proven by the compiler, you don't have to think about rules. For the last 1% of functions, the documentation should clarify how your data can escape, and then you have to think about how that affects your usage of it. The docs could say 'best to allocate a closure unless you know what you are doing'.What I'd prefer is allocate closure when you can prove it, allow specification when you can't. That is, allocate a closure automatically in simple cases like this: int *f() {int x = 5; return &x;} And in cases where you can't prove it, default to not allocating a closure, and allow the developer to specify that a closure is necessary:So basically programmers have to memorize all the rules the compiler uses to prove when it's necessary to allocate a closure, and then run those rules in their heads to determine if the current line of code will trigger allocation or not?And when the compiler gets a little smarter, the programmers need to get smarter, too. In lock step.Not really. If the compiler can some day store the scope dependency information in the object file (and get rid of reading source to determine function signature), then this whole manual requirement goes away.That doesn't sound like a good solution to me.Then let's go back to D1's solution -- no closures ;) For example, NONE of tango uses closures (as evidenced by the fact that it's D1), and it uses pointers to stack data very often (to improve performance). So if closure-by-default is the choice, then I'll have to mark all these usages as non-closure, which is going to make the whole code base look awful. With the way Walter is thinking of implementing, it might be impossible to specify correctly. -Steve
Oct 28 2008
On Wed, Oct 29, 2008 at 4:56 AM, Steven Schveighoffer <schveiguy yahoo.com> wrote:"Bill Baxter" wroteI don't see why not. Because the compiler might be allocating a closure when I don't want it to, killing performance. So I'll either be surprised later, or I need to think about it when I'm writing that line of code.On Tue, Oct 28, 2008 at 10:54 PM, Steven Schveighoffer <schveiguy yahoo.com> wrote:First, the compiler does not have any sound rules for this. It currently allocates a closure on a knee jerk reaction from taking the address of a stack variable. And its either this or substitute in your statement "prove when it's *not* necessary to allocate a closure", which is about as hard and probably 10x more common. Second, for 90% of functions that don't require you to allocate closures, you don't have to think about any rules.What I'd prefer is allocate closure when you can prove it, allow specification when you can't. That is, allocate a closure automatically in simple cases like this: int *f() {int x = 5; return &x;} And in cases where you can't prove it, default to not allocating a closure, and allow the developer to specify that a closure is necessary:So basically programmers have to memorize all the rules the compiler uses to prove when it's necessary to allocate a closure, and then run those rules in their heads to determine if the current line of code will trigger allocation or not?For the 9% of functions which return a pointer to local data, proven by the compiler, you don't have to think about rules.Except didn't you just give us some examples where the function does things that escape in the local sense, but can be seen not to escape when examining the full context? So in those 9% of the cases I may also want to think about what the compiler will do to avoid unnecessary hidden allocations in my code. And if I am getting one of these unnecessary allocations, then I will have to think about how to rearrange my code so that the compiler doesn't get tricked. But it could be a library function that's causing it.For the last 1% of functions, the documentation should clarify how your data can escape, and then you have to think about how that affects your usage of it. The docs could say 'best to allocate a closure unless you know what you are doing'.Until the compiler can do the right thing 100% of the time, I have to be on the lookout for spurious allocations.And when the compiler gets a little smarter, the programmers need to get smarter, too. In lock step.Not really. If the compiler can some day store the scope dependency information in the object file (and get rid of reading source to determine function signature), then this whole manual requirement goes away.Sure. But if you're going to do that, then at least give us an easy way to explicitly request a closure for those of us who know we need one and when we don't. :-) --bbThat doesn't sound like a good solution to me.Then let's go back to D1's solution -- no closures ;)
Oct 28 2008
"Bill Baxter" wroteOn Wed, Oct 29, 2008 at 4:56 AM, Steven Schveighoffer <schveiguy yahoo.com> wrote:No, I'm proposing the compiler SHOULDN'T allocate closures unless it can prove without a shadow of a doubt that a closure is required. i.e. it defaults to D1 behavior, which should cover 90% of functions today."Bill Baxter" wroteI don't see why not. Because the compiler might be allocating a closure when I don't want it to, killing performance. So I'll either be surprised later, or I need to think about it when I'm writing that line of code.On Tue, Oct 28, 2008 at 10:54 PM, Steven Schveighoffer <schveiguy yahoo.com> wrote:First, the compiler does not have any sound rules for this. It currently allocates a closure on a knee jerk reaction from taking the address of a stack variable. And its either this or substitute in your statement "prove when it's *not* necessary to allocate a closure", which is about as hard and probably 10x more common. Second, for 90% of functions that don't require you to allocate closures, you don't have to think about any rules.What I'd prefer is allocate closure when you can prove it, allow specification when you can't. That is, allocate a closure automatically in simple cases like this: int *f() {int x = 5; return &x;} And in cases where you can't prove it, default to not allocating a closure, and allow the developer to specify that a closure is necessary:So basically programmers have to memorize all the rules the compiler uses to prove when it's necessary to allocate a closure, and then run those rules in their heads to determine if the current line of code will trigger allocation or not?This is what I'm thinking as proven by the compiler: int *f() { int x = 5; return &x; } There is no doubt that this will cause an escape. A more common scenario (I just ran into this with a newb on irc): char[] readData(InputStream s) { char[64] buf; auto len = s.read(buf); return buf[0..len]; }For the 9% of functions which return a pointer to local data, proven by the compiler, you don't have to think about rules.Except didn't you just give us some examples where the function does things that escape in the local sense, but can be seen not to escape when examining the full context?So in those 9% of the cases I may also want to think about what the compiler will do to avoid unnecessary hidden allocations in my code. And if I am getting one of these unnecessary allocations, then I will have to think about how to rearrange my code so that the compiler doesn't get tricked. But it could be a library function that's causing it.I'm starting to think that if you compile with warnings on, these 9% of functions shouldn't compile. Perhaps they shouldn't compile by default since it's very easy to do this kind of stuff explicitly without closures.I'm saying no automatic closures unless it's absolutely provable that an escape occurs.For the last 1% of functions, the documentation should clarify how your data can escape, and then you have to think about how that affects your usage of it. The docs could say 'best to allocate a closure unless you know what you are doing'.Until the compiler can do the right thing 100% of the time, I have to be on the lookout for spurious allocations.And when the compiler gets a little smarter, the programmers need to get smarter, too. In lock step.Not really. If the compiler can some day store the scope dependency information in the object file (and get rid of reading source to determine function signature), then this whole manual requirement goes away.Assuming the compiler does not ever allocate closures needlessly, I agree with a way to specify when closures should occur, but not when they should not (since there's no need). -SteveSure. But if you're going to do that, then at least give us an easy way to explicitly request a closure for those of us who know we need one and when we don't. :-)That doesn't sound like a good solution to me.Then let's go back to D1's solution -- no closures ;)
Oct 28 2008
Andrei Alexandrescu wrote:Escape analysis is a tricky business. My opinion is that we either take care of it properly or blissfully ignore the entire issue. That opinion may disagree a bit with Walter's, who'd prefer a quick patch for delegates so he returns to threading. I think if we opt for a quick patch now, it'll turn to gangrene later. Among other things, it will hurt the threading infrastructure it was supposed to give precedence to.Like const, I'd rather have no solution than a bad solution insofar as escape analysis is concerned. Sean
Oct 28 2008
On Wed, Oct 29, 2008 at 1:04 AM, Sean Kelly <sean invisibleduck.org> wrote:Andrei Alexandrescu wrote:The only serious problem people have right now is that closures are allocated automatically when they may not need to be. Making closure allocation manual for now seems like the most future-compatible way to fix things. In some nebulous future, the manual allocation could become unnecessary, or it could become compiler-checked, but it seems to me that for now just making it manual does the least harm and lets Walter get back to work on other things. --bbEscape analysis is a tricky business. My opinion is that we either take care of it properly or blissfully ignore the entire issue. That opinion may disagree a bit with Walter's, who'd prefer a quick patch for delegates so he returns to threading. I think if we opt for a quick patch now, it'll turn to gangrene later. Among other things, it will hurt the threading infrastructure it was supposed to give precedence to.Like const, I'd rather have no solution than a bad solution insofar as escape analysis is concerned.
Oct 28 2008
Bill Baxter wrote:On Wed, Oct 29, 2008 at 1:04 AM, Sean Kelly <sean invisibleduck.org> wrote:This would be the most backwards-compatible way also. The only real argument against it in my mind is that it makes the default behavior the unsafe behavior. I don't think this is a big deal given what I see as the target market for D, but then I don't see a point in SafeD either, for the same reason. The syntax seems like it should be pretty straightforward: use 'new' (Andrei will love that ;-)): void fn( int delegate() dg ); void main() { int x; int getX() { return x; } // static closure fn( &getX ); // dynamic closure fn( new &getX ); } That said, the fact that some function calls will always be opaque suggests to me that automatic escape analysis will never be possible in all situations. Therefore, we'll likely need something roughly similar to the proposed keyword eventually. So perhaps it really is worth considering adding some sort of 'noscope' storage class now: // generates a dynamic closure noscope int delegate() dg = &getX; I do think, however, that 'scope' should be the default behavior, for two reasons. It's backwards-compatible, which is handy. But more importantly, I'd say that probably 95% of the current uses of delegates are scoped, and that isn't likely to shift all the way to 50% even if D moved to a much more functional style of programming. Algorithms, for example, all use scoped delegates, which I'd say is far and away their most common current use. SeanAndrei Alexandrescu wrote:The only serious problem people have right now is that closures are allocated automatically when they may not need to be. Making closure allocation manual for now seems like the most future-compatible way to fix things. In some nebulous future, the manual allocation could become unnecessary, or it could become compiler-checked, but it seems to me that for now just making it manual does the least harm and lets Walter get back to work on other things.Escape analysis is a tricky business. My opinion is that we either take care of it properly or blissfully ignore the entire issue. That opinion may disagree a bit with Walter's, who'd prefer a quick patch for delegates so he returns to threading. I think if we opt for a quick patch now, it'll turn to gangrene later. Among other things, it will hurt the threading infrastructure it was supposed to give precedence to.Like const, I'd rather have no solution than a bad solution insofar as escape analysis is concerned.
Oct 28 2008
Sean Kelly wrote:I do think, however, that 'scope' should be the default behavior, for two reasons. It's backwards-compatible, which is handy. But more importantly, I'd say that probably 95% of the current uses of delegates are scoped, and that isn't likely to shift all the way to 50% even if D moved to a much more functional style of programming. Algorithms, for example, all use scoped delegates, which I'd say is far and away their most common current use.The counter to that is that when there is an inadvertent escape of a reference, the error is often undetectable even while it silently corrupts data and behaves erratically. In other words (as Andrei pointed out to me) the cost of those errors, even though rare, is very high. This makes it highly desirable to prevent them automatically, rather than relying on the skill and attention to detail of the programmer. Contrast that with, say, a null pointer bug which results in an unambiguous sudden halt to the program with a clear indication of what happened. The 'scope' storage class also has a future in that it is possible using data flow analysis to statically verify it.
Oct 28 2008
Walter Bright wrote:Sean Kelly wrote:I think the cost/benefit of this could probably be argued either way. I've never encountered a bug related to this, for example, so to me the benefit is entirely theoretical while the cost is immediate.I do think, however, that 'scope' should be the default behavior, for two reasons. It's backwards-compatible, which is handy. But more importantly, I'd say that probably 95% of the current uses of delegates are scoped, and that isn't likely to shift all the way to 50% even if D moved to a much more functional style of programming. Algorithms, for example, all use scoped delegates, which I'd say is far and away their most common current use.The counter to that is that when there is an inadvertent escape of a reference, the error is often undetectable even while it silently corrupts data and behaves erratically. In other words (as Andrei pointed out to me) the cost of those errors, even though rare, is very high. This makes it highly desirable to prevent them automatically, rather than relying on the skill and attention to detail of the programmer.The 'scope' storage class also has a future in that it is possible using data flow analysis to statically verify it.This is the real benefit in my mind. From a "features I want in a systems programming language" standpoint I absolutely do not want default dynamic closures (today at any rate). However, just like 'const' I very much appreciate that this approach allows for static verification. So as much as I hate to say so I think that default dynamic closures would be the best long-term option for D. The cost of DMA will continue to come down anyway, and once a codebase is converted it probably won't be too difficult to maintain going forward. Sean
Oct 28 2008
Sean Kelly Wrote:Walter Bright wrote:As the author of an open source multithreaded application in D1, I've had these errors pop up. It's easy to overlook this stuff and pass things the wrong way (it's easier to code). Tango doesn't even have a bind library to make it easier!In other words (as Andrei pointed out to me) the cost of those errors, even though rare, is very high. This makes it highly desirable to prevent them automatically, rather than relying on the skill and attention to detail of the programmer.I think the cost/benefit of this could probably be argued either way. I've never encountered a bug related to this, for example, so to me the benefit is entirely theoretical while the cost is immediate.
Oct 28 2008
On Tue, Oct 28, 2008 at 6:29 PM, Jason House <jason.james.house gmail.com> wrote:Sean Kelly Wrote:For what it's worth, std.bind I think depends on one Phobos-specific function. It would probably take a matter of a minute or two to port it to work with Tango.Walter Bright wrote:As the author of an open source multithreaded application in D1, I've had these errors pop up. It's easy to overlook this stuff and pass things the wrong way (it's easier to code). Tango doesn't even have a bind library to make it easier!In other words (as Andrei pointed out to me) the cost of those errors, even though rare, is very high. This makes it highly desirable to prevent them automatically, rather than relying on the skill and attention to detail of the programmer.I think the cost/benefit of this could probably be argued either way. I've never encountered a bug related to this, for example, so to me the benefit is entirely theoretical while the cost is immediate.
Oct 28 2008
Jarrett Billingsley Wrote:On Tue, Oct 28, 2008 at 6:29 PM, Jason House <jason.james.house gmail.com> wrote:I ported a bind implementation and maintain it in my code base. I didn't mention that because I still maintain hope that Tango will add it. The last time I asked the Tango folks why they didn't have it, the answer was something to the effect of "we don't recognize a need for it"Sean Kelly Wrote:For what it's worth, std.bind I think depends on one Phobos-specific function. It would probably take a matter of a minute or two to port it to work with Tango.Walter Bright wrote:As the author of an open source multithreaded application in D1, I've had these errors pop up. It's easy to overlook this stuff and pass things the wrong way (it's easier to code). Tango doesn't even have a bind library to make it easier!In other words (as Andrei pointed out to me) the cost of those errors, even though rare, is very high. This makes it highly desirable to prevent them automatically, rather than relying on the skill and attention to detail of the programmer.I think the cost/benefit of this could probably be argued either way. I've never encountered a bug related to this, for example, so to me the benefit is entirely theoretical while the cost is immediate.
Oct 28 2008
Jason House wrote:Sean Kelly Wrote:I agree. Particularly in higher-order code this kind of problem is bound to show itself. And it's a really really nasty case of reality ripping straight through a carefully-conceived abstraction - something like a bullet carving through a precision microprocessor. AndreiWalter Bright wrote:As the author of an open source multithreaded application in D1, I've had these errors pop up. It's easy to overlook this stuff and pass things the wrong way (it's easier to code). Tango doesn't even have a bind library to make it easier!In other words (as Andrei pointed out to me) the cost of those errors, even though rare, is very high. This makes it highly desirable to prevent them automatically, rather than relying on the skill and attention to detail of the programmer.I think the cost/benefit of this could probably be argued either way. I've never encountered a bug related to this, for example, so to me the benefit is entirely theoretical while the cost is immediate.
Oct 28 2008
On Wed, Oct 29, 2008 at 7:23 AM, Sean Kelly <sean invisibleduck.org> wrote:Walter Bright wrote:I've had bugs caused by this but they were pretty easy to find. Some delegate I'm calling crashes and all the variables are nonsensical garbage... Hmm maybe I was using out-of-scope variables in that delegate that I wasn't supposed to? Maybe there are real cases where the bugs caused are harder to find. But I'll just add my 2c to Sean's. I haven't had many such bugs, and when I've had them they've been pretty easy to find. --bbSean Kelly wrote:I think the cost/benefit of this could probably be argued either way. I've never encountered a bug related to this, for example, so to me the benefit is entirely theoretical while the cost is immediate.I do think, however, that 'scope' should be the default behavior, for two reasons. It's backwards-compatible, which is handy. But more importantly, I'd say that probably 95% of the current uses of delegates are scoped, and that isn't likely to shift all the way to 50% even if D moved to a much more functional style of programming. Algorithms, for example, all use scoped delegates, which I'd say is far and away their most common current use.The counter to that is that when there is an inadvertent escape of a reference, the error is often undetectable even while it silently corrupts data and behaves erratically. In other words (as Andrei pointed out to me) the cost of those errors, even though rare, is very high. This makes it highly desirable to prevent them automatically, rather than relying on the skill and attention to detail of the programmer.
Oct 28 2008
Bill Baxter wrote:On Wed, Oct 29, 2008 at 7:23 AM, Sean Kelly <sean invisibleduck.org> wrote:I don't think we can afford program correctness to rest on anecdote and "it works for me". That age is long gone. AndreiWalter Bright wrote:I've had bugs caused by this but they were pretty easy to find. Some delegate I'm calling crashes and all the variables are nonsensical garbage... Hmm maybe I was using out-of-scope variables in that delegate that I wasn't supposed to? Maybe there are real cases where the bugs caused are harder to find. But I'll just add my 2c to Sean's. I haven't had many such bugs, and when I've had them they've been pretty easy to find.Sean Kelly wrote:I think the cost/benefit of this could probably be argued either way. I've never encountered a bug related to this, for example, so to me the benefit is entirely theoretical while the cost is immediate.I do think, however, that 'scope' should be the default behavior, for two reasons. It's backwards-compatible, which is handy. But more importantly, I'd say that probably 95% of the current uses of delegates are scoped, and that isn't likely to shift all the way to 50% even if D moved to a much more functional style of programming. Algorithms, for example, all use scoped delegates, which I'd say is far and away their most common current use.The counter to that is that when there is an inadvertent escape of a reference, the error is often undetectable even while it silently corrupts data and behaves erratically. In other words (as Andrei pointed out to me) the cost of those errors, even though rare, is very high. This makes it highly desirable to prevent them automatically, rather than relying on the skill and attention to detail of the programmer.
Oct 28 2008
Andrei Alexandrescu wrote:I don't think we can afford program correctness to rest on anecdote and "it works for me". That age is long gone.I agree. When you're managing a program with a million lines of code in it, there is great value in being able to *prove* that it does not suffer from as many kinds of bugs as practical, especially memory corruption bugs. Think of buffer overflow bugs, for example. Think of all the grief that would have been saved if the C/C++ compiler could prove that buffer overflows could not happen.
Oct 28 2008
On Wed, Oct 29, 2008 at 11:40 AM, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Bill Baxter wrote:I haven't seen any real data about how serious a problem this is from you either. Chasing bogeymen is at least as bad as ignoring real problems. --bbOn Wed, Oct 29, 2008 at 7:23 AM, Sean Kelly <sean invisibleduck.org> wrote:I don't think we can afford program correctness to rest on anecdote and "it works for me". That age is long gone.Walter Bright wrote:I've had bugs caused by this but they were pretty easy to find. Some delegate I'm calling crashes and all the variables are nonsensical garbage... Hmm maybe I was using out-of-scope variables in that delegate that I wasn't supposed to? Maybe there are real cases where the bugs caused are harder to find. But I'll just add my 2c to Sean's. I haven't had many such bugs, and when I've had them they've been pretty easy to find.Sean Kelly wrote:I think the cost/benefit of this could probably be argued either way. I've never encountered a bug related to this, for example, so to me the benefit is entirely theoretical while the cost is immediate.I do think, however, that 'scope' should be the default behavior, for two reasons. It's backwards-compatible, which is handy. But more importantly, I'd say that probably 95% of the current uses of delegates are scoped, and that isn't likely to shift all the way to 50% even if D moved to a much more functional style of programming. Algorithms, for example, all use scoped delegates, which I'd say is far and away their most common current use.The counter to that is that when there is an inadvertent escape of a reference, the error is often undetectable even while it silently corrupts data and behaves erratically. In other words (as Andrei pointed out to me) the cost of those errors, even though rare, is very high. This makes it highly desirable to prevent them automatically, rather than relying on the skill and attention to detail of the programmer.
Oct 28 2008
Bill Baxter wrote:On Wed, Oct 29, 2008 at 11:40 AM, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Well to provide real data I'd have to spend time on user studies, which would be time-intensive. I also think it's not an interesting research problem because it is generally accepted in the community that memory un-safety is a source of problems. So I don't quite feel burdened with the need to provide a proof. Reframing the problem as chasing a bogeyman won't help with addressing it. AndreiBill Baxter wrote:I haven't seen any real data about how serious a problem this is from you either. Chasing bogeymen is at least as bad as ignoring real problems.On Wed, Oct 29, 2008 at 7:23 AM, Sean Kelly <sean invisibleduck.org> wrote:I don't think we can afford program correctness to rest on anecdote and "it works for me". That age is long gone.Walter Bright wrote:I've had bugs caused by this but they were pretty easy to find. Some delegate I'm calling crashes and all the variables are nonsensical garbage... Hmm maybe I was using out-of-scope variables in that delegate that I wasn't supposed to? Maybe there are real cases where the bugs caused are harder to find. But I'll just add my 2c to Sean's. I haven't had many such bugs, and when I've had them they've been pretty easy to find.Sean Kelly wrote:I think the cost/benefit of this could probably be argued either way. I've never encountered a bug related to this, for example, so to me the benefit is entirely theoretical while the cost is immediate.I do think, however, that 'scope' should be the default behavior, for two reasons. It's backwards-compatible, which is handy. But more importantly, I'd say that probably 95% of the current uses of delegates are scoped, and that isn't likely to shift all the way to 50% even if D moved to a much more functional style of programming. Algorithms, for example, all use scoped delegates, which I'd say is far and away their most common current use.The counter to that is that when there is an inadvertent escape of a reference, the error is often undetectable even while it silently corrupts data and behaves erratically. In other words (as Andrei pointed out to me) the cost of those errors, even though rare, is very high. This makes it highly desirable to prevent them automatically, rather than relying on the skill and attention to detail of the programmer.
Oct 28 2008
Andrei Alexandrescu wrote:Bill Baxter wrote:I just wanted to issue an apology to Bill for the above, which is brusque and demeaning. He was delicate enough to email me privately what he thought about my response, and in very levelheaded terms. After having answered privately as well, I thought I'd post a public apology; it would be quite unethical to apologize in private for a public remark! Hopefully this helps with undoing the damage and with keeping the recent streak of good discussions going. AndreiOn Wed, Oct 29, 2008 at 11:40 AM, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Well to provide real data I'd have to spend time on user studies, which would be time-intensive. I also think it's not an interesting research problem because it is generally accepted in the community that memory un-safety is a source of problems. So I don't quite feel burdened with the need to provide a proof. Reframing the problem as chasing a bogeyman won't help with addressing it. AndreiBill Baxter wrote:I haven't seen any real data about how serious a problem this is from you either. Chasing bogeymen is at least as bad as ignoring real problems.On Wed, Oct 29, 2008 at 7:23 AM, Sean Kelly <sean invisibleduck.org> wrote:I don't think we can afford program correctness to rest on anecdote and "it works for me". That age is long gone.Walter Bright wrote:I've had bugs caused by this but they were pretty easy to find. Some delegate I'm calling crashes and all the variables are nonsensical garbage... Hmm maybe I was using out-of-scope variables in that delegate that I wasn't supposed to? Maybe there are real cases where the bugs caused are harder to find. But I'll just add my 2c to Sean's. I haven't had many such bugs, and when I've had them they've been pretty easy to find.Sean Kelly wrote:I think the cost/benefit of this could probably be argued either way. I've never encountered a bug related to this, for example, so to me the benefit is entirely theoretical while the cost is immediate.I do think, however, that 'scope' should be the default behavior, for two reasons. It's backwards-compatible, which is handy. But more importantly, I'd say that probably 95% of the current uses of delegates are scoped, and that isn't likely to shift all the way to 50% even if D moved to a much more functional style of programming. Algorithms, for example, all use scoped delegates, which I'd say is far and away their most common current use.The counter to that is that when there is an inadvertent escape of a reference, the error is often undetectable even while it silently corrupts data and behaves erratically. In other words (as Andrei pointed out to me) the cost of those errors, even though rare, is very high. This makes it highly desirable to prevent them automatically, rather than relying on the skill and attention to detail of the programmer.
Oct 29 2008
On Thu, Oct 30, 2008 at 12:21 PM, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Andrei Alexandrescu wrote:No problem. My comment leading to that response was a bit snarky too. Though I tried really hard not to make it snarky. It still is basically saying "I you are but what am I?" Back to the technical topic, as I told Andrei, all I want is some solution that doesn't kill performance with lots of hidden memory allocations. I doubt that's something anyone really wants, so all this huffing and puffing about it probably isn't necessary. --bbBill Baxter wrote:I just wanted to issue an apology to Bill for the above, which is brusque and demeaning. He was delicate enough to email me privately what he thought about my response, and in very levelheaded terms. After having answered privately as well, I thought I'd post a public apology; it would be quite unethical to apologize in private for a public remark! Hopefully this helps with undoing the damage and with keeping the recent streak of good discussions going.On Wed, Oct 29, 2008 at 11:40 AM, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Well to provide real data I'd have to spend time on user studies, which would be time-intensive. I also think it's not an interesting research problem because it is generally accepted in the community that memory un-safety is a source of problems. So I don't quite feel burdened with the need to provide a proof. Reframing the problem as chasing a bogeyman won't help with addressing it. AndreiBill Baxter wrote:I haven't seen any real data about how serious a problem this is from you either. Chasing bogeymen is at least as bad as ignoring real problems.On Wed, Oct 29, 2008 at 7:23 AM, Sean Kelly <sean invisibleduck.org> wrote:I don't think we can afford program correctness to rest on anecdote and "it works for me". That age is long gone.Walter Bright wrote:I've had bugs caused by this but they were pretty easy to find. Some delegate I'm calling crashes and all the variables are nonsensical garbage... Hmm maybe I was using out-of-scope variables in that delegate that I wasn't supposed to? Maybe there are real cases where the bugs caused are harder to find. But I'll just add my 2c to Sean's. I haven't had many such bugs, and when I've had them they've been pretty easy to find.Sean Kelly wrote:I think the cost/benefit of this could probably be argued either way. I've never encountered a bug related to this, for example, so to me the benefit is entirely theoretical while the cost is immediate.I do think, however, that 'scope' should be the default behavior, for two reasons. It's backwards-compatible, which is handy. But more importantly, I'd say that probably 95% of the current uses of delegates are scoped, and that isn't likely to shift all the way to 50% even if D moved to a much more functional style of programming. Algorithms, for example, all use scoped delegates, which I'd say is far and away their most common current use.The counter to that is that when there is an inadvertent escape of a reference, the error is often undetectable even while it silently corrupts data and behaves erratically. In other words (as Andrei pointed out to me) the cost of those errors, even though rare, is very high. This makes it highly desirable to prevent them automatically, rather than relying on the skill and attention to detail of the programmer.
Oct 29 2008
"Bill Baxter" wroteBack to the technical topic, as I told Andrei, all I want is some solution that doesn't kill performance with lots of hidden memory allocations. I doubt that's something anyone really wants, so all this huffing and puffing about it probably isn't necessary.I doubt anyone wants that. But here is my main concern (my defense for huffing): One of my main goals for D at the moment is to have Tango compile on D2. Right now, I'm slowly getting everything constified, and dealing with small design changes to make that happen (and filing bugs that I find). However, when dissecting solutions to unnecessary dynamic closures, I want to make sure that the solution does not force Tango to change its overall design. Right now, with Walter's proposal, I fear a large amount of scope decorations would be necessary (making the api very unattractive), and possibly some of the ways Tango uses stack variables might be made uncompilable. I would like to avoid that. It has happened in the past that things considered closed on D2 did not work with Tango because the main code used to test D2 (Phobos) does not have a similar design, and does not use the same features as Tango does. When I think a solution solves the problem, and will allow Tango to compile, I'll stop my whining ;) -Steve
Oct 30 2008
Steven Schveighoffer Wrote:Right now, with Walter's proposal, I fear a large amount of scope decorations would be necessary (making the api very unattractive)Moreover that sematics in some cases will force allocation when it's not needed.
Oct 30 2008
Sean Kelly wrote:I think the cost/benefit of this could probably be argued either way. I've never encountered a bug related to this, for example, so to me the benefit is entirely theoretical while the cost is immediate.I have. Not often in my own code because I am very careful to avoid it, but it frequently happens in 'bug' reports I get sent. This trap does happen to programmers who are less familiar with how the underlying stack machine actually works. The real problem is there is no way to verify that this isn't happening in some arbitrarily large code base. I strongly believe that it is good for D and for programming languages in general to work towards a design that can provably eliminate certain types of bugs.
Oct 28 2008
Walter Bright wrote:Sean Kelly wrote:I tend to ask a question along these lines to entry-level interviewees and it's surprising how often they get it wrong. So I agree that this is a fair point. I mostly brought up this argument because C++ is unapologetically designed for experts and I'm occasionally inclined to view D the same way... even though its goal is really somewhat different.I think the cost/benefit of this could probably be argued either way. I've never encountered a bug related to this, for example, so to me the benefit is entirely theoretical while the cost is immediate.I have. Not often in my own code because I am very careful to avoid it, but it frequently happens in 'bug' reports I get sent. This trap does happen to programmers who are less familiar with how the underlying stack machine actually works.The real problem is there is no way to verify that this isn't happening in some arbitrarily large code base. I strongly believe that it is good for D and for programming languages in general to work towards a design that can provably eliminate certain types of bugs.I agree, which is why I'm actually in favor of this despite what I said above. Sean
Oct 28 2008
Sean Kelly wrote:I tend to ask a question along these lines to entry-level interviewees and it's surprising how often they get it wrong. So I agree that this is a fair point. I mostly brought up this argument because C++ is unapologetically designed for experts and I'm occasionally inclined to view D the same way... even though its goal is really somewhat different.To me that is akin to building a car with no brakes and justifying it by saying it is "designed for experts." Sure, an expert who never makes any mistakes could effectively drive such a car. The trouble is, the road is full of non-expert drivers the expert ones are forced to interact with, and even the experts still make mistakes now and then. I don't believe that having brakes impairs the performance of my car one bit. I also would not say that C++ was deliberately designed without brakes, it just kinda worked out that way. We have the benefit of hindsight in designing D.
Oct 28 2008
"Walter Bright" wroteSean Kelly wrote:I agree with this. It would be nice to be able to flag these kinds of things. Even if it was a warning and not a true error. Just not a solution which silently allocates data that shouldn't be allocated. This would be a great candidate for a lint tool. -SteveI think the cost/benefit of this could probably be argued either way. I've never encountered a bug related to this, for example, so to me the benefit is entirely theoretical while the cost is immediate.I have. Not often in my own code because I am very careful to avoid it, but it frequently happens in 'bug' reports I get sent. This trap does happen to programmers who are less familiar with how the underlying stack machine actually works. The real problem is there is no way to verify that this isn't happening in some arbitrarily large code base. I strongly believe that it is good for D and for programming languages in general to work towards a design that can provably eliminate certain types of bugs.
Oct 28 2008
On Wed, Oct 29, 2008 at 1:04 PM, Steven Schveighoffer <schveiguy yahoo.com> wrote:"Walter Bright" wroteOk, I think we're completely on the same page here. I'm for the compiler finding bugs. But I'm not for the compiler being conservative and allocating memory when it doesn't have to, as it does currently. --bbSean Kelly wrote:I agree with this. It would be nice to be able to flag these kinds of things. Even if it was a warning and not a true error. Just not a solution which silently allocates data that shouldn't be allocated.I think the cost/benefit of this could probably be argued either way. I've never encountered a bug related to this, for example, so to me the benefit is entirely theoretical while the cost is immediate.I have. Not often in my own code because I am very careful to avoid it, but it frequently happens in 'bug' reports I get sent. This trap does happen to programmers who are less familiar with how the underlying stack machine actually works. The real problem is there is no way to verify that this isn't happening in some arbitrarily large code base. I strongly believe that it is good for D and for programming languages in general to work towards a design that can provably eliminate certain types of bugs.
Oct 28 2008
Bill Baxter wrote:On Wed, Oct 29, 2008 at 1:04 PM, Steven Schveighoffer <schveiguy yahoo.com> wrote:How about adding a warning switch (I know Walter you're against them but it might be justified here) that would flag all the closure allocations. I know that should be the job of a "lint" tool, but the compiler already has the context here."Walter Bright" wroteOk, I think we're completely on the same page here. I'm for the compiler finding bugs. But I'm not for the compiler being conservative and allocating memory when it doesn't have to, as it does currently. --bbSean Kelly wrote:I agree with this. It would be nice to be able to flag these kinds of things. Even if it was a warning and not a true error. Just not a solution which silently allocates data that shouldn't be allocated.I think the cost/benefit of this could probably be argued either way. I've never encountered a bug related to this, for example, so to me the benefit is entirely theoretical while the cost is immediate.I have. Not often in my own code because I am very careful to avoid it, but it frequently happens in 'bug' reports I get sent. This trap does happen to programmers who are less familiar with how the underlying stack machine actually works. The real problem is there is no way to verify that this isn't happening in some arbitrarily large code base. I strongly believe that it is good for D and for programming languages in general to work towards a design that can provably eliminate certain types of bugs.
Oct 29 2008
On Mon, 27 Oct 2008 23:05:48 -0400, Walter Bright <newshound1 digitalmars.com> wrote:scope is a storage class, not a type constructor.Okay, I'm confused. I had assumed that the escape scope was different from the storage scope as the storage scope has a few known problems with regard to escape analysis as currently defined e.g. class Node { Node next }; void append(scope Node a) { scope b = new Node(); a.next = b; // b just escaped } scope const also has similar issues. So is the plan for the compilier going to do a static escape analysis based on the funtion signiture? Alternatively, a deep type which prevents assignment except at declaration grantees (I think) no escape.
Oct 27 2008
Robert Fraser wrote:Walter Bright Wrote:This is a misunderstanding. Scope is a storage class, not a type modifier, so it's not as pervasive as you may think. AndreiThe delegate closure issue is part of a wider issue - escape analysis. A reference is said to 'escape' a scope if it, well, leaves that scope. Here's a trivial example: int* foo() { int i; return &i; } The reference to i escapes the scope of i, thus courting disaster. Another form of escaping: int* p; void bar(int* x) { p = x; } which is, on the surface, legitimate, but fails for: void abc(int j) { bar(&j); } This kind of problem is currently undetectable by the compiler. The first step is, are function parameters considered to be escaping by default or not by default? I.e.: void bar(noscope int* p); // p escapes void bar(scope int* p); // p does not escape void bar(int* p); // what should be the default? What should be the default? The functional programmer would probably choose scope as the default, and the OOP programmer noscope. (The issue with delegates is we need the dynamic closure only if the delegate 'escapes'.)I get the feeling that D's type system is going to become the joke of the programming world. Are we really going to have to worry about a scope unshared(invariant(int)*) ...? What other type modifiers can we put on that?
Oct 27 2008
Robert Fraser wrote:Walter Bright Wrote:I agree I think that D will be used only by people like you that understand all this shared/scope/mutable/lazy things. I thought C++ was complex and difficult to learn but I think I was wrong.The delegate closure issue is part of a wider issue - escape analysis. A reference is said to 'escape' a scope if it, well, leaves that scope. Here's a trivial example: int* foo() { int i; return &i; } The reference to i escapes the scope of i, thus courting disaster. Another form of escaping: int* p; void bar(int* x) { p = x; } which is, on the surface, legitimate, but fails for: void abc(int j) { bar(&j); } This kind of problem is currently undetectable by the compiler. The first step is, are function parameters considered to be escaping by default or not by default? I.e.: void bar(noscope int* p); // p escapes void bar(scope int* p); // p does not escape void bar(int* p); // what should be the default? What should be the default? The functional programmer would probably choose scope as the default, and the OOP programmer noscope. (The issue with delegates is we need the dynamic closure only if the delegate 'escapes'.)I get the feeling that D's type system is going to become the joke of the programming world. Are we really going to have to worry about a scope unshared(invariant(int)*) ...? What other type modifiers can we put on that?
Oct 28 2008
Mosfet wrote:Robert Fraser wrote:Well I think you were right. The question is how much you spend learning things that are actually useful, versus learning gratuitous complexity. I think D is much more rewarding per unit of effort invested than C++. AndreiWalter Bright Wrote:I agree I think that D will be used only by people like you that understand all this shared/scope/mutable/lazy things. I thought C++ was complex and difficult to learn but I think I was wrong.The delegate closure issue is part of a wider issue - escape analysis. A reference is said to 'escape' a scope if it, well, leaves that scope. Here's a trivial example: int* foo() { int i; return &i; } The reference to i escapes the scope of i, thus courting disaster. Another form of escaping: int* p; void bar(int* x) { p = x; } which is, on the surface, legitimate, but fails for: void abc(int j) { bar(&j); } This kind of problem is currently undetectable by the compiler. The first step is, are function parameters considered to be escaping by default or not by default? I.e.: void bar(noscope int* p); // p escapes void bar(scope int* p); // p does not escape void bar(int* p); // what should be the default? What should be the default? The functional programmer would probably choose scope as the default, and the OOP programmer noscope. (The issue with delegates is we need the dynamic closure only if the delegate 'escapes'.)I get the feeling that D's type system is going to become the joke of the programming world. Are we really going to have to worry about a scope unshared(invariant(int)*) ...? What other type modifiers can we put on that?
Oct 28 2008
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'sWell I think you were right. The question is how much you spend learning things that are actually useful, versus learning gratuitous complexity. I think D is much more rewarding per unit of effort invested than C++. AndreiSeconded. Both C++ and D are very complex languages, but I don't see that as a problem. As Bjarne <insert correct spelling of his last name here> would say, "Complexity has to go somewhere." If you oversimplify the core language, you end up acting has a human compiler to make your code fit within the confines of the simple language. See Java and C. The real problem with C++ is not complexity per se, but cruft, the fact that it's a low-level language masquerading as a high-level language, and the complete ignorance of convenience as a design goal. This can be exemplified just by examining how arrays "work" in C++. First, you have the cruft of C arrays that are very low-level and really aren't good for much, except making things more confusing. To get around this without doing anything to the core language, C++ adds vector to the STL. This is fine, except that you have no vector literals, no slice syntax, horrible error messages, inefficient copying semantics by default, vectors can't be used in metaprogramming, etc. It works, but it's not very convenient. Furthermore, the reason you have no nice slice syntax or default reference semantics is because you have no garbage collection.
Oct 28 2008
Walter Bright Wrote:The delegate closure issue is part of a wider issue - escape analysis. A reference is said to 'escape' a scope if it, well, leaves that scope. Here's a trivial example: int* foo() { int i; return &i; } The reference to i escapes the scope of i, thus courting disaster. Another form of escaping: int* p; void bar(int* x) { p = x; } which is, on the surface, legitimate, but fails for: void abc(int j) { bar(&j); } This kind of problem is currently undetectable by the compiler. The first step is, are function parameters considered to be escaping by default or not by default? I.e.: void bar(noscope int* p); // p escapes void bar(scope int* p); // p does not escape void bar(int* p); // what should be the default? What should be the default? The functional programmer would probably choose scope as the default, and the OOP programmer noscope. (The issue with delegates is we need the dynamic closure only if the delegate 'escapes'.)In D1, local variables implicitly follow a mixed rule: Objects are noscope Primitive types are scope Structs are headscope Headscope may be a bit of a misnomer because member scope is type dependent. When it comes to escaping, do we need transitive scope? I currently can't imagine that without allowing some exceptions. That insane path seems to lead to 3 scopes for membervariables...
Oct 27 2008
Allocation is determined on delegate creation, not on passing it somewhere else, isn't it? Closure allocation is a caller's task, so it's responsible and should be able to control this. Documentation is generally not needed, usually it's quite obvious, what's going on. Default should be alloc by default, and programmer should be able to track allocations with compiler's help, if he wants.
Oct 28 2008
Walter Bright wrote:The first step is, are function parameters considered to be escaping by default or not by default? I.e.: void bar(noscope int* p); // p escapes void bar(scope int* p); // p does not escape void bar(int* p); // what should be the default? What should be the default? The functional programmer would probably choose scope as the default, and the OOP programmer noscope.I'm for safe defaults. Programs shouldn't crash for no reason. Here are my thoughts on escape analysis. Sorry if they're obvious. I think it is possible to detect whether a reference escapes or not in the absence of function calls by analyzing an expression graph. Assigning to a global state variable is an ultimate escape. In the worst case, when only the current function can be analyzed and no meta-info is available about other functions, the compiler must assume a reference escapes if it is passed as an argument to another function. This is the current D2 behavior. Pure functions provide some meta-info because any reference passed as an argument can only escape via a reference return value or other mutable reference arguments. This makes escape analysis possible even after an unknown pure function is called. For any function in a tree of imported modules the compiler could keep some meta-data about which argument escapes where, if at all. This way even regular functions can participate in escape analysis without blowing it up. An argument to a virtual function call always escapes by default. It may be possible to declare an argument as non-escaping (scope?) and compiler should then enforce non-escaping contract upon any overriding functions. An argument to a function declared as a prototype always escapes by default. It may be possible for the compiler to export some meta-info along with the prototype when a .di file is generated, whether an argument is guaranteed to not escape, or maybe even detailed info about which argument escapes where, to mimic the compile-time meta-info. The expression graph analysis should be the first step towards safe stack closures.
Oct 28 2008
"Sergey Gromov" wroteWalter Bright wrote:If safe defaults means 75% performance decrease, I'm for using unsafe defaults that are safe 99% of the time, with the ability to make them 100% safe if needed.The first step is, are function parameters considered to be escaping by default or not by default? I.e.: void bar(noscope int* p); // p escapes void bar(scope int* p); // p does not escape void bar(int* p); // what should be the default? What should be the default? The functional programmer would probably choose scope as the default, and the OOP programmer noscope.I'm for safe defaults. Programs shouldn't crash for no reason.Here are my thoughts on escape analysis. Sorry if they're obvious. I think it is possible to detect whether a reference escapes or not in the absence of function calls by analyzing an expression graph.Yes, but not in D, since import uses uncompiled files as input.Assigning to a global state variable is an ultimate escape.Agree there.In the worst case, when only the current function can be analyzed and no meta-info is available about other functions, the compiler must assume a reference escapes if it is passed as an argument to another function. This is the current D2 behavior.This leads to the current situation, where you have a huge performance decrease for little or no gain in reliability.Pure functions provide some meta-info because any reference passed as an argument can only escape via a reference return value or other mutable reference arguments. This makes escape analysis possible even after an unknown pure function is called.Good point. Easy analysis on pure functions.For any function in a tree of imported modules the compiler could keep some meta-data about which argument escapes where, if at all. This way even regular functions can participate in escape analysis without blowing it up.Where is the data kept? It must be in the object file, and d imports must then read the object file for api instead of the source file. I don't think it's worth anything to break the single file for imports/code model. Requiring a .di file is a little iffy as it is today.An argument to a virtual function call always escapes by default. It may be possible to declare an argument as non-escaping (scope?) and compiler should then enforce non-escaping contract upon any overriding functions.This is tricky, because most class member functions are virtual, so you are forced to litter all your functions with escaping/non-escaping syntax. To be accurate you need to define the escape graph in the signature, which will be a PITA. What would be worse is to not have a way to express the complete graph. Another solution is that a derived function must have the same expression graph or a tighter one than the base class'. But without being able to store the graph with the compiled code (and having the compiler import the metadata instead of the source file), this is a moot point.An argument to a function declared as a prototype always escapes by default. It may be possible for the compiler to export some meta-info along with the prototype when a .di file is generated, whether an argument is guaranteed to not escape, or maybe even detailed info about which argument escapes where, to mimic the compile-time meta-info.No, the di file might not be auto-generated. You also now back to a separate import and source file, like C has. I think in order for this to work, the graph and object code must be stored in the same file that is imported.The expression graph analysis should be the first step towards safe stack closures.I would agree with this. But I don't think it's happening in the near future. And I hope it's not done through .di files. In the meantime, to make D2 a systems language again, it should drop conservative closures. -Steve
Oct 28 2008
Steven Schveighoffer wrote:"Sergey Gromov" wrotePlease note the "in the absence of function calls" part. I'm talking about code which is doing pure calculus, without calling anything external. It's pretty useless by itself, but it's the basics. Unfortunately I don't know how import is implemented. It should do some parsing though, to be able to inline functions from other modules, and to expand templates.Walter Bright wrote:If safe defaults means 75% performance decrease, I'm for using unsafe defaults that are safe 99% of the time, with the ability to make them 100% safe if needed.The first step is, are function parameters considered to be escaping by default or not by default? I.e.: void bar(noscope int* p); // p escapes void bar(scope int* p); // p does not escape void bar(int* p); // what should be the default? What should be the default? The functional programmer would probably choose scope as the default, and the OOP programmer noscope.I'm for safe defaults. Programs shouldn't crash for no reason.Here are my thoughts on escape analysis. Sorry if they're obvious. I think it is possible to detect whether a reference escapes or not in the absence of function calls by analyzing an expression graph.Yes, but not in D, since import uses uncompiled files as input.Here I'm talking about disposable compile-time data, module-local if you wish. This means that local optimization is better than inter-module optimization. Nothing new here I suppose. Of course it would be nice if this data is exported somehow and used when compiling other modules. But it'd make the compilation process asymmetric, when meta-data is available for already compiled modules and not available for others.Assigning to a global state variable is an ultimate escape.Agree there.In the worst case, when only the current function can be analyzed and no meta-info is available about other functions, the compiler must assume a reference escapes if it is passed as an argument to another function. This is the current D2 behavior.This leads to the current situation, where you have a huge performance decrease for little or no gain in reliability.Pure functions provide some meta-info because any reference passed as an argument can only escape via a reference return value or other mutable reference arguments. This makes escape analysis possible even after an unknown pure function is called.Good point. Easy analysis on pure functions.For any function in a tree of imported modules the compiler could keep some meta-data about which argument escapes where, if at all. This way even regular functions can participate in escape analysis without blowing it up.Where is the data kept? It must be in the object file, and d imports must then read the object file for api instead of the source file. I don't think it's worth anything to break the single file for imports/code model. Requiring a .di file is a little iffy as it is today.Not every call to a virtual function is itself virtual, and not every virtual function cares whether its argument escapes. I'd say more: the noscope should be default for all reference types except delegates because you usually don't care. I agree that having scope delegates the default is probably the right thing to do, but only if a compiler can detect violations of this contract.An argument to a virtual function call always escapes by default. It may be possible to declare an argument as non-escaping (scope?) and compiler should then enforce non-escaping contract upon any overriding functions.This is tricky, because most class member functions are virtual, so you are forced to litter all your functions with escaping/non-escaping syntax. To be accurate you need to define the escape graph in the signature, which will be a PITA. What would be worse is to not have a way to express the complete graph.Another solution is that a derived function must have the same expression graph or a tighter one than the base class'. But without being able to store the graph with the compiled code (and having the compiler import the metadata instead of the source file), this is a moot point.There are separate import files. Actually compiler can simply put scope/noscope for the arguments based upon the meta-data collected during compilation. If your .di is manually created, you either put them manually as well, or you don't care.An argument to a function declared as a prototype always escapes by default. It may be possible for the compiler to export some meta-info along with the prototype when a .di file is generated, whether an argument is guaranteed to not escape, or maybe even detailed info about which argument escapes where, to mimic the compile-time meta-info.No, the di file might not be auto-generated. You also now back to a separate import and source file, like C has. I think in order for this to work, the graph and object code must be stored in the same file that is imported.You can limit analysis to a single module for now. This will cover local function calls, including some local method calls, and I hope it'll also cover template function calls which means std.algorithm will work without memory allocation again.The expression graph analysis should be the first step towards safe stack closures.I would agree with this. But I don't think it's happening in the near future. And I hope it's not done through .di files.In the meantime, to make D2 a systems language again, it should drop conservative closures. -Steve
Oct 28 2008
"Sergey Gromov" wroteSteven Schveighoffer wrote:Ah, sorry. I read 'absence of function source'. My bad, in that case we agree on this one."Sergey Gromov" wrotePlease note the "in the absence of function calls" part. I'm talking about code which is doing pure calculus, without calling anything external. It's pretty useless by itself, but it's the basics.Walter Bright wrote:If safe defaults means 75% performance decrease, I'm for using unsafe defaults that are safe 99% of the time, with the ability to make them 100% safe if needed.The first step is, are function parameters considered to be escaping by default or not by default? I.e.: void bar(noscope int* p); // p escapes void bar(scope int* p); // p does not escape void bar(int* p); // what should be the default? What should be the default? The functional programmer would probably choose scope as the default, and the OOP programmer noscope.I'm for safe defaults. Programs shouldn't crash for no reason.Here are my thoughts on escape analysis. Sorry if they're obvious. I think it is possible to detect whether a reference escapes or not in the absence of function calls by analyzing an expression graph.Yes, but not in D, since import uses uncompiled files as input.Unfortunately I don't know how import is implemented. It should do some parsing though, to be able to inline functions from other modules, and to expand templates.Those are all problems to be solved. But if the file used by the linker and the file that contains the expression graphs aren't the same, or at least forced to be related, then you end up with very weird issues.Except the linker has to enforce it. Which means it needs to somehow be munged into the signature. If the signature is defined only in a .di file then it might not match. I just think the object file and .di file are too unrelated to force continuity. Weird issues can happen when these things are edited separately. If .di files were not editable and always generated with object files, I'd say they were a good place to put this info. But they aren't.Here I'm talking about disposable compile-time data, module-local if you wish. This means that local optimization is better than inter-module optimization. Nothing new here I suppose.Assigning to a global state variable is an ultimate escape.Agree there.In the worst case, when only the current function can be analyzed and no meta-info is available about other functions, the compiler must assume a reference escapes if it is passed as an argument to another function. This is the current D2 behavior.This leads to the current situation, where you have a huge performance decrease for little or no gain in reliability.Pure functions provide some meta-info because any reference passed as an argument can only escape via a reference return value or other mutable reference arguments. This makes escape analysis possible even after an unknown pure function is called.Good point. Easy analysis on pure functions.For any function in a tree of imported modules the compiler could keep some meta-data about which argument escapes where, if at all. This way even regular functions can participate in escape analysis without blowing it up.Where is the data kept? It must be in the object file, and d imports must then read the object file for api instead of the source file. I don't think it's worth anything to break the single file for imports/code model. Requiring a .di file is a little iffy as it is today.Of course it would be nice if this data is exported somehow and used when compiling other modules. But it'd make the compilation process asymmetric, when meta-data is available for already compiled modules and not available for others.It would have to be available for all of them. That would be the point of including it in the object file.A very very common technique in Tango to save using heap allocation is to declare a static array as a buffer, and then pass that buffer to be used as scratch space in a function (which is possibly virtual). This would be my golden use case that has to not allocate anything and has to work in order for any solution to be viable. Saying all reference types are noscope would prevent this, no?Not every call to a virtual function is itself virtual, and not every virtual function cares whether its argument escapes. I'd say more: the noscope should be default for all reference types except delegates because you usually don't care. I agree that having scope delegates the default is probably the right thing to do, but only if a compiler can detect violations of this contract.An argument to a virtual function call always escapes by default. It may be possible to declare an argument as non-escaping (scope?) and compiler should then enforce non-escaping contract upon any overriding functions.This is tricky, because most class member functions are virtual, so you are forced to litter all your functions with escaping/non-escaping syntax. To be accurate you need to define the escape graph in the signature, which will be a PITA. What would be worse is to not have a way to express the complete graph.I think the graph has to be complete for this to be usable. Otherwise, it becomes an unused feature. Using .di files is optional. I generally don't use them.Another solution is that a derived function must have the same expression graph or a tighter one than the base class'. But without being able to store the graph with the compiled code (and having the compiler import the metadata instead of the source file), this is a moot point.There are separate import files. Actually compiler can simply put scope/noscope for the arguments based upon the meta-data collected during compilation. If your .di is manually created, you either put them manually as well, or you don't care.An argument to a function declared as a prototype always escapes by default. It may be possible for the compiler to export some meta-info along with the prototype when a .di file is generated, whether an argument is guaranteed to not escape, or maybe even detailed info about which argument escapes where, to mimic the compile-time meta-info.No, the di file might not be auto-generated. You also now back to a separate import and source file, like C has. I think in order for this to work, the graph and object code must be stored in the same file that is imported.Yes, but not class virtual methods or interface methods. These are used quite a bit in Tango. End result, not a lot of benefit. -SteveYou can limit analysis to a single module for now. This will cover local function calls, including some local method calls, and I hope it'll also cover template function calls which means std.algorithm will work without memory allocation again.The expression graph analysis should be the first step towards safe stack closures.I would agree with this. But I don't think it's happening in the near future. And I hope it's not done through .di files.
Oct 28 2008
Tue, 28 Oct 2008 23:33:53 -0400, Steven Schveighoffer wrote:"Sergey Gromov" wroteAllocation only happens when a stack variable reference escapes via a delegate. A static array is not a stack variable, therefore the compiler doesn't care if it escapes.Steven Schveighoffer wrote:A very very common technique in Tango to save using heap allocation is to declare a static array as a buffer, and then pass that buffer to be used as scratch space in a function (which is possibly virtual). This would be my golden use case that has to not allocate anything and has to work in order for any solution to be viable. Saying all reference types are noscope would prevent this, no?"Sergey Gromov" wroteNot every call to a virtual function is itself virtual, and not every virtual function cares whether its argument escapes. I'd say more: the noscope should be default for all reference types except delegates because you usually don't care. I agree that having scope delegates the default is probably the right thing to do, but only if a compiler can detect violations of this contract.An argument to a virtual function call always escapes by default. It may be possible to declare an argument as non-escaping (scope?) and compiler should then enforce non-escaping contract upon any overriding functions.This is tricky, because most class member functions are virtual, so you are forced to litter all your functions with escaping/non-escaping syntax. To be accurate you need to define the escape graph in the signature, which will be a PITA. What would be worse is to not have a way to express the complete graph.For the incomplete graph to be usable, the compiler must assume the worst for nodes with absent meta-info. Therefore if you don't care to provide meta-info for your modules, it'll still work, though not as efficiently. On the other hand, if you supply .di files with your library and you do care enough, or you generate your .di files automatically, the meta-info will be present there saving some allocations for the user.I think the graph has to be complete for this to be usable. Otherwise, it becomes an unused feature. Using .di files is optional. I generally don't use them.There are separate import files. Actually compiler can simply put scope/noscope for the arguments based upon the meta-data collected during compilation. If your .di is manually created, you either put them manually as well, or you don't care.An argument to a function declared as a prototype always escapes by default. It may be possible for the compiler to export some meta-info along with the prototype when a .di file is generated, whether an argument is guaranteed to not escape, or maybe even detailed info about which argument escapes where, to mimic the compile-time meta-info.No, the di file might not be auto-generated. You also now back to a separate import and source file, like C has. I think in order for this to work, the graph and object code must be stored in the same file that is imported.If those virtual and interface methods are often used with function-local delegates as parameters then yes, the benefit wouldn't be that significant. Are you sure this is the case with Tango?Yes, but not class virtual methods or interface methods. These are used quite a bit in Tango. End result, not a lot of benefit.You can limit analysis to a single module for now. This will cover local function calls, including some local method calls, and I hope it'll also cover template function calls which means std.algorithm will work without memory allocation again.The expression graph analysis should be the first step towards safe stack closures.I would agree with this. But I don't think it's happening in the near future. And I hope it's not done through .di files.
Oct 29 2008
"Sergey Gromov" wroteTue, 28 Oct 2008 23:33:53 -0400, Steven Schveighoffer wrote:A static array declared on the stack absolutely is a stack variable. An example (from Tango's integer to text converter): char[] toString (long i, char[] fmt = null) { char[66] tmp = void; return format (tmp, i, fmt).dup; } Without the dup, toString returns a pointer to it's own stack. With a full graph analysis, it can be proven that tmp doesn't escape, but without either that or some crazy scope scheme, it would either allocate a closure, or fail to compile. Neither of those options are acceptable."Sergey Gromov" wroteAllocation only happens when a stack variable reference escapes via a delegate. A static array is not a stack variable, therefore the compiler doesn't care if it escapes.Steven Schveighoffer wrote:A very very common technique in Tango to save using heap allocation is to declare a static array as a buffer, and then pass that buffer to be used as scratch space in a function (which is possibly virtual). This would be my golden use case that has to not allocate anything and has to work in order for any solution to be viable. Saying all reference types are noscope would prevent this, no?"Sergey Gromov" wroteNot every call to a virtual function is itself virtual, and not every virtual function cares whether its argument escapes. I'd say more: the noscope should be default for all reference types except delegates because you usually don't care. I agree that having scope delegates the default is probably the right thing to do, but only if a compiler can detect violations of this contract.An argument to a virtual function call always escapes by default. It may be possible to declare an argument as non-escaping (scope?) and compiler should then enforce non-escaping contract upon any overriding functions.This is tricky, because most class member functions are virtual, so you are forced to litter all your functions with escaping/non-escaping syntax. To be accurate you need to define the escape graph in the signature, which will be a PITA. What would be worse is to not have a way to express the complete graph.This doesn't cover virtual functions or runtime-determined delegates. I'd rather just have a separate meta file or have the meta data included in the object file. What is wrong with that? Why must it be in the .di file? If the compiler always generates these meta files, then the graph is always complete.For the incomplete graph to be usable, the compiler must assume the worst for nodes with absent meta-info. Therefore if you don't care to provide meta-info for your modules, it'll still work, though not as efficiently. On the other hand, if you supply .di files with your library and you do care enough, or you generate your .di files automatically, the meta-info will be present there saving some allocations for the user.I think the graph has to be complete for this to be usable. Otherwise, it becomes an unused feature. Using .di files is optional. I generally don't use them.There are separate import files. Actually compiler can simply put scope/noscope for the arguments based upon the meta-data collected during compilation. If your .di is manually created, you either put them manually as well, or you don't care.An argument to a function declared as a prototype always escapes by default. It may be possible for the compiler to export some meta-info along with the prototype when a .di file is generated, whether an argument is guaranteed to not escape, or maybe even detailed info about which argument escapes where, to mimic the compile-time meta-info.No, the di file might not be auto-generated. You also now back to a separate import and source file, like C has. I think in order for this to work, the graph and object code must be stored in the same file that is imported.Any time you use opApply (and opApply is virtual), you are doing this. I suppose opApply is a special case, and can be failed if you save the delegate somewhere. But what about being able to pass the delegate to another virtual function while inside your opApply? Here is another example from Tango that isn't used via foreach: final bool putCache (char[] key, IMessage message) { void send (IConduit conduit) { buffer.setConduit (conduit); writer.put (ProtocolWriter.Command.Add, name_, key, message).flush; } // return false if the cache server said there's // already something newer if (cluster_.cache.request (&send, reader, key)) return false; return true; } cluster_.cache is a class. -SteveIf those virtual and interface methods are often used with function-local delegates as parameters then yes, the benefit wouldn't be that significant. Are you sure this is the case with Tango?Yes, but not class virtual methods or interface methods. These are used quite a bit in Tango. End result, not a lot of benefit.You can limit analysis to a single module for now. This will cover local function calls, including some local method calls, and I hope it'll also cover template function calls which means std.algorithm will work without memory allocation again.The expression graph analysis should be the first step towards safe stack closures.I would agree with this. But I don't think it's happening in the near future. And I hope it's not done through .di files.
Oct 29 2008
Wed, 29 Oct 2008 11:52:14 -0400, Steven Schveighoffer wrote:"Sergey Gromov" wroteThere is no delegate, therefore nothing to allocate a closure for. If tmp escapes, it is a compile-time error. If format() were pure it would be trivial to prove that tmp didn't escape. If format() is not pure, and escape graph for it is not known, then issuing an error here would be too much of a breaking change, I agree.Tue, 28 Oct 2008 23:33:53 -0400, Steven Schveighoffer wrote:A static array declared on the stack absolutely is a stack variable. An example (from Tango's integer to text converter): char[] toString (long i, char[] fmt = null) { char[66] tmp = void; return format (tmp, i, fmt).dup; } Without the dup, toString returns a pointer to it's own stack. With a full graph analysis, it can be proven that tmp doesn't escape, but without either that or some crazy scope scheme, it would either allocate a closure, or fail to compile. Neither of those options are acceptable."Sergey Gromov" wroteAllocation only happens when a stack variable reference escapes via a delegate. A static array is not a stack variable, therefore the compiler doesn't care if it escapes.Steven Schveighoffer wrote:A very very common technique in Tango to save using heap allocation is to declare a static array as a buffer, and then pass that buffer to be used as scratch space in a function (which is possibly virtual). This would be my golden use case that has to not allocate anything and has to work in order for any solution to be viable. Saying all reference types are noscope would prevent this, no?"Sergey Gromov" wroteNot every call to a virtual function is itself virtual, and not every virtual function cares whether its argument escapes. I'd say more: the noscope should be default for all reference types except delegates because you usually don't care. I agree that having scope delegates the default is probably the right thing to do, but only if a compiler can detect violations of this contract.An argument to a virtual function call always escapes by default. It may be possible to declare an argument as non-escaping (scope?) and compiler should then enforce non-escaping contract upon any overriding functions.This is tricky, because most class member functions are virtual, so you are forced to litter all your functions with escaping/non-escaping syntax. To be accurate you need to define the escape graph in the signature, which will be a PITA. What would be worse is to not have a way to express the complete graph.If you compile two files for the first time, and the first file imports the second one, where do you get that meta-data for the second file? What if you compile only one file, and that file imports another which wasn't compiled yet? Either you construct meta-data on the fly, or require it included in the source, or assume it's not present (worst case).This doesn't cover virtual functions or runtime-determined delegates. I'd rather just have a separate meta file or have the meta data included in the object file. What is wrong with that? Why must it be in the .di file? If the compiler always generates these meta files, then the graph is always complete.For the incomplete graph to be usable, the compiler must assume the worst for nodes with absent meta-info. Therefore if you don't care to provide meta-info for your modules, it'll still work, though not as efficiently. On the other hand, if you supply .di files with your library and you do care enough, or you generate your .di files automatically, the meta-info will be present there saving some allocations for the user.I think the graph has to be complete for this to be usable. Otherwise, it becomes an unused feature. Using .di files is optional. I generally don't use them.There are separate import files. Actually compiler can simply put scope/noscope for the arguments based upon the meta-data collected during compilation. If your .di is manually created, you either put them manually as well, or you don't care.An argument to a function declared as a prototype always escapes by default. It may be possible for the compiler to export some meta-info along with the prototype when a .di file is generated, whether an argument is guaranteed to not escape, or maybe even detailed info about which argument escapes where, to mimic the compile-time meta-info.No, the di file might not be auto-generated. You also now back to a separate import and source file, like C has. I think in order for this to work, the graph and object code must be stored in the same file that is imported.Fair enough. opApply() is an important technique.Any time you use opApply (and opApply is virtual), you are doing this.If those virtual and interface methods are often used with function-local delegates as parameters then yes, the benefit wouldn't be that significant. Are you sure this is the case with Tango?Yes, but not class virtual methods or interface methods. These are used quite a bit in Tango. End result, not a lot of benefit.You can limit analysis to a single module for now. This will cover local function calls, including some local method calls, and I hope it'll also cover template function calls which means std.algorithm will work without memory allocation again.The expression graph analysis should be the first step towards safe stack closures.I would agree with this. But I don't think it's happening in the near future. And I hope it's not done through .di files.
Oct 29 2008
"Sergey Gromov" wroteWed, 29 Oct 2008 11:52:14 -0400, Steven Schveighoffer wrote:I was under the impression that closures are currently allocated if you return a reference to a stack variable, not just for delegates. Maybe I'm wrong..."Sergey Gromov" wroteThere is no delegate, therefore nothing to allocate a closure for. If tmp escapes, it is a compile-time error.Tue, 28 Oct 2008 23:33:53 -0400, Steven Schveighoffer wrote:A static array declared on the stack absolutely is a stack variable. An example (from Tango's integer to text converter): char[] toString (long i, char[] fmt = null) { char[66] tmp = void; return format (tmp, i, fmt).dup; } Without the dup, toString returns a pointer to it's own stack. With a full graph analysis, it can be proven that tmp doesn't escape, but without either that or some crazy scope scheme, it would either allocate a closure, or fail to compile. Neither of those options are acceptable.A very very common technique in Tango to save using heap allocation is to declare a static array as a buffer, and then pass that buffer to be used as scratch space in a function (which is possibly virtual). This would be my golden use case that has to not allocate anything and has to work in order for any solution to be viable. Saying all reference types are noscope would prevent this, no?Allocation only happens when a stack variable reference escapes via a delegate. A static array is not a stack variable, therefore the compiler doesn't care if it escapes.If format() were pure it would be trivial to prove that tmp didn't escape. If format() is not pure, and escape graph for it is not known, then issuing an error here would be too much of a breaking change, I agree.format cannot be pure because it accepts mutable reference data. It happens to be in the same file, so it probably would not be an issue because a graph is generated for the current file, but these are not the only cases that Tango has.My vote would be for compiling it on the fly. The compiler already does parsing of the source file, so it can also generate this graph data. It shouldn't be too hard a task. Look, I agree that a graph analysis is the best possible solution. It requires no work from the user, no extra specification, and it will solve the problem accurately. But the current mode of compliation doesn't allow for that easily. That's all I was saying. -SteveIf you compile two files for the first time, and the first file imports the second one, where do you get that meta-data for the second file? What if you compile only one file, and that file imports another which wasn't compiled yet? Either you construct meta-data on the fly, or require it included in the source, or assume it's not present (worst case).This doesn't cover virtual functions or runtime-determined delegates. I'd rather just have a separate meta file or have the meta data included in the object file. What is wrong with that? Why must it be in the .di file? If the compiler always generates these meta files, then the graph is always complete.I think the graph has to be complete for this to be usable. Otherwise, it becomes an unused feature. Using .di files is optional. I generally don't use them.For the incomplete graph to be usable, the compiler must assume the worst for nodes with absent meta-info. Therefore if you don't care to provide meta-info for your modules, it'll still work, though not as efficiently. On the other hand, if you supply .di files with your library and you do care enough, or you generate your .di files automatically, the meta-info will be present there saving some allocations for the user.
Oct 29 2008
Wed, 29 Oct 2008 15:23:12 -0400, Steven Schveighoffer wrote:"Sergey Gromov" wroteI do understand that. I just wanted to discuss whether it is possible to approach this problem incrementally, so that relatively simple changes significantly improve the situation without breaking safety. And I thought that a dispute was a nice way for probing an idea for hidden flaws.If you compile two files for the first time, and the first file imports the second one, where do you get that meta-data for the second file? What if you compile only one file, and that file imports another which wasn't compiled yet? Either you construct meta-data on the fly, or require it included in the source, or assume it's not present (worst case).My vote would be for compiling it on the fly. The compiler already does parsing of the source file, so it can also generate this graph data. It shouldn't be too hard a task. Look, I agree that a graph analysis is the best possible solution. It requires no work from the user, no extra specification, and it will solve the problem accurately. But the current mode of compliation doesn't allow for that easily. That's all I was saying.
Oct 29 2008
Steven Schveighoffer wrote:"Sergey Gromov" wroteIf safe defaults means 2% performance decrease, I'm for using unsafe defaults that are safe 10% of the time, with the inability to make them 100% safe if needed. I might also be insane. ... I'm initially biased towards the safe default. I remember reading that part of D's design philosophy is to be safe by default, and I like that A LOT because it saves me from wasting many many hours of my life on stupid bugs. I'm also not convinced that full closures really run that much slower. That said, I'd be happy to ignore escape analysis for a while longer and just have D1 closures with the option to manually heap allocate them. I say that under the assumption that it's really easy to implement, mostly sortof solves the problem, and allows better (more general, safer) solutions to be put in place later.Walter Bright wrote:If safe defaults means 75% performance decrease, I'm for using unsafe defaults that are safe 99% of the time, with the ability to make them 100% safe if needed.The first step is, are function parameters considered to be escaping by default or not by default? I.e.: void bar(noscope int* p); // p escapes void bar(scope int* p); // p does not escape void bar(int* p); // what should be the default? What should be the default? The functional programmer would probably choose scope as the default, and the OOP programmer noscope.I'm for safe defaults. Programs shouldn't crash for no reason.
Oct 28 2008
I've run across some academic work on ownership types which seems relevant to this discussion on share/local/scope/noscope. Paper: http://www.cs.jhu.edu/~scott/pll/papers/pedigree-types.pdf Slides: http://www.cs.jhu.edu/~scott/pll/papers/iwaco.ppt Site: http://www.cs.jhu.edu/~scott/pll/abinitio.html Overview: Pedigree Types are an intuitive ownership type system requiring minimal programmer annotations. Reusing the vocabulary of human genealogy, Pedigree Types programmers can qualify any object reference with a pedigree -- a child, sibling, parent, grandparent, etc -- to indicate what relationship the object being referred to has with the referant on the standard ownership tree, following the owners-as-dominators convention. Such a qualifier serves as a heap shape constraint that must hold at run time and is enforced statically. Pedigree child captures the intention of encapsulation, i.e. ownership: the modified object reference is ensured not to escape the boundary of its parent. Among existing ownership type systems, Pedigree Types are closest to Universe Types. The former can be viewed as extending the latter with a more general form of pedigree modifiers, so that the relationship between any pair of objects on the aforementioned ownership tree can be named and -- more importantly -- inferred. We use a constraint-based type system which is proved sound via subject reduction. Other technical originalities include a polymorphic treatment of pedigrees not explicitly specified by programmers, and use of linear diophantine equations in type constraints to enforce the hierarchy.
Oct 28 2008
On 2008-10-28 23:52:04 -0400, "Robert Jacques" <sandford jhu.edu> said:I've run across some academic work on ownership types which seems relevant to this discussion on share/local/scope/noscope.I haven't read the paper yet, but the overview seems to go in the same direction as I was thinking. Basically, all the scope variables you can get are guarentied to be in the current or in some ansestry scope. To allow a reference to a scope variable, or a scope function, to be put inside a member of a struct or class, you only need to prove that the struct or class lifetime is smaller or equal to the one of the reference to your scope variable. If you could tell to the compiler the scope relationship of the various arguments, then you'd have pretty good scope analysis. For instance, with this syntax, we could define i to be available during the whole lifetime of o: void foo(scope MyObject o, scope(o) int* i) { o.i = i; } So you could do: void bar() { scope int i; scope MyObject o = new MyObject; foo(o, &i); } And the compiler would let it pass because foo guarenties not to keep references to i outside of o's scope, and o's scope is the same as i. Or you could do: void test1() { int i; test2(&i); } void test2(scope int* i) { scope o = new MyObject; foo(o, &i); } Again, the compiler can statically check that test2 won't keep a reference to i outside of the caller's scope (test1) because o scope is limited to test2. And if you try the reverse: void test1() { scope o = new MyObject; test2(o); } void test2(scope MyObject o) { int i; foo(o, &i); } Then the compiler could determine automatically that i needs to escape test2's scope and allocate the variable on the heap to make its lifetime as long as the object's scope (as it does currently with nested functions) [see my reserves to this in post scriptum]. This could be avoided by explictly binding i to the current scope, in which case the compiler could issue a scope error: void test2(scope MyObject o) { scope int i; foo(o, &i); // error, i scope needs to match o's, but i is bound to the current scope. } Interistingly, with this scheme, assuming your function arguments are properly scope-labeled, you never need to allocate variables on the heap explicitly anymore, the compiler can take care of it for you when the use of the variable inside the function body requires it. void test3(int* i); // unscoped parameter void test4() { int i; // allocated on heap because calling test3 requires an unscoped variable. test3(&i); } The reverse is also true: objects declared as allocated on the heap could be automatically rescoped as local stack variables if their use inside the function is limited in scope: void test5() { auto o = new MyObject; test2(o); } For instance, in test3 above where o isn't declared as scope, the compiler could still allocate o on the stack (as long as it knows the constructor doesn't leave unwanted references to the object in the global state), because it knows from the argument declaration of test2 that no references to o will leave the current scope. So basically, what to heap-allocate and what to stack-allocate could be left entirely to the compiler's discretion. Note that for all this to work, the pointer "i" in MyObject must be defined as not escaping the scope of the class: class MyObject { scope int* i; } or else someone could take the reference and put it into a global variable, or a variable of a greater scope than the object. P.S.: I'm still somewhat skeptical about this automatic allocation thing because it would mean a lot of extra heap allocation (and thus loss of performance) for any function where the parameters are not properly scoped. Perhaps the default should be local scope and you explicitly make it greater by declaring variables as noscope, which would allow the compiler to allocate if needed, but it doesn't solve the issue of the need to allocate on the heap for calling safely functions not using scope-labeled arguments. P.P.S.: This syntax doesn't fit very well with the current scope(success/failure/exit) feature. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Oct 29 2008
"Michel Fortin" wroteOn 2008-10-28 23:52:04 -0400, "Robert Jacques" <sandford jhu.edu> said:[snip] This is exactly the kind of thing I DON'T want to have. Here, you have to specify everything, even though the compiler is also doing the work, and making sure it matches. Tack on const modifiers, shared modifiers, and pure functions and there's going to be more decorations on function signatures than there are parameters. Note that especially this scope stuff will be required more often than the others. I'd much rather have either no checks, or have the compiler (or a lint tool) do all the work to tell me if anything escapes. -SteveI've run across some academic work on ownership types which seems relevant to this discussion on share/local/scope/noscope.I haven't read the paper yet, but the overview seems to go in the same direction as I was thinking.
Oct 29 2008
On Wed, 29 Oct 2008 11:01:35 -0400, Steven Schveighoffer <schveiguy yahoo.com> wrote:"Michel Fortin" wroteNote that one of a major points in the Pedigree paper is the static type inference, so you don't have to specify everything.On 2008-10-28 23:52:04 -0400, "Robert Jacques" <sandford jhu.edu> said:[snip] This is exactly the kind of thing I DON'T want to have. Here, you have to specify everything, even though the compiler is also doing the work, and making sure it matches. Tack on const modifiers, shared modifiers, and pure functions and there's going to be more decorations on function signatures than there are parameters. Note that especially this scope stuff will be required more often than the others. I'd much rather have either no checks, or have the compiler (or a lint tool) do all the work to tell me if anything escapes. -SteveI've run across some academic work on ownership types which seems relevant to this discussion on share/local/scope/noscope.I haven't read the paper yet, but the overview seems to go in the same direction as I was thinking.
Oct 29 2008
On 2008-10-29 11:01:35 -0400, "Steven Schveighoffer" <schveiguy yahoo.com> said:This is exactly the kind of thing I DON'T want to have. Here, you have to specify everything, even though the compiler is also doing the work, and making sure it matches. Tack on const modifiers, shared modifiers, and pure functions and there's going to be more decorations on function signatures than there are parameters.I agree that this is becomming a problem, even without scope. What we need is good defaults so that you don't have to decorate most of the time, and especially when you want to bypass it. I'd also like to point out that beside the possibility of better optimization and error catching by the compiler, specifying more properties function interfaces can free us of handling other releated things. With "immutable" values you don't need to worry about duplicating them everywhere to avoid other references from changing it; with "shared", you'll have less to worry about thread synchronization; and with "scope" as I proposed, you no longer have to worry about providing variables with the correct scope as the compiler can dynamically allocate when it sees the variable is needed outside of the current scope. Basically, by documenting better the interfaces in a machine-readable way, we are freed of other burdens the compiler can now take care of. In addition, we have better defined interfaces and the compiler has a lot more room to optimize things.Note that especially this scope stuff will be required more often than the others.Indeed.I'd much rather have either no checks, or have the compiler (or a lint tool) do all the work to tell me if anything escapes.The problem is that as soon as you have a function declaration without the body, the lint tool won't be able to tell you if it escapes or not. So, without a way to specify the requested scope of the parameters, you'll very often have holes in your escape analysis that will propagate down the caller chain, preventing any useful conclusion. For instance: void foo() { char[5] x = ['1', '2', '3', '4', '\0']; bar(x); } void bar(char* x) { printf(x); } void printf(char* x); Here you have no specification telling you that printf won't keep a reference to x beyond its scope, so we have to expect that it may do so. Turns out that because of that, a compiler or lit tool can't deduce if bar may or not leak the reference beyond its scope, which basically mean that calling bar(x) in foo may or may not be safe. With my proposal, it'd become this: void foo() { char[5] x = ['1', '2', '3', '4', '\0']; bar(x.ptr); } void bar(scope char* x) { printf(x); } void printf(scope char* x); And here the compiler, or the lint tool, can see that x doesn't need to live outside of foo's scope and that all is fine. If bar decided to keep the pointer in a global variable for further use, then the function signature would have a noscope x or the assignment to a global wouldn't work, and once bar has a noscope argument then foo won't compile unless x is allocated on the heap. I don't think it's bad to force interfaces to be well documented, and documented in a format that the compiler can understand to find errors like this. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Oct 30 2008
"Michel Fortin" wroteBasically, by documenting better the interfaces in a machine-readable way, we are freed of other burdens the compiler can now take care of. In addition, we have better defined interfaces and the compiler has a lot more room to optimize things.But the burden you have left for the developer is a tough one. You have to analyze the inputs and function calls from a function and determine which variable depends on what. This is a perfect problem for a tool to solve.The problem is that as soon as you have a function declaration without the body, the lint tool won't be able to tell you if it escapes or not.This I agree is a problem. In fact, without specifications in the function things like interfaces would be very difficult to determine scope-ness at compile time. The only way I can see to solve this is to do it at link time. When you link, piece together the parts of the graph that were incomplete, and see if they all work. It would be a very radical change, and might not even work with the current linkers. Especially if you want to do shared libraries, where the linker is builtin to the OS. A related question: how do you handle C functions?So, without a way to specify the requested scope of the parameters, you'll very often have holes in your escape analysis that will propagate down the caller chain, preventing any useful conclusion.Yes, and if a function has mis-specified some of its parameters, then you have code that doesn't compile. Or the function itself won't compile, and you need to do some more manual analysis. Imagine a function that calls 5 or 6 other functions with its parameters. And there are multiple different dependencies you have to resolve. That's a lot of analysis you have to do manually.I don't think it's bad to force interfaces to be well documented, and documented in a format that the compiler can understand to find errors like this.I think this concept is going to be really hard for a person to decipher, and really hard to get right. We are talking about a graph dependency analysis, in which many edges can exist, and the vertices do not necessarily have to be parameters. This is not stuff for the meager developer looking to get work done to have to think about. I'd much rather have a tool that does it, if not the compiler, then something else. Or partial analysis. Or no analysis. I agree it's good to have bugs caught by the compiler, but this solution requires too much work from the developer to be used. Some fun puzzles for you to come up with a proper scope syntax to use: void f(ref int *a, int *b, int *c) { if(*b < *c) a = b; else a = c;} struct S { int *v; } int *f2(S* s) { return s.v;} void f3(ref int *a, ref int *b, ref int *c) { int *tmp = a; a = b; b = c; c = tmp; } -Steve
Oct 31 2008
On Fri, 31 Oct 2008 11:11:26 -0400, Steven Schveighoffer <schveiguy yahoo.com> wrote:"Michel Fortin" wroteTools can't handle function pointers, which is why escape analysis has been limited to dynamic laguages like Java so far.Basically, by documenting better the interfaces in a machine-readable way, we are freed of other burdens the compiler can now take care of. In addition, we have better defined interfaces and the compiler has a lot more room to optimize things.But the burden you have left for the developer is a tough one. You have to analyze the inputs and function calls from a function and determine which variable depends on what. This is a perfect problem for a tool to solve.One option is link time compilation, although that doesn't apply to shared libs.The problem is that as soon as you have a function declaration without the body, the lint tool won't be able to tell you if it escapes or not.This I agree is a problem. In fact, without specifications in the function things like interfaces would be very difficult to determine scope-ness at compile time. The only way I can see to solve this is to do it at link time. When you link, piece together the parts of the graph that were incomplete, and see if they all work. It would be a very radical change, and might not even work with the current linkers. Especially if you want to do shared libraries, where the linker is builtin to the OS.A related question: how do you handle C functions?Hope and pray? (i.e. The same way C functions and immutable types are handled now.)Well, the same problem occurs with const today and just like const you'd have specific compilier errors to guide you.So, without a way to specify the requested scope of the parameters, you'll very often have holes in your escape analysis that will propagate down the caller chain, preventing any useful conclusion.Yes, and if a function has mis-specified some of its parameters, then you have code that doesn't compile. Or the function itself won't compile, and you need to do some more manual analysis. Imagine a function that calls 5 or 6 other functions with its parameters. And there are multiple different dependencies you have to resolve. That's a lot of analysis you have to do manually.Well, I'd guess most functions are either no escape or heap escape. Only functions that permit escape and want to play nice with stack variables need to do actual graph analysis. You'll note Walter's blog ignores this usage.I don't think it's bad to force interfaces to be well documented, and documented in a format that the compiler can understand to find errors like this.I think this concept is going to be really hard for a person to decipher, and really hard to get right. We are talking about a graph dependency analysis, in which many edges can exist, and the vertices do not necessarily have to be parameters. This is not stuff for the meager developer looking to get work done to have to think about. I'd much rather have a tool that does it, if not the compiler, then something else. Or partial analysis. Or no analysis. I agree it's good to have bugs caught by the compiler, but this solution requires too much work from the developer to be used.Some fun puzzles for you to come up with a proper scope syntax to use: void f(ref int *a, int *b, int *c) { if(*b < *c) a = b; else a = c;}if( a.scope <= b.scope && a.scope <= c.scope )struct S { int *v; } int *f2(S* s) { return s.v;}int* f2(S* s) if( return.scope >= s.scope )void f3(ref int *a, ref int *b, ref int *c) { int *tmp = a; a = b; b = c; c = tmp; }if ( a.scope == b.scope && a.scope == c.scope )-SteveThis is actually pretty straight forward as a = b implies a.scope <= b.scope.
Oct 31 2008
On 2008-10-31 11:11:26 -0400, "Steven Schveighoffer" <schveiguy yahoo.com> said:"Michel Fortin" wroteIf you can't determine yourself that a function can work with scoped parameters, you'd better never call that function with reference to local variables and leave its prototype with noscope parameters, making the compiler aware of the situation. In any case, the one who design the function is the one who is most likely able to tell you whether or not it accepts scoped arguments. The current situation makes the caller of that function responsible of calling it correctly. I think that's backward.Basically, by documenting better the interfaces in a machine-readable way, we are freed of other burdens the compiler can now take care of. In addition, we have better defined interfaces and the compiler has a lot more room to optimize things.But the burden you have left for the developer is a tough one. You have to analyze the inputs and function calls from a function and determine which variable depends on what. This is a perfect problem for a tool to solve.The problem is that as soon as you have a function declaration without the body, the lint tool won't be able to tell you if it escapes or not.This I agree is a problem. In fact, without specifications in the function things like interfaces would be very difficult to determine scope-ness at compile time.The only way I can see to solve this is to do it at link time. When you link, piece together the parts of the graph that were incomplete, and see if they all work. It would be a very radical change, and might not even work with the current linkers. Especially if you want to do shared libraries, where the linker is builtin to the OS.I think you're dreaming... not that it's a bad thing to have ambition, but that's probably not even possible.A related question: how do you handle C functions?You read the documentation of the function to determine if the function will let the pointer escape somewhere, and if not declare the parameter scope. For instance: extern (C) void printf(scope char* format, scope...); By the way, extern (C) functions with noscope parameters need careful consideration since their pointers aren't tracked by the garbage collector.You'll get an error at some call site, which can mean only two things: either your local variable shouldn't be bound to the local scope (because the function expects a reference it can keep beyond its scope) so you should allocate it on the heap, or the function you're calling has its prototype wrong. There's a chance that fixing the function prototype will create problems upward if it tries to put a reference to a scope variable in a global, or pass it to a function as a noscope argument.So, without a way to specify the requested scope of the parameters, you'll very often have holes in your escape analysis that will propagate down the caller chain, preventing any useful conclusion.Yes, and if a function has mis-specified some of its parameters, then you have code that doesn't compile. Or the function itself won't compile, and you need to do some more manual analysis. Imagine a function that calls 5 or 6 other functions with its parameters. And there are multiple different dependencies you have to resolve. That's a lot of analysis you have to do manually.It takes some thinking to get the prototype right at first. But it takes less caution calling the function later with local variables since the compiler will either issue an error or automatically fix the issue by allocating on the heap when an argument requires a greater scope.I don't think it's bad to force interfaces to be well documented, and documented in a format that the compiler can understand to find errors like this.I think this concept is going to be really hard for a person to decipher, and really hard to get right.We are talking about a graph dependency analysis, in which many edges can exist, and the vertices do not necessarily have to be parameters. This is not stuff for the meager developer looking to get work done to have to think about. I'd much rather have a tool that does it, if not the compiler, then something else. Or partial analysis. Or no analysis. I agree it's good to have bugs caught by the compiler, but this solution requires too much work from the developer to be used. Some fun puzzles for you to come up with a proper scope syntax to use: void f(ref int *a, int *b, int *c) { if(*b < *c) a = b; else a = c;}void f(scope ref int *a, scopeof(a) int *b, scopeof(o) int *c) { if (*b < *c) a = b; else a = c; }struct S { int *v; } int *f2(S* s) { return s.v;}Here you have two options depending on what you mean. Your example above is valid, but would allow v to point only to heap variables. If your intension is that S.v should be able to refer to scope variables too, then you'd need to write S as: struct S { scope int *v; } Then, no function can copy this pointer and keep it beyond of the scope of S. Therfore, the function needs to be updated to propagate this property: scopeof(s) int *f2(scope S* s) { return s.v; }void f3(ref int *a, ref int *b, ref int *c) { int *tmp = a; a = b; b = c; c = tmp; }This one is special, because you have a circular reference between the parameters. Note that a simpler example of this would be swapping two values. I had to invent something here saying that all these variables share the same scope... but I'd agree the syntax isn't so good. void f3(ref scope(1) int *a, ref scope(1) int *b, ref scope(1) int *c) { scope int *tmp = a; a = b; b = c; c = tmp; } -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Oct 31 2008
"Michel Fortin" wroteIf you can't determine yourself that a function can work with scoped parameters, you'd better never call that function with reference to local variables and leave its prototype with noscope parameters, making the compiler aware of the situation. In any case, the one who design the function is the one who is most likely able to tell you whether or not it accepts scoped arguments. The current situation makes the caller of that function responsible of calling it correctly. I think that's backward.But often times, the safety of the call depends on how it is being called. Unless the function has fully documented the scope escapes of its parameters, which as I have been saying, is going to be difficult, or impossible, for a person to figure out.Sure it is ;) You have to write a special linker. I think everyone who thinks a scope decoration proposal is going to 1) solve all scope escape issues and 2) be easy to use is dreaming :PThe only way I can see to solve this is to do it at link time. When you link, piece together the parts of the graph that were incomplete, and see if they all work. It would be a very radical change, and might not even work with the current linkers. Especially if you want to do shared libraries, where the linker is builtin to the OS.I think you're dreaming... not that it's a bad thing to have ambition, but that's probably not even possible.Or, the prototype can't be written correctly, even though it is provable that no escapes occur.You'll get an error at some call site, which can mean only two things: either your local variable shouldn't be bound to the local scope (because the function expects a reference it can keep beyond its scope) so you should allocate it on the heap, or the function you're calling has its prototype wrong.So, without a way to specify the requested scope of the parameters, you'll very often have holes in your escape analysis that will propagate down the caller chain, preventing any useful conclusion.Yes, and if a function has mis-specified some of its parameters, then you have code that doesn't compile. Or the function itself won't compile, and you need to do some more manual analysis. Imagine a function that calls 5 or 6 other functions with its parameters. And there are multiple different dependencies you have to resolve. That's a lot of analysis you have to do manually.I hope to avoid this last situation. Having the compiler make decisions for me, especially when heap allocation occurs, is bad.It takes some thinking to get the prototype right at first. But it takes less caution calling the function later with local variables since the compiler will either issue an error or automatically fix the issue by allocating on the heap when an argument requires a greater scope.I don't think it's bad to force interfaces to be well documented, and documented in a format that the compiler can understand to find errors like this.I think this concept is going to be really hard for a person to decipher, and really hard to get right.I assume you meant scopeof(a) instead of scopeof(o), but in any case, your design is incorrect. a depends on b and c's scope, not the other way around. Consider this valid usage: void foo() { int b = 1, c = 2; bar(&b, &c); } void bar(scope int *b, scope int *c) { int *a; f(a, b, c);// should not fail, but would with your decorations } -Stevevoid f(ref int *a, int *b, int *c) { if(*b < *c) a = b; else a = c;}void f(scope ref int *a, scopeof(a) int *b, scopeof(o) int *c) { if (*b < *c) a = b; else a = c; }
Nov 01 2008
Steven Schveighoffer wrote:I think everyone who thinks a scope decoration proposal is going to 1) solve all scope escape issues and 2) be easy to use is dreaming :PI think that's a fair assessment. One suggestion I made Walter is to only allow and implement the scope storage class for delegates, which simply means the callee will not squirrel away a pointer to delegate. That would allow us to solve the closure issue and for now sleep some more on the other issues. Andrei
Nov 01 2008
"Andrei Alexandrescu" wroteSteven Schveighoffer wrote:If scope delegates means trust the coder knows what he is doing (in the beginning), I agree with that plan of attack. -SteveI think everyone who thinks a scope decoration proposal is going to 1) solve all scope escape issues and 2) be easy to use is dreaming :PI think that's a fair assessment. One suggestion I made Walter is to only allow and implement the scope storage class for delegates, which simply means the callee will not squirrel away a pointer to delegate. That would allow us to solve the closure issue and for now sleep some more on the other issues.
Nov 02 2008
Steven Schveighoffer wrote:"Andrei Alexandrescu" wroteIt looks like things will move that way. Bartosz, Walter and I talked a lot yesterday about it - a lot of crazy things were on the table! The next step is to make this a reference, which is highly related to escape analysis. At the risk of anticipating a bit an unfinalized design, here's what's on the table: * Continue an "anything goes" policy for *explicit* pointers, i.e. those written explicitly by user code with stars and stuff. * Disallow pointers in SafeD. * Make all ref parameters scoped by default. There will be impossible for a function to escape the address of a ref parameter without a cast. I haven't proved it to myself yet, but I believe that if pointers are not used and with the amendments below regarding arrays and delegates, this makes things entirely safe. In Walter's words, "it buttons things pretty tight". * Make this a reference so that it obeys what references obey. * If people want to implement e.g. linked lists, they should do it with classes. Implementing them with structs will require casts to obtain and escape &this. That also means they'd be using pointers, so anything goes - pointers are not restricted from escaping. * There are two cases in which things escape without the user explicitly using pointers: delegates and dynamic arrays initialized from stack-allocated arrays. * For delegates require the scope keyword in the signature of the callee. A scoped delegate cannot be stored, only called or passed down to another function that in turn takes a scoped delegate. This makes scope delegates entirely safe. Non-scoped delegates use dynamic allocation. * We don't have an idea for dynamic arrays initialized from stack-allocated arrays. Thoughts? Ideas? AndreiSteven Schveighoffer wrote:If scope delegates means trust the coder knows what he is doing (in the beginning), I agree with that plan of attack.I think everyone who thinks a scope decoration proposal is going to 1) solve all scope escape issues and 2) be easy to use is dreaming :PI think that's a fair assessment. One suggestion I made Walter is to only allow and implement the scope storage class for delegates, which simply means the callee will not squirrel away a pointer to delegate. That would allow us to solve the closure issue and for now sleep some more on the other issues.
Nov 02 2008
Andrei Alexandrescu Wrote:* If people want to implement e.g. linked lists, they should do it with classes.UHm... I see. But I am not sure I like that. Isn't that a waste of memory? All objects have a vtable. Bye, bearophile
Nov 02 2008
bearophile wrote:Andrei Alexandrescu Wrote:Yah, we can't get rid of that. Possibilities discussed were (a) make final classes not have a vtable, and (b) define a new kind of struct that's only heap allocated. Walter thinks both add quite some complication for little benefit. Let's not forget that a cast will allow the trick for those interested in saving the extra word. Andrei* If people want to implement e.g. linked lists, they should do it with classes.UHm... I see. But I am not sure I like that. Isn't that a waste of memory? All objects have a vtable.
Nov 02 2008
== Quote from bearophile (bearophileHUGS lycos.com)'s articleAndrei Alexandrescu Wrote:objects have a vtable.* If people want to implement e.g. linked lists, they should do it with classes.UHm... I see. But I am not sure I like that. Isn't that a waste of memory? AllBye, bearophileAnd a monitor. And RTTI. Then again, for code that absolutely must be as efficient as possible, doing some fairly hackish/unsafe things is generally considered more acceptable than in run-of-the-mill programming. In these cases, you could always do it with structs and just use the casts. In the other 97% of cases, when we should forget about small efficiencies, a class works fine.
Nov 02 2008
On Sun, Nov 2, 2008 at 10:39 AM, bearophile <bearophileHUGS lycos.com> wrote:Andrei Alexandrescu Wrote:No, they have a *pointer* to a vtable. There is only one vtable per class, allocated in static memory. You only pay the cost of one pointer.* If people want to implement e.g. linked lists, they should do it with classes.UHm... I see. But I am not sure I like that. Isn't that a waste of memory? All objects have a vtable.
Nov 02 2008
On 2008-11-02 10:12:46 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:It looks like things will move that way. Bartosz, Walter and I talked a lot yesterday about it - a lot of crazy things were on the table! The next step is to make this a reference, which is highly related to escape analysis. At the risk of anticipating a bit an unfinalized design, here's what's on the table: * Continue an "anything goes" policy for *explicit* pointers, i.e. those written explicitly by user code with stars and stuff.That's a little disapointing. I was hoping for something to fix all holes. I know it isn't easy to design and implement, but once done I firmly believe it would have the potential to completely eliminate the need for explicit memory allocation. For the programmer, it's a good trade: less worrying about what needs to be dynamically allocated and better documented function signatures. Perhaps that would be too much of a departure from C and C++ though.* Disallow pointers in SafeD.Again a consequence of not having a full scoping solution. Couldn't you allow pointers in SafeD, while disallowing taking the address of local variables? This would limit pointers to heap-allocated variables. And disallow pointer arithmetic too.* Make all ref parameters scoped by default. There will be impossible for a function to escape the address of a ref parameter without a cast. I haven't proved it to myself yet, but I believe that if pointers are not used and with the amendments below regarding arrays and delegates, this makes things entirely safe. In Walter's words, "it buttons things pretty tight".If this means you can't implement a swap function for this struct, then I think you're right that it's safe: struct A { ref A a; } void swap(ref A a0, ref A a1); On the other side, if you can implement the swap function, then calling it is unsafe since you can rebind a reference to another without being able to check that their scopes are compatible. So basically, references must always be initialized at construction and should be non-rebindable, just like in C++. (Hum, and I should mention I don't like too much references in C++.)* Make this a reference so that it obeys what references obey.Ah, so that's why Walter wanted to change that suddenly. This is a good thing by itself, even without correct scoping.* If people want to implement e.g. linked lists, they should do it with classes. Implementing them with structs will require casts to obtain and escape &this. That also means they'd be using pointers, so anything goes - pointers are not restricted from escaping. * There are two cases in which things escape without the user explicitly using pointers: delegates and dynamic arrays initialized from stack-allocated arrays. * For delegates require the scope keyword in the signature of the callee. A scoped delegate cannot be stored, only called or passed down to another function that in turn takes a scoped delegate. This makes scope delegates entirely safe. Non-scoped delegates use dynamic allocation.Again, I'd say that if you can implement a swap function with those scope delegates, it's unsafe. Case in point: void f1(ref scope void delegate() arg) { int i; scope void f2() { ++i; } scope void delegate() inner = &f2; swap(arg, inner); // this should be an error. arg = inner; // this too should be an error. } If you can't rebind a the value of a scope delegate pointer, then all is fine.* We don't have an idea for dynamic arrays initialized from stack-allocated arrays.Either disallow it, either keep it as unsafe as pointers (bad for SafeD I expect), or implement a complete scope-checking system (if you do it for arrays, you'll have done it for pointers too). You don't have much choice there, as arrays are pretty much the same thing as pointers.Thoughts? Ideas?I'm under the impression that scope classes could be dangerous in this system: an object reference is not necessarly on the heap. Personally, I'd have liked to have a language where you can be completely scope safe, where you could document interfaces so they know the scope they're evolving in. This concept of something in between is a nice attempt at a compromize, but I find it somewhat limitting. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Nov 02 2008
Michel Fortin wrote:On 2008-11-02 10:12:46 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:That's only the half of it. If you want to take a look at a C-like language that is safe, you may want to look at Cyclone. The reality is that making things 100% safe are going to require more or less the moral equivalent of Cyclone's limitations and demands from its user. I think Dan Grossman has done an excellent job making things "as tight as possible but not tighter", so Cyclone is a great yardstick to measure D's tradeoffs against.It looks like things will move that way. Bartosz, Walter and I talked a lot yesterday about it - a lot of crazy things were on the table! The next step is to make this a reference, which is highly related to escape analysis. At the risk of anticipating a bit an unfinalized design, here's what's on the table: * Continue an "anything goes" policy for *explicit* pointers, i.e. those written explicitly by user code with stars and stuff.That's a little disapointing. I was hoping for something to fix all holes. I know it isn't easy to design and implement, but once done I firmly believe it would have the potential to completely eliminate the need for explicit memory allocation. For the programmer, it's a good trade: less worrying about what needs to be dynamically allocated and better documented function signatures. Perhaps that would be too much of a departure from C and C++ though.A "full scoping solution" would impose demands on you that you'd be the first to dislike.* Disallow pointers in SafeD.Again a consequence of not having a full scoping solution.Couldn't you allow pointers in SafeD, while disallowing taking the address of local variables? This would limit pointers to heap-allocated variables. And disallow pointer arithmetic too.I think pointers can be allowed in SafeD under certain restrictions starting with the ones you mention. We best start from the safe end.Swap will work fine because ref is not a type constructor. Struct A is in error. In fact ref not being a type constructor is much of the beauty of it all.* Make all ref parameters scoped by default. There will be impossible for a function to escape the address of a ref parameter without a cast. I haven't proved it to myself yet, but I believe that if pointers are not used and with the amendments below regarding arrays and delegates, this makes things entirely safe. In Walter's words, "it buttons things pretty tight".If this means you can't implement a swap function for this struct, then I think you're right that it's safe: struct A { ref A a; } void swap(ref A a0, ref A a1); On the other side, if you can implement the swap function, then calling it is unsafe since you can rebind a reference to another without being able to check that their scopes are compatible.So basically, references must always be initialized at construction and should be non-rebindable, just like in C++. (Hum, and I should mention I don't like too much references in C++.)No, C++ references are "almost" type constructors. Also note that rvalues won't bind to any kind of references in D. (More on that later.)Yah, in fact it's pretty amazing it seems to work out so well. We gain a huge guarantee without changing much in the language.* Make this a reference so that it obeys what references obey.Ah, so that's why Walter wanted to change that suddenly. This is a good thing by itself, even without correct scoping.Indeed, rebinding would be disallowed.* If people want to implement e.g. linked lists, they should do it with classes. Implementing them with structs will require casts to obtain and escape &this. That also means they'd be using pointers, so anything goes - pointers are not restricted from escaping. * There are two cases in which things escape without the user explicitly using pointers: delegates and dynamic arrays initialized from stack-allocated arrays. * For delegates require the scope keyword in the signature of the callee. A scoped delegate cannot be stored, only called or passed down to another function that in turn takes a scoped delegate. This makes scope delegates entirely safe. Non-scoped delegates use dynamic allocation.Again, I'd say that if you can implement a swap function with those scope delegates, it's unsafe. Case in point: void f1(ref scope void delegate() arg) { int i; scope void f2() { ++i; } scope void delegate() inner = &f2; swap(arg, inner); // this should be an error. arg = inner; // this too should be an error. } If you can't rebind a the value of a scope delegate pointer, then all is fine.Exactly. Essentially array are as "bad" as structs containing pointers.* We don't have an idea for dynamic arrays initialized from stack-allocated arrays.Either disallow it, either keep it as unsafe as pointers (bad for SafeD I expect), or implement a complete scope-checking system (if you do it for arrays, you'll have done it for pointers too). You don't have much choice there, as arrays are pretty much the same thing as pointers.I think a fair move to do is deal away with scope classes. We can still allow them via systems-level tricks, but not with an innocuous construct that's in fact a weapon of mass destruction.Thoughts? Ideas?I'm under the impression that scope classes could be dangerous in this system: an object reference is not necessarly on the heap.Personally, I'd have liked to have a language where you can be completely scope safe, where you could document interfaces so they know the scope they're evolving in. This concept of something in between is a nice attempt at a compromize, but I find it somewhat limitting.I agree. Again, something like this was on the table: void wyda(scope T* a, scope U* b) if (scope(a) <= scope(b) { a.field = b; } I think it's not hard to appreciate the toll this kind of user-written function summary exacts on the user of the language. Andrei
Nov 02 2008
On 2008-11-02 19:04:37 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:First, I think it's a pretty good idea to have this. Second, I think it's possible to improve the syntax; there should be a way to not have to worry about the scope rules when you don't want them to bother you. Here's something we could do about it... Add a special keyword (lets call it "autoscope" for now) that you can put at the start of the function making the compiler create automatically the less restrictive scope constrains from the function body and apply them to the signature. The restriction is that the source must be available for the compiler to see and there must not be any override based solely on scope constrains. So basically, you could write: autoscope void wyda(T* a, U* b) { a.field = b; } and the compiler would make the signature like your example above. And it'd be a good idea if the compiler could generate correct scoping constrains (without using "autoscope") in an eventual generated .di file to make things faster and not reliant on the code itself. -- Michel Fortin michel.fortin michelf.com http://michelf.com/Personally, I'd have liked to have a language where you can be completely scope safe, where you could document interfaces so they know the scope they're evolving in. This concept of something in between is a nice attempt at a compromize, but I find it somewhat limitting.I agree. Again, something like this was on the table: void wyda(scope T* a, scope U* b) if (scope(a) <= scope(b) { a.field = b; } I think it's not hard to appreciate the toll this kind of user-written function summary exacts on the user of the language.
Nov 02 2008
Michel Fortin wrote:On 2008-11-02 19:04:37 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:[snip] But syntax is so little a part of it. I knew since age immemorial that escape analysis is a bitch. I mean, everybody knows. Every once in a while, I'd get lulled into the belief that things can get "a little pregnant" in a sweet spot where the implementation isn't too hard, limitations aren't too severe, and the language doesn't get too complex. A couple of weeks ago was the (n + 1)th time that that happened; I got encouraged that Walter was willing to tackle the task of writing even a context/flow insensitive escape analyzer, and I also got hope from "scope" being an easy way to express something about a function. Ironically, it was your example that disabused me of my mistaken belief. That leaves me in the position that if someone wants to show me there *is* such a sweet spot, they better come with a very airtight argument. AndreiFirst, I think it's a pretty good idea to have this. Second, I think it's possible to improve the syntax; there should be a way to not have to worry about the scope rules when you don't want them to bother you. Here's something we could do about it...Personally, I'd have liked to have a language where you can be completely scope safe, where you could document interfaces so they know the scope they're evolving in. This concept of something in between is a nice attempt at a compromize, but I find it somewhat limitting.I agree. Again, something like this was on the table: void wyda(scope T* a, scope U* b) if (scope(a) <= scope(b) { a.field = b; } I think it's not hard to appreciate the toll this kind of user-written function summary exacts on the user of the language.
Nov 02 2008
On 2008-11-03 00:39:34 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:But syntax is so little a part of it. I knew since age immemorial that escape analysis is a bitch. I mean, everybody knows. Every once in a while, I'd get lulled into the belief that things can get "a little pregnant" in a sweet spot where the implementation isn't too hard, limitations aren't too severe, and the language doesn't get too complex. A couple of weeks ago was the (n + 1)th time that that happened; I got encouraged that Walter was willing to tackle the task of writing even a context/flow insensitive escape analyzer, and I also got hope from "scope" being an easy way to express something about a function. Ironically, it was your example that disabused me of my mistaken belief.Studying things more in depth often at first leave you with the impression that things are more complicated than they are. But after some time, you start to see a few common patterns and you can start to simplify and unify the concepts. Who would have thought some centuries ago that you could use the same math formulas to understand how an apple falls from a tree and how the Moon is orbiting around the Earth? Perhaps it's a wise choice to forget about the idea and avoid wasting time on making things more complicated *if* they'll indeed make things more complicated. But right now I have the feeling that you're bailing out after a first try seeing things are more complicated than they first looked like, without even digging further to see if there are common patterns that would allow simplification and unification with other concepts further down the line.That leaves me in the position that if someone wants to show me there *is* such a sweet spot, they better come with a very airtight argument.I believe I have a complete solution by placing the scope annotations on the type as I will explain below, alghouth I don't have a good syntax for it. My solution doesn't revolve around escape analysis but more about explicit scoping constrains (which could and should be made implicit through escape analysis, but that isn't stricly needed for the scoping system to work). And, as a bonus, it can provide a way for the compiler to completly free the programmer from having to explicity dynamically allocate things in his program (because all scopes are known at compile-time, the compiler can tell what needs to be dynamically allocated and what doesn't need to). So, are you interested? - - - Personally, I'd implement scoping rules by reusing the framework that was built for const. I'd make scope like const (a type modifier, is it called like that?), but with the additional variation that each scope qualifier could be bound to another variable's scope that would become a child scope (needed for a swap function for instance). Basically, each pointer or reference in a type can get its own scope qualifier. Scope restriction work in the reverse direction however: the data you point to impose scoping restrictions to pointers leading to it, not the other way around like with transitive const. You can have a scope pointer to no-scope data. You can't have a no-scope pointer pointing to scope data. So "scope(char)*" makes little sense, since "char" being scope, the pointer needs to be scope too. This makes more sense: "char scope(*)", a scope pointer to non-scope data. Basically, scope should be more and more restricted while reading a type from left to right, so you could have something like "char scope(* scopeof(x)(*))". There's of course a need for a better syntax than the above. But, I think the ugly syntax above conceptualize pretty well a good solution to the scoping problem that could extend arrays, structs and classes. We sure should make it prettier, perhaps by imposing restrictions like forcing all pointers in the type to be of the most restrictive scope which would avoid placing scope annotations everywhere in it. But in essence, I think this solution is workable. From there, we can define scope comparisons, and scope restriction checks to apply when asigning variables to one another. Scope restriction checks could allow restriction propagation when doing escape analysis. If you want more details, I can provide them as I've thought of the matter a lot in the last few days. I just don't have the time to write about everything about it right now. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Nov 03 2008
Michel Fortin wrote:On 2008-11-03 00:39:34 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:[snip]If you want more details, I can provide them as I've thought of the matter a lot in the last few days. I just don't have the time to write about everything about it right now.It may be wise to read some more before writing some more. As far as I understand it, your idea, if taken to completion, is very much like region analysis as defined in Cyclone. http://www.research.att.com/~trevor/papers/pldi2002.pdf Here are some slides: http://www.cs.washington.edu/homes/djg/slides/cyclone_pldi02.ppt My hope was that we can obtain an approximation of that idea by defining only two regions - "inside this function" and "outside this function". It looks like that's not much gain for a lot of pain. So the question is - should we introduce region analysis to D, or not? Andrei
Nov 03 2008
On 2008-11-03 11:21:08 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:Michel Fortin wrote:Pretty interesting slides. Yeah, that looks pretty much like my idea, in concept, where I call regions scopes. But I'd have made things simpler by having only local function regions (on the stack) and the global region (dynamically-allocated garbage-collected heap), which mean you don't need templates at all for dealing with them. I also belive we can completly avoid the use of named regions, such as: { int*`L p; L: { int x; p = x; } } The problem illustrated above, of having a pointer outside the inner braces take the address of a variable inside it, solves itself if you allow a variable's region to be "promoted" automatically to a broader one. For instance, you could write: { int* p; { int x; p = x; } } and p = x would make the compiler automatically extend the life of x up to p's region (local scope), although x wouldn't be accessible outside of the the inner braces other than by dereferencing p. If the pointer was copied outside of the function, then the only available broader region to promote x to would be the heap. I think this should be done automatically, although it could be decided to require dynamic allocation to be explicit too; this is of little importance to the escape analysis and scopre restriction problem.On 2008-11-03 00:39:34 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:[snip]If you want more details, I can provide them as I've thought of the matter a lot in the last few days. I just don't have the time to write about everything about it right now.It may be wise to read some more before writing some more. As far as I understand it, your idea, if taken to completion, is very much like region analysis as defined in Cyclone. http://www.research.att.com/~trevor/papers/pldi2002.pdf Here are some slides: http://www.cs.washington.edu/homes/djg/slides/cyclone_pldi02.pptMy hope was that we can obtain an approximation of that idea by defining only two regions - "inside this function" and "outside this function". It looks like that's not much gain for a lot of pain. So the question is - should we introduce region analysis to D, or not?I think we should at least try. I don't think we need everything Cyclone does however; we can and should keep things simpler. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Nov 04 2008
Michel Fortin wrote:On 2008-11-03 11:21:08 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:I don't understand that part.Michel Fortin wrote:Pretty interesting slides. Yeah, that looks pretty much like my idea, in concept, where I call regions scopes. But I'd have made things simpler by having only local function regions (on the stack) and the global region (dynamically-allocated garbage-collected heap), which mean you don't need templates at all for dealing with them.On 2008-11-03 00:39:34 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:[snip]If you want more details, I can provide them as I've thought of the matter a lot in the last few days. I just don't have the time to write about everything about it right now.It may be wise to read some more before writing some more. As far as I understand it, your idea, if taken to completion, is very much like region analysis as defined in Cyclone. http://www.research.att.com/~trevor/papers/pldi2002.pdf Here are some slides: http://www.cs.washington.edu/homes/djg/slides/cyclone_pldi02.pptI also belive we can completly avoid the use of named regions, such as: { int*`L p; L: { int x; p = x; } } The problem illustrated above, of having a pointer outside the inner braces take the address of a variable inside it, solves itself if you allow a variable's region to be "promoted" automatically to a broader one. For instance, you could write: { int* p; { int x; p = x; } } and p = x would make the compiler automatically extend the life of x up to p's region (local scope), although x wouldn't be accessible outside of the the inner braces other than by dereferencing p.Cyclone has region subtyping which takes care of that.If the pointer was copied outside of the function, then the only available broader region to promote x to would be the heap. I think this should be done automatically, although it could be decided to require dynamic allocation to be explicit too; this is of little importance to the escape analysis and scopre restriction problem.I'm not sure how to read this. For what I can tell, Cyclone's region analysis does not introduce undue complexity. It does the minimum necessary to prove that function manipulating pointers are safe. So if you suggest a simpler scheme, then either it is more limiting, less safe, or both. What are the tradeoffs you are thinking about, and how do they compare to Cyclone? AndreiMy hope was that we can obtain an approximation of that idea by defining only two regions - "inside this function" and "outside this function". It looks like that's not much gain for a lot of pain. So the question is - should we introduce region analysis to D, or not?I think we should at least try. I don't think we need everything Cyclone does however; we can and should keep things simpler.
Nov 04 2008
On 2008-11-04 12:36:15 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:Indeed, I was somewhat mistaken that the <> notation was templates (seen to much C++ lately), which somewhat confused my analysis for a few things later. And perhaps I should have read a little more about Cyclone before attempting a comparison as it seems I got a few things wrong from the slides.Yeah, that looks pretty much like my idea, in concept, where I call regions scopes. But I'd have made things simpler by having only local function regions (on the stack) and the global region (dynamically-allocated garbage-collected heap), which mean you don't need templates at all for dealing with them.I don't understand that part.I guess I'd have to familiarize myself with Cyclone a little more to be able to do a good comparison. Right now I've just been scratching the surface, but it looks more complicated than what I had in mind for D. I'd tend to believe Cyclone may cover some cases that wouldn't be by mine, but I'm not sure which one and I am currently under the impression that they are not that important (could be handled in other manners). Don't forget that Cyclone is targeted at the C language, which doesn't has templates nor garbage collection (although Cyclone supports an optional garbage collector). Since D has both, it can leverage some of this to simplify things. For instance, because of the garbage collector I don't think we need what Cyclone calls dynamic regions: I'd simply put everything escaping a function on the heap. It then follows that we don't need to propagate region handles. -- Michel Fortin michel.fortin michelf.com http://michelf.com/I'm not sure how to read this. For what I can tell, Cyclone's region analysis does not introduce undue complexity. It does the minimum necessary to prove that function manipulating pointers are safe. So if you suggest a simpler scheme, then either it is more limiting, less safe, or both. What are the tradeoffs you are thinking about, and how do they compare to Cyclone?My hope was that we can obtain an approximation of that idea by defining only two regions - "inside this function" and "outside this function". It looks like that's not much gain for a lot of pain. So the question is - should we introduce region analysis to D, or not?I think we should at least try. I don't think we need everything Cyclone does however; we can and should keep things simpler.
Nov 04 2008
On 2008-11-04 12:36:15 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:Not the same way as I'm proposing. What cyclone does is make p undereferencable outside the scope of L. So if I add an assignment to p outside of L, it won't compile: { int*`L p; L: { int x; p = &x; } *p = 42; // error, dereferencing p outside of L. } What I'm proposing is that such code extends the life of the storage of the local variable x to p's region: { int* p; { int x; p = &x; } *p = 42; // okay; per assignment to p, x lives up to p's scope. x; // error, x is not accessible in this scope, except through p. } Follows that if p is outside of the local function, x needs to be allocated dynamically (just as closures currently do for each variable they use): void f(ref int* p) { int x; p = &x; } If you want to make sure x never escapes the memory region associated to its scope, then you can declare x as scope and get a compile-time error when assigning it to p. So, in essence, the system I propose is a little simpler because pointer variables just cannot point to values coming from a region that doesn't exist in the scope the pointer is declared. The guaranty I propose is that during the whole lifetime of a pointer, it points to either a valid memory region, or null. Cyclone's approach is to forbid you from dereferencing the pointer. Combine this with my proposal to not have dynamic regions and we don't need named regions anymore. Perhaps the syntax could be made simpler with region names, but technically, we don't need them as we can always go the route of saying that a pointer value is "valid within the scope of variable_x". This is what I'm expressing with "scopeof(variable_x)" in my other examples, and I believe it is analogous to the "regions_of(variable_x)" in Cyclone, although Cyclone doesn't use it pervasively. -- Michel Fortin michel.fortin michelf.com http://michelf.com/I also belive we can completly avoid the use of named regions, such as: { int*`L p; L: { int x; p = x; } } The problem illustrated above, of having a pointer outside the inner braces take the address of a variable inside it, solves itself if you allow a variable's region to be "promoted" automatically to a broader one. For instance, you could write: { int* p; { int x; p = x; } } and p = x would make the compiler automatically extend the life of x up to p's region (local scope), although x wouldn't be accessible outside of the the inner braces other than by dereferencing p.Cyclone has region subtyping which takes care of that.
Nov 05 2008
Michel Fortin wrote:On 2008-11-04 12:36:15 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:Well how about this: int * p; float * q; if (condition) { int x; p = &x; } else { float y; q = &y; } Houston, we have a problem. You can of course patch that little rule in a number of ways, but really at the end of the day what happens only inside a function is uninteresting. The main challenge is making the analysis scalable to multiple functions.Not the same way as I'm proposing. What cyclone does is make p undereferencable outside the scope of L. So if I add an assignment to p outside of L, it won't compile: { int*`L p; L: { int x; p = &x; } *p = 42; // error, dereferencing p outside of L. } What I'm proposing is that such code extends the life of the storage of the local variable x to p's region: { int* p; { int x; p = &x; } *p = 42; // okay; per assignment to p, x lives up to p's scope. x; // error, x is not accessible in this scope, except through p. }I also belive we can completly avoid the use of named regions, such as: { int*`L p; L: { int x; p = x; } } The problem illustrated above, of having a pointer outside the inner braces take the address of a variable inside it, solves itself if you allow a variable's region to be "promoted" automatically to a broader one. For instance, you could write: { int* p; { int x; p = x; } } and p = x would make the compiler automatically extend the life of x up to p's region (local scope), although x wouldn't be accessible outside of the the inner braces other than by dereferencing p.Cyclone has region subtyping which takes care of that.Follows that if p is outside of the local function, x needs to be allocated dynamically (just as closures currently do for each variable they use): void f(ref int* p) { int x; p = &x; }Well this pretty much hamstrings pointers. You can take addresses of things inside a function but you can't pass them around. Moreover, people disliked the stealth dynamic allocation when delegates are being used; you are adding more of those.If you want to make sure x never escapes the memory region associated to its scope, then you can declare x as scope and get a compile-time error when assigning it to p. So, in essence, the system I propose is a little simpler because pointer variables just cannot point to values coming from a region that doesn't exist in the scope the pointer is declared. The guaranty I propose is that during the whole lifetime of a pointer, it points to either a valid memory region, or null. Cyclone's approach is to forbid you from dereferencing the pointer. Combine this with my proposal to not have dynamic regions and we don't need named regions anymore. Perhaps the syntax could be made simpler with region names, but technically, we don't need them as we can always go the route of saying that a pointer value is "valid within the scope of variable_x". This is what I'm expressing with "scopeof(variable_x)" in my other examples, and I believe it is analogous to the "regions_of(variable_x)" in Cyclone, although Cyclone doesn't use it pervasively.IMHO this may be made to work. I personally prefer the system in which ref is safe and pointers are permissive. The system you are referring to makes ref and pointer of the same power, so we could as well dispense with either. But I'd be curious what others think of it. Notice how the discussion participants got reduced to you and me, and from what I saw that's not a good sign. Andrei
Nov 06 2008
"Andrei Alexandrescu" wroteIMHO this may be made to work. I personally prefer the system in which ref is safe and pointers are permissive. The system you are referring to makes ref and pointer of the same power, so we could as well dispense with either. But I'd be curious what others think of it. Notice how the discussion participants got reduced to you and me, and from what I saw that's not a good sign.FWIW, I still think the proposal you have put forth about references being the safe type and pointers being permissive is the best one so far. It's clean, doesn't add excessive syntax, and makes good practical sense. I think full scope analysis is an interesting problem to solve, but it may just be an academic exercise, as it would be impractical to develop with. Just MHO. -Steve
Nov 07 2008
On 2008-11-06 23:36:55 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:Well how about this: int * p; float * q; if (condition) { int x; p = &x; } else { float y; q = &y; } Houston, we have a problem.I don't see a problem at all. The compiler would expand the lifetime of x to the outer scope, and do the same for y. Basically, the compiler would make it this way in the compiled code: int * p; float * q; int x; float y; if (condition) { p = &x; } else { q = &y; } A good optimising compiler could also place x and y in a union to save some space.You can of course patch that little rule in a number of ways, but really at the end of the day what happens only inside a function is uninteresting. The main challenge is making the analysis scalable to multiple functions.Indeed. Personally, I take the case above as a simple optimisation to avoid unnecessary dynamic allocation of x and y when you need to extend variable lifetime to a broader scope part of the same function.I'd like to point out that the two things people complained the most about regarding the automatic dynamic allocation for dynamic closures: 1. There is no way to prevent it, to make sure there is no allocation. 2. The compiler does allocate a lot more than necessary. In my proposal, these two points are addressed: 1. You can declare any variable as "scope", preventing it from being placed in a broader scope, preventing at the same time dynamic allocation. 2. The compiler being aware of what arguments do and do not escape the scope of the called functions, it won't allocate unnecessarily. So I think the situation would be much better. But all this is orthogonal to having or not an escape analysis system, as we could choose the reverse conventions: no variable can escape its scope unless explicitly authorized by some new syntactic construct.Follows that if p is outside of the local function, x needs to be allocated dynamically (just as closures currently do for each variable they use): void f(ref int* p) { int x; p = &x; }Well this pretty much hamstrings pointers. You can take addresses of things inside a function but you can't pass them around. Moreover, people disliked the stealth dynamic allocation when delegates are being used; you are adding more of those.I'm not too thrilled by references. I once got a question from someone coming from C: what is the difference between a pointer and a reference in C++? I had to answer: references are pointers with a different syntax, no rebindability, and no possibility of being null. It seems he and I both agree that references are mostly a cosmetic patch to solve a syntactic problem. References in D aren't much different. If we could have a unified syntax for pointers of all kinds, I think it'd be more convenient than having two kinds of pointers. A null-forbiding but rebindable pointer would be more useful in my opinion than the current reference concept.If you want to make sure x never escapes the memory region associated to its scope, then you can declare x as scope and get a compile-time error when assigning it to p. So, in essence, the system I propose is a little simpler because pointer variables just cannot point to values coming from a region that doesn't exist in the scope the pointer is declared. The guaranty I propose is that during the whole lifetime of a pointer, it points to either a valid memory region, or null. Cyclone's approach is to forbid you from dereferencing the pointer. Combine this with my proposal to not have dynamic regions and we don't need named regions anymore. Perhaps the syntax could be made simpler with region names, but technically, we don't need them as we can always go the route of saying that a pointer value is "valid within the scope of variable_x". This is what I'm expressing with "scopeof(variable_x)" in my other examples, and I believe it is analogous to the "regions_of(variable_x)" in Cyclone, although Cyclone doesn't use it pervasively.IMHO this may be made to work. I personally prefer the system in which ref is safe and pointers are permissive. The system you are referring to makes ref and pointer of the same power, so we could as well dispense with either.But I'd be curious what others think of it. Notice how the discussion participants got reduced to you and me, and from what I saw that's not a good sign.Indeed. I'm interested in other opinions too. But I'm under the impression that many lost track of what was being discussed, especially since we started referring to Cyclone which few are familiar with and probably few have read the paper. One of the fears expressed at the start of the thread was about excessive need for annotation, but as the Cyclone paper say, with good defaults, you need to add scoping annotation only to a few specific places. (It took me some time to read the paper and start discussing things sanely after that, remember?) So perhaps we could get more people involved if we could propose a tangible syntax for it. Or perhaps not; for advanced programmers who already understand well what can and cannot be done by passing pointers around, full escape analysis may not seem to be a so interesting gain since they've already adopted the right conventions to avoid most bugs it would prevent. And most people here who can discuss this topic with some confidence are not newbies to programming and don't make too much mistakes of the sort anymore. Which makes me think of beginners saying pointers are hard. You've certainly seen beginners struggle as they learn how to correctly use pointers in C or C++. Making sure their program fail at compile-time, with an explicative error message as to why they mustn't do this or that, is certainly going to help their experience learning the language more than cryptic and frustrating segfaults and access violations at runtime, sometime far from the source of the problem. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Nov 09 2008
Michel Fortin wrote:On 2008-11-06 23:36:55 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:In point of fact, it's expensive to extend the stack, so any compiler would do that, even without escape analysis. On the other hand, what about nested functions? I don't think they'd cause any trouble, but I'm not certain.Well how about this: int * p; float * q; if (condition) { int x; p = &x; } else { float y; q = &y; } Houston, we have a problem.I don't see a problem at all. The compiler would expand the lifetime of x to the outer scope, and do the same for y. Basically, the compiler would make it this way in the compiled code: int * p; float * q; int x; float y; if (condition) { p = &x; } else { q = &y; }
Nov 09 2008
On 2008-11-09 08:59:18 -0500, Christopher Wright <dhasenan gmail.com> said:Michel Fortin wrote:Indeed.I don't see a problem at all. The compiler would expand the lifetime of x to the outer scope, and do the same for y. Basically, the compiler would make it this way in the compiled code: int * p; float * q; int x; float y; if (condition) { p = &x; } else { q = &y; }In point of fact, it's expensive to extend the stack, so any compiler would do that, even without escape analysis.On the other hand, what about nested functions? I don't think they'd cause any trouble, but I'm not certain.If you mean there could be a problem with functions referring to the pointer, I'd say that with properly propagated escape constrains, it's safe. But it's an interesting case nonetheless. Consider this: int * p; if (condition) { int x; p = &x; } else { int y; p = &y; } int f() { return *p; } return &f; Now returning &f forces p to dynamically allocate on the heap, which puts a constrain on p forcing it to point only to variables on the heap, which in turn forces x and y to be allocated on the heap. I haven't verified, but I'm pretty certain this doesn't work correctly with the current dynamic closures in D2 however (because escape analysis doesn't see through pointers). Also, if you made p point to a value it received in argument, and the scope of that argument isn't the global scope, it'd be an error. For instance, this wouldn't work: int delegate() foo1(int* arg) { int f() { return *arg; } return &f; // error, returned closure may live longer than *arg; need constraint } Constraining the lifetime of the returned value to be no longer than the one of the argument would allow it to work safely (disregard the bizarre syntax for expressing the constrain on the delegate): int delegate(arg)() foo2(int* arg) { int f() { return *arg; } return &f; // ok, returned closure lifetime guarantied to be // at most as long as the lifetime of *arg. } int globalInt; int delegate() globalDelegate; void bar() { int localInt; int delegate() localDelegate; globalDelegate = foo2(globalInt); // ok, same lifetime localDelegate = foo2(globalInt); // ok, delegate lifetime shorter localDelegate = foo2(localInt); // ok, same lifetime globalDelegate = foo2(localInt); // ok, but forces bar to allocate localInt on the heap since otherwise // localInt lifetime would be shorter than lifetime of the delegate } Note that what I want to demonstrate is that the compiler can see pretty clearly what needs and what doesn't need to be allocated on the heap to guaranty safety. Whether we decide it does allocate automatically or it generate an error is of lesser concern to me. (And I'll add that some other issues with templates may make this automatic allocation scheme unworkable.) -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Nov 14 2008
Michel Fortin wrote:I'd like to point out that the two things people complained the most about regarding the automatic dynamic allocation for dynamic closures: 1. There is no way to prevent it, to make sure there is no allocation. 2. The compiler does allocate a lot more than necessary. In my proposal, these two points are addressed: 1. You can declare any variable as "scope", preventing it from being placed in a broader scope, preventing at the same time dynamic allocation. 2. The compiler being aware of what arguments do and do not escape the scope of the called functions, it won't allocate unnecessarily. So I think the situation would be much better.I agree that an escape analyzer would improve things. I am not sure that one oblivious to regions is expressive enough.But all this is orthogonal to having or not an escape analysis system, as we could choose the reverse conventions: no variable can escape its scope unless explicitly authorized by some new syntactic construct.It's not orthogonal. Whatever the default is, you must be able to enforce escaping rules, otherwise the system would be as good as a convention.I disagree. References in D are very different. They are not type constructors. They are storage classes that can only be used in function signatures, which makes them impossible to dangle. I think C++ references would also have been much better off as storage classes instead of half-life types.I'm not too thrilled by references. I once got a question from someone coming from C: what is the difference between a pointer and a reference in C++? I had to answer: references are pointers with a different syntax, no rebindability, and no possibility of being null. It seems he and I both agree that references are mostly a cosmetic patch to solve a syntactic problem. References in D aren't much different.If you want to make sure x never escapes the memory region associated to its scope, then you can declare x as scope and get a compile-time error when assigning it to p. So, in essence, the system I propose is a little simpler because pointer variables just cannot point to values coming from a region that doesn't exist in the scope the pointer is declared. The guaranty I propose is that during the whole lifetime of a pointer, it points to either a valid memory region, or null. Cyclone's approach is to forbid you from dereferencing the pointer. Combine this with my proposal to not have dynamic regions and we don't need named regions anymore. Perhaps the syntax could be made simpler with region names, but technically, we don't need them as we can always go the route of saying that a pointer value is "valid within the scope of variable_x". This is what I'm expressing with "scopeof(variable_x)" in my other examples, and I believe it is analogous to the "regions_of(variable_x)" in Cyclone, although Cyclone doesn't use it pervasively.IMHO this may be made to work. I personally prefer the system in which ref is safe and pointers are permissive. The system you are referring to makes ref and pointer of the same power, so we could as well dispense with either.If we could have a unified syntax for pointers of all kinds, I think it'd be more convenient than having two kinds of pointers. A null-forbiding but rebindable pointer would be more useful in my opinion than the current reference concept.Well ref means "This function wants to modify its argument". That is a very different charter from what pointers mean. So I'm not sure how you say you'd much prefer this to that. They are not comparable.In my experience, when someone is interested in something, she'd make time for it. So I take that as lack of interest. And hey, since when was lack of expertise a real deterrent? :o)But I'd be curious what others think of it. Notice how the discussion participants got reduced to you and me, and from what I saw that's not a good sign.Indeed. I'm interested in other opinions too. But I'm under the impression that many lost track of what was being discussed, especially since we started referring to Cyclone which few are familiar with and probably few have read the paper.One of the fears expressed at the start of the thread was about excessive need for annotation, but as the Cyclone paper say, with good defaults, you need to add scoping annotation only to a few specific places. (It took me some time to read the paper and start discussing things sanely after that, remember?) So perhaps we could get more people involved if we could propose a tangible syntax for it.To be very frank, I think we are very far from having an actual proposal, and syntax is of very low priority now if you want to put one together. Right now what we have is a few vague ideas and conjectures (e.g., there's no need for named regions because the need would be rare enough to require dynamic allocation for those cases). I'm not saying that to criticize, but merely to underline the difficulties.Or perhaps not; for advanced programmers who already understand well what can and cannot be done by passing pointers around, full escape analysis may not seem to be a so interesting gain since they've already adopted the right conventions to avoid most bugs it would prevent. And most people here who can discuss this topic with some confidence are not newbies to programming and don't make too much mistakes of the sort anymore. Which makes me think of beginners saying pointers are hard. You've certainly seen beginners struggle as they learn how to correctly use pointers in C or C++. Making sure their program fail at compile-time, with an explicative error message as to why they mustn't do this or that, is certainly going to help their experience learning the language more than cryptic and frustrating segfaults and access violations at runtime, sometime far from the source of the problem.I totally agree that pointers are hard and good static checking for them would help. Currently, what we try to do is obviate the need for pointers in most cases, and to actually forbid them in safe modules. The question that remains is, how many unsafe modules are necessary, and what liability do they entail? If there are few and not too unwieldy, maybe we can declare victory without constructing an escape analyzer. I agree if you or anyone says they don't think so. At this point, I am not sure, but what I can say is that it's good to reduce the need for pointers regardless. Andrei
Nov 09 2008
On 2008-11-09 10:10:03 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:Michel Fortin wrote:If you think I proposed a region-oblivious scheme, then you've got me wrong (and perhaps it's my fault for not explaining well enough). Let me explain again, and I'll try to not skip anything this time. Cyclone has dynamic regions, regions which are allocated on the heap but that are deleted at the end of the scope that created them. Basically, those are scoped heaps offering a very useful system to automatically free memory. (It's somewhat similar in concept to Cocoa's NSAutoReleasePool for instance.) The downside of them is that you need to pass region handle around (so called functions can allocate objects within them). So my first point is that since we have a garbage collector in D, and moreover since we're likely to get one heap per thread in D2, we don't need dynamic regions. The remaining regions are: 1) the shared heap, 2) the thread-local heap, 3) All the stack frames; and you can't allocate other stack frames than the current one. Because none of these regions require a handle to allocate into, we (A) don't need region handles. We still have many regions. Beside the two heaps (shared, thread-local), each function's stack frame, and each block within them, creates a distinct memory region. But nowhere we need to know exactly which region a function parameter comes from; what we need to know is which address outlives which pointer, and then we can forbid assigning addresses to pointers that outlive them. All we need is a relative ordering of the various regions, and for that we don't need to attach *names* to the regions so that you can refer explicitly to them in the syntax. Instead, you could say something like "region of (x)", or "region of (*y)" and that would be enough. So there is still a region for every pointer, only regions don't need to be *named* because you can always refer to them by referring to the variables. (And perhaps the syntax would be clearer with region names than without, in which case I don't mind we use them. But they're not required for the concept to work.)I'd like to point out that the two things people complained the most about regarding the automatic dynamic allocation for dynamic closures: 1. There is no way to prevent it, to make sure there is no allocation. 2. The compiler does allocate a lot more than necessary. In my proposal, these two points are addressed: 1. You can declare any variable as "scope", preventing it from being placed in a broader scope, preventing at the same time dynamic allocation. 2. The compiler being aware of what arguments do and do not escape the scope of the called functions, it won't allocate unnecessarily. So I think the situation would be much better.I agree that an escape analyzer would improve things. I am not sure that one oblivious to regions is expressive enough.Which makes me think of this: struct A { int i; this(); } ref A foo(ref A a) { return a; } ref A bar() { foo(A()).i = 1; ref A a = foo(A()); // illegal, ref cannot be used outside function signature a.i = 1; return foo(A()); // illegal ? } Also, I'd like to point out that ref (and out) being storage classes somewhat hinder me from using them where it makes sense in the D/Objective-C bridge, since there most functions are instanciated by templates where template arguments give the type of each function argument. Perhaps there should be a way to specify "ref" and "out" in template arguments...I'm not too thrilled by references. I once got a question from someone coming from C: what is the difference between a pointer and a reference in C++? I had to answer: references are pointers with a different syntax, no rebindability, and no possibility of being null. It seems he and I both agree that references are mostly a cosmetic patch to solve a syntactic problem. References in D aren't much different.I disagree. References in D are very different. They are not type constructors. They are storage classes that can only be used in function signatures, which makes them impossible to dangle. I think C++ references would also have been much better off as storage classes instead of half-life types.I was under the impression that ref would be allowed as a storage class for local variables. I'll say it's perfectly acceptable for function arguments, but I'm less sure about function return types. Also, I'd still like to have a non-null pointer type, especially for clarifying function sigatures. A template can do. If it was in the language however it be used by more people, which would be better.If we could have a unified syntax for pointers of all kinds, I think it'd be more convenient than having two kinds of pointers. A null-forbiding but rebindable pointer would be more useful in my opinion than the current reference concept.Well ref means "This function wants to modify its argument". That is a very different charter from what pointers mean. So I'm not sure how you say you'd much prefer this to that. They are not comparable.As I said below, I think many people in this group are already confortable with using pointers, which may explain why they're not so interested. Having no one interested in something doesn't necessarly mean they won't appreciate it when it comes. It does, however reduce the incitative for continuing forward. So I understand why you're backing off, even if it displease me somewhat.In my experience, when someone is interested in something, she'd make time for it. So I take that as lack of interest. And hey, since when was lack of expertise a real deterrent? :o)But I'd be curious what others think of it. Notice how the discussion participants got reduced to you and me, and from what I saw that's not a good sign.Indeed. I'm interested in other opinions too. But I'm under the impression that many lost track of what was being discussed, especially since we started referring to Cyclone which few are familiar with and probably few have read the paper.I never said the need for dynamic regions would be rare: I said garbage collector obsoletes it. If we can justify the need for dynamic regions later, we can add them back (with all the added complexity it requires) but I'd try without them first.One of the fears expressed at the start of the thread was about excessive need for annotation, but as the Cyclone paper say, with good defaults, you need to add scoping annotation only to a few specific places. (It took me some time to read the paper and start discussing things sanely after that, remember?) So perhaps we could get more people involved if we could propose a tangible syntax for it.To be very frank, I think we are very far from having an actual proposal, and syntax is of very low priority now if you want to put one together. Right now what we have is a few vague ideas and conjectures (e.g., there's no need for named regions because the need would be rare enough to require dynamic allocation for those cases). I'm not saying that to criticize, but merely to underline the difficulties.But dynamic arrays *are* pointers, how are you oblivating the need for them? If you find a solution for dynamic arrays, you'll have a solution for pointers too. You could forbid dynamic arrays from refering to stack-allocated static ones, or automatically dynamically allocate those when they escape in a dynamic array. And if I were you, whatever you choose for arrays I'd allow it for pointers too, to keep things consistent. Pointer to heap objects should be retained in my opinion.Or perhaps not; for advanced programmers who already understand well what can and cannot be done by passing pointers around, full escape analysis may not seem to be a so interesting gain since they've already adopted the right conventions to avoid most bugs it would prevent. And most people here who can discuss this topic with some confidence are not newbies to programming and don't make too much mistakes of the sort anymore. Which makes me think of beginners saying pointers are hard. You've certainly seen beginners struggle as they learn how to correctly use pointers in C or C++. Making sure their program fail at compile-time, with an explicative error message as to why they mustn't do this or that, is certainly going to help their experience learning the language more than cryptic and frustrating segfaults and access violations at runtime, sometime far from the source of the problem.I totally agree that pointers are hard and good static checking for them would help. Currently, what we try to do is obviate the need for pointers in most cases, and to actually forbid them in safe modules.The question that remains is, how many unsafe modules are necessary, and what liability do they entail? If there are few and not too unwieldy, maybe we can declare victory without constructing an escape analyzer. I agree if you or anyone says they don't think so. At this point, I am not sure, but what I can say is that it's good to reduce the need for pointers regardless.But are you reducing the need for pointers or hiding and restricting them? I'd say the later. Reference are pointers with restrictions. Object references are no different from pointer except in syntax (they can even point to stack allocated objects with scope classes). Dynamic arrays are pointers with a certain range. Closure have a pointer to a stack frame, which can be heap-allocated or not. The only way to have a safe system without escape analysis is to force everything they can point to to be on the heap, or prevent them from escaping at all (as with ref). I which there could be some consistency here. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Nov 12 2008
Michel Fortin wrote:On 2008-11-09 10:10:03 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said: So my first point is that since we have a garbage collector in D, and moreover since we're likely to get one heap per thread in D2, we don't need dynamic regions. The remaining regions are: 1) the shared heap, 2) the thread-local heap, 3) All the stack frames; and you can't allocate other stack frames than the current one. Because none of these regions require a handle to allocate into, we (A) don't need region handles. We still have many regions. Beside the two heaps (shared, thread-local), each function's stack frame, and each block within them, creates a distinct memory region. But nowhere we need to know exactly which region a function parameter comes from; what we need to know is which address outlives which pointer, and then we can forbid assigning addresses to pointers that outlive them. All we need is a relative ordering of the various regions, and for that we don't need to attach *names* to the regions so that you can refer explicitly to them in the syntax. Instead, you could say something like "region of (x)", or "region of (*y)" and that would be enough.But how do you type then the assignment example? void assign(int** p, int * r) { *p = *r; } How do you reflect the requirement that r's region outlives *p's region? But that's not even the point. Say you define some notation, such as: void assign(int** p, int * r) if (region(r) <= region(p)); But the whole point of regions was to _simplify_ notations like the above into: void assign(region R)(int*R* p, int *R r); So although you think you simplified things by using region(symbol) instead of symbolic names, you complicated things. The compiler still needs to infer regions for each value, so it is as complicated as a named-regions compiler, and in addition you require the user to write bulkier expressions because you disallow use of symbols. So everybody is worse off. Note how in the example using a symbolic region the outlives relationship is enforced implicitly by using the same symbol name in two places. I suspect there are things you can't even express without symbolic regions. Consider this example from Dan's slides: struct ILst(region R1, region R2) { int *R1 hd; ILst!(R1, R2) *R2 tl; } This code reflects the fact that the list holds pointer to integers in one region, whereas the nodes themselves are in a different region. It would be a serious challenge to tackle that without symbolic regions, and simpler that won't be for anybody. I'll insert a few more points below in this sprawling discussion.Which makes me think of this: struct A { int i; this(); } ref A foo(ref A a) { return a; } ref A bar() { foo(A()).i = 1; ref A a = foo(A()); // illegal, ref cannot be used outside function signature a.i = 1; return foo(A()); // illegal ? }foo(A()) is illegal because ref does not bind to an rvalue.Also, I'd like to point out that ref (and out) being storage classes somewhat hinder me from using them where it makes sense in the D/Objective-C bridge, since there most functions are instanciated by templates where template arguments give the type of each function argument. Perhaps there should be a way to specify "ref" and "out" in template arguments...I agree. Something like that is on the list.As of now, ref is not planned for local variables.I was under the impression that ref would be allowed as a storage class for local variables. I'll say it's perfectly acceptable for function arguments, but I'm less sure about function return types.If we could have a unified syntax for pointers of all kinds, I think it'd be more convenient than having two kinds of pointers. A null-forbiding but rebindable pointer would be more useful in my opinion than the current reference concept.Well ref means "This function wants to modify its argument". That is a very different charter from what pointers mean. So I'm not sure how you say you'd much prefer this to that. They are not comparable.Also, I'd still like to have a non-null pointer type, especially for clarifying function sigatures. A template can do. If it was in the language however it be used by more people, which would be better.I don't grok this notion "if it's in the language it would be used by more people". How does that come about? Does it mean templates are at such a high syntactic disadvantage? Maybe we should do something about that then, such as replacing !() with something else :o). If we put it in phobos (which after integration will be usable alongside with tango) could it count as being in the language?That I totally agree with. It's happened a couple of times with D features.As I said below, I think many people in this group are already confortable with using pointers, which may explain why they're not so interested. Having no one interested in something doesn't necessarly mean they won't appreciate it when it comes.In my experience, when someone is interested in something, she'd make time for it. So I take that as lack of interest. And hey, since when was lack of expertise a real deterrent? :o)But I'd be curious what others think of it. Notice how the discussion participants got reduced to you and me, and from what I saw that's not a good sign.Indeed. I'm interested in other opinions too. But I'm under the impression that many lost track of what was being discussed, especially since we started referring to Cyclone which few are familiar with and probably few have read the paper.It does, however reduce the incitative for continuing forward. So I understand why you're backing off, even if it displease me somewhat.I'm sorry about how you feel. Now we're in a conundrum of sorts. You seem to strongly believe you can make some nice simplified regions work, and make people like them. Taking that to a proof is hard. The conundrum is, you are facing the prospect of putting work into it and creating a system that, albeit correct, is not enticing.Let's not forget that symbolic regions (for typing purposes) should not be confused with dynamic regions (for efficiency purposes). I agree we can do away with the latter and put them in later if we care. I disagree that dropping symbolic regions simplifies things.I never said the need for dynamic regions would be rare: I said garbage collector obsoletes it. If we can justify the need for dynamic regions later, we can add them back (with all the added complexity it requires) but I'd try without them first.One of the fears expressed at the start of the thread was about excessive need for annotation, but as the Cyclone paper say, with good defaults, you need to add scoping annotation only to a few specific places. (It took me some time to read the paper and start discussing things sanely after that, remember?) So perhaps we could get more people involved if we could propose a tangible syntax for it.To be very frank, I think we are very far from having an actual proposal, and syntax is of very low priority now if you want to put one together. Right now what we have is a few vague ideas and conjectures (e.g., there's no need for named regions because the need would be rare enough to require dynamic allocation for those cases). I'm not saying that to criticize, but merely to underline the difficulties.But a possible path is to make arrays safe and leave pointers for those cases in which efficiency is of utmost importance. With luck, those cases are rare.But dynamic arrays *are* pointers, how are you oblivating the need for them? If you find a solution for dynamic arrays, you'll have a solution for pointers too. You could forbid dynamic arrays from refering to stack-allocated static ones, or automatically dynamically allocate those when they escape in a dynamic array. And if I were you, whatever you choose for arrays I'd allow it for pointers too, to keep things consistent. Pointer to heap objects should be retained in my opinion.Or perhaps not; for advanced programmers who already understand well what can and cannot be done by passing pointers around, full escape analysis may not seem to be a so interesting gain since they've already adopted the right conventions to avoid most bugs it would prevent. And most people here who can discuss this topic with some confidence are not newbies to programming and don't make too much mistakes of the sort anymore. Which makes me think of beginners saying pointers are hard. You've certainly seen beginners struggle as they learn how to correctly use pointers in C or C++. Making sure their program fail at compile-time, with an explicative error message as to why they mustn't do this or that, is certainly going to help their experience learning the language more than cryptic and frustrating segfaults and access violations at runtime, sometime far from the source of the problem.I totally agree that pointers are hard and good static checking for them would help. Currently, what we try to do is obviate the need for pointers in most cases, and to actually forbid them in safe modules.Of course - that's the whole point. In fact, I'll insert a small correction: we are reducing the need for pointers BY hiding and restricting them. And that's a good thing. If you can do most of your work with restricted pointers (e.g. ref), then that's a net win. AndreiThe question that remains is, how many unsafe modules are necessary, and what liability do they entail? If there are few and not too unwieldy, maybe we can declare victory without constructing an escape analyzer. I agree if you or anyone says they don't think so. At this point, I am not sure, but what I can say is that it's good to reduce the need for pointers regardless.But are you reducing the need for pointers or hiding and restricting them?
Nov 12 2008
Andrei Alexandrescu Wrote:But how do you type then the assignment example? void assign(int** p, int * r) { *p = *r; } How do you reflect the requirement that r's region outlives *p's region? But that's not even the point. Say you define some notation, such as: void assign(int** p, int * r) if (region(r) <= region(p)); But the whole point of regions was to _simplify_ notations like the above into: void assign(region R)(int*R* p, int *R r); So although you think you simplified things by using region(symbol) instead of symbolic names, you complicated things. The compiler still needs to infer regions for each value, so it is as complicated as a named-regions compiler, and in addition you require the user to write bulkier expressions because you disallow use of symbols. So everybody is worse off. Note how in the example using a symbolic region the outlives relationship is enforced implicitly by using the same symbol name in two places.Examples such as this one are rare enough to afford the need for annotations. I was under the impression that D was supposed to promote the use of references over pointers. People working with low-level code will probably either appreciate the optimization and correctness checking, or can request a way to turn off compiler enforcement of scoping in low-level code fragments.I suspect there are things you can't even express without symbolic regions. Consider this example from Dan's slides: struct ILst(region R1, region R2) { int *R1 hd; ILst!(R1, R2) *R2 tl; } This code reflects the fact that the list holds pointer to integers in one region, whereas the nodes themselves are in a different region. It would be a serious challenge to tackle that without symbolic regions, and simpler that won't be for anybody.Transitive scope ownership ensures that a member of a structure outlives the structure itself. In which case we can create a list in a local scope, and either add objects allocated in that scope or any parent scope or the heap. Referencing objects from child scopes would be incorrect and I don't think it's unreasonable to expect the programmer to code around such a desire. foo*R*Q x, if (R in Q) is illegal, because it could produce a dangling reference. foo*R*Q x, if (Q in R) is equivalent to foo*Q*Q, for the purpose of: *x = y; where y is one of foo*R, foo*Q or foo*global A problem arises for other operations though: foo*R*Q might have different semantics than foo*Q*Q when being on the right-hand side of the assignment. y = *x; is legal for foo*R y, but not for foo*Q y. Therefore, while the lifetime must always stay constant or be reduced towards the right side of the type declaration. It's necessary to be able to explicitly relax restrictions towards the left. The problem is that the type syntax is suited for scope relaxation rules to be transitive, not scope restriction. Ie. global(foo*)* makes sense, when * is scoped by default, but scope(foo*)* doesn't make sense, when * is global by default. So we could either implement it with regions, which I'm not a big fan of (better than nothing though!); or ditch "scope" (as a restriction) in favor of "global" and maybe "scopeof()" (as a relaxation). Hopefully soon D2 and the book will be done and the development of D3 can start, and such a breaking change can be introduced.But a possible path is to make arrays safe and leave pointers for those cases in which efficiency is of utmost importance. With luck, those cases are rare.Safe sure, but not by fobidding the usage of stack arrays.
Nov 12 2008
On 2008-11-12 10:02:02 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:Michel Fortin wrote:Everywhere I said there was no need for named regions, I also said named regions could be kept to ease the syntax. That said, I'm not so sure named regions are that good at simplifying the syntax. In your assign example above, the named-region version has an error: it forces the two pointers to be of the same region. That could be fine, but, assuming you're assigning to *p, it'd be more precise to write it like that: void assign(region R1, region R2)(int*R1* p, int*R2 r) if (R1 <= R2); Once we get there, I think the no-named region syntax is better. That said, for the swap example, where both values need to share the same region, the named region notation is simpler: void swap(region R)(int*R a, int*R b); void swap(int* a, int* b) if (region(a) == region(b)); But I'd argue that most of the time regions do not need to be equal, but are subset or superset of each other, so reusing variable names makes more sense in my opinion. In any case, I prefer a notation where regions constrains are attached directly to the type instead of being expressed somewhere else. Something like this (explained below): void assign(int*(r)* p, int* r) { *p = r; } void swap(ref int*(b) a, ref int*(a) b); Here, a parenthesis suffix after a pointer indicates the region constrain of the pointer, based on the region of another pointer. In the first example, int*(r)* means that the integer pointer "*p" must not live beyond the value pointed by "r" (because we're going to assign "r" to "*p"). In the second example, the value pointed by "a" must not live longer than the one pointed by "b" and the value pointed by "b" must not live longer than the one pointed "a"; the net result is that they must have the same lifetime and need to be in the same region. For something more complicated, you could give multiple commas-separated constrains: void choose(ref int*(a,b) result, int* a, int* b) { result = rand() > 0.5 ? a : b; }On 2008-11-09 10:10:03 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said: So my first point is that since we have a garbage collector in D, and moreover since we're likely to get one heap per thread in D2, we don't need dynamic regions. The remaining regions are: 1) the shared heap, 2) the thread-local heap, 3) All the stack frames; and you can't allocate other stack frames than the current one. Because none of these regions require a handle to allocate into, we (A) don't need region handles. We still have many regions. Beside the two heaps (shared, thread-local), each function's stack frame, and each block within them, creates a distinct memory region. But nowhere we need to know exactly which region a function parameter comes from; what we need to know is which address outlives which pointer, and then we can forbid assigning addresses to pointers that outlive them. All we need is a relative ordering of the various regions, and for that we don't need to attach *names* to the regions so that you can refer explicitly to them in the syntax. Instead, you could say something like "region of (x)", or "region of (*y)" and that would be enough.But how do you type then the assignment example? void assign(int** p, int * r) { *p = *r; } How do you reflect the requirement that r's region outlives *p's region? But that's not even the point. Say you define some notation, such as: void assign(int** p, int * r) if (region(r) <= region(p)); But the whole point of regions was to _simplify_ notations like the above into: void assign(region R)(int*R* p, int *R r); So although you think you simplified things by using region(symbol) instead of symbolic names, you complicated things. The compiler still needs to infer regions for each value, so it is as complicated as a named-regions compiler, and in addition you require the user to write bulkier expressions because you disallow use of symbols. So everybody is worse off. Note how in the example using a symbolic region the outlives relationship is enforced implicitly by using the same symbol name in two places.I suspect there are things you can't even express without symbolic regions. Consider this example from Dan's slides: struct ILst(region R1, region R2) { int *R1 hd; ILst!(R1, R2) *R2 tl; } This code reflects the fact that the list holds pointer to integers in one region, whereas the nodes themselves are in a different region. It would be a serious challenge to tackle that without symbolic regions, and simpler that won't be for anybody.Today's templates are just fine for that. Just propagate variables through template arguments and apply region constrains to the members: struct ILst(alias var1, alias var2) { int*(var1) hd; ILst!(var1, var2)*(var2) tl; } int z; int*(z) a, b; ILst!(a, b) lst1; ILst!(&z, &z) lst2; We could even allow regions to propagate through type arguments too: struct ILst2(T1, T2) { int*(T1) hd; ILst2!(T1, T2)*(T2) tl; } ILst2!(typeof(&z), typeof(b)) lst3; I think this example is a good case for attaching region constrains directly to types instead of expressing them as conditional expressions elsewhere, as in "if (region a <= region b)".I'll insert a few more points below in this sprawling discussion.Ah, you're right.Which makes me think of this: struct A { int i; this(); } ref A foo(ref A a) { return a; } ref A bar() { foo(A()).i = 1; ref A a = foo(A()); // illegal, ref cannot be used outside function signature a.i = 1; return foo(A()); // illegal ? }foo(A()) is illegal because ref does not bind to an rvalue.Great!Also, I'd like to point out that ref (and out) being storage classes somewhat hinder me from using them where it makes sense in the D/Objective-C bridge, since there most functions are instanciated by templates where template arguments give the type of each function argument. Perhaps there should be a way to specify "ref" and "out" in template arguments...I agree. Something like that is on the list.No, I really think it's true that if it is in the language, explained right alongside nullable pointers, more people would learn them more and use them more. Isn't it this exact notion that made Walter add Ddoc and unit tests directly into the language?Also, I'd still like to have a non-null pointer type, especially for clarifying function sigatures. A template can do. If it was in the language however it be used by more people, which would be better.I don't grok this notion "if it's in the language it would be used by more people". How does that come about?Does it mean templates are at such a high syntactic disadvantage? Maybe we should do something about that then, such as replacing !() with something else :o). If we put it in phobos (which after integration will be usable alongside with tango) could it count as being in the language?Pointers that shouldn't be null are pretty common, possibly even more common that can-be-null pointers, which is why I think it deserves a good, short, easy to read and remember syntax. I'd even suggest changing the standard syntax for pointer "*" so it only allows non-null pointers, and having something else "*?" for nullable ones. This would force people into giving more consideration before allowing nullable pointers, and the same syntax could apply to objects too. That said, having a non-nullable pointer in the standard library would certainly be better than nothing. And the standard library should make use of it everywhere it makes sense. But is a standard-libary solution going to work with "extern (C)" functions? I think it'd be sad if it didn't, and it would look strange if it did (C functions with template arguments!).Currently, I'm just trying to convince you (and any other potential silent listeners) that it can work. I haven't given much though about the syntax before today as I wanted to clear up the concepts first. But now, in part because of your syntactic arguments above, I'm wondering if this was the good path to take. I don't mind much if it never gets into the language, although I'd like it very much. I doing it for myself too, to better understand how you can document and analyse the region/scope relationship of various variables in a program piece by piece.As I said below, I think many people in this group are already confortable with using pointers, which may explain why they're not so interested. Having no one interested in something doesn't necessarly mean they won't appreciate it when it comes.That I totally agree with. It's happened a couple of times with D features.It does, however reduce the incitative for continuing forward. So I understand why you're backing off, even if it displease me somewhat.I'm sorry about how you feel. Now we're in a conundrum of sorts. You seem to strongly believe you can make some nice simplified regions work, and make people like them. Taking that to a proof is hard. The conundrum is, you are facing the prospect of putting work into it and creating a system that, albeit correct, is not enticing.I was under the impression that Cyclone requirement for named regions came with its use of dynamic regions, which I now believe was incorrect. If I take this example from the paper: char?p rstrdup(region_t<p>, const char? s); you *need* a name for the region handle. Since region handles are there for supporting dynamic regions, it therefore follows that you need named regions to make things work at all... well here's the catch: you need named *region handles* as variables, not necessarily named regions, as you could always arrange the syntax so that the returned pointer is of the region of the region handle... or something like that.I never said the need for dynamic regions would be rare: I said garbage collector obsoletes it. If we can justify the need for dynamic regions later, we can add them back (with all the added complexity it requires) but I'd try without them first.Let's not forget that symbolic regions (for typing purposes) should not be confused with dynamic regions (for efficiency purposes). I agree we can do away with the latter and put them in later if we care. I disagree that dropping symbolic regions simplifies things."make arrays safe"... by forcing dynamic ones to always be on the heap? Or by implementing a full region system that applies only to arrays? Obviously it's not the later; the former is the only choice I can see. And I think you should at least allow pointers to work with heap variables in SafeD... otherwise people will work around that by creating one-item arrays. :-)But dynamic arrays *are* pointers, how are you oblivating the need for them? If you find a solution for dynamic arrays, you'll have a solution for pointers too. You could forbid dynamic arrays from refering to stack-allocated static ones, or automatically dynamically allocate those when they escape in a dynamic array. And if I were you, whatever you choose for arrays I'd allow it for pointers too, to keep things consistent. Pointer to heap objects should be retained in my opinion.But a possible path is to make arrays safe and leave pointers for those cases in which efficiency is of utmost importance. With luck, those cases are rare.Whether you can work effectively only with ref or not remains to be seen. -- Michel Fortin michel.fortin michelf.com http://michelf.com/But are you reducing the need for pointers or hiding and restricting them?Of course - that's the whole point. In fact, I'll insert a small correction: we are reducing the need for pointers BY hiding and restricting them. And that's a good thing. If you can do most of your work with restricted pointers (e.g. ref), then that's a net win.
Nov 12 2008
Michel Fortin wrote:On 2008-11-12 10:02:02 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:No, the code is correct as written (without the if). You may want to reread the paper with an eye for region subtyping rules. This partly backs up my point: understanding region analysis may be quite a burden for the average programmer. Even you, who took pains to think through everything and absorb the paper, are having trouble. And me too to be honest :o).Michel Fortin wrote:Everywhere I said there was no need for named regions, I also said named regions could be kept to ease the syntax. That said, I'm not so sure named regions are that good at simplifying the syntax. In your assign example above, the named-region version has an error: it forces the two pointers to be of the same region. That could be fine, but, assuming you're assigning to *p, it'd be more precise to write it like that: void assign(region R1, region R2)(int*R1* p, int*R2 r) if (R1 <= R2);On 2008-11-09 10:10:03 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said: So my first point is that since we have a garbage collector in D, and moreover since we're likely to get one heap per thread in D2, we don't need dynamic regions. The remaining regions are: 1) the shared heap, 2) the thread-local heap, 3) All the stack frames; and you can't allocate other stack frames than the current one. Because none of these regions require a handle to allocate into, we (A) don't need region handles. We still have many regions. Beside the two heaps (shared, thread-local), each function's stack frame, and each block within them, creates a distinct memory region. But nowhere we need to know exactly which region a function parameter comes from; what we need to know is which address outlives which pointer, and then we can forbid assigning addresses to pointers that outlive them. All we need is a relative ordering of the various regions, and for that we don't need to attach *names* to the regions so that you can refer explicitly to them in the syntax. Instead, you could say something like "region of (x)", or "region of (*y)" and that would be enough.But how do you type then the assignment example? void assign(int** p, int * r) { *p = *r; } How do you reflect the requirement that r's region outlives *p's region? But that's not even the point. Say you define some notation, such as: void assign(int** p, int * r) if (region(r) <= region(p)); But the whole point of regions was to _simplify_ notations like the above into: void assign(region R)(int*R* p, int *R r); So although you think you simplified things by using region(symbol) instead of symbolic names, you complicated things. The compiler still needs to infer regions for each value, so it is as complicated as a named-regions compiler, and in addition you require the user to write bulkier expressions because you disallow use of symbols. So everybody is worse off. Note how in the example using a symbolic region the outlives relationship is enforced implicitly by using the same symbol name in two places.Once we get there, I think the no-named region syntax is better.This is invalidated by the wrong assertion above.That said, for the swap example, where both values need to share the same region, the named region notation is simpler: void swap(region R)(int*R a, int*R b); void swap(int* a, int* b) if (region(a) == region(b));No, for that swap there is no need to specify any region. You can swap ints in any two regions. Probably you meant to use int** throughout.But I'd argue that most of the time regions do not need to be equal, but are subset or superset of each other, so reusing variable names makes more sense in my opinion.Don't forget that using a region name twice may actually work with two different regions, so far as they are in a subtyping relationship. Region subtyping is key to both simplifying code and to understanding code after simplification.In any case, I prefer a notation where regions constrains are attached directly to the type instead of being expressed somewhere else. Something like this (explained below): void assign(int*(r)* p, int* r) { *p = r; } void swap(ref int*(b) a, ref int*(a) b);Sure. I'm sure there's understanding that that doesn't make anything any simpler or any easier to implement or understand. It's just a minor change in notation, and IMHO not to the better.Here, a parenthesis suffix after a pointer indicates the region constrain of the pointer, based on the region of another pointer.I thought it means pointer to function. Oops.In the first example, int*(r)* means that the integer pointer "*p" must not live beyond the value pointed by "r" (because we're going to assign "r" to "*p"). In the second example, the value pointed by "a" must not live longer than the one pointed by "b" and the value pointed by "b" must not live longer than the one pointed "a"; the net result is that they must have the same lifetime and need to be in the same region. For something more complicated, you could give multiple commas-separated constrains: void choose(ref int*(a,b) result, int* a, int* b) { result = rand() > 0.5 ? a : b; }This all is irrelevant. You essentially change the syntax. Syntax is, again, the least of the problems to be solved.I hope you agree that this is just written symbols without much meaning. This is not half-baked. It's not even rare. The cow is still moving. I can't eat that! :o) I can't even start replying to it because there are so many actual and potential issues, I'd need to get to work on them first.I suspect there are things you can't even express without symbolic regions. Consider this example from Dan's slides: struct ILst(region R1, region R2) { int *R1 hd; ILst!(R1, R2) *R2 tl; } This code reflects the fact that the list holds pointer to integers in one region, whereas the nodes themselves are in a different region. It would be a serious challenge to tackle that without symbolic regions, and simpler that won't be for anybody.Today's templates are just fine for that. Just propagate variables through template arguments and apply region constrains to the members: struct ILst(alias var1, alias var2) { int*(var1) hd; ILst!(var1, var2)*(var2) tl; } int z; int*(z) a, b; ILst!(a, b) lst1; ILst!(&z, &z) lst2;We could even allow regions to propagate through type arguments too: struct ILst2(T1, T2) { int*(T1) hd; ILst2!(T1, T2)*(T2) tl; } ILst2!(typeof(&z), typeof(b)) lst3; I think this example is a good case for attaching region constrains directly to types instead of expressing them as conditional expressions elsewhere, as in "if (region a <= region b)".I am thoroughly lost here, sorry. I can't even answer "this is so wrong" or "this is pure genius". Probably it's somewhere in between :o). At any rate, I suggest you develop a solid understanding of Cyclone if you want to build something related to it. [In the interest of coherence I snipped away unrelated parts of the discussion.]I understand I've been blunt throughout this post, but please side with me for a minute. I'm doing so for the following reasons: (a) I'm essentially writing this post in negative time; (b) I believe you currently don't have an attack on the problem you're trying to solve; (c) I believe it's worthwhile for you to develop an attack on the problem, (d) I think "we" = "the D community" should seriously consider safety and consequently things like region analysis. You can now stop siding with me and side again with yourself. At this point you can easily guess that all of the above was to prepare you for an even blunter comment. Here goes. You say you want to convince people "it can work". But right now there is no "it". You have no "it". Much less an "it" that can work. But there is of course good hope that an "it" could emerge, and I encourage you to continue working towards that goal. It's just a lot more work than it might appear. AndreiI'm sorry about how you feel. Now we're in a conundrum of sorts. You seem to strongly believe you can make some nice simplified regions work, and make people like them. Taking that to a proof is hard. The conundrum is, you are facing the prospect of putting work into it and creating a system that, albeit correct, is not enticing.Currently, I'm just trying to convince you (and any other potential silent listeners) that it can work.
Nov 12 2008
On 2008-11-13 00:53:50 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:Michel Fortin wrote:Ok, I've reread that part and it's true that using Cyclone's subtyping rules it'd work fine with only one region name because Cyclone implicitly creates two regions from that, the first being a subset of the other, just as I wrote explicitly here. But what I missed out was one of Cyclone's syntactic construct, not a concept of regions. ... or perhaps we have a different notion of what is a syntax and what is a concept?Everywhere I said there was no need for named regions, I also said named regions could be kept to ease the syntax. That said, I'm not so sure named regions are that good at simplifying the syntax. In your assign example above, the named-region version has an error: it forces the two pointers to be of the same region. That could be fine, but, assuming you're assigning to *p, it'd be more precise to write it like that: void assign(region R1, region R2)(int*R1* p, int*R2 r) if (R1 <= R2);No, the code is correct as written (without the if). You may want to reread the paper with an eye for region subtyping rules. This partly backs up my point: understanding region analysis may be quite a burden for the average programmer. Even you, who took pains to think through everything and absorb the paper, are having trouble. And me too to be honest :o).Yes and no. It's true that Cyclone's region subtyping makes the syntax prettier. On the other side, the programmer has to be aware of how it works, and especially aware that changing the order his arguments will implicitly change the region relationship between them.Once we get there, I think the no-named region syntax is better.This is invalidated by the wrong assertion above.Hum, you're right, I meant to make these "ref int*".That said, for the swap example, where both values need to share the same region, the named region notation is simpler: void swap(region R)(int*R a, int*R b); void swap(int* a, int* b) if (region(a) == region(b));No, for that swap there is no need to specify any region. You can swap ints in any two regions. Probably you meant to use int** throughout.I'm not convinced that region subtyping is so simple to understand for neophytes, especially because you may assume the same region at first glance. Cyclone isn't C++, but this region subtyping rule makes me think of one of those many little known corners in C++ such as Koenig name lookup. But I consider this just a syntactic issue about how to express regions though. And I may be completely wrong about its unintuitiveness.But I'd argue that most of the time regions do not need to be equal, but are subset or superset of each other, so reusing variable names makes more sense in my opinion.Don't forget that using a region name twice may actually work with two different regions, so far as they are in a subtyping relationship. Region subtyping is key to both simplifying code and to understanding code after simplification.Ok, then we disagree here. I think this notation is better because it makes you think about things in term of pointer lifetime vs. the pointed data lifetime, which I think is much less abstract than variables being part of different regions where some regions encompass other regions. It's a shift in perspective from the syntactic approach of Cyclone, although under the hood the compiler would do mostly the same work.In any case, I prefer a notation where regions constrains are attached directly to the type instead of being expressed somewhere else. Something like this (explained below): void assign(int*(r)* p, int* r) { *p = r; } void swap(ref int*(b) a, ref int*(a) b);Sure. I'm sure there's understanding that that doesn't make anything any simpler or any easier to implement or understand. It's just a minor change in notation, and IMHO not to the better.And I though the syntax was the least of your concern right now? :-) This probably can't be the final syntax, but I think it makes things clear enough talk about about the concepts... for now.Here, a parenthesis suffix after a pointer indicates the region constrain of the pointer, based on the region of another pointer.I thought it means pointer to function. Oops.Ok then. Let's go to the real problems.In the first example, int*(r)* means that the integer pointer "*p" must not live beyond the value pointed by "r" (because we're going to assign "r" to "*p"). In the second example, the value pointed by "a" must not live longer than the one pointed by "b" and the value pointed by "b" must not live longer than the one pointed "a"; the net result is that they must have the same lifetime and need to be in the same region. For something more complicated, you could give multiple commas-separated constrains: void choose(ref int*(a,b) result, int* a, int* b) { result = rand() > 0.5 ? a : b; }This all is irrelevant. You essentially change the syntax. Syntax is, again, the least of the problems to be solved.If you mean there aren't any explanation, then you're right that explanations were somewhat missing from my last post. Sorry. I guess I was too tired to notice the lack of instructions. Basically you apply the same rules as for the function signatures in the preceding function examples. For instance, "int*(var1)" means the ht pointer points to an int that lives at least as long as the one pointed by var1 (var1 must be an "int*" pointer). This means that you can assign the content of var1 to it, or anything else that will live at least as long as var1. It also mean you can take its value and place it in var1, or any pointer with a shorter life. Then, we have "ILst!(var1, var2)*(var2)". It's the same rules as the first, except that we have a different type beyond the pointer which must be valid through var2's lifetime. The last code snippet shows how to use that template. int z; int*(z) a, b; ILst!(a, b) lst1; ILst!(&z, &z) lst2; Here, we're declaring "int*(z)", which is a pointer to an int whose lifetime is equal or longer than the address of z. (ok, there's an error here, it should have been "int*(&z)"). And normally, you wouldn't explicitly write that, "int*" would be enough: the compiler should determine the default constrains automatically. Then when you instanciate ILst!(a, b), the template will take the lifetime of a and b (which is the lifetime of the address of z) and apply it to pointers inside the struct.I hope you agree that this is just written symbols without much meaning. This is not half-baked. It's not even rare. The cow is still moving. I can't eat that! :o) I can't even start replying to it because there are so many actual and potential issues, I'd need to get to work on them first.I suspect there are things you can't even express without symbolic regions. Consider this example from Dan's slides: struct ILst(region R1, region R2) { int *R1 hd; ILst!(R1, R2) *R2 tl; } This code reflects the fact that the list holds pointer to integers in one region, whereas the nodes themselves are in a different region. It would be a serious challenge to tackle that without symbolic regions, and simpler that won't be for anybody.Today's templates are just fine for that. Just propagate variables through template arguments and apply region constrains to the members: struct ILst(alias var1, alias var2) { int*(var1) hd; ILst!(var1, var2)*(var2) tl; } int z; int*(z) a, b; ILst!(a, b) lst1; ILst!(&z, &z) lst2;Again, some explanations were missing... Basically, region/scoping/lifetime constrains are attached to pointers. Which means that propagating a type ought to be enough to propagate the lifetime constrains too. "ILst2!(typeof(&z), typeof(b))" is exactly the same as "ILst!(&z, b)". ILst takes its constrains from variables while ILst2 takes its constrains from types. But the two previous examples are a little stretched to make the concept more similar to Cyclone. With my proposal, you can do much better than this. I think in most cases where you want to propagate constrains, you'll want to propagate a type too. If what you want is a linked list, it'd be better expressed generically like this: struct ListRoot(T) { ListNode!(T)* first; } struct ListNode(T) { T hd; ILst2!(T)* tl; } int global; void foo() { int a; ListRoot!(int*) listRoot; ListNode!(int*) listNode; listRoot.first = &listNode; listNode.hd = &a; listNode.hd = &global; } Notice how there is absolutely no special annotation here; it's already valid template code. Now, let the compiler apply some defaults according to these rules: types declared in local variables will be allowed to point to values of their own region, and structs members will be allowed to point to values of the same region the struct comes from. Annotated explicitly, the default annotations would look like this: struct ListRoot(T) { ListNode!(T)*(this) first; // pointer to something in the same region as this } struct ListNode(T) { T value; // if T is a pointer, it holds its own region annotations ILst2!(T)*(this) next; // pointer to something in the same region as this } int global; void foo() { int a; ListRoot!(int*(&listRoot)) listRoot; ListNode!(int*(&listNode)) listNode; listRoot.first = &listNode; listNode.value = &a; listNode.value = &global; } With this scheme, the lifetime of all nodes in the linked list need to be equal or longer than the one of the preceding node (normally, they will all be equal), and the lifetime of the value pointer is determined by the type you give as a template argument to ListRoot and ListNode. Therefore, it becomes possible to construct the linked list on the stack when the root is on the stack, with no need for explicit annotations. There is still one problem though. If you want to swap two nodes, you can't, because there is no guarenty that the lifetime of the "this" pointer of a ListNode is equal to lifetime of the "next" pointer. (In fact, the next pointer lifetime is longer or equal to the struct lifetime). So if we're going to swap or reorder nodes, we'll need a way to constrain the "this" pointer against the "next" pointer to create a circular reference and thus forcing the two pointers to point to the same region... perhaps something like this: struct ListNode(T) { ListNode*(next) this; T value; ILst2!(T)*(this) next; } Not a very good syntax though.We could even allow regions to propagate through type arguments too: struct ILst2(T1, T2) { int*(T1) hd; ILst2!(T1, T2)*(T2) tl; } ILst2!(typeof(&z), typeof(b)) lst3;I'll side with "pure genius", but I also consider myself biased. :-)I think this example is a good case for attaching region constrains directly to types instead of expressing them as conditional expressions elsewhere, as in "if (region a <= region b)".I am thoroughly lost here, sorry. I can't even answer "this is so wrong" or "this is pure genius". Probably it's somewhere in between :o). At any rate, I suggest you develop a solid understanding of Cyclone if you want to build something related to it.I don't mind about (a) and I agree about (d). I'll say that because of my lack of expertise with Cyclone I have some difficulty expressing my proposal as a comparaison of what is different from Cyclone (it's difficult enough without it). You're the one asking for such a comparison and increasing the difficulty. I do not dislike the challenge, but I don't think you can take this as a proof that I don't understand well the problem I'm trying to solve when I may just be mixing some things about the approach taken by Cyclone. Another thing not helping is that my original proposal has evolved a little since the first time I started the "full scope analysis proposal" thread. I also revamped the syntax I use to talk about the problem (and apparently I should do it again to avoid a conflicts with function names). Hunting in previous post the details I leave out in the more recent ones doesn't help anyone understanding what I'm talking about. I'm thinking that maybe I should put everything in one document to have a coherent proposal that could evolve as a whole instead of one scattered on various post between which the syntax I use and some concepts have evolved.I understand I've been blunt throughout this post, but please side with me for a minute. I'm doing so for the following reasons: (a) I'm essentially writing this post in negative time; (b) I believe you currently don't have an attack on the problem you're trying to solve; (c) I believe it's worthwhile for you to develop an attack on the problem, (d) I think "we" = "the D community" should seriously consider safety and consequently things like region analysis.I'm sorry about how you feel. Now we're in a conundrum of sorts. You seem to strongly believe you can make some nice simplified regions work, and make people like them. Taking that to a proof is hard. The conundrum is, you are facing the prospect of putting work into it and creating a system that, albeit correct, is not enticing.Currently, I'm just trying to convince you (and any other potential silent listeners) that it can work.You can now stop siding with me and side again with yourself. At this point you can easily guess that all of the above was to prepare you for an even blunter comment. Here goes. You say you want to convince people "it can work". But right now there is no "it". You have no "it". Much less an "it" that can work. But there is of course good hope that an "it" could emerge, and I encourage you to continue working towards that goal. It's just a lot more work than it might appear.I'm pretty sure I hold that "it" just now, or something very near it. It's just that it seems I haven't explained it well enough for you (and probably anyone) to understand correctly. I should probably write it all down in one coherent and more formal document rather than scattering all the details over many different posts as half-documented concept-name-changing written-too-fast examples. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Nov 14 2008
Just to fix a little misunderstanding: Michel Fortin wrote:On 2008-11-13 00:53:50 -0500, Andrei AlexandrescuBy this I meant I don't have time (t < 0), not that I was writing while being at a time when I had a negative outlook. Andreiwith me for a minute. I'm doing so for the following reasons: (a) I'm essentially writing this post in negative time;
Nov 14 2008
On Sun, 02 Nov 2008 10:12:46 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:* Make all ref parameters scoped by default. There will be impossible for a function to escape the address of a ref parameter without a cast. I haven't proved it to myself yet, but I believe that if pointers are not used and with the amendments below regarding arrays and delegates, this makes things entirely safe. In Walter's words, "it buttons things pretty tight".Does this mean the whole shared/local/scope issue for classes is being sidestepped for now?
Nov 02 2008
Robert Jacques wrote:On Sun, 02 Nov 2008 10:12:46 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:What issue do you have in mind? Andrei* Make all ref parameters scoped by default. There will be impossible for a function to escape the address of a ref parameter without a cast. I haven't proved it to myself yet, but I believe that if pointers are not used and with the amendments below regarding arrays and delegates, this makes things entirely safe. In Walter's words, "it buttons things pretty tight".Does this mean the whole shared/local/scope issue for classes is being sidestepped for now?
Nov 02 2008
On Mon, 03 Nov 2008 00:29:29 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Robert Jacques wrote:Right now, it's trivial for scope classes to escape due to automatic conversion to 'local'. And under the current shared/local scheme, one has to write multiple functions (one for each type combination).On Sun, 02 Nov 2008 10:12:46 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:What issue do you have in mind?* Make all ref parameters scoped by default. There will be impossible for a function to escape the address of a ref parameter without a cast. I haven't proved it to myself yet, but I believe that if pointers are not used and with the amendments below regarding arrays and delegates, this makes things entirely safe. In Walter's words, "it buttons things pretty tight".Does this mean the whole shared/local/scope issue for classes is being sidestepped for now?
Nov 03 2008
"Andrei Alexandrescu" wroteSteven Schveighoffer wrote:Isn't this already the case? BTW, slightly OT, I read Bartosz' article on digitalmars about SafeD. This isn't an implemented language right? Is the plan for D to become SafeD? Or is there going to be a compiler switch? Or something else maybe? I've heard SafeD mentioned a lot on this NG, without ever really knowing how it exists (concrete or theory)."Andrei Alexandrescu" wroteIt looks like things will move that way. Bartosz, Walter and I talked a lot yesterday about it - a lot of crazy things were on the table! The next step is to make this a reference, which is highly related to escape analysis. At the risk of anticipating a bit an unfinalized design, here's what's on the table: * Continue an "anything goes" policy for *explicit* pointers, i.e. those written explicitly by user code with stars and stuff. * Disallow pointers in SafeD.Steven Schveighoffer wrote:If scope delegates means trust the coder knows what he is doing (in the beginning), I agree with that plan of attack.I think everyone who thinks a scope decoration proposal is going to 1) solve all scope escape issues and 2) be easy to use is dreaming :PI think that's a fair assessment. One suggestion I made Walter is to only allow and implement the scope storage class for delegates, which simply means the callee will not squirrel away a pointer to delegate. That would allow us to solve the closure issue and for now sleep some more on the other issues.* Make all ref parameters scoped by default. There will be impossible for a function to escape the address of a ref parameter without a cast. I haven't proved it to myself yet, but I believe that if pointers are not used and with the amendments below regarding arrays and delegates, this makes things entirely safe. In Walter's words, "it buttons things pretty tight".I think this sounds reasonable. However, will there be a way to override this behavior? For example, some modifier to signify that a reference is not scope? The advantage to having the other be the default is that the scope keyword already exists. Having to cast for every time I convert to a pointer will be unpleasant, but not horrific. I'd prefer to state one time 'this is an unsafe reference', preferrably in the signature, and be able to use it like before. The same semantics still apply as far as calling the function, it just says "the author of this function knows what he is doing" to the compiler. You would also disallow this keyword usage in SafeD which would be easy to filter. noscope would be a good keyword...* Make this a reference so that it obeys what references obey.This is one place where I think whole-heartedly it should be done. One rarely needs the address this, in fact, I generally end up returning *this quite a bit in struct operators, so this change will be most welcome.* If people want to implement e.g. linked lists, they should do it with classes. Implementing them with structs will require casts to obtain and escape &this. That also means they'd be using pointers, so anything goes - pointers are not restricted from escaping.I implemented dcollections' node-based containers (tree, hash, linked list) as structs, because I wanted to control the allocation of them. I agree with others that the defacto standard is going to be structs, since performance is paramount, and you have little need for OOP in the internal node structures. Also, if the noscope (or equivalent keyword) is implemented as above, you can easily decorate your pointer-using functions: struct LinkNode(T) { noscope { LinkNode *find(T value); LinkNode *findReverse(T value); ... } }* There are two cases in which things escape without the user explicitly using pointers: delegates and dynamic arrays initialized from stack-allocated arrays. * For delegates require the scope keyword in the signature of the callee. A scoped delegate cannot be stored, only called or passed down to another function that in turn takes a scoped delegate. This makes scope delegates entirely safe. Non-scoped delegates use dynamic allocation.If noscope (or equivalent keyword) is used, can we make scope the default? I'd much rather have the default be the higher-performance, more commonly used option. Also, when you say stored, do you mean stored anywhere, or stored anywhere but the stack? Because there is no harm in storing a scope delegate in a local variable (as long as it is also scope).* We don't have an idea for dynamic arrays initialized from stack-allocated arrays.Hm... this is a tough one. At the very least, you can disallow returning such arrays, as long as the compiler can prove the arrays origins. That should cover 90% of the issues. The other 10% are ones that are passed into functions. You might employ the same techniques as for delegates, but then we are stuck with the same problems as needed for full escape analysis. Plus the need to return a slice of an array is much greater than the need to return a delegate. You could also argue that an array contains a pointer, and morphing into a dynamic array is the same as taking the address of a stack local variable (which would require a cast). But that means SafeD cannot use dynamic arrays to reference static arrays. However, you can then argue that dynamic arrays allocated using new are OK for SafeD because you didn't take the address of a local stack variable. My understanding is that in SafeD, safety trumps performance. Note that a static array could be used for a rebindable reference, since it has a rebindable pointer in it, so it is really an unsafe operation: int[2] a; int[] aref = a[0..1]; // reference to a[0] aref = a[1..2]; // rebind to a[1] -Steve
Nov 03 2008
Steven Schveighoffer wrote:"Andrei Alexandrescu" wroteAt a point we wanted to allow pointers in restricted ways.* Disallow pointers in SafeD.Isn't this already the case?BTW, slightly OT, I read Bartosz' article on digitalmars about SafeD. This isn't an implemented language right? Is the plan for D to become SafeD? Or is there going to be a compiler switch? Or something else maybe? I've heard SafeD mentioned a lot on this NG, without ever really knowing how it exists (concrete or theory).It's planned as a compiler switch and module option. Essentially SafeD is slated to be a safe, proper, well-defined subset of D. It was Bartosz's idea, and IMHO an important dimension of D's development. Walter is implementing module safety options like this: module(safe) mymodule; which means the module must always be compiled with safety on. On the contrary, module(system) mymodule; means the module is getting its hands greasy.Good point. I think escaping the address of a ref should be allowed via a cast.* Make all ref parameters scoped by default. There will be impossible for a function to escape the address of a ref parameter without a cast. I haven't proved it to myself yet, but I believe that if pointers are not used and with the amendments below regarding arrays and delegates, this makes things entirely safe. In Walter's words, "it buttons things pretty tight".I think this sounds reasonable. However, will there be a way to override this behavior? For example, some modifier to signify that a reference is not scope? The advantage to having the other be the default is that the scope keyword already exists.Having to cast for every time I convert to a pointer will be unpleasant, but not horrific. I'd prefer to state one time 'this is an unsafe reference', preferrably in the signature, and be able to use it like before. The same semantics still apply as far as calling the function, it just says "the author of this function knows what he is doing" to the compiler.Currently Walter plans to do that at module granularity.You would also disallow this keyword usage in SafeD which would be easy to filter. noscope would be a good keyword...I think safety should be the default. People who care about efficiency will be willing to write a little bit more. I agree that this is annoying if that's the more frequent situation.* Make this a reference so that it obeys what references obey.This is one place where I think whole-heartedly it should be done. One rarely needs the address this, in fact, I generally end up returning *this quite a bit in struct operators, so this change will be most welcome.* If people want to implement e.g. linked lists, they should do it with classes. Implementing them with structs will require casts to obtain and escape &this. That also means they'd be using pointers, so anything goes - pointers are not restricted from escaping.I implemented dcollections' node-based containers (tree, hash, linked list) as structs, because I wanted to control the allocation of them. I agree with others that the defacto standard is going to be structs, since performance is paramount, and you have little need for OOP in the internal node structures. Also, if the noscope (or equivalent keyword) is implemented as above, you can easily decorate your pointer-using functions: struct LinkNode(T) { noscope { LinkNode *find(T value); LinkNode *findReverse(T value); ... } }* There are two cases in which things escape without the user explicitly using pointers: delegates and dynamic arrays initialized from stack-allocated arrays. * For delegates require the scope keyword in the signature of the callee. A scoped delegate cannot be stored, only called or passed down to another function that in turn takes a scoped delegate. This makes scope delegates entirely safe. Non-scoped delegates use dynamic allocation.If noscope (or equivalent keyword) is used, can we make scope the default? I'd much rather have the default be the higher-performance, more commonly used option.Also, when you say stored, do you mean stored anywhere, or stored anywhere but the stack? Because there is no harm in storing a scope delegate in a local variable (as long as it is also scope).That could be allowed, but probably it's not really needed.I agree with the above. The floor is open for more ideas. Andrei* We don't have an idea for dynamic arrays initialized from stack-allocated arrays.Hm... this is a tough one. At the very least, you can disallow returning such arrays, as long as the compiler can prove the arrays origins. That should cover 90% of the issues. The other 10% are ones that are passed into functions. You might employ the same techniques as for delegates, but then we are stuck with the same problems as needed for full escape analysis. Plus the need to return a slice of an array is much greater than the need to return a delegate. You could also argue that an array contains a pointer, and morphing into a dynamic array is the same as taking the address of a stack local variable (which would require a cast). But that means SafeD cannot use dynamic arrays to reference static arrays. However, you can then argue that dynamic arrays allocated using new are OK for SafeD because you didn't take the address of a local stack variable. My understanding is that in SafeD, safety trumps performance. Note that a static array could be used for a rebindable reference, since it has a rebindable pointer in it, so it is really an unsafe operation: int[2] a; int[] aref = a[0..1]; // reference to a[0] aref = a[1..2]; // rebind to a[1]
Nov 03 2008
"Andrei Alexandrescu" wroteSteven Schveighoffer wrote:I personally probably won't use it, as I feel I have enough experience to avoid the problems that SafeD will prevent. But it does sound like a very important version of the language.BTW, slightly OT, I read Bartosz' article on digitalmars about SafeD. This isn't an implemented language right? Is the plan for D to become SafeD? Or is there going to be a compiler switch? Or something else maybe? I've heard SafeD mentioned a lot on this NG, without ever really knowing how it exists (concrete or theory).It's planned as a compiler switch and module option. Essentially SafeD is slated to be a safe, proper, well-defined subset of D. It was Bartosz's idea, and IMHO an important dimension of D's development.Walter is implementing module safety options like this: module(safe) mymodule; which means the module must always be compiled with safety on. On the contrary, module(system) mymodule; means the module is getting its hands greasy.Hm... that's kinda too high level. I might have one function in a class that does things that are 'unsafe', but I don't want to have to mark my whole class as unsafe.What I meant was, make the default behavior as if scope was marked on the delegate. This doesn't make it unsafe (you said so yourself). But it does line up with most code today, which doesn't do anything with a delegate but call it. i.e. less decorations on current code that is already considered safe. The most obvious usage is opApply. Every opApply will have to have its delegate marked scope unless it's the default. The only downside is that you then have to come up with a way to mark a delegate as noscope.I think safety should be the default. People who care about efficiency will be willing to write a little bit more. I agree that this is annoying if that's the more frequent situation.* For delegates require the scope keyword in the signature of the callee. A scoped delegate cannot be stored, only called or passed down to another function that in turn takes a scoped delegate. This makes scope delegates entirely safe. Non-scoped delegates use dynamic allocation.If noscope (or equivalent keyword) is used, can we make scope the default? I'd much rather have the default be the higher-performance, more commonly used option.I can think of certain cases to need it, for example if you have two inner functions that have the same signature, and you want to decide which one to use at runtime, you might store the one to use in a local variable. --------------------------- It seems to me like the way you are saying things will work is that you will have either safety checks or no safety checks at a module level. I think that is a mistake. Most of my code should be safe, and I'd prefer it to be safety checked. The ideas that all of you have come up with in this post are very good, and should be easy to use for most code. I especially like the requirement to cast in order to take the address of a reference. But if all those checks go away when you mark your module as system, then this seems like it will either require me to split up my modules into safe and unsafe parts, or just not use safety checks where they could be used. I'd prefer to be able to mark specific functions/parameters as unsafe or safe so I know exactly where I have disabled the safety checks. And I'd prefer safety by default, not have to mark for safety. As long as the safety can be easily verified and allows most usages. I really like how pointers are simply considered unsafe, so all safety checks are off. That draws a clear line of where it's difficult to verify safety without hindering ability. The further check of compliance to SafeD can eliminate possible pointer usages that you miss. -SteveAlso, when you say stored, do you mean stored anywhere, or stored anywhere but the stack? Because there is no harm in storing a scope delegate in a local variable (as long as it is also scope).That could be allowed, but probably it's not really needed.
Nov 04 2008
On Sat, 01 Nov 2008 12:00:10 -0400, Steven Schveighoffer <schveiguy yahoo.com> wrote:"Michel Fortin" wroteVarious research languages have shown both 1 and 2 are possible.Sure it is ;) You have to write a special linker. I think everyone who thinks a scope decoration proposal is going to 1) solve all scope escape issues and 2) be easy to use is dreaming :PThe only way I can see to solve this is to do it at link time. When you link, piece together the parts of the graph that were incomplete, and see if they all work. It would be a very radical change, and might not even work with the current linkers. Especially if you want to do shared libraries, where the linker is builtin to the OS.I think you're dreaming... not that it's a bad thing to have ambition, but that's probably not even possible.How so? Please explain why it's bad (an opinion by itself isn't and argument).I hope to avoid this last situation. Having the compiler make decisions for me, especially when heap allocation occurs, is bad.It takes some thinking to get the prototype right at first. But it takes less caution calling the function later with local variables since the compiler will either issue an error or automatically fix the issue by allocating on the heap when an argument requires a greater scope.I don't think it's bad to force interfaces to be well documented, and documented in a format that the compiler can understand to find errors like this.I think this concept is going to be really hard for a person to decipher, and really hard to get right.
Nov 02 2008
"Robert Jacques" wroteOn Sat, 01 Nov 2008 12:00:10 -0400, Steven Schveighoffer <schveiguy yahoo.com> wrote:I think 1 can be possibly done. 2 is a matter of subjectivity, and so far, I haven't seen an example of it. But I also don't want D to become a purely academic language. I want it to keep the system-level performance and usability that drew me to it in the first place."Michel Fortin" wroteVarious research languages have shown both 1 and 2 are possible.Sure it is ;) You have to write a special linker. I think everyone who thinks a scope decoration proposal is going to 1) solve all scope escape issues and 2) be easy to use is dreaming :PThe only way I can see to solve this is to do it at link time. When you link, piece together the parts of the graph that were incomplete, and see if they all work. It would be a very radical change, and might not even work with the current linkers. Especially if you want to do shared libraries, where the linker is builtin to the OS.I think you're dreaming... not that it's a bad thing to have ambition, but that's probably not even possible.Allocating on the heap involves locking a global mutex (as long as the heap is global), searching for a free memory space, possibly running a garbage collection cycle, and finally possibly allocating more memory from the OS. All of these are very expensive compared to adjusting the stack pointer. For instance, I wrote a 'chunk allocator' which uses D's allocator to allocate memory in chunks instead of going to the GC for each piece in dcollections' implementation. Doing this achieved at least a 2x speedup because I was calling on the GC less often. The author of Tango's new container implementation wrote a similar allocator that's even faster than that because it doesn't use the GC for any allocation (of course, you cannot use it to allocate items which have references, because the GC doesn't look at that memory). In Tango, many operations rely on using stack allocation for buffers and temporary classes. If the compiler decides I don't know what I'm doing and helpfully allocates those on the heap for my protection, I just lost all the performance that I purposely build the library to have. This is one of the main arguments I hear from the other Tango devs about moving to D2, the automatic dynamic closure. I think many people are not aware of how important it is to avoid heap allocation when possible. It is one of the central goals that makes Tango so much faster than other libraries. -SteveHow so? Please explain why it's bad (an opinion by itself isn't and argument).I hope to avoid this last situation. Having the compiler make decisions for me, especially when heap allocation occurs, is bad.It takes some thinking to get the prototype right at first. But it takes less caution calling the function later with local variables since the compiler will either issue an error or automatically fix the issue by allocating on the heap when an argument requires a greater scope.I don't think it's bad to force interfaces to be well documented, and documented in a format that the compiler can understand to find errors like this.I think this concept is going to be really hard for a person to decipher, and really hard to get right.
Nov 03 2008
On 2008-11-03 14:47:25 -0500, "Steven Schveighoffer" <schveiguy yahoo.com> said:I won't dispute this. I'll note that the upcomming "shared" keyword may help regarding not locking a global mutex for unshared variables, but even without the mutex the operation still is expensive.Allocating on the heap involves locking a global mutex (as long as the heap is global), searching for a free memory space, possibly running a garbage collection cycle, and finally possibly allocating more memory from the OS. All of these are very expensive compared to adjusting the stack pointer.I hope to avoid this last situation. Having the compiler make decisions for me, especially when heap allocation occurs, is bad.How so? Please explain why it's bad (an opinion by itself isn't and argument).For instance, I wrote a 'chunk allocator' which uses D's allocator to allocate memory in chunks instead of going to the GC for each piece in dcollections' implementation. Doing this achieved at least a 2x speedup because I was calling on the GC less often. The author of Tango's new container implementation wrote a similar allocator that's even faster than that because it doesn't use the GC for any allocation (of course, you cannot use it to allocate items which have references, because the GC doesn't look at that memory).Nothing of the sort should be prevented by a scoping system. If it is, then I'd consider the system a failure.In Tango, many operations rely on using stack allocation for buffers and temporary classes. If the compiler decides I don't know what I'm doing and helpfully allocates those on the heap for my protection, I just lost all the performance that I purposely build the library to have. This is one of the main arguments I hear from the other Tango devs about moving to D2, the automatic dynamic closure.Then we must make sure the compiler doesn't heap allocate when it doesn't absolutely need to. And, *in addition*, when the programmer really needs to be sure that a variable is not heap-allocated, marking a varialbe "scope" would do the trick.I think many people are not aware of how important it is to avoid heap allocation when possible. It is one of the central goals that makes Tango so much faster than other libraries.I agree with your first assertion (and am not enough familiar with Tango to say anything about the second) and this is exactly why I'm in favor of the compiler deciding what to heap-allocate. People are not aware enough of how important it is to avoid heap allocation, so I expect that if the compiler can be made to know about scopes, it can avoid heap allocation where many users wouldn't bother (especially in a garbage-collected language where you can heap-allocate without thinking), which would result in faster programs with fewer bugs all this without having to think about the technical details. Note that I may be wrong with this, but there's no way to be sure without trying. Anyway, once we have a proper scoping system, it'll be easy to try and decide between auto-allocation and simply enforcing constrains by emitting errors. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Nov 04 2008
On Wed, 29 Oct 2008 07:28:55 -0400, Michel Fortin <michel.fortin michelf.com> wrote:On 2008-10-28 23:52:04 -0400, "Robert Jacques" <sandford jhu.edu> said:What does the scope part of 'scope MyObject o' mean? (i.e. is this D's current scope or something else?) What does 'scope(o)' explicitly mean? I'm going to assume scope(o) means the scope of o.I've run across some academic work on ownership types which seems relevant to this discussion on share/local/scope/noscope.I haven't read the paper yet, but the overview seems to go in the same direction as I was thinking. Basically, all the scope variables you can get are guarentied to be in the current or in some ansestry scope. To allow a reference to a scope variable, or a scope function, to be put inside a member of a struct or class, you only need to prove that the struct or class lifetime is smaller or equal to the one of the reference to your scope variable. If you could tell to the compiler the scope relationship of the various arguments, then you'd have pretty good scope analysis. For instance, with this syntax, we could define i to be available during the whole lifetime of o: void foo(scope MyObject o, scope(o) int* i) { o.i = i; }So you could do: void bar() { scope int i; scope MyObject o = new MyObject; foo(o, &i); } And the compiler would let it pass because foo guarenties not to keep references to i outside of o's scope, and o's scope is the same as i. Or you could do: void test1() { int i; test2(&i); } void test2(scope int* i) { scope o = new MyObject; foo(o, &i);Error: &i is of type int** while foo takes a int*. Did you mean foo(o, i)?} Again, the compiler can statically check that test2 won't keep a reference to i outside of the caller's scope (test1) because o scope is limited to test2.The way I read your example, no useful escape analysis can be done by the complier, and it works mainly because i is a pointer to a value type.And if you try the reverse: void test1() { scope o = new MyObject; test2(o); } void test2(scope MyObject o) { int i; foo(o, &i); } Then the compiler could determine automatically that i needs to escape test2's scope and allocate the variable on the heap to make its lifetime as long as the object's scope (as it does currently with nested functions) [see my reserves to this in post scriptum]. This could be avoided by explictly binding i to the current scope, in which case the compiler could issue a scope error:The way I read this is o is of type scope MyObject, i is of type scope int and therefore foo(o,&i) is valid and an escape happens.
Oct 29 2008
On 2008-10-29 15:10:00 -0400, "Robert Jacques" <sandford jhu.edu> said:On Wed, 29 Oct 2008 07:28:55 -0400, Michel Fortin <michel.fortin michelf.com> wrote:Ok, I should have defined that better. It means that o is bound the caller scope (possibly on the stack). Scopes are created for each function and each {}-delimited blocks in them, basically it's the stack of the current thread. Once you exit a scope, its variables cease to exist and we must ensure there is no more reference to them. In this case, "scope MyObject o" means that we're recieving a MyObject reference which could be pointing to somewhere down in the stack *or* the heap. We have to consider the most restrictive constrain however, so let's say it's in the stack. The rule is that you can't place a reference to a scoped variable anywhere below its scope in the stack, making sure that you can't keep a reference to a variable which no longer exist once the top scope has dissapeared. Scope stack (call stack with the global scope at the bottom): 1. foo ( scope MyObject o = function1.o ) { } 2. function1 () { scope MyObject o, int i } 3. main () { } ... n. global scope In practical terms, "scope MyObject o" means that we can't put a reference to the object anywhere that lives beyond the current function call... except in a scope return value, but I haven't entered that yet.Basically, all the scope variables you can get are guarentied to be in the current or in some ansestry scope. To allow a reference to a scope variable, or a scope function, to be put inside a member of a struct or class, you only need to prove that the struct or class lifetime is smaller or equal to the one of the reference to your scope variable. If you could tell to the compiler the scope relationship of the various arguments, then you'd have pretty good scope analysis. For instance, with this syntax, we could define i to be available during the whole lifetime of o: void foo(scope MyObject o, scope(o) int* i) { o.i = i; }What does the scope part of 'scope MyObject o' mean? (i.e. is this D's current scope or something else?)What does 'scope(o)' explicitly mean? I'm going to assume scope(o) means the scope of o.That's it... mostly. scope(o) is the scope of o, or any scope below o. Take it as any scope valid as long as o exists. If o was not scope, scope(o) would be noscope.Oops. Indeed, I meant foo(o, i).So you could do: void bar() { scope int i; scope MyObject o = new MyObject; foo(o, &i); } And the compiler would let it pass because foo guarenties not to keep references to i outside of o's scope, and o's scope is the same as i. Or you could do: void test1() { int i; test2(&i); } void test2(scope int* i) { scope o = new MyObject; foo(o, &i);Error: &i is of type int** while foo takes a int*. Did you mean foo(o, i)?It's not escape analysis. It scoping constrains enforced by making sure that every function declares what may escape and what may not. If this was a pure value type passed by copy, scope would be meaningless indeed as there would be no reference that could escape.} Again, the compiler can statically check that test2 won't keep a reference to i outside of the caller's scope (test1) because o scope is limited to test2.The way I read your example, no useful escape analysis can be done by the complier, and it works mainly because i is a pointer to a value type.That's my point. The compiler can detect an escape may happen just by looking at the funciton prototype for foo. The prototype tells us that foo needs i to be at the same or a lower scope than o, something we don't have here. The compiler can then decide to allocate i dynamically on the heap to make sure it exists for at least the scope of o; or it could be decided to just make that illegal. I prefer automatic heap allocation, as it means we can get rid of the decision to statically or dynamically allocate variables: the compiler can decide based on the funciton prototypes whichever is best. For cases you really mean a variable to be on the stack, you can use scope, as in: scope int i; and the compiler would just issue an error if you attept to give a reference to i to a function that wants to use it in a lower scope. Otherwise, the compiler would be free to decide whichever scope to use between local or heap-allocated. -- Michel Fortin michel.fortin michelf.com http://michelf.com/And if you try the reverse: void test1() { scope o = new MyObject; test2(o); } void test2(scope MyObject o) { int i; foo(o, &i); } Then the compiler could determine automatically that i needs to escape test2's scope and allocate the variable on the heap to make its lifetime as long as the object's scope (as it does currently with nested functions) [see my reserves to this in post scriptum]. This could be avoided by explictly binding i to the current scope, in which case the compiler could issue a scope error:The way I read this is o is of type scope MyObject, i is of type scope int and therefore foo(o,&i) is valid and an escape happens.
Oct 30 2008
On Thu, 30 Oct 2008 08:14:31 -0400, Michel Fortin <michel.fortin michelf.com> wrote:Just to clarify: void test2(scope MyObject o) // the scope of o is a parent of test2 { int i; // the scope of i is test2 foo(o, &i); // foo(o,&i) requires &i to have o's scope or a parent of o's scope, so i must be heap (the root parent) allocated. } A problem I see is that once shared/local are introduced, you have multiple heaps where i should be allocated, depending on the runtime type of o. How would this be handled in this scheme?That's my point. The compiler can detect an escape may happen just by looking at the funciton prototype for foo. The prototype tells us that foo needs i to be at the same or a lower scope than o, something we don't have here. The compiler can then decide to allocate i dynamically on the heap to make sure it exists for at least the scope of o; or it could be decided to just make that illegal. I prefer automatic heap allocation, as it means we can get rid of the decision to statically or dynamically allocate variables: the compiler can decide based on the funciton prototypes whichever is best. For cases you really mean a variable to be on the stack, you can use scope, as in: scope int i; and the compiler would just issue an error if you attept to give a reference to i to a function that wants to use it in a lower scope. Otherwise, the compiler would be free to decide whichever scope to use between local or heap-allocated.And if you try the reverse: void test1() { scope o = new MyObject; test2(o); } void test2(scope MyObject o) { int i; foo(o, &i); } Then the compiler could determine automatically that i needs to escape test2's scope and allocate the variable on the heap to make its lifetime as long as the object's scope (as it does currently with nested functions) [see my reserves to this in post scriptum]. This could be avoided by explictly binding i to the current scope, in which case the compiler could issue a scope error:The way I read this is o is of type scope MyObject, i is of type scope int and therefore foo(o,&i) is valid and an escape happens.
Oct 30 2008
On 2008-10-30 09:04:10 -0400, "Robert Jacques" <sandford jhu.edu> said:Just to clarify: void test2(scope MyObject o) // the scope of o is a parent of test2 { int i; // the scope of i is test2 foo(o, &i); // foo(o,&i) requires &i to have o's scope or a parent of o's scope, so i must be heap (the root parent) allocated. } A problem I see is that once shared/local are introduced, you have multiple heaps where i should be allocated, depending on the runtime type of o. How would this be handled in this scheme?Well, it all depends if foo wants the second argument of i must be shared or not. If foo's declaration was like this: void foo(scope MyObject o, scope(o) shared int* i); then you'd need to use "shared int i" in test2 to avoid an error at the call site. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Oct 30 2008
On Thu, 30 Oct 2008 21:01:27 -0400, Michel Fortin <michel.fortin michelf.com> wrote:On 2008-10-30 09:04:10 -0400, "Robert Jacques" <sandford jhu.edu> said:Actually, what I meant was that o may be local or shared. However, assuming thin-locks, o may be tested at runtime for share/local cheaply and the right allocation done.Just to clarify: void test2(scope MyObject o) // the scope of o is a parent of test2 { int i; // the scope of i is test2 foo(o, &i); // foo(o,&i) requires &i to have o's scope or a parent of o's scope, so i must be heap (the root parent) allocated. } A problem I see is that once shared/local are introduced, you have multiple heaps where i should be allocated, depending on the runtime type of o. How would this be handled in this scheme?Well, it all depends if foo wants the second argument of i must be shared or not. If foo's declaration was like this: void foo(scope MyObject o, scope(o) shared int* i); then you'd need to use "shared int i" in test2 to avoid an error at the call site.
Oct 31 2008
On Wed, 29 Oct 2008 07:28:55 -0400, Michel Fortin <michel.fortin michelf.com> wrote:P.P.S.: This syntax doesn't fit very well with the current scope(success/failure/exit) feature.How about o.scope instead of scope(o)? Also, this would allow contract-like syntax: void foo (myObject o, int* i) if (o.scope <= i.scope) { ... }
Oct 30 2008
On 2008-10-30 14:07:42 -0400, "Robert Jacques" <sandford jhu.edu> said:On Wed, 29 Oct 2008 07:28:55 -0400, Michel Fortin <michel.fortin michelf.com> wrote:Hum, but can that syntax guarenty a reference to o or i won't escape the current function's scope, like void foo(scope Object o); ? -- Michel Fortin michel.fortin michelf.com http://michelf.com/P.P.S.: This syntax doesn't fit very well with the current scope(success/failure/exit) feature.How about o.scope instead of scope(o)? Also, this would allow contract-like syntax: void foo (myObject o, int* i) if (o.scope <= i.scope) { ... }
Oct 30 2008
On Thu, 30 Oct 2008 21:01:28 -0400, Michel Fortin <michel.fortin michelf.com> wrote:On 2008-10-30 14:07:42 -0400, "Robert Jacques" <sandford jhu.edu> said:No, the syntax was meant to address the more complex problem of specifying the concept of scope(o). It also add some flexibility for other relationships. As for do not escape, I'm assuming a no_escape type (it would behave as a transitive version of final). I dislike reusing the scope keyword for this as void foo(scope Object a) { scope Object b = new Object(); scope Object c = b; // Okay scope Object d = a; // Error }On Wed, 29 Oct 2008 07:28:55 -0400, Michel Fortin <michel.fortin michelf.com> wrote:Hum, but can that syntax guarenty a reference to o or i won't escape the current function's scope, like void foo(scope Object o); ?P.P.S.: This syntax doesn't fit very well with the current scope(success/failure/exit) feature.How about o.scope instead of scope(o)? Also, this would allow contract-like syntax: void foo (myObject o, int* i) if (o.scope <= i.scope) { ... }
Oct 31 2008
On Thu, 30 Oct 2008 21:01:28 -0400, Michel Fortin <michel.fortin michelf.com> wrote:On 2008-10-30 14:07:42 -0400, "Robert Jacques" <sandford jhu.edu> said:Another option is for the default to be escape. i.e. a contract is required for an escape to happen Object o; void foo(Object a, Object b) if(b.scope <= o.scope) { o = b; // Okay o = a; // Error }On Wed, 29 Oct 2008 07:28:55 -0400, Michel Fortin <michel.fortin michelf.com> wrote:Hum, but can that syntax guarenty a reference to o or i won't escape the current function's scope, like void foo(scope Object o); ?P.P.S.: This syntax doesn't fit very well with the current scope(success/failure/exit) feature.How about o.scope instead of scope(o)? Also, this would allow contract-like syntax: void foo (myObject o, int* i) if (o.scope <= i.scope) { ... }
Oct 31 2008
On Fri, 31 Oct 2008 11:02:31 -0400, Robert Jacques <sandford jhu.edu> wrote:Another option is for the default to be escape.Correction: default to be _no_ escape.
Oct 31 2008
I think C++ designers are fully mad, this shows how to use C++ lambdas: http://blogs.msdn.com/vcblog/archive/2008/10/28/lambdas-auto-and-static-assert-c-0x-features-in-vc10-part-1.aspx If D2 lambdas (and closures) become *half* complex as that I'm going to stop using D on the spot :-) Bye, bearophile
Oct 29 2008
Wed, 29 Oct 2008 09:46:22 -0400, bearophile wrote:I think C++ designers are fully mad, this shows how to use C++ lambdas: http://blogs.msdn.com/vcblog/archive/2008/10/28/lambdas-auto-and-static-assert-c-0x-features-in-vc10-part-1.aspx If D2 lambdas (and closures) become *half* complex as that I'm going to stop using D on the spot :-)Well, they're somewhat limited, and a bit manual, and actually just a syntactic sugar, but otherwise they're quite close to D's stack delegates, even in syntax. I couldn't see what scared you that much.
Oct 29 2008
On Thu, Oct 30, 2008 at 12:49 AM, Sergey Gromov <snake.scaly gmail.com> wrote:Wed, 29 Oct 2008 09:46:22 -0400, bearophile wrote:I think it's mostly the capture mode [] stuff that's a bit ugly. I think this is a legal lambda: [=,this,&x,&y](int& r) mutable { ... } Anyway, I'm impressed that MS is getting these things into the compiler so quickly. I had expected to see another C99 foot-dragging extravaganza. Guess it just goes to show how little they care about C. --bbI think C++ designers are fully mad, this shows how to use C++ lambdas: http://blogs.msdn.com/vcblog/archive/2008/10/28/lambdas-auto-and-static-assert-c-0x-features-in-vc10-part-1.aspx If D2 lambdas (and closures) become *half* complex as that I'm going to stop using D on the spot :-)Well, they're somewhat limited, and a bit manual, and actually just a syntactic sugar, but otherwise they're quite close to D's stack delegates, even in syntax. I couldn't see what scared you that much.
Oct 29 2008
Thu, 30 Oct 2008 04:06:52 +0900, Bill Baxter wrote:On Thu, Oct 30, 2008 at 12:49 AM, Sergey Gromov <snake.scaly gmail.com> wrote:The discussed features are really a significant improvement for C++ productivity. I think a lot of C++ code is still being written by MS so improving productivity here should be a priority for them. I wonder if there are any chances for typeof() in C++.Wed, 29 Oct 2008 09:46:22 -0400, bearophile wrote:Anyway, I'm impressed that MS is getting these things into the compiler so quickly. I had expected to see another C99 foot-dragging extravaganza. Guess it just goes to show how little they care about C.I think C++ designers are fully mad, this shows how to use C++ lambdas: http://blogs.msdn.com/vcblog/archive/2008/10/28/lambdas-auto-and-static-assert-c-0x-features-in-vc10-part-1.aspx
Oct 29 2008
On Wed, Oct 29, 2008 at 5:30 PM, Sergey Gromov <snake.scaly gmail.com> wrote:I wonder if there are any chances for typeof() in C++.It's called decltype().
Oct 29 2008
Bill Baxter wrote:Anyway, I'm impressed that MS is getting these things into the compiler so quickly. I had expected to see another C99 foot-dragging extravaganza. Guess it just goes to show how little they care about C.C++ is a .NET language now ;-P
Oct 29 2008
I wonder if it would be easy enough to allocate closures lazily at runtime. So the compiler scans executable code, and any time there is an assignment (passing as function args doesn't count, returning does) involving delegates, it inserts code that will do the following: - Check whether the delegate being assigned from is on the stack or the heap. - If it's on the stack, make a copy on the heap, and use that. Scope (partial) closures never get assigned to other things, so no extra code will ever be generated or executed for them. I worry that this might be more complicated with multithreading though. Also, I'm not sure how to make sure all calls to the closure access the same context, and that the function that contains the context also knows when it's context has moved off of the stack and into the heap. I'm not sure of this because I'm also not sure how that's handled anyways. Also notable is that the heuristic I suggest is just that; it is not necessarily optimal or even strictly lazy. There are cases where delegates could be passed around by assignment yet never escape their scope. Maybe it is easy enough to add that as another condition for the runtime check: is this delegate being assigned to some place in the heap or too far up (down?) in the stack? Just an optimization though, and probably one not nearly as important. OK so all of this doesn't help much with the more general problem of /static/ escape analysis. Oh well.
Oct 30 2008
Walter Bright wrote:void bar(noscope int* p); // p escapes void bar(scope int* p); // p does not escape void bar(int* p); // what should be the default? What should be the default? The functional programmer would probably choose scope as the default, and the OOP programmer noscope. (The issue with delegates is we need the dynamic closure only if the delegate 'escapes'.)I appreciate OOP. I also appreciate it when it takes no significant effort to write safe code. I also appreciate it when I don't have to convince the compiler that what I'm doing is safe when I know it's safe. In the case of pointers, I don't use them, most of the time. (I'm working on a variable-key-length cache-oblivious lookahead array right now, and that requires pointers for efficiency, but this is probably the first time I've used pointers in D.) In the case of delegates, I use them. I've been confused and upset by the lack of closures in D1. I think a lot of new programmers will expect closures and get confused by having two different ways of declaring them. For my code, I won't mind using whatever new syntax for closures, even if it's slightly verbose. For new programmers, I'd recommend using closures by default, since they're safer. Once they're more comfortable with the language, you can introduce the idea of allocating delegate context on the stack as an occasionally unsafe optimization.
Nov 01 2008