digitalmars.D - Escape Analysis on reddit
- Walter Bright (1/1) Oct 31 2008 http://www.reddit.com/r/programming/comments/7akzd/escape_analysis/
- Steven Schveighoffer (52/53) Oct 31 2008 The question is, in the example:
- Denis Koroskin (15/72) Oct 31 2008 I will state the question slightly different. f4 is now pure. What chang...
- Andrei Alexandrescu (27/92) Oct 31 2008 The problem is that foo does not know enough about bar to deem the call
- Robert Fraser (3/10) Oct 31 2008 I think Steven's point was that these programs are not the few; they're
- Walter Bright (7/9) Oct 31 2008 I'm not sure that's the case, or if it is, those programs will suffer
- Steven Schveighoffer (15/24) Oct 31 2008 Yes, my point in all this is, if we are going to 'solve' the scope escap...
- Michel Fortin (19/32) Oct 31 2008 But declaring arguments as scope solve the heap allocation problem,
- Steven Schveighoffer (25/54) Nov 01 2008 What I was talking about is possible cases where it can be proven (with ...
- Walter Bright (3/8) Oct 31 2008 Pushing things off to the linker is not practical because then you
- Robert Fraser (2/12) Nov 01 2008 Or dynamic ones.
- Steven Schveighoffer (10/22) Nov 01 2008 Both good points. Such is the requirement for a full analysis though, a...
- Sergey Gromov (8/32) Nov 07 2008 Or there could be a separate tool, like verify in Java. DLLs can be
- Steven Schveighoffer (47/132) Oct 31 2008 No, foo doesn't know anything. The compiler is the one who does not kno...
- Walter Bright (19/33) Oct 31 2008 Assume the compiler keeps track of, for every expression, a list of
- Steven Schveighoffer (11/44) Nov 01 2008 OK, I assumed that was an easy one, but I was establishing how it's hard...
- Walter Bright (2/5) Nov 01 2008 I hadn't thought about a swap function. Hrmmm...
- Andrei Alexandrescu (58/201) Oct 31 2008 Well obviously. When I said "foo doesn't know" I really meant "the
- Andrei Alexandrescu (8/13) Oct 31 2008 Lo and behold, I found the paper. I'm amazed because it's been many
- Dave (8/21) Nov 01 2008 Another interesting blog on the subject of pointer analysis. This one ta...
- ore-sama (2/17) Nov 01 2008 If compiler is able to mark data as escaping and compare escape marks of...
- Andrei Alexandrescu (8/29) Nov 01 2008 Excellent point. The compiler can do that, but it would take a long time...
- bearophile (4/11) Nov 01 2008 I can see a better intermediate solution: let the compiler do interproce...
- Michel Fortin (110/118) Nov 01 2008 Tell me how often you encounter a function that swap two variables then.
- Andrei Alexandrescu (17/123) Nov 01 2008 "scope" is not needed inside the function (in both cases). The compiler
- Michel Fortin (19/122) Nov 01 2008 Indeed.
- Andrei Alexandrescu (16/149) Nov 01 2008 I think you're putting forth a compelling argument. Thank you! I will
- Steven Schveighoffer (33/160) Nov 01 2008 I agree that implementing it on credit alone is going to open up problem...
- Andrei Alexandrescu (6/19) Nov 01 2008 That's an excellent point. Casts won't work because scope is not really
- Walter Bright (2/5) Oct 31 2008 That would certainly address that problem.
- Andrei Alexandrescu (6/7) Oct 31 2008 Oh, I encourage people who find the discussion even marginally
- Jason House (3/12) Oct 31 2008 Not to be rude, but the articles usually aren't very good. This article ...
- Andrei Alexandrescu (3/13) Oct 31 2008 I agree.
- Walter Bright (3/13) Oct 31 2008 Point taken, but on the other hand, I've found if the articles are long
- Jason House (2/16) Oct 31 2008 I was responding to the frequent requests to vote for D articles. I defi...
- Walter Bright (6/10) Oct 31 2008 That's fair, and I appreciate the constructive criticism. Would you mind...
- Jason House (4/15) Nov 01 2008 I don't mind. It could even be fun :)
- bearophile (4/5) Oct 31 2008 I think for this complex stuff Lambda the Ultimate is much fitter than R...
- Sean Kelly (6/10) Oct 31 2008 It's a bit off-topic, but Jaron Lanier wrote an interesting diatribe
- ore-sama (12/13) Oct 31 2008 how to mark and verify this?
- Steven Schveighoffer (8/21) Nov 01 2008 I think your example would be impossible for the compiler to prove safe,...
- Michel Fortin (13/31) Nov 01 2008 Just create a scope ThreadPool object, accepting a scopeof(this)
- Walter Bright (2/5) Nov 01 2008 Yes.
http://www.reddit.com/r/programming/comments/7akzd/escape_analysis/
Oct 31 2008
"Walter Bright" wrotehttp://www.reddit.com/r/programming/comments/7akzd/escape_analysis/The question is, in the example: int* bar(int* p) { return p; } int* foo() { int x = 3; return bar(&x); } where is the problem? Is the problem in bar, or is the problem in foo? The real problem is in foo, because there's technically nothing wrong with bar. So what ends up happening in your scheme is this: int *bar(int *p) { return p;} void foo(scope int *p) { int *x = bar(p); // no escape, but compile error } So yes, you are protected, but at the cost of not being able to write code that is provably safe. The end result is you end up removing scope where it really should be, and then virally, it makes you remove scope decorations everywhere else. I predict that if this is implemented, nobody will be able to use it. In order for this to be useful, we need to have a way to express total relationships between parameters and the return value. And that would probably be so convoluted that it's not worth the syntax. I'll reiterate my challenge from another thread: void f(ref int *a, int *b, int *c) { if(*b < *c) a = b; else a = c;} struct S { int *v; } int *f2(S* s) { return s.v;} void f3(ref int *a, ref int *b, ref int *c) { int *tmp = a; a = b; b = c; c = tmp; } void f4() { int a=1, b=2, c=3; auto ap = &a; auto bp = &b; auto cp = &c; S s(cp); f(ap, bp, cp); b = *f2(&s); f3(ap, bp, cp); } Please tell me how I should mark f, S, f2, and f3 such that f4 compiles. Note that this code all works in D1 and provably has no problems with escaping values, since f4 returns void, and no globals are accessed. -Steve
Oct 31 2008
On Fri, 31 Oct 2008 23:17:41 +0300, Steven Schveighoffer <schveiguy yahoo.com> wrote:"Walter Bright" wroteI will state the question slightly different. f4 is now pure. What changes should be done to source code (if any) so that it compiles? pure void f4() { int a=1, b=2, c=3; auto ap = &a; auto bp = &b; auto cp = &c; S s(cp); f(ap, bp, cp); b = *f2(&s); f3(ap, bp, cp); }http://www.reddit.com/r/programming/comments/7akzd/escape_analysis/The question is, in the example: int* bar(int* p) { return p; } int* foo() { int x = 3; return bar(&x); } where is the problem? Is the problem in bar, or is the problem in foo? The real problem is in foo, because there's technically nothing wrong with bar. So what ends up happening in your scheme is this: int *bar(int *p) { return p;} void foo(scope int *p) { int *x = bar(p); // no escape, but compile error } So yes, you are protected, but at the cost of not being able to write code that is provably safe. The end result is you end up removing scope where it really should be, and then virally, it makes you remove scope decorations everywhere else. I predict that if this is implemented, nobody will be able to use it. In order for this to be useful, we need to have a way to express total relationships between parameters and the return value. And that would probably be so convoluted that it's not worth the syntax. I'll reiterate my challenge from another thread: void f(ref int *a, int *b, int *c) { if(*b < *c) a = b; else a = c;} struct S { int *v; } int *f2(S* s) { return s.v;} void f3(ref int *a, ref int *b, ref int *c) { int *tmp = a; a = b; b = c; c = tmp; } void f4() { int a=1, b=2, c=3; auto ap = &a; auto bp = &b; auto cp = &c; S s(cp); f(ap, bp, cp); b = *f2(&s); f3(ap, bp, cp); } Please tell me how I should mark f, S, f2, and f3 such that f4 compiles. Note that this code all works in D1 and provably has no problems with escaping values, since f4 returns void, and no globals are accessed. -Steve
Oct 31 2008
Steven Schveighoffer wrote:"Walter Bright" wroteThe problem is that foo does not know enough about bar to deem the call correct. Notice that if you consider local functions instead of ints, that gets you straight to the delegates case that is the crux of the matter.http://www.reddit.com/r/programming/comments/7akzd/escape_analysis/The question is, in the example: int* bar(int* p) { return p; } int* foo() { int x = 3; return bar(&x); } where is the problem? Is the problem in bar, or is the problem in foo? The real problem is in foo, because there's technically nothing wrong with bar.So what ends up happening in your scheme is this: int *bar(int *p) { return p;} void foo(scope int *p) { int *x = bar(p); // no escape, but compile error }Well I was talking to Walter how in compiling a function there are two scopes to track: that of parameters, and that of locals inside functions. The former includes the latter. Bar should be defined as: scope int* bar(scope int* p) { return p; } which is easy to check as long as some conservativeness is acceptable.So yes, you are protected, but at the cost of not being able to write code that is provably safe. The end result is you end up removing scope where it really should be, and then virally, it makes you remove scope decorations everywhere else. I predict that if this is implemented, nobody will be able to use it.There are many cases in which statically typed programs disallow valid constructs. Each case is a matter of nuance, but by and large types have turned out to be successful.In order for this to be useful, we need to have a way to express total relationships between parameters and the return value. And that would probably be so convoluted that it's not worth the syntax.I don't think there is a need to express the exact relationship.I'll reiterate my challenge from another thread: void f(ref int *a, int *b, int *c) { if(*b < *c) a = b; else a = c;} struct S { int *v; } int *f2(S* s) { return s.v;} void f3(ref int *a, ref int *b, ref int *c) { int *tmp = a; a = b; b = c; c = tmp; } void f4() { int a=1, b=2, c=3; auto ap = &a; auto bp = &b; auto cp = &c; S s(cp); f(ap, bp, cp); b = *f2(&s); f3(ap, bp, cp); } Please tell me how I should mark f, S, f2, and f3 such that f4 compiles. Note that this code all works in D1 and provably has no problems with escaping values, since f4 returns void, and no globals are accessed.First off, we need to clarify that "provably" is a bit strong, and that the hypothesis "f4 returns void, and no globals are accessed" is not all that's needed for the proof. Full access to function bodies is necessary. Correctness may be provable to a human or machine having access to the entire body of code, but not if e.g. f, f2, and f3 are invisible to f4. Second, I contest the entire example. Static typing is supposed to prove interesting properties about *most* correct programs, at the cost of disallowing a *few* correct programs that are not very frequent, are not very interesting, or can be easily replaced with typesafe alternatives. One can write correct code based on void* that is virtually impossible to typecheck statically, but that doesn't show that static types are useless. Andrei
Oct 31 2008
Andrei Alexandrescu wrote:Second, I contest the entire example. Static typing is supposed to prove interesting properties about *most* correct programs, at the cost of disallowing a *few* correct programs that are not very frequent, are not very interesting, or can be easily replaced with typesafe alternatives. One can write correct code based on void* that is virtually impossible to typecheck statically, but that doesn't show that static types are useless.I think Steven's point was that these programs are not the few; they're the many. Most fairly large programs will have issues like this one.
Oct 31 2008
Robert Fraser wrote:I think Steven's point was that these programs are not the few; they're the many. Most fairly large programs will have issues like this one.I'm not sure that's the case, or if it is, those programs will suffer significantly from using an alternative (the most obvious alternative is to allocate those particular cases on the heap). Consider the other problem. It's not easy for a human to look at Steven's case and verify that it is correct. A small mistake in it, and it isn't correct anymore. What's it worth to be rid of such problems?
Oct 31 2008
"Walter Bright" wroteRobert Fraser wrote:Yes, my point in all this is, if we are going to 'solve' the scope escape issue which today causes unnecessary heap allocations, let's not make new syntax that makes things harder or impossible to do and *still* have the heap allocation problem.I think Steven's point was that these programs are not the few; they're the many. Most fairly large programs will have issues like this one.I'm not sure that's the case, or if it is, those programs will suffer significantly from using an alternative (the most obvious alternative is to allocate those particular cases on the heap).Consider the other problem. It's not easy for a human to look at Steven's case and verify that it is correct. A small mistake in it, and it isn't correct anymore. What's it worth to be rid of such problems?It's worth a lot, I agree. But not at the cost of invalidating tested efficient designs such as Tango. FWIW, I think with changes to the compiler/linker, this can be all detected automatically without any extra syntax. Consider that the linker currently resolves dependencies of symbols. It could also resolve incomplete escape analysis graphs. If the graphs don't match up, it's a linker error. This would be needed in the case of interfaces and virtual functions. That would be the ultimate solution IMO, but definitely not in time for D2 ;) -Steve
Oct 31 2008
On 2008-10-31 22:47:59 -0400, "Steven Schveighoffer" <schveiguy yahoo.com> said:Yes, my point in all this is, if we are going to 'solve' the scope escape issue which today causes unnecessary heap allocations, let's not make new syntax that makes things harder or impossible to do and *still* have the heap allocation problem.But declaring arguments as scope solve the heap allocation problem, except where not heap-allocating would create a bug because of someone keeping the reference for too long.It's worth a lot, I agree. But not at the cost of invalidating tested efficient designs such as Tango.If you can't annotate Tango's code to be scope-correct, then there are probably hidden bugs in there. If D scope syntax isn't expressive enough to annotate valid and efficient designs, then the syntax is probably not well enough.FWIW, I think with changes to the compiler/linker, this can be all detected automatically without any extra syntax. Consider that the linker currently resolves dependencies of symbols. It could also resolve incomplete escape analysis graphs. If the graphs don't match up, it's a linker error. This would be needed in the case of interfaces and virtual functions.Here you're expecting the linker to perform escape analysis on comiled code, a world where pointers, numbers and raw data are the same thing: bytes and values in registers. That's probably a million times harder to do than when working on the semantic tree of a D program, if that's even possible.That would be the ultimate solution IMO, but definitely not in time for D2 ;)Indeed. :-) -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Oct 31 2008
"Michel Fortin" wroteOn 2008-10-31 22:47:59 -0400, "Steven Schveighoffer" <schveiguy yahoo.com> said:What I was talking about is possible cases where it can be proven (with all source code available) that a heap allocation is not required, but the compiler will not let it compile unless it is heap-allocated. Therefore, you have to do the heap allocation to get it to work. So we would still have the heap allocation problem, but now you have to resolve it manually.Yes, my point in all this is, if we are going to 'solve' the scope escape issue which today causes unnecessary heap allocations, let's not make new syntax that makes things harder or impossible to do and *still* have the heap allocation problem.But declaring arguments as scope solve the heap allocation problem, except where not heap-allocating would create a bug because of someone keeping the reference for too long.It depends on what 'scope correct' means. If the syntax to annotate is expressive enough, then it should be possible in most cases. Tango is used quite a bit, and most code has bugs, but generally, scope escape bugs are rare, and special care is taken by the devs to make sure it doesn't happen.It's worth a lot, I agree. But not at the cost of invalidating tested efficient designs such as Tango.If you can't annotate Tango's code to be scope-correct, then there are probably hidden bugs in there.If D scope syntax isn't expressive enough to annotate valid and efficient designs, then the syntax is probably not well enough.Yes, this is what I'm concerned about.I mean that the linker would be given a scope-graph metadata (in the object file), which contains simple dependency edges between parameters of functions and their return values. The graph is generated by the compiler like you said, when working on the semantic tree. The linker just resolves undefined vertices, connecting the incomplete graphs to see if the entire graph is valid. It only makes sense for values that contain references, so code like this: int f(scope int x) should mean nothing to the compiler. So the linker only has to worry about references, and doesn't really need to understand the types of the parameters, just where they can escape to. -SteveFWIW, I think with changes to the compiler/linker, this can be all detected automatically without any extra syntax. Consider that the linker currently resolves dependencies of symbols. It could also resolve incomplete escape analysis graphs. If the graphs don't match up, it's a linker error. This would be needed in the case of interfaces and virtual functions.Here you're expecting the linker to perform escape analysis on comiled code, a world where pointers, numbers and raw data are the same thing: bytes and values in registers. That's probably a million times harder to do than when working on the semantic tree of a D program, if that's even possible.
Nov 01 2008
Steven Schveighoffer wrote:FWIW, I think with changes to the compiler/linker, this can be all detected automatically without any extra syntax. Consider that the linker currently resolves dependencies of symbols. It could also resolve incomplete escape analysis graphs. If the graphs don't match up, it's a linker error. This would be needed in the case of interfaces and virtual functions.Pushing things off to the linker is not practical because then you cannot rely on standard, off-the-shelf linkers.
Oct 31 2008
Walter Bright wrote:Steven Schveighoffer wrote:Or dynamic ones.FWIW, I think with changes to the compiler/linker, this can be all detected automatically without any extra syntax. Consider that the linker currently resolves dependencies of symbols. It could also resolve incomplete escape analysis graphs. If the graphs don't match up, it's a linker error. This would be needed in the case of interfaces and virtual functions.Pushing things off to the linker is not practical because then you cannot rely on standard, off-the-shelf linkers.
Nov 01 2008
"Robert Fraser" wroteWalter Bright wrote:Both good points. Such is the requirement for a full analysis though, as you don't always have all the source available, and if you have cyclic dependencies, you have a chicken-and-egg problem. But if this scope proposal just makes D nightmarish, either with convoluted syntax or correct code not compiling, it's still an option to think about. As far as dynamic linkers, at least on windows, the DLL can have a routine that resolves the graphs when loaded. I'm not sure about shared objects though... -SteveSteven Schveighoffer wrote:Or dynamic ones.FWIW, I think with changes to the compiler/linker, this can be all detected automatically without any extra syntax. Consider that the linker currently resolves dependencies of symbols. It could also resolve incomplete escape analysis graphs. If the graphs don't match up, it's a linker error. This would be needed in the case of interfaces and virtual functions.Pushing things off to the linker is not practical because then you cannot rely on standard, off-the-shelf linkers.
Nov 01 2008
Sat, 1 Nov 2008 11:28:15 -0400, Steven Schveighoffer wrote:"Robert Fraser" wroteOr there could be a separate tool, like verify in Java. DLLs can be handled, too, because every DLL must have an import library. So it can include an import escape graph as well. Though there'll have to be an escape metal-language anyway to be able to describe external APIs and dynamically loaded DLLs. An image of core.sys.windows.windows annotated to support escape analysis boggles my mind.Walter Bright wrote:Both good points. Such is the requirement for a full analysis though, as you don't always have all the source available, and if you have cyclic dependencies, you have a chicken-and-egg problem. But if this scope proposal just makes D nightmarish, either with convoluted syntax or correct code not compiling, it's still an option to think about. As far as dynamic linkers, at least on windows, the DLL can have a routine that resolves the graphs when loaded. I'm not sure about shared objects though...Steven Schveighoffer wrote:Or dynamic ones.FWIW, I think with changes to the compiler/linker, this can be all detected automatically without any extra syntax. Consider that the linker currently resolves dependencies of symbols. It could also resolve incomplete escape analysis graphs. If the graphs don't match up, it's a linker error. This would be needed in the case of interfaces and virtual functions.Pushing things off to the linker is not practical because then you cannot rely on standard, off-the-shelf linkers.
Nov 07 2008
"Andrei Alexandrescu" wroteSteven Schveighoffer wrote:No, foo doesn't know anything. The compiler is the one who does not know. The user does, so it looks to the user like the compiler has a bug (which I would argue is correct)."Walter Bright" wroteThe problem is that foo does not know enough about bar to deem the call correct. Notice that if you consider local functions instead of ints, that gets you straight to the delegates case that is the crux of the matter.http://www.reddit.com/r/programming/comments/7akzd/escape_analysis/The question is, in the example: int* bar(int* p) { return p; } int* foo() { int x = 3; return bar(&x); } where is the problem? Is the problem in bar, or is the problem in foo? The real problem is in foo, because there's technically nothing wrong with bar.So how does foo know what scope return means? Consider a min function: scope T min(T)(scope T v1, scope T v2); Now, let's try it out: scope int *foo(scope int *p) { int x; return min(&x, p); } Shouldn't this not compile? If you say 'well, the compiler can take the most conservative approach, and assume scope means the most inner scope.' OK scope T[] findFirst(scope T[] buf, scope T target); scope T[] findFirst5(scope T[] buf) { scope t5 = new T(5); return findFirst(buf, t5); } The problem is, unless you allow fully specifying dependencies, there is going to be ways people call functions that you didn't think of, which are legitimate designs, and just can't be done with your new regime. The ultimate result is either people cast around the system, don't use the system, or end up allocating unecessarily on the heap to work around it.So what ends up happening in your scheme is this: int *bar(int *p) { return p;} void foo(scope int *p) { int *x = bar(p); // no escape, but compile error }Well I was talking to Walter how in compiling a function there are two scopes to track: that of parameters, and that of locals inside functions. The former includes the latter. Bar should be defined as: scope int* bar(scope int* p) { return p; } which is easy to check as long as some conservativeness is acceptable.I would bet that either A) the proposed system would not provide correct protection, or B) the system would require many design changes in a library like Tango, which uses lots of stack variables and inner functions to achive higher performance. It just doesn't work with actual already successful software programs. It's too limited.So yes, you are protected, but at the cost of not being able to write code that is provably safe. The end result is you end up removing scope where it really should be, and then virally, it makes you remove scope decorations everywhere else. I predict that if this is implemented, nobody will be able to use it.There are many cases in which statically typed programs disallow valid constructs. Each case is a matter of nuance, but by and large types have turned out to be successful.OK, if f4 returns void, and takes no arguments, and none of the code used by any of the functions accesses globals (I admit, I may have said this ambiguously last time), there's not even any heap activity, you tell me how it can escape. Prove me wrong. You have full access to the source as you can see.I'll reiterate my challenge from another thread: void f(ref int *a, int *b, int *c) { if(*b < *c) a = b; else a = c;} struct S { int *v; } int *f2(S* s) { return s.v;} void f3(ref int *a, ref int *b, ref int *c) { int *tmp = a; a = b; b = c; c = tmp; } void f4() { int a=1, b=2, c=3; auto ap = &a; auto bp = &b; auto cp = &c; S s(cp); f(ap, bp, cp); b = *f2(&s); f3(ap, bp, cp); } Please tell me how I should mark f, S, f2, and f3 such that f4 compiles. Note that this code all works in D1 and provably has no problems with escaping values, since f4 returns void, and no globals are accessed.First off, we need to clarify that "provably" is a bit strong, and that the hypothesis "f4 returns void, and no globals are accessed" is not all that's needed for the proof. Full access to function bodies is necessary.Correctness may be provable to a human or machine having access to the entire body of code, but not if e.g. f, f2, and f3 are invisible to f4.That is the challenge. If I remove the function bodies, how do you mark the signatures so you can still prove that it's correct?Second, I contest the entire example. Static typing is supposed to prove interesting properties about *most* correct programs, at the cost of disallowing a *few* correct programs that are not very frequent, are not very interesting, or can be easily replaced with typesafe alternatives. One can write correct code based on void* that is virtually impossible to typecheck statically, but that doesn't show that static types are useless.My example is at least as interesting as Walter's in the article to show the problem. If you want to complain about simple examples not being useful, then stop using simple examples to show your point. We are not talking about mucking with void *'s or drastic violations of types here, we are talking about reference parameters for returns, or an accessor of a member, or a 3way swap function. These are all common designs. The fact that you cannot find a solution so you resort to just bluntly calling my example 'uninteresting' speaks for itself. -Steve
Oct 31 2008
Steven Schveighoffer wrote:So how does foo know what scope return means? Consider a min function: scope T min(T)(scope T v1, scope T v2); Now, let's try it out: scope int *foo(scope int *p) { int x; return min(&x, p); } Shouldn't this not compile? If you say 'well, the compiler can take the most conservative approach, and assume scope means the most inner scope.'Assume the compiler keeps track of, for every expression, a list of references that it is composed of. For: min(&x, p) the compiler knows that min() returns some combination of v1 and v2, so the expression min(&x, p) is tagged with the list [&x,p]. Now it examines the return, and sees that [&x,p] is escaping the scope of the function. Although p is a scope parameter, the scope on the foo() return value says that's ok, so that passes. The &x, however, fails because it's a reference to a local going out of scope. So, I think we've got this case covered properly. But I don't like the syntax. For one thing, if the return value might refer to parameter p1 but not p2, we need to express that. So, something like: int* foo(return scope int* p1, scope int* p2, return scope int* p3); Now, the caller tags a call to foo() with [arg1,arg3] for its escape analysis, omitting arg2. The fly in this is I'd like to combine it somehow with the const propagation issue of argument type to return type.
Oct 31 2008
"Walter Bright" wroteSteven Schveighoffer wrote:OK, I assumed that was an easy one, but I was establishing how it's hard to solve this and the other example.So how does foo know what scope return means? Consider a min function: scope T min(T)(scope T v1, scope T v2); Now, let's try it out: scope int *foo(scope int *p) { int x; return min(&x, p); } Shouldn't this not compile? If you say 'well, the compiler can take the most conservative approach, and assume scope means the most inner scope.'Assume the compiler keeps track of, for every expression, a list of references that it is composed of. For: min(&x, p) the compiler knows that min() returns some combination of v1 and v2, so the expression min(&x, p) is tagged with the list [&x,p]. Now it examines the return, and sees that [&x,p] is escaping the scope of the function. Although p is a scope parameter, the scope on the foo() return value says that's ok, so that passes. The &x, however, fails because it's a reference to a local going out of scope. So, I think we've got this case covered properly.But I don't like the syntax. For one thing, if the return value might refer to parameter p1 but not p2, we need to express that.This is part of it, you also can have return values through parameters which need to be tracked.So, something like: int* foo(return scope int* p1, scope int* p2, return scope int* p3); Now, the caller tags a call to foo() with [arg1,arg3] for its escape analysis, omitting arg2.This is better functionally, but notice how verbose we are getting. Now add in tagging parameters to depend on other parameters (like for instance in a swap function).The fly in this is I'd like to combine it somehow with the const propagation issue of argument type to return type.Note that the const propogation issue isn't as complex, because you can't change const parameters, so you only have to do return value analysis. -Steve
Nov 01 2008
Steven Schveighoffer wrote:This is better functionally, but notice how verbose we are getting. Now add in tagging parameters to depend on other parameters (like for instance in a swap function).I hadn't thought about a swap function. Hrmmm...
Nov 01 2008
Steven Schveighoffer wrote:"Andrei Alexandrescu" wroteWell obviously. When I said "foo doesn't know" I really meant "the compiler while compiling foo and enjoying its horizon". I'm at a continuous disadvantage because I don't have the time to give full detail in my posts so they come off as unclear.Steven Schveighoffer wrote:No, foo doesn't know anything. The compiler is the one who does not know. The user does, so it looks to the user like the compiler has a bug (which I would argue is correct)."Walter Bright" wroteThe problem is that foo does not know enough about bar to deem the call correct. Notice that if you consider local functions instead of ints, that gets you straight to the delegates case that is the crux of the matter.http://www.reddit.com/r/programming/comments/7akzd/escape_analysis/The question is, in the example: int* bar(int* p) { return p; } int* foo() { int x = 3; return bar(&x); } where is the problem? Is the problem in bar, or is the problem in foo? The real problem is in foo, because there's technically nothing wrong with bar.That won't compile per my explanation: scope in a function parameter is different from scope in a local variable. You can't return a scoped local. So the code won't compile.So how does foo know what scope return means? Consider a min function: scope T min(T)(scope T v1, scope T v2); Now, let's try it out: scope int *foo(scope int *p) { int x; return min(&x, p); }So what ends up happening in your scheme is this: int *bar(int *p) { return p;} void foo(scope int *p) { int *x = bar(p); // no escape, but compile error }Well I was talking to Walter how in compiling a function there are two scopes to track: that of parameters, and that of locals inside functions. The former includes the latter. Bar should be defined as: scope int* bar(scope int* p) { return p; } which is easy to check as long as some conservativeness is acceptable.Shouldn't this not compile? If you say 'well, the compiler can take the most conservative approach, and assume scope means the most inner scope.' OKIn this case it's not even conservative, it's doing the correct thing. To illustrate the effects of a conservative approach, consider: scope int *foo(scope int *p) { int x; auto r = &x; r = p; return r; } This function won't compile if the compiler employs flow-insensitive escape analysis (see e.g. http://en.wikipedia.org/wiki/Data_flow_analysis) because r will be marked as escaping regardless of order of statements. Walter plans to implement path- and flow-insensitive escape analysis. In fact he first wants to define scope to work on credit alone (trust the programmer) and later on implement the analysis. I hope I'll be able to talk him out of that, because it's a precisely backwards approach: he'll first allow vale tudo, and then break code when he implements a limited analysis. I think the right thing is to move from conservative to permissive.scope T[] findFirst(scope T[] buf, scope T target); scope T[] findFirst5(scope T[] buf) { scope t5 = new T(5); return findFirst(buf, t5); }Same thing, won't compile. This is a clear example where conservative doesn't cut it. There is a paper that I don't have the time to look for, that does a pointer analysis and associate with each function a little graph showing how the return value depends on the inputs. (Does anyone know of it?) I don't think we can afford to implement anything near that level of sophistication.The problem is, unless you allow fully specifying dependencies, there is going to be ways people call functions that you didn't think of, which are legitimate designs, and just can't be done with your new regime. The ultimate result is either people cast around the system, don't use the system, or end up allocating unecessarily on the heap to work around it.It all depends on how often those cases are encountered.It would be helpful to provide some examples from Tango or other libraries to substantiate this.I would bet that either A) the proposed system would not provide correct protection, or B) the system would require many design changes in a library like Tango, which uses lots of stack variables and inner functions to achive higher performance. It just doesn't work with actual already successful software programs. It's too limited.So yes, you are protected, but at the cost of not being able to write code that is provably safe. The end result is you end up removing scope where it really should be, and then virally, it makes you remove scope decorations everywhere else. I predict that if this is implemented, nobody will be able to use it.There are many cases in which statically typed programs disallow valid constructs. Each case is a matter of nuance, but by and large types have turned out to be successful.That's exactly what I said. Without access to the bodies you can't tell anything. And I automatically discount access to bodies. Information should be inferred from function signatures.OK, if f4 returns void, and takes no arguments, and none of the code used by any of the functions accesses globals (I admit, I may have said this ambiguously last time), there's not even any heap activity, you tell me how it can escape. Prove me wrong. You have full access to the source as you can see.I'll reiterate my challenge from another thread: void f(ref int *a, int *b, int *c) { if(*b < *c) a = b; else a = c;} struct S { int *v; } int *f2(S* s) { return s.v;} void f3(ref int *a, ref int *b, ref int *c) { int *tmp = a; a = b; b = c; c = tmp; } void f4() { int a=1, b=2, c=3; auto ap = &a; auto bp = &b; auto cp = &c; S s(cp); f(ap, bp, cp); b = *f2(&s); f3(ap, bp, cp); } Please tell me how I should mark f, S, f2, and f3 such that f4 compiles. Note that this code all works in D1 and provably has no problems with escaping values, since f4 returns void, and no globals are accessed.First off, we need to clarify that "provably" is a bit strong, and that the hypothesis "f4 returns void, and no globals are accessed" is not all that's needed for the proof. Full access to function bodies is necessary.Instead of looking on an artificial example, it would be better to work on a corpus of examples extracted from actual code.Correctness may be provable to a human or machine having access to the entire body of code, but not if e.g. f, f2, and f3 are invisible to f4.That is the challenge. If I remove the function bodies, how do you mark the signatures so you can still prove that it's correct?This is the second time my identity is conflated with that of Walter. I do not write his blog entries! Moreover, I also think they are due for an increase in quality of treatment. About two months ago I told Walter that his writing is overly imprecise and glossing over a lot of detail. To me it conveys the feeling that he considers the reader not too smart and not too critical and just willing to accept a lot of statements at face value. If I were tasked with writing on the same subjects, I'd make damn sure I could stand to lose a toe for every incorrect word in there. (Newsgroup standards aren't that high, hallelujah.) He rightly said in his defense that a blog entry can't be an in-depth treatment for the time he can afford to invest in it. My opinion of the whole banana is that if you want to make an authoritative blog that's also short and sweet you must end up pondering each word carefully and provide ample references for the interested. AndreiSecond, I contest the entire example. Static typing is supposed to prove interesting properties about *most* correct programs, at the cost of disallowing a *few* correct programs that are not very frequent, are not very interesting, or can be easily replaced with typesafe alternatives. One can write correct code based on void* that is virtually impossible to typecheck statically, but that doesn't show that static types are useless.My example is at least as interesting as Walter's in the article to show the problem. If you want to complain about simple examples not being useful, then stop using simple examples to show your point. We are not talking about mucking with void *'s or drastic violations of types here, we are talking about reference parameters for returns, or an accessor of a member, or a 3way swap function. These are all common designs. The fact that you cannot find a solution so you resort to just bluntly calling my example 'uninteresting' speaks for itself.
Oct 31 2008
Andrei Alexandrescu wrote: [snip]There is a paper that I don't have the time to look for, that does a pointer analysis and associate with each function a little graph showing how the return value depends on the inputs. (Does anyone know of it?) I don't think we can afford to implement anything near that level of sophistication.Lo and behold, I found the paper. I'm amazed because it's been many years since I read it and I never fully understood it :o). http://suif.stanford.edu/papers/wilson95/paper.html The pointer representation they use form little graphs that represent their aliasing relationship. Andrei
Oct 31 2008
"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message news:geglbg$2f8s$1 digitalmars.com...Andrei Alexandrescu wrote: [snip]Another interesting blog on the subject of pointer analysis. This one talks about using a relational query language over Binary Decison Diagrams to make PA scalable and relatively easy to build into static analysis tools: http://blog.robjsoftware.org/2007/10/bddbddb-pointer-alias-analysis-comes-of.html which refers to: http://suif.stanford.edu/papers/pldi04.pdfThere is a paper that I don't have the time to look for, that does a pointer analysis and associate with each function a little graph showing how the return value depends on the inputs. (Does anyone know of it?) I don't think we can afford to implement anything near that level of sophistication.Lo and behold, I found the paper. I'm amazed because it's been many years since I read it and I never fully understood it :o). http://suif.stanford.edu/papers/wilson95/paper.html The pointer representation they use form little graphs that represent their aliasing relationship. Andrei
Nov 01 2008
Andrei Alexandrescu Wrote:In this case it's not even conservative, it's doing the correct thing. To illustrate the effects of a conservative approach, consider: scope int *foo(scope int *p) { int x; auto r = &x; r = p; return r; } This function won't compile if the compiler employs flow-insensitive escape analysis (see e.g. http://en.wikipedia.org/wiki/Data_flow_analysis ) because r will be marked as escaping regardless of order of statements.If compiler is able to mark data as escaping and compare escape marks of variables, isn't it able to derive parameters' marks rather than requesting them from programmers? Thus it can build function's precise signature and reading it from obj file (another long-announced feature) it can decide whether that function is callable from current context, but compiler should also check objs before linking to make sure that their metadata don't conflict.
Nov 01 2008
ore-sama wrote:Andrei Alexandrescu Wrote:Excellent point. The compiler can do that, but it would take a long time and would need access to the entire source code (the dreaded "interprocedural analysis"). This makes modular development difficult. A solution is to have the programmer put information about the function in the function signature. There's a tradeoff to be made so as not to burden the programmer with writing too long signatures. AndreiIn this case it's not even conservative, it's doing the correct thing. To illustrate the effects of a conservative approach, consider: scope int *foo(scope int *p) { int x; auto r = &x; r = p; return r; } This function won't compile if the compiler employs flow-insensitive escape analysis (see e.g. http://en.wikipedia.org/wiki/Data_flow_analysis ) because r will be marked as escaping regardless of order of statements.If compiler is able to mark data as escaping and compare escape marks of variables, isn't it able to derive parameters' marks rather than requesting them from programmers? Thus it can build function's precise signature and reading it from obj file (another long-announced feature) it can decide whether that function is callable from current context, but compiler should also check objs before linking to make sure that their metadata don't conflict.
Nov 01 2008
Andrei Alexandrescu:Excellent point. The compiler can do that, but it would take a long time and would need access to the entire source code (the dreaded "interprocedural analysis"). This makes modular development difficult. A solution is to have the programmer put information about the function in the function signature. There's a tradeoff to be made so as not to burden the programmer with writing too long signatures.I can see a better intermediate solution: let the compiler do interprocedural analysis on the visible/available modules, and let the programmer add manually the required information at the points where the known and unknown modules interface to each other. This minimizes the work of the programmer while giving the compiler all the information it needs. Bye, bearophile
Nov 01 2008
On 2008-10-31 23:51:45 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:Tell me how often you encounter a function that swap two variables then. Basically, it's implemented like this: void swap(scope ref int a, scope ref int b) { int tmp = a; a = b; b = tmp; } All fine, no reference is escaping. Now change the parameters for pointers or object references. You could naively implement it like this for pointers: void swap(scope ref int* a, scope ref int* b) { scope int* tmp = a; a = b; b = tmp; } But if swap compiles like this, it's dangerous because the scope for *a may be different from the scope for *b and you may leak a pointer to a narrower scope in a broader one. For instance: // broader scope (global) int a = 1; int* pa = &a; void test() { // narrower scope (local) int b = 2; int* pb = &b; swap(pa, pb); // should be an error } Here, swap(pa, pb), if allowed, would attempt to put a pointer to the local variable b into the global variable pa. After the function returns, pointer pa becomes invalid. To avoid this, you need the signature for swap to tell you a more complext scope constrain. Using Robert Jacques's suggested notation, I think it should be like: void swap(scope ref int* a, scope ref int* b) if ((*a).scope <= b.scope && (*b).scope <= a.scope) { scope int* tmp = a; a = b; b = tmp; } Basically, if you're giving b a pointer to the value pointed by a, the value pointed by a needs to be alive as long or longer than b if you want to avoid having b point to invalid memory. Same for a and the value pointed by b. Robert's notation is good for expressing that in the function signature, but I don't like it because it places constrains on the signature, not on the type, making constrains propagation a lot more complicated. So I've been thinking of this notation: void swap(scopeof(b*) ref int* a, scopeof(*a) ref int* b) { scopeof(b) int* tmp = a; a = b; b = tmp; } Note that tmp here needs to be scopeof(a): this guarenties that whatever you put in tmp, you can copy in b later (since scopeof(a) == scopeof(*b)). Oh, and if you don't about the type of tmp, including the scoping constrains, you should be allowed to substitute it all with "auto": void swap(scopeof(b*) ref int* a, scopeof(*a) ref int* b) { auto tmp = a; a = b; b = tmp; } - - - That's enough for the swap function. Now lets take a look at a plain setter function, which, I'm pretty sure, is a common pattern. Let's try this first: struct S { int *pi; } void setPI(scope S s, scope int* pi) { s.pi = pi; } Simple enough you think? But we have the same problem: the scope of the value pointed by pi may be narrower than the one s is in: S s1; // global scope int i1; void test() { int i2; S s2; setPI(s1, &i1); // ok, i1 same scope as s1.pi. setPI(s2, &i1); // ok, i1 is of a broader scope than s2.pi. setPI(s1, &i2); // should be an error, i2 scope narrower than s1.pi. setPI(s2, &i2); // ok, i2 is same scope as s2.pi. } To enforce correctly the constrains while still allowing the usage of s2 above, where the second argument given to the setter may be a local variable, here's what we need: struct S { scope int *pi; } void setPI(scope S s, scopeof(s.pi) int* pi) { s.pi = pi; } "scope int *pi" in the struct means you can't take the value pointed to by pi and keep it beyond S's scope. "scopeof(s.pi) int* pi" in the function signature means that the scope of the value pointed to by the second argument is at least as broad as the given scope for s.pi. -- Michel Fortin michel.fortin michelf.com http://michelf.com/The problem is, unless you allow fully specifying dependencies, there is going to be ways people call functions that you didn't think of, which are legitimate designs, and just can't be done with your new regime. The ultimate result is either people cast around the system, don't use the system, or end up allocating unecessarily on the heap to work around it.It all depends on how often those cases are encountered.
Nov 01 2008
Michel Fortin wrote:On 2008-10-31 23:51:45 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:"scope" is not needed inside the function (in both cases). The compiler will infer it.Tell me how often you encounter a function that swap two variables then. Basically, it's implemented like this: void swap(scope ref int a, scope ref int b) { int tmp = a; a = b; b = tmp; } All fine, no reference is escaping. Now change the parameters for pointers or object references. You could naively implement it like this for pointers: void swap(scope ref int* a, scope ref int* b) { scope int* tmp = a; a = b; b = tmp; }The problem is, unless you allow fully specifying dependencies, there is going to be ways people call functions that you didn't think of, which are legitimate designs, and just can't be done with your new regime. The ultimate result is either people cast around the system, don't use the system, or end up allocating unecessarily on the heap to work around it.It all depends on how often those cases are encountered.But if swap compiles like this, it's dangerous because the scope for *a may be different from the scope for *b and you may leak a pointer to a narrower scope in a broader one.Aha! Good point. But notice, however, that you are trying to extract more juice from "scope" than there is in it. What scope says in swap is only that swap itself does not escape any of its parameters. It doesn't tell that the parameters have actually the same scope.For instance: // broader scope (global) int a = 1; int* pa = &a; void test() { // narrower scope (local) int b = 2; int* pb = &b; swap(pa, pb); // should be an error } Here, swap(pa, pb), if allowed, would attempt to put a pointer to the local variable b into the global variable pa. After the function returns, pointer pa becomes invalid. To avoid this, you need the signature for swap to tell you a more complext scope constrain. Using Robert Jacques's suggested notation, I think it should be like: void swap(scope ref int* a, scope ref int* b) if ((*a).scope <= b.scope && (*b).scope <= a.scope) { scope int* tmp = a; a = b; b = tmp; }This would require each variable to carry its actual scope either at compilation or at runtime. Either would be very hard to implement if at all. But I'm glad there is a bit of exaggeration on the side of safety :o).Basically, if you're giving b a pointer to the value pointed by a, the value pointed by a needs to be alive as long or longer than b if you want to avoid having b point to invalid memory. Same for a and the value pointed by b. Robert's notation is good for expressing that in the function signature, but I don't like it because it places constrains on the signature, not on the type, making constrains propagation a lot more complicated. So I've been thinking of this notation: void swap(scopeof(b*) ref int* a, scopeof(*a) ref int* b) { scopeof(b) int* tmp = a; a = b; b = tmp; }That's nice, but it would reduce what swap can do. Oftentimes it is not known whether values manipulated by a function are in the same scope or not.Note that tmp here needs to be scopeof(a): this guarenties that whatever you put in tmp, you can copy in b later (since scopeof(a) == scopeof(*b)). Oh, and if you don't about the type of tmp, including the scoping constrains, you should be allowed to substitute it all with "auto": void swap(scopeof(b*) ref int* a, scopeof(*a) ref int* b) { auto tmp = a; a = b; b = tmp; } - - - That's enough for the swap function. Now lets take a look at a plain setter function, which, I'm pretty sure, is a common pattern. Let's try this first: struct S { int *pi; } void setPI(scope S s, scope int* pi) { s.pi = pi; } Simple enough you think? But we have the same problem: the scope of the value pointed by pi may be narrower than the one s is in:I am glad there is awareness of the limitations of scope. There are still things that are outside scope's charter, and the summary is - scope tells something about the function declaring it, not about a relationship between its parameters. Andrei
Nov 01 2008
On 2008-11-01 12:02:47 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:Michel Fortin wrote:Indeed.Basically, it's implemented like this: void swap(scope ref int a, scope ref int b) { int tmp = a; a = b; b = tmp; } All fine, no reference is escaping. Now change the parameters for pointers or object references. You could naively implement it like this for pointers: void swap(scope ref int* a, scope ref int* b) { scope int* tmp = a; a = b; b = tmp; }"scope" is not needed inside the function (in both cases). The compiler will infer it.But my point was that the "no escape" guarenty doesn't hold if the two arguments' scope doesn't match. The parameter may escape via the other parameter.But if swap compiles like this, it's dangerous because the scope for *a may be different from the scope for *b and you may leak a pointer to a narrower scope in a broader one.Aha! Good point. But notice, however, that you are trying to extract more juice from "scope" than there is in it. What scope says in swap is only that swap itself does not escape any of its parameters. It doesn't tell that the parameters have actually the same scope.I wouldn't call it exageration. Adding complexity to function signatures only to be half-safe isn't very appealing to me.For instance: // broader scope (global) int a = 1; int* pa = &a; void test() { // narrower scope (local) int b = 2; int* pb = &b; swap(pa, pb); // should be an error } Here, swap(pa, pb), if allowed, would attempt to put a pointer to the local variable b into the global variable pa. After the function returns, pointer pa becomes invalid. To avoid this, you need the signature for swap to tell you a more complext scope constrain. Using Robert Jacques's suggested notation, I think it should be like: void swap(scope ref int* a, scope ref int* b) if ((*a).scope <= b.scope && (*b).scope <= a.scope) { scope int* tmp = a; a = b; b = tmp; }This would require each variable to carry its actual scope either at compilation or at runtime. Either would be very hard to implement if at all. But I'm glad there is a bit of exaggeration on the side of safety :o).The caller function should know, and from there it could be possible to check against the constrain. About reducing what swap can do, I'd like to seeBasically, if you're giving b a pointer to the value pointed by a, the value pointed by a needs to be alive as long or longer than b if you want to avoid having b point to invalid memory. Same for a and the value pointed by b. Robert's notation is good for expressing that in the function signature, but I don't like it because it places constrains on the signature, not on the type, making constrains propagation a lot more complicated. So I've been thinking of this notation: void swap(scopeof(b*) ref int* a, scopeof(*a) ref int* b) { scopeof(b) int* tmp = a; a = b; b = tmp; }That's nice, but it would reduce what swap can do. Oftentimes it is not known whether values manipulated by a function are in the same scope or not.So basically, if you have two scope arguments and you assign some part of the first to the other, you risk bypassing the scope-checking system despite the appearances to the contrary. I'm not sure it's a good idea to have a scope-checking system if you can only trust in half. -- Michel Fortin michel.fortin michelf.com http://michelf.com/That's enough for the swap function. Now lets take a look at a plain setter function, which, I'm pretty sure, is a common pattern. Let's try this first: struct S { int *pi; } void setPI(scope S s, scope int* pi) { s.pi = pi; } Simple enough you think? But we have the same problem: the scope of the value pointed by pi may be narrower than the one s is in:I am glad there is awareness of the limitations of scope. There are still things that are outside scope's charter, and the summary is - scope tells something about the function declaring it, not about a relationship between its parameters.
Nov 01 2008
Michel Fortin wrote:On 2008-11-01 12:02:47 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:I think you're putting forth a compelling argument. Thank you! I will try to get a hold of Walter today and discuss the matter. Off the top of my head, there are the following ways to go: 1. Cop-out: do whatever is the syntactic minimum for allowing closure allocation when requested. Don't check anything, trust the coder. 2. Accept: implement scope with an understanding of its issues and limitations. Your argument weakens the motivation to pursue this approach. 3. Dig further: look into how scope can be made indeed good. My current understanding is that a partial ordering of scopes would be interesting to think about. On a side note, I'm very glad to see Walter having taken this bull by the horns. This is entirely his initiative - and if it turns sour, it's all his fault! :o) - initiative tackling a difficult problem that revealed an embarrassing corner of the language. AndreiMichel Fortin wrote:Indeed.Basically, it's implemented like this: void swap(scope ref int a, scope ref int b) { int tmp = a; a = b; b = tmp; } All fine, no reference is escaping. Now change the parameters for pointers or object references. You could naively implement it like this for pointers: void swap(scope ref int* a, scope ref int* b) { scope int* tmp = a; a = b; b = tmp; }"scope" is not needed inside the function (in both cases). The compiler will infer it.But my point was that the "no escape" guarenty doesn't hold if the two arguments' scope doesn't match. The parameter may escape via the other parameter.But if swap compiles like this, it's dangerous because the scope for *a may be different from the scope for *b and you may leak a pointer to a narrower scope in a broader one.Aha! Good point. But notice, however, that you are trying to extract more juice from "scope" than there is in it. What scope says in swap is only that swap itself does not escape any of its parameters. It doesn't tell that the parameters have actually the same scope.I wouldn't call it exageration. Adding complexity to function signatures only to be half-safe isn't very appealing to me.For instance: // broader scope (global) int a = 1; int* pa = &a; void test() { // narrower scope (local) int b = 2; int* pb = &b; swap(pa, pb); // should be an error } Here, swap(pa, pb), if allowed, would attempt to put a pointer to the local variable b into the global variable pa. After the function returns, pointer pa becomes invalid. To avoid this, you need the signature for swap to tell you a more complext scope constrain. Using Robert Jacques's suggested notation, I think it should be like: void swap(scope ref int* a, scope ref int* b) if ((*a).scope <= b.scope && (*b).scope <= a.scope) { scope int* tmp = a; a = b; b = tmp; }This would require each variable to carry its actual scope either at compilation or at runtime. Either would be very hard to implement if at all. But I'm glad there is a bit of exaggeration on the side of safety :o).The caller function should know, and from there it could be possible to check against the constrain. About reducing what swap can do, I'd like to seeBasically, if you're giving b a pointer to the value pointed by a, the value pointed by a needs to be alive as long or longer than b if you want to avoid having b point to invalid memory. Same for a and the value pointed by b. Robert's notation is good for expressing that in the function signature, but I don't like it because it places constrains on the signature, not on the type, making constrains propagation a lot more complicated. So I've been thinking of this notation: void swap(scopeof(b*) ref int* a, scopeof(*a) ref int* b) { scopeof(b) int* tmp = a; a = b; b = tmp; }That's nice, but it would reduce what swap can do. Oftentimes it is not known whether values manipulated by a function are in the same scope or not.So basically, if you have two scope arguments and you assign some part of the first to the other, you risk bypassing the scope-checking system despite the appearances to the contrary. I'm not sure it's a good idea to have a scope-checking system if you can only trust in half.That's enough for the swap function. Now lets take a look at a plain setter function, which, I'm pretty sure, is a common pattern. Let's try this first: struct S { int *pi; } void setPI(scope S s, scope int* pi) { s.pi = pi; } Simple enough you think? But we have the same problem: the scope of the value pointed by pi may be narrower than the one s is in:I am glad there is awareness of the limitations of scope. There are still things that are outside scope's charter, and the summary is - scope tells something about the function declaring it, not about a relationship between its parameters.
Nov 01 2008
"Andrei Alexandrescu" wroteWalter plans to implement path- and flow-insensitive escape analysis.This is good! It is what I was hoping for as a fix for trivial examples.In fact he first wants to define scope to work on credit alone (trust the programmer) and later on implement the analysis. I hope I'll be able to talk him out of that, because it's a precisely backwards approach: he'll first allow vale tudo, and then break code when he implements a limited analysis. I think the right thing is to move from conservative to permissive.I agree that implementing it on credit alone is going to open up problems later on.OK, so then your proposed solution won't work for valid code like this. Will there be a way to force this to compile?scope T[] findFirst(scope T[] buf, scope T target); scope T[] findFirst5(scope T[] buf) { scope t5 = new T(5); return findFirst(buf, t5); }Same thing, won't compile. This is a clear example where conservative doesn't cut it.There is a paper that I don't have the time to look for, that does a pointer analysis and associate with each function a little graph showing how the return value depends on the inputs. (Does anyone know of it?) I don't think we can afford to implement anything near that level of sophistication.I think something like this isn't possible, especially using interfaces or function pointers, but might be possible if graph dependencies are resolved at link-time. And even then, it might reject some valid code.It only takes one common function to ruin the whole thing. Like I said before, it is viral. If one lower level function can't be scope, then nothing above it can be scope. Unless there is a way to cast away the scope property, in which case you have compromised the system.The problem is, unless you allow fully specifying dependencies, there is going to be ways people call functions that you didn't think of, which are legitimate designs, and just can't be done with your new regime. The ultimate result is either people cast around the system, don't use the system, or end up allocating unecessarily on the heap to work around it.It all depends on how often those cases are encountered.I will find some examples. Just not today, I've got some fall cleanup to do ;)It would be helpful to provide some examples from Tango or other libraries to substantiate this.I would bet that either A) the proposed system would not provide correct protection, or B) the system would require many design changes in a library like Tango, which uses lots of stack variables and inner functions to achive higher performance. It just doesn't work with actual already successful software programs. It's too limited.So yes, you are protected, but at the cost of not being able to write code that is provably safe. The end result is you end up removing scope where it really should be, and then virally, it makes you remove scope decorations everywhere else. I predict that if this is implemented, nobody will be able to use it.There are many cases in which statically typed programs disallow valid constructs. Each case is a matter of nuance, but by and large types have turned out to be successful.Sure, which is why I posted the challenge. You may have misunderstood what I meant. I meant, look at this group of functions. Having access to the source code, you can prove that it is valid, and no escapes occur. Now remove the function bodies for everything but f4, and show me how your solution marks up the signatures such that f4 compiles. I think with being able to specify the scope return value, it's probably a lot easier than without that.That's exactly what I said. Without access to the bodies you can't tell anything. And I automatically discount access to bodies. Information should be inferred from function signatures.OK, if f4 returns void, and takes no arguments, and none of the code used by any of the functions accesses globals (I admit, I may have said this ambiguously last time), there's not even any heap activity, you tell me how it can escape. Prove me wrong. You have full access to the source as you can see.I'll reiterate my challenge from another thread: void f(ref int *a, int *b, int *c) { if(*b < *c) a = b; else a = c;} struct S { int *v; } int *f2(S* s) { return s.v;} void f3(ref int *a, ref int *b, ref int *c) { int *tmp = a; a = b; b = c; c = tmp; } void f4() { int a=1, b=2, c=3; auto ap = &a; auto bp = &b; auto cp = &c; S s(cp); f(ap, bp, cp); b = *f2(&s); f3(ap, bp, cp); } Please tell me how I should mark f, S, f2, and f3 such that f4 compiles. Note that this code all works in D1 and provably has no problems with escaping values, since f4 returns void, and no globals are accessed.First off, we need to clarify that "provably" is a bit strong, and that the hypothesis "f4 returns void, and no globals are accessed" is not all that's needed for the proof. Full access to function bodies is necessary.OK, I'll find some (per your request above, from Tango).Instead of looking on an artificial example, it would be better to work on a corpus of examples extracted from actual code.Correctness may be provable to a human or machine having access to the entire body of code, but not if e.g. f, f2, and f3 are invisible to f4.That is the challenge. If I remove the function bodies, how do you mark the signatures so you can still prove that it's correct?Very true, but contrived simple examples are used almost exclusively to illustrate a point when discussing merits and drawbacks of theoretical new features of the compiler. You can't just discount it because I made it up; it is a valid piece of code, provably not dangerous, and has elements that are common in programming design. Your rebuttal rings hollow because you have sidesteped the main argument in the interest of essentially name calling on my code. I'll find some real examples that are like this in Tango. If nothing else, to remove this argument about unimportance. -SteveThis is the second time my identity is conflated with that of Walter. I do not write his blog entries! Moreover, I also think they are due for an increase in quality of treatment. About two months ago I told Walter that his writing is overly imprecise and glossing over a lot of detail. To me it conveys the feeling that he considers the reader not too smart and not too critical and just willing to accept a lot of statements at face value. If I were tasked with writing on the same subjects, I'd make damn sure I could stand to lose a toe for every incorrect word in there. (Newsgroup standards aren't that high, hallelujah.) He rightly said in his defense that a blog entry can't be an in-depth treatment for the time he can afford to invest in it. My opinion of the whole banana is that if you want to make an authoritative blog that's also short and sweet you must end up pondering each word carefully and provide ample references for the interested.Second, I contest the entire example. Static typing is supposed to prove interesting properties about *most* correct programs, at the cost of disallowing a *few* correct programs that are not very frequent, are not very interesting, or can be easily replaced with typesafe alternatives. One can write correct code based on void* that is virtually impossible to typecheck statically, but that doesn't show that static types are useless.My example is at least as interesting as Walter's in the article to show the problem. If you want to complain about simple examples not being useful, then stop using simple examples to show your point. We are not talking about mucking with void *'s or drastic violations of types here, we are talking about reference parameters for returns, or an accessor of a member, or a 3way swap function. These are all common designs. The fact that you cannot find a solution so you resort to just bluntly calling my example 'uninteresting' speaks for itself.
Nov 01 2008
Steven Schveighoffer wrote:"Andrei Alexandrescu" wroteThat's an excellent point. Casts won't work because scope is not really a type constructor. I think we need to explore more Walter's idea in a post of his: to be able to attach the scope of the returned value to a specific parameter. AndreiOK, so then your proposed solution won't work for valid code like this. Will there be a way to force this to compile?scope T[] findFirst(scope T[] buf, scope T target); scope T[] findFirst5(scope T[] buf) { scope t5 = new T(5); return findFirst(buf, t5); }Same thing, won't compile. This is a clear example where conservative doesn't cut it.
Nov 01 2008
Steven Schveighoffer wrote:In order for this to be useful, we need to have a way to express total relationships between parameters and the return value. And that would probably be so convoluted that it's not worth the syntax.That would certainly address that problem.
Oct 31 2008
Walter Bright wrote:http://www.reddit.com/r/programming/comments/7akzd/escape_analysis/Oh, I encourage people who find the discussion even marginally interesting to upvote the article. The reddit militia is at it again - Walter's articles get regularly down voted by a few people, even within so short time it would be impossible for the voters to have read them. Andrei
Oct 31 2008
Not to be rude, but the articles usually aren't very good. This article gives a basic example of the problem space, and mentions the scope keyword to fix it, but that's about it. It doesn't show the code with the fix. It doesn't discuss transitivity, multiple types of scope, or any other concerns discussed on this list. If you double the length of the article by discussing these deeper issues, the article would be way better. Since D articles already have a reputation to be like this one, I'd avoid useless mentions of D, at least at the beginning of the article. Andrei Alexandrescu Wrote:Walter Bright wrote:http://www.reddit.com/r/programming/comments/7akzd/escape_analysis/Oh, I encourage people who find the discussion even marginally interesting to upvote the article. The reddit militia is at it again - Walter's articles get regularly down voted by a few people, even within so short time it would be impossible for the voters to have read them. Andrei
Oct 31 2008
Jason House wrote:Not to be rude, but the articles usually aren't very good. This article gives a basic example of the problem space, and mentions the scope keyword to fix it, but that's about it. It doesn't show the code with the fix. It doesn't discuss transitivity, multiple types of scope, or any other concerns discussed on this list. If you double the length of the article by discussing these deeper issues, the article would be way better. Since D articles already have a reputation to be like this one, I'd avoid useless mentions of D, at least at the beginning of the article.I agree. Andrei
Oct 31 2008
Jason House wrote:Not to be rude, but the articles usually aren't very good. This article gives a basic example of the problem space, and mentions the scope keyword to fix it, but that's about it. It doesn't show the code with the fix. It doesn't discuss transitivity, multiple types of scope, or any other concerns discussed on this list. If you double the length of the article by discussing these deeper issues, the article would be way better. Since D articles already have a reputation to be like this one, I'd avoid useless mentions of D, at least at the beginning of the article.Point taken, but on the other hand, I've found if the articles are long ones they don't get read. Blogs aren't intended to be an in-depth treatment.
Oct 31 2008
Walter Bright Wrote:Jason House wrote:I was responding to the frequent requests to vote for D articles. I definitely agree there's a sweet spot for article length. I wasn't trying to insult or change what is written. I only wanted to point out how to get my vote, and possibly the vote of others.Not to be rude, but the articles usually aren't very good. This article gives a basic example of the problem space, and mentions the scope keyword to fix it, but that's about it. It doesn't show the code with the fix. It doesn't discuss transitivity, multiple types of scope, or any other concerns discussed on this list. If you double the length of the article by discussing these deeper issues, the article would be way better. Since D articles already have a reputation to be like this one, I'd avoid useless mentions of D, at least at the beginning of the article.Point taken, but on the other hand, I've found if the articles are long ones they don't get read. Blogs aren't intended to be an in-depth treatment.
Oct 31 2008
Jason House wrote:I was responding to the frequent requests to vote for D articles. I definitely agree there's a sweet spot for article length. I wasn't trying to insult or change what is written. I only wanted to point out how to get my vote, and possibly the vote of others.That's fair, and I appreciate the constructive criticism. Would you mind if I passed the next one by you for comments before I post it? There is another hamster lurking in the closet, however. I try to do two of them a month, and doing two in-depth articles a month is quite a bit of work. I guess I don't know just where that sweet spot is.
Oct 31 2008
Walter Bright Wrote:Jason House wrote:I don't mind. It could even be fun :)I was responding to the frequent requests to vote for D articles. I definitely agree there's a sweet spot for article length. I wasn't trying to insult or change what is written. I only wanted to point out how to get my vote, and possibly the vote of others.That's fair, and I appreciate the constructive criticism. Would you mind if I passed the next one by you for comments before I post it?There is another hamster lurking in the closet, however. I try to do two of them a month, and doing two in-depth articles a month is quite a bit of work. I guess I don't know just where that sweet spot is.I don't know where the sweet spot is either :) I won't be offended if you reject my tips about article additions. It's much easier to be a critic than an author.
Nov 01 2008
Walter Bright:http://www.reddit.com/r/programming/comments/7akzd/escape_analysis/I think for this complex stuff Lambda the Ultimate is much fitter than Reddit :-) Bye, bearophile
Oct 31 2008
bearophile wrote:Walter Bright:It's a bit off-topic, but Jaron Lanier wrote an interesting diatribe recently on metafilters such as reddit. Definitely worth a read if you like that sort of thing: http://edge.org/3rd_culture/lanier06/lanier06_index.html Seanhttp://www.reddit.com/r/programming/comments/7akzd/escape_analysis/I think for this complex stuff Lambda the Ultimate is much fitter than Reddit :-)
Oct 31 2008
Walter Bright Wrote:http://www.reddit.com/r/programming/comments/7akzd/escape_analysis/how to mark and verify this? void foo() { void goo() { DoSomeJob(); } JobHandle job=ThreadPool.QueueJob(&goo); //goo goes to the global state ... job.WaitForCompletion(); //but it's safe }
Oct 31 2008
"ore-sama" wroteWalter Bright Wrote:I think your example would be impossible for the compiler to prove safe, even with all the code available, and all dependency graphs available. This might be a case where you have to force the compiler to accept it. Walter, if you do implement this feature, will there at least be a way to force perceived incorrect code to compile? Similar to casting away const when you know what you are doing. -Stevehttp://www.reddit.com/r/programming/comments/7akzd/escape_analysis/how to mark and verify this? void foo() { void goo() { DoSomeJob(); } JobHandle job=ThreadPool.QueueJob(&goo); //goo goes to the global state ... job.WaitForCompletion(); //but it's safe }
Nov 01 2008
On 2008-11-01 11:23:17 -0400, "Steven Schveighoffer" <schveiguy yahoo.com> said:"ore-sama" wroteJust create a scope ThreadPool object, accepting a scopeof(this) function pointer and make sure the destructor waits for completion of the threads. Hum, and "goo" in this example does not use any local variable of foo, so there wouldn't be a need to allocate anything anyway. ;-) That said, I think casting scope away should be allowed, but undefined behaviour (just as casting away immutable). -- Michel Fortin michel.fortin michelf.com http://michelf.com/how to mark and verify this? void foo() { void goo() { DoSomeJob(); } JobHandle job=ThreadPool.QueueJob(&goo); //goo goes to the global state ... job.WaitForCompletion(); //but it's safe }I think your example would be impossible for the compiler to prove safe, even with all the code available, and all dependency graphs available. This might be a case where you have to force the compiler to accept it.
Nov 01 2008
Steven Schveighoffer wrote:Walter, if you do implement this feature, will there at least be a way to force perceived incorrect code to compile? Similar to casting away const when you know what you are doing.Yes.
Nov 01 2008