www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Scope of variables

reply bearophile <bearophileHUGS lycos.com> writes:
This post is a follow-up of something I've written almost one year ago, that I
have summarized here:
http://d.puremagic.com/issues/show_bug.cgi?id=5007

Feel free to ignore this post if you are busy, or if you didn't appreciate the
precedent post.


In my C/D programs some common troubles (or even bugs) are caused by the
scoping of variable/constant names: (1) wrongly thinking I am using a local
name while I am using a name from an outer scope, or (2) using an outer scope
name while I wrongly think I'm using a local name.

Outer scopes are struct/class attributes, global variables, or
variabiles/constants defined in the outer function when you are working in an
inner function.

Given the frequency of troubles such mistakes cause me, I don't understand why
newer languages don't try to improve this situation with stricter visibility
rules. (In D the visibility situation is more complex than C because there are
structs/classes and there are inner functions too, so I think just copying C
rules isn't enough. There are even inner classes, despite I have used them only
once in D, to port some Java code to D. I think they are useful mostly to port
Java code to D).

In Python3 the default is the opposite one, and it seems safer (this
explanation about Python is not the whole story, but it's a good enough
approximation for this discussion): in a function if you want to use a variable
name that't not defined locally you have to use the "nonlocal" (from Python2
there is also "global" keyword for similar usage, but it's not good enough if
you want to use inner functions):


def foo():
    x = 10
    def bar():
        nonlocal x
        x += 1


Some examples of hiding outer names in D this (is shows just the first half of
the problem):


int i;
void foo(int i) {} // hiding global variable
struct Bar {
    int j, w;
    static int k;
    void spam(int j) { // hiding instance variable
        int w; // hiding instance variable
    }
    void baz(int k) { // hiding static variable
        void inner(int k) {} // hiding outer variable
    }
    static void red() {
        int k; // hiding static variable
    }
}
void main() {}


You can't adopt the Python solution in D. But doing the opposite is possible,
the idea is disallowing hiding present in outer scopes (so they generate an
error) unless you mark the variable with something, like a "local":



int i;
void foo(local int i) {}
struct Bar {
    int j, w;
    static int k;
    void spam(int local j) {
        local int w;
    }
    void baz(local int k) {
        void inner(local int k) {}
    }
    static void red() {
        local int k;
    }
}
void main() {}


Time ago I have even thought about an  outer attribute, that is optional and
it's more similar to the Python3 "nonlocal". If you add  outer at a function
signature, you have to list all the names from outer scopes you will use inside
the current function (and if you don't use them all it's an error again). In
SPARK there is something similar, but more verbose and it's not optional (I
don't remember if the Splint lint has something similar):



int i = 1;
int j = 2;
 outer(out i, in j) void foo(int x) {
  i = x + j;
}
struct Foo {
  int x;
   outer(in x, inout j) void bar() {
    j += x;
  }
}
void main() {}


"local" is more light to use, in syntax too, while  outer() is more heavy but
it's more precise (but to make it work I think "local" can't be optional). Of
the two ideas I like  outer() better.  outer is meant to avoid bugs but I think
it's usable in debugging too, when you find a bug that I think is scope-related
I add some  outer() to make the code more strict and make the bug stand out.

In theory with some more static introspection (like a __traits that given a
function/method name, returns an array of variables used inside it (or used by
functions called by the function), and how they are used, if read, written or
both) it's possible to add an user-defined attribute like  outer with user code.


In bug 5007 Nick Sabalausky has shown a simpler idea:


int globalVar;
class Foo() {
    int instanceVar;
    static int classVar;

     explicitLookup // Name subject to change
    void bar() {

        int globalVar;   // Error
        int instanceVar; // Error
        int classVar;    // Error

        globalVar   = 1; // Error
        instanceVar = 1; // Error
        classVar    = 1; // Error

        .globalVar       = 1; // Ok
        this.instanceVar = 1; // Ok
        Foo.classVar     = 1; // Ok
    }
}

Bye,
bearophile
Jun 24 2011
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
    void baz(local int k) {
        void inner(local int k) {}
    }
This is unneeded. By declaring a local variable, it's *obviously* local - there's no point in saying local again! Now, I've forgotten that I had a variable declared before, but adding more stuff to the argument list wouldn't change anything, because if I was looking at the argument list, I would have realized the local variable was there anyway! If D was a bad language that allowed implicit variable definitions, it might make sense to do this, but we already declare all vars so it adds nothing.
 void foo(int i) {} // hiding global variable
This is a good thing because you can reason about it locally. If you declare a variable locally, you know it is going to work. What I've taken to doing is if I definitely want to access a class var, I'll just write this and use the dot to get to a global. That way, it's always clear what's going on without looking back at the function definition. I don't want that to be required though.
Jun 24 2011
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Adam D. Ruppe:

    void baz(local int k) {
        void inner(local int k) {}
    }
This is unneeded. By declaring a local variable, it's *obviously* local - there's no point in saying local again!
 If D was a bad language that allowed implicit variable definitions,
 it might make sense to do this, but we already declare all vars so
 it adds nothing.
You are right, but the purpose of that "local" annotation is a bit different: if you don't use that annotation, you define a local k and you already have a variable named k in an outer name space, the compiler is supposed to generate an error. So "local" is a way to say the compiler that you know there is an outer name "k" and you want to hide it. Maybe "local" is not the best possible name for this annotation, "hider" seems more clear :-)
 What I've taken to doing is if I definitely want to access a
 class var, I'll just write this and use the dot to get to a global.
 
 That way, it's always clear what's going on without looking
 back at the function definition.
 
 I don't want that to be required though.
The purpose of the _optional_ explicitLookup annotation by Nick Sabalausky is that one, to ask the compiler to enforce what you do. Bye, bearophile
Jun 24 2011
prev sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On 2011-06-24 17:39, Adam D. Ruppe wrote:
    void baz(local int k) {
    
        void inner(local int k) {}
    
    }
This is unneeded. By declaring a local variable, it's *obviously* local - there's no point in saying local again! Now, I've forgotten that I had a variable declared before, but adding more stuff to the argument list wouldn't change anything, because if I was looking at the argument list, I would have realized the local variable was there anyway! If D was a bad language that allowed implicit variable definitions, it might make sense to do this, but we already declare all vars so it adds nothing.
 void foo(int i) {} // hiding global variable
This is a good thing because you can reason about it locally. If you declare a variable locally, you know it is going to work. What I've taken to doing is if I definitely want to access a class var, I'll just write this and use the dot to get to a global. That way, it's always clear what's going on without looking back at the function definition.
That's why a lot of people typically use a different naming scheme for member variables (e.g. prepending them with _ or m_). As long as you're smart about naming, it's really not a problem.
 I don't want that to be required though.
Yeah. That would not be fun. As it is, there are a variety of ways that you can go about differentiating between globals, locals, and member variables. There's no need for the compiler or language to get into it any further than they already do. Being too strict about stuff like that would get very annoying. - Jonathan M Davis
Jun 24 2011
next sibling parent reply Peter Alexander <peter.alexander.au gmail.com> writes:
On 25/06/11 2:42 AM, Jonathan M Davis wrote:
 That's why a lot of people typically use a different naming scheme for member
 variables (e.g. prepending them with _ or m_). As long as you're smart about
 naming, it's really not a problem.

 I don't want that to be required though.
Yeah. That would not be fun. As it is, there are a variety of ways that you can go about differentiating between globals, locals, and member variables. There's no need for the compiler or language to get into it any further than they already do. Being too strict about stuff like that would get very annoying. - Jonathan M Davis
Agree with this 100%. This is a solved problem. No need for throwing more and more language features at it.
Jun 25 2011
parent bearophile <bearophileHUGS lycos.com> writes:
Peter Alexander:

 This is a solved problem. No need for throwing more and more language 
 features at it.
I agree, from what I am seeing in Haskell it's a mostly solved problem. Bye, bearophile
Jun 25 2011
prev sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Jonathan M Davis:

That's why a lot of people typically use a different naming scheme for member
variables (e.g. prepending them with _ or m_). As long as you're smart about
naming, it's really not a problem.<
I too use the leading underscore. The problem with using name tags like the leading underscore is lack of enforcement from the compiler (there is only a partial enforcement: you have to use the same name and new instance attributes don't pop out of existance if you assign to a wrong attribute name by mistake, as in Python), and lack of standards. In other code you often find other wans to denote member variabiles. In Java you mostly have classes and nested classes, in C you have global variables and local variables (and function arguments). But in D the situation is more complex, you have non-lambda inner functions too, so the normal safeties used in C maybe are not enough. Do you use non-lambda inner functions often in your code? I do. D is more complex and more powerful than both C and Java, so maybe it also needs more safeguards that are not needed in C and Java. You have to keep in mind that D code has new kinds of bugs (or higher frequency of some kinds of old bugs) because D has more features than C/Java. Just trying to avoid bugs typical of C code is not enough in D! You have also to try to avoid D-specific bugs, coming from its differnces and new features.
 I don't want that to be required though.
Yeah. That would not be fun.
Both the outer() and the idea by Sabalausky are fully optional. You are allowed to not use them, and removing them from code doesn't change what the code does. If library code that you are using uses them, you are not forced to use them in your code too.
There's no need for the compiler or language to get into it any further than
they already do.<
I am finding problems about the hiding of variable names, since years (I have some bug reports in bugzilla about sub-problems of this), so I presume for me there is a vague need for something better, to avoid the troubles I am finding. Such problems waste some of my time.
Being too strict about stuff like that would get very annoying.<
I am not sure that's true. The outer() is optional, this means you probably don't want to use if if you are writing 50 lines long programs, or if you are not using global variables and non-lambda inner functions a lot, and if even most of your functions are pure too. outer() is meant to help debugging too, when you find a bug, you add those annotations, and I think this helps uncover some problems. There are different kinds of programs, and different kinds of programmers, in some cases more strictness is desired. My experience tells me that sometimes more strictness avoids bugs that otherwise later make me waste far more time than being strict in the first place. But of course, this requires a bit of self-discipline and experience. I program in Python too, but it has nonlocal/global and its situation is very different, so you can't compare it will with coding in D. Bye, bearophile
Jun 25 2011
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On 2011-06-25 05:52, bearophile wrote:
 Jonathan M Davis:
That's why a lot of people typically use a different naming scheme for
member variables (e.g. prepending them with _ or m_). As long as you're
smart about naming, it's really not a problem.<
I too use the leading underscore. The problem with using name tags like the leading underscore is lack of enforcement from the compiler (there is only a partial enforcement: you have to use the same name and new instance attributes don't pop out of existance if you assign to a wrong attribute name by mistake, as in Python), and lack of standards. In other code you often find other wans to denote member variabiles. In Java you mostly have classes and nested classes, in C you have global variables and local variables (and function arguments). But in D the situation is more complex, you have non-lambda inner functions too, so the normal safeties used in C maybe are not enough. Do you use non-lambda inner functions often in your code? I do. D is more complex and more powerful than both C and Java, so maybe it also needs more safeguards that are not needed in C and Java. You have to keep in mind that D code has new kinds of bugs (or higher frequency of some kinds of old bugs) because D has more features than C/Java. Just trying to avoid bugs typical of C code is not enough in D! You have also to try to avoid D-specific bugs, coming from its differnces and new features.
The situation in D with regards to variable names isn't really any more complex in D than it is in C++, and in my experience it's virtually never a problem in C++. Smart variable naming makes things clear. This is a complete non-issue IMHO. It rarely creates bugs. Non offense, but honestly, if you're seeing much in the way of bugs from using the wrong variable, I have to wonder if you're doing something wrong. - Jonathan M Davis
Jun 25 2011
parent bearophile <bearophileHUGS lycos.com> writes:
Jonathan M Davis:

 The situation in D with regards to variable names isn't really any more 
 complex in D than it is in C++,
I don't agree. In D I use many inner functions, so I have to be careful about where a variable is defined, if global, if local, or if local in the outer function. This burns some of my time.
 and in my experience it's virtually never a 
 problem in C++. Smart variable naming makes things clear. This is a complete 
 non-issue IMHO. It rarely creates bugs.
Seeing my recurring mild troubles, and how SPARK has chosen a more strict attitude, I think you are not fully right.
 Non offense, but honestly, if you're 
 seeing much in the way of bugs from using the wrong variable, I have to wonder 
 if you're doing something wrong.
None taken, you're one of the most polite and gentle persons around here, I appreciate this. Surely different programmers have different cognitive capabilities. Sometimes in CLisp I get lost in parentheses, while I usually know where the North is while I program in Haskell. There's lot of free space to explore while you design a programming language, and I think current languages usually have parts that can be improved. Designing good computer languages is one of the most difficult activities I see around, they are interfaces between quirky and very complex mammals and refined and increasingly complex tech agglomerates able to approximate universal computation machines. Bye, bearophile
Jun 25 2011