digitalmars.D - Scope of variables

bearophile (86/86) Jun 24 2011 This post is a follow-up of something I've written almost one year ago, ...

Adam D. Ruppe (16/20) Jun 24 2011 This is unneeded. By declaring a local variable, it's

bearophile (6/22) Jun 24 2011 You are right, but the purpose of that "local" annotation is a bit diffe...
Jonathan M Davis (10/39) Jun 24 2011 That's why a lot of people typically use a different naming scheme for m...

Peter Alexander (4/14) Jun 25 2011 Agree with this 100%.

bearophile (4/6) Jun 25 2011 I agree, from what I am seeing in Haskell it's a mostly solved problem.

bearophile (12/17) Jun 25 2011 I too use the leading underscore. The problem with using name tags like ...

Jonathan M Davis (8/34) Jun 25 2011 The situation in D with regards to variable names isn't really any more

bearophile (9/17) Jun 25 2011 Seeing my recurring mild troubles, and how SPARK has chosen a more stric...

bearophile <bearophileHUGS lycos.com> writes:

This post is a follow-up of something I've written almost one year ago, that I
have summarized here:
http://d.puremagic.com/issues/show_bug.cgi?id=5007

Feel free to ignore this post if you are busy, or if you didn't appreciate the
precedent post.


In my C/D programs some common troubles (or even bugs) are caused by the
scoping of variable/constant names: (1) wrongly thinking I am using a local
name while I am using a name from an outer scope, or (2) using an outer scope
name while I wrongly think I'm using a local name.

Outer scopes are struct/class attributes, global variables, or
variabiles/constants defined in the outer function when you are working in an
inner function.

Given the frequency of troubles such mistakes cause me, I don't understand why
newer languages don't try to improve this situation with stricter visibility
rules. (In D the visibility situation is more complex than C because there are
structs/classes and there are inner functions too, so I think just copying C
rules isn't enough. There are even inner classes, despite I have used them only
once in D, to port some Java code to D. I think they are useful mostly to port
Java code to D).

In Python3 the default is the opposite one, and it seems safer (this
explanation about Python is not the whole story, but it's a good enough
approximation for this discussion): in a function if you want to use a variable
name that't not defined locally you have to use the "nonlocal" (from Python2
there is also "global" keyword for similar usage, but it's not good enough if
you want to use inner functions):


def foo():
    x = 10
    def bar():
        nonlocal x
        x += 1


Some examples of hiding outer names in D this (is shows just the first half of
the problem):


int i;
void foo(int i) {} // hiding global variable
struct Bar {
    int j, w;
    static int k;
    void spam(int j) { // hiding instance variable
        int w; // hiding instance variable
    }
    void baz(int k) { // hiding static variable
        void inner(int k) {} // hiding outer variable
    }
    static void red() {
        int k; // hiding static variable
    }
}
void main() {}


You can't adopt the Python solution in D. But doing the opposite is possible,
the idea is disallowing hiding present in outer scopes (so they generate an
error) unless you mark the variable with something, like a "local":



int i;
void foo(local int i) {}
struct Bar {
    int j, w;
    static int k;
    void spam(int local j) {
        local int w;
    }
    void baz(local int k) {
        void inner(local int k) {}
    }
    static void red() {
        local int k;
    }
}
void main() {}


Time ago I have even thought about an  outer attribute, that is optional and
it's more similar to the Python3 "nonlocal". If you add  outer at a function
signature, you have to list all the names from outer scopes you will use inside
the current function (and if you don't use them all it's an error again). In
SPARK there is something similar, but more verbose and it's not optional (I
don't remember if the Splint lint has something similar):



int i = 1;
int j = 2;
 outer(out i, in j) void foo(int x) {
  i = x + j;
}
struct Foo {
  int x;
   outer(in x, inout j) void bar() {
    j += x;
  }
}
void main() {}


"local" is more light to use, in syntax too, while  outer() is more heavy but
it's more precise (but to make it work I think "local" can't be optional). Of
the two ideas I like  outer() better.  outer is meant to avoid bugs but I think
it's usable in debugging too, when you find a bug that I think is scope-related
I add some  outer() to make the code more strict and make the bug stand out.

In theory with some more static introspection (like a __traits that given a
function/method name, returns an array of variables used inside it (or used by
functions called by the function), and how they are used, if read, written or
both) it's possible to add an user-defined attribute like  outer with user code.


In bug 5007 Nick Sabalausky has shown a simpler idea:


int globalVar;
class Foo() {
    int instanceVar;
    static int classVar;

     explicitLookup // Name subject to change
    void bar() {

        int globalVar;   // Error
        int instanceVar; // Error
        int classVar;    // Error

        globalVar   = 1; // Error
        instanceVar = 1; // Error
        classVar    = 1; // Error

        .globalVar       = 1; // Ok
        this.instanceVar = 1; // Ok
        Foo.classVar     = 1; // Ok
    }
}

Bye,
bearophile

Jun 24 2011

Adam D. Ruppe <destructionator gmail.com> writes:

    void baz(local int k) {
        void inner(local int k) {}
    }

This is unneeded. By declaring a local variable, it's
*obviously* local - there's no point in saying local again!

Now, I've forgotten that I had a variable declared before, but
adding more stuff to the argument list wouldn't change anything,
because if I was looking at the argument list, I would have realized
the local variable was there anyway!

If D was a bad language that allowed implicit variable definitions,
it might make sense to do this, but we already declare all vars so
it adds nothing.

 void foo(int i) {} // hiding global variable

This is a good thing because you can reason about it locally. If
you declare a variable locally, you know it is going to work.

What I've taken to doing is if I definitely want to access a
class var, I'll just write this and use the dot to get to a global.

That way, it's always clear what's going on without looking
back at the function definition.


I don't want that to be required though.

Jun 24 2011

bearophile <bearophileHUGS lycos.com> writes:

Adam D. Ruppe:

    void baz(local int k) {
        void inner(local int k) {}
    }

 
 This is unneeded. By declaring a local variable, it's
 *obviously* local - there's no point in saying local again!

 If D was a bad language that allowed implicit variable definitions,
 it might make sense to do this, but we already declare all vars so
 it adds nothing.

You are right, but the purpose of that "local" annotation is a bit different:
if you don't use that annotation, you define a local k and you already have a
variable named k in an outer name space, the compiler is supposed to generate
an error. So "local" is a way to say the compiler that you know there is an
outer name "k" and you want to hide it.

Maybe "local" is not the best possible name for this annotation, "hider" seems
more clear :-)


 What I've taken to doing is if I definitely want to access a
 class var, I'll just write this and use the dot to get to a global.
 
 That way, it's always clear what's going on without looking
 back at the function definition.
 
 I don't want that to be required though.

The purpose of the _optional_  explicitLookup annotation by Nick Sabalausky is
that one, to ask the compiler to enforce what you do.

Bye,
bearophile

Jun 24 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On 2011-06-24 17:39, Adam D. Ruppe wrote:
    void baz(local int k) {
    
        void inner(local int k) {}
    
    }

 
 This is unneeded. By declaring a local variable, it's
 *obviously* local - there's no point in saying local again!
 
 Now, I've forgotten that I had a variable declared before, but
 adding more stuff to the argument list wouldn't change anything,
 because if I was looking at the argument list, I would have realized
 the local variable was there anyway!
 
 If D was a bad language that allowed implicit variable definitions,
 it might make sense to do this, but we already declare all vars so
 it adds nothing.
 
 void foo(int i) {} // hiding global variable

 
 This is a good thing because you can reason about it locally. If
 you declare a variable locally, you know it is going to work.
 
 What I've taken to doing is if I definitely want to access a
 class var, I'll just write this and use the dot to get to a global.
 
 That way, it's always clear what's going on without looking
 back at the function definition.

That's why a lot of people typically use a different naming scheme for member 
variables (e.g. prepending them with _ or m_). As long as you're smart about 
naming, it's really not a problem.

 I don't want that to be required though.

Yeah. That would not be fun. As it is, there are a variety of ways that you 
can go about differentiating between globals, locals, and member variables. 
There's no need for the compiler or language to get into it any further than 
they already do. Being too strict about stuff like that would get very 
annoying.

- Jonathan M Davis

Jun 24 2011

Peter Alexander <peter.alexander.au gmail.com> writes:

On 25/06/11 2:42 AM, Jonathan M Davis wrote:
 That's why a lot of people typically use a different naming scheme for member
 variables (e.g. prepending them with _ or m_). As long as you're smart about
 naming, it's really not a problem.

 I don't want that to be required though.

 Yeah. That would not be fun. As it is, there are a variety of ways that you
 can go about differentiating between globals, locals, and member variables.
 There's no need for the compiler or language to get into it any further than
 they already do. Being too strict about stuff like that would get very
 annoying.

 - Jonathan M Davis

Agree with this 100%.

This is a solved problem. No need for throwing more and more language 
features at it.

Jun 25 2011

bearophile <bearophileHUGS lycos.com> writes:

Peter Alexander:

 This is a solved problem. No need for throwing more and more language 
 features at it.

I agree, from what I am seeing in Haskell it's a mostly solved problem.

Bye,
bearophile

Jun 25 2011

bearophile <bearophileHUGS lycos.com> writes:

Jonathan M Davis:

That's why a lot of people typically use a different naming scheme for member
variables (e.g. prepending them with _ or m_). As long as you're smart about
naming, it's really not a problem.<

I too use the leading underscore. The problem with using name tags like the
leading underscore is lack of enforcement from the compiler (there is only a
partial enforcement: you have to use the same name and new instance attributes
don't pop out of existance if you assign to a wrong attribute name by mistake,
as in Python), and lack of standards. In other code you often find other wans
to denote member variabiles.

In Java you mostly have classes and nested classes, in C you have global
variables and local variables (and function arguments). But in D the situation
is more complex, you have non-lambda inner functions too, so the normal
safeties used in C maybe are not enough. Do you use non-lambda inner functions
often in your code? I do. 

D is more complex and more powerful than both C and Java, so maybe it also
needs more safeguards that are not needed in C and Java. You have to keep in
mind that D code has new kinds of bugs (or higher frequency of some kinds of
old bugs) because D has more features than C/Java.

Just trying to avoid bugs typical of C code is not enough in D! You have also
to try to avoid D-specific bugs, coming from its differnces and new features.


 I don't want that to be required though.

 Yeah. That would not be fun.

Both the  outer() and the idea by Sabalausky are fully optional. You are
allowed to not use them, and removing them from code doesn't change what the
code does. If library code that you are using uses them, you are not forced to
use them in your code too.


There's no need for the compiler or language to get into it any further than
they already do.<

I am finding problems about the hiding of variable names, since years (I have
some bug reports in bugzilla about sub-problems of this), so I presume for me
there is a vague need for something better, to avoid the troubles I am finding.
Such problems waste some of my time.


Being too strict about stuff like that would get very annoying.<

I am not sure that's true. The  outer() is optional, this means you probably
don't want to use if if you are writing 50 lines long programs, or if you are
not using global variables and non-lambda inner functions a lot, and if even
most of your functions are pure too.  outer() is meant to help debugging too,
when you find a bug, you add those annotations, and I think this helps uncover
some problems.

There are different kinds of programs, and different kinds of programmers, in
some cases more strictness is desired. My experience tells me that sometimes
more strictness avoids bugs that otherwise later make me waste far more time
than being strict in the first place.

But of course, this requires a bit of self-discipline and experience. I program
in Python too, but it has nonlocal/global and its situation is very different,
so you can't compare it will with coding in D.

Bye,
bearophile

Jun 25 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On 2011-06-25 05:52, bearophile wrote:
 Jonathan M Davis:
That's why a lot of people typically use a different naming scheme for
member variables (e.g. prepending them with _ or m_). As long as you're
smart about naming, it's really not a problem.<

 
 I too use the leading underscore. The problem with using name tags like the
 leading underscore is lack of enforcement from the compiler (there is only
 a partial enforcement: you have to use the same name and new instance
 attributes don't pop out of existance if you assign to a wrong attribute
 name by mistake, as in Python), and lack of standards. In other code you
 often find other wans to denote member variabiles.
 
 In Java you mostly have classes and nested classes, in C you have global
 variables and local variables (and function arguments). But in D the
 situation is more complex, you have non-lambda inner functions too, so the
 normal safeties used in C maybe are not enough. Do you use non-lambda
 inner functions often in your code? I do.
 
 D is more complex and more powerful than both C and Java, so maybe it also
 needs more safeguards that are not needed in C and Java. You have to keep
 in mind that D code has new kinds of bugs (or higher frequency of some
 kinds of old bugs) because D has more features than C/Java.
 
 Just trying to avoid bugs typical of C code is not enough in D! You have
 also to try to avoid D-specific bugs, coming from its differnces and new
 features.

The situation in D with regards to variable names isn't really any more 
complex in D than it is in C++, and in my experience it's virtually never a 
problem in C++. Smart variable naming makes things clear. This is a complete 
non-issue IMHO. It rarely creates bugs. Non offense, but honestly, if you're 
seeing much in the way of bugs from using the wrong variable, I have to wonder 
if you're doing something wrong.

- Jonathan M Davis

Jun 25 2011

bearophile <bearophileHUGS lycos.com> writes:

Jonathan M Davis:

 The situation in D with regards to variable names isn't really any more 
 complex in D than it is in C++,

I don't agree. In D I use many inner functions, so I have to be careful about
where a variable is defined, if global, if local, or if local in the outer
function. This burns some of my time.


 and in my experience it's virtually never a 
 problem in C++. Smart variable naming makes things clear. This is a complete 
 non-issue IMHO. It rarely creates bugs.

Seeing my recurring mild troubles, and how SPARK has chosen a more strict
attitude, I think you are not fully right.


 Non offense, but honestly, if you're 
 seeing much in the way of bugs from using the wrong variable, I have to wonder 
 if you're doing something wrong.

None taken, you're one of the most polite and gentle persons around here, I
appreciate this.

Surely different programmers have different cognitive capabilities. Sometimes
in CLisp I get lost in parentheses, while I usually know where the North is
while I program in Haskell.

There's lot of free space to explore while you design a programming language,
and I think current languages usually have parts that can be improved.

Designing good computer languages is one of the most difficult activities I see
around, they are interfaces between quirky and very complex mammals and refined
and increasingly complex tech agglomerates able to approximate universal
computation machines.

Bye,
bearophile

Jun 25 2011

D Programming

C/C++ Programming

Other

digitalmars.D - Scope of variables