www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - How compiler detects forward reference errors

reply Igor <stojkovic.igor gmail.com> writes:
Can anyone explain in plain English how does compiler process and 
detect a "test.d(6) Error: forward reference of variable a" in 
following code:

import std.stdio;

enum a = 1 + b;
enum d = 5 + a; // No error here
enum b = 12 + c;
enum c = 10 + a; // error here

void main()
{
     writeln("Hello World!", b);
}
Sep 03 2016
next sibling parent reply Lodovico Giaretta <lodovico giaretart.net> writes:
On Saturday, 3 September 2016 at 14:06:06 UTC, Igor wrote:
 Can anyone explain in plain English how does compiler process 
 and detect a "test.d(6) Error: forward reference of variable a" 
 in following code:

 import std.stdio;

 enum a = 1 + b;
 enum d = 5 + a; // No error here
 enum b = 12 + c;
 enum c = 10 + a; // error here

 void main()
 {
     writeln("Hello World!", b);
 }
a needs b to be initialized. So b must be initialized before a. Let's write this b->a. Now b needs c. So c->b. c needs a, so a->c. If we sum everything, we have that a->c->b->a. This mean that to initialize a we need b, to initialize b we need c, but to initialize c we need a. So to initialize a we need a, which is not possible. We need a before having initialized it. On the other hand, a->d is not a problem, as d can be initialized after a.
Sep 03 2016
parent reply Igor <stojkovic.igor gmail.com> writes:
On Saturday, 3 September 2016 at 14:13:27 UTC, Lodovico Giaretta 
wrote:
 On Saturday, 3 September 2016 at 14:06:06 UTC, Igor wrote:
 Can anyone explain in plain English how does compiler process 
 and detect a "test.d(6) Error: forward reference of variable 
 a" in following code:

 import std.stdio;

 enum a = 1 + b;
 enum d = 5 + a; // No error here
 enum b = 12 + c;
 enum c = 10 + a; // error here

 void main()
 {
     writeln("Hello World!", b);
 }
a needs b to be initialized. So b must be initialized before a. Let's write this b->a. Now b needs c. So c->b. c needs a, so a->c. If we sum everything, we have that a->c->b->a. This mean that to initialize a we need b, to initialize b we need c, but to initialize c we need a. So to initialize a we need a, which is not possible. We need a before having initialized it. On the other hand, a->d is not a problem, as d can be initialized after a.
So, you are saying compiler is keeping a kind of linked list of dependencies and then checks if any of those lists are circular? But how exactly would that list be structured since one expression can have multiple dependencies, like: enum a = b + c + d + e; enum b = 10 + c; enum c = d + e + a; ...
Sep 04 2016
next sibling parent Lodovico Giaretta <lodovico giaretart.net> writes:
On Sunday, 4 September 2016 at 19:15:15 UTC, Igor wrote:
 So, you are saying compiler is keeping a kind of linked list of 
 dependencies and then checks if any of those lists are 
 circular? But how exactly would that list be structured since 
 one expression can have multiple dependencies, like:

 enum a = b + c + d + e;
 enum b = 10 + c;
 enum c = d + e + a;
 ...
Disclaimer: I don't know how the D compiler works internally. I just happen to know how generic compilers usually work. I think the easiest solution is to keep a stack of what you are doing. So in your last example, we start with evaluating a. We first see that a depends on b. So we switch evaluating b. As we do so, our stack contains a and b. We see that b depends on c. So we switch to evaluate c. Our stack becomes a, b, c. We see that c depends on d. So we evaluate d. Our stack is a, b, c, d. You didn't give a definition for d. Let's assume it does not depend on anything. So we successfully evaluated d. So we get back to c. Our stack is a, b, c. We see that c also depends on e. So we evaluate e. Our stack is a, b, c, e. You didn't give a definition for e. Let's assume it does not depend on anything. So we successfully evaluated e. So we get back to c. Our stack is a, b, c. We see that c also depends on a. So we evaluate a. Our stack becomes a, b, c, a. Ops... Our stack contains a two times. This means that to evaluate a we need the value of a. So we tell the user that we cannot proceed further. By the way, the stack also tells us how a depends on itself. In fact, the stack a, b, c, a tells us that a depends on b which depends on c which depends on a. I hope this is a clear enough explanation. If not, feel free to ask further.
Sep 04 2016
prev sibling parent reply Illuminati <Joe Masons.com> writes:
On Sunday, 4 September 2016 at 19:15:15 UTC, Igor wrote:
 On Saturday, 3 September 2016 at 14:13:27 UTC, Lodovico 
 Giaretta wrote:
 On Saturday, 3 September 2016 at 14:06:06 UTC, Igor wrote:
 Can anyone explain in plain English how does compiler process 
 and detect a "test.d(6) Error: forward reference of variable 
 a" in following code:

 import std.stdio;

 enum a = 1 + b;
 enum d = 5 + a; // No error here
 enum b = 12 + c;
 enum c = 10 + a; // error here

 void main()
 {
     writeln("Hello World!", b);
 }
a needs b to be initialized. So b must be initialized before a. Let's write this b->a. Now b needs c. So c->b. c needs a, so a->c. If we sum everything, we have that a->c->b->a. This mean that to initialize a we need b, to initialize b we need c, but to initialize c we need a. So to initialize a we need a, which is not possible. We need a before having initialized it. On the other hand, a->d is not a problem, as d can be initialized after a.
So, you are saying compiler is keeping a kind of linked list of dependencies and then checks if any of those lists are circular? But how exactly would that list be structured since one expression can have multiple dependencies, like: enum a = b + c + d + e; enum b = 10 + c; enum c = d + e + a; ...
It is not complicated. Each expression is incomplete. The compiler can easily keep track if a variable is completely defined or incompletely defined. If it is completely defined it must be able to be evaluated at CT. It can be evaluated because it only depends on things that are evaluatable(such as numbers and mathematical operations, etc). So, any expression that depends on something else, that something else must be completely evaluatable for the dependent expression to be completely evaluatable. That is "completely evaluatable" is transitive. As the compiler processes expressions, it can check just have a flag to mark variable that "hold" those expressions(or were assigned) as either being evaluated or not. If it is, it, say, is marked. Then other other expressions that depend on the variable(s) require all it's dependencies to have been evaluated. For forward references, it is not much difference between that and an undefined variable.... except the compiler "looks ahead" and knows that the variables were declared after they were used. Not a lot of difference between the two except the compiler tries to be a little smarter in one case. For the compiler to actually finish it's job, every single expression, statement, etc must be interpreted properly. If it is not, it can't compute the result. The "program" is invalid. There is effectively a one to one correspondence between the languages grammar and the "machine code" that the compiler spits out. If there is not, then the compiler simply cannot produce a result(it does not know how or understand what the programmer wants). It is very complex, in many ways, but the rules are simple and well defined(they must be to be able to produce consistent and valid results). It is like physics or any scientific thing. The are "laws" of behavior and no matter how complex something behaves in the physical world, it must behave according to those laws. There are laws in a programming language(the grammar) and every program must abide by them or the program is invalid. Using a variable before it is defined does not make sense. It is illogical, unmathematical, unscientific, and meaningless. One could argue that something like enum a = b + c; enum b = 10 + c; enum c = 4; could be valid, and this is true, if we work from the bottom up. But that is just grammatical transformations. We generally work top down and all grammars I know of use top down. One can't use both because it would produce ambiguous results and is rather meaningless. But enum a = b; enum b = a; is meaningless in any interpretation and doesn't work no matter how one defines things, in general. Of course, one can build in any type of craziness in to a grammar and make such things interpretable but generally, while simple cases work, more complex cases become ambiguous and meaningless, hard to reason about, etc. So, a compiler creates a set of rules and a programmer(should be programmar, but...) learns those rules and tries to express what he wants done by the program in terms of those rules. The compiler then takes the "code" from the programmer and checks to see if it fits the rules of the grammar and if it does, it converts the code in to a program to execute the "specific" rules the programmer coded. Now, the compiler can decide how it wants to process the code. This is where the "forward reference" comes in for D. Not all compilers do this. Some will just process the first line and say "Hey, there is no b in my symbol table. FAIL!". D goes a step further and doesn't stop immediately, it processes the next line.. etc. But all that is implementation details that depends on the specific compilers.
Sep 04 2016
parent Igor <stojkovic.igor gmail.com> writes:
Thank you all for your replies. I am trying to learn a bit about 
compiler and language design and I really like D among many other 
languages I read about so I am trying to learn from it as well.
Sep 08 2016
prev sibling next sibling parent Adam D. Ruppe <destructionator gmail.com> writes:
On Saturday, 3 September 2016 at 14:06:06 UTC, Igor wrote:
 Can anyone explain in plain English how does compiler process 
 and detect a "test.d(6) Error: forward reference of variable a" 
 in following code:
Via bugs, it does if(would_be_annoying_to_user) trigger_error(); else let_it_work(); If a user complains they say "report to bugzilla" but bugzilla is conveniently redirected to "wait forever; fix not coming". It is all random bugs
Sep 03 2016
prev sibling parent Stefan Koch <uplink.coder googlemail.com> writes:
On Saturday, 3 September 2016 at 14:06:06 UTC, Igor wrote:
 Can anyone explain in plain English how does compiler process 
 and detect a "test.d(6) Error: forward reference of variable a" 
 in following code:

 import std.stdio;

 enum a = 1 + b;
 enum d = 5 + a; // No error here
 enum b = 12 + c;
 enum c = 10 + a; // error here

 void main()
 {
     writeln("Hello World!", b);
 }
This is an recursive expression. c = 10 + 1 + 12 + c ... and therefore cannot be compiled.
Sep 03 2016