digitalmars.D - Clang error recovery

bearophile (84/86) Apr 06 2010 This is the latest post on the LLVM blog, "Amazing feats of Clang Error ...

Robert Clipsham (26/41) Apr 06 2010 The error column won't happen in DMD, Walter has mentioned many times

bearophile (11/12) Apr 06 2010 The C# dotnet compiler too shows the error column. But I agree with you ...

Robert Clipsham (14/20) Apr 06 2010 I really don't like this option, if there needs to be an option for it

bearophile (7/11) Apr 06 2010 I use GNU head on Windows too:

Ary Borenszweig (6/44) Apr 06 2010 dmd has an ErrorType when an expression or something gives an error. The...

Don (4/55) Apr 06 2010 There's also an ErrorExpression which is used in many places, and it

Walter Bright (7/10) Apr 06 2010 Attempting to correct the error and move forward with the compilation so...

Andrei Alexandrescu (3/13) Apr 06 2010 NaP should be its name I guess :o).

Walter Bright (9/37) Apr 06 2010 dmd's spell checker only looks a distance of one, and uint64 is a distan...

Brad Roberts (10/50) Apr 06 2010 Consider trying increasing distances (with some relatively low max). If...

Walter Bright (3/11) Apr 06 2010 Transpositions count as 1. See the dmd source file speller.c.

bearophile <bearophileHUGS lycos.com> writes:

This is the latest post on the LLVM blog, "Amazing feats of Clang Error
Recovery", by Chris Lattner:
http://blog.llvm.org/2010/04/amazing-feats-of-clang-error-recovery.html

I've compared dmd to few of those examples of Clang usage. I don't comment each
one of those things because some of them are specific of C++, and because I
don't understand some other of them. If you can write better translations to D
or if you understand more of them, you can add and show me more comparisons :-)

=============================

Clang:


int foo(int x, pid_t y) {
  return x+y;
}


t.c:1:16: error: unknown type name 'pid_t'
int foo(int x, pid_t y) {
               ^

-----------------

dmd 2.042:

int foo(int x, pid_t y) {
  return x+y;
}
void main() {}


temp.d(1): Error: identifier 'pid_t' is not defined
temp.d(1): Error: pid_t is used as a type
temp.d(1): Error: cannot have parameter of type void

-----------------

Here Clangs gives a single error message (and it gives the error column).
Here the errors given by dmd give the same information.
Here dmd gives error messages that are better than GCC 4.2 ones, but I think a
single good error message is better than three.

=============================

Clang:


#include <inttypes.h>
int64 x;


t.c:2:1: error: unknown type name 'int64'; did you mean 'int64_t'?
int64 x;
^~~~~
int64_t

-----------------

dmd:

I am not sure if this is the same situation:

alias uint uint64_t;
int foo(uint64 x) {
  return x * 2;
}
void main() {}


dmd prints:

temp.d(2): Error: identifier 'uint64' is not defined
temp.d(2): Error: uint64 is used as a type
temp.d(2): Error: cannot have parameter of type void


That page says:

Code that later used 'x', for example, knows that it is declared as an int64_t,
so it doesn't lead to other weird follow on errors that don't make any sense.<

But I don't understand what it means. Maybe this is why this blog post is
titled "Clang Error Recovery" instead of "Clang Error Messages".
Later in the same post it says something similar:

In addition to getting the error message right (and suggesting a fixit
replacement to "::"), Clang "knows what you mean" so it handles the subsequent
uses of a2 correctly.<

=============================

Clang:

namespace foo {
  struct x { int y; };
}
namespace bar {
  typedef int y;
}
void test() {
  foo::x a;
  bar::y b;
  a + b;
}



t.cc:10:5: error: invalid operands to binary expression ('foo::x' and 'bar::y'
      (aka 'int'))
  a + b;
  ~ ^ ~

-----------------

dmd:

D doesn't have the namespaces, so I have adapted that code like this:


struct foo {
  static struct x { int y; };
}
struct bar {
  typedef int y;
}
void test() {
  foo.x a;
  bar.y b;
  a + b;
}
void main() {}


dmd prints:

temp.d(10): Error: incompatible types for ((a) + (b)): 'x' and 'y'
temp.d(10): Error: + has no effect in expression ((__error) + (__error))

Bye,
bearophile

Apr 06 2010

Robert Clipsham <robert octarineparrot.com> writes:

On 06/04/10 13:05, bearophile wrote:
 Here Clangs gives a single error message (and it gives the error column).

The error column won't happen in DMD, Walter has mentioned many times 
before that no one ever commented on the feature, and no one seemed to 
care when it disappeared... Maybe this will change as clang increases in 
popularity? That said, with properly formatted code it's generally not 
hard to find where your error is, as there isn't much on a line.

I also like the idea of giving a single error message. Most of the time 
when I have a nice list of errors I skim through and look for identifier 
names where I forgot to import a module, or pick out some line numbers 
that seem to have to first error on and jump to them in my editor... 
Both of these would be easier to do with only one error message.

 Here the errors given by dmd give the same information.
 Here dmd gives error messages that are better than GCC 4.2 ones, but I think a
single good error message is better than three.

I agree here, those extra errors can be a pain sometimes when you're 
trying to find the actual cause.
 temp.d(2): Error: identifier 'uint64' is not defined
 temp.d(2): Error: uint64 is used as a type
 temp.d(2): Error: cannot have parameter of type void


 That page says:

 Code that later used 'x', for example, knows that it is declared as an
int64_t, so it doesn't lead to other weird follow on errors that don't make any
sense.<


I don't think the spell checker has been implemented for types, just 
identifiers... If this is the case then it shouldn't take much to add in 
the spell checking. As for later error messages, that will take more 
effort, although I would imagine it would just require changing the type 
to the spell checked type internally to avid the later errors.

 But I don't understand what it means. Maybe this is why this blog post is
titled "Clang Error Recovery" instead of "Clang Error Messages".

It means that clang knows the type give is incorrect, so when it 
continues analyzing the code it will pretend you gave (what it thinks 
is) the right type, rather than giving more errors because you gave the 
incorrect type.

 Later in the same post it says something similar:

 In addition to getting the error message right (and suggesting a fixit
replacement to "::"), Clang "knows what you mean" so it handles the subsequent
uses of a2 correctly.<


This is the same thing as above.

 temp.d(10): Error: incompatible types for ((a) + (b)): 'x' and 'y'
 temp.d(10): Error: + has no effect in expression ((__error) + (__error))

Other than the column, this gives roughly the same information. I guess 
the only way to improve here would be to remove the second error, but 
it's not really much of an issue.

 Bye,
 bearophile

Apr 06 2010

bearophile <bearophileHUGS lycos.com> writes:

Thank you for your comments, Robert Clipsham.

The error column won't happen in DMD,<


it's not so important.


Few possible improvements of Dmd error messages, from that article, from your
answers and from my experience:
- A compiler switch to stop the compilation after the first or few first error
messages;
- Use the true Levenshtein distance to find the typing errors;
- Spell checker for types too;
- Maybe, as you suggest, changing the type to the spell checked type internally
to avoid some of the later errors;
- Printing less error messages, increasing their semantic density. This is not
easy to do;
- In Bugzilla I have added some bug reports that list specific situations where
the error message can be improved (in theory even I can fix some of them. In
the meantime one of them have being fixed by Don and Walter).

Bye,
bearophile

Apr 06 2010

Robert Clipsham <robert octarineparrot.com> writes:

On 06/04/10 19:19, bearophile wrote:
 Few possible improvements of Dmd error messages, from that article, from your
answers and from my experience:
 - A compiler switch to stop the compilation after the first or few first error
messages;

I really don't like this option, if there needs to be an option for it 
the compiler is doing something wrong. On posix based systems you can use:

dmd myFileWithErrors.d |& head

To replicate this if you want it, I don't know about windows.

 - Use the true Levenshtein distance to find the typing errors;

The suggestions I've received for misspelled types has been pretty good, 
I'm not sure what advantage this would give. I'd agree that the proper 
way to do it should be used though, particularly if it gives better 
suggestions.

 - Spell checker for types too;
 - Maybe, as you suggest, changing the type to the spell checked type
internally to avoid some of the later errors;

Agreed, neither of these should be too hard to add should someone feel 
inclined to do so.

 - Printing less error messages, increasing their semantic density. This is not
easy to do;

This would also be nice, but I think the effort required to add it is 
too much for us to worry about at this stage, there are far more 
important things to work on.

Apr 06 2010

bearophile <bearophileHUGS lycos.com> writes:

Robert Clipsham:

 I really don't like this option, if there needs to be an option for it 
 the compiler is doing something wrong. On posix based systems you can use:
 dmd myFileWithErrors.d |& head

GCC has that option, it's named -Wfatal-errors


 To replicate this if you want it, I don't know about windows.

I use GNU head on Windows too:
http://sourceforge.net/projects/unxutils/
A small problem with head is that the compilation goes on, and it can take some
time to stop, while -Wfatal-errors stops the compiler quickly.

Bye,
bearophile

Apr 06 2010

Ary Borenszweig <ary esperanto.org.ar> writes:

bearophile wrote:
 This is the latest post on the LLVM blog, "Amazing feats of Clang Error
Recovery", by Chris Lattner:
 http://blog.llvm.org/2010/04/amazing-feats-of-clang-error-recovery.html
 
 I've compared dmd to few of those examples of Clang usage. I don't comment
each one of those things because some of them are specific of C++, and because
I don't understand some other of them. If you can write better translations to
D or if you understand more of them, you can add and show me more comparisons
:-)
 
 =============================
 
 Clang:
 
 
 int foo(int x, pid_t y) {
   return x+y;
 }
 
 
 t.c:1:16: error: unknown type name 'pid_t'
 int foo(int x, pid_t y) {
                ^
 
 -----------------
 
 dmd 2.042:
 
 int foo(int x, pid_t y) {
   return x+y;
 }
 void main() {}
 
 
 temp.d(1): Error: identifier 'pid_t' is not defined
 temp.d(1): Error: pid_t is used as a type
 temp.d(1): Error: cannot have parameter of type void
 
 -----------------
 
 Here Clangs gives a single error message (and it gives the error column).
 Here the errors given by dmd give the same information.
 Here dmd gives error messages that are better than GCC 4.2 ones, but I think a
single good error message is better than three.

dmd has an ErrorType when an expression or something gives an error. The 
problem is it is a kind of an alias of an int type, so it continues to 
give errors. If such ErrorType would not trigger errors anymore, that 
would solve the problem. (in some cases I think a void type is returned 
instead of an error type)

Apr 06 2010

Don <nospam nospam.com> writes:

Ary Borenszweig wrote:
 bearophile wrote:
 This is the latest post on the LLVM blog, "Amazing feats of Clang 
 Error Recovery", by Chris Lattner:
 http://blog.llvm.org/2010/04/amazing-feats-of-clang-error-recovery.html

 I've compared dmd to few of those examples of Clang usage. I don't 
 comment each one of those things because some of them are specific of 
 C++, and because I don't understand some other of them. If you can 
 write better translations to D or if you understand more of them, you 
 can add and show me more comparisons :-)

 =============================

 Clang:


 int foo(int x, pid_t y) {
   return x+y;
 }


 t.c:1:16: error: unknown type name 'pid_t'
 int foo(int x, pid_t y) {
                ^

 -----------------

 dmd 2.042:

 int foo(int x, pid_t y) {
   return x+y;
 }
 void main() {}


 temp.d(1): Error: identifier 'pid_t' is not defined
 temp.d(1): Error: pid_t is used as a type
 temp.d(1): Error: cannot have parameter of type void

 -----------------

 Here Clangs gives a single error message (and it gives the error column).
 Here the errors given by dmd give the same information.
 Here dmd gives error messages that are better than GCC 4.2 ones, but I 
 think a single good error message is better than three.

 
 dmd has an ErrorType when an expression or something gives an error. The 
 problem is it is a kind of an alias of an int type, so it continues to 
 give errors. If such ErrorType would not trigger errors anymore, that 
 would solve the problem. (in some cases I think a void type is returned 
 instead of an error type)

There's also an ErrorExpression which is used in many places, and it 
generally works properly in supressing errors. (__error shows up in 
error messages when it hasn't been treated properly).

Apr 06 2010

Walter Bright <newshound1 digitalmars.com> writes:

Don wrote:
 There's also an ErrorExpression which is used in many places, and it 
 generally works properly in supressing errors. (__error shows up in 
 error messages when it hasn't been treated properly).

Attempting to correct the error and move forward with the compilation sounds 
good, but generally is a hopeless failure. A far better approach, one that is 
half-implemented in dmd, is to replace failed types, expressions, etc., with 
special error productions, and then suppress further messages that have as 
operands one of those error productions.

It's analogous to using NaNs in floating point.

Apr 06 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 04/06/2010 04:52 PM, Walter Bright wrote:
 Don wrote:
 There's also an ErrorExpression which is used in many places, and it
 generally works properly in supressing errors. (__error shows up in
 error messages when it hasn't been treated properly).

 Attempting to correct the error and move forward with the compilation
 sounds good, but generally is a hopeless failure. A far better approach,
 one that is half-implemented in dmd, is to replace failed types,
 expressions, etc., with special error productions, and then suppress
 further messages that have as operands one of those error productions.

 It's analogous to using NaNs in floating point.

NaP should be its name I guess :o).

Andrei

Apr 06 2010

Walter Bright <newshound1 digitalmars.com> writes:

bearophile wrote:
 Clang:
 
 
 #include <inttypes.h>
 int64 x;
 
 
 t.c:2:1: error: unknown type name 'int64'; did you mean 'int64_t'?
 int64 x;
 ^~~~~
 int64_t
 
 -----------------
 
 dmd:
 
 I am not sure if this is the same situation:
 
 alias uint uint64_t;
 int foo(uint64 x) {
   return x * 2;
 }
 void main() {}
 
 
 dmd prints:
 
 temp.d(2): Error: identifier 'uint64' is not defined

dmd's spell checker only looks a distance of one, and uint64 is a distance of 
two from uint64_t. This is trivially changed, but I didn't do the longer 
distances because of the annoyances of false positives - variable name spelling 
doesn't work like english language spelling.

The issue is not, as has been suggested, that dmd doesn't do spelling checks on 
types.

There's really nothing "amazing" about a spell checker, it's just a better idea 
than not doing it.

Apr 06 2010

Brad Roberts <braddr slice-2.puremagic.com> writes:

On Tue, 6 Apr 2010, Walter Bright wrote:

 bearophile wrote:
 Clang:
 
 
 #include <inttypes.h>
 int64 x;
 
 
 t.c:2:1: error: unknown type name 'int64'; did you mean 'int64_t'?
 int64 x;
 ^~~~~
 int64_t
 
 -----------------
 
 dmd:
 
 I am not sure if this is the same situation:
 
 alias uint uint64_t;
 int foo(uint64 x) {
   return x * 2;
 }
 void main() {}
 
 
 dmd prints:
 
 temp.d(2): Error: identifier 'uint64' is not defined

 
 dmd's spell checker only looks a distance of one, and uint64 is a distance of
 two from uint64_t. This is trivially changed, but I didn't do the longer
 distances because of the annoyances of false positives - variable name
 spelling doesn't work like english language spelling.
 
 The issue is not, as has been suggested, that dmd doesn't do spelling checks
 on types.
 
 There's really nothing "amazing" about a spell checker, it's just a better
 idea than not doing it.

Consider trying increasing distances (with some relatively low max).  If 
you hit a single suggestable correction, substitute it.  ie, for uint64, 
nothing at 0 or 1, one at 2 (uint64_t) so use it (but still error, 
obviously).

This could be particularly useful for simple 2 letter transpositions, if 
those are found by the checker.. a common thing for a lot of people for 
length, for instance.

Later,
Brad

Apr 06 2010

Walter Bright <newshound1 digitalmars.com> writes:

Brad Roberts wrote:
 Consider trying increasing distances (with some relatively low max).  If 
 you hit a single suggestable correction, substitute it.  ie, for uint64, 
 nothing at 0 or 1, one at 2 (uint64_t) so use it (but still error, 
 obviously).

Of course if you do more distances, pick the shortest match!

 
 This could be particularly useful for simple 2 letter transpositions, if 
 those are found by the checker.. a common thing for a lot of people for 
 length, for instance.

Transpositions count as 1. See the dmd source file speller.c.

Apr 06 2010

D Programming

C/C++ Programming

Other

digitalmars.D - Clang error recovery