www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - List of issues PVS-Studio statically analyzes for

reply Walter Bright <newshound2 digitalmars.com> writes:
I always have a side interest in program static analysis. This one is for C, is 
expensive (!), and kindly provided a list of what it looks for:

http://www.viva64.com/en/d/

Some of these might be useful to add into D.
Jul 22 2011
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Walter:

 I always have a side interest in program static analysis. This one is for C,
is 
 expensive (!), and kindly provided a list of what it looks for:
 http://www.viva64.com/en/d/
 Some of these might be useful to add into D.
I have already scanned them all. D is able to avoid/catch several of those already. I have put some of them in Bugzilla. D doesn't currently catch errors coming from bad usage of parallelism, or from bad usage of core.stdc functions (like strlen, printf, etc). Converting 32 bit D code to 64 bit D one has shown some bug patterns, some of them may be worth looking for, by the compiler. I have another similar (bigger!) list for another good C lint. If you want I will scan it too, looking for things to catch by D compilers. But you have to keep in account that later it will require some work to actually implement some of those tests in the compiler :-) Bye, bearophile
Jul 23 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/23/2011 4:10 AM, bearophile wrote:
 D doesn't currently catch errors coming from [...]
 bad usage of core.stdc functions (like strlen, printf, etc).
This isn't likely to happen. D's mission isn't to try and fix usage of C functions.
Jul 23 2011
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
Walter Bright wrote:
 On 7/23/2011 4:10 AM, bearophile wrote:
 D doesn't currently catch errors coming from [...]
 bad usage of core.stdc functions (like strlen, printf, etc).
This isn't likely to happen. D's mission isn't to try and fix usage of C functions.
Note that currently, unsafe cstdio functions are often faster than Phobos stdio functions by a factor large enough to force people to use the C functions for IO bound tasks. I don't know how many people on this newsgroup are affected by this fact. What is important to note is: This issue is a blocker for using D as a teaching language at universities. std.stdio is completely compatible with cstdio functions, but that is both a benefit and a drawback: - cstdio can be used at no extra cost, very nice if you need it. - Phobos input functionality (mainly readf) is slowed down, since it cannot use an internal buffer. -- This even applies when cstdio is not used at all! -- a large part of the inefficiency of readf may be caused by the range abstraction for files: std.stdio.LockingTextReader.empty looks like a bottleneck. -- C++ iostreams 'solves' it with ios::sync_with_cstdio(false); - Another more fundamental issue is that D IO cannot be atomic. There is no way to implement a function that leaves the input untouched if it is invalid, and still is compatible to cstdio. Eg: try a = read!int(); // oops, input is actually "abc" catch(...){s = read!string();} // get malformed input Currently in case of ill-formed input, formattedRead leaves the InputRange (which is real file input in the case of readf) in whatever position the error occured, and I'm not sure if this is even specified anywhere. It is almost useless for error handling. So, if D/Phobos basically forces usage of C functions then it's job would actually be to fix their usage. Otherwise, this is an open design issue. Any thoughts on how to improve the current situation? I think Phobos should get _input_ right eventually. (and output too, what is the state of the toString issue?) - Timon
Jul 24 2011
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Timon Gehr:

 Note that currently, unsafe cstdio functions are often faster than Phobos stdio
 functions by a factor large enough to force people to use the C functions for
IO
 bound tasks.
Once I have had to use printf instead of writeln for speed. More often I use printf instead of writeln when I want to read the asm, because writeln creates a much longer and more dirty asm output compared to printf. Bye, bearophile
Jul 24 2011
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/24/11 11:17 AM, Timon Gehr wrote:
 Walter Bright wrote:
 On 7/23/2011 4:10 AM, bearophile wrote:
 D doesn't currently catch errors coming from [...]
 bad usage of core.stdc functions (like strlen, printf, etc).
This isn't likely to happen. D's mission isn't to try and fix usage of C functions.
Note that currently, unsafe cstdio functions are often faster than Phobos stdio functions by a factor large enough to force people to use the C functions for IO bound tasks.
Only a fraction of them.
 I don't know how many people on this newsgroup are affected by this fact. What
is
 important to note is: This issue is a blocker for using D as a teaching
language
 at universities.
Unless you're teaching heavy-duty data I/O, even considerably slower I/O speed should not affect using a language for teaching. I'd be curious to hear more detail (and which university are you (considering) teaching D at?), thanks.
 std.stdio is completely compatible with cstdio functions, but that is both a
 benefit and a drawback:
 - cstdio can be used at no extra cost, very nice if you need it.

 - Phobos input functionality (mainly readf) is slowed down, since it cannot
use an
 internal buffer.
That is correct. I know how to fix that on all supported OSs, but never got around to it. So much to do, sigh.
 -- This even applies when cstdio is not used at all!
Yah, I think that's not surprising.
 -- a large part of the inefficiency of readf may be caused by the range
 abstraction for files: std.stdio.LockingTextReader.empty looks like a
bottleneck.
This is because a range essentially exposes a one-element buffer explicitly (via front()). Ironically, C's stdlib _also_ has a range of one element, but that's exposed in a very inefficient and indirect way: you can call getc() to destructively fetch one character, but you can then put it back with ungetc() AND ungetc() is GUARANTEED to succeed at least once. This means FILE* does have a one-character buffer even for unbuffered streams, which can be easily seen by analyzing the implementation of various stdlibs. Some actually offer a private function peek() that lets code "see" the next character in the stream. That would help the range interface. Currently std.stdio.LockingTextReader.empty calls (a variant of) getc, stores the character, and then calls (a variant of) ungetc(). The "variant of" is the unlocked version, so the code is already unportable. It's also slow, and we can fix it to be faster in unportable ways.
 -- C++ iostreams 'solves' it with ios::sync_with_cstdio(false);

 - Another more fundamental issue is that D IO cannot be atomic. There is no
way to
 implement a function that leaves the input untouched if it is invalid, and
still
 is compatible to cstdio.

 Eg:
 try a = read!int(); // oops, input is actually "abc"
 catch(...){s = read!string();} // get malformed input
That's a bug, read should leave unmatched characters in place.
 Currently in case of ill-formed input, formattedRead leaves the InputRange
(which
 is real file input in the case of readf) in whatever position the error
occured,
 and I'm not sure if this is even specified anywhere. It is almost useless for
 error handling.
Agreed.
 So, if D/Phobos basically forces usage of C functions then it's job would
actually
 be to fix their usage. Otherwise, this is an open design issue.

 Any thoughts on how to improve the current situation? I think Phobos should get
 _input_ right eventually. (and output too, what is the state of the toString
issue?)
I have all the knowledge (much of it shared above) but no time. If anyone would want to get on this, I'd be glad to answer detailed questions. Andrei
Jul 24 2011
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 7/24/2011 9:17 AM, Timon Gehr wrote:
 Walter Bright wrote:
 On 7/23/2011 4:10 AM, bearophile wrote:
 D doesn't currently catch errors coming from [...]
 bad usage of core.stdc functions (like strlen, printf, etc).
This isn't likely to happen. D's mission isn't to try and fix usage of C functions.
Note that currently, unsafe cstdio functions are often faster than Phobos stdio functions by a factor large enough to force people to use the C functions for IO bound tasks.
Which ones?
 I don't know how many people on this newsgroup are affected by this fact. What
is
 important to note is: This issue is a blocker for using D as a teaching
language
 at universities.
Why would using C functions be a blocker?
 std.stdio is completely compatible with cstdio functions, but that is both a
 benefit and a drawback:
 - cstdio can be used at no extra cost, very nice if you need it.

 - Phobos input functionality (mainly readf) is slowed down, since it cannot
use an
 internal buffer.
 -- This even applies when cstdio is not used at all!
 -- a large part of the inefficiency of readf may be caused by the range
 abstraction for files: std.stdio.LockingTextReader.empty looks like a
bottleneck.
To get high efficiency with I/O, you need to lock the stream at the highest level possible. If you are doing lock/readchar/unlock in a loop, you will get incredibly bad performance.
 -- C++ iostreams 'solves' it with ios::sync_with_cstdio(false);
C++ iostreams is very slow.
 - Another more fundamental issue is that D IO cannot be atomic. There is no
way to
 implement a function that leaves the input untouched if it is invalid, and
still
 is compatible to cstdio.

 Eg:
 try a = read!int(); // oops, input is actually "abc"
 catch(...){s = read!string();} // get malformed input
There's no way to back up cstdio either (beyond a single character).
 Currently in case of ill-formed input, formattedRead leaves the InputRange
(which
 is real file input in the case of readf) in whatever position the error
occured,
 and I'm not sure if this is even specified anywhere. It is almost useless for
 error handling.

 So, if D/Phobos basically forces usage of C functions then it's job would
actually
 be to fix their usage. Otherwise, this is an open design issue.

 Any thoughts on how to improve the current situation? I think Phobos should get
 _input_ right eventually. (and output too, what is the state of the toString
issue?)
We welcome any proposals for improvements.
Jul 24 2011
prev sibling parent "Mike James" <foo bar.com> writes:
"Walter Bright" <newshound2 digitalmars.com> wrote in message 
news:j0da2r$1cd6$1 digitalmars.com...
I always have a side interest in program static analysis. This one is for 
C, is expensive (!), and kindly provided a list of what it looks for:

 http://www.viva64.com/en/d/

 Some of these might be useful to add into D.
With the work I do, I also have to use such tools. I use the ones from http://www.ldra.com/ They can be a pain sometimes - like programming with a critic on your shoulder :-) I know what you mean about the expense - luckily the company pays for them :-)
Jul 23 2011