digitalmars.D - [OT] Walter about compilers
- eles (11/11) Jan 22 2013 Hi everybody,
- Era Scarecrow (39/46) Jan 22 2013 It's been quoted that for every 10 lines of code there's a bug.
- Peter Alexander (4/8) Jan 22 2013 I love how >1kloc is "large" :D
- deadalnix (3/12) Jan 22 2013 It really depends if we are talking about java or not.
- Peter Alexander (5/19) Jan 22 2013 Not just Java. According to Wikipedia Debian 5 has over 300
- jerro (1/3) Jan 22 2013 It also consists of over 20000 packages. It is not one program.
- eles (3/7) Jan 22 2013 That means (at least) 100k bugs. Happy fixing!
- Simen Kjaeraas (6/13) Jan 22 2013 It's context dependent, of course. Finding all the bugs in 1kloc is doab...
- bearophile (6/8) Jan 22 2013 In debug mode that's the job of a modern well designed language,
- Era Scarecrow (15/21) Jan 22 2013 Agreed. However D (compilers) doesn't have an option to check
- Thiez (6/11) Jan 22 2013 Since D aims to emulate C in this aspect, overflow with uints is
- Era Scarecrow (38/50) Jan 22 2013 That merely shortens the size of the check, not where you need
- Walter Bright (4/5) Jan 22 2013 I've been doing some refactoring in dmd now and then. Every time I do, t...
- deadalnix (4/5) Jan 22 2013 It is said a lot. I'd like to see hard data on that one. I'd bet
- Philippe Sigaud (2/8) Jan 22 2013 With D, we aim for one bug every 14 lines of code :)
- eles (4/8) Jan 22 2013 Add to that the fact that programs in D tend to be shorter than
- Era Scarecrow (11/16) Jan 22 2013 Less boiler plate code, fewer direct pointers, no preprocessor
- eles (3/8) Jan 23 2013 Sigh... Only if it would go into that gcc suite... faster.
- Simen Kjaeraas (5/16) Jan 22 2013 Can do. Who wants to patch the compiler to automaically insert those bug...
- Don (6/12) Jan 23 2013 It definitely does.
- Era Scarecrow (21/25) Jan 23 2013 Bugs in code don't always live on one line per bug; They can
Hi everybody, I was just reading this: http://www.laputan.org/metamorphosis/metamorphosis.html#SoftwareTectonics (a thing about software architectures). The text opens with...: "We like it when people always want more! Otherwise, we'd be out of the upgrade business. Sometimes, people ask me what I will do when the compiler is done. Done? No software program that is selling is ever done! -- Walter Bright, C++ compiler architect" So... the question is: does that quote also applies for dmd? :)
Jan 22 2013
On Tuesday, 22 January 2013 at 13:54:08 UTC, eles wrote:The text opens with...: "We like it when people always want more! Otherwise, we'd be out of the upgrade business. Sometimes, people ask me what I will do when the compiler is done. Done? No software program that is selling is ever done! -- Walter Bright, C++ compiler architect" So... the question is: does that quote also applies for dmd? :)It's been quoted that for every 10 lines of code there's a bug. There are programs with tens of thousands of lines of code, so finding every bug is probably impossible for large programs (above 1000 lines). But that doesn't mean they can't run very very well. A number of the bugs for unchecked work is addition for example, perhaps simplest of operations; Are you going to check after every little + that you didn't have an overflow? Without a lot of extra work are you going to include checks that ensure they can't break? C example: //code looks okay void* getMemory(int a, int b) { return malloc(a + b); } //becomes negative due to overflow. it can happen //probably returns NULL. I don't know.. void* ptr = getMemory(0x7fffffff, 0x7fffffff); //overflow free version? void* getMemory(unsigned int a, unsigned int b) { //max name may be wrong, but you get the idea. //don't remember, need third cast? assert(((long long) a) + ((long long) b) <= UNSIGNED_INT_MAX); return malloc(a + b); } //should assert now void* ptr = getMemory(UNSIGNED_INT_MAX, UNSIGNED_INT_MAX); Since part of the process is not only fixing bugs and improving the compiler, but there's also new features that may be requested that you find necessary yet never needed before you thought about it. Consider: A recent project of mine that I hadn't updated in over a year and a half seemed to have a bug with how it handled a certain feature and was just brought up, needed to add about 10 lines of code to handle it; Then I found a bug within those 10 lines (after it was working). With that in mind, it's likely no program will be 'done', but if they do the job well enough then it's probably good enough. So to answer it, the answer is probably yes it applies to dmd.
Jan 22 2013
On Tuesday, 22 January 2013 at 14:44:26 UTC, Era Scarecrow wrote:It's been quoted that for every 10 lines of code there's a bug. There are programs with tens of thousands of lines of code, so finding every bug is probably impossible for large programs (above 1000 lines).I love how >1kloc is "large" :D I'd say anything under 100kloc is a small program. 100kloc-1mloc medium, and >1mloc large.
Jan 22 2013
On Tuesday, 22 January 2013 at 14:59:48 UTC, Peter Alexander wrote:On Tuesday, 22 January 2013 at 14:44:26 UTC, Era Scarecrow wrote:It really depends if we are talking about java or not.It's been quoted that for every 10 lines of code there's a bug. There are programs with tens of thousands of lines of code, so finding every bug is probably impossible for large programs (above 1000 lines).I love how >1kloc is "large" :D I'd say anything under 100kloc is a small program. 100kloc-1mloc medium, and >1mloc large.
Jan 22 2013
On Tuesday, 22 January 2013 at 15:26:28 UTC, deadalnix wrote:On Tuesday, 22 January 2013 at 14:59:48 UTC, Peter Alexander wrote:Not just Java. According to Wikipedia Debian 5 has over 300 million lines of code. http://en.wikipedia.org/wiki/Source_lines_of_code Last time I counted, Phobos has ~200kloc.On Tuesday, 22 January 2013 at 14:44:26 UTC, Era Scarecrow wrote:It really depends if we are talking about java or not.It's been quoted that for every 10 lines of code there's a bug. There are programs with tens of thousands of lines of code, so finding every bug is probably impossible for large programs (above 1000 lines).I love how >1kloc is "large" :D I'd say anything under 100kloc is a small program. 100kloc-1mloc medium, and >1mloc large.
Jan 22 2013
Not just Java. According to Wikipedia Debian 5 has over 300 million lines of code.It also consists of over 20000 packages. It is not one program.
Jan 22 2013
On Tuesday, 22 January 2013 at 14:59:48 UTC, Peter Alexander wrote:On Tuesday, 22 January 2013 at 14:44:26 UTC, Era Scarecrow wrote: I'd say anything under 100kloc is a small program. 100kloc-1mloc medium, and >1mloc large.That means (at least) 100k bugs. Happy fixing!
Jan 22 2013
On 2013-01-22, 15:59, Peter Alexander wrote:On Tuesday, 22 January 2013 at 14:44:26 UTC, Era Scarecrow wrote:It's context dependent, of course. Finding all the bugs in 1kloc is doable, but lots of work. Finding all the bugs in 10kloc, conceivably doable, but unlikely to be worth it. >= 100kloc? ouch. -- SimenIt's been quoted that for every 10 lines of code there's a bug. There are programs with tens of thousands of lines of code, so finding every bug is probably impossible for large programs (above 1000 lines).I love how >1kloc is "large" :D I'd say anything under 100kloc is a small program. 100kloc-1mloc medium, and >1mloc large.
Jan 22 2013
Era Scarecrow:Are you going to check after every little + that you didn't have an overflow?In debug mode that's the job of a modern well designed language, just like checking an index is inside the bounds of an array every time you perform an array access. Bye, bearophile
Jan 22 2013
On Tuesday, 22 January 2013 at 15:11:41 UTC, bearophile wrote:Era Scarecrow:Agreed. However D (compilers) doesn't have an option to check those, I think it was requested but walter said no (due to slower speed I think); Therefore if the compiler won't do it for you, you have to do it yourself. I really wouldn't want to have to use BigInt for everything that can't overflow and then check to make sure I can fit it in my smaller variables afterwards along with the extra move. I wouldn't want to use BigInts everywhere, and long's aren't needed everywhere either. Of course if an attribute was added that checked just those functions for important overflows then it could help, but in truth it kinda clutters the signatures with something that isn't an important attribute. Guess 'CheckedInt' could work in those cases, but that's more during runtime and release rather than debugging.Are you going to check after every little + that you didn't have an overflow?In debug mode that's the job of a modern well designed language, just like checking an index is inside the bounds of an array every time you perform an array access.
Jan 22 2013
On Tuesday, 22 January 2013 at 16:31:20 UTC, Era Scarecrow wrote:I really wouldn't want to have to use BigInt for everything that can't overflow and then check to make sure I can fit it in my smaller variables afterwards along with the extra move. I wouldn't want to use BigInts everywhere, and long's aren't needed everywhere either.Since D aims to emulate C in this aspect, overflow with uints is probably defined as a wrap-around (like C). In this case it seems to me the check for overflow would simply be '(a+b)<a', no need to cast to longs and BigInts and all that. Of course this may not apply to signed ints...
Jan 22 2013
On Tuesday, 22 January 2013 at 17:10:35 UTC, Thiez wrote:On Tuesday, 22 January 2013 at 16:31:20 UTC, Era Scarecrow wrote:That merely shortens the size of the check, not where you need to place the checks or how often. Truthfully, in almost all cases the wrap-around or overflow/underflow is an error, usually unchecked. If 1 million were the max, then 1,000,000 + 1 should equal 1,000,001 and not <=0, and if 0 is the minimum, 0 - 1 should not equal >=0. The only real time I can find overflow wanted is while making something that watches for it explicitly to make use of it. Say we emulate or write the 'ucent' types. That could be done as: //addition example obviously void add(const uint[4] lhs, const uint[4] rhs) { uint[4] val; bool carry = false; foreach(i, ref v; val) { uint tmp = lhs[i]; v = lhs[i] + rhs[i] + (carry ? 1 : 0); carry = v < tmp; } assert(!carry); //could fail. How to handle this? Ignore? } Now let's say there's a for loop which someone decides they would be clever and use a ubyte (unsigned char) as an index or counter. for(ubyte i = 0; i < 1000; i++) { writeln(i); } The overflow is an error because the wrong type was selected but doesn't change the obvious logic behind it. You can hide the type behind an alias or similar but that doesn't change the fact it's a bug, and can be easier to detect if we are aware the overflow is happening at all rather than it getting stuck and having to manually kill the process or step through it in a debugger. If it wasn't outputting in some way you could identify it's much harder to find. Encryption may make use of the overflow/wrap around, but far more likely they use xor or binary operations which don't have those problems.I really wouldn't want to have to use BigInt for everything that can't overflow and then check to make sure I can fit it in my smaller variables afterwards along with the extra move. I wouldn't want to use BigInts everywhere, and long's aren't needed everywhere either.Since D aims to emulate C in this aspect, overflow with uints is probably defined as a wrap-around (like C). In this case it seems to me the check for overflow would simply be '(a+b)<a', no need to cast to longs and BigInts and all that. Of course this may not apply to signed ints...
Jan 22 2013
On 1/22/2013 6:44 AM, Era Scarecrow wrote:It's been quoted that for every 10 lines of code there's a bug.I've been doing some refactoring in dmd now and then. Every time I do, the process exposes latent bugs. On the one hand, that's discouraging, on the other hand, I think it shows the value in refactoring into a better design.
Jan 22 2013
On Tuesday, 22 January 2013 at 14:44:26 UTC, Era Scarecrow wrote:It's been quoted that for every 10 lines of code there's a bug.It is said a lot. I'd like to see hard data on that one. I'd bet that it greatly vary from one programmer to another, and probably from one language to another.
Jan 22 2013
On Wed, Jan 23, 2013 at 5:56 AM, deadalnix <deadalnix gmail.com> wrote:On Tuesday, 22 January 2013 at 14:44:26 UTC, Era Scarecrow wrote:With D, we aim for one bug every 14 lines of code :)It's been quoted that for every 10 lines of code there's a bug.It is said a lot. I'd like to see hard data on that one. I'd bet that it greatly vary from one programmer to another, and probably from one language to another.
Jan 22 2013
On Wednesday, 23 January 2013 at 06:22:55 UTC, Philippe Sigaud wrote:On Wed, Jan 23, 2013 at 5:56 AM, deadalnix <deadalnix gmail.com> wrote:Add to that the fact that programs in D tend to be shorter than their C or even C++ equivalents!On Tuesday, 22 January 2013 at 14:44:26 UTC, Era ScarecrowWith D, we aim for one bug every 14 lines of code :)
Jan 22 2013
On Wednesday, 23 January 2013 at 07:33:22 UTC, eles wrote:On Wednesday, 23 January 2013 at 06:22:55 UTC, Philippe Sigaud wrote:Less boiler plate code, fewer direct pointers, no preprocessor macros. Code that might have ambiguities based on order of priority force (or sternly warn) you to use parentheses for what you intend rather than a set of long complex rules. Templates easier to make and use (needing fewer of them). No header file(s) (and all the duplication or annoying separation that comes with it). Assignment in certain locations are illegal. Oh yes, no ugly STL, and a lot more. Plenty of stuff that simplifies a whole lot of stuff. D is indeed the language I always wanted :)With D, we aim for one bug every 14 lines of code :)Add to that the fact that programs in D tend to be shorter than their C or even C++ equivalents!
Jan 22 2013
On Wednesday, 23 January 2013 at 07:57:38 UTC, Era Scarecrow wrote:On Wednesday, 23 January 2013 at 07:33:22 UTC, eles wrote:Sigh... Only if it would go into that gcc suite... faster.On Wednesday, 23 January 2013 at 06:22:55 UTC, Philippe Sigaud wrote:Plenty of stuff that simplifies a whole lot of stuff. D is indeed the language I always wanted :)
Jan 23 2013
On 2013-16-23 07:01, Philippe Sigaud <philippe.sigaud gmail.com> wrote:On Wed, Jan 23, 2013 at 5:56 AM, deadalnix <deadalnix gmail.com> wrote:Can do. Who wants to patch the compiler to automaically insert those bugs? :p -- SimenOn Tuesday, 22 January 2013 at 14:44:26 UTC, Era Scarecrow wrote:With D, we aim for one bug every 14 lines of code :)It's been quoted that for every 10 lines of code there's a bug.It is said a lot. I'd like to see hard data on that one. I'd bet that it greatly vary from one programmer to another, and probably from one language to another.
Jan 22 2013
On Wednesday, 23 January 2013 at 04:56:11 UTC, deadalnix wrote:On Tuesday, 22 January 2013 at 14:44:26 UTC, Era Scarecrow wrote:It definitely does. "There has been no error reported in TeX since 1994 or 1995" -- Knuth, 2002. There were 7 bugs in TeX reported between 1982 and 1995. Tex has a lot more than 70 lines of code :-)It's been quoted that for every 10 lines of code there's a bug.It is said a lot. I'd like to see hard data on that one. I'd bet that it greatly vary from one programmer to another, and probably from one language to another.
Jan 23 2013
On Wednesday, 23 January 2013 at 09:46:47 UTC, Don wrote:"There has been no error reported in TeX since 1994 or 1995" -- Knuth, 2002. There were 7 bugs in TeX reported between 1982 and 1995. Tex has a lot more than 70 lines of code :-)Bugs in code don't always live on one line per bug; They can span multiple very easily. Some bugs are simply missing logic, untested cases, no default values in variables. Now if we have a while loop and you modify the index at the wrong spot you need to move it, making it have a bug spanning at least two lines. Some bugs are known but for the most part ignored, like memory management for very tiny programs. Many error values returned by the OS & errorno are ignored, but don't usually have any catastrophic effects. Some bugs are the effect of using a macro which expands. Logically it makes sense, but the macro makes it unstable at best; while an actual function wouldn't have a bug. #define min(a,b) ((a)>(b) ? (b) : (a)) int a=1,b=2,c; c = min(a++, b++); //minimum of both a or b, and increase each once //will any of these pass? assert(c == 1); assert(a == 2); assert(b == 3);
Jan 23 2013