digitalmars.D - int nan
- bearophile (6/7) Jun 26 2009 The following comes partially from a friend of mine. If you are busy you...
- dsimcha (12/19) Jun 26 2009 uses 0 instead. It doesn't have the advantages of error detection that N...
- Nick Sabalausky (13/27) Jun 26 2009 Interesting idea, but IMO using NaN as a default initializer is just a
- bearophile (5/11) Jun 27 2009 The good thing of using int.min (and short.min, etc) is that then the nu...
- Nick Sabalausky (6/20) Jun 27 2009 Yes, I know. I only said that "default initing to nan" was a sub-optimal...
- grauzone (7/8) Jun 27 2009 You're saying C# nullable ints require more memory than native ints, but...
- bearophile (5/7) Jun 28 2009 Have you read my posts? I have said to use the value that currently is i...
- grauzone (17/18) Jun 28 2009 That wasn't very explicit. Anyway, we need int.min for, you know, doing
- bearophile (8/17) Jun 28 2009 It's not a random value, is a specific one, and it's an asymmetric extre...
- ponce (5/5) Jun 28 2009 I'm sorry but I think it would be an ugly feature.
- bearophile (7/10) Jun 28 2009 I agree that there are many situations where you want 2^32 different val...
- Jarrett Billingsley (2/4) Jun 28 2009 Let me know when x86 gets that.
- Nick Sabalausky (4/11) Jun 28 2009 Geez, it's a hypothetical discussion, for cryin out loud. Not everything...
- Frits van Bommel (10/19) Jun 29 2009 It's fine for Lisp because any Lisp I've ever seen auto-upgrades out-of-...
- bearophile (6/16) Jun 29 2009 Probably I have not expressed myself well in this part of my post, becau...
- BCS (4/8) Jun 27 2009 I think you can prove that it is impossible to do this totally correctly...
- Michiel Helvensteijn (23/33) Jun 27 2009 Complete static analysis of the flow of program control is the holy grai...
- superdan (2/19) Jun 27 2009 extremely complicated? it's machine haltin' dood.
- Michiel Helvensteijn (8/14) Jun 27 2009 Ok, since 'complete static analysis' may include undecidable problems su...
- Walter Bright (3/7) Jun 27 2009 I believe this is what valgrind does by instrumenting each variable at
- Denis Koroskin (4/13) Jun 27 2009 This code doesn't compile in C# and fails with the following error at
- Michiel Helvensteijn (8/17) Jun 27 2009 Ah, so C# is overly conservative. That's another option, of course.
- Nick Sabalausky (39/54) Jun 27 2009 Yes, this approach is what I was getting at. In fact, I would (and alrea...
- Nick Sabalausky (7/65) Jun 27 2009 Additionally, in the C# approach (and this is speaking from personal
- Michiel Helvensteijn (28/71) Jun 28 2009 Better than a flipflop between "runs correctly" and "runs incorrectly",
- Simen Kjaeraas (4/5) Jun 28 2009 While the ugliness of it is that it's both.
- Michiel Helvensteijn (4/7) Jun 28 2009 Care to elaborate?
- Simen Kjaeraas (14/19) Jun 28 2009 As has already been mentioned, one of the biggest problems with the holy
- Michiel Helvensteijn (25/43) Jun 28 2009 The modularity thing is a good point. I assume you're talking about
- Ary Borenszweig (5/86) Jun 28 2009 Of course not:
- Michiel Helvensteijn (10/22) Jun 28 2009 My mistake. For some reason I was assuming 'foo' was pure.
- Nick Sabalausky (29/92) Jun 28 2009 The fix would be one of the following, depending on what the code is
- Nick Sabalausky (30/71) Jun 28 2009 I would also be perfectly ok with this compiling:
- Michiel Helvensteijn (12/16) Jun 28 2009 Ah, we are starting to agree. :-)
- Michiel Helvensteijn (29/59) Jun 28 2009 Keep in mind that we're talking about a situation in which we're sure 'i...
- Nick Sabalausky (31/88) Jun 28 2009 It's a situation where we're *initially* sure 'i' will always be set. Bu...
- Michiel Helvensteijn (20/34) Jun 28 2009 I mean that if the programmer has provided a postcondition of the outer
- BCS (3/9) Jun 28 2009 How about letting the user signal that they know what they are doing by ...
- BCS (5/22) Jun 27 2009 Yes, trying to solve the problem for all cases won't work, but I think t...
- BCS (4/26) Jun 27 2009 And if foo() is never <=0 then the error is valid, but incorrect. I like...
The following comes partially from a friend of mine. If you are busy you can skip this post of musings. From the docs: http://www.digitalmars.com/d/1.0/faq.html#nanBecause of the way CPUs are designed, there is no NaN value for integers, so D uses 0 instead. It doesn't have the advantages of error detection that NaN has, but at least errors resulting from unintended default initializations will be consistent and therefore more debuggable.<Seeing how abs(int.min) gives problems, and seeing how CPUs manage nans of FPs efficiently enough, it can be nice for int.min to become the nan of integers (and similar for short, long, and maybe tiny too). Such nan may also be useful Bye, bearophile
Jun 26 2009
== Quote from bearophile (bearophileHUGS lycos.com)'s articleThe following comes partially from a friend of mine. If you are busy you canskip this post of musings.From the docs: http://www.digitalmars.com/d/1.0/faq.html#nanuses 0 instead. It doesn't have the advantages of error detection that NaN has, but at least errors resulting from unintended default initializations will be consistent and therefore more debuggable.<Because of the way CPUs are designed, there is no NaN value for integers, so DSeeing how abs(int.min) gives problems, and seeing how CPUs manage nans of FPsefficiently enough, it can be nice for int.min to become the nan of integers (and similar for short, long, and maybe tiny too). Such nan may also be useful forBye, bearophileThis is IMHO (at least at first glance) a reasonable idea in the very long run. However, it isn't practical here and now for D2, because NaN behavior is implemented partly in hardware, and mathematically undefined integer operations throw hardware exceptions instead of returning int.nan on current hardware.
Jun 26 2009
"bearophile" <bearophileHUGS lycos.com> wrote in message news:h237c9$orl$1 digitalmars.com...The following comes partially from a friend of mine. If you are busy you can skip this post of musings. From the docs: http://www.digitalmars.com/d/1.0/faq.html#nanInteresting idea, but IMO using NaN as a default initializer is just a crutch for not having a real system of compile-time detecting/preventing of well in my experience). Ie, Default initing to NaN is certainly better than default-initing to a commonly-used value, but it still isn't the right long-term solution. Barring that "correct" solution though, I do think it would make far more sense for the default-initializer to be something that isn't so commonly used as 0. So yea, either int.min, or 0x69696969 or 0xB00BB00B, etc, ie something that will actually stand out and scream "Hey! Double-check this! It might not be right!".Because of the way CPUs are designed, there is no NaN value for integers, so D uses 0 instead. It doesn't have the advantages of error detection that NaN has, but at least errors resulting from unintended default initializations will be consistent and therefore more debuggable.<Seeing how abs(int.min) gives problems, and seeing how CPUs manage nans of FPs efficiently enough, it can be nice for int.min to become the nan of integers (and similar for short, long, and maybe tiny too). Such nan may Bye, bearophile
Jun 26 2009
Nick Sabalausky:Ie, Default initing to NaN is certainly better than default-initing to a commonly-used value, but it still isn't the right long-term solution.Having a nan has other purposes beside initialization values. You can represent think).So yea, either int.min, or 0x69696969 or 0xB00BB00B, etc, ie something that will actually stand out and scream "Hey! Double-check this! It might not be right!".The good thing of using int.min (and short.min, etc) is that then the numbers become symmetric, you have a positive number for each negative one, and abs() works in all cases. Bye, bearophile
Jun 27 2009
"bearophile" <bearophileHUGS lycos.com> wrote in message news:h250ve$1dvr$1 digitalmars.com...Nick Sabalausky:Yes, I know. I only said that "default initing to nan" was a sub-optimal approach, not having nans. But I may have misunderstood you, I thought default init values was what you were talking about?Ie, Default initing to NaN is certainly better than default-initing to a commonly-used value, but it still isn't the right long-term solution.Having a nan has other purposes beside initialization values. You can 8 bytes, I think).Good point.So yea, either int.min, or 0x69696969 or 0xB00BB00B, etc, ie something that will actually stand out and scream "Hey! Double-check this! It might not be right!".The good thing of using int.min (and short.min, etc) is that then the numbers become symmetric, you have a positive number for each negative one, and abs() works in all cases.
Jun 27 2009
Having a nan has other purposes beside initialization values. You can bytes, I think).just how would you represent int.nan with 32 bits? The correct solution would be to add nullable value types as additional types. It'd be nice if we could have non-nullable object references at the same time. But figuring out and agreeing on a concrete design seems to be too complicated, and D will never have it. "Stop dreaming."
Jun 27 2009
grauzone:just how would you represent int.nan with 32 bits?Have you read my posts? I have said to use the value that currently is int.min as null, and I've explained why. I'll keep dreaming some more years, bye, bearophile
Jun 28 2009
Have you read my posts? I have said to use the value that currently is int.min as null, and I've explained why.That wasn't very explicit. Anyway, we need int.min for, you know, doing useful stuff. We can't just define a quite random number to be a special value. Checking math operations for nullable integers would also be quite expensive (you had to check both operands before the operation). If you realize nullable ints by making them a tuple of a native int and a bool signaling nan, for most operations you only need to or the nan-bools of both operands, and store it in the result. At least I imagine that to be better, because you don't need additional jumps in the generated asm code. And this implementation choice clearly is superior, because it doesn't restrict the value range of the original type. There's no int value, that the nullable int type can't represent. Now there's the space overhead, but if you need performance, you'd restrict yourself to hardware supported operations anyway. Although it's pointless to discuss about implementation details of a feature that will never be implemented, what do you think? int.nan. Overflows or illegal operations would just trigger exceptions.
Jun 28 2009
grauzone:That wasn't very explicit. Anyway, we need int.min for, you know, doing useful stuff.Like for what? Have you used a Lisp? Their tagged integers show that a smaller range is fine. And I'm just talking about 1 value in 4 billions, I don't think you will miss it much. And it's a value that has no symmetric positive.We can't just define a quite random number to be a special value.<It's not a random value, is a specific one, and it's an asymmetric extrema too.Checking math operations for nullable integers would also be quite expensive (you had to check both operands before the operation).I was talking about a hardware-managed nan of ints, shorts, longs, tinys. That's why I have defined the original posts of musings.Although it's pointless to discuss about implementation details of a feature that will never be implemented, what do you think?Inventions sometimes come from dreams too :-)int.nan. Overflows or illegal operations would just trigger exceptions.I'll do my best to have them in LDC (LLVM supports them already!), it's probably the only new feature I'll ask to LDC developers. If necessary I may even create a personal version of LDC that has this single extra feature. Bye, bearophile
Jun 28 2009
I'm sorry but I think it would be an ugly feature. What would be the NaN of uint ? What if you actually need 2^32 different values (such as in a linear congruential random number generator) ? Besides, there would be no cheap way to ensure NaN propagation (no hardware support). Cheers.
Jun 28 2009
ponce:What would be the NaN of uint ?Having a NaN in just signed integral values (of 1, 2, 4, 8, 16 bytes) looks enough to me, see below.What if you actually need 2^32 different values (such as in a linear congruential random number generator) ?<I agree that there are many situations where you want 2^32 different values, or 2^16, etc, in such situations you can use an utiny/ushort/uint/ulong/ucent that But I think it's much less common to need 2^32 or 2^64 different signed integers.Besides, there would be no cheap way to ensure NaN propagation (no hardware support).<I was talking about having hardware support, of course. Bye, bearophile
Jun 28 2009
On Sun, Jun 28, 2009 at 6:02 PM, bearophile<bearophileHUGS lycos.com> wrote:Let me know when x86 gets that.Besides, there would be no cheap way to ensure NaN propagation (no hardware support).<I was talking about having hardware support, of course.
Jun 28 2009
"Jarrett Billingsley" <jarrett.billingsley gmail.com> wrote in message news:mailman.315.1246226874.13405.digitalmars-d puremagic.com...On Sun, Jun 28, 2009 at 6:02 PM, bearophile<bearophileHUGS lycos.com> wrote:Geez, it's a hypothetical discussion, for cryin out loud. Not everything has to be immediately feasable to be worthy of debate.Let me know when x86 gets that.Besides, there would be no cheap way to ensure NaN propagation (no hardware support).<I was talking about having hardware support, of course.
Jun 28 2009
bearophile wrote:grauzone:It's fine for Lisp because any Lisp I've ever seen auto-upgrades out-of-range integers to (heap-allocated) bigints.That wasn't very explicit. Anyway, we need int.min for, you know, doing useful stuff.Like for what? Have you used a Lisp? Their tagged integers show that a smaller range is fine. And I'm just talking about 1 value in 4 billions, I don't think you will miss it much. And it's a value that has no symmetric positive.I'd like to point out you don't need a new built-in type (or changes to a existing one) to use those LLVM intrinsics with LDC. Just import ldc.intrinsics, define a struct MyInt and overload operators on it using llvm_sadd_with_overflow and friends. That doesn't work for external libraries of course, but those should be free to handle overflow situations and undefined operations however they want without having to worry about int.nan...int.nan. Overflows or illegal operations would just trigger exceptions.I'll do my best to have them in LDC (LLVM supports them already!), it's probably the only new feature I'll ask to LDC developers. If necessary I may even create a personal version of LDC that has this single extra feature.
Jun 29 2009
Frits van Bommel:It's fine for Lisp because any Lisp I've ever seen auto-upgrades out-of-range integers to (heap-allocated) bigints.I think it can be fine even if you have just fixnums with that single value missing from signed integrals.I'd like to point out you don't need a new built-in type (or changes to a existing one) to use those LLVM intrinsics with LDC. Just import ldc.intrinsics, define a struct MyInt and overload operators on it using llvm_sadd_with_overflow and friends. That doesn't work for external libraries of course, but those should be free to handle overflow situations and undefined operations however they want without having to worry about int.nan...Probably I have not expressed myself well in this part of my post, because here I was not taking about a new int type or about int nans. I was talking about int overflows. I'll explain better in #ldc. Bye, bearophile
Jun 29 2009
Hello Nick,Interesting idea, but IMO using NaN as a default initializer is just a crutch for not having a real system of compile-time system for this works very well in my experience).I think you can prove that it is impossible to do this totally correctly: int i; for(int j = foo(); j > 0; j--) i = bar(j); // what if foo() returns -5?
Jun 27 2009
BCS wrote:Complete static analysis of the flow of program control is the holy grail of compiler construction. It would allow automatic proof of many program properties (such as initialization). It may not be impossible, but it is extremely complicated. If nothing is known about the post-condition of 'foo', the sensible conclusion would be that 'i' may not be initialized after the loop. If you know that the return value of 'foo' is always positive under the given conditions, then you know otherwise. In the general case, however, you can't guarantee correct static analysis. This leaves a language/compiler with two options, I believe: * Do nothing about it. Let the programmer use int.min or set a bool to test initialization at runtime. * Add 'uninitialized' to the set of possible states of each type. Every time a variable is read, assert that it is initialized first. Use the static analysis techniques that *are* available (a set that will continue to grow) to eliminate these tests (and the extended state) where possible. The first method has the advantage of simplicity for the compiler and better runtime performance in most cases. The second method has the advantage of automatic detection of subtle bugs and more simplicity for the programmer. -- Michiel HelvensteijnInteresting idea, but IMO using NaN as a default initializer is just a crutch for not having a real system of compile-time system for this works very well in my experience).I think you can prove that it is impossible to do this totally correctly: int i; for(int j = foo(); j > 0; j--) i = bar(j); // what if foo() returns -5?
Jun 27 2009
Michiel Helvensteijn Wrote:BCS wrote:extremely complicated? it's machine haltin' dood.Complete static analysis of the flow of program control is the holy grail of compiler construction. It would allow automatic proof of many program properties (such as initialization). It may not be impossible, but it is extremely complicated.Interesting idea, but IMO using NaN as a default initializer is just a crutch for not having a real system of compile-time system for this works very well in my experience).I think you can prove that it is impossible to do this totally correctly: int i; for(int j = foo(); j > 0; j--) i = bar(j); // what if foo() returns -5?
Jun 27 2009
superdan wrote:Ok, since 'complete static analysis' may include undecidable problems such as halting, I agree that in the general case, it's impossible. However, in many practical cases, it may not be. Additionally, the burden of providing loop invariants and ranking functions (to prove termination) could be given to the programmer instead of the compiler. -- Michiel HelvensteijnComplete static analysis of the flow of program control is the holy grail of compiler construction. It would allow automatic proof of many program properties (such as initialization). It may not be impossible, but it is extremely complicated.extremely complicated? it's machine haltin' dood.
Jun 27 2009
Michiel Helvensteijn wrote:* Add 'uninitialized' to the set of possible states of each type. Every time a variable is read, assert that it is initialized first. Use the static analysis techniques that *are* available (a set that will continue to grow) to eliminate these tests (and the extended state) where possible.I believe this is what valgrind does by instrumenting each variable at runtime.
Jun 27 2009
On Sat, 27 Jun 2009 17:50:11 +0400, BCS <none anon.com> wrote:Hello Nick,first attempt to use 'i': error CS0165: Use of unassigned local variable 'i'Interesting idea, but IMO using NaN as a default initializer is just a crutch for not having a real system of compile-time system for this works very well in my experience).I think you can prove that it is impossible to do this totally correctly: int i; for(int j = foo(); j > 0; j--) i = bar(j); // what if foo() returns -5?
Jun 27 2009
Denis Koroskin wrote:It has the advantage of always knowing at compile time that you're not throw out the baby with the bath water. The example program may be perfectly valid if 'foo' always returns positive. -- Michiel Helvensteijnint i; for(int j = foo(); j > 0; j--) i = bar(j); // what if foo() returns -5?first attempt to use 'i': error CS0165: Use of unassigned local variable 'i'
Jun 27 2009
"Michiel Helvensteijn" <m.helvensteijn.remove gmail.com> wrote in message news:h25fbk$28mg$1 digitalmars.com...Denis Koroskin wrote:Yes, this approach is what I was getting at. In fact, I would (and already have in the past) argue that this is *better* than the "holy grail" approach, because because it's based on very simple and easy to remember rules. Conversely, the "holy grail" approach leads to difficult-to-predict cases of small, seemingly-innocent changes in one place causing some other code to suddenly switch back and forth between "compiles" and "doesn't compile". Take this modified version of your example: ------------ // Imagine foo resides in a completely different package int foo() { return 5; } int i; for(int j = foo(); j > 3; j--) i = j; auto k = i; // Compiles at the moment... ------------ Now make a perfectly acceptable-looking change to foo: ------------ int foo() { return 2; } ------------ And all of a sudden non-local code starts flip-flopping between "compiles" and "doesn't compile". Additionally, even the "holy grail" approach still has to reduce itself to being overly conservative in certain cases anyway: ------------ int foo() { auto rnd = new RandomGenerator(); rnd.seed(systemClock); return rnd.fromRange(1,10); } ------------ So, we only have two initial choices: - Overly permissive (current D approach) And if we choose "overly conservative", then our next choice is: - Overly conservative with complex rules that have seemingly-random non-localized effects ("holy grail")It has the advantage of always knowing at compile time that you're not throw out the baby with the bath water. The example program may be perfectly valid if 'foo' always returns positive.int i; for(int j = foo(); j > 0; j--) i = bar(j); // what if foo() returns -5?first attempt to use 'i': error CS0165: Use of unassigned local variable 'i'
Jun 27 2009
"Nick Sabalausky" <a a.a> wrote in message news:h2623m$73u$1 digitalmars.com..."Michiel Helvensteijn" <m.helvensteijn.remove gmail.com> wrote in message news:h25fbk$28mg$1 digitalmars.com...experience), anytime you do come across a provably-correct case that the compiler rejects, not only is it always obvious to see why the compiler rejected it, but it's also trivially easy to fix. So in practice, it's really not much of a "baby with the bathwater" situation at all.Denis Koroskin wrote:Yes, this approach is what I was getting at. In fact, I would (and already have in the past) argue that this is *better* than the "holy grail" approach, because because it's based on very simple and easy to remember rules. Conversely, the "holy grail" approach leads to difficult-to-predict cases of small, seemingly-innocent changes in one place causing some other code to suddenly switch back and forth between "compiles" and "doesn't compile". Take this modified version of your example: ------------ // Imagine foo resides in a completely different package int foo() { return 5; } int i; for(int j = foo(); j > 3; j--) i = j; auto k = i; // Compiles at the moment... ------------ Now make a perfectly acceptable-looking change to foo: ------------ int foo() { return 2; } ------------ And all of a sudden non-local code starts flip-flopping between "compiles" and "doesn't compile". Additionally, even the "holy grail" approach still has to reduce itself to being overly conservative in certain cases anyway: ------------ int foo() { auto rnd = new RandomGenerator(); rnd.seed(systemClock); return rnd.fromRange(1,10); } ------------ So, we only have two initial choices: - Overly permissive (current D approach) And if we choose "overly conservative", then our next choice is: - Overly conservative with complex rules that have seemingly-random non-localized effects ("holy grail")It has the advantage of always knowing at compile time that you're not throw out the baby with the bath water. The example program may be perfectly valid if 'foo' always returns positive.int i; for(int j = foo(); j > 0; j--) i = bar(j); // what if foo() returns -5?first attempt to use 'i': error CS0165: Use of unassigned local variable 'i'
Jun 27 2009
Nick Sabalausky wrote:Better than a flipflop between "runs correctly" and "runs incorrectly", wouldn't you agree? But of course, you're arguing on the other end of the spectrum. Read on.Yes, this approach is what I was getting at. In fact, I would (and already have in the past) argue that this is *better* than the "holy grail" approach, because because it's based on very simple and easy to remember rules. Conversely, the "holy grail" approach leads to difficult-to-predict cases of small, seemingly-innocent changes in one place causing some other code to suddenly switch back and forth between "compiles" and "doesn't compile". Take this modified version of your example: ------------ // Imagine foo resides in a completely different package int foo() { return 5; } int i; for(int j = foo(); j > 3; j--) i = j; auto k = i; // Compiles at the moment... ------------ Now make a perfectly acceptable-looking change to foo: ------------ int foo() { return 2; } ------------ And all of a sudden non-local code starts flip-flopping between "compiles" and "doesn't compile".I wouldn't call the "holy grail" overly conservative in this instance. The post-condition of 'foo' would simply be (1 <= returnValue <= 10). With no more information than that, the compiler would have to give an error, since 'foo' *may return a value* that results in an uninitialized read of 'i'. That's how it should work. No errors if and only if there is no possible execution path that results in failure, be it uninitialized-read failure, null-dereference failure or divide-by-zero failure.Additionally, even the "holy grail" approach still has to reduce itself to being overly conservative in certain cases anyway: ------------ int foo() { auto rnd = new RandomGenerator(); rnd.seed(systemClock); return rnd.fromRange(1,10); } ------------I tend to agree with BCS that the programmer should have the last say, unless the compiler can absolutely prove that (s)he is wrong. Given the choice between overly conservative and overly permissive, I would pick overly permissive. But the beauty of the holy grail is that it's neither.So, we only have two initial choices: - Overly permissive (current D approach)experience), anytime you do come across a provably-correct case that the compiler rejects, not only is it always obvious to see why the compiler rejected it, but it's also trivially easy to fix. So in practice, it's really not much of a "baby with the bathwater" situation at all.But what would the fix be in the case of our example? Surely you're not suggesting initializing 'i' to 0? Then we'd be back in the old situation where we might get unexpected runtime behavior if we were wrong about 'foo'. An acceptable solution would be: int i; assert(foo() > 3); for(int j = foo(); j > 3; j--) i = j; auto k = i; // Compiles at the moment... -- Michiel Helvensteijn
Jun 28 2009
Michiel Helvensteijn wrote:But the beauty of the holy grail is that it's neither.While the ugliness of it is that it's both. -- Simen
Jun 28 2009
Simen Kjaeraas wrote:Care to elaborate? -- Michiel HelvensteijnBut the beauty of the holy grail is that it's neither.While the ugliness of it is that it's both.
Jun 28 2009
Michiel Helvensteijn wrote:Simen Kjaeraas wrote:As has already been mentioned, one of the biggest problems with the holy grail is that it leads to capricious states of "possibly compilable". There are also bunches of examples in which it will not be able to deduce if it should compile or not, at least not without breaking modularity, and even then, functions called from outside sources (dlls, SOs, OS functions, compiled libraries, etc) will break the system. This means the system has to be either permissive or conservative when encountering an problem insoluble to its logic, and this fall-back mechanism will then work counter-intuitively to its normal working order, thus giving birth to the system's dualism of both conservativeness and permissiveness. -- SimenCare to elaborate?But the beauty of the holy grail is that it's neither.While the ugliness of it is that it's both.
Jun 28 2009
Simen Kjaeraas wrote:The modularity thing is a good point. I assume you're talking about encapsulation. The designer of a function should make its definition public. The stuff it requires and the stuff it guarantees. The stuff it requires can be of the form of a logical precondition. The stuff it guarantees could be, at the choice of the designer, the function body itself or a logical postcondition (with access to the initial state of the function). The postcondition is used if you want to encapsulate the function implementation. Remember that the definition should be known to the caller of the function anyway, or why would he/she call it? Often this is in the form of documentation, but ideally it would be in an assertion language the compiler can understand.As has already been mentioned, one of the biggest problems with the holy grail is that it leads to capricious states of "possibly compilable". There are also bunches of examples in which it will not be able to deduce if it should compile or not, at least not without breaking modularity,Care to elaborate?But the beauty of the holy grail is that it's neither.While the ugliness of it is that it's both.and even then, functions called from outside sources (dlls, SOs, OS functions, compiled libraries, etc) will break the system.You're right. If nothing is known about them, they must automatically receive the weakest possible postcondition: true. Pretty much anything can happen if you call them. However, it's acceptable for either the designers of those outside functions or other programmers to supply public contracts for them. The correctness of the code on the calling side would then be contingent upon the correctness of those contracts. An acceptable compromise.This means the system has to be either permissive or conservative when encountering an problem insoluble to its logic, and this fall-back mechanism will then work counter-intuitively to its normal working order, thus giving birth to the system's dualism of both conservativeness and permissiveness.Well, it would still be either one or the other. Not both. Or perhaps I still don't understand your point. I do find this topic fascinating. -- Michiel Helvensteijn
Jun 28 2009
Michiel Helvensteijn escribió:Nick Sabalausky wrote:Of course not: int foo() { return rand() % 10; }Better than a flipflop between "runs correctly" and "runs incorrectly", wouldn't you agree? But of course, you're arguing on the other end of the spectrum. Read on.Yes, this approach is what I was getting at. In fact, I would (and already have in the past) argue that this is *better* than the "holy grail" approach, because because it's based on very simple and easy to remember rules. Conversely, the "holy grail" approach leads to difficult-to-predict cases of small, seemingly-innocent changes in one place causing some other code to suddenly switch back and forth between "compiles" and "doesn't compile". Take this modified version of your example: ------------ // Imagine foo resides in a completely different package int foo() { return 5; } int i; for(int j = foo(); j > 3; j--) i = j; auto k = i; // Compiles at the moment... ------------ Now make a perfectly acceptable-looking change to foo: ------------ int foo() { return 2; } ------------ And all of a sudden non-local code starts flip-flopping between "compiles" and "doesn't compile".I wouldn't call the "holy grail" overly conservative in this instance. The post-condition of 'foo' would simply be (1 <= returnValue <= 10). With no more information than that, the compiler would have to give an error, since 'foo' *may return a value* that results in an uninitialized read of 'i'. That's how it should work. No errors if and only if there is no possible execution path that results in failure, be it uninitialized-read failure, null-dereference failure or divide-by-zero failure.Additionally, even the "holy grail" approach still has to reduce itself to being overly conservative in certain cases anyway: ------------ int foo() { auto rnd = new RandomGenerator(); rnd.seed(systemClock); return rnd.fromRange(1,10); } ------------I tend to agree with BCS that the programmer should have the last say, unless the compiler can absolutely prove that (s)he is wrong. Given the choice between overly conservative and overly permissive, I would pick overly permissive. But the beauty of the holy grail is that it's neither.So, we only have two initial choices: - Overly permissive (current D approach)experience), anytime you do come across a provably-correct case that the compiler rejects, not only is it always obvious to see why the compiler rejected it, but it's also trivially easy to fix. So in practice, it's really not much of a "baby with the bathwater" situation at all.But what would the fix be in the case of our example? Surely you're not suggesting initializing 'i' to 0? Then we'd be back in the old situation where we might get unexpected runtime behavior if we were wrong about 'foo'. An acceptable solution would be: int i; assert(foo() > 3); for(int j = foo(); j > 3; j--) i = j; auto k = i; // Compiles at the moment...
Jun 28 2009
Ary Borenszweig wrote:My mistake. For some reason I was assuming 'foo' was pure. int i; int j = foo(); assert(j > 3); for(; j > 3; j--) i = j; auto k = i; -- Michiel Helvensteijnint i; assert(foo() > 3); for(int j = foo(); j > 3; j--) i = j; auto k = i; // Compiles at the moment...Of course not: int foo() { return rand() % 10; }
Jun 28 2009
"Michiel Helvensteijn" <m.helvensteijn.remove gmail.com> wrote in message news:h2810s$hl1$1 digitalmars.com...Nick Sabalausky wrote:The fix would be one of the following, depending on what the code is actually doing: --------------- // Instead of knee-jerking i to 0, we default init it to // whatever safe value we want it to be if the loop // doesn't set it. This, of course, may or may not // be zero, depending on the code, but regardless, // there are times when this IS perfectly safe. int i = contextDependentInitVal; for(int j = foo(); j > 3; j--) i = j; auto k = i; --------------- --------------- int i; bool isSet = false; // making i nullable would be better for(int j = foo(); j > 3; j--) { i = j; isSet = true; } if(isSet) { auto k = i; } else { /* handle the problem */ } --------------- Also, keep in mind that while, under this mechanism, it is certainly possible for a coder to cause bugs by always knee-jerking the value to zero whenever the compiler complains, that's also a possibility under the "holy grail" approach.Better than a flipflop between "runs correctly" and "runs incorrectly", wouldn't you agree? But of course, you're arguing on the other end of the spectrum. Read on.Yes, this approach is what I was getting at. In fact, I would (and already have in the past) argue that this is *better* than the "holy grail" approach, because because it's based on very simple and easy to remember rules. Conversely, the "holy grail" approach leads to difficult-to-predict cases of small, seemingly-innocent changes in one place causing some other code to suddenly switch back and forth between "compiles" and "doesn't compile". Take this modified version of your example: ------------ // Imagine foo resides in a completely different package int foo() { return 5; } int i; for(int j = foo(); j > 3; j--) i = j; auto k = i; // Compiles at the moment... ------------ Now make a perfectly acceptable-looking change to foo: ------------ int foo() { return 2; } ------------ And all of a sudden non-local code starts flip-flopping between "compiles" and "doesn't compile".I wouldn't call the "holy grail" overly conservative in this instance. The post-condition of 'foo' would simply be (1 <= returnValue <= 10). With no more information than that, the compiler would have to give an error, since 'foo' *may return a value* that results in an uninitialized read of 'i'. That's how it should work. No errors if and only if there is no possible execution path that results in failure, be it uninitialized-read failure, null-dereference failure or divide-by-zero failure.Additionally, even the "holy grail" approach still has to reduce itself to being overly conservative in certain cases anyway: ------------ int foo() { auto rnd = new RandomGenerator(); rnd.seed(systemClock); return rnd.fromRange(1,10); } ------------I tend to agree with BCS that the programmer should have the last say, unless the compiler can absolutely prove that (s)he is wrong. Given the choice between overly conservative and overly permissive, I would pick overly permissive. But the beauty of the holy grail is that it's neither.So, we only have two initial choices: - Overly permissive (current D approach)experience), anytime you do come across a provably-correct case that the compiler rejects, not only is it always obvious to see why the compiler rejected it, but it's also trivially easy to fix. So in practice, it's really not much of a "baby with the bathwater" situation at all.But what would the fix be in the case of our example? Surely you're not suggesting initializing 'i' to 0? Then we'd be back in the old situation where we might get unexpected runtime behavior if we were wrong about 'foo'.
Jun 28 2009
"Nick Sabalausky" <a a.a> wrote in message news:h28gqc$1duk$1 digitalmars.com..."Michiel Helvensteijn" <m.helvensteijn.remove gmail.com> wrote in message news:h2810s$hl1$1 digitalmars.com...I would also be perfectly ok with this compiling: --------------- int foo() out { assert(ret >= 5 && ret <= 10); } body { auto rnd = new RandomGenerator(); rnd.seed(systemClock); int ret = rnd.fromRange(5,10); return ret; } int i; for(int j = foo(); j > 3; j--) i = j; auto k = i; --------------- Ie, I can agree that the compiler should be able to take advantage of a function's contract when determining whether or not to throw a "may not get inited" error, but I strongly disagree that the contract used should be implicity defined by the actual behavior of the function. IMO, In the sans-"out" versions of foo, the *only* post-condition contract is that it returns an int. If foo's creater really does intend for foo's result to always be within a certain subset of that, no matter what revisions are eventually made to foo (without actually changing the whole purpose of foo), then that should be put in a formal post-condition contract anyway, such as above.The fix would be one of the following, depending on what the code is actually doing: --------------- // Instead of knee-jerking i to 0, we default init it to // whatever safe value we want it to be if the loop // doesn't set it. This, of course, may or may not // be zero, depending on the code, but regardless, // there are times when this IS perfectly safe. int i = contextDependentInitVal; for(int j = foo(); j > 3; j--) i = j; auto k = i; --------------- --------------- int i; bool isSet = false; // making i nullable would be better for(int j = foo(); j > 3; j--) { i = j; isSet = true; } if(isSet) { auto k = i; } else { /* handle the problem */ } --------------- Also, keep in mind that while, under this mechanism, it is certainly possible for a coder to cause bugs by always knee-jerking the value to zero whenever the compiler complains, that's also a possibility under the "holy grail" approach.experience), anytime you do come across a provably-correct case that the compiler rejects, not only is it always obvious to see why the compiler rejected it, but it's also trivially easy to fix. So in practice, it's really not much of a "baby with the bathwater" situation at all.But what would the fix be in the case of our example? Surely you're not suggesting initializing 'i' to 0? Then we'd be back in the old situation where we might get unexpected runtime behavior if we were wrong about 'foo'.
Jun 28 2009
Nick Sabalausky wrote:Ie, I can agree that the compiler should be able to take advantage of a function's contract when determining whether or not to throw a "may not get inited" error, but I strongly disagree that the contract used should be implicity defined by the actual behavior of the function.Ah, we are starting to agree. :-) However, in some cases, a function is so short and/or so simple that it would be extremely redundant to provide a formal contract. Think about setters, getters and the like. Functions whose implementations are extremely unlikely to change. So while I agree in general that the definition of a function should be its contract - not its implementation - in simple cases, I would find it acceptable for the creator of a function to explicitly indicate that it is defined by its implementation. -- Michiel Helvensteijn
Jun 28 2009
Nick Sabalausky wrote:The fix would be one of the following, depending on what the code is actually doing: --------------- // Instead of knee-jerking i to 0, we default init it to // whatever safe value we want it to be if the loop // doesn't set it. This, of course, may or may not // be zero, depending on the code, but regardless, // there are times when this IS perfectly safe. int i = contextDependentInitVal; for(int j = foo(); j > 3; j--) i = j; auto k = i; --------------- --------------- int i; bool isSet = false; // making i nullable would be better for(int j = foo(); j > 3; j--) { i = j; isSet = true; } if(isSet) { auto k = i; } else { /* handle the problem */ } ---------------Keep in mind that we're talking about a situation in which we're sure 'i' will always be set. If this is not so, the program is incorrect, and we would want to see one error or another. Your first solution would be misleading in that case. Any initial value you choose would be a hack to silence the compiler. A variation on your second solution then: int i; bool isSet = false; // making i nullable would be better for(int j = foo(); j > 3; j--) { i = j; isSet = true; } assert(isSet); auto k = i; This is the basic solution I would always choose in the absence of the grail. As you say, ideally, the 'uninitialized' state should be part of 'i', not a separate variable. Reading 'i' would then automatically assert its initialization at runtime. I guess that brings us back to one of those scenario's I mentioned in another subthread. As compilers become more sophisticated, they will be able to remove the explicit initialization, the test and the extended state in more complex situations.Also, keep in mind that while, under this mechanism, it is certainly possible for a coder to cause bugs by always knee-jerking the value to zero whenever the compiler complains, that's also a possibility under the "holy grail" approach.That's true. But if we did have the grail, the compiler would also be able to see that knee-jerking 'i' would not satisfy the contract of the outer function. Programmers would learn to say what they mean, not what the compiler wants to hear. -- Michiel Helvensteijn
Jun 28 2009
"Michiel Helvensteijn" <m.helvensteijn.remove gmail.com> wrote in message news:h28i61$1hl3$1 digitalmars.com...Nick Sabalausky wrote:It's a situation where we're *initially* sure 'i' will always be set. But once we see that error from the compiler, we have to reassess that belief. There are three possibilities when that happens: 1. It will always be set because of the function's contract. In this case, we do the formal contract stuff I advocated earlier. And we can certainly come up with ways to be minimally-verbose with this for trivial cases. So this case gets eliminated. 2. It will always be set, but *only* because of the function's implementation. This *should* cause a compiler error, because if it's allowed by the function's formal contract, then that very fact means that we *should* assume that this may very well flip-flop anytime that either foo or anything foo may rely upon is changed. 3. We were, in fact, *mistaken* in thinking that what we were doing would always leave 'i' inited (this does happen). This causes the programmer to reassess their approach. Depending on what they're trying to do, the solution might involve rewriting a loop that's basically fubared already, or in some cases it may very well be as simple as adding a default init value (this does happen).The fix would be one of the following, depending on what the code is actually doing: --------------- // Instead of knee-jerking i to 0, we default init it to // whatever safe value we want it to be if the loop // doesn't set it. This, of course, may or may not // be zero, depending on the code, but regardless, // there are times when this IS perfectly safe. int i = contextDependentInitVal; for(int j = foo(); j > 3; j--) i = j; auto k = i; --------------- --------------- int i; bool isSet = false; // making i nullable would be better for(int j = foo(); j > 3; j--) { i = j; isSet = true; } if(isSet) { auto k = i; } else { /* handle the problem */ } ---------------Keep in mind that we're talking about a situation in which we're sure 'i' will always be set. If this is not so, the program is incorrect, and we would want to see one error or another. Your first solution would be misleading in that case. Any initial value you choose would be a hack to silence the compiler.A variation on your second solution then: int i; bool isSet = false; // making i nullable would be better for(int j = foo(); j > 3; j--) { i = j; isSet = true; } assert(isSet); auto k = i; This is the basic solution I would always choose in the absence of the grail. As you say, ideally, the 'uninitialized' state should be part of 'i', not a separate variable. Reading 'i' would then automatically assert its initialization at runtime.Yea, that works too. It's effectively a sub-case of my "if(isSet) else {/* handle this somehow*/ }", and more-or-less what I had in mind.I guess that brings us back to one of those scenario's I mentioned in another subthread. As compilers become more sophisticated, they will be able to remove the explicit initialization, the test and the extended state in more complex situations.Agreed, but with the caveat that care should be taken to ensure these new rules don't allow non-localized flip-flopping when something's[1] implementation is changed, because then the programmer has to start remembering and analyzing an increasingly complex set of rules. [1] Side-trip to spell-check land again: Apparently OpenOffice doesn't think "something" can be made possessive. (But then again, maybe it technically can't in super-anal-grammar-police land, not like I would know ;) )No, it wouldn't, because it would have no way of knowing that's a knee-jerk fix for a "using uninited var" error. But maybe I misunderstand you?Also, keep in mind that while, under this mechanism, it is certainly possible for a coder to cause bugs by always knee-jerking the value to zero whenever the compiler complains, that's also a possibility under the "holy grail" approach.That's true. But if we did have the grail, the compiler would also be able to see that knee-jerking 'i' would not satisfy the contract of the outer function.
Jun 28 2009
Nick Sabalausky wrote:Init, assert, spell-check land, etc.We're agreed.I mean that if the programmer has provided a postcondition of the outer function (the function that contains the variable 'i'), a verifying compiler will be able to give an error if knee-jerking 'i' results in a subtle bug in the function; one that would invalidate the postcondition. Of course, if no postcondition is supplied, the compiler can only assume you meant for exactly that thing to happen. The bug becomes a feature. :-) Anyway, the verifying compiler is a project I'm working on. I'm designing a language based on the assumption that compilers will become more and more sophisticated in the area of static analysis. Contracts are the most important feature of this language and assertions even have their own syntax (because they'll be used so much). Where the correctness of a piece of code cannot be proved at compile-time, a managed runtime environment is used. This offers the guarantee that the current state will always satisfy the contract. Assertions cannot be 'caught' and discarded. Many optimizations may also be based on contracts. It's a really fun project. -- Michiel HelvensteijnNo, it wouldn't, because it would have no way of knowing that's a knee-jerk fix for a "using uninited var" error. But maybe I misunderstand you?Also, keep in mind that while, under this mechanism, it is certainly possible for a coder to cause bugs by always knee-jerking the value to zero whenever the compiler complains, that's also a possibility under the "holy grail" approach.That's true. But if we did have the grail, the compiler would also be able to see that knee-jerking 'i' would not satisfy the contract of the outer function.
Jun 28 2009
Hello Nick,Also, keep in mind that while, under this mechanism, it is certainly possible for a coder to cause bugs by always knee-jerking the value to zero whenever the compiler complains, that's also a possibility under the "holy grail" approach.How about letting the user signal that they know what they are doing by using: int i = void;
Jun 28 2009
Hello Nick,"Michiel Helvensteijn" <m.helvensteijn.remove gmail.com> wrote in message news:h25fbk$28mg$1 digitalmars.com...Yes, trying to solve the problem for all cases won't work, but I think the default should be to trust the programer. If you can show for sure with a trivial set of rules that I use a variable before setting it give me an error. If not, get the heck out of my way!It has the advantage of always knowing at compile time that you're often throw out the baby with the bath water. The example program may be perfectly valid if 'foo' always returns positive.Yes, this approach is what I was getting at. In fact, I would (and already have in the past) argue that this is *better* than the "holy grail" approach, because because it's based on very simple and easy to remember rules. Conversely, the "holy grail" approach leads to difficult-to-predict cases of small, seemingly-innocent changes in one place causing some other code to suddenly switch back and forth between "compiles" and "doesn't compile".
Jun 27 2009
Hello Denis,On Sat, 27 Jun 2009 17:50:11 +0400, BCS <none anon.com> wrote:And if foo() is never <=0 then the error is valid, but incorrect. I like the int.nan idea better. Not one unassigned local variable error I have ever seen has pointed me at a bug.Hello Nick,first attempt to use 'i': error CS0165: Use of unassigned local variable 'i'Interesting idea, but IMO using NaN as a default initializer is just a crutch for not having a real system of compile-time detecting/preventing of uninitialized variables from being readI think you can prove that it is impossible to do this totally correctly: int i; for(int j = foo(); j > 0; j--) i = bar(j); // what if foo() returns -5?
Jun 27 2009