digitalmars.D - On exceptions, errors, and contract violations
- Sean Kelly (70/70) Oct 03 2014 I finally realized what's been bugging me about thew program
- bearophile (10/22) Oct 03 2014 In presence of high-order functions the assignment of contract
- monarch_dodra (12/16) Oct 03 2014 Technically, a precondition validates correct argument passing
- Sean Kelly (4/11) Oct 03 2014 It sounds to me like we're saying the same thing. Postconditions
- =?UTF-8?B?QWxpIMOHZWhyZWxp?= (70/75) Oct 03 2014 Agreed.
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (49/78) Oct 03 2014 Yes, I agree. The exceptions are serious hardware instability
- Marco Leise (14/17) Oct 05 2014 Am Fri, 03 Oct 2014 19:46:15 +0000
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (39/49) Oct 05 2014 By what definition?
- Marco Leise (6/6) Oct 06 2014 Ok, I get it. You are asking for a change in paradigms. But it
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (32/36) Oct 06 2014 I think D, Go, Rust and C++ struggle with high-level vs low-level
- Dicebot (10/10) Oct 05 2014 I agree with most parts and this pretty much fits my unhappy
- Steven Schveighoffer (25/28) Oct 06 2014 This is the thing I have been arguing. Inside a library, the idea of
- Jacob Carlborg (7/15) Oct 06 2014 I kind of agree with Walter. In an ideal world all environmental errors
- Bruno Medeiros (34/87) Oct 29 2014 Spot on, I pretty much agree with everything above, but up to this point...
- Sean Kelly (19/39) Oct 29 2014 It would give us an idea of where to look for the bug. Also, it
I finally realized what's been bugging me about thew program logic error, airplane vs. server discussion, and rather than have it lost in the other thread I thought I'd start a new one. The actual problem driving these discussions is that the vocabulary we're using to describe error conditions is too limited. We currently have a binary condition. Either something is a temporary environmental condition, detected at run-time, which may disappear simply by retrying the operation, or it is a programming logic error which always indicates utter, irrecoverable failure. Setting aside exceptions for the moment, one thing I've realized about errors is that in most cases, an API has no idea how important its proper function is to the application writer. If a programmer passes out of range arguments to a mathematical function, his logic may be faulty, but *we have no idea what this means to him*. The confusion about whether the parameters to a library constitute user input is ultimately the result of this exact problem--since we have no idea of the importance of our library in each application, we cannot dictate how the application should react to failures within the library. A contract has preconditions and postconditions to validate different types of errors. Preconditions validate user input (caller error), and postconditions validate resulting state (callee error). If nothing else, proper communication regarding which type of error occurred is crucial. A precondition error suggests a logic error in the application, and a postcondition error suggests a logic error in the function. The function writer is in the best position to know the implications of a postcondition failure, but has no idea what the implications of a precondition failure might be. So it's reasonable to assert that not only the type of contract error is important to know what to fix, but also to know how to react to the problem. Another issue is what the error tells us about the locality of the failure. A precondition indicates that the failure simply occurred sometime before the precondition was called, while a postcondition indicates that the failure occurred within the processing of the function. Invariant failures might indicate either, which leads me to think that they are too coarse-grained to be of much use. In general, I would rather know whether the data integrity problem was preexisting or whether it occurred as the result of my function. Simply knowing that it exists at all is better than nothing, but since we already have the facility for a more detailed diagnosis, why use invariants? Without running on too long, I think the proper response to this issue is to create a third subtype of Throwable to indicate contract violations, further differentiating between pre and postcondition failures. So we use Exception to represent (environmental) errors which may disappear simply from retrying the operation (and weirdly, out of memory falls into this category, though nothrow precludes recategorization), Error to represent, basically, the things we want to be allowable in nothrow code, and ContractFailure (with children: PreconditionFailure, PostconditionFailure, and InvariantFailure) to indicate contract violations. This gets them out from under the Exception umbrella in terms of having them accidentally discarded, and gives the programmer the facility to handle them explicitly. I'm not entirely sure how they should operate with respect to nothrow, but am leaning towards saying that they should be allowable just like Error. The question then arises, if contract failures are valid under nothrow, should they derive from Error? I'm inclined to say no, because Error indicates something different. An Error results as a natural consequence of code that should be legal everywhere. Divide by zero, for instance. It isn't practical to outlaw division in nothrow code, so a divide by zero error is an Error. But contract violations are something different and I believe they deserve their own category. How does this sound? It's only the beginnings of an idea so far, but I think it's on the right track towards drawing meaningful distinctions between error conditions in D.
Oct 03 2014
Sean Kelly:Another issue is what the error tells us about the locality of the failure. A precondition indicates that the failure simply occurred sometime before the precondition was called, while a postcondition indicates that the failure occurred within the processing of the function.In presence of high-order functions the assignment of contract failure blame is less simple.Invariant failures might indicate either, which leads me to think that they are too coarse-grained to be of much use. In general, I would rather know whether the data integrity problem was preexisting or whether it occurred as the result of my function. Simply knowing that it exists at all is better than nothing, but since we already have the facility for a more detailed diagnosis, why use invariants?Invariants are more DRY. There are plenty of state parts that can be changed by a function, if you want to test the invariant each time you risk duplicating that testing code in every post-condition. The invariant helps you avoid that. It's a safety net that helps against problems that you can forget. Bye, bearophile
Oct 03 2014
On Friday, 3 October 2014 at 17:40:43 UTC, Sean Kelly wrote:A contract has preconditions and postconditions to validate different types of errors. Preconditions validate user input (caller error), and postconditions validate resulting state (callee error).Technically, a precondition validates correct argument passing from "caller", which is not quite the same as "user input". "User" in this context is really the "end user", and is *not* what contracts are made for. Also, I don't think "postconditions" are meant to check "callee" errors. That's what asserts do. Rather, postconditions are verifications that can ony occur *after* the call. For example, a function that takes an input range (no length), but says "input range shall have exactly this amount of items..." or "input shall be no bigger than some unknown value, which will cause a result overflow".
Oct 03 2014
On Friday, 3 October 2014 at 17:51:16 UTC, monarch_dodra wrote:Also, I don't think "postconditions" are meant to check "callee" errors. That's what asserts do. Rather, postconditions are verifications that can ony occur *after* the call. For example, a function that takes an input range (no length), but says "input range shall have exactly this amount of items..." or "input shall be no bigger than some unknown value, which will cause a result overflow".It sounds to me like we're saying the same thing. Postconditions test whether the function did the right thing. This might be validating the result or validating some internal state.
Oct 03 2014
On 10/03/2014 10:40 AM, Sean Kelly wrote:an API has no idea how important its proper function is to the application writer.Agreed. Further, an API has no idea whether its caller is a "user" or a "library". (I will expand below.)If a programmer passes out of range arguments to a mathematical function, his logic may be faulty, but *we have no idea what this means to him*.This reminds me of a discussion we had on this forum many months ago. If I write an API function, I have no option but to call enforce() inside the body of the function because I must be ready for a "user". I don't want my checks to be removed by the -release compiler switch. However, if my caller happens to be a library or a lower level code of a program, then my enforce() calls should be turned into asserts and be moved to the in block. We would like to write the checks only once but tell the compiler to use them as in contracts or enforce checks. Here is an implementation of the idea if we wanted to do it ourselves: 1) Repeat the checks in both places, only one of which will be enabled depending on version identifiers. double squareRoot(double value) in { mixin preConditionCheck_squareRoot!(); check(value); } body { mixin preConditionCheck_squareRoot!(); check(value); return 42.42; } 2) To test the idea, provide version identifiers during compilation: /* Pick a combination of the two versions 'userCalled' and 'release' * below. What happens is specified after the arrow: * * a) userCalled (no release) -> throws Exception * b) userCalled + release -> throws Exception * c) (no userCalled, no release) -> throws AssertError * d) (no userCalled) release -> does not check */ // version = userCalled; // version = release; void main() { squareRoot(-1); } 3) Here is the horrible boiler plate that I came up with without putting much effort into it: mixin template preConditionCheck_squareRoot() { import std.stdio; import std.exception; void noop(T...)(T) {} void assertCaller(T...)(T args) { assert(args[0], args[1]); } version (userCalled) { alias checker = enforce; } else { version (release) { alias checker = noop; } else { alias checker = assertCaller; } } void check(T...)(T args) { checker(value > 0, "Value must be greater than zero."); } } Ali
Oct 03 2014
On Friday, 3 October 2014 at 17:40:43 UTC, Sean Kelly wrote:Setting aside exceptions for the moment, one thing I've realized about errors is that in most cases, an API has no idea how important its proper function is to the application writer. If a programmer passes out of range arguments to a mathematical function, his logic may be faulty, but *we have no idea what this means to him*. The confusion about whether the parameters to a library constitute user input is ultimately the result of this exact problem--since we have no idea of the importance of our library in each application, we cannot dictate how the application should react to failures within the library.Yes, I agree. The exceptions are serious hardware instability issues. Or serious errors such as a garbage collector going bananas with no possibility of ever reclaiming any lost memory. Those are not application level concerns. Those are kernel or foundational runtime concerns. I don't want my website to tank just because some library author wrote code that is incapable of dealing with 29th of February in a date conversion. I don't want the website to tank because a validator throws a type related error. I want the validation to fail, but not the program. The failure should be local to what you are calling into. The conversion from a local failure to a fatal error can only be done by the application in most cases.A contract has preconditions and postconditions to validate different types of errors. Preconditions validate user input (caller error), and postconditions validate resulting state (callee error).Preconditions establish what input the postconditions are guaranteed to hold for. They don't really validate. Preconditions say this: if you call me with other values than the ones in the precondition you may or may not get what you want, it is not my responsibility.A precondition error suggests a logic error in the application, and a postcondition error suggests a logic error in the function.I'd rather say: 1. If you call into a function breaking the precondition, then the calling module is responsible for any errors that may occur later on. (or rather the contractor that wrote it) 2. If the postcondition fails for input that satisfies the precondition then the called module is responsible. (or rather the contractor that wrote it)The function writer is in the best position to know the implications of a postcondition failure, but has no idea what the implications of a precondition failure might be.If the modules are isolated then the system architect can define which modules are critical and which are not. So he or she can decide whether the module should continue, be turned off, logged for manual correction at later stage, reset or if the whole system should enter some kind of emergency mode.it's reasonable to assert that not only the type of contract error is important to know what to fix, but also to know how to react to the problem.Yes, and maybe even know which contractor should be called to fix the problem.But contract violations are something different and I believe they deserve their own category.The support for contracts in D is not really meaningful IMO. It is basically just regular asserts with syntax dressing.How does this sound? It's only the beginnings of an idea so far, but I think it's on the right track towards drawing meaningful distinctions between error conditions in D.I like where you are going, but think about this: Is infinite recursion a logic error? It should not be considered to be, because you can solve a problem using N strategies concurrently and choose the one that returns a valid result first. Hence logic errors are not fatal errors until the application code says so. It makes a lot of sense to cut down on the amount of code you have to write and let runtime errors happen, then catch them and swallow them. It makes sense to just swallow errors like typing issues if they should not occur for a result you want to use. You simply turn the "logic error" into a "cannot compute this" result if that is suitable for the application. And the programming language should not make this hard.
Oct 03 2014
Am Fri, 03 Oct 2014 19:46:15 +0000 schrieb "Ola Fosheim Gr=C3=B8stad" <ola.fosheim.grostad+dlang gmail.com>:You simply turn the "logic error" into a "cannot compute this"=20 result if that is suitable for the application. And the=20 programming language should not make this hard.I don't get this. When we say logic error we are talking bugs in the program. Why would anyone turn an outright bug into "cannot compute this". When a function cannot handle division by zero it should not be fed a zero in the first place. That's part of input validation before getting to that point. Or do you vote for removing these validations and wait for the divide by zero to happen inside the callee in order to catch it in the caller and say in hindsight: "It seems like in one way or another this input was not computable"? --=20 Marco
Oct 05 2014
On Sunday, 5 October 2014 at 21:16:17 UTC, Marco Leise wrote:I don't get this. When we say logic error we are talking bugs in the program.By what definition? And what if I decide that I want my programs to recover from bugs in insignificant code sections and keep going? Is a type error in a validator a bug? It makes perfect sense to let the runtime throw implicitly on things you cannot be bothered to check explicitly because they should not happen for valid input. If that is a bug, then it is a good bug that makes it easier to write code that responds properly. The less verbose a validator is, the easier it is to ensure that it responds in a desirable fashion. Why force the programmer to replicate the work that the compiler/runtime already do anyway? Is a out-of-range-error when processing a corrupt file a bug or it is a deliberate reliance on D's range check feature? Isn't the range check more useful if you don't have to do explicit checks for valid input? Useful as in: saves time and money with the same level of correctness as long as you know what you are doing? Is deep recursion a bug? Not really. Is running out of memory a bug? Not really. Is division by a very small number that is coerced to zero a bug? Not really. Is hitting the worst case running time which cause timeouts a bug? Not really, it is bad luck. Can the compiler/library/runtime reliably determine what is a bug and what is not? Not in a consistent fashion.Why would anyone turn an outright bug into "cannot compute this". When a function cannot handle division by zero it should not be fed a zero in the first place. That's part of input validation before getting to that point.I disagree. When you want to computations to be performant it makes a lot of sense to do speculative computation in a SIMD like manner using the less robust method, then recompute the computations that failed using a slower and more robust method. Or simply ignore the results that were hard to compute: Think of a ray tracer that solves very complex equations using a numerical solver that will not always produce a meaningful result. You are then better off using the faster solver and simply ignore the rays that produce unreasonable results according to some heuristics. You can compensate by firing more rays per pixel with slightly different x/y coordinates. The alternative is to produce images with "pixel noise" or use a much slower solver.Or do you vote for removing these validations and wait for the divide by zero to happen inside the callee in order to catch it in the caller and say in hindsight: "It seems like in one way or another this input was not computable"?There is a reason for why the FP handling in ALUs let this be configurable. It is up to the application to decide.
Oct 05 2014
Ok, I get it. You are asking for a change in paradigms. But it is way outside my comfort zone to say yes or no to it. I will just go on duplicating the error checking through input validation. -- Marco
Oct 06 2014
On Monday, 6 October 2014 at 13:11:25 UTC, Marco Leise wrote:Ok, I get it. You are asking for a change in paradigms. But it is way outside my comfort zone to say yes or no to it. I will just go on duplicating the error checking through input validation.I think D, Go, Rust and C++ struggle with high-level vs low-level programming. It is a difficult balancing act if you want to keep the language/libraries reasonably simple and uniform. Go has more or less kicked out the low-level aspect and are now spending effort on getting acceptable response times in a high concurrency, memory safe niche. Useful, but limited. Rust is too young to be evaluated, but kind of attempt to get safety through wrapping up unsafe stuff in unique_ptr boxes. Immature and at the moment inconvenient, but they are at least clear on keeping a razor sharp fence between unsafe and safe code. C++ struggles with their high level effort and will do so forever. Primarily useful if your needs for high level support is limited and use C++ as a "better C". Cost efficient web programming requires expressive high level programming frameworks where things are caught at runtime by a rich semantic system and turned into appropriate HTTP responses. There is usually little incentive to turn off safety features. Engine/kernel/DSP/AI programming requires low level programming frameworks where you get to treat all data in their binary representation or close to it. Safety features are too expensive, but useful during debugging. D tries to cover both, but the priorities are unclear and it isn't suitable for either out-of-the-box. Which is natural when you don't want to give up performance and still want to have convenience. I guess the current focus is to take the performance-limiting aspects of convenience out of the language/runtime and put it into libraries so that you can more easily configure according to the application domain. The question is if you can do that without sacrificing ease of use, interoperability, performance and convenience. It will be interesting to see how D without GC plays out, though.
Oct 06 2014
I agree with most parts and this pretty much fits my unhappy experience of trying to use D assert/contract system. However I don't feel like contracts and plain assertions should throw different kinds of exceptions - it allows to distinguish some cases but does not solve the problem in general. And those are essentially the same tools so having same exception types make sense. Different compilation versions sound more suitable but that creates usual distribution problems with exponential version explosion.
Oct 05 2014
On 10/3/14 1:40 PM, Sean Kelly wrote:Setting aside exceptions for the moment, one thing I've realized about errors is that in most cases, an API has no idea how important its proper function is to the application writer.This is the thing I have been arguing. Inside a library, the idea of input to the function being user defined or program-defined is not clear. It means that any user-defined input has to be double checked in the same exact way, to avoid having an error thrown in the case that the library function throws an error on such input. The other side, any program-caused errors that end up triggering exceptions (a misnamed filename for opening a config file, for instance), needs to treat this exception as an error and halt the program with an appropriate stack trace. What makes sense to me is: 1. If you send in an obvious programming error (i.e. null instead of an expected target address), throw an error. This can be done in contract precondition scopes. Such things could NOT be generated by user. 2. If you send in something that doesn't make sense, but could be user-generated, throw an exception. User-generated is a very fuzzy thing, but we should conservatively assume the input is user-generated. 3. If anything happens internally to the function, treat it as an error. If an input causes an exception, but the input was program generated, don't catch it in the calling scope! Just let the runtime handle it as an error, and exit the program. Any uncaught exceptions should be treated as errors anyway. It might be a good idea to have an umbrella exception type that means "input invalid." This gives some easy way to catch all such exceptions and print them properly in the case where it's user-defined input. -Steve
Oct 06 2014
On 2014-10-06 16:36, Steven Schveighoffer wrote:This is the thing I have been arguing. Inside a library, the idea of input to the function being user defined or program-defined is not clear. It means that any user-defined input has to be double checked in the same exact way, to avoid having an error thrown in the case that the library function throws an error on such input. The other side, any program-caused errors that end up triggering exceptions (a misnamed filename for opening a config file, for instance), needs to treat this exception as an error and halt the program with an appropriate stack trace.I kind of agree with Walter. In an ideal world all environmental errors would throw an exception, i.e. a file cannot be found. Any other errors would be asserts, i.e. passing null to a function not expecting it. But I can understand that that behavior would most likely cause problems. -- /Jacob Carlborg
Oct 06 2014
On 03/10/2014 18:40, Sean Kelly wrote:I finally realized what's been bugging me about thew program logic error, airplane vs. server discussion, and rather than have it lost in the other thread I thought I'd start a new one. The actual problem driving these discussions is that the vocabulary we're using to describe error conditions is too limited. We currently have a binary condition. Either something is a temporary environmental condition, detected at run-time, which may disappear simply by retrying the operation, or it is a programming logic error which always indicates utter, irrecoverable failure. Setting aside exceptions for the moment, one thing I've realized about errors is that in most cases, an API has no idea how important its proper function is to the application writer. If a programmer passes out of range arguments to a mathematical function, his logic may be faulty, but *we have no idea what this means to him*. The confusion about whether the parameters to a library constitute user input is ultimately the result of this exact problem--since we have no idea of the importance of our library in each application, we cannot dictate how the application should react to failures within the library.Spot on, I pretty much agree with everything above, but up to this point only. After that, not so much.A contract has preconditions and postconditions to validate different types of errors. Preconditions validate user input (caller error), and postconditions validate resulting state (callee error). If nothing else, proper communication regarding which type of error occurred is crucial. A precondition error suggests a logic error in the application, and a postcondition error suggests a logic error in the function. The function writer is in the best position to know the implications of a postcondition failure, but has no idea what the implications of a precondition failure might be. So it's reasonable to assert that not only the type of contract error is important to know what to fix, but also to know how to react to the problem. Another issue is what the error tells us about the locality of the failure. A precondition indicates that the failure simply occurred sometime before the precondition was called, while a postcondition indicates that the failure occurred within the processing of the function. Invariant failures might indicate either, which leads me to think that they are too coarse-grained to be of much use. In general, I would rather know whether the data integrity problem was preexisting or whether it occurred as the result of my function. Simply knowing that it exists at all is better than nothing, but since we already have the facility for a more detailed diagnosis, why use invariants?"A precondition error suggests a logic error in the application, and a postcondition error suggests a logic error in the function." Suggests yes, but it doesn't guarantee. A postcondition error (or invariant failure) in a component (imagine a library) could well be triggered not due to a bug in that component, but a bug in the use of that component. Only if the component properly defines all the preconditions for all its functions that case should not happen. But in practice we know that software is not perfect, and a lot of preconditions may not be explicitly defined.Without running on too long, I think the proper response to this issue is to create a third subtype of Throwable to indicate contract violations, further differentiating between pre and postcondition failures. So we use Exception to represent (environmental) errors which may disappear simply from retrying the operation (and weirdly, out of memory falls into this category, though nothrow precludes recategorization), Error to represent, basically, the things we want to be allowable in nothrow code, and ContractFailure (with children: PreconditionFailure, PostconditionFailure, and InvariantFailure) to indicate contract violations. This gets them out from under the Exception umbrella in terms of having them accidentally discarded, and gives the programmer the facility to handle them explicitly. I'm not entirely sure how they should operate with respect to nothrow, but am leaning towards saying that they should be allowable just like Error.Even if we could correctly differentiate between precondition failures and postcondition ones, what would that gives us of use? I think in practice when some code hits an error in some component that it uses, knowing whether it is a precondition failure (bug in the code using the component), or a postcondition (bug in the used component itself), it may actually not tell us much about how much of the program has been affected, ie, which fault domain is broken. I think the default behavior should be simply the clean throw of an Exception when an assertion fails. If there is a performance issue with this, and we want to crash the program immediately when an assertion fails, then that should be an option too. However this behavior should be configurable per library/component, not globally for the whole program, that is too coarse. Also, it should be configurable *in the code* itself, not at compile time. For some components it should even be possible to use the component with hard-stop assertion failure behavior in some places in the program, and with clean exceptions in other places of the *same program*. -- Bruno Medeiros https://twitter.com/brunodomedeiros
Oct 29 2014
On Wednesday, 29 October 2014 at 13:28:28 UTC, Bruno Medeiros wrote:Even if we could correctly differentiate between precondition failures and postcondition ones, what would that gives us of use?It would give us an idea of where to look for the bug. Also, it provides the option of discretionally reacting differently to each.I think in practice when some code hits an error in some component that it uses, knowing whether it is a precondition failure (bug in the code using the component), or a postcondition (bug in the used component itself), it may actually not tell us much about how much of the program has been affected, ie, which fault domain is broken.Yes, it may not. This is really an artifact of the function being called. I don't think any universal rules can be applied to pre vs. postcondition failures, but this at least gives the programmer the option of distinguishing between them for the cases where something can be done. Though if we have something like this: void f1() in { /* passes */ } body { f2(); } void f2() in { throw new PreconditionFailure; } body { ... } Then things get a bit muddied. The bug here is really in the body of f1(), but the caller has no way of knowing that. We could probably do something to indicate that this had occurred, but it's probably overcomplicating things.I think the default behavior should be simply the clean throw of an Exception when an assertion fails. If there is a performance issue with this, and we want to crash the program immediately when an assertion fails, then that should be an option too. However this behavior should be configurable per library/component, not globally for the whole program, that is too coarse. Also, it should be configurable *in the code* itself, not at compile time. For some components it should even be possible to use the component with hard-stop assertion failure behavior in some places in the program, and with clean exceptions in other places of the *same program*.This is definitely possible, since the assert handler can be overridden, though it will not interact well with nothrow.
Oct 29 2014