digitalmars.D - Concept proposal: Safely catching error
- Olivier FAURE (70/70) Jun 05 2017 I recently skimmed the "Bad array indexing is considered deadly"
- ketmar (10/11) Jun 05 2017 tbh, i think that it adds Yet Another Exception Rule to the language, an...
- Olivier FAURE (19/25) Jun 05 2017 Fair enough. A few counterpoints:
- ketmar (15/38) Jun 05 2017 this still nullifies the sense of Error/Exception differences. not all
- Olivier FAURE (16/25) Jun 07 2017 I don't think this is a workaround, or that it goes against the
- Moritz Maxeiner (6/15) Jun 05 2017 Pragmatic question: How much work do you think this will require?
- Olivier FAURE (22/26) Jun 05 2017 Good question. I'm no compiler programmer, so I'm not sure what
- Moritz Maxeiner (21/43) Jun 05 2017 Sure, but with regards to long running processes that are
- Olivier FAURE (21/41) Jun 07 2017 Note that in the case you describe, the alternative is either
- Moritz Maxeiner (16/30) Jun 07 2017 Not how pure is currently defined in D, see the referred spec;
- ag0aep6g (8/62) Jun 05 2017 But `myData` is still alive when `catch (Error)` is reached, isn't it?
- Olivier FAURE (12/42) Jun 07 2017 Good catch; yes, this example would refuse to compile; myData
- ag0aep6g (24/36) Jun 07 2017 I think I mistyped there. Makes more sense this way: "You can't assume
- ag0aep6g (10/14) Jun 07 2017 Thinking a bit more about this, I'm not sure if it's entirely correct.
- Olivier FAURE (33/62) Jun 08 2017 To clarify, when I said "shouldn't be trusted", I meant in the
- ag0aep6g (31/39) Jun 08 2017 I might get the idea now. The throwing code could be in the middle of
- Olivier FAURE (16/32) Jun 08 2017 That's true. A "pure after cleanup" function is incompatible with
- ag0aep6g (19/35) Jun 08 2017 I think it's supposed to be just as pure as any other pure function.
- Steven Schveighoffer (29/35) Jun 05 2017 I don't think this will work. Only throwing Error makes a function
- Olivier FAURE (30/39) Jun 07 2017 If the function is @pure, then the only things it can set up will
- Steven Schveighoffer (13/46) Jun 08 2017 Hm... if you locked an object that was passed in on the stack, for
- Olivier FAURE (7/14) Jun 08 2017 This wouldn't be allowed unless the object was duplicated /
- Steven Schveighoffer (9/15) Jun 08 2017 void foo(Mutex m, Data d) pure
- Stanislav Blinov (3/11) Jun 08 2017 Isn't synchronized(m) not nothrow?
- Steven Schveighoffer (17/27) Jun 08 2017 You're right, it isn't. I actually didn't know that. Also forgot to make...
- Jesse Phillips (17/17) Jun 08 2017 I want to start by stating that the discussion around being able
I recently skimmed the "Bad array indexing is considered deadly" thread, which discusses the "array OOB throws Error, which throws the whole program away" problem. The gist of the debate is: - Array OOB is a programming problem; it means an invariant is broken, which means the code surrounding it probably makes invalid assumptions and shouldn't be trusted. - Also, it can be caused by memory corruption. - But then again, anything can be cause by memory corruption, so it's kind of an odd thing to worry about. We should worry about not causing it, not making memory corrupted programs safe, since it's extremely rare and there's not much we can do about it anyway. - But memory corruption is super bad, if a proved error *might* be caused by memory corruption then we must absolutely throw the potentially corrupted data away without using it. - Besides, even without memory corruption, the same argument applies to broken invariants; if we have data that breaks invariants, we need to throw it away, and use it as little as possible. - But sometimes we have very big applications with lots of data and lots of code. If my server deals with dozens of clients or more, I don't want to brutally disconnect them all because I need to throw away one user's data. - This could be achieved with processes. Then again, using processes often isn't practical for performance or architecture reasons. My proposal for solving these problems would be to explicitly allow to catch Errors in safe code IF the try block from which the Error is caught is perfectly pure. In other words, safe functions would be allowed to catch Error after try blocks if the block only mutates data declared inside of it; the code would look like: import vibe.d; // ... string handleRequestOrError(in HTTPServerRequest req) safe { ServerData myData = createData(); try { // both doSomethingWithData and mutateMyData are pure doSomethingWithData(req, myData); mutateMyData(myData); return myData.toString; } catch (Error) { throw new SomeException("Oh no, a system error occured"); } } void handleRequest(HTTPServerRequest req, HTTPServerResponse res) safe { try { res.writeBody(handleRequestOrError(req), "text/plain"); } catch (SomeException) { // Handle exception } } The point is, this is safe even when doSomethingWithData breaks an invariant or mutateMyData corrupts myData, because the compiler guarantees that the only data affected WILL be thrown away or otherwise unaccessible by the time catch(Error) is reached. This would allow to design applications that can fail gracefully when dealing with multiple independent clients or tasks, even when one of the tasks has to thrown away because of a programmer error. What do you think? Does the idea have merit? Should I make it into a DIP?
Jun 05 2017
Olivier FAURE wrote:What do you think? Does the idea have merit? Should I make it into a DIP?tbh, i think that it adds Yet Another Exception Rule to the language, and this does no good in the long run. "oh, you generally cannot do that, except if today is Friday, it is rainy, and you've seen pink unicorn at the morning." the more exceptions to general rules language has, the more it reminds Dragon Poker game from Robert Asprin books. any exception will usually have a strong rationale behind it, of course, so it will be a little reason to not accept it, especially if we had accepted some exceptions before. i think it is better to not follow that path, even if this one idea looks nice.
Jun 05 2017
On Monday, 5 June 2017 at 10:09:30 UTC, ketmar wrote:tbh, i think that it adds Yet Another Exception Rule to the language, and this does no good in the long run. "oh, you generally cannot do that, except if today is Friday, it is rainy, and you've seen pink unicorn at the morning." the more exceptions to general rules language has, the more it reminds Dragon Poker game from Robert Asprin books.Fair enough. A few counterpoints: - This one special case is pretty self-contained. It doesn't impact code that doesn't use it, and the users most likely to hear about it are the one who need to recover from Errors in their code. - It doesn't introduce elaborate under-the-hood tricks (unlike DIP 1008*). It uses already-existing concepts ( safe and pure), and is in fact closer to the intuitive logic behind Error recovery than the current model; instead of "You can't recover from Errors" you have "You can't recover from Errors unless you flush all data that might have been affected by it". *Note that I am not making a statement for or against those DIPs. I'm only using them as examples to compare my proposal against. So while this would add feature creep to the language, but I'd argue that feature creep would be pretty minor and well-contained, and would probably be worth the problem it would solve.
Jun 05 2017
Olivier FAURE wrote:On Monday, 5 June 2017 at 10:09:30 UTC, ketmar wrote:this still nullifies the sense of Error/Exception differences. not all errors are recoverable, even in safe code. assuming that it is safe to catch any Error in safe immediately turns it no unsafe. so... we will need to introduce RecoverableInSafeCodeError class, and change runtime to throw it instead of Error (sometimes). and even more issues follows (it's avalanche of changes, and possible code breakage too). so, in the original form your idea turns safe code into unsafe, and with more changes it becomes a real pain to implement, and adds more complexity to the language (another Dragon Poker modifier). using wrappers and carefully checking preconditions looks better to me. after all, if programmer failed to check some preconditions, the worst thing to do is trying to hide that by masking errors. bombing out is *way* better, i believe, 'cause it forcing programmer to really fix the bugs instead of creating hackish workarounds.tbh, i think that it adds Yet Another Exception Rule to the language, and this does no good in the long run. "oh, you generally cannot do that, except if today is Friday, it is rainy, and you've seen pink unicorn at the morning." the more exceptions to general rules language has, the more it reminds Dragon Poker game from Robert Asprin books.Fair enough. A few counterpoints: - This one special case is pretty self-contained. It doesn't require doesn't use it, and the users most likely to hear about it are the one who need to recover from Errors in their code. - It doesn't introduce elaborate under-the-hood tricks (unlike DIP 1008*). It uses already-existing concepts ( safe and pure), and is in fact closer to the intuitive logic behind Error recovery than the current model; instead of "You can't recover from Errors" you have "You can't recover from Errors unless you flush all data that might have been affected by it". *Note that I am not making a statement for or against those DIPs. I'm only using them as examples to compare my proposal against. So while this would add feature creep to the language, but I'd argue that feature creep would be pretty minor and well-contained, and would probably be worth the problem it would solve.
Jun 05 2017
On Monday, 5 June 2017 at 13:13:01 UTC, ketmar wrote:this still nullifies the sense of Error/Exception differences. not all errors are recoverable, even in safe code. ... using wrappers and carefully checking preconditions looks better to me. after all, if programmer failed to check some preconditions, the worst thing to do is trying to hide that by masking errors. bombing out is *way* better, i believe, 'cause it forcing programmer to really fix the bugs instead of creating hackish workarounds.I don't think this is a workaround, or that it goes against the purpose of Errors. The goal would still be to bomb out, cancel whatever you were doing, print a big red error message to the coder / user, and exit. A program that catches an Error would not try to use the data that broke a contract; in fact, the program would not have access to the invalid data, since it would be thrown away. It's natural progression would be to log the error, and quit whatever it was doing. The point is, if the program needs to free system resources before shutting down, it could do so; or if the program is a server or a multi-threaded app dealing with multiple clients at the same time, those clients would not be affected by a crash unrelated to their data.
Jun 07 2017
On Monday, 5 June 2017 at 09:50:15 UTC, Olivier FAURE wrote:My proposal for solving these problems would be to explicitly allow to catch Errors in safe code IF the try block from which the Error is caught is perfectly pure. This would allow to design applications that can fail gracefully when dealing with multiple independent clients or tasks, even when one of the tasks has to thrown away because of a programmer error. What do you think? Does the idea have merit? Should I make it into a DIP?Pragmatic question: How much work do you think this will require? Because writing a generic wrapper that you can customize the fault behaviour for using DbI requires very little[1]. [1] https://github.com/Calrama/libds/blob/fbceda333dbf76697050faeb6e25dbfcc9e3fbc0/src/ds/linear/array/dynamic.d
Jun 05 2017
On Monday, 5 June 2017 at 10:59:28 UTC, Moritz Maxeiner wrote:Pragmatic question: How much work do you think this will require?Good question. I'm no compiler programmer, so I'm not sure what the answer is. I would say "probably a few days at most". The change is fairly self-contained, and built around existing concepts (mutability and safety); I think it would mostly be a matter of adding a function to the safety checks that tests whether a mutable reference to non-local data is used in any try block with catch(Error). Another problem is that non-gc memory allocated in the try block would be irreversibly leaked when an Error is thrown (though now that I think about it, that would probably count as impure and be impossible anyway). Either way, it's not a safety risk and the programmer can decide whether leaking memory is worse than brutally shutting down for their purpose.Because writing a generic wrapper that you can customize the fault behaviour for using DbI requires very little.Using an array wrapper only covers part of the problem. Users may want their server to keep going even if they fail an assertion, or want the performance of nothrow code, or use a library that throws RangeError in very rare and hard to pinpoint cases. Arrays aside, I think there's some use in being able to safely recover from (or safely shut down after) the kind of broken contracts that throw Errors.
Jun 05 2017
On Monday, 5 June 2017 at 12:01:35 UTC, Olivier FAURE wrote:On Monday, 5 June 2017 at 10:59:28 UTC, Moritz Maxeiner wrote:D considers allocating memory as pure[1].Pragmatic question: How much work do you think this will require?Another problem is that non-gc memory allocated in the try block would be irreversibly leaked when an Error is thrown (though now that I think about it, that would probably count as impure and be impossible anyway).Either way, it's not a safety risk and the programmer can decide whether leaking memory is worse than brutally shutting down for their purpose.Sure, but with regards to long running processes that are supposed to handle tens of thousands of requests, leaking memory (and continuing to run) will likely eventually end up brutally shutting down the process on out of memory errors. But yes, that is something that would have to be evaluated on a case by case basis.It *replaces* the hard coded assert Errors with flexible attests, that can throw whatever you want (or even kill the process immediately), you just have to disable the runtimes internal bound checks via `-boundscheck=off`.Because writing a generic wrapper that you can customize the fault behaviour for using DbI requires very little.Using an array wrapper only covers part of the problem.Users may want their server to keep going even if they fail an assertionNormal assertions (other than assert(false)) are not present in -release mode, they are purely for debug mode.or want the performance of nothrow codeThat's easily doable with the attest approach.or use a library that throws RangeError in very rare and hard to pinpoint cases.Fix the library (or get it fixed if you don't have the code).Arrays aside, I think there's some use in being able to safely recover from (or safely shut down after) the kind of broken contracts that throw Errors.I consider there to be value in allowing users to say "this is not a contract, it is a valid use case" (-> wrapper), but a broken contract being recoverable violates the entire concept of DbC. [1] https://dlang.org/spec/function.html#pure-functions
Jun 05 2017
On Monday, 5 June 2017 at 12:59:11 UTC, Moritz Maxeiner wrote:On Monday, 5 June 2017 at 12:01:35 UTC, Olivier FAURE wrote:Note that in the case you describe, the alternative is either "Brutally shutdown right now", or "Throwaway some data, potentially some memory as well, and maybe brutally shut down later if that happens too often". (although in the second case, there is also the trade-off that the leaking program "steals" memory from the other routines running on the same computer) Anyway, I don't think this would happen. Most forms of memory allocations are impure, and wouldn't be allowed in a try {} catch(Error) block; C's malloc() is pure, but C's free() isn't, so the thrown Error wouldn't be skipping over any calls to free(). Memory allocated by the GC would be reclaimed once the Error is caught and the data thrown away.Another problem is that non-gc memory allocated in the try block would be irreversibly leaked when an Error is thrown (though now that I think about it, that would probably count as impure and be impossible anyway).D considers allocating memory as pure[1]. ... Sure, but with regards to long running processes that are supposed to handle tens of thousands of requests, leaking memory (and continuing to run) will likely eventually end up brutally shutting down the process on out of memory errors. But yes, that is something that would have to be evaluated on a case by case basis.I half-agree. There *should not* be way to say "Okay, the contract is broken, but let's keep going anyway". There *should* be a way to say "okay, the contract is broken, let's get rid of all data associated with it, log an error message to explain what went wrong, then kill *the specific thread/process/task* and let the others keep going". The goal isn't to ignore or bypass Errors, it's to compartmentalize the damage.Arrays aside, I think there's some use in being able to safely recover from (or safely shut down after) the kind of broken contracts that throw Errors.I consider there to be value in allowing users to say "this is not a contract, it is a valid use case" (-> wrapper), but a broken contract being recoverable violates the entire concept of DbC.
Jun 07 2017
On Wednesday, 7 June 2017 at 15:35:56 UTC, Olivier FAURE wrote:On Monday, 5 June 2017 at 12:59:11 UTC, Moritz Maxeiner wrote: Anyway, I don't think this would happen. Most forms of memory allocations are impure,Not how pure is currently defined in D, see the referred spec; allocating memory is considered pure (even if it is impure with the theoretical pure definition). This is something that would need to be changed in the spec.The problem is that in current operating systems the finest scope/context of computation you can (safely) kill / compartmentalize the damage in in order to allow the rest of the system to proceed is a process (-> process isolation). Anything finer than that (threads, fibers, etc.) may or may not work in a particular use case, but you can't guarantee/proof that it works in the majority of use cases (which is what the runtime would have to be able to do if we were to allow that behaviour as the default). Compartmentalizing like this is your job as the programmer imho, not the job of the runtime.I consider there to be value in allowing users to say "this is not a contract, it is a valid use case" (-> wrapper), but a broken contract being recoverable violates the entire concept of DbC.There *should* be a way to say "okay, the contract is broken, let's get rid of all data associated with it, log an error message to explain what went wrong, then kill *the specific thread/process/task* and let the others keep going". The goal isn't to ignore or bypass Errors, it's to compartmentalize the damage.
Jun 07 2017
On 06/05/2017 11:50 AM, Olivier FAURE wrote:- But memory corruption is super bad, if a proved error *might* be caused by memory corruption then we must absolutely throw the potentially corrupted data away without using it. - Besides, even without memory corruption, the same argument applies to broken invariants; if we have data that breaks invariants, we need to throw it away, and use it as little as possible.[...]My proposal for solving these problems would be to explicitly allow to catch Errors in safe code IF the try block from which the Error is caught is perfectly pure. In other words, safe functions would be allowed to catch Error after try blocks if the block only mutates data declared inside of it; the code would look like: import vibe.d; // ... string handleRequestOrError(in HTTPServerRequest req) safe { ServerData myData = createData(); try { // both doSomethingWithData and mutateMyData are pure doSomethingWithData(req, myData); mutateMyData(myData); return myData.toString; } catch (Error) { throw new SomeException("Oh no, a system error occured"); } } void handleRequest(HTTPServerRequest req, HTTPServerResponse res) safe { try { res.writeBody(handleRequestOrError(req), "text/plain"); } catch (SomeException) { // Handle exception } } The point is, this is safe even when doSomethingWithData breaks an invariant or mutateMyData corrupts myData, because the compiler guarantees that the only data affected WILL be thrown away or otherwise unaccessible by the time catch(Error) is reached.But `myData` is still alive when `catch (Error)` is reached, isn't it? [...]What do you think? Does the idea have merit? Should I make it into a DIP?How does ` trusted` fit into this? The premise is that there's a bug somewhere. You can't assume that the bug is in a ` system` function. It can just as well be in a ` trusted` one. And then ` safe` and `pure` mean nothing.
Jun 05 2017
On Monday, 5 June 2017 at 12:51:16 UTC, ag0aep6g wrote:On 06/05/2017 11:50 AM, Olivier FAURE wrote:Good catch; yes, this example would refuse to compile; myData needs to be declared in the try block.In other words, safe functions would be allowed to catch Error after try blocks if the block only mutates data declared inside of it; the code would look like: import vibe.d; // ... string handleRequestOrError(in HTTPServerRequest req) safe { ServerData myData = createData(); try { ... } catch (Error) { throw new SomeException("Oh no, a system error occured"); } } ...But `myData` is still alive when `catch (Error)` is reached, isn't it?How does ` trusted` fit into this? The premise is that there's a bug somewhere. You can't assume that the bug is in a ` system` function. It can just as well be in a ` trusted` one. And then ` safe` and `pure` mean nothing.The point of this proposal is that catching Errors should be considered safe under certain conditions; code that catch Errors properly would be considered as safe as any other code, which is, "as safe as the trusted code it calls". I think the issue of trusted is tangential to this. If you (or the writer of a library you use) are using trusted to cast away pureness and then have side effects, you're already risking data corruption and undefined behavior, catching Errors or no catching Errors.
Jun 07 2017
On 06/07/2017 05:19 PM, Olivier FAURE wrote:I think I mistyped there. Makes more sense this way: "You can't assume that the bug is in a **` safe`** function. It can just as well be in a ` trusted` one."How does ` trusted` fit into this? The premise is that there's a bug somewhere. You can't assume that the bug is in a ` system` function. It can just as well be in a ` trusted` one. And then ` safe` and `pure` mean nothing.The point of this proposal is that catching Errors should be considered safe under certain conditions; code that catch Errors properly would be considered as safe as any other code, which is, "as safe as the trusted code it calls".When no trusted code is involved, then catching an out-of-bounds error from a safe function is safe. No additional rules are needed. Assuming no compiler bugs, a safe function simply cannot corrupt memory without calling trusted code. You gave the argument against catching out-of-bounds errors as: "it means an invariant is broken, which means the code surrounding it probably makes invalid assumptions and shouldn't be trusted." That line of reasoning applies to trusted code. Only trusted code can lose its trustworthiness. safe code is guaranteed trustworthy (except for calls to trusted code). So the argument against catching out-of-bounds errors is that there might be misbehaving trusted code. And for misbehaving trusted code you can't tell the reach of the potential corruption by looking at the function signature.I think the issue of trusted is tangential to this. If you (or the writer of a library you use) are using trusted to cast away pureness and then have side effects, you're already risking data corruption and undefined behavior, catching Errors or no catching Errors.It's not about intentional misuse of the trusted attribute. trusted functions must be safe. The point is that an out-of-bounds error implies a bug somewhere. If the bug is in safe code, it doesn't affect safety at all. There is no explosion. But if the bug is in trusted code, you can't determine how large the explosion is by looking at the function signature.
Jun 07 2017
On 06/07/2017 09:45 PM, ag0aep6g wrote:When no trusted code is involved, then catching an out-of-bounds error from a safe function is safe. No additional rules are needed. Assuming no compiler bugs, a safe function simply cannot corrupt memory without calling trusted code.Thinking a bit more about this, I'm not sure if it's entirely correct. Can a safe language feature throw an Error *after* corrupting memory? For example, could `a[i] = n;` write the value first and do the bounds check afterwards? There's probably a better example, if this kind of "shoot first, ask questions later" style ever makes sense. If bounds checking could be implemented like that, you wouldn't be able to ever catch the resulting error safely. Wouldn't matter if it comes from safe or trusted code. Purity wouldn't matter either, because an arbitrary write like that doesn't care about purity.
Jun 07 2017
On Wednesday, 7 June 2017 at 19:45:05 UTC, ag0aep6g wrote:You gave the argument against catching out-of-bounds errors as: "it means an invariant is broken, which means the code surrounding it probably makes invalid assumptions and shouldn't be trusted." That line of reasoning applies to trusted code. Only trusted code can lose its trustworthiness. safe code is guaranteed trustworthy (except for calls to trusted code).To clarify, when I said "shouldn't be trusted", I meant in the general sense, not in the memory safety sense. I think Jonathan M Davis put it nicely: On Wednesday, 31 May 2017 at 23:51:30 UTC, Jonathan M Davis wrote:Honestly, once a memory corruption has occurred, all bets are off anyway. The core thing here is that the contract of indexing arrays was violated, which is a bug. If we're going to argue about whether it makes sense to change that contract, then we have to discuss the consequences of doing so, and I really don't see why whether a memory corruption has occurred previously is relevant. [...] In either case, the runtime has no way of determining the reason for the failure, and I don't see why passing a bad value to index an array is any more indicative of a memory corruption than passing an invalid day of the month to std.datetime's Date when constructing it is indicative of a memory corruption.The sane way to protect against memory corruption is to write safe code, not code that *might* shut down brutally onces memory corruption has already occurred. This is done by using safe and proofreading all trusted functions in your libs. Contracts are made to preempt memory corruption, and to protect against *programming* errors; they're not recoverable because breaking a contract means that from now on the program is in a state that wasn't anticipated by the programmer. Which means the only way to handle them gracefully is to cancel what you were doing and go back to the pre-contract-breaking state, then produce a big, detailed error message and then exit / remove the thread / etc.I don't think there is much overlap between the problems that can be caused by faulty trusted code and the problems than can be caught by Errors. Not that this is not a philosophical problem. I'm making an empirical claim: "Catching Errors would not open programs to memory safety attacks or accidental memory safety blunders that would not otherwise happen". For instance, if some poorly-written trusted function causes the size of an int[10] slice to be registered as 20, then your program becomes vulnerable to buffer overflows when you iterate over it; the buffer overflow will not throw any Error. I'm not sure what the official stance is on this. As far as I'm aware, contracts and OOB checks are supposed to prevent memory corruption, not detect it. Any security based on detecting potential memory corruption can ultimately be bypassed by a hacker.I think the issue of trusted is tangential to this. If you (or the writer of a library you use) are using trusted to cast away pureness and then have side effects, you're already risking data corruption and undefined behavior, catching Errors or no catching Errors.The point is that an out-of-bounds error implies a bug somewhere. If the bug is in safe code, it doesn't affect safety at all. There is no explosion. But if the bug is in trusted code, you can't determine how large the explosion is by looking at the function signature.
Jun 08 2017
On 06/08/2017 11:27 AM, Olivier FAURE wrote:Contracts are made to preempt memory corruption, and to protect against *programming* errors; they're not recoverable because breaking a contract means that from now on the program is in a state that wasn't anticipated by the programmer. Which means the only way to handle them gracefully is to cancel what you were doing and go back to the pre-contract-breaking state, then produce a big, detailed error message and then exit / remove the thread / etc.I might get the idea now. The throwing code could be in the middle of some unsafe operation when it throws the out-of-bounds error. It would have cleaned up after itself, but it can't because of the (unexpected) error. Silly example: ---- void f(ref int* p) trusted { p = cast(int*) 13; /* corrupt stuff or partially initialize or whatever */ int[] a; auto x = a[0]; /* trigger an out-of-bounds error */ p = new int; /* would have cleaned up */ } ---- Catching the resulting error is safe when you throw the int* away. So if f is `pure` and you make sure that the arguments don't survive the `try` block, you're good, because f supposedly cannot have reached anything else. This is your proposal, right? I don't think that's sound. At least, it clashes with another relatively recent development: https://dlang.org/phobos/core_memory.html#.pureMalloc That's a wrapper around C's malloc. C's malloc might set the global errno, so it's impure. pureMalloc achieves purity by resetting errno to the value it had before the call. So a `pure` function may mess with global state, as long as it cleans it up. But when it's interrupted (e.g. by an out-of-bounds error), it may leave globals in an invalid state. So you can't assume that a `pure` function upholds its purity when it throws an error. In the end, an error indicates that something is wrong, and probably all guarantees may be compromised.
Jun 08 2017
On Thursday, 8 June 2017 at 13:02:38 UTC, ag0aep6g wrote:Catching the resulting error is safe when you throw the int* away. So if f is `pure` and you make sure that the arguments don't survive the `try` block, you're good, because f supposedly cannot have reached anything else. This is your proposal, right?Right.I don't think that's sound. At least, it clashes with another relatively recent development: https://dlang.org/phobos/core_memory.html#.pureMalloc That's a wrapper around C's malloc. C's malloc might set the global errno, so it's impure. pureMalloc achieves purity by resetting errno to the value it had before the call. So a `pure` function may mess with global state, as long as it cleans it up. But when it's interrupted (e.g. by an out-of-bounds error), it may leave globals in an invalid state. So you can't assume that a `pure` function upholds its purity when it throws an error.That's true. A "pure after cleanup" function is incompatible with catching Errors (unless we introduce a "scope(error)" keyword that also runs on errors, but that comes with other problems). Is pureMalloc supposed to be representative of pure functions, or more of a special case? That's not a rhetorical question, I genuinely don't know. The spec says a pure function "does not read or write any global or static mutable state", which seems incompatible with "save a global, then write it back like it was". In fact, doing so seems contrary to the assumption that you can run any two pure functions on immutable / independent data at the same time and you won't have race conditions. Actually, now I'm wondering whether pureMalloc & co handle potential race conditions at all, or just hope they don't happen.
Jun 08 2017
On 06/08/2017 04:02 PM, Olivier FAURE wrote:That's true. A "pure after cleanup" function is incompatible with catching Errors (unless we introduce a "scope(error)" keyword that also runs on errors, but that comes with other problems). Is pureMalloc supposed to be representative of pure functions, or more of a special case? That's not a rhetorical question, I genuinely don't know.I think it's supposed to be just as pure as any other pure function. Here's the pull request that added it: https://github.com/dlang/druntime/pull/1746 I don't see anything about it being special-cased in the compiler or such.The spec says a pure function "does not read or write any global or static mutable state", which seems incompatible with "save a global, then write it back like it was".True. Something similar is going on with safe. There's a list of things that are "not allowed in safe functions" [1], but you can do all those things in trusted code, of course. The list is about what the compiler rejects, not about what a safe function can actually do. It might be the same with the things that pure functions can/cannot do. I suppose the idea is that it cannot be observed that pureMalloc messes with global state, so it's ok. The assumption being that you don't catch errors. By the way, with regards to purity and errors, `new` is the same as pureMalloc. When `new` throws an OutOfMemoryError and you catch it, you can see that errno has been set. Yet `new` is considered `pure`.In fact, doing so seems contrary to the assumption that you can run any two pure functions on immutable / independent data at the same time and you won't have race conditions. Actually, now I'm wondering whether pureMalloc & co handle potential race conditions at all, or just hope they don't happen.Apparently errno is thread-local. [1] https://dlang.org/spec/function.html#safe-functions
Jun 08 2017
On 6/5/17 5:50 AM, Olivier FAURE wrote:I recently skimmed the "Bad array indexing is considered deadly" thread, which discusses the "array OOB throws Error, which throws the whole program away" problem.[snip]My proposal for solving these problems would be to explicitly allow to catch Errors in safe code IF the try block from which the Error is caught is perfectly pure.I don't think this will work. Only throwing Error makes a function nothrow. A nothrow function may not properly clean up the stack while unwinding. Not because the stack unwinding code skips over it, but because the compiler knows nothing can throw, and so doesn't include the cleanup code. So this means, regardless of whether you catch an Error or not, the program may be in a state that is not recoverable. Not to mention that only doing this for pure code eliminates usages that sparked the original discussion, as my code communicates with a database, and that wouldn't be allowed in pure code. The only possible language change I can think of here, is to have a third kind of Throwable type. Call it SafeError. A SafeError would be only catchable in system or trusted code. This means that safe code would have to terminate, but any wrapping code that is calling the safe code (such as the vibe.d framework), could catch it and properly handle the error, knowing that everything was properly cleaned up, and knowing that because we are in safe code, there hasn't been a memory corruption (right?). Throwing a SafeError prevents a function from being marked nothrow. I can't see a way around this, unless we came up with another attribute (shudder). Then we can change the compiler (runtime?) to throwing SafeRangeError instead of RangeError inside safe code. All of this, I'm not proposing to do, because I don't see it being accepted. Creating a new array type which is used in my code will work, and avoids all the hassle of navigating the DIP system. -Steve
Jun 05 2017
On Monday, 5 June 2017 at 14:05:27 UTC, Steven Schveighoffer wrote:I don't think this will work. Only throwing Error makes a function nothrow. A nothrow function may not properly clean up the stack while unwinding. Not because the stack unwinding code skips over it, but because the compiler knows nothing can throw, and so doesn't include the cleanup code.If the function is pure, then the only things it can set up will be stored on local or GC data, and it won't matter if they're not properly cleaned up, since they won't be accessible anymore. I'm not 100% sure about that, though. Can a pure function do impure things in its scope(exit) / destructor code?Not to mention that only doing this for pure code eliminates usages that sparked the original discussion, as my code communicates with a database, and that wouldn't be allowed in pure code.It would work for sending to a database; but you would need to use the functional programming idiom of "do 99% of the work in pure functions, then send the data to the remaining 1% for impure tasks". A process's structure would be: - Read the inputs from the socket (impure, no catching errors) - Parse them and transform them into database requests (pure) - Send the requests to the database (impure) - Parse / analyse / whatever the results (pure) - Send the results to the socket (impure) And okay, yeah, that list isn't realistic. Using functional programming idioms in real life programs can be a pain in the ass, and lead to convoluted callback-based scaffolding and weird data structures that you need to pass around a bunch of functions that don't really need them. The point is, you could isolate the pure data-manipulating parts of the program from the impure IO parts; and encapsulate the former in Error-catching blocks (which is convenient, since those parts are likely to be more convoluted and harder to foolproof than the IO parts, therefore likely to throw more Errors). Then if an Error occurs, you can close the connection the client (maybe send them an error packet beforehand), close the database file descriptor, log an error message, etc.
Jun 07 2017
On 6/7/17 12:20 PM, Olivier FAURE wrote:On Monday, 5 June 2017 at 14:05:27 UTC, Steven Schveighoffer wrote:Hm... if you locked an object that was passed in on the stack, for instance, there is no guarantee the object gets unlocked.I don't think this will work. Only throwing Error makes a function nothrow. A nothrow function may not properly clean up the stack while unwinding. Not because the stack unwinding code skips over it, but because the compiler knows nothing can throw, and so doesn't include the cleanup code.If the function is pure, then the only things it can set up will be stored on local or GC data, and it won't matter if they're not properly cleaned up, since they won't be accessible anymore.I'm not 100% sure about that, though. Can a pure function do impure things in its scope(exit) / destructor code?Even if it does pure things, that can cause problems.Even this still pushes the handling of the error onto the user. I want vibe.d to handle the error, in case I create a bug. But vibe.d can't possibly know what database things I'm going to do. And really this isn't possible. 99% of the work is using the database.Not to mention that only doing this for pure code eliminates usages that sparked the original discussion, as my code communicates with a database, and that wouldn't be allowed in pure code.It would work for sending to a database; but you would need to use the functional programming idiom of "do 99% of the work in pure functions, then send the data to the remaining 1% for impure tasks".A process's structure would be: - Read the inputs from the socket (impure, no catching errors) - Parse them and transform them into database requests (pure) - Send the requests to the database (impure) - Parse / analyse / whatever the results (pure) - Send the results to the socket (impure) And okay, yeah, that list isn't realistic. Using functional programming idioms in real life programs can be a pain in the ass, and lead to convoluted callback-based scaffolding and weird data structures that you need to pass around a bunch of functions that don't really need them. The point is, you could isolate the pure data-manipulating parts of the program from the impure IO parts; and encapsulate the former in Error-catching blocks (which is convenient, since those parts are likely to be more convoluted and harder to foolproof than the IO parts, therefore likely to throw more Errors).Aside from the point that this still doesn't solve the problem (pure functions do cleanup too), this means a lot of headache for people who just want to write code. I'd much rather just write an array type and be done. -Steve
Jun 08 2017
On Thursday, 8 June 2017 at 12:20:19 UTC, Steven Schveighoffer wrote:Hm... if you locked an object that was passed in on the stack, for instance, there is no guarantee the object gets unlocked.This wouldn't be allowed unless the object was duplicated / created inside the try block.Aside from the point that this still doesn't solve the problem (pure functions do cleanup too), this means a lot of headache for people who just want to write code. I'd much rather just write an array type and be done. -SteveFair enough. There are other advantages to writing with "create data with pure functions then process it" idioms (easier to do unit tests, better for parallelism, etc), though.
Jun 08 2017
On 6/8/17 9:42 AM, Olivier FAURE wrote:On Thursday, 8 June 2017 at 12:20:19 UTC, Steven Schveighoffer wrote:void foo(Mutex m, Data d) pure { synchronized(m) { // ... manipulate d } // no guarantee m gets unlocked } -SteveHm... if you locked an object that was passed in on the stack, for instance, there is no guarantee the object gets unlocked.This wouldn't be allowed unless the object was duplicated / created inside the try block.
Jun 08 2017
On Thursday, 8 June 2017 at 14:13:53 UTC, Steven Schveighoffer wrote:void foo(Mutex m, Data d) pure { synchronized(m) { // ... manipulate d } // no guarantee m gets unlocked } -SteveIsn't synchronized(m) not nothrow?
Jun 08 2017
On 6/8/17 11:19 AM, Stanislav Blinov wrote:On Thursday, 8 June 2017 at 14:13:53 UTC, Steven Schveighoffer wrote:You're right, it isn't. I actually didn't know that. Also forgot to make my function nothrow. Fixed: void foo(Mutex m, Data d) pure nothrow { try { synchronized(m) { // .. manipulate d } } catch(Exception) { } } -Stevevoid foo(Mutex m, Data d) pure { synchronized(m) { // ... manipulate d } // no guarantee m gets unlocked }Isn't synchronized(m) not nothrow?
Jun 08 2017
I want to start by stating that the discussion around being able to throw Error from nothrow functions and the compiler optimizations that follow is important to the thoughts below. The other aspect of array bounds checking is that those particular checks will not be added in -release. There has been much discussion around this already and I do recall that the solution was that safe code will retain the array bounds checks (I'm not sure if contracts was included in this). Thus if using -release and safe you'd be able to rely on having an Error to catch. Now it might make sense for safe code to throw an ArrayOutOfBounds Exception, but that would mean the function couldn't be marked as nothrow if array indexing is used. This is probably a terrible idea, but safe nothrow functions could throw ArrayIndexError while safe could throw ArrayIndexException. It would really suck that adding nothrow would change the semantics silently.
Jun 08 2017