digitalmars.D - Bad array indexing is considered deadly
- Steven Schveighoffer (25/25) May 31 2017 I have discovered an annoyance in using vibe.d instead of another web
- H. S. Teoh via Digitalmars-d (12/31) May 31 2017 [...]
- Steven Schveighoffer (10/38) May 31 2017 Yes, I can likely do this. This kills any existing connections being
- Nick Sabalausky (Abscissa) (2/12) May 31 2017 Plus, relying on that strikes me as a DoS attack vector.
- Laeeth Isharc (13/60) Jun 01 2017 Hi Steve.
- aberba (3/18) Jun 02 2017 How does that setup affect response time? Do you cache large
- Laeeth Isharc (6/29) Jun 02 2017 Our world is very different from web world. Very few users but
- Steven Schveighoffer (6/15) Jun 02 2017 I think at some point, if vibe.d doesn't move in this direction, you
- Adam D. Ruppe (5/7) May 31 2017 I don't use vibe, but my cgi.d just catches RangeError, kills the
- Steven Schveighoffer (10/16) May 31 2017 There are a couple issues with this. At least from the perspective of
- ketmar (6/23) May 31 2017 that is, the question reduces to "should out-of-bounds be Error or Excep...
- Steven Schveighoffer (28/55) May 31 2017 That ship, unfortunately, has sailed. There is no reasonable migration
- Steven Schveighoffer (15/30) May 31 2017 Just realized, that @trusted escape is just so unnecessarily verbose.
- ketmar (3/34) May 31 2017 bonus point: you can include index and length in error message! (somethi...
- Nick Sabalausky (Abscissa) (14/16) May 31 2017 +1 million. I *hate* D's notion of Error. Well, no...more correctly, I
- Moritz Maxeiner (6/15) May 31 2017 To be fair, anything that can be handled in a sane&safe way
- Nick Sabalausky (Abscissa) (4/21) May 31 2017 Then out-of-bounds and assert failures should be Exception not Error.
- Moritz Maxeiner (5/10) May 31 2017 No, because as I stated in my other post, the runtime *cannot*
- Timon Gehr (2/14) May 31 2017 Hence all programs must abort on startup.
- H. S. Teoh via Digitalmars-d (7/13) May 31 2017 If D had *true* garbage collection, it would have done this upon
- Moritz Maxeiner (3/14) May 31 2017 I think vigil will be a perfect fit for you[1] ;p
- Moritz Maxeiner (4/19) May 31 2017 In the context of the conversation, and error has already
- Timon Gehr (3/22) May 31 2017 Bounds checks have /no business at all/ trying to handle preexisting
- Moritz Maxeiner (4/11) May 31 2017 Sure, because the program is in an undefined state by that point.
- Timon Gehr (13/23) May 31 2017 What does that even mean? Everything is perfectly well-defined here:
- Moritz Maxeiner (14/42) May 31 2017 That once memory corruption has occurred the state of the program
- Timon Gehr (5/13) Jun 01 2017 Yes, they would stop me from using a smaller scope. 'nothrow' functions
- Steven Schveighoffer (4/17) Jun 02 2017 By default yes, but...
- Moritz Maxeiner (19/22) May 31 2017 It is not that accessing the array out of bounds *leading* to
- Nick Sabalausky (Abscissa) (9/12) May 31 2017 Of course not, that's absurd. Where do people get the idea that
- Moritz Maxeiner (16/28) May 31 2017 You assume something I did not write. What I wrote is that the
- Nick Sabalausky (Abscissa) (33/62) May 31 2017 Like I said, *anything* could be the result of data corruption. (And
- Steven Schveighoffer (15/32) May 31 2017 To be blunt, no this is completely wrong. Memory corruption *already
- =?UTF-8?Q?Ali_=c3=87ehreli?= (12/33) May 31 2017 True.
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (3/7) May 31 2017 How is this different from a file system exception?
- =?UTF-8?Q?Ali_=c3=87ehreli?= (15/21) May 31 2017 When you say "memory" I think you refer to the thought of bounds
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (21/26) Jun 01 2017 That's true, but that could be the case with file system
- H. S. Teoh via Digitalmars-d (32/42) May 31 2017 [...]
- Moritz Maxeiner (15/20) May 31 2017 While I agree on a theoretical level about the fact that in
- Steven Schveighoffer (5/20) May 31 2017 Again, there has not been memory corruption. There is a confusion
- Moritz Maxeiner (18/23) May 31 2017 Again, the runtime *cannot* know that and hence you *cannot*
- Timon Gehr (10/35) May 31 2017 No, it is perfectly safe, because the language does not guarantee any
- Moritz Maxeiner (11/26) May 31 2017 The language not guaranteeing a specific behaviour on memory
- Jonathan M Davis via Digitalmars-d (24/32) May 31 2017 Honestly, once a memory corruption has occurred, all bets are off anyway...
- Moritz Maxeiner (17/49) May 31 2017 Right, and that is why termination when in doubt (and the
- Steven Schveighoffer (31/51) May 31 2017 Yes, it cannot know at any point whether or not a memory corruption has
- Moritz Maxeiner (28/70) May 31 2017 Because assuming the worst is a sane default.
- Steven Schveighoffer (24/58) May 31 2017 But the program cannot possibly know which variable is an index. So it
- Kagamin (4/8) Jun 01 2017 Other systems work like this: an internal server error is
- Moritz Maxeiner (22/61) May 31 2017 I disagree.
- Nick Sabalausky (Abscissa) (4/9) May 31 2017 This is why the runtime needs to guarantee that normal unwinding/cleanup...
- Jonathan M Davis via Digitalmars-d (23/30) May 31 2017 It is my understanding that with how nothrow is implemented, that's not
- Walter Bright (5/10) May 31 2017 Everything about a network is unreliable, so any reliable system must ha...
- Nick Sabalausky (Abscissa) (1/38) May 31 2017
- Nick Sabalausky (Abscissa) (6/12) May 31 2017 Honestly, I really think that if there is need to wrap something as
- Jonathan M Davis via Digitalmars-d (26/38) May 31 2017 Using an Exception to signal a programming bug and then potentially tryi...
- Nick Sabalausky (Abscissa) (14/47) May 31 2017 Exeption thrown != "OMG NOTHING ABOUT ANY BRANCH OF THE PROGRAM CAN BE
- Jonathan M Davis via Digitalmars-d (36/45) May 31 2017 Indexing an array with an invalid index is the same as violating any
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (19/31) Jun 01 2017 Well, if you take this position then you should not only crash
- Kagamin (6/10) Jun 01 2017 Sad reality is that d programmers are still comfortable writing
- H. S. Teoh via Digitalmars-d (7/10) Jun 01 2017 Huh? There is no void* in that bug report, and it was closed 3 years
- Moritz Maxeiner (18/21) Jun 03 2017 After some consideration you can now find the (dynamic) array
- Jonathan M Davis via Digitalmars-d (41/61) May 31 2017 I don't think that you even need to worry about whether memory corruptio...
- Moritz Maxeiner (11/35) May 31 2017 That is correct (and that was even mentioned in the OP), but from
- Steven Schveighoffer (25/88) Jun 01 2017 Yes, it's definitely a bug, and that is not something I'm arguing
- rikki cattermole (5/5) Jun 01 2017 I'm just sitting here waiting for shared libraries to be properly
- Jonathan M Davis via Digitalmars-d (17/24) Jun 01 2017 Honestly, unless something about vibe.d prevents fixing bugs like bad ar...
- Adam D. Ruppe (9/13) Jun 01 2017 If you control the deployment, it works perfectly well. You
- Jacob Carlborg (9/13) Jun 01 2017 You can do a combination of both. One request per fiber and as many
- aberba (5/17) Jun 01 2017 I'm glad I know enough to know this is an opinion...
- aberba (3/23) Jun 01 2017 Here is Daemonise
- Steven Schveighoffer (10/27) Jun 02 2017 Don't get me wrong, I think D will be better than other frameworks for
- Adam D. Ruppe (8/11) Jun 02 2017 Correction: "vibe.d frameworks" are fragile. This isn't D
- Timon Gehr (2/12) Jun 02 2017 I'm not convinced that public perception is sensitive to such details. ;...
- Moritz Maxeiner (20/23) May 31 2017 Sorry for double post, but - after thinking more about this - I
- Steven Schveighoffer (4/14) May 31 2017 Nope, an autonomous system did not type out my code that caused the out
- Moritz Maxeiner (3/5) May 31 2017 Same as the human who typed out the code of the autonomous system.
- Kagamin (4/7) May 31 2017 On windows you can set up service restart settings in case it
- Steven Schveighoffer (5/11) May 31 2017 That *would* be a feature on Windows ;)
- Moritz Maxeiner (3/5) May 31 2017 OT: *with whatever process supervisor floats your boat.
- Daniel Kozak via Digitalmars-d (5/20) Jun 01 2017 [Service]
- Steven Schveighoffer (3/6) Jun 01 2017 Thanks!
- John Colvin (32/63) May 31 2017 What things are considered unrecoverable errors or not is
- Brad Roberts via Digitalmars-d (9/13) May 31 2017 This.. exactly this. I've worked on software from the tiny device level...
- Walter Bright (25/32) May 31 2017 Since you don't know where the bad index came from, such a conclusion ca...
- Guillaume Piolat (9/15) Jun 01 2017 +1
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (4/7) Jun 01 2017 No. You don't want to crash immediately. In fact, you want to
- Guillaume Piolat (3/12) Jun 01 2017 Solved by auto-saving, _before_ the crash
- H. S. Teoh via Digitalmars-d (10/21) Jun 01 2017 Yes. Saving *after* a crash was detected is stupid, because you no
- Walter Bright (3/9) Jun 01 2017 An even better idea is to use rolling backups, with the crash recovery b...
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (2/3) Jun 02 2017 That only works for simple applications.
- Steven Schveighoffer (30/62) Jun 01 2017 You could say that about any error. You could say that about malformed
- Jonathan M Davis via Digitalmars-d (50/68) Jun 01 2017 I think that it really comes down to what the contract is and how it mak...
- Walter Bright (4/11) Jun 01 2017 It is a programming bug to not validate the input. It's not that bad to ...
- Timon Gehr (5/21) Jun 01 2017 They should be treated as bugs, but isn't it plausible that there are
- Walter Bright (10/20) Jun 01 2017 The stages of programming expertise:
- Timon Gehr (26/55) Jun 01 2017 This does not really say anything about programming expertise, it says
- Walter Bright (2/4) Jun 01 2017 C quality code is straightforward in D. Just mark it @system.
- Timon Gehr (2/7) Jun 01 2017 I don't know what this is, but it is not an answer to my post.
- Paolo Invernizzi (19/61) Jun 01 2017 Everything coming as an input of the _process_ should be
- Timon Gehr (10/24) Jun 01 2017 You seem to not understand what happened. There was a single server
- Paolo Invernizzi (15/39) Jun 01 2017 I really understand what is happening: I've a vibe.d server
- aberba (3/9) Jun 01 2017 Pretty much it. Containerisation of several stateless instances
- Steven Schveighoffer (14/24) Jun 02 2017 If only that is what happened, I would not have started this thread!
- Arafel (32/50) Jun 02 2017 Hi,
- Steven Schveighoffer (12/22) Jun 02 2017 I don't think this is workable, simply because of nothrow. An Error is
- Arafel (10/25) Jun 02 2017 Well, as I understood from this thread this is already possible in debug...
- Steven Schveighoffer (5/31) Jun 02 2017 Yes, of course. This is a non-starter if you need to compile release
- John Colvin (39/89) Jun 01 2017 I think the idea is that no, array overflows can never be caused
- Stanislav Blinov (3/7) Jun 01 2017 Oh yes, there is a way:
- John Colvin (2/9) Jun 01 2017 Sure, @safe has some holes as it currently stands.
- Jonathan M Davis via Digitalmars-d (9/19) Jun 01 2017 It's far better than nothing, but it definitely has holes. DIP 1000 is
- Walter Bright (3/5) Jun 01 2017 Please post bug reports to bugzilla. Posting them only on the n.g. prett...
- Stanislav Blinov (2/7) Jun 01 2017 Please look at the very first post of that thread :\
- H. S. Teoh via Digitalmars-d (13/21) Jun 01 2017 [...]
- Walter Bright (34/45) Jun 01 2017 What's missing here is looking carefully at a program and deciding what ...
- H. S. Teoh via Digitalmars-d (32/44) Jun 01 2017 +1. I think this is the root of the problem. Data that comes from
- cym13 (15/25) Jun 01 2017 I'm not familiar with the idea, do we need more than the
- H. S. Teoh via Digitalmars-d (123/150) Jun 01 2017 [...]
- cym13 (3/4) Jun 01 2017 Now that I think about it, what we really want going that way is
- Walter Bright (4/7) Jun 01 2017 Found it:
- Dukc (11/16) Jun 01 2017 I think he understood all that already. Array overflow is a sign
- Steven Schveighoffer (33/39) Jun 02 2017 I think it's important to state that no, I wasn't relying on array
- John Carter (32/41) May 31 2017 In this case it is fairly obvious where the bad index is coming
- H. S. Teoh via Digitalmars-d (31/40) May 31 2017 [...]
- Paolo Invernizzi (4/10) Jun 01 2017 That's exactly the point: to use the right tool for the
- Timon Gehr (2/16) Jun 01 2017 There is no such tool.
- Jacob Carlborg (9/10) Jun 01 2017 In this case, Erlang is a pretty good candidate. It's using green
- Paolo Invernizzi (3/19) Jun 01 2017 Process isolation was exactly crafted for that.
- Vladimir Panteleev (28/31) Jun 01 2017 Since I wrote/run a bunch of websites/network services written in
- Walter Bright (4/10) Jun 01 2017 This is the best advice.
- Steven Schveighoffer (12/23) Jun 01 2017 Indeed it is good advice. I'm thinking actually a good setup is to have
- Martin Tschierschke (9/12) Jun 01 2017 Is this option useful for you?
- Nick Sabalausky (Abscissa) (3/17) Jun 01 2017 All that would do is *cause* corruption due to the way the runtime
- Andrei Alexandrescu (21/53) Jun 02 2017 This is a meaningful concern. People use threads instead of processes
- Joseph Rushton Wakeling (11/13) Jun 04 2017 Ideally, fiber, as well. Probably the real ideal for this sort
- Jacob Carlborg (7/14) Jun 04 2017 Erlang has the philosophy of share nothing between processes (green
- Paolo Invernizzi (5/23) Jun 04 2017 If I'm not wrong, it also uses a VM, also if there's the
- Jacob Carlborg (4/7) Jun 04 2017 Yes, it's running on a VM, the Beam.
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (12/15) Jun 04 2017 Not sure if I follow that. If you only use safe code then there
- Joseph Rushton Wakeling (6/10) Jun 04 2017 Indeed. (I used 'task' here in a deliberately vague sense, in
- nohbdy (36/36) Jun 02 2017 I'm using D to write an RSS reader.
- Paolo Invernizzi (5/12) Jun 02 2017 The worst thing happened in programming in the last 30 years is
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (17/20) Jun 03 2017 Really?
- Paolo Invernizzi (19/40) Jun 03 2017 It doesn't seems to me that the trends to try to handle somehow,
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (35/43) Jun 03 2017 That all depends. It makes perfect sense in a "strongly pure"
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (10/10) Jun 03 2017 Anyway, all of this boils down to the question of whether D
- Paolo Invernizzi (20/49) Jun 03 2017 Sorry Ola, I can't support that way of working.
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (34/48) Jun 03 2017 If the compiler is broken then anything could happen, at any
- Timon Gehr (10/26) Jun 03 2017 I don't get why you would /restart/ mission-critical software that has
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (10/20) Jun 03 2017 Yes, mission critical software such as flight control are (and
- Paolo Invernizzi (16/45) Jun 03 2017 That's what should be done in mission-critical software, and we
- Timon Gehr (13/65) Jun 03 2017 That document says that the crash was caused by a component going down
I have discovered an annoyance in using vibe.d instead of another web framework. Simple errors in indexing crash the entire application. For example: int[3] arr; arr[3] = 5; Compare this to, let's say, a malformed unicode string (exception), malformed JSON data (exception), file not found (exception), etc. Technically this is a programming error, and a bug. But memory hasn't actually been corrupted. The system properly stopped me from corrupting memory. But my reward is that even though this fiber threw an Error, and I get an error message in the log showing me the bug, the web server itself is now out of commission. No other pages can be served. This is like the equivalent of having a guard rail on a road not only stop you from going off the cliff but proactively disable your car afterwards to prevent you from more harm. This seems like a large penalty for "almost" corrupting memory. No other web framework I've used crashes the entire web server for such a simple programming error. And vibe.d has no choice. There is no guarantee the stack is properly unwound, so it has to accept the characterization of this is a program-ending error by the D runtime. I am considering writing a set of array wrappers that throw exceptions when trying to access out of bounds elements. This comes with its own set of problems, but at least the web server should continue to run. What are your thoughts? Have you run into this? If so, how did you solve it? -Steve
May 31 2017
On Wed, May 31, 2017 at 09:04:52AM -0400, Steven Schveighoffer via Digitalmars-d wrote:I have discovered an annoyance in using vibe.d instead of another web framework. Simple errors in indexing crash the entire application. For example: int[3] arr; arr[3] = 5; Compare this to, let's say, a malformed unicode string (exception), malformed JSON data (exception), file not found (exception), etc. Technically this is a programming error, and a bug. But memory hasn't actually been corrupted. The system properly stopped me from corrupting memory. But my reward is that even though this fiber threw an Error, and I get an error message in the log showing me the bug, the web server itself is now out of commission. No other pages can be served. This is like the equivalent of having a guard rail on a road not only stop you from going off the cliff but proactively disable your car afterwards to prevent you from more harm.[...] Isn't it customary to have the webserver launched by a script that restarts it whenever it crashes (after logging a message in an emergency logfile)? Not an ideal solution, I know, but at least it minimizes downtime. On another note, why didn't the compiler reject the above code? I thought it checks static arrays bounds at compile time whenever possible. Did I remember wrong? T -- Change is inevitable, except from a vending machine.
May 31 2017
On 5/31/17 9:21 AM, H. S. Teoh via Digitalmars-d wrote:On Wed, May 31, 2017 at 09:04:52AM -0400, Steven Schveighoffer via Digitalmars-d wrote:Yes, I can likely do this. This kills any existing connections being handled though, and is far far from ideal. It's also a hard crash, any operations such as writing DB data are killed mid-stream. But you won't win over any minds that are used to php or python with this workaround.I have discovered an annoyance in using vibe.d instead of another web framework. Simple errors in indexing crash the entire application. For example: int[3] arr; arr[3] = 5; Compare this to, let's say, a malformed unicode string (exception), malformed JSON data (exception), file not found (exception), etc. Technically this is a programming error, and a bug. But memory hasn't actually been corrupted. The system properly stopped me from corrupting memory. But my reward is that even though this fiber threw an Error, and I get an error message in the log showing me the bug, the web server itself is now out of commission. No other pages can be served. This is like the equivalent of having a guard rail on a road not only stop you from going off the cliff but proactively disable your car afterwards to prevent you from more harm.[...] Isn't it customary to have the webserver launched by a script that restarts it whenever it crashes (after logging a message in an emergency logfile)? Not an ideal solution, I know, but at least it minimizes downtime.On another note, why didn't the compiler reject the above code? I thought it checks static arrays bounds at compile time whenever possible. Did I remember wrong?I'm not sure, it's a toy example. In the real bug, the index was a variable. The annoying thing about this is that there is no actual memory corruption. It was properly stopped. -Steve
May 31 2017
On 05/31/2017 09:34 AM, Steven Schveighoffer wrote:On 5/31/17 9:21 AM, H. S. Teoh via Digitalmars-d wrote:Plus, relying on that strikes me as a DoS attack vector.Isn't it customary to have the webserver launched by a script that restarts it whenever it crashes (after logging a message in an emergency logfile)? Not an ideal solution, I know, but at least it minimizes downtime.Yes, I can likely do this. This kills any existing connections being handled though, and is far far from ideal. It's also a hard crash, any operations such as writing DB data are killed mid-stream.
May 31 2017
On Wednesday, 31 May 2017 at 13:34:25 UTC, Steven Schveighoffer wrote:On 5/31/17 9:21 AM, H. S. Teoh via Digitalmars-d wrote:Hi Steve. Had similar problems early on. We used supervisord to automatically keep a pool of vibed applications running and put nginx in front as a load balancer. User session info stored in redis. And a separate process for data communicating with web server over nanomsg. Zeromq is more mature but I found sometimes socket could get into an inconsistent state if servers crashed midway, and nanomsg doesn't have this problem. So data update either succeeds or fails but no corruption if Web server crashes. Maybe better ways but it seems to be okay for us. LaeethOn Wed, May 31, 2017 at 09:04:52AM -0400, Steven Schveighoffer via Digitalmars-d wrote:Yes, I can likely do this. This kills any existing connections being handled though, and is far far from ideal. It's also a hard crash, any operations such as writing DB data are killed mid-stream. .. -SteveI have discovered an annoyance in using vibe.d instead of another web framework. Simple errors in indexing crash the entire application. For example: int[3] arr; arr[3] = 5; Compare this to, let's say, a malformed unicode string (exception), malformed JSON data (exception), file not found (exception), etc. Technically this is a programming error, and a bug. But memory hasn't actually been corrupted. The system properly stopped me from corrupting memory. But my reward is that even though this fiber threw an Error, and I get an error message in the log showing me the bug, the web server itself is now out of commission. No other pages can be served. This is like the equivalent of having a guard rail on a road not only stop you from going off the cliff but proactively disable your car afterwards to prevent you from more harm.[...] Isn't it customary to have the webserver launched by a script that restarts it whenever it crashes (after logging a message in an emergency logfile)? Not an ideal solution, I know, but at least it minimizes downtime.
Jun 01 2017
On Friday, 2 June 2017 at 02:11:34 UTC, Laeeth Isharc wrote:On Wednesday, 31 May 2017 at 13:34:25 UTC, Steven Schveighoffer wrote:How does that setup affect response time? Do you cache large query results in redis?[...]Hi Steve. Had similar problems early on. We used supervisord to automatically keep a pool of vibed applications running and put nginx in front as a load balancer. User session info stored in redis. And a separate process for data communicating with web server over nanomsg. Zeromq is more mature but I found sometimes socket could get into an inconsistent state if servers crashed midway, and nanomsg doesn't have this problem. So data update either succeeds or fails but no corruption if Web server crashes. Maybe better ways but it seems to be okay for us. Laeeth
Jun 02 2017
On Friday, 2 June 2017 at 10:37:09 UTC, aberba wrote:On Friday, 2 June 2017 at 02:11:34 UTC, Laeeth Isharc wrote:Our world is very different from web world. Very few users but incredibly high value. If we have twenty users then for most things that's a lot. We don't cache query results as it's fast enough and the data retrieval bit is not where the bottleneck is. LaeethOn Wednesday, 31 May 2017 at 13:34:25 UTC, Steven Schveighoffer wrote:How does that setup affect response time? Do you cache large query results in redis?[...]Hi Steve. Had similar problems early on. We used supervisord to automatically keep a pool of vibed applications running and put nginx in front as a load balancer. User session info stored in redis. And a separate process for data communicating with web server over nanomsg. Zeromq is more mature but I found sometimes socket could get into an inconsistent state if servers crashed midway, and nanomsg doesn't have this problem. So data update either succeeds or fails but no corruption if Web server crashes. Maybe better ways but it seems to be okay for us. Laeeth
Jun 02 2017
On 6/1/17 10:11 PM, Laeeth Isharc wrote:Had similar problems early on. We used supervisord to automatically keep a pool of vibed applications running and put nginx in front as a load balancer. User session info stored in redis. And a separate process for data communicating with web server over nanomsg. Zeromq is more mature but I found sometimes socket could get into an inconsistent state if servers crashed midway, and nanomsg doesn't have this problem. So data update either succeeds or fails but no corruption if Web server crashes. Maybe better ways but it seems to be okay for us.I think at some point, if vibe.d doesn't move in this direction, you will see a popular setup that wraps vibe.d along these lines. I imagined a similar solution earlier: https://forum.dlang.org/post/ogq7nd$ccj$1 digitalmars.com -Steve
Jun 02 2017
On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:What are your thoughts? Have you run into this? If so, how did you solve it?I don't use vibe, but my cgi.d just catches RangeError, kills the individual connection, and lets the others carry on. Can you do the same thing?
May 31 2017
On 5/31/17 9:37 AM, Adam D. Ruppe wrote:On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:There are a couple issues with this. At least from the perspective of vibe.d attempting to be a mainstream base library. 1. You can mark a function nothrow that throws a RangeError. So the compiler is free to assume the function won't throw and build faster code that won't properly clean up if an Error is thrown. 2. Technically, there is no guarantee by the runtime to unwind the stack. So at some point, your workaround may not even work. And even if it does, things like RAII may not work. -SteveWhat are your thoughts? Have you run into this? If so, how did you solve it?I don't use vibe, but my cgi.d just catches RangeError, kills the individual connection, and lets the others carry on. Can you do the same thing?
May 31 2017
Steven Schveighoffer wrote:On 5/31/17 9:37 AM, Adam D. Ruppe wrote:that is, the question reduces to "should out-of-bounds be Error or Exception"? i myself see no easy way to customize this with language attribute (new/delete disaster immediately comes to mind). so i'd say: "create your own array wrapper/implementation, and hope that all the functions you need are rangified, so they'll be able to work with YourArray".On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:There are a couple issues with this. At least from the perspective of vibe.d attempting to be a mainstream base library. 1. You can mark a function nothrow that throws a RangeError. So the compiler is free to assume the function won't throw and build faster code that won't properly clean up if an Error is thrown. 2. Technically, there is no guarantee by the runtime to unwind the stack. So at some point, your workaround may not even work. And even if it does, things like RAII may not work. -SteveWhat are your thoughts? Have you run into this? If so, how did you solve it?I don't use vibe, but my cgi.d just catches RangeError, kills the individual connection, and lets the others carry on. Can you do the same thing?
May 31 2017
On 5/31/17 9:54 AM, ketmar wrote:Steven Schveighoffer wrote:That ship, unfortunately, has sailed. There is no reasonable migration path, as every function that uses indexing can currently be marked nothrow, and would stop compiling in one way or another. In other words mass breakage of every project would likely happen.On 5/31/17 9:37 AM, Adam D. Ruppe wrote:that is, the question reduces to "should out-of-bounds be Error or Exception"?On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:There are a couple issues with this. At least from the perspective of vibe.d attempting to be a mainstream base library. 1. You can mark a function nothrow that throws a RangeError. So the compiler is free to assume the function won't throw and build faster code that won't properly clean up if an Error is thrown. 2. Technically, there is no guarantee by the runtime to unwind the stack. So at some point, your workaround may not even work. And even if it does, things like RAII may not work.What are your thoughts? Have you run into this? If so, how did you solve it?I don't use vibe, but my cgi.d just catches RangeError, kills the individual connection, and lets the others carry on. Can you do the same thing?i myself see no easy way to customize this with language attribute (new/delete disaster immediately comes to mind). so i'd say: "create your own array wrapper/implementation, and hope that all the functions you need are rangified, so they'll be able to work with YourArray".I have, and it seems to work OK for my purposes (and wasn't really that bad actually). Here is complete implementation (should be safe too): struct ExArr(T, size_t dim) { T[dim] _value; alias _value this; ref inout(T) opIndex(size_t idx, string fname = __FILE__, size_t linenum = __LINE__) inout { if(idx >= dim) throw new Exception("Index out of bounds", fname, linenum); static ref x(ref inout(T[dim]) val, size_t i) trusted { return val.ptr[i]; } return x(_value, idx); } } Now, I just need to search and replace for all the cases where I have a static array... A dynamic array replacement shouldn't be too difficult either. Just need to override opIndex and opSlice. Then I can override those in my static array implementation as well. -Steve
May 31 2017
On 5/31/17 10:07 AM, Steven Schveighoffer wrote:Here is complete implementation (should be safe too): struct ExArr(T, size_t dim) { T[dim] _value; alias _value this; ref inout(T) opIndex(size_t idx, string fname = __FILE__, size_t linenum = __LINE__) inout { if(idx >= dim) throw new Exception("Index out of bounds", fname, linenum); static ref x(ref inout(T[dim]) val, size_t i) trusted { return val.ptr[i]; } return x(_value, idx); } }Just realized, that trusted escape is just so unnecessarily verbose. struct ExArr(T, size_t dim) { T[dim] _value; alias _value this; ref inout(T) opIndex(size_t idx, string fname = __FILE__, size_t linenum = __LINE__) inout trusted { if(idx >= dim) throw new Exception("Index out of bounds", fname, linenum); return _value.ptr[idx]; } } -Steve
May 31 2017
Steven Schveighoffer wrote:On 5/31/17 10:07 AM, Steven Schveighoffer wrote:bonus point: you can include index and length in error message! (something i really miss in dmd range error)Here is complete implementation (should be safe too): struct ExArr(T, size_t dim) { T[dim] _value; alias _value this; ref inout(T) opIndex(size_t idx, string fname = __FILE__, size_t linenum = __LINE__) inout { if(idx >= dim) throw new Exception("Index out of bounds", fname, linenum); static ref x(ref inout(T[dim]) val, size_t i) trusted { return val.ptr[i]; } return x(_value, idx); } }Just realized, that trusted escape is just so unnecessarily verbose. struct ExArr(T, size_t dim) { T[dim] _value; alias _value this; ref inout(T) opIndex(size_t idx, string fname = __FILE__, size_t linenum = __LINE__) inout trusted { if(idx >= dim) throw new Exception("Index out of bounds", fname, linenum); return _value.ptr[idx]; } } -Steve
May 31 2017
On 05/31/2017 09:04 AM, Steven Schveighoffer wrote:What are your thoughts?+1 million. I *hate* D's notion of Error. Well, no...more correctly, I absolutely hate that it throws cleanup/unwinding straight out the window for many situations that can obviously be handled safely without the paranoid "ZOMG Sky Is Falling!!!!" overreaction that is baked into the design of Error. And that causes problems like the one you describe. Kill it with fire!!! A wrapper type seems like a plausable workaround, but I really, really dislike that it would ever be necessary to bother wrapping such a basic prevailant feature as...arrays, especially just to work around such a collosal misfeature. And, as you describe in your reply to H.S. Teoh, the current behavior of Error can actually cause MORE damage than just rcovering from an obviously recoverable situation.
May 31 2017
On Wednesday, 31 May 2017 at 17:13:08 UTC, Nick Sabalausky (Abscissa) wrote:On 05/31/2017 09:04 AM, Steven Schveighoffer wrote:To be fair, anything that can be handled in a sane&safe way should inherit from Exception, not from Error, so throwing away cleanup for Error makes sense, since an Error means the program is in an undefined state and should terminate asap.What are your thoughts?+1 million. I *hate* D's notion of Error. Well, no...more correctly, I absolutely hate that it throws cleanup/unwinding straight out the window for many situations that can obviously be handled safely without the paranoid "ZOMG Sky Is Falling!!!!" overreaction that is baked into the design of Error. And that causes problems like the one you describe.
May 31 2017
On 05/31/2017 02:55 PM, Moritz Maxeiner wrote:On Wednesday, 31 May 2017 at 17:13:08 UTC, Nick Sabalausky (Abscissa) wrote:Then out-of-bounds and assert failures should be Exception not Error. Frankly, even out-of-memory, arguably. And then there's null dereference... In other words, basically everything.On 05/31/2017 09:04 AM, Steven Schveighoffer wrote:To be fair, anything that can be handled in a sane&safe way should inherit from Exception, not from Error, so throwing away cleanup for Error makes sense, since an Error means the program is in an undefined state and should terminate asap.What are your thoughts?+1 million. I *hate* D's notion of Error. Well, no...more correctly, I absolutely hate that it throws cleanup/unwinding straight out the window for many situations that can obviously be handled safely without the paranoid "ZOMG Sky Is Falling!!!!" overreaction that is baked into the design of Error. And that causes problems like the one you describe.
May 31 2017
On Wednesday, 31 May 2017 at 20:09:16 UTC, Nick Sabalausky (Abscissa) wrote:[...]No, because as I stated in my other post, the runtime *cannot* assume that it is safe *in all cases*. If there is even one single case in which it is unsafe, it must abort.program is in an undefined state and should terminate asap.Then out-of-bounds and assert failures should be Exception not Error. Frankly, even out-of-memory, arguably. And then there's null dereference... In other words, basically everything.
May 31 2017
On 31.05.2017 22:45, Moritz Maxeiner wrote:On Wednesday, 31 May 2017 at 20:09:16 UTC, Nick Sabalausky (Abscissa) wrote:Hence all programs must abort on startup.[...]No, because as I stated in my other post, the runtime *cannot* assume that it is safe *in all cases*. If there is even one single case in which it is unsafe, it must abort.program is in an undefined state and should terminate asap.Then out-of-bounds and assert failures should be Exception not Error. Frankly, even out-of-memory, arguably. And then there's null dereference... In other words, basically everything.
May 31 2017
On Wed, May 31, 2017 at 11:29:53PM +0200, Timon Gehr via Digitalmars-d wrote:On 31.05.2017 22:45, Moritz Maxeiner wrote:[...]If D had *true* garbage collection, it would have done this upon starting up any buggy program. :-D T -- Why is it that all of the instruments seeking intelligent life in the universe are pointed away from Earth? -- Michael BeiblNo, because as I stated in my other post, the runtime *cannot* assume that it is safe *in all cases*. If there is even one single case in which it is unsafe, it must abort.Hence all programs must abort on startup.
May 31 2017
On Wednesday, 31 May 2017 at 21:30:47 UTC, H. S. Teoh wrote:On Wed, May 31, 2017 at 11:29:53PM +0200, Timon Gehr via Digitalmars-d wrote:I think vigil will be a perfect fit for you[1] ;p [1] https://github.com/munificent/vigilOn 31.05.2017 22:45, Moritz Maxeiner wrote:[...]If D had *true* garbage collection, it would have done this upon starting up any buggy program. :-DNo, because as I stated in my other post, the runtime *cannot* assume that it is safe *in all cases*. If there is even one single case in which it is unsafe, it must abort.Hence all programs must abort on startup.
May 31 2017
On Wednesday, 31 May 2017 at 21:29:53 UTC, Timon Gehr wrote:On 31.05.2017 22:45, Moritz Maxeiner wrote:In the context of the conversation, and error has already occurred and the all cases was referring to all the cases that lead to the error.On Wednesday, 31 May 2017 at 20:09:16 UTC, Nick Sabalausky (Abscissa) wrote:Hence all programs must abort on startup.[...]No, because as I stated in my other post, the runtime *cannot* assume that it is safe *in all cases*. If there is even one single case in which it is unsafe, it must abort.program is in an undefined state and should terminate asap.Then out-of-bounds and assert failures should be Exception not Error. Frankly, even out-of-memory, arguably. And then there's null dereference... In other words, basically everything.
May 31 2017
On 01.06.2017 00:22, Moritz Maxeiner wrote:On Wednesday, 31 May 2017 at 21:29:53 UTC, Timon Gehr wrote:Bounds checks have /no business at all/ trying to handle preexisting memory corruption, and in that sense they are comparable to program startup.On 31.05.2017 22:45, Moritz Maxeiner wrote:In the context of the conversation, and error has already occurred and the all cases was referring to all the cases that lead to the error.On Wednesday, 31 May 2017 at 20:09:16 UTC, Nick Sabalausky (Abscissa) wrote:Hence all programs must abort on startup.[...]No, because as I stated in my other post, the runtime *cannot* assume that it is safe *in all cases*. If there is even one single case in which it is unsafe, it must abort.program is in an undefined state and should terminate asap.Then out-of-bounds and assert failures should be Exception not Error. Frankly, even out-of-memory, arguably. And then there's null dereference... In other words, basically everything.
May 31 2017
On Wednesday, 31 May 2017 at 23:40:00 UTC, Timon Gehr wrote:Sure, because the program is in an undefined state by that point. There is only termination.In the context of the conversation, and error has already occurred and the all cases was referring to all the cases that lead to the error.Bounds checks have /no business at all/ trying to handle preexisting memory corruption,and in that sense they are comparable to program startup.I disagree.
May 31 2017
On 01.06.2017 01:55, Moritz Maxeiner wrote:On Wednesday, 31 May 2017 at 23:40:00 UTC, Timon Gehr wrote:What does that even mean? Everything is perfectly well-defined here: void main(){ auto a = new int[](2); a[2] = 3; }Sure, because the program is in an undefined state by that point.In the context of the conversation, and error has already occurred and the all cases was referring to all the cases that lead to the error.Bounds checks have /no business at all/ trying to handle preexisting memory corruption,There is only termination. ...Termination of what? How on earth do you determine that the scope of this "undefined state" is the program, not the machine, or the world? I.e., why terminate the program, but not shut down the machine or nuke the planet? Scoping really ought to be up to the programmer as it greatly depends on the actual circumstances. Program termination is the only reasonable default behaviour, but it is not the only reasonable behaviour.
May 31 2017
On Thursday, 1 June 2017 at 00:11:10 UTC, Timon Gehr wrote:On 01.06.2017 01:55, Moritz Maxeiner wrote:That once memory corruption has occurred the state of the program is not well defined anymore.On Wednesday, 31 May 2017 at 23:40:00 UTC, Timon Gehr wrote:What does that even mean?Sure, because the program is in an undefined state by that point.In the context of the conversation, and error has already occurred and the all cases was referring to all the cases that lead to the error.Bounds checks have /no business at all/ trying to handle preexisting memory corruption,Everything is perfectly well-defined here: void main(){ auto a = new int[](2); a[2] = 3; }Sure, because there has been no memory corruption prior to the index out of bounds. That is not something the runtime should just assume for every out of index error.As that is the closest scope current operating systems give us to work with, this is a sane default for the runtime. Nobody stops you from using a different scope if you need it.There is only termination. ...Termination of what? How on earth do you determine that the scope of this "undefined state" is the program, not the machine, or the world?I.e., why terminate the program, but not shut down the machine or nuke the planet? Scoping really ought to be up to the programmer as it greatly depends on the actual circumstances.Of course, and if you need something else you can do so.Program termination is the only reasonable default behaviour, but it is not the only reasonable behaviour.Absolutely; rereading through our subthread I realized that I had not made that explicit here (only in other subthreads). I apologize for being imprecise.
May 31 2017
On 01.06.2017 02:57, Moritz Maxeiner wrote:Yes, they would stop me from using a smaller scope. 'nothrow' functions are not guaranteed to be unwindable and the compiler infers 'nothrow' automatically. Also, null pointer dereferences do not even throw. (On Linux.)Termination of what? How on earth do you determine that the scope of this "undefined state" is the program, not the machine, or the world?As that is the closest scope current operating systems give us to work with, this is a sane default for the runtime. Nobody stops you from using a different scope if you need it.
Jun 01 2017
On 6/1/17 3:49 PM, Timon Gehr wrote:On 01.06.2017 02:57, Moritz Maxeiner wrote:By default yes, but... https://github.com/dlang/druntime/blob/master/src/etc/linux/memoryerror.d -SteveYes, they would stop me from using a smaller scope. 'nothrow' functions are not guaranteed to be unwindable and the compiler infers 'nothrow' automatically. Also, null pointer dereferences do not even throw. (On Linux.)Termination of what? How on earth do you determine that the scope of this "undefined state" is the program, not the machine, or the world?As that is the closest scope current operating systems give us to work with, this is a sane default for the runtime. Nobody stops you from using a different scope if you need it.
Jun 02 2017
On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:[...] What are your thoughts? Have you run into this? If so, how did you solve it?It is not that accessing the array out of bounds *leading* to data corruption that is the issue here, but that in general you have to assume that the index *being* out of bounds is itself the *result* of *already occurred* data corruption; and if data corruption occurred for the index, you *cannot* assume that *only* the index has been affected. The runtime cannot simply assume the index being out of bounds is not the result of already occurred data corruption, because that is inherently unsafe, so it *must* terminate asap as the default. If you get the index as the input to your process - and thus *know* that it being out of bounds is not the result of previous data corruption - then you should check this yourself before accessing the array and handle it appropriately (e.g. via Exception). So in your specific use case I would say use a wrapper. This is one of the reasons why I am working on my own library for data structures (libds).
May 31 2017
On 05/31/2017 03:17 PM, Moritz Maxeiner wrote:in general you have to assume that the index *being* out of bounds is itself the *result* of *already occurred* data corruption;Of course not, that's absurd. Where do people get the idea that out-of-bounds *implies* pre-existing data corruption? Most of the time, out-of-bounds comes from a bug (especially in D, what with all of its safeguards). Sure, data corruption is one possible cause of out-of-bounds, but data corruption is one possible cause of *ANYTHING*. So just to be safe, let's just abort on all exceptions, and upon everything else for that matter.
May 31 2017
On Wednesday, 31 May 2017 at 20:23:21 UTC, Nick Sabalausky (Abscissa) wrote:On 05/31/2017 03:17 PM, Moritz Maxeiner wrote:You assume something I did not write. What I wrote is that the runtime cannot *in general* (i.e. without further information about the semantics of your specific program) assume that it was *not* preexisting data corruption.in general you have to assume that the index *being* out of bounds is itself the *result* of *already occurred* data corruption;Of course not, that's absurd. Where do people get the idea that out-of-bounds *implies* pre-existing data corruption?Most of the time, out-of-bounds comes from a bug (especially in D, what with all of its safeguards).Unfortunately the runtime has no way to know *if* the out of bounds comes from a bug or a data corruption, which was my point; only a human can know that. What is the most likely culprit is irrelevant for the default behaviour, because as long as it *could* be data corruption, the runtime cannot by default assume that it is not; that would be unsafe.Sure, data corruption is one possible cause of out-of-bounds, but data corruption is one possible cause of *ANYTHING*. So just to be safe, let's just abort on all exceptions, and upon everything else for that matter.No, abort on Errors where the runtime cannot know if data corruption has already occured, i.e. the program is in an undefined state. If you, as the programmer, know that it is safe, you have to code that in.
May 31 2017
On 05/31/2017 05:03 PM, Moritz Maxeiner wrote:On Wednesday, 31 May 2017 at 20:23:21 UTC, Nick Sabalausky (Abscissa) wrote:Ok, fine. However...On 05/31/2017 03:17 PM, Moritz Maxeiner wrote:You assume something I did not write. What I wrote is that the runtime cannot *in general* (i.e. without further information about the semantics of your specific program) assume that it was *not* preexisting data corruption.in general you have to assume that the index *being* out of bounds is itself the *result* of *already occurred* data corruption;Of course not, that's absurd. Where do people get the idea that out-of-bounds *implies* pre-existing data corruption?Like I said, *anything* could be the result of data corruption. (And with out-of-bounds in particular, it's very rare for the cause to be data corruption, especially in D). If the determining factor for whether or not condition XYZ should abort is "*could* it be data corruption?", then ALL conditions must abort, because data corruption and undefined state can, by their very nature, cause *any* state - heck, even ones that "look" perfectly valid. So, since that approach is a complete non-starter even in thory, the closest thing we *can* reasonably do is instead, use the crieteria "is this *likely enough* to be data corruption?" (for however we choose to define "likely enough"). BUT, in that case, out-of-bounds *still* fails to meet the criteria by a longshot. When an out-of-bounds does occurs, it's vastly most likely to be a bug, not data corruption. Fuck, in all my decades of programming, including using D since pre-v1.0, NOT ONCE have ANY of the hundreds, maybe thousands, of out-of-bounds I've encountered ever been the result of data corruption. NOT ONCE. Not exaggerating. Even as an anecdote, that's a FAR cry from being able to reasonably suspect data corruption as a likey cause, regardless of where we set the bar for "likely".Most of the time, out-of-bounds comes from a bug (especially in D, what with all of its safeguards).Unfortunately the runtime has no way to know *if* the out of bounds comes from a bug or a data corruption, which was my point; only a human can know that. What is the most likely culprit is irrelevant for the default behaviour, because as long as it *could* be data corruption, the runtime cannot by default assume that it is not; that would be unsafe.The runtime can NEVER be know that no data corruption has occurred. Let me emphasise that: *NEVER*. By the very nature of data curruption and undefined states, it is NOT even theoretically plausible for a runtime to EVER be able to rule out data corruption, *not even when things look A-OK*, and hell, not even when the algorithm is mathematically proven correct, because, shoot, let's just pretend we live in a fantasy world where hardware failures are impossible why don't we? Therefore, if we follow your reasoning (that we must abort whenever data corruption is possible), then we must therefore abort all processes unconditionally upon creation. Your approach sounds nice, but it's completely unrealistic.Sure, data corruption is one possible cause of out-of-bounds, but data corruption is one possible cause of *ANYTHING*. So just to be safe, let's just abort on all exceptions, and upon everything else for that matter.No, abort on Errors where the runtime cannot know if data corruption has already occured, i.e. the program is in an undefined state.
May 31 2017
On 5/31/17 3:17 PM, Moritz Maxeiner wrote:On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:To be blunt, no this is completely wrong. Memory corruption *already having happened* can cause any number of errors. The point of bounds checking is to prevent memory corruption in the first place. I could memory corrupt the length of the array also (assuming a dynamic array), and bounds checking merrily does nothing to stop further memory corruption.[...] What are your thoughts? Have you run into this? If so, how did you solve it?It is not that accessing the array out of bounds *leading* to data corruption that is the issue here, but that in general you have to assume that the index *being* out of bounds is itself the *result* of *already occurred* data corruption;and if data corruption occurred for the index, you *cannot* assume that *only* the index has been affected. The runtime cannot simply assume the index being out of bounds is not the result of already occurred data corruption, because that is inherently unsafe, so it *must* terminate asap as the default.The runtime should not assume that crashing the whole program is necessary when an integer is out of range. Preventing actual corruption, yes that is good. But an Exception would have done the job just fine. But that ship, as I said elsewhere, has sailed. We can't change it to Exception now, as that would break just about all nothrow code in existence.So in your specific use case I would say use a wrapper. This is one of the reasons why I am working on my own library for data structures (libds).That is my conclusion too. Is your library in a usable state? Perhaps we should not repeat efforts, though I wasn't planning on making a robust public library for it :) -Steve
May 31 2017
On 05/31/2017 02:00 PM, Steven Schveighoffer wrote:On 5/31/17 3:17 PM, Moritz Maxeiner wrote:Blunter: Moritz is right. :)It is not that accessing the array out of bounds *leading* to data corruption that is the issue here, but that in general you have to assume that the index *being* out of bounds is itself the *result* of *already occurred* data corruption;To be blunt, no this is completely wrong.Memory corruption *already having happened* can cause any number of errors.True.The point of bounds checking is to prevent memory corruption in the first place.That's just one goal. It also maintains an invariant of arrays: The index value must be within bounds.I could memory corrupt the length of the array also (assuming a dynamic array), and bounds checking merrily does nothing to stop further memory corruption.That's true but the language provides no tool to check for that. The fact that program correctness is not achievable in general should not have any bearing on bounds checking.How could an Exception work in this case? Catch it and repeat the same bug over and over again? What would the program be achieving? (I assume the exception handler will not arbitrarily decrease index values.) Aliand if data corruption occurred for the index, you *cannot* assume that *only* the index has been affected. The runtime cannot simply assume the index being out of bounds is not the result of already occurred data corruption, because that is inherently unsafe, so it *must* terminate asap as the default.The runtime should not assume that crashing the whole program is necessary when an integer is out of range. Preventing actual corruption, yes that is good. But an Exception would have done the job just fine.
May 31 2017
On Wednesday, 31 May 2017 at 21:30:05 UTC, Ali Çehreli wrote:How could an Exception work in this case? Catch it and repeat the same bug over and over again? What would the program be achieving? (I assume the exception handler will not arbitrarily decrease index values.)How is this different from a file system exception? The file system is memory too...
May 31 2017
On 05/31/2017 02:41 PM, Ola Fosheim Grøstad wrote:On Wednesday, 31 May 2017 at 21:30:05 UTC, Ali Çehreli wrote:values.)How could an Exception work in this case? Catch it and repeat the same bug over and over again? What would the program be achieving? (I assume the exception handler will not arbitrarily decrease indexHow is this different from a file system exception? The file system is memory too...When you say "memory" I think you refer to the thought of bounds checking being for prevention of memory corruption. True, memory corruption can happen when the program writes out of bounds but it's one special case. The actual reason for bounds checking is maintaining an invariant. Regarding the file system, because it's part of the environment of the program, hence the program cannot control, it's correct to throw an Exception, in which case the response can be "Cannot open that file; how about another one?". In the case of array indexes, they are in complete control of the program, hence a bug when out of bounds. It's not possible to say "Bad index; let me try 42 less." Ali
May 31 2017
On Wednesday, 31 May 2017 at 21:57:04 UTC, Ali Çehreli wrote:of bounds but it's one special case. The actual reason for bounds checking is maintaining an invariant.That's true, but that could be the case with file system exception too. Say, a file is supposed to be of length N, but you get an exception because you are reading past the file end. Same issue. Should you then wipe the entire file system, because there appears to be a problem with a single file?In the case of array indexes, they are in complete control of the program, hence a bug when out of bounds. It's not possible to say "Bad index; let me try 42 less."Well, it is possible that the bad indexing was because the input was empty and there was a mistake in the program. One reasonable thing to do is to rollback for that particular input, log it as a problem, then continue processing other input. Which is often better than shutting down the service, but it really is contextual. The real question is, what is the probability of a mismatched index for your application being just an indexing problem. I think it is very high for most "safe" code. So if D supports "safe" code well, then indexing issues will most likely almost never be due to corruption. If you only write "unsafe" code, then indexing issues are still most likely to not be because of corruption, but the probability is much higher.
Jun 01 2017
On Wed, May 31, 2017 at 02:30:05PM -0700, Ali Çehreli via Digitalmars-d wrote:On 05/31/2017 02:00 PM, Steven Schveighoffer wrote:[...][...] In this particular case, the idea is that the fibre that ran into the bug would throw an Exception to the main loop, which catches it and terminates the fibre (presumably also sending an error response to the client browser), while continuing to process other, possibly-ongoing requests normally. Rather than having the one bad request triggering the buggy code and causing *all* currently in-progress requests to terminate because the entire program has aborted. An extreme example of this is if you had a vibe.d server hosting multiple virtual domains belonging to different customers. It's bad enough that one customer's service would crash when it encounters a bug, but it's far worse to have *all* customers' services crash just because *one* of them encountered a bug. This is an interesting use case, because conceptually speaking, each vibe.d fibre actually represents an independent computation, so any fatal errors like out-of-bounds bugs should cause the termination of the *fibre*, rather than *everything* that just happens to be running in the same process. If vibe.d had been implemented with, say, forked processes instead, this wouldn't have been an issue. But of course, the fibre implementation was chosen for performance (and possibly other) reasons. Forking would give you the per-request isolation needed to handle this kind of problem cleanly, but it also comes with a hefty performance price tag. Like all things in practical engineering, it's a tradeoff. I'd say creating a custom type that throws Exception instead of Error is probably the best solution here, given what we have. T -- The fact that anyone still uses AOL shows that even the presence of options doesn't stop some people from picking the pessimal one. - Mike EllisThe runtime should not assume that crashing the whole program is necessary when an integer is out of range. Preventing actual corruption, yes that is good. But an Exception would have done the job just fine.How could an Exception work in this case? Catch it and repeat the same bug over and over again? What would the program be achieving? (I assume the exception handler will not arbitrarily decrease index values.)
May 31 2017
On Wednesday, 31 May 2017 at 21:45:51 UTC, H. S. Teoh wrote:This is an interesting use case, because conceptually speaking, each vibe.d fibre actually represents an independent computation, so any fatal errors like out-of-bounds bugs should cause the termination of the *fibre*, rather than *everything* that just happens to be running in the same process.While I agree on a theoretical level about the fact that in principal only the fibre (and the same argument goes for threads) should terminate, the problem is that fibres, as well as threads, share the same virtual memory of a process, i.e. memory corruption in one fibre (or thread) cannot in general be safely contained and kept from spreading to the other fibres (or threads; except in the thread case one might argue if you know the memory corruption to have happened only in TLS then you can kill the thread, but I don't know how you would prove that). If you cannot be sure that the memory corruption is contained in a scope (i.e. a fibre or thread), you must terminate at the closest enclosing scope that you know will keep the error from escaping further outward to the rest of your system; AFAIK in modern operating system the closest such scope is a process.
May 31 2017
On 5/31/17 6:36 PM, Moritz Maxeiner wrote:On Wednesday, 31 May 2017 at 21:45:51 UTC, H. S. Teoh wrote:Again, there has not been memory corruption. There is a confusion rampant in this thread that preventing *attempted* memory corruption must mean there *is* memory corruption. One does not require the other. -SteveThis is an interesting use case, because conceptually speaking, each vibe.d fibre actually represents an independent computation, so any fatal errors like out-of-bounds bugs should cause the termination of the *fibre*, rather than *everything* that just happens to be running in the same process.While I agree on a theoretical level about the fact that in principal only the fibre (and the same argument goes for threads) should terminate, the problem is that fibres, as well as threads, share the same virtual memory of a process, i.e. memory corruption in one fibre (or thread) cannot in general be safely contained and kept from spreading to the other fibres (or threads; except in the thread case one might argue if you know the memory corruption to have happened only in TLS then you can kill the thread, but I don't know how you would prove that).
May 31 2017
On Wednesday, 31 May 2017 at 22:47:38 UTC, Steven Schveighoffer wrote:Again, there has not been memory corruption.Again, the runtime *cannot* know that and hence you *cannot* claim that. It sees an index out of bounds and it *cannot* reason about whether a memory corruption has already occurred or not, which means it *must assume* the worst case (it must *assume* there was).There is a confusion rampant in this thread that preventing *attempted* memory corruption must mean there *is* memory corruption.No, please no. Nobody has written that in the entire thread even once! - An index being out of bounds is an error (lowercase!). - The runtime sees that error when the array is accessed (what you describe as *attemped* memory corruption. - The runtime does not know *why* the index is out of bounds It does *not* mean that there *was* memory corruption (and again, nobody claimed that), but the runtime cannot assume that there was not, because that is *unsafe*.One does not require the other.Correct, but the runtime has to be safe in the *general* case, so it *must* assume the worst in case of a bug.
May 31 2017
On 01.06.2017 01:13, Moritz Maxeiner wrote:On Wednesday, 31 May 2017 at 22:47:38 UTC, Steven Schveighoffer wrote:No, it is perfectly safe, because the language does not guarantee any specific behavior in case memory is corrupted. Therefore the language can /always/ assume that there is no memory corruption.Again, there has not been memory corruption.Again, the runtime *cannot* know that and hence you *cannot* claim that. It sees an index out of bounds and it *cannot* reason about whether a memory corruption has already occurred or not, which means it *must assume* the worst case (it must *assume* there was).There is a confusion rampant in this thread that preventing *attempted* memory corruption must mean there *is* memory corruption.No, please no. Nobody has written that in the entire thread even once! - An index being out of bounds is an error (lowercase!). - The runtime sees that error when the array is accessed (what you describe as *attemped* memory corruption. - The runtime does not know *why* the index is out of bounds It does *not* mean that there *was* memory corruption (and again, nobody claimed that), but the runtime cannot assume that there was not, because that is *unsafe*. ...Software has bugs. The runtime has no business claiming that the scope of any particular bug is the entire service. The practical outcomes of this design are just silly. Data is lost, services go down, etc. When in doubt, the software should just do what the programmer has written. It is not always correct, but it is the best available proxy of the desirable behavior.One does not require the other.Correct, but the runtime has to be safe in the *general* case, so it *must* assume the worst in case of a bug.
May 31 2017
On Wednesday, 31 May 2017 at 23:50:07 UTC, Timon Gehr wrote:No, it is perfectly safe, because the language does not guarantee any specific behavior in case memory is corrupted.The language not guaranteeing a specific behaviour on memory corruption does not imply that assuming a bug was not caused by memory corruption is safe.Therefore the language can /always/ assume that there is no memory corruption.That is also not implied.It absolutely has the business of doing exactly that as long as you, the programmer, do not tell it otherwise; which you can do and is your job.Software has bugs. The runtime has no business claiming that the scope of any particular bug is the entire service.One does not require the other.Correct, but the runtime has to be safe in the *general* case, so it *must* assume the worst in case of a bug.The practical outcomes of this design are just silly. Data is lost, services go down, etc. When in doubt, the software should just do what the programmer has written. It is not always correct, but it is the best available proxy of the desirable behavior.When in doubt about memory corruption, the closest enclosing scope that will get rid of the memory corruption must die. The current behaviour achieves that in many cases.
May 31 2017
On Wednesday, May 31, 2017 23:13:35 Moritz Maxeiner via Digitalmars-d wrote:On Wednesday, 31 May 2017 at 22:47:38 UTC, Steven Schveighoffer wrote:Honestly, once a memory corruption has occurred, all bets are off anyway. The core thing here is that the contract of indexing arrays was violated, which is a bug. If we're going to argue about whether it makes sense to change that contract, then we have to discuss the consequences of doing so, and I really don't see why whether a memory corruption has occurred previously is relevant. We could easily treat indexing arrays the same as as any other function which chooses to throw an Exception when it's given bad input. The core difference is whether it's considered okay to give bad values or whether it's considered a programming bug to pass bad values. In either case, the runtime has no way of determining the reason for the failure, and I don't see why passing a bad value to index an array is any more indicative of a memory corruption than passing an invalid day of the month to std.datetime's Date when constructing it is indicative of a memory corruption. In both cases, the input is bad, and the runtime doesn't know why. It's just that in the array case, the API of arrays requires that the input be valid, whereas for Date, it's acceptable for bad input to be passed. So, while I can appreciate that you're trying to argue for us keeping RangeError (which I agree with), I think that this whole argument about possible, previous memory corruptions prior to the invalid index being passed is derailing things. The issue ultimately is what the consequences are of using an Error vs an Exception, and _that_ is what we need to discuss. - Jonathan M DavisAgain, there has not been memory corruption.Again, the runtime *cannot* know that and hence you *cannot* claim that. It sees an index out of bounds and it *cannot* reason about whether a memory corruption has already occurred or not, which means it *must assume* the worst case (it must *assume* there was).
May 31 2017
On Wednesday, 31 May 2017 at 23:51:30 UTC, Jonathan M Davis wrote:On Wednesday, May 31, 2017 23:13:35 Moritz Maxeiner via Digitalmars-d wrote:Right, and that is why termination when in doubt (and the programmer has not done anything to clear that doubt up) is the sane choice.On Wednesday, 31 May 2017 at 22:47:38 UTC, Steven Schveighoffer wrote:Honestly, once a memory corruption has occurred, all bets are off anyway.Again, there has not been memory corruption.Again, the runtime *cannot* know that and hence you *cannot* claim that. It sees an index out of bounds and it *cannot* reason about whether a memory corruption has already occurred or not, which means it *must assume* the worst case (it must *assume* there was).The core thing here is that the contract of indexing arrays was violated, which is a bug.I disagree about it being the core issue, because that was already established in the OP.If we're going to argue about whether it makes sense to change that contract, then we have to discuss the consequences of doing so, and I really don't see why whether a memory corruption has occurred previously is relevant.Because if such a memory corruption occurred, termination of the closest enclosing scope to get rid of it must follow (or your entire system can end up corrupted).We could easily treat indexing arrays the same as as any other function which chooses to throw an Exception when it's given bad input. The core difference is whether it's considered okay to give bad values or whether it's considered a programming bug to pass bad values. In either case, the runtime has no way of determining the reason for the failure, and I don't see why passing a bad value to index an array is any more indicative of a memory corruption than passing an invalid day of the month to std.datetime's Date when constructing it is indicative of a memory corruption. In both cases, the input is bad, and the runtime doesn't know why.One of those is a library construct, the other is baked into the language; it is perfectly fine for the former to use exceptions, because it can be easily avoided by anyone; the latter is a required component of pretty much everything you can build with D and must thus use the stricter contract.The issue ultimately is what the consequences are of using an Error vs an Exception, and _that_ is what we need to discuss.An Exception leads to unwinding&cleanup, an Error to termination (with unwinding&cleanup in debug mode for debugging purposes). What would you like to discuss here?
May 31 2017
On 5/31/17 7:13 PM, Moritz Maxeiner wrote:On Wednesday, 31 May 2017 at 22:47:38 UTC, Steven Schveighoffer wrote:Yes, it cannot know at any point whether or not a memory corruption has occurred. However, it has a lever to pull to say "your program cannot continue, and you have no choice." It chooses to pull this lever on any attempt of out of bounds access of an array, regardless of the reason why that is happening. The chances that a memory corruption is the cause is so low, and it doesn't matter even if it is. The program may already have messed up everything by that point. In fact, the current behavior of printing the Error message and doing an orderly shutdown is pretty risky anyway if we think this is a memory corruption. There are almost no other environmentally caused errors that cause this lever to be pulled. It doesn't make a whole lot of sense that it is.Again, there has not been memory corruption.Again, the runtime *cannot* know that and hence you *cannot* claim that. It sees an index out of bounds and it *cannot* reason about whether a memory corruption has already occurred or not, which means it *must assume* the worst case (it must *assume* there was)."you have to assume that the index *being* out of bounds is itself the *result* of *already occurred* data corruption;"There is a confusion rampant in this thread that preventing *attempted* memory corruption must mean there *is* memory corruption.No, please no. Nobody has written that in the entire thread even once!- An index being out of bounds is an error (lowercase!). - The runtime sees that error when the array is accessed (what you describe as *attemped* memory corruption. - The runtime does not know *why* the index is out of bounds It does *not* mean that there *was* memory corruption (and again, nobody claimed that), but the runtime cannot assume that there was not, because that is *unsafe*.It's not the runtime's job to determine that the cause of an out-of-bounds access could be memory corruption. It's job is to prevent the current attempt. Throwing an Error accomplishes this, yes, but it also means you must shut down the program. I have no problem at all with it preventing the corruption, nor do I have a problem with it throwing an Error, per se. The problem I have is that throwing an Error itself corrupts the program, and makes it unusable. Therefore, it's the wrong tool for that job. And I absolutely do not think that throwing an Error in this case was the result of a careful choice deciding that memory corruption must be or even might be the cause. I think it's this way because of the desire to write nothrow code without having to pepper your code with try/catch blocks.It's easy to prove as well that throwing an Exception instead of an Error is perfectly safe. My array wrapper is perfectly safe and does not throw an Error on bad indexing. -SteveOne does not require the other.Correct, but the runtime has to be safe in the *general* case, so it *must* assume the worst in case of a bug.
May 31 2017
On Wednesday, 31 May 2017 at 23:53:11 UTC, Steven Schveighoffer wrote:On 5/31/17 7:13 PM, Moritz Maxeiner wrote:Because assuming the worst is a sane default.On Wednesday, 31 May 2017 at 22:47:38 UTC, Steven Schveighoffer wrote:Yes, it cannot know at any point whether or not a memory corruption has occurred. However, it has a lever to pull to say "your program cannot continue, and you have no choice." It chooses to pull this lever on any attempt of out of bounds access of an array, regardless of the reason why that is happening.Again, there has not been memory corruption.Again, the runtime *cannot* know that and hence you *cannot* claim that. It sees an index out of bounds and it *cannot* reason about whether a memory corruption has already occurred or not, which means it *must assume* the worst case (it must *assume* there was).The chances that a memory corruption is the cause is so low, and it doesn't matter even if it is. The program may already have messed up everything by that point.True, it might have already corrupted other things; but that is no argument for allowing it to continue to potentially corrupt even more.In fact, the current behavior of printing the Error message and doing an orderly shutdown is pretty risky anyway if we think this is a memory corruption.AFAIK the orderly shutdown is not guaranteed to be done in release mode and I would welcome for thrown errors in release mode to simply kill the process immediately.Yes, precisely. I state: "you have to assume that the index *being* out of bounds is itself the *result* of *already occurred* data corruption;" You state: "that preventing *attempted* memory corruption must mean there *is* memory corruption" You state that I claim the memory corruption must definitely have occurred, while in contrast I state that one has to *assume* that is has occurred. *Not* the same."you have to assume that the index *being* out of bounds is itself the *result* of *already occurred* data corruption;"There is a confusion rampant in this thread that preventing *attempted* memory corruption must mean there *is* memory corruption.No, please no. Nobody has written that in the entire thread even once!It's not the runtime's job to determine that the cause of an out-of-bounds access could be memory corruption.That was the job of whoever wrote the runtime, yes.It's job is to prevent the current attempt.That is one of its jobs. The other is to terminate when it detects potential memory corruptions the programmer has not ensured are not.The problem I have is that throwing an Error itself corrupts the program, and makes it unusable.Because the programmer has not done the steps to ensure the runtime that memory has not been corrupted, that is the only sane choice I see.It's easy to prove as well that throwing an Exception instead of an Error is perfectly safe. My array wrapper is perfectly safe and does not throw an Error on bad indexing.And anyone using wrapper implicitly promises that a wrong index cannot be the result of memory corruption, which can definitely be a sane choice for a lot of use cases, but not as the default for the basic building block in the language.
May 31 2017
On 5/31/17 5:30 PM, Ali Çehreli wrote:On 05/31/2017 02:00 PM, Steven Schveighoffer wrote:I'll ignore this section of the debate :)On 5/31/17 3:17 PM, Moritz Maxeiner wrote:Blunter: Moritz is right. :)It is not that accessing the array out of bounds *leading* to data corruption that is the issue here, but that in general you have to assume that the index *being* out of bounds is itself the *result* of *already occurred* data corruption;To be blunt, no this is completely wrong.But the program cannot possibly know which variable is an index. So it cannot maintain the invariant until it's actually used. At that point, it can use throwing an Error to say that something isn't right, or it can use throwing an Exception. D chose Error, and the consequences of that choice are that you have to check before D checks or else your entire program is killed.Memory corruption *already having happened* can cause any number of errors.True.The point of bounds checking is to prevent memory corruption in the first place.That's just one goal. It also maintains an invariant of arrays: The index value must be within bounds.My point simply is that assuming corruption is not a good answer. It's a good *excuse* for the current behavior, but doesn't really satisfy any meaningful requirement. To borrow from another subthread here, imagine if when you attempted to open a non-existent file, the OS assumed that your program must have been memory corrupted and killed it instead of returning ENOENT? It could be a "reasonable" assumption -- memory corruption could have caused that filename to be corrupt, hence you have sniffed out a memory corruption and stopped it in its tracks! Well, actually not really, but you saw the tracks. Or else, maybe someone made a typo?I could memory corrupt the length of the array also (assuming a dynamic array), and bounds checking merrily does nothing to stop further memory corruption.That's true but the language provides no tool to check for that. The fact that program correctness is not achievable in general should not have any bearing on bounds checking.Just like it works for all other exceptions -- you print a reasonable message to the offending party (in this case, it would be a 500 error I think), and continue executing other things. No memory corruption has occurred because bounds checking stopped it, therefore the program is still sane. -SteveHow could an Exception work in this case? Catch it and repeat the same bug over and over again? What would the program be achieving? (I assume the exception handler will not arbitrarily decrease index values.)and if data corruption occurred for the index, you *cannot* assume that *only* the index has been affected. The runtime cannot simply assume the index being out of bounds is not the result of already occurred data corruption, because that is inherently unsafe, so it *must* terminate asap as the default.The runtime should not assume that crashing the whole program is necessary when an integer is out of range. Preventing actual corruption, yes that is good. But an Exception would have done the job just fine.
May 31 2017
On Wednesday, 31 May 2017 at 21:30:05 UTC, Ali Çehreli wrote:How could an Exception work in this case? Catch it and repeat the same bug over and over again? What would the program be achieving? (I assume the exception handler will not arbitrarily decrease index values.)Other systems work like this: an internal server error is reported to the client, client reports an unexpected error to the user, and the action is repeated at the user's discretion.
Jun 01 2017
On Wednesday, 31 May 2017 at 21:00:43 UTC, Steven Schveighoffer wrote:On 5/31/17 3:17 PM, Moritz Maxeiner wrote:I disagree.On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:To be blunt, no this is completely wrong.[...] What are your thoughts? Have you run into this? If so, how did you solve it?It is not that accessing the array out of bounds *leading* to data corruption that is the issue here, but that in general you have to assume that the index *being* out of bounds is itself the *result* of *already occurred* data corruption;Memory corruption *already having happened* can cause any number of errors.Correct, of which out of bounds array is *one*.The point of bounds checking is to prevent memory corruption in the first place.That is *one* of the purposes. The other is to abort in case of already occurred memory corruption.I could memory corrupt the length of the array also (assuming a dynamic array), and bounds checking merrily does nothing to stop further memory corruption.Yes, that is one case against out of bounds checks do not help; but that changes nothing for the case we were talking about.The runtime should not assume that crashing the whole program is necessary when an integer is out of range.Without *any* other information, I think it should.Preventing actual corruption, yes that is good. But an Exception would have done the job just fine.If it were only about further memory corruption, yes, but as I said, my argument about preexisting corruption remains.But that ship, as I said elsewhere, has sailed. We can't change it to Exception now, as that would break just about all nothrow code in existence.Sure.Well, since I really needed only a single data structure at the time, it only contains a binary heap so far, but I believe it to be usable. I intend to add a dynamic array implementation next.So in your specific use case I would say use a wrapper. This is one of the reasons why I am working on my own library for data structures (libds).That is my conclusion too. Is your library in a usable state?Perhaps we should not repeat efforts, though I wasn't planning on making a robust public library for it :)Well, you can take a look at the binary heap implementation[1] and decide if that a style you are interested in, but it does currently use errors for things such as removing an element when the heap is empty; I am not sure there, what I intend to do here, but I might make it configurable via the Conf template parameter in a design-by-introspection style. [1] https://github.com/Calrama/libds
May 31 2017
On 05/31/2017 05:00 PM, Steven Schveighoffer wrote:But that ship, as I said elsewhere, has sailed. We can't change it to Exception now, as that would break just about all nothrow code in existence.This is why the runtime needs to guarantee that normal unwinding/cleanup *does* occur on Error (barring actual corruption or physical impossibilities, obviously).
May 31 2017
On Wednesday, May 31, 2017 22:24:16 Nick Sabalausky via Digitalmars-d wrote:On 05/31/2017 05:00 PM, Steven Schveighoffer wrote:It is my understanding that with how nothrow is implemented, that's not actually possible. The compiler takes advantage of nothrow to optimize out the exception handling code where possible. To force it to stay just to try and clean up when an Error is thrown would defeat the performance gains that we get with nothrow. Besides, it's highly debatable that you're actually better off cleaning up when an Error is thrown, because it largely depends on what has gone wrong. In some cases, it _would_ be better if clean-up occurred, whereas in others, it's just making matters worse. What we currently have is a weird hybrid. When an Error is thrown, _some_ of the clean-up is done, but not all. Whether that's worse than doing no clean-up is debatable, but regardless, due to nothrow, we can't do all of the clean-up, so relying on all of the clean-up occurring is error-prone. And pretty much the only reason that _any_ clean-up is done when an Error is thrown is because someone implemented it when Walter wasn't looking. The reality of the matter though is that no matter what we do, a completely robust program must be able to deal with the fact that it could be killed at any time (e.g. due to a power outage) - not that it needs to function perfectly when it gets killed, but for stuff like database consistency, you can't rely on the program dying gracefully to avoid data corruption. - Jonathan M DavisBut that ship, as I said elsewhere, has sailed. We can't change it to Exception now, as that would break just about all nothrow code in existence.This is why the runtime needs to guarantee that normal unwinding/cleanup *does* occur on Error (barring actual corruption or physical impossibilities, obviously).
May 31 2017
On 5/31/2017 7:39 PM, Jonathan M Davis via Digitalmars-d wrote:The reality of the matter though is that no matter what we do, a completely robust program must be able to deal with the fact that it could be killed at any time (e.g. due to a power outage) - not that it needs to function perfectly when it gets killed, but for stuff like database consistency, you can't rely on the program dying gracefully to avoid data corruption.Everything about a network is unreliable, so any reliable system must have baked into it the ability to cleanly redo any transaction that failed partway through it. Trying to have the software ignore serious bugs in order to complete a transaction is a doomed approach.
May 31 2017
On 05/31/2017 10:39 PM, Jonathan M Davis via Digitalmars-d wrote:On Wednesday, May 31, 2017 22:24:16 Nick Sabalausky via Digitalmars-d wrote:On 05/31/2017 05:00 PM, Steven Schveighoffer wrote:It is my understanding that with how nothrow is implemented, that's not actually possible. The compiler takes advantage of nothrow to optimize out the exception handling code where possible. To force it to stay just to try and clean up when an Error is thrown would defeat the performance gains that we get with nothrow. Besides, it's highly debatable that you're actually better off cleaning up when an Error is thrown, because it largely depends on what has gone wrong. In some cases, it _would_ be better if clean-up occurred, whereas in others, it's just making matters worse. What we currently have is a weird hybrid. When an Error is thrown, _some_ of the clean-up is done, but not all. Whether that's worse than doing no clean-up is debatable, but regardless, due to nothrow, we can't do all of the clean-up, so relying on all of the clean-up occurring is error-prone. And pretty much the only reason that _any_ clean-up is done when an Error is thrown is because someone implemented it when Walter wasn't looking. The reality of the matter though is that no matter what we do, a completely robust program must be able to deal with the fact that it could be killed at any time (e.g. due to a power outage) - not that it needs to function perfectly when it gets killed, but for stuff like database consistency, you can't rely on the program dying gracefully to avoid data corruption. - Jonathan M DavisBut that ship, as I said elsewhere, has sailed. We can't change it to Exception now, as that would break just about all nothrow code in existence.This is why the runtime needs to guarantee that normal unwinding/cleanup *does* occur on Error (barring actual corruption or physical impossibilities, obviously).
May 31 2017
On 05/31/2017 05:00 PM, Steven Schveighoffer wrote:On 5/31/17 3:17 PM, Moritz Maxeiner wrote:Honestly, I really think that if there is need to wrap something as basic as "all arrays in a codebase" then it's clear something in the langauge had gone horribly wrong. But short of actually *fixing* D's broken concept of Error, I don't see a better solution either.So in your specific use case I would say use a wrapper. This is one of the reasons why I am working on my own library for data structures (libds).That is my conclusion too.
May 31 2017
On Wednesday, May 31, 2017 22:33:43 Nick Sabalausky via Digitalmars-d wrote:On 05/31/2017 05:00 PM, Steven Schveighoffer wrote:Using an Exception to signal a programming bug and then potentially trying to recover from it is like trying to recover from a segfault. It really doesn't make sense. Yes, it's annoying when you have a bug that kills your program, and even when you do solid testing, you're unlikely to have found everything, but the solution to a bug is to fix the bug, not try and have your program limp along in an unknown state. Yes, there may be cases where array indices are effectively coming from user input, and you're going to have to check them all rather than the code having been written in a way that guarantees that the indices are valid, and in those cases, wrapping an array to do the checks may make sense, but in the vast majority of programs, invalid indices should simply never happen - just like dereferencing a null pointer should simply never happen - and if it does happen, it's a bug. So, treating it like bad user input as the default really doesn't make sense. Just fix the bug and move on, and over time, such problems will go away, because you'll have found the bugs and fixed them. And if you're consistently not finding them while testing, then maybe you need to do more and/or better testing. I can totally understand how it can be frustrating when a bug results in your program being killed, but it's far better for it to be in your face so that you find it and fix it rather than letting your program limp along and potentially have problems later down the line that are disconnected from the original bug and thus far harder to track down. - Jonathan M DavisOn 5/31/17 3:17 PM, Moritz Maxeiner wrote:Honestly, I really think that if there is need to wrap something as basic as "all arrays in a codebase" then it's clear something in the langauge had gone horribly wrong. But short of actually *fixing* D's broken concept of Error, I don't see a better solution either.So in your specific use case I would say use a wrapper. This is one of the reasons why I am working on my own library for data structures (libds).That is my conclusion too.
May 31 2017
On 05/31/2017 10:50 PM, Jonathan M Davis via Digitalmars-d wrote:On Wednesday, May 31, 2017 22:33:43 Nick Sabalausky via Digitalmars-d wrote:Exeption thrown != "OMG NOTHING ABOUT ANY BRANCH OF THE PROGRAM CAN BE REASONED ABOUT OR RELIED UPON ANYMORE!!!!" Your argument only applies for spaghetti code. Normal code is compartmentalized. Different subsystems and all that jazz. Just because one thing fails in one box, doesn't mean we gotta nuke the whole friggin industrial park and rebuild.On 05/31/2017 05:00 PM, Steven Schveighoffer wrote:Using an Exception to signal a programming bug and then potentially trying to recover from it is like trying to recover from a segfault. It really doesn't make sense. Yes, it's annoying when you have a bug that kills your program, and even when you do solid testing, you're unlikely to have found everything, but theOn 5/31/17 3:17 PM, Moritz Maxeiner wrote:Honestly, I really think that if there is need to wrap something as basic as "all arrays in a codebase" then it's clear something in the langauge had gone horribly wrong. But short of actually *fixing* D's broken concept of Error, I don't see a better solution either.So in your specific use case I would say use a wrapper. This is one of the reasons why I am working on my own library for data structures (libds).That is my conclusion too.solution to a bug is to fix the bug,Obviously. But that's not the question. The question is: What do you do in the meantime? Do you quarantine 12 states and a neighboring country because somebody coughed untill the threat is neutralized, or should the response actually match the threat?not try and have your program limp along in an unknown state.False dichotomy. Exceptions causes are usually very localized. There is no "unknown state" outside of that tiny little already-quaranteened box.Yes, there may be cases where array indices are effectively coming from user input, and you're going to have to check them all rather than the code having been written in a way that guarantees that the indices are valid, and in those cases, wrapping an array to do the checks may make sense, but in the vast majority of programs, invalid indices should simply never happen - just like dereferencing a null pointer should simply never happen - and if it does happen, it's a bug.Yes, it's a bug. A *localized* bug. NOT RAMPANT MEMORY CORRUPTION.
May 31 2017
On Wednesday, May 31, 2017 23:20:54 Nick Sabalausky via Digitalmars-d wrote:On 05/31/2017 10:50 PM, Jonathan M Davis via Digitalmars-d wrote:Indexing an array with an invalid index is the same as violating any contract in D except that you get a RangeError instead of an AssertError, and the check is always in place in safe code (even with -release) in order to avoid memory corruption. As soon as the contract is violated, the program is in an unknown state. It's logic is clearly wrong, and the assumptions that it's making may or may not be valid. So, continuing may or may not be safe. Whether memory corruption is involved is irrelevant. The program violated the contract, so the runtime knows that the program is in an invalid state. The cause of that bug may or may not be localized, but it's a guarantee at that point that the program is wrong, so you can't rely on it doing the right thing. Yes, we _could_ have made it so that the contract of indexing arrays in D was such that passing an invalid index was considered normal and then have it throw an Exception to indicate that bad input had been given. But that means that that code can no longer be nothrow (which does mean that it can't be optimized as well), and programs would then need to deal with the fact that indexing an array could throw and handle that case appropriately. For the vast majority of programs, most array indices do not come from user input, and thus it usually really doesn't make sense to treat passing an invalid index to an array as anything other than a bug. It's reasonable to expect the programmer to get it right and that if they don't, they'll find it during testing. If you want to wrap indexing arrays so that you get an Exception, then fine. At that point, you're saying that it's not a program bug to be passed an invalid index, and you're writing your programs with the idea that they need to be able to handle and recover from such bad input. But that is not the contract that the language itself uses precisely because indexing an array with an invalid index is usually a bug and not bad program input, and in the case where the array index _does_ somehow come from user input, then the programmer can test it. But having the runtime throw an Exception for what is normally a program bug would harm the programs that actually got their indices right. - Jonathan M DavisYes, there may be cases where array indices are effectively coming from user input, and you're going to have to check them all rather than the code having been written in a way that guarantees that the indices are valid, and in those cases, wrapping an array to do the checks may make sense, but in the vast majority of programs, invalid indices should simply never happen - just like dereferencing a null pointer should simply never happen - and if it does happen, it's a bug.Yes, it's a bug. A *localized* bug. NOT RAMPANT MEMORY CORRUPTION.
May 31 2017
On Thursday, 1 June 2017 at 05:03:17 UTC, Jonathan M Davis wrote:Whether memory corruption is involved is irrelevant. The program violated the contract, so the runtime knows that the program is in an invalid state. The cause of that bug may or may not be localized, but it's a guarantee at that point that the program is wrong, so you can't rely on it doing the right thing.Well, if you take this position then you should not only crash the program, but also delete the executable to prevent it from being run again. Allowing the process to be restarted when you know that it contains logic errors breaks with the principles you are outlining.handle that case appropriately. For the vast majority of programs, most array indices do not come from user input, and thus it usually really doesn't make sense to treat passing an invalid index to an array as anything other than a bug. It's reasonable to expect the programmer to get it right and that if they don't, they'll find it during testing.It is surprisingly common to forget to check for a field/file being empty in a service. So it makes a lot of sense to roll back for such errors and keep the service alive. In my experience this is the common scenario. And indexing an array is no different than asking for a key that doesn't exist in any other data-structure, array shouldn't be a special case. Does that mean that other ADTs also should throw Error and not Exception? For instance, assume you have a chat-server and the supplied clients work fine. Then some guy decides to reverse engineer it and build his own client. You don't want that service to go down all the time. You want to shut out that specific client. You want to identify the client and block it.
Jun 01 2017
On Wednesday, 31 May 2017 at 21:00:43 UTC, Steven Schveighoffer wrote:To be blunt, no this is completely wrong. Memory corruption *already having happened* can cause any number of errors. The point of bounds checking is to prevent memory corruption in the first place.Sad reality is that d programmers are still comfortable writing code in 70s style playing with void* pointers and don't enable bound checks early enough, see https://issues.dlang.org/show_bug.cgi?id=13367
Jun 01 2017
On Thu, Jun 01, 2017 at 10:11:19AM +0000, Kagamin via Digitalmars-d wrote: [...]Sad reality is that d programmers are still comfortable writing code in 70s style playing with void* pointers and don't enable bound checks early enough, see https://issues.dlang.org/show_bug.cgi?id=13367Huh? There is no void* in that bug report, and it was closed 3 years ago. What's your point? T -- Ph.D. = Permanent head Damage
Jun 01 2017
On Wednesday, 31 May 2017 at 21:00:43 UTC, Steven Schveighoffer wrote:That is my conclusion too. Is your library in a usable state? Perhaps we should not repeat efforts, though I wasn't planning on making a robust public library for it :)After some consideration you can now find the (dynamic) array implementation here[1]. With regards to (usage) errors: The data structures in libds allow passing an optional function `attest` via the template parameter `Hook` (DbI). `attest` is passed the data structure (by ref, for logging purposes) and a boolean value and must only return successfully if the value is true; if it is false, `attest` must throw something (e.g. an Exception), or terminate the process. An example of how to use it is here[2]. If no `attest` is passed, the data structures default to throwing an AssertError. [1] https://github.com/Calrama/libds/blob/fbceda333dbf76697050faeb6e25dbfcc9e3fbc0/src/ds/linear/array/dynamic.d [2] https://github.com/Calrama/libds/blob/fbceda333dbf76697050faeb6e25dbfcc9e3fbc0/src/ds/tree/heap/binary.d#L381
Jun 03 2017
On Wednesday, May 31, 2017 19:17:16 Moritz Maxeiner via Digitalmars-d wrote:On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:I don't think that you even need to worry about whether memory corruption occurred prior to indexing the array with an invalid index. The fact that the array was indexed with an invalid index is a bug. What caused the bug depends entirely on the code. Whether it's a memory corruption or something else is irrelevant. The contract of indexing arrays is that only valid indices be passed. If an invalid index has been passed, then the contract has been violated, and by definition, there's a bug in the program, so the runtime has no choice but to throw an Error or otherwise kill the program. Given the contract, the only alternative would be to use assertions and only check when not compiling with -release, but that would be a serious problem for safe code, and it really wouldn't help Steven's situation. Either way, the contract of indexing arrays is such that passing an invalid index is a bug, and no program should be doing it. The reason that the index is invalid is pretty much irrelevant to the discussion. It's a bug regardless. We _could_ make it so that the contract of indexing arrays is such that you're allowed to pass invalid values, but then the runtime would _always_ have to check the indices (even in system code), and arrays in general could never be used in code that was nothrow without a bunch of extra try-catch blocks. It would be like how auto-decoding and UTFException screws over our ability to have nothrow code with strings, only it would be for _all_ arrays. So, the result would be annoying for a lot of code as well as less efficient. The vast majority of array code is written in a way that invalid indices are simple never used, and having it so that indexing an array could throw an Exception would cause serious problems for a lot of code - especially when the code is already written in a way that such an exception will never be thrown (similar to how format can't be nothrow even when you know you've passed the correct arguments, and it will never throw). As such, it really doesn't make sense to force all programs to deal with arrays throwing Exceptions due to bad indices. If a program can't guarantee that it's going to be passing a valid index to an array, then it needs to validate the index first. And if that needs to be done frequently, it makes a lot of sense to either create a wrapper function for indexing arrays which does the check or to outright wrap arrays such that opIndex on that type does the check and throws an Exception before the invalid index is passed to the array. And if the wrapper function is trusted, it _should_ make it so that druntime doesn't check the index, avoiding having redundant checks. I can understand Steven's frustration, but I really think that we're better off the way it is now, even if it's not ideal for his current use case. - Jonathan M Davis[...] What are your thoughts? Have you run into this? If so, how did you solve it?It is not that accessing the array out of bounds *leading* to data corruption that is the issue here, but that in general you have to assume that the index *being* out of bounds is itself the *result* of *already occurred* data corruption; and if data corruption occurred for the index, you *cannot* assume that *only* the index has been affected. The runtime cannot simply assume the index being out of bounds is not the result of already occurred data corruption, because that is inherently unsafe, so it *must* terminate asap as the default. If you get the index as the input to your process - and thus *know* that it being out of bounds is not the result of previous data corruption - then you should check this yourself before accessing the array and handle it appropriately (e.g. via Exception).
May 31 2017
On Wednesday, 31 May 2017 at 22:42:30 UTC, Jonathan M Davis wrote:I don't think that you even need to worry about whether memory corruption occurred prior to indexing the array with an invalid index. The fact that the array was indexed with an invalid index is a bug. What caused the bug depends entirely on the code. Whether it's a memory corruption or something else is irrelevant. The contract of indexing arrays is that only valid indices be passed. [...]That is correct (and that was even mentioned in the OP), but from my PoV the argument was about whether that contract is sensible the way it is, so I was arguing for why I think the contract is good as it is. *The contract says so* is not an argument supporting the case of *why* the contract is the way it is.We _could_ make it so that the contract of indexing arrays is such that you're allowed to pass invalid values, but then [...]Another reason as to why I support the current contract.As such, it really doesn't make sense to force all programs to deal with arrays throwing Exceptions due to bad indices. If a program can't guarantee that it's going to be passing a valid index to an array, then it needs to validate the index first. And if that needs to be done frequently, it makes a lot of sense to either create a wrapper function for indexing arrays which does the check or to outright wrap arrays such that opIndex on that type does the check and throws an Exception before the invalid index is passed to the array. And if the wrapper function is trusted, it _should_ make it so that druntime doesn't check the index, avoiding having redundant checks.Precisely, and that is why I stated that I think he should use a wrapper.I can understand Steven's frustration, but I really think that we're better off the way it is now, even if it's not ideal for his current use case.I agree.
May 31 2017
On 5/31/17 6:42 PM, Jonathan M Davis via Digitalmars-d wrote:On Wednesday, May 31, 2017 19:17:16 Moritz Maxeiner via Digitalmars-d wrote:Yes, it's definitely a bug, and that is not something I'm arguing against. The correct handling is to throw something, and prevent the corruption in doing so. The problem is that the act of throwing itself makes the program unusable after that. I'm not on Nick's side saying that everything should be Exception, especially not out of memory. But the result of throwing an Error means your entire program has now been *made* invalid, even if it wasn't before. Therefore you must close it. I feel this is a mistake. A bad index can come from anywhere, and to assume it's from memory corruption is a huge leap. What would have been nice is to have a level between Error and Exception, and to throw that when a bug such as this occurs. Something that a framework can catch, but safe code couldn't. I feel that when these decisions were made, the concept of a single-process fiber-based server wasn't planned for.On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:I don't think that you even need to worry about whether memory corruption occurred prior to indexing the array with an invalid index. The fact that the array was indexed with an invalid index is a bug. What caused the bug depends entirely on the code. Whether it's a memory corruption or something else is irrelevant. The contract of indexing arrays is that only valid indices be passed. If an invalid index has been passed, then the contract has been violated, and by definition, there's a bug in the program, so the runtime has no choice but to throw an Error or otherwise kill the program. Given the contract, the only alternative would be to use assertions and only check when not compiling with -release, but that would be a serious problem for safe code, and it really wouldn't help Steven's situation. Either way, the contract of indexing arrays is such that passing an invalid index is a bug, and no program should be doing it. The reason that the index is invalid is pretty much irrelevant to the discussion. It's a bug regardless.[...] What are your thoughts? Have you run into this? If so, how did you solve it?It is not that accessing the array out of bounds *leading* to data corruption that is the issue here, but that in general you have to assume that the index *being* out of bounds is itself the *result* of *already occurred* data corruption; and if data corruption occurred for the index, you *cannot* assume that *only* the index has been affected. The runtime cannot simply assume the index being out of bounds is not the result of already occurred data corruption, because that is inherently unsafe, so it *must* terminate asap as the default. If you get the index as the input to your process - and thus *know* that it being out of bounds is not the result of previous data corruption - then you should check this yourself before accessing the array and handle it appropriately (e.g. via Exception).We _could_ make it so that the contract of indexing arrays is such that you're allowed to pass invalid values, but then the runtime would _always_ have to check the indices (even in system code), and arrays in general could never be used in code that was nothrow without a bunch of extra try-catch blocks. It would be like how auto-decoding and UTFException screws over our ability to have nothrow code with strings, only it would be for _all_ arrays. So, the result would be annoying for a lot of code as well as less efficient.Right, we can't pick that path now anyway. Too much code would break.The vast majority of array code is written in a way that invalid indices are simple never used, and having it so that indexing an array could throw an Exception would cause serious problems for a lot of code - especially when the code is already written in a way that such an exception will never be thrown (similar to how format can't be nothrow even when you know you've passed the correct arguments, and it will never throw). As such, it really doesn't make sense to force all programs to deal with arrays throwing Exceptions due to bad indices. If a program can't guarantee that it's going to be passing a valid index to an array, then it needs to validate the index first. And if that needs to be done frequently, it makes a lot of sense to either create a wrapper function for indexing arrays which does the check or to outright wrap arrays such that opIndex on that type does the check and throws an Exception before the invalid index is passed to the array. And if the wrapper function is trusted, it _should_ make it so that druntime doesn't check the index, avoiding having redundant checks. I can understand Steven's frustration, but I really think that we're better off the way it is now, even if it's not ideal for his current use case.It just means that D is an inferior platform for a web framework, unless you use the process-per-request model so the entire thing doesn't go down for one page request. But that obviously is going to cause performance problems. Which is unfortunate, because vibe.d is a great platform for web development, other than this. You could go Adam's route and just put the blinders on, but I think that's not a sustainable practice. -Steve
Jun 01 2017
I'm just sitting here waiting for shared libraries to be properly implemented cross platform. Then I can start thinking about a proper web server written in D. Until then, we are not really suited to become a generic web server and should only exist in the context of multiple instances (and restart-able).
Jun 01 2017
On Thursday, June 01, 2017 06:13:25 Steven Schveighoffer via Digitalmars-d wrote:It just means that D is an inferior platform for a web framework, unless you use the process-per-request model so the entire thing doesn't go down for one page request. But that obviously is going to cause performance problems. Which is unfortunate, because vibe.d is a great platform for web development, other than this. You could go Adam's route and just put the blinders on, but I think that's not a sustainable practice.Honestly, unless something about vibe.d prevents fixing bugs like bad array indices, I'd just use vibe.d and program normally, and if a problem like you hit occurs, I'd fix it, and then that wouldn't crash the program anymore. Depending on how many such logic errors got passed the testing process, it could take a while before the server was stable enough, or it could be very little time at all, but in the long run, there wouldn't be any invalid array indices, because those bugs would have been fixed, and there wouldn't be a problem anymore. Now, if there's something about vibe.d that outright prevents fixing these bugs or makes them impossible to avoid, then that calls for a different approach, but from what I understand of the situation, I don't see anything here preventing using vibe.d's approach with fibers. It's just that missed bugs will be very annoying until they're fixed, but that's true of most programs. - Jonathan M Davis
Jun 01 2017
On Thursday, 1 June 2017 at 10:13:25 UTC, Steven Schveighoffer wrote:Which is unfortunate, because vibe.d is a great platform for web development, other than this. You could go Adam's route and just put the blinders on, but I think that's not a sustainable practice.If you control the deployment, it works perfectly well. You aren't being blind to it, you are just taking control. I prefer to use processes anyway (they are easier to use, compatible with more libraries, considerably more reliable, and perform quite well - we don't have to spin up a new perl interpreter, 1999 was a long time ago), but fibers can handle RangeError too as long as you never use -release and such.
Jun 01 2017
On 2017-06-01 12:13, Steven Schveighoffer wrote:It just means that D is an inferior platform for a web framework, unless you use the process-per-request model so the entire thing doesn't go down for one page request. But that obviously is going to cause performance problems.You can do a combination of both. One request per fiber and as many instances of your program as cores. That will utilize the hardware better. I've noticed that the multi-threading in vibe.d doesn't seem to work. If one process goes down all those request are lost, but you can still handle new requests. That in the combination of auto restarting the processes if they crash. -- /Jacob Carlborg
Jun 01 2017
On Thursday, 1 June 2017 at 10:13:25 UTC, Steven Schveighoffer wrote:On 5/31/17 6:42 PM, Jonathan M Davis via Digitalmars-d wrote:On Wednesday, May 31, 2017 19:17:16 Moritz Maxeiner via Digitalmars-d wrote:It just means that D is an inferior platform for a web framework, unless you use the process-per-request model so the entire thing doesn't go down for one page request. But that obviously is going to cause performance problems. Which is unfortunate, because vibe.d is a great platform for web development, other than this. You could go Adam's route and just put the blinders on, but I think that's not a sustainable practice. -SteveI'm glad I know enough to know this is an opinion... anyway, its better to run a vibe.d instance in something like daemonized package. You should also use the vibe.d error handlers.
Jun 01 2017
On Friday, 2 June 2017 at 00:15:39 UTC, aberba wrote:On Thursday, 1 June 2017 at 10:13:25 UTC, Steven Schveighoffer wrote:Here is Daemonise https://github.com/NCrashed/daemonize/blob/master/examples/03.Vibed/README.md for running it as a daemon. Offers some controlOn 5/31/17 6:42 PM, Jonathan M Davis via Digitalmars-d wrote:[...]It just means that D is an inferior platform for a web framework, unless you use the process-per-request model so the entire thing doesn't go down for one page request. But that obviously is going to cause performance problems. Which is unfortunate, because vibe.d is a great platform for web development, other than this. You could go Adam's route and just put the blinders on, but I think that's not a sustainable practice. -SteveI'm glad I know enough to know this is an opinion... anyway, its better to run a vibe.d instance in something like daemonized package. You should also use the vibe.d error handlers.
Jun 01 2017
On 6/1/17 8:15 PM, aberba wrote:On Thursday, 1 June 2017 at 10:13:25 UTC, Steven Schveighoffer wrote:Don't get me wrong, I think D will be better than other frameworks for those who are willing to work with the warts. But the perception is going to be that D web frameworks are too fragile -- one miswritten handler, and your whole webserver dies. DOS attacks will be easy with D web frameworks, even if you have distributed handling.On 5/31/17 6:42 PM, Jonathan M Davis via Digitalmars-d wrote:On Wednesday, May 31, 2017 19:17:16 Moritz Maxeiner via Digitalmars-d wrote:It just means that D is an inferior platform for a web framework, unless you use the process-per-request model so the entire thing doesn't go down for one page request. But that obviously is going to cause performance problems. Which is unfortunate, because vibe.d is a great platform for web development, other than this. You could go Adam's route and just put the blinders on, but I think that's not a sustainable practice.I'm glad I know enough to know this is an opinion...anyway, its better to run a vibe.d instance in something like daemonized package. You should also use the vibe.d error handlers.I found the way to restart it using systemd, so that part should be handled. Now, I need to push up moving my session handling into a persistent storage (just using the memory storage for now). -Steve
Jun 02 2017
On Friday, 2 June 2017 at 12:33:17 UTC, Steven Schveighoffer wrote:But the perception is going to be that D web frameworks are too fragile -- one miswritten handler, and your whole webserver dies.Correction: "vibe.d frameworks" are fragile. This isn't D specific - my cgi.d is resilient to this (and more) and has been since 2008 since it uses a process pool. Simple solution that works very well. Might not handle 10,000 concurrent connections... but you very rarely actually have to.
Jun 02 2017
On 02.06.2017 15:24, Adam D. Ruppe wrote:On Friday, 2 June 2017 at 12:33:17 UTC, Steven Schveighoffer wrote:I'm not convinced that public perception is sensitive to such details. ;)But the perception is going to be that D web frameworks are too fragile -- one miswritten handler, and your whole webserver dies.Correction: "vibe.d frameworks" are fragile. This isn't D specific - my cgi.d is resilient to this (and more) and has been since 2008 since it uses a process pool. Simple solution that works very well. Might not handle 10,000 concurrent connections... but you very rarely actually have to.
Jun 02 2017
On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:This is like the equivalent of having a guard rail on a road not only stop you from going off the cliff but proactively disable your car afterwards to prevent you from more harm.Sorry for double post, but - after thinking more about this - I do not agree that this fits. I think a better analogy would be this: Your car has an autonomous driving system and an anti-collision system and the anti-collision system detects that you are about to hit an obstacle (let us say another car); as a result it engages the breaks and shuts off the autonomous driving system. It might be that the autonomous driving system was in the right and the reason for the almost collision was another human driver driving illegally, but it might also be that there is a bug in the autonomous driving system. If the latter is the case, in this one instance the anti-collision device detected the result of the bug, but the next time it might be that the autonomous driving system drives you off a cliff, which the anti-collision would not help against. So the only sane thing to do is shut the autonomous driving system off, requiring human intervention to decide which of the two was the case (and if it was the former, turn it on again).
May 31 2017
On 5/31/17 4:06 PM, Moritz Maxeiner wrote:On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:Nope, an autonomous system did not type out my code that caused the out of bounds error, I did :) -SteveThis is like the equivalent of having a guard rail on a road not only stop you from going off the cliff but proactively disable your car afterwards to prevent you from more harm.Sorry for double post, but - after thinking more about this - I do not agree that this fits. I think a better analogy would be this: Your car has an autonomous driving system and an anti-collision system and the anti-collision system detects that you are about to hit an obstacle (let us say another car); as a result it engages the breaks and shuts off the autonomous driving system.
May 31 2017
On Wednesday, 31 May 2017 at 21:02:06 UTC, Steven Schveighoffer wrote:Nope, an autonomous system did not type out my code that caused the out of bounds error, I did :)Same as the human who typed out the code of the autonomous system.
May 31 2017
On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:This seems like a large penalty for "almost" corrupting memory. No other web framework I've used crashes the entire web server for such a simple programming error.On windows you can set up service restart settings in case it crashes. Useful for services that crash regularly.
May 31 2017
On 5/31/17 4:53 PM, Kagamin wrote:On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:That *would* be a feature on Windows ;) No, this is Linux, so I'll have to research how to properly do it with systemd. -SteveThis seems like a large penalty for "almost" corrupting memory. No other web framework I've used crashes the entire web server for such a simple programming error.On windows you can set up service restart settings in case it crashes. Useful for services that crash regularly.
May 31 2017
On Wednesday, 31 May 2017 at 21:03:02 UTC, Steven Schveighoffer wrote:No, this is Linux, so I'll have to research how to properly do it with systemd.OT: *with whatever process supervisor floats your boat.
May 31 2017
[Service] ... Restart=on-failure On Wed, May 31, 2017 at 11:03 PM, Steven Schveighoffer via Digitalmars-d < digitalmars-d puremagic.com> wrote:On 5/31/17 4:53 PM, Kagamin wrote:On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:That *would* be a feature on Windows ;) No, this is Linux, so I'll have to research how to properly do it with systemd. -SteveThis seems like a large penalty for "almost" corrupting memory. No other web framework I've used crashes the entire web server for such a simple programming error.On windows you can set up service restart settings in case it crashes. Useful for services that crash regularly.
Jun 01 2017
On 6/1/17 6:05 AM, Daniel Kozak via Digitalmars-d wrote:[Service] .... Restart=on-failureThanks! -Steve
Jun 01 2017
On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:I have discovered an annoyance in using vibe.d instead of another web framework. Simple errors in indexing crash the entire application. For example: int[3] arr; arr[3] = 5; Compare this to, let's say, a malformed unicode string (exception), malformed JSON data (exception), file not found (exception), etc. Technically this is a programming error, and a bug. But memory hasn't actually been corrupted. The system properly stopped me from corrupting memory. But my reward is that even though this fiber threw an Error, and I get an error message in the log showing me the bug, the web server itself is now out of commission. No other pages can be served. This is like the equivalent of having a guard rail on a road not only stop you from going off the cliff but proactively disable your car afterwards to prevent you from more harm. This seems like a large penalty for "almost" corrupting memory. No other web framework I've used crashes the entire web server for such a simple programming error. And vibe.d has no choice. There is no guarantee the stack is properly unwound, so it has to accept the characterization of this is a program-ending error by the D runtime. I am considering writing a set of array wrappers that throw exceptions when trying to access out of bounds elements. This comes with its own set of problems, but at least the web server should continue to run. What are your thoughts? Have you run into this? If so, how did you solve it? -SteveWhat things are considered unrecoverable errors or not is debatable, but in the end I think the whole things can be seen from the perspective of a fundamental problem of systems where multiple operations must be able to progress successfully* independently of each other. All operations (a.k.a. processes, fibers, or function calls within fibers, or whatever granularity you choose) that modify shared state (could be external to the fiber, the thread, the process, the machine, could be "real-world") must somehow maintain some consistency with other operations that come before, are interleaved, simultaneous or after. The way I see it is that you have two choices: reason more explicitly about the relationship between different operations and carefully catch only the mishaps that you know (or are prepared to risk) don't ruin the consistent picture between operations OR remove the need for consistency. A lot of the latter makes the former easier. IIRC this is what deadalnix has talked about as one of the big wins of php in practice, the separation of state between requests means that things can mess up locally without having to worry about wider consequences except in the specific cases where things are shared; I.e. the set of things that must be maintained consistent are opt-in, as opposed to opt-out in care-free use of the vibe-d model. * "progress successfully" is itself a tricky idea. P.S. Sometimes I do feel D is a bit eager on the self-destruct switch, but I think the solution is to rise to the challenge of making better software, not to be more blasé about pretending to know how to recover from unknown logic errors (exposed by unexpected input).
May 31 2017
On 5/31/2017 5:37 PM, John Colvin via Digitalmars-d wrote:P.S. Sometimes I do feel D is a bit eager on the self-destruct switch, but I think the solution is to rise to the challenge of making better software, not to be more blasé about pretending to know how to recover from unknown logic errors (exposed by unexpected input).This.. exactly this. I've worked on software from the tiny device level to the largest distributed systems in the world and many in between. The ones that are aggressive about defining application correctness through asserts and similar mechanisms and use the basic precepts of failing fast are the most stable. Problems are caught early, they're loud, obnoxious, and obvious. And they get fixed, fast. I'm happy that D takes a similar stance. It makes my job easier. - Brad
May 31 2017
On 5/31/2017 6:04 AM, Steven Schveighoffer wrote:Technically this is a programming error, and a bug. But memory hasn't actually been corrupted.Since you don't know where the bad index came from, such a conclusion cannot be drawn.This seems like a large penalty for "almost" corrupting memory. No other web framework I've used crashes the entire web server for such a simple programming error.Hence the endless vectors for malware insertion in those other frameworks.What are your thoughts?Track down where the bad index is coming from and fix it. -----------Compare this to, let's say, a malformed unicode string (exception), malformedJSON data (exception), file not found (exception), etc. That's because those are input and environmental errors, not programming bugs. There can be grey areas in classifying problems as input errors or programming bugs, and those will need some careful thought by the programmer as to which bin they fall into, and then code accordingly. Array overflows are not a grey area, however. They are always programming bugs. ----------- This topic comes up regularly in this forum - the idea that a program that entered an unknown, undefined state is actually ok and can continue executing. Maybe that's fine on a system (such as a gaming console) where nobody cares if it goes off the deep end and it is not connected to the internet so it cannot propagate malware infections. Otherwise, while it's hard to write invulnerable programs, it is another thing entirely to endorse vulnerabilities. I cannot endorse such practices, nor can I endorse vibe.d if it is coded to continue running after entering an undefined state. A corollary is the idea that one creates reliable systems by writing programs that can continue executing after corruption. This is another fallacious concept. Reliable systems are ones that have independent components that can take over if some part of them fails. Shared memory is not independence.
May 31 2017
On Thursday, 1 June 2017 at 01:05:42 UTC, Walter Bright wrote:This topic comes up regularly in this forum - the idea that a program that entered an unknown, undefined state is actually ok and can continue executing. Maybe that's fine on a system (such as a gaming console) where nobody cares if it goes off the deep end and it is not connected to the internet so it cannot propagate malware infections.+1 Why are we discussing this topic again at all? Again? Even with consumer software, you may want to crash immediately so that you actually get complaints from testers/buyers instead of having a silent, invisible bug that no one will report ever. Actually leaving checks is imho perfectly valid for consumer software, if you don't do that the next consumers will have the issues that didn't get reported.
Jun 01 2017
On Thursday, 1 June 2017 at 09:18:24 UTC, Guillaume Piolat wrote:Even with consumer software, you may want to crash immediately so that you actually get complaints from testers/buyers instead of having a silent, invisible bug that no one will report ever.No. You don't want to crash immediately. In fact, you want to save and recover. Preferably without much work lost and without the user being bothered by it.
Jun 01 2017
On Thursday, 1 June 2017 at 09:46:09 UTC, Ola Fosheim Grøstad wrote:On Thursday, 1 June 2017 at 09:18:24 UTC, Guillaume Piolat wrote:Solved by auto-saving, _before_ the crashEven with consumer software, you may want to crash immediately so that you actually get complaints from testers/buyers instead of having a silent, invisible bug that no one will report ever.No. You don't want to crash immediately. In fact, you want to save and recover. Preferably without much work lost and without the user being bothered by it.
Jun 01 2017
On Thu, Jun 01, 2017 at 02:04:40PM +0000, Guillaume Piolat via Digitalmars-d wrote:On Thursday, 1 June 2017 at 09:46:09 UTC, Ola Fosheim Grøstad wrote:Yes. Saving *after* a crash was detected is stupid, because you no longer can guarantee the user data you're saving hasn't already been corrupted. I've experienced over-zealous "crash recovery" code in applications overwrite the last known good copy of my data with the latest, most up-to-date, and also most-corrupted data after it detected a problem. Not nice at all. T -- Question authority. Don't ask why, just do it.On Thursday, 1 June 2017 at 09:18:24 UTC, Guillaume Piolat wrote:Solved by auto-saving, _before_ the crashEven with consumer software, you may want to crash immediately so that you actually get complaints from testers/buyers instead of having a silent, invisible bug that no one will report ever.No. You don't want to crash immediately. In fact, you want to save and recover. Preferably without much work lost and without the user being bothered by it.
Jun 01 2017
On 6/1/2017 7:48 AM, H. S. Teoh via Digitalmars-d wrote:Yes. Saving *after* a crash was detected is stupid, because you no longer can guarantee the user data you're saving hasn't already been corrupted. I've experienced over-zealous "crash recovery" code in applications overwrite the last known good copy of my data with the latest, most up-to-date, and also most-corrupted data after it detected a problem. Not nice at all.An even better idea is to use rolling backups, with the crash recovery backup only being the most recent, not the only, version.
Jun 01 2017
On Thursday, 1 June 2017 at 14:04:40 UTC, Guillaume Piolat wrote:Solved by auto-saving, _before_ the crashThat only works for simple applications.
Jun 02 2017
On 5/31/17 9:05 PM, Walter Bright wrote:On 5/31/2017 6:04 AM, Steven Schveighoffer wrote:You could say that about any error. You could say that about malformed unicode strings, malformed JSON data, file not found. In this mindset, everything should be an Error, and nothing should be recoverable.Technically this is a programming error, and a bug. But memory hasn't actually been corrupted.Since you don't know where the bad index came from, such a conclusion cannot be drawn.No, those are due to the implementation of the interpreter. If the interpreter is implemented in safe D, then you don't have those problems.This seems like a large penalty for "almost" corrupting memory. No other web framework I've used crashes the entire web server for such a simple programming error.Hence the endless vectors for malware insertion in those other frameworks.Not necessarily. A file name could be sourced from the program, but have a typo. An index could come from the environment. The library can't know, but makes assumptions one way or the other. Just like we assume you want to use the GC, these assumptions are harmful for those who need it to be the other way.Compare this to, let's say, a malformed unicode string (exception),malformed JSON data (exception), file not found (exception), etc. That's because those are input and environmental errors, not programming bugs.There can be grey areas in classifying problems as input errors or programming bugs, and those will need some careful thought by the programmer as to which bin they fall into, and then code accordingly. Array overflows are not a grey area, however. They are always programming bugs.Of course, programming bugs cause all kinds of Errors and Exceptions alike. Environmental bugs can cause Array overflows. I can detail exactly what happened in my code -- I am accepting dates from a given week from a web request. One of the dates fell outside the week, and so tried to access a 7 element array with index 9. Nothing corrupted memory, but the runtime corrupted my entire process, forcing a shutdown. With an exception thrown, I still see the programming error, I still can fix it, and other web pages can still continue to be served.This topic comes up regularly in this forum - the idea that a program that entered an unknown, undefined state is actually ok and can continue executing. Maybe that's fine on a system (such as a gaming console) where nobody cares if it goes off the deep end and it is not connected to the internet so it cannot propagate malware infections.In fact, it did not enter such a state. The runtime successfully *prevented* such a state. And then instantaneously ruined the state by unwinding the stack withoutOtherwise, while it's hard to write invulnerable programs, it is another thing entirely to endorse vulnerabilities. I cannot endorse such practices, nor can I endorse vibe.d if it is coded to continue running after entering an undefined state.It's not. And it can't be. What I have to do is re-engineer the contract between myself and arrays. The only way to do that is to not use builtin arrays. That's the part that sucks. My code will be perfectly safe, and not ever experience corruption. It's just a bit ugly.A corollary is the idea that one creates reliable systems by writing programs that can continue executing after corruption. This is another fallacious concept. Reliable systems are ones that have independent components that can take over if some part of them fails. Shared memory is not independence.That is not what is happening here. I'm avoiding corruption so I don't have to crash. -Steve
Jun 01 2017
On Thursday, June 01, 2017 06:26:24 Steven Schveighoffer via Digitalmars-d wrote:On 5/31/17 9:05 PM, Walter Bright wrote:I think that it really comes down to what the contract is and how it makes sense to treat bad values. At the one extreme, you can treat all bad input as programmer error, requiring that callers validate all arguments to all functions (in which case, assertions or some other type of Error would be used on failure), and at the other extreme, you can be completely defensive about it and can have every function validate its input and throw an Exception on failure so that the checks never get compiled out, and the caller can choose whether they want to recover or not. Both approaches are of course rather extreme, and what we should do is somewhere in the middle. So, for any given function, we need to decide whether we want to take the DbC approach and require that the caller validates the input or take the defensive programming approach and have the function itself validate the input. Which makes more sense depends on what the function does and how it's used and is a bit of an art. But ultimately, whether something is a programming error depends on what the API and its contracts are, and that definitely does not mean that one-size-fits-all. As a default, I think that treating invalid indices as an Error makes a lot of sense, but it is true that if the index comes from user input or is otherwise inferred from user input, having the checks result in Errors is annoying. But you can certainly do additional checks yourself, and if you wrap the actual call to index the array in an trusted function, it should be possible to avoid there being two checks in the case that the index is valid. I get the impression that Walter tends to prefer treating stuff as programmatic error due to the types of programs that he usually writes. You get a lot fewer things that come from user input when you're simply processing a file (like you do with a compiler) than you get with stuff like a server application or a GUI. So, I think that he's more inclined to come to the conclusion that something should be treated as programmatic error than some other folks are. That being said, I also think that many folks are too willing to try and make their program continue like nothing was wrong after something fairly catastrophic happened.On 5/31/2017 6:04 AM, Steven Schveighoffer wrote:You could say that about any error. You could say that about malformed unicode strings, malformed JSON data, file not found. In this mindset, everything should be an Error, and nothing should be recoverable.Technically this is a programming error, and a bug. But memory hasn't actually been corrupted.Since you don't know where the bad index came from, such a conclusion cannot be drawn.Well, you _can_ use the built-in arrays and just use a helper function for indexing arrays so that the arrays are passed around normally, but you get an Exception for an invalid index rather than an Error. You would have to be careful to remember to index the array through the helper function, but it wouldn't force you to use different data structures. e.g. auto result = arr[i]; becomes something like auto result = arr.at(i); As an aside, I think that there has been way too much talk of memory corruption in this thread, and much of it has derailed the discussion from the actual issue. The array bounds checking prevented the memory corruption problem. The question here is how to deal with invalid indices and whether it should be treated as programmer error or bad input, and that's really a question of whether arrays should use DbC or defensive programming and whether there should be a way to choose based on your application's needs. - Jonathan M DavisOtherwise, while it's hard to write invulnerable programs, it is another thing entirely to endorse vulnerabilities. I cannot endorse such practices, nor can I endorse vibe.d if it is coded to continue running after entering an undefined state.It's not. And it can't be. What I have to do is re-engineer the contract between myself and arrays. The only way to do that is to not use builtin arrays. That's the part that sucks. My code will be perfectly safe, and not ever experience corruption. It's just a bit ugly.
Jun 01 2017
On 6/1/2017 3:56 AM, Jonathan M Davis via Digitalmars-d wrote:I get the impression that Walter tends to prefer treating stuff as programmatic error due to the types of programs that he usually writes. You get a lot fewer things that come from user input when you're simply processing a file (like you do with a compiler) than you get with stuff like a server application or a GUI. So, I think that he's more inclined to come to the conclusion that something should be treated as programmatic error than some other folks are.It is a programming bug to not validate the input. It's not that bad to abort programs if you neglected to validate the input. It is always bad to treat programming bugs as input errors.
Jun 01 2017
On 01.06.2017 20:37, Walter Bright wrote:On 6/1/2017 3:56 AM, Jonathan M Davis via Digitalmars-d wrote:It really depends on the specific circumstances.I get the impression that Walter tends to prefer treating stuff as programmatic error due to the types of programs that he usually writes. You get a lot fewer things that come from user input when you're simply processing a file (like you do with a compiler) than you get with stuff like a server application or a GUI. So, I think that he's more inclined to come to the conclusion that something should be treated as programmatic error than some other folks are.It is a programming bug> to not validate the input. It's not that bad to abort programs if you neglected to validate the input. ...It is always bad to treat programming bugs as input errors.They should be treated as bugs, but isn't it plausible that there are circumstances where one does not want to authorize every safe library function one calls to bring down the entire process?
Jun 01 2017
On 6/1/2017 12:16 PM, Timon Gehr wrote:On 01.06.2017 20:37, Walter Bright wrote:The stages of programming expertise: 1. newbie - follows the rules because he is told to 2. master - follows the rules because he understands them 3. guru - breaks the rules because he understands the rules don't apply Let's not skip stages :-)It is a programming bug> to not validate the input. It's not that bad to abort programs if you neglected to validate the input. ...It really depends on the specific circumstances.You, as the programmer, need to decide what is validated data and what is not. Being unclear about this is technical debt that is going to cause problems. Validated data that is not valid is a programming bug and the program should be aborted.It is always bad to treat programming bugs as input errors.They should be treated as bugs, but isn't it plausible that there are circumstances where one does not want to authorize every safe library function one calls to bring down the entire process?
Jun 01 2017
On 01.06.2017 21:48, Walter Bright wrote:On 6/1/2017 12:16 PM, Timon Gehr wrote:This does not really say anything about programming expertise, it says that "the rules" (whatever those are) are incomplete (unless there are no gurus, but then the list is nothing but silly). I guess "terminate the program upon detection of a bug" is one of your rules. It's incomplete, but the language specification enforces it (for a subset of bugs).On 01.06.2017 20:37, Walter Bright wrote:The stages of programming expertise: 1. newbie - follows the rules because he is told to 2. master - follows the rules because he understands them 3. guru - breaks the rules because he understands the rules don't apply Let's not skip stages :-) ...It is a programming bug> to not validate the input. It's not that bad to abort programs if you neglected to validate the input. ...It really depends on the specific circumstances.There is not only one programmer and not all programmers are me.You, as the programmer, need to decide what is validated data and what is not.It is always bad to treat programming bugs as input errors.They should be treated as bugs, but isn't it plausible that there are circumstances where one does not want to authorize every safe library function one calls to bring down the entire process?Being unclear about this is technical debt that is going to cause problems. ...This is both obvious and not answering my question.Validated data that is not valid is a programming bugAgain, obvious.and the program should be aborted.The buggy subprogram should be. Let's say I want to use library functionality written over the course of years by non-computer scientist domain expert Random C. Monkey. The library is an ugly jungle of special cases but it is mostly correct and makes it trivial to add feature X to my product. It's also pure and safe without any trusted functions. I can still serve customers if this library occasionally misbehaves, at a lower quality. (Let's say it is trivial to check whether the code returned a correct result, even though building the result in the first place was hard.) I cannot trust Mr. Monkey to have written only correct code respecting array bounds and null pointers, but if my product does not (seem to) have feature X by tomorrow, I'm most likely going out of business. Now, why exactly should any of Mr. Monkey's bugs terminate my entire service, necessitating a costly restart and causing unnecessary frustration to my customers? I'm pretty sure D should not outright prevent this use case, even though in an ideal world this situation would never arise.
Jun 01 2017
On 6/1/2017 1:47 PM, Timon Gehr wrote:I'm pretty sure D should not outright prevent this use case, even though in an ideal world this situation would never arise.C quality code is straightforward in D. Just mark it system.
Jun 01 2017
On 01.06.2017 23:12, Walter Bright wrote:On 6/1/2017 1:47 PM, Timon Gehr wrote:I don't know what this is, but it is not an answer to my post.I'm pretty sure D should not outright prevent this use case, even though in an ideal world this situation would never arise.C quality code is straightforward in D. Just mark it system.
Jun 01 2017
On Thursday, 1 June 2017 at 10:26:24 UTC, Steven Schveighoffer wrote:On 5/31/17 9:05 PM, Walter Bright wrote:Everything coming as an input of the _process_ should be validated... once validated, if still find during the execution malformed JSON data, malformed unicode strings, etc, there's a bug, and the process should terminate.On 5/31/2017 6:04 AM, Steven Schveighoffer wrote:You could say that about any error. You could say that about malformed unicode strings, malformed JSON data, file not found. In this mindset, everything should be an Error, and nothing should be recoverable.Technically this is a programming error, and a bug. But memory hasn't actually been corrupted.Since you don't know where the bad index came from, such a conclusion cannot be drawn.It seems to me that reducing the danger only to corrupted memory is underestimating the damage that can be done, for example by a simple SQL injection, that can be done without corrupting memory at all.No, those are due to the implementation of the interpreter. If the interpreter is implemented in safe D, then you don't have those problems.This seems like a large penalty for "almost" corrupting memory. No other web framework I've used crashes the entire web server for such a simple programming error.Hence the endless vectors for malware insertion in those other frameworks.The library should not assume nothing about anything coming from the environment, the filesystem, etc: there's must be a validation at the boundaries.Not necessarily. A file name could be sourced from the program, but have a typo. An index could come from the environment. The library can't know, but makes assumptions one way or the other. Just like we assume you want to use the GC, these assumptions are harmful for those who need it to be the other way.Compare this to, let's say, a malformed unicode string (exception),malformed JSON data (exception), file not found (exception), etc. That's because those are input and environmental errors, not programming bugs.I can detail exactly what happened in my code -- I am accepting dates from a given week from a web request. One of the dates fell outside the week, and so tried to access a 7 element array with index 9. Nothing corrupted memory, but the runtime corrupted my entire process, forcing a shutdown.And that's a good thing! The input should be validated, especially because we are talking about a web request. See it like being kind with the other side of the connection, informing it with a clear "rejected as the date is invalid". :-) /Paolo
Jun 01 2017
On 01.06.2017 14:25, Paolo Invernizzi wrote:You seem to not understand what happened. There was a single server serving multiple different web pages. There was an out-of-bounds error due to a single user inserting invalid data into a single form with missing data validation. The web server went down, killing all pages for all users. There is no question that input data should be validated, but if it isn't, the response should be proportional. It's enough to kill the request, log the exception , notify the developer, and maybe even disable the specific web page.I can detail exactly what happened in my code -- I am accepting dates from a given week from a web request. One of the dates fell outside the week, and so tried to access a 7 element array with index 9. Nothing corrupted memory, but the runtime corrupted my entire process, forcing a shutdown.And that's a good thing! The input should be validated, especially because we are talking about a web request. See it like being kind with the other side of the connection, informing it with a clear "rejected as the date is invalid". :-)
Jun 01 2017
On Thursday, 1 June 2017 at 18:54:51 UTC, Timon Gehr wrote:On 01.06.2017 14:25, Paolo Invernizzi wrote:I really understand what is happening: I've a vibe.d server that's serving a US top 5 FMCG world company, and sometime it goes down for a crash. It's dockerized, in a docker swarm, and every times it crashes (or it's "unhealty") it's restarted, and we've a log, that it's helping us to squeeze bugs. Guess it, it's not a problem for the customer (at least right now!) as long as we have taken a clear approach: we are squeezing bug, and if process state is signalling us that a bug has occurred, we simply pull the plug. A proportional response can be archived having multiple processes handling the requests.. it's the only sane way I can think to not kill "all" the sessions, but only a portion. /PaoloYou seem to not understand what happened. There was a single server serving multiple different web pages. There was an out-of-bounds error due to a single user inserting invalid data into a single form with missing data validation. The web server went down, killing all pages for all users. There is no question that input data should be validated, but if it isn't, the response should be proportional. It's enough to kill the request, log the exception , notify the developer, and maybe even disable the specific web page.I can detail exactly what happened in my code -- I am accepting dates from a given week from a web request. One of the dates fell outside the week, and so tried to access a 7 element array with index 9. Nothing corrupted memory, but the runtime corrupted my entire process, forcing a shutdown.And that's a good thing! The input should be validated, especially because we are talking about a web request. See it like being kind with the other side of the connection, informing it with a clear "rejected as the date is invalid". :-)
Jun 01 2017
On Thursday, 1 June 2017 at 21:55:55 UTC, Paolo Invernizzi wrote:On Thursday, 1 June 2017 at 18:54:51 UTC, Timon Gehr wrote:Pretty much it. Containerisation of several stateless instances is pretty much the scalable approach going forward.[...]I really understand what is happening: I've a vibe.d server that's serving a US top 5 FMCG world company, and sometime it goes down for a crash. [...]
Jun 01 2017
On 6/1/17 8:25 AM, Paolo Invernizzi wrote:On Thursday, 1 June 2017 at 10:26:24 UTC, Steven Schveighoffer wrote:If only that is what happened, I would not have started this thread! In any case, the way forward is clear -- create containers that don't throw Error, and make them easy to use. I think I will actually publish them, because it's a very useful thing to have. You can validate your input all you want, but if you have a program bug, or there is something you didn't consider, then the entire server isn't crashed because of it. I *like* the bounds checking, I don't have to translate back to the input what it will mean for every array access in the function -- the simple check is enough. Still good to have it auto-restart, which I will also do. But having some sort of feedback to the client, and an attempt to continue on with other unrelated requests is preferable. -SteveI can detail exactly what happened in my code -- I am accepting dates from a given week from a web request. One of the dates fell outside the week, and so tried to access a 7 element array with index 9. Nothing corrupted memory, but the runtime corrupted my entire process, forcing a shutdown.And that's a good thing! The input should be validated, especially because we are talking about a web request. See it like being kind with the other side of the connection, informing it with a clear "rejected as the date is invalid".
Jun 02 2017
On 06/02/2017 01:26 PM, Steven Schveighoffer wrote:If only that is what happened, I would not have started this thread! In any case, the way forward is clear -- create containers that don't throw Error, and make them easy to use. I think I will actually publish them, because it's a very useful thing to have. You can validate your input all you want, but if you have a program bug, or there is something you didn't consider, then the entire server isn't crashed because of it. I *like* the bounds checking, I don't have to translate back to the input what it will mean for every array access in the function -- the simple check is enough. Still good to have it auto-restart, which I will also do. But having some sort of feedback to the client, and an attempt to continue on with other unrelated requests is preferable. -SteveHi, I think that most people agree that an out-of-bounds access is a bug that needs to be fixed, this shouldn't be an acceptable way of running the program. The question here seems to be what to do *in the meanwhile*, and here is the problem. I can understand the position that from a theoretical point of view the process is already unsafe at this point, and that the best option is to stop (and restart if needed). But, in the real world if I've got a (web)server that has proper isolation, I'd much rather have a server that sends back a 500 [error message] for the buggy page and keeps working otherwise, than one that is killed and has to be restarted every time a buggy page is asked. Think that it can be a multithreaded server, and that other ongoing (and safe!) tasks might be affected, and that safe restart, even when available, often has a performance hit. I agree that one (perhaps even the proper) way to get this is through process isolation, but this doesn't mean that the language shouldn't allow it if needed and explicitly required. There are ways for the programmer to explicitly disable most other security features (__gshared, casting away shared and immutable, trusted code, etc.) so why not this one? Perhaps an intermediate solution would be to offer a compiler switch that allows Errors to be safely caught (that is, they behave as exceptions). As far as I understand from reading this thread, that's already the case in debug builds, so it cannot be that bad practice, but it would be nice to have a mode that it's otherwise "release", only with this feature turned on. Even better would be to turn on this behaviour on a per-function basis (say throwErrors). Although perhaps that'd be promoting this behaviour a bit too much... Anyway, just 2¢ from a half-newbie (okay, still full-newbie :) )
Jun 02 2017
On 6/2/17 7:55 AM, Arafel wrote:But, in the real world if I've got a (web)server that has proper isolation, I'd much rather have a server that sends back a 500 [error message] for the buggy page and keeps working otherwise, than one that is killed and has to be restarted every time a buggy page is asked.Yes, exactly what I want.Perhaps an intermediate solution would be to offer a compiler switch that allows Errors to be safely caught (that is, they behave as exceptions). As far as I understand from reading this thread, that's already the case in debug builds, so it cannot be that bad practice, but it would be nice to have a mode that it's otherwise "release", only with this feature turned on.I don't think this is workable, simply because of nothrow. An Error is allowed to be thrown in nothrow code, and the compiler can simultaneously assume that nothrow functions won't throw. Therefore it can legally omit the scaffolding for deallocating scope variables when an Exception is thrown (for performance reasons), and leave your program in an invalid state. The only conclusion I can come to is that I need to write my own array types. This isn't going to be so bad as I thought, and likely will just become second nature to use them. -Steve
Jun 02 2017
On 06/02/2017 02:12 PM, Steven Schveighoffer wrote:Well, as I understood from this thread this is already possible in debug mode:Perhaps an intermediate solution would be to offer a compiler switch that allows Errors to be safely caught (that is, they behave as exceptions). As far as I understand from reading this thread, that's already the case in debug builds, so it cannot be that bad practice, but it would be nice to have a mode that it's otherwise "release", only with this feature turned on.I don't think this is workable, simply because of nothrow. An Error is allowed to be thrown in nothrow code, and the compiler can simultaneously assume that nothrow functions won't throw. Therefore it can legally omit the scaffolding for deallocating scope variables when an Exception is thrown (for performance reasons), and leave your program in an invalid state.An Exception leads to unwinding&cleanup, an Error to termination (with unwinding&cleanup in debug mode for debugging purposes).If it is indeed so, then adding a switch that only removes this optimization (from nothrow code) but is otherwise a release version shouldn't be too hard to implement? Even if not, making nothrow a no-op w.r.t. unwinding should still be possible and not too hard (sorry if I'm being naïve here, I don't know how hard it would be to implement, but conceptually it seems straightforward). Of course, one must be willing to take the performance hit.
Jun 02 2017
On 6/2/17 9:00 AM, Arafel wrote:On 06/02/2017 02:12 PM, Steven Schveighoffer wrote:Yes, of course. This is a non-starter if you need to compile release mode (and you do, my relatively small app is 47MB in debug mode, 20MB in release mode, and I can't imagine performance doing very well). -SteveWell, as I understood from this thread this is already possible in debug mode:Perhaps an intermediate solution would be to offer a compiler switch that allows Errors to be safely caught (that is, they behave as exceptions). As far as I understand from reading this thread, that's already the case in debug builds, so it cannot be that bad practice, but it would be nice to have a mode that it's otherwise "release", only with this feature turned on.I don't think this is workable, simply because of nothrow. An Error is allowed to be thrown in nothrow code, and the compiler can simultaneously assume that nothrow functions won't throw. Therefore it can legally omit the scaffolding for deallocating scope variables when an Exception is thrown (for performance reasons), and leave your program in an invalid state.An Exception leads to unwinding&cleanup, an Error to termination (with unwinding&cleanup in debug mode for debugging purposes).If it is indeed so, then adding a switch that only removes this optimization (from nothrow code) but is otherwise a release version shouldn't be too hard to implement? Even if not, making nothrow a no-op w.r.t. unwinding should still be possible and not too hard (sorry if I'm being naïve here, I don't know how hard it would be to implement, but conceptually it seems straightforward). Of course, one must be willing to take the performance hit.
Jun 02 2017
On Thursday, 1 June 2017 at 10:26:24 UTC, Steven Schveighoffer wrote:On 5/31/17 9:05 PM, Walter Bright wrote:I think the idea is that no, array overflows can never be caused by the environment in a correct program. If you don't adequately screen the environmental input, your program is incorrect. This is how I think about it: There are 3 categories of bugs: known safe to survive, known unsafe to survive, unknown safety. Range Errors are an example of errors that can be considered "unknown safety", so by default we assume it is unsafe to continue. If you - as the human developer - decide that the specific RangeError bug from this place in the code is actually known safe to survive, you should add screening for the bad value and throw an Exception instead, or if that's difficult to do then catch the Error and then throw an Exception*. Note that these aren't fixes for the bug, these are explicit recognition of the continued existence of the bug while promising ( trusted style) that everything will still be OK. If you decide it is truly an "unsafe to continue" bug, then let it carry on crashing there. Ultimately of course you screen the environment at the appropriate level or fix the bug, do the "right thing" whatever that may be. *note that you could abstract this away into an array type that throws Exceptions, but where would you know it was safe to use? Perhaps not so many places. Tldr; if you know that a bug is safe to continue/recover from, put in the necessary code to do so. I would be interested to see ideas of how to implement some sort of logical sandboxing in D. Perhaps if one calls a strongly pure safe function, there is no way it can mess up shared state, so you know that as long as you disregard the result it will always be safe to continue... Effectively it's a "process within a process" or something like that. Of course you'd need to be able to guarantee you can catch Errors, plus even though the function you've called can't have *caused* the problem, it might be the only place where you *find* the problem and that might be bad enough to not want to continue from...On 5/31/2017 6:04 AM, Steven Schveighoffer wrote:You could say that about any error. You could say that about malformed unicode strings, malformed JSON data, file not found. In this mindset, everything should be an Error, and nothing should be recoverable.Technically this is a programming error, and a bug. But memory hasn't actually been corrupted.Since you don't know where the bad index came from, such a conclusion cannot be drawn.No, those are due to the implementation of the interpreter. If the interpreter is implemented in safe D, then you don't have those problems.This seems like a large penalty for "almost" corrupting memory. No other web framework I've used crashes the entire web server for such a simple programming error.Hence the endless vectors for malware insertion in those other frameworks.Not necessarily. A file name could be sourced from the program, but have a typo. An index could come from the environment. The library can't know, but makes assumptions one way or the other. Just like we assume you want to use the GC, these assumptions are harmful for those who need it to be the other way.Compare this to, let's say, a malformed unicode string (exception),malformed JSON data (exception), file not found (exception), etc. That's because those are input and environmental errors, not programming bugs.There can be grey areas in classifying problems as input errors or programming bugs, and those will need some careful thought by the programmer as to which bin they fall into, and then code accordingly. Array overflows are not a grey area, however. They are always programming bugs.Of course, programming bugs cause all kinds of Errors and Exceptions alike. Environmental bugs can cause Array overflows.
Jun 01 2017
On Thursday, 1 June 2017 at 14:10:21 UTC, John Colvin wrote:I would be interested to see ideas of how to implement some sort of logical sandboxing in D. Perhaps if one calls a strongly pure safe function, there is no way it can mess up shared state,Oh yes, there is a way: http://forum.dlang.org/post/psdamamjecdwfeiuvqsz forum.dlang.org
Jun 01 2017
On Thursday, 1 June 2017 at 14:21:35 UTC, Stanislav Blinov wrote:On Thursday, 1 June 2017 at 14:10:21 UTC, John Colvin wrote:Sure, safe has some holes as it currently stands.I would be interested to see ideas of how to implement some sort of logical sandboxing in D. Perhaps if one calls a strongly pure safe function, there is no way it can mess up shared state,Oh yes, there is a way: http://forum.dlang.org/post/psdamamjecdwfeiuvqsz forum.dlang.org
Jun 01 2017
On Thursday, June 01, 2017 14:40:59 John Colvin via Digitalmars-d wrote:On Thursday, 1 June 2017 at 14:21:35 UTC, Stanislav Blinov wrote:It's far better than nothing, but it definitely has holes. DIP 1000 is fixing a lot of those holes. Unfortunately, the only way to absolutely guarantee that it doesn't have any holes is to do it via whitelisting operations and then vetting every operation to make sure that it's safe for the compiler to say that it's safe, whereas it's implemented by blacklisting operations that are determined to be unsafe. So, we'll probably always be at risk of having holes in safe, but the situation is improving. - Jonathan M DavisOn Thursday, 1 June 2017 at 14:10:21 UTC, John Colvin wrote:Sure, safe has some holes as it currently stands.I would be interested to see ideas of how to implement some sort of logical sandboxing in D. Perhaps if one calls a strongly pure safe function, there is no way it can mess up shared state,Oh yes, there is a way: http://forum.dlang.org/post/psdamamjecdwfeiuvqsz forum.dlang.org
Jun 01 2017
On 6/1/2017 7:21 AM, Stanislav Blinov wrote:Oh yes, there is a way: http://forum.dlang.org/post/psdamamjecdwfeiuvqsz forum.dlang.orgPlease post bug reports to bugzilla. Posting them only on the n.g. pretty much ensures they will never get addressed.
Jun 01 2017
On Thursday, 1 June 2017 at 18:40:28 UTC, Walter Bright wrote:On 6/1/2017 7:21 AM, Stanislav Blinov wrote:Please look at the very first post of that thread :\Oh yes, there is a way: http://forum.dlang.org/post/psdamamjecdwfeiuvqsz forum.dlang.orgPlease post bug reports to bugzilla. Posting them only on the n.g. pretty much ensures they will never get addressed.
Jun 01 2017
On Thu, Jun 01, 2017 at 06:26:24AM -0400, Steven Schveighoffer via Digitalmars-d wrote: [...]Of course, programming bugs cause all kinds of Errors and Exceptions alike. Environmental bugs can cause Array overflows. I can detail exactly what happened in my code -- I am accepting dates from a given week from a web request. One of the dates fell outside the week, and so tried to access a 7 element array with index 9. Nothing corrupted memory, but the runtime corrupted my entire process, forcing a shutdown.[...] Isn't this a case of failing to sanitize user input adequately before using it for internal processing? And failing to test the code with pathological data to ensure resilience before deploying to a live server? In this case, nothing worse happened than an out-of-bounds array index. But we all know what *could* happen with unsanitized user input in other cases... T -- Stop staring at me like that! It's offens... no, you'll hurt your eyes!
Jun 01 2017
On 6/1/2017 3:26 AM, Steven Schveighoffer wrote:On 5/31/17 9:05 PM, Walter Bright wrote:What's missing here is looking carefully at a program and deciding what are input (and environmental) errors and what are program bugs. The former are recoverable, the latter are not. For example, malformed unicode strings. Joel Spolsky wrote about this issue long ago, in that data in a program should be compartmentalized into untrusted and trusted data. Untrusted data comes from the input, and stays untrusted until it is validated. Malformed untrusted data are recoverable. Once it is validated, it becomes trusted data. Any malformations in trusted data are programming bugs. It should be clear in a well designed program what data is trusted and what data is untrusted. Spolsky suggests using different types for them so they are distinct. For your date case, the date was not validated, and was fed into an array, where the invalid date overflowed the array bounds. The program was relying on the array bounds checking to validate the data. I'd argue this is a problematic program design because: 1. It's inefficient. Data should be validated once in a clear location in the program. Arrays appear all over the place, and tend to be in hot locations. Validating the same data over and over is highly inefficient. 2. Array bounds checking can be turned off by a compiler switch. Program data validation should not be silently disabled in such an unexpected manner. 3. Arrays are a ubiquitous data structure. They are used all over the place. There is no way to distinguish "this is a data validation use" and "this must be valid data". 4. It would be surprising to anyone familiar with D looking at your code to realize that an array access is data validation rather than bug checking. 5. Arrays are sometimes optimized by removing the bounds checking. This should not turn off data validation. 6. safe code is intended to find programming bugs, not validate input data. 7. Just because code is marked safe doesn't mean memory corruption is impossible. Even if safe is perfect, programs have trusted and system code too, and those may have memory corrupting bugs. 8. It does not distinguish array overflow from programming bugs / corruption from invalid program input.On 5/31/2017 6:04 AM, Steven Schveighoffer wrote:You could say that about any error. You could say that about malformed unicode strings, malformed JSON data, file not found. In this mindset, everything should be an Error, and nothing should be recoverable.Technically this is a programming error, and a bug. But memory hasn't actually been corrupted.Since you don't know where the bad index came from, such a conclusion cannot be drawn.
Jun 01 2017
On Thu, Jun 01, 2017 at 11:29:53AM -0700, Walter Bright via Digitalmars-d wrote: [...]Untrusted data comes from the input, and stays untrusted until it is validated. Malformed untrusted data are recoverable. Once it is validated, it becomes trusted data. Any malformations in trusted data are programming bugs. It should be clear in a well designed program what data is trusted and what data is untrusted. Spolsky suggests using different types for them so they are distinct. For your date case, the date was not validated, and was fed into an array, where the invalid date overflowed the array bounds. The program was relying on the array bounds checking to validate the data.+1. I think this is the root of the problem. Data that comes from outside sources must never, ever be trusted, until they are validated. Any errors that occur during validation are recoverable, because you *know* they are caused by wrong data from outside. Once the data is validated, any further errors involving that data are program bugs: either your validation code was incorrect / incomplete, or there is a program logic error that led to an inconsistent state. In this case, aborting the program is the only sane response, especially in an online services setting, because your broken validation code may have let through maliciously-crafted data that can lead to an exploit (better nip it in the bud before the exploit proceeds any further), or the internal program logic is inconsistent, so proceeding further is UB. Feeding unvalidated, tainted data directly into inner program logic like indexing an array is a bad idea. The data ought to be validated first. I like Spolsky's idea of using separate types for tainted / verified input. Let the compiler statically verify that you at least made an attempt at validating your program's inputs (though obviously it can only go so far -- the compiler can't guarantee that your validation code is actually correct). The problem, though, is that D currently doesn't have tainted types, so for example you can't tell at a glance whether a given string is untrusted user input or validated data, it's all just `string`. I wonder if tainted types could be something worth adding either to the language or to Phobos. [...]8. It does not distinguish array overflow from programming bugs / corruption from invalid program input.Yes, I think this conflation is the root cause of this problem. Validation should be explicit, and separate from inner program logic. Mixing the two together only serves to confuse the issue. T -- If you think you are too small to make a difference, try sleeping in a closed room with a mosquito. -- Jan van Steenbergen
Jun 01 2017
On Thursday, 1 June 2017 at 19:04:19 UTC, H. S. Teoh wrote:I like Spolsky's idea of using separate types for tainted / verified input. Let the compiler statically verify that you at least made an attempt at validating your program's inputs (though obviously it can only go so far -- the compiler can't guarantee that your validation code is actually correct). The problem, though, is that D currently doesn't have tainted types, so for example you can't tell at a glance whether a given string is untrusted user input or validated data, it's all just `string`. I wonder if tainted types could be something worth adding either to the language or to Phobos.I'm not familiar with the idea, do we need more than the following? struct Tainted { T _basetype; alias _basetype this; } void main(string[] args) { auto ts = Tainted!string("Hello"); writeln(ts); } It's a PoC, ok, but it lets you use ts like any variable of the base type, it lets you convert one easily to the other, but this conversion has to be explicit. So, real question, what more do we need?
Jun 01 2017
On Thu, Jun 01, 2017 at 10:09:36PM +0000, cym13 via Digitalmars-d wrote:On Thursday, 1 June 2017 at 19:04:19 UTC, H. S. Teoh wrote:[...] Actually, I re-read Spolsky's blog post[1] again, and apparently he didn't actually recommend using the type system for enforcing this, but a naming convention that would make code stick out when it's doing something funny. [1] https://www.joelonsoftware.com/2005/05/11/making-wrong-code-look-wrong/ So, for example, you'd name all tainted strings with the prefix `us`, and all functions that return tainted strings are prefixed with `us`, including any string identifiers you might use to refer to the tainted data. E.g.: string usName = usGetParam(httpRequest, "name"); ... database.cache("usName", usName); ... string usData = database.read("usName"); ... // sEscapeHtmlUs means it converts unsafe data (...Us) to safe // data (s...) by escaping dangerous characters. string sData = sEscapeHtmlUs(usData); ... // sWrite means it requires safe data sWrite(html, "<p>Your name is %s</p>", sData); The idea is that if you see a line of code where the prefixes don't match, then you immediately know there's a problem. For example: // Uh-oh, we assigned unsafe data to a variable that should only // hold safe data. string sName = usGetParam(httpRequest, "name"); // Uh-oh, we wrote unsafe data into a database field that should // only contain safe data. database.cache("sName", usName); // Uh-oh, we're printing unsafe data via a function that assumes // its input is safe. sWrite(html, "<p>Your name is %s</p>", usData); This is not bad, since with some practice you could immediately identify code that's probably wrong (mixing s- and us- prefixes wrongly, or identifier with no prefix, meaning the code needs to be reviewed and the identifier renamed accordingly). The problem is that this is still in the realm of coding by convention. What I had in mind was more along the lines of what you proposed, that you'd actually use the type system to enforce a distinction between safe and unsafe data, so that the compiler will reject code that tries to mix the two without an explicit conversion. I haven't thought too deeply about how to actually implement this, but here's my initial idea: any function that reads data from external sources (network, filesystem, environment) will return Tainted!string or Tainted!(T[]) rather than string or T[]. Unlike what you proposed above, the Tainted wrapper will *not* allow implicit conversion to the underlying type, because otherwise it defeats the purpose (pass Tainted!T to a function that expects T, and the compiler will automatically cast it to T for you: no good). So you cannot pass this data directly to a function that expects string or T[]. However, they will allow some way of accessing the wrapped data, so that the validation function can inspect the data to ensure that it's OK, then explicitly cast it to the underlying type. Sketch of code: struct Tainted(T) { // Note: outside code cannot directly access payload. private T payload; T validate(alias isClean)() if (is(typeof(isClean(T.init)) == bool)) { // Do not allow isClean to escape references to // payload (?is this correct usage?). Requires // -dip1000. scope _p = payload; if (isClean(_p)) return payload; throw new Exception("Bad data"); } T cleanse(alias cleaner)() if (is(typeof(cleaner(T.init)) == T)) { // Prevent cleaner() from cheating and simply // returning the payload (?necessary?). Requires // -dip1000. The idea being to force the // creation of safe data from the payload, e.g., // a HTML-escaped string from a raw string. scope _p = payload; return cleaner(_p); } } // Note: returns Tainted!T instead of T. Tainted!T readParam(T)(HttpRequest req, string paramName); // Note: requires string, not Tainted!string void writeToOutput(string s); void handleRequest(HttpRequest req) { string[7] daysOfWeek = [ "Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun" ]; // Returns Tainted!int auto day = req.readParam!int("dayOfWeek"); // Compile error: cannot index array with Tainted!int //writeToOutput(daysOfWeek[day]); // Check range and return unwrapped int if OK, throw // Exception otherwise. auto checkedDay = day.validate!(d => d >= 0 && d < daysOfWeek.length); writeToOutput(daysOfWeek[checkedDay]); // OK // Returns Tainted!string auto name = req.readParam!string("name"); // Compile error: cannot pass Tainted!string to writeToOutput. //writeToOutput(name); // Unwrap to string if does not contain meta-characters, // throw Exception otherwise. auto safeName = name.validate!hasNoMetaCharacters; writeToOutput(safeName); // OK // Cleanse the string by escaping metacharacters. auto escapedName = name.cleanse!escapeHtmlMetaChars; writeToOutput(escapedName); // OK } This is just a rough sketch, of course. A more complete implementation would have to consider what to do about code that obtains unsafe data directly from OS interfaces like core.stdc.stdlib.fread that isn't wrapped by Tainted. Also, it would have to address what to do about functions like File.rawRead(), that writes to a user-provided buffer, since the caller could just read the tainted data directly from the buffer, bypassing any Tainted protections. T -- I'm still trying to find a pun for "punishment"...I like Spolsky's idea of using separate types for tainted / verified input. Let the compiler statically verify that you at least made an attempt at validating your program's inputs (though obviously it can only go so far -- the compiler can't guarantee that your validation code is actually correct). The problem, though, is that D currently doesn't have tainted types, so for example you can't tell at a glance whether a given string is untrusted user input or validated data, it's all just `string`. I wonder if tainted types could be something worth adding either to the language or to Phobos.I'm not familiar with the idea, do we need more than the following? struct Tainted { T _basetype; alias _basetype this; } void main(string[] args) { auto ts = Tainted!string("Hello"); writeln(ts); } It's a PoC, ok, but it lets you use ts like any variable of the base type, it lets you convert one easily to the other, but this conversion has to be explicit. So, real question, what more do we need?
Jun 01 2017
On Friday, 2 June 2017 at 00:30:39 UTC, H. S. Teoh wrote:[...]Now that I think about it, what we really want going that way is an IO monad.
Jun 01 2017
On 6/1/2017 11:29 AM, Walter Bright wrote:Joel Spolsky wrote about this issue long ago, in that data in a program should be compartmentalized into untrusted and trusted data.Found it: https://www.joelonsoftware.com/2005/05/11/making-wrong-code-look-wrong/ It's one of those programming essays that everyone should read.
Jun 01 2017
On Thursday, 1 June 2017 at 18:29:53 UTC, Walter Bright wrote:What's missing here is looking carefully at a program and deciding what are input (and environmental) errors and what are program bugs. The former are recoverable, the latter are not. [...]I think he understood all that already. Array overflow is a sign of a bug, which must not be left to slip past. But I think the point was that it causes so big amount of work -the whole program- to abort. Potentially thousands of customers could lose connection to server because of that. He wishes that just the connection in question crashed, so other users using other, likely bugless, parts of the program would not be disturbed. Personally I have no opinion of this right now, save that it's definitely a tough sounding question.
Jun 01 2017
On 6/1/17 2:29 PM, Walter Bright wrote:For your date case, the date was not validated, and was fed into an array, where the invalid date overflowed the array bounds. The program was relying on the array bounds checking to validate the data.I think it's important to state that no, I wasn't relying on array bounds checks to validate the data. I should be validating the data (and am now). What I had was a bug in my program. What I have been saying is that in this framework, designed the way it is, there is no good reason to crash the entire process for such a bug. There are clear delineations of when the bug is in the "user code" section of vibe.d (i.e. the code in your project) and when it is in "system code" (i.e. the vibe.d framework). I want the "system code" section to continue to function when an out of bounds error happens in "user code", to give feedback to the user that no, this didn't work, there was an internal error. Other frameworks and languages don't have this issue. An out of range error in PHP doesn't crash apache. Similarly, a segfault in a program doesn't crash the OS. This is the way I view the layer between vibe.d framework and the code that handles requests. I get that the memory is shared, and there's a greater risk of corruption affecting the framework. The right answer is to physically separate the processes, and at some point, maybe vibe can move in that direction (I outlined what I considered a good setup in another post). But a more logical separation is still possible by requiring for instance that all request handlers are safe. Even in that case, crashing the *fiber* and not the entire process is still preferable in cases where the input isn't properly validated. Specifically, I'm talking about out-of-bounds failures, and not general asserts. This is why I'm still moving forward with making my arrays throw Exceptions for out-of-bounds issues (and will publish the library to do so in case anyone else feels the same way).2. Array bounds checking can be turned off by a compiler switch. Program data validation should not be silently disabled in such an unexpected manner.Most of your points are based on the assumption that this was a design decision, so they aren't applicable, but on this point I wanted to say: IMO, anything that's on the Internet should never have array bounds checking turned off. The risk is too great. -Steve
Jun 02 2017
On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:For example: int[3] arr; arr[3] = 5; Technically this is a programming error, and a bug. But memory hasn't actually been corrupted. The system properly stopped me from corrupting memory. But my reward is that even though this fiber threw an Error, and I get an error message in the log showing me the bug, the web server itself is now out of commission. No other pages can be served.In this case it is fairly obvious where the bad index is coming from... but in general it is impossible to say. So how much of your program is mad? You need to reset to some safe / correct point to continue. Which point? It is impossible for the compiler to determine that. Personally I would say the design fault is trying to build _everything_ into a single OS process. The mechanism that is guaranteed, enforced by the hardware, to recover all resources and reset to a sane point is OS process exit. ie. If you need "bug" tolerance, decompose your system into multiple processes. This actually has a large number of other benefits. (eg. Automagically concurrent) Of course, you then need to encode some common sense in the harness... if something keeps on starting up and dying within a very short period of time.... stop restarting it. Of course, this is just one (of many) ways that a program bug can screw up a system. For example it can start chewing way too many resources. So your harness needs to be able to limit that. And of course if you are going to decompose in processes, a process may spawn many more, so you need to shepherd all the subprocesses sanely..... ...and start the herd of processes in appropriate order, and shut them down appropriately.... Sounds like quite an intelligent harness... Fortunately one exists and has really carefully thought through all these issues. It's called systemd and works very well.
May 31 2017
On Thu, Jun 01, 2017 at 03:24:02AM +0000, John Carter via Digitalmars-d wrote: [...]Personally I would say the design fault is trying to build _everything_ into a single OS process. The mechanism that is guaranteed, enforced by the hardware, to recover all resources and reset to a sane point is OS process exit. ie. If you need "bug" tolerance, decompose your system into multiple processes. This actually has a large number of other benefits. (eg. Automagically concurrent)[...] Again, from an engineering standpoint, this is a tradeoff. The self-containment of an OS-level process is good for isolating it from affecting other processes, but they come with a cost. In the case of vibe.d, while I can't speak for the design rationales because I'm not involved in its development, it does appear to me that fibres were chosen because of their very low context-switch cost and memory requirements. If you were to turn the fibres into full-blown processes, that means incurring the cost of saving/restoring the full process context, because that's what it takes to achieve independence between processes. You need a bigger memory footprint because each process needs to have its own copy of data in order to ensure independence. It may very well be that for your particular design, process independence is important, so this price may be well worth paying. The fibre route chosen by vibe.d comes with the advantage of faster context switches and smaller memory footprint (and probably other perks as well), but the price you pay for that performance boost is that the fibres are not self-contained and isolated from each other. So if one fibre goes awry, you can no longer guarantee that the other fibres aren't also compromised. Hence if you wish to guarantee safety in case of logic errors like out-of-bounds array accesses, you're forced to have to reset the entire process before you can be absolutely sure you're back in a sane state. Which route to choose depends on the particulars of what you're trying to achieve, and how much / whether you're willing to pay the price to achieve what you want. T -- Today's society is one of specialization: as you grow, you learn more and more about less and less. Eventually, you know everything about nothing.
May 31 2017
On Thursday, 1 June 2017 at 06:11:43 UTC, H. S. Teoh wrote:On Thu, Jun 01, 2017 at 03:24:02AM +0000, John Carter via Digitalmars-d wrote: [...]That's exactly the point: to use the right tool for the requirement of the job to be done. /P[...][...] Again, from an engineering standpoint, this is a tradeoff. [...]
Jun 01 2017
On 01.06.2017 10:47, Paolo Invernizzi wrote:On Thursday, 1 June 2017 at 06:11:43 UTC, H. S. Teoh wrote:There is no such tool.On Thu, Jun 01, 2017 at 03:24:02AM +0000, John Carter via Digitalmars-d wrote: [...]That's exactly the point: to use the right tool for the requirement of the job to be done. /P[...][...] Again, from an engineering standpoint, this is a tradeoff. [...]
Jun 01 2017
On 2017-06-01 21:20, Timon Gehr wrote:There is no such tool.In this case, Erlang is a pretty good candidate. It's using green processes that are even more lightweight than fibers. You can have millions of these processes. All data is process local. If there's a corruption in one of the processes it cannot affect the other ones (unless there's a bug in the virtual machine). The major downside is that it's not D and it's a pretty crappy programming language. -- /Jacob Carlborg
Jun 01 2017
On Thursday, 1 June 2017 at 19:20:01 UTC, Timon Gehr wrote:On 01.06.2017 10:47, Paolo Invernizzi wrote:Process isolation was exactly crafted for that. /PaoloOn Thursday, 1 June 2017 at 06:11:43 UTC, H. S. Teoh wrote:There is no such tool.On Thu, Jun 01, 2017 at 03:24:02AM +0000, John Carter via Digitalmars-d wrote: [...]That's exactly the point: to use the right tool for the requirement of the job to be done. /P[...][...] Again, from an engineering standpoint, this is a tradeoff. [...]
Jun 01 2017
On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:I have discovered an annoyance in using vibe.d instead of another web framework. Simple errors in indexing crash the entire application.Since I wrote/run a bunch of websites/network services written in D, here's my experience/advice: First, this is not something specific to array indexing, but an entire class of logic errors which are sometimes recoverable. Other examples are associative array indexing, division by zero, and out-of-memory errors resulting from underflows. All of these are due to bugs in the program, but could hypothetically be handled without compromising the integrity of the process. My advice: 1. Let the program crash. Make sure it's restarted afterwards, either via a looping script, or a watchdog. 2. Make sure you are notified of the error. I don't mean just recorded in a log file somewhere, but set it up so you receive an email any time it happens, with the stack trace. I run all my D network services from a cronjob, which automatically sends output by email. If you have the stack trace, most of these bugs take only a few minutes to fix - at the very least, turning the error into an exception is a trivial modification if you don't have time for a full root cause analysis at that moment. 3. Design your program so that it can be terminated at any point without resulting in data corruption. I don't know if Vibe.d can satisfy this constraint, but e.g. the ae.net.http.server workflow is to build/send the entire response atomically, meaning that the Content-Length will always be populated. Wrap your database updates in transactions. Use the "write to temporary file then rename over the original file" pattern when updating files. Etc.
Jun 01 2017
On 6/1/2017 2:53 AM, Vladimir Panteleev wrote:3. Design your program so that it can be terminated at any point without resulting in data corruption. I don't know if Vibe.d can satisfy this constraint, but e.g. the ae.net.http.server workflow is to build/send the entire response atomically, meaning that the Content-Length will always be populated. Wrap your database updates in transactions. Use the "write to temporary file then rename over the original file" pattern when updating files. Etc.This is the best advice. I.e. design with the assumption that failure will occur, rather than fruitlessly trying to prevent all failure.
Jun 01 2017
On 6/1/17 2:00 PM, Walter Bright wrote:On 6/1/2017 2:53 AM, Vladimir Panteleev wrote:Indeed it is good advice. I'm thinking actually a good setup is to have 2 levels of processes: one which delivers requests to some set of child processes that handle the requests with fibers, and one which handles the i/o to the client. Then if the subprocess dies, the master process can both inform the client of the failure, and retry other fibers that were in process but never had a chance to finish. Not sure if I'll get to that point. At this time, I'm writing an array wrapping struct that will turn all range errors into range exceptions. Then at least I can inform the client of the error and continue to handle requests. -Steve3. Design your program so that it can be terminated at any point without resulting in data corruption. I don't know if Vibe.d can satisfy this constraint, but e.g. the ae.net.http.server workflow is to build/send the entire response atomically, meaning that the Content-Length will always be populated. Wrap your database updates in transactions. Use the "write to temporary file then rename over the original file" pattern when updating files. Etc.This is the best advice. I.e. design with the assumption that failure will occur, rather than fruitlessly trying to prevent all failure.
Jun 01 2017
On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:I have discovered an annoyance in using vibe.d instead of another web framework. Simple errors in indexing crash the entire application.Is this option useful for you? VibeDebugCatchAll Enables catching of exceptions that derive from Error. This can be useful during application development to get useful error information while keeping the application running, but can generally be dangerous, because the application may be left in a bad state after an Error has been thrown. From: http://vibed.org/docs#compile-time-configuration
Jun 01 2017
On 06/01/2017 09:54 AM, Martin Tschierschke wrote:On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:All that would do is *cause* corruption due to the way the runtime handles (or more precisely, doesn't handle) a thrown Error.I have discovered an annoyance in using vibe.d instead of another web framework. Simple errors in indexing crash the entire application.Is this option useful for you? VibeDebugCatchAll Enables catching of exceptions that derive from Error. This can be useful during application development to get useful error information while keeping the application running, but can generally be dangerous, because the application may be left in a bad state after an Error has been thrown. From: http://vibed.org/docs#compile-time-configuration
Jun 01 2017
On 05/31/2017 09:04 AM, Steven Schveighoffer wrote:I have discovered an annoyance in using vibe.d instead of another web framework. Simple errors in indexing crash the entire application. For example: int[3] arr; arr[3] = 5; Compare this to, let's say, a malformed unicode string (exception), malformed JSON data (exception), file not found (exception), etc. Technically this is a programming error, and a bug. But memory hasn't actually been corrupted. The system properly stopped me from corrupting memory. But my reward is that even though this fiber threw an Error, and I get an error message in the log showing me the bug, the web server itself is now out of commission. No other pages can be served. This is like the equivalent of having a guard rail on a road not only stop you from going off the cliff but proactively disable your car afterwards to prevent you from more harm. This seems like a large penalty for "almost" corrupting memory. No other web framework I've used crashes the entire web server for such a simple programming error. And vibe.d has no choice. There is no guarantee the stack is properly unwound, so it has to accept the characterization of this is a program-ending error by the D runtime. I am considering writing a set of array wrappers that throw exceptions when trying to access out of bounds elements. This comes with its own set of problems, but at least the web server should continue to run. What are your thoughts? Have you run into this? If so, how did you solve it?This is a meaningful concern. People use threads instead of processes for serving requests for improving speed and footprint. Threads hardly communicate with one another so they are virtually independent. D already has good mechanisms for isolating threads effectively (the shared qualifier, safe) so an argument could be made that bringing down the entire process because a thread has had a problem is disproportionate response. This was a concern about using D on the server for Facebook as well. Of course, it may be the case that that thread's failure reflects a memory corruption that affects all others, so one could reduce the matter to this and argue the entire process should be brought down. But of course things are never as simple as we'd like them to be. Array bound accesses should be easy to intercept and have them just kill the current thread. Vibe may want to do that, or allow their users to. The more difficult matter is null pointer dereferences. I recall there has been work in druntime to convert memory violations into thrown Errors at least on Linux. You may want to look into that. It seems to me we'd do good to improve matters on this front. Thanks, Andrei
Jun 02 2017
On Friday, 2 June 2017 at 15:19:29 UTC, Andrei Alexandrescu wrote:Array bound accesses should be easy to intercept and have them just kill the current thread.Ideally, fiber, as well. Probably the real ideal for this sort of problem is to be able to be as close as possible to Erlang, where errors bring down the particular task in progress, but not the application that spawned the task. Incidentally, I wouldn't limit the area of concern here to array bound access issues. This is more about the ability of _any_ error to propagate in applications of this nature, where you have many independent tasks being spawned in separate threads or (more often) fibers, and where you absolutely do not want an error in one task preventing you from being able to continue with others.
Jun 04 2017
On 2017-06-04 20:15, Joseph Rushton Wakeling wrote:On Friday, 2 June 2017 at 15:19:29 UTC, Andrei Alexandrescu wrote:Erlang has the philosophy of share nothing between processes (green processes), or task as you call it here. All allocations are process local, that makes it easier to know that a failing process doesn't affect any other process. -- /Jacob CarlborgArray bound accesses should be easy to intercept and have them just kill the current thread.Ideally, fiber, as well. Probably the real ideal for this sort of problem is to be able to be as close as possible to Erlang, where errors bring down the particular task in progress, but not the application that spawned the task.
Jun 04 2017
On Sunday, 4 June 2017 at 19:12:42 UTC, Jacob Carlborg wrote:On 2017-06-04 20:15, Joseph Rushton Wakeling wrote:If I'm not wrong, it also uses a VM, also if there's the availability of a native code compiler... If a VM is involved, it's another game... /PaoloOn Friday, 2 June 2017 at 15:19:29 UTC, Andrei Alexandrescu wrote:Erlang has the philosophy of share nothing between processes (green processes), or task as you call it here. All allocations are process local, that makes it easier to know that a failing process doesn't affect any other process.Array bound accesses should be easy to intercept and have them just kill the current thread.Ideally, fiber, as well. Probably the real ideal for this sort of problem is to be able to be as close as possible to Erlang, where errors bring down the particular task in progress, but not the application that spawned the task.
Jun 04 2017
On 2017-06-04 21:24, Paolo Invernizzi wrote:If I'm not wrong, it also uses a VM, also if there's the availability of a native code compiler... If a VM is involved, it's another game...Yes, it's running on a VM, the Beam. -- /Jacob Carlborg
Jun 04 2017
On Sunday, 4 June 2017 at 19:24:27 UTC, Paolo Invernizzi wrote:If I'm not wrong, it also uses a VM, also if there's the availability of a native code compiler... If a VM is involved, it's another game...Not sure if I follow that. If you only use safe code then there should be no difference between using a VM or not. And what is a VM these days anyway? (e.g. hypervisors and micro code caches in CPUs etc) Now, you might argue that some IRs are too complicated, and that a simple IR is easier to get right. Or that some concurrency models are more volatile than others. That is true, but it doesn't have much to do with using a VM. So the only special thing about using a VM in this case is that it could allow an actor to migrate to another server while running. Which is another game...
Jun 04 2017
On Sunday, 4 June 2017 at 19:12:42 UTC, Jacob Carlborg wrote:Erlang has the philosophy of share nothing between processes (green processes), or task as you call it here. All allocations are process local, that makes it easier to know that a failing process doesn't affect any other process.Indeed. (I used 'task' here in a deliberately vague sense, in order to not be too Erlang- or D-specific.) The obvious differences in how D handles things seem to make it rather hard to get the same ease of error handling, but it would be interesting to consider what might get us closer.
Jun 04 2017
I'm using D to write an RSS reader. As I understand it, the compiler does not guarantee correct cleanup when an Error is thrown through a nothrow function. Furthermore, it doesn't guarantee that an Error can be caught (though it happens to allow it today). Do I need to modify the compiler to ignore nothrow and treat all throwables the same so it doesn't corrupt application state when I recover from an Error? Fork vibe.d and every other library I use to remove nothrow? I can't really justify that. My RSS reader is a side project. Do I accept that writing my code in D will result in a program show a 503 and log an error to disk? That's a disservice to my users. Do I increase development time to make up for D's problems in this area, pipe requests through a proxy that will convert crashes to 503 errors, split things out into as many processes as a wide variety of ways, but I'd save a lot of work and complexity. And this practice is to make code marginally more efficient in uncommon cases, because people are conflating "this is a problem that a competent programmer should have been able to avoid" (yeah, okay, I was incautious, we can move on) with "this dependency of yours, probably the runtime, is in an invalid state", and nothrow optimizations assume the latter only. And it's exacerbated because bounds checking is seen as an option to help with debugging instead of a safety feature to be used in production. Because removing bounds checking is seen as a sensible thing to do instead of a highly unsafe optimization. It's exacerbated because Walter is in a mindset of writing mission-critical applications where any detectable bug means you need to restart the program. Honestly, if I were writing flight control systems for Airbus, I could modify druntime to raise SIGABRT or call exit(3) when you try to throw an Error. It would be easy, and it would be worthwhile. If you really need cleanup, atexit(3) is available.
Jun 02 2017
On Friday, 2 June 2017 at 23:23:45 UTC, nohbdy wrote:It's exacerbated because Walter is in a mindset of writing mission-critical applications where any detectable bug means you need to restart the program. Honestly, if I were writing flight control systems for Airbus, I could modify druntime to raise SIGABRT or call exit(3) when you try to throw an Error. It would be easy, and it would be worthwhile. If you really need cleanup, atexit(3) is available.The worst thing happened in programming in the last 30 years is just that less and less programmers are adopting Walter mindset... I'm really really puzzled by why this topic pops up so often... /Paolo
Jun 02 2017
On Saturday, 3 June 2017 at 06:55:35 UTC, Paolo Invernizzi wrote:The worst thing happened in programming in the last 30 years is just that less and less programmers are adopting Walter mindset...Really? On the contrary. What is being adopted is robustness and program verification. More and more. Assuming that a program shouldn't be able to flush its buffers out of some flawed reasoning about program correctness does not support your argument at all. Even if your program is fully based on event-sourcing and can deal with an immediate shutdown YOU STILL WANT TO FLUSH YOUR EVENT-BUFFERS TO DISK! The argument Walter is follwing is flawed. If a failed assert means you should not be able to flush to disk, then it also means that you should undo everything the program has ever written to disk. The incorrect program state could have occured at install. You have to reason about these things in probabilistic terms and not in absolutes.
Jun 03 2017
On Saturday, 3 June 2017 at 07:51:55 UTC, Ola Fosheim Grøstad wrote:On Saturday, 3 June 2017 at 06:55:35 UTC, Paolo Invernizzi wrote:It doesn't seems to me that the trends to try to handle somehow, that something, somewhere, who knows when, has gone wild it's coherent with the term "robustness". And the fact that the "nice tries" are done at runtime, in production, is the opposite of what I'm thinking is program verification.The worst thing happened in programming in the last 30 years is just that less and less programmers are adopting Walter mindset...Really? On the contrary. What is being adopted is robustness and program verification. More and more.Assuming that a program shouldn't be able to flush its buffers out of some flawed reasoning about program correctness does not support your argument at all. Even if your program is fully based on event-sourcing and can deal with an immediate shutdown YOU STILL WANT TO FLUSH YOUR EVENT-BUFFERS TO DISK!There's a fundamental difference between trying to flush logs and trying to report what's happened, with the scope of gaining more information of what happened, and trying to "automagically" handle the fact that there's an error in the implementation, or in the logic, or in the HW.The argument Walter is follwing is flawed. If a failed assert means you should not be able to flush to disk, then it also means that you should undo everything the program has ever written to disk. The incorrect program state could have occured at install.The argument Walter is following is not flawed: it's a really beautiful pragmatic balance of risks and engineering way of developing software, IMHO.You have to reason about these things in probabilistic terms and not in absolutes.I'm trying to exactly do that, I like to think myself as a very pragmatic person... /Paolo
Jun 03 2017
On Saturday, 3 June 2017 at 10:21:03 UTC, Paolo Invernizzi wrote:It doesn't seems to me that the trends to try to handle somehow, that something, somewhere, who knows when, has gone wild it's coherent with the term "robustness".That all depends. It makes perfect sense in a "strongly pure" function to just return an exception for basically anything that went wrong in that function. I use this strategy in other languages for writing validator_functions, it is a very useful and time-saving way of writing validators. E.g.: try { … validated_field = validate_input(unvalidated_input); } I don't really care why validate_input failed, even if it was a logic flaws in the "validate_input" code itself it is perfectly fine to just respond to the exception, log the failure return a failure status code and continue with the next request. The idea that programs can do provably full veracity checking of input isn't realistic in evolving code bases that need constant updates. My "validate_input" only have to be correct for correct input. If it fails because the input is wrong or because the validation spec is wrong does not matter, as long as it fails.And the fact that the "nice tries" are done at runtime, in production, is the opposite of what I'm thinking is program verification.Program verification requires a spec. In the above example the spec could be that it should never allow illegal input to pass, but it could also make room for failing for some legal input. "false alarm" is a concept that is allowed for in many real world application. In this context it means that you throw too many exceptions, but that does not mean that you don't throw an exception when you should have.I'm trying to exactly do that, I like to think myself as a very pragmatic person...What do you mean by "pragmatic"? Shutting down a B2B website because one insignificant request-handler fails on some requests (e.g. requesting a help page) is not very pragmatic. Pragmatic in this context would be to specify which handlers are critical and which ones are not.
Jun 03 2017
Anyway, all of this boils down to the question of whether D really provides a safe programming environment. If you only write safe code in a safe language, then it should be perfectly ok to trap and deal with a failed lookup, irrespective of what kind of data-structure it is. So, if this isn't possible in D, then D isn't able to compete with other safe programming languages... But then maybe one shouldn't try to sell it as a safe programming language either. You can't really have it both ways.
Jun 03 2017
On Saturday, 3 June 2017 at 10:47:36 UTC, Ola Fosheim Grøstad wrote:On Saturday, 3 June 2017 at 10:21:03 UTC, Paolo Invernizzi wrote:Sorry Ola, I can't support that way of working. Don't take it wrong, Walter is doing a lot on safe, but compilers are built from a codebase, and the codebase has, anyway, bugs. I can't approve a "ok, do whatever you want in the validate_input and I try to *safely* throw" IMHO you can only do that if the validator is totally segregated, and to me that means in a separate process, neither in another thread.It doesn't seems to me that the trends to try to handle somehow, that something, somewhere, who knows when, has gone wild it's coherent with the term "robustness".That all depends. It makes perfect sense in a "strongly pure" function to just return an exception for basically anything that went wrong in that function. I use this strategy in other languages for writing validator_functions, it is a very useful and time-saving way of writing validators. E.g.: try { … validated_field = validate_input(unvalidated_input); } I don't really care why validate_input failed, even if it was a logic flaws in the "validate_input" code itself it is perfectly fine to just respond to the exception, log the failure return a failure status code and continue with the next request. The idea that programs can do provably full veracity checking of input isn't realistic in evolving code bases that need constant updates.To me, pragmatic means that the B2B website has to be organised in a way that the impact is minimum if one of the processes that are handling the requests are restarted, for a bug or not. See Laeeth [1]. Just handle "insignificant requests" to a cheeper, less robust, less costly, web stack. But we were talking about another argument... /Paolo [1] http://forum.dlang.org/post/uvhlxtolghfydydoxwfg forum.dlang.orgI'm trying to exactly do that, I like to think myself as a very pragmatic person...What do you mean by "pragmatic"? Shutting down a B2B website because one insignificant request-handler fails on some requests (e.g. requesting a help page) is not very pragmatic. Pragmatic in this context would be to specify which handlers are critical and which ones are not.
Jun 03 2017
On Saturday, 3 June 2017 at 11:18:16 UTC, Paolo Invernizzi wrote:Sorry Ola, I can't support that way of working. Don't take it wrong, Walter is doing a lot on safe, but compilers are built from a codebase, and the codebase has, anyway, bugs. I can't approve a "ok, do whatever you want in the validate_input and I try to *safely* throw"If the compiler is broken then anything could happen, at any time. So that merely suggests that you consider the current version to be of beta-quality. Would you make the same argument for Python?IMHO you can only do that if the validator is totally segregated, and to me that means in a separate process, neither in another thread.Well, that would be very tedious. The crux is: The best way to write at good validator is to make the code in the validator as simple as possible so that you can reduce the probability of making mistakes in the implementation of the spec. If you have to add code for things like division-by-zero logic flaws or out of bounds checks then you make it harder to catch mistakes in validator and increase the probability of a much worse situation: letting illegal input pass. So for a validator I want to focus my energy on writing simple crystal clear code that only allows legal input to pass. That makes the overall system robust, as long as the language is capable of trapping all the logic flaws and classify them as a validation-error. So there is a trade off here. What is more important? 1. Increasing the probability of correctly implementing the validation spec to keep the database consistent. 2. Reduce the chance of the unlikely event that the compiler/unsafe code could cause the validator to pass when it shouldn't. If the programmer knows that the validator was written in this way, it also isn't a big deal to catch all Errors from it. Probabilistically speaking, the compiler being the cause here would be a highly unlikely event (power failure would be much more likely).To me, pragmatic means that the B2B website has to be organised in a way that the impact is minimum if one of the processes that are handling the requests are restarted, for a bug or not. See Laeeth [1]. Just handle "insignificant requests" to a cheeper, less robust, less costly, web stack.Then we land on the conclusion that development and running cost would increase by choosing D over some of the competing alternatives for this particular use case. That's ok.
Jun 03 2017
On 03.06.2017 08:55, Paolo Invernizzi wrote:On Friday, 2 June 2017 at 23:23:45 UTC, nohbdy wrote:I don't get why you would /restart/ mission-critical software that has been shown to be buggy. What you need to do instead: Have a few more development teams that create independent implementations of your service. (Completely from scratch, as the available libraries were not developed to the necessary standard.) All of them should run on different hardware produced in different factories by different companies. Furthermore, you need to hire a team of testers and software verification experts vastly exceeding the team of developers in magnitude, etc.It's exacerbated because Walter is in a mindset of writing mission-critical applications where any detectable bug means you need to restart the program. Honestly, if I were writing flight control systems for Airbus, I could modify druntime to raise SIGABRT or call exit(3) when you try to throw an Error. It would be easy, and it would be worthwhile. If you really need cleanup, atexit(3) is available.The worst thing happened in programming in the last 30 years is just that less and less programmers are adopting Walter mindset... I'm really really puzzled by why this topic pops up so often... /Paolo
Jun 03 2017
On Saturday, 3 June 2017 at 09:48:05 UTC, Timon Gehr wrote:I don't get why you would /restart/ mission-critical software that has been shown to be buggy. What you need to do instead: Have a few more development teams that create independent implementations of your service. (Completely from scratch, as the available libraries were not developed to the necessary standard.) All of them should run on different hardware produced in different factories by different companies. Furthermore, you need to hire a team of testers and software verification experts vastly exceeding the team of developers in magnitude, etc.Yes, mission critical software such as flight control are (and should) be proven correct. There is modelling software for this very narrow field that will generate correct code. Or as you say, you can implement 3 different versions, running on 3 different hardware platforms and shut down the 1 that disagrees with the others. But you still have to think in probabilistic terms, because there could be problems with sensors, actuators, human errors etc etc etc..
Jun 03 2017
On Saturday, 3 June 2017 at 09:48:05 UTC, Timon Gehr wrote:On 03.06.2017 08:55, Paolo Invernizzi wrote:That's what should be done in mission-critical software, and we are relaxing the constraint of mission critical, it seems [1] The point is software, somehow, has to be run, with bugs, or sometimes logic flaws: alas bugged software is running here [2]... So, if you have to, you should restart 'not-so-critical-software', and you should code it as it should be restarted from time to time. It's an opinion, when it's the better moment to just restart it, and a judgement between risks and opportunities. My personal opinion, it should be stopped ASAP a bug is detected. /Paolo [1] http://exploration.esa.int/mars/59176-exomars-2016-schiaparelli-anomaly-inquiry [2] https://motherboard.vice.com/en_us/article/the-f-35s-software-is-so-buggy-it-might-ground-the-whole-fleetOn Friday, 2 June 2017 at 23:23:45 UTC, nohbdy wrote:I don't get why you would /restart/ mission-critical software that has been shown to be buggy. What you need to do instead: Have a few more development teams that create independent implementations of your service. (Completely from scratch, as the available libraries were not developed to the necessary standard.) All of them should run on different hardware produced in different factories by different companies. Furthermore, you need to hire a team of testers and software verification experts vastly exceeding the team of developers in magnitude, etc.It's exacerbated because Walter is in a mindset of writing mission-critical applications where any detectable bug means you need to restart the program. Honestly, if I were writing flight control systems for Airbus, I could modify druntime to raise SIGABRT or call exit(3) when you try to throw an Error. It would be easy, and it would be worthwhile. If you really need cleanup, atexit(3) is available.The worst thing happened in programming in the last 30 years is just that less and less programmers are adopting Walter mindset... I'm really really puzzled by why this topic pops up so often... /Paolo
Jun 03 2017
On 03.06.2017 12:44, Paolo Invernizzi wrote:On Saturday, 3 June 2017 at 09:48:05 UTC, Timon Gehr wrote:That document says that the crash was caused by a component going down after an unexpected condition instead of just continuing to operate normally. (Admittedly this is biased reporting, but it is true.)On 03.06.2017 08:55, Paolo Invernizzi wrote:That's what should be done in mission-critical software, and we are relaxing the constraint of mission critical, it seems [1] ...On Friday, 2 June 2017 at 23:23:45 UTC, nohbdy wrote:I don't get why you would /restart/ mission-critical software that has been shown to be buggy. What you need to do instead: Have a few more development teams that create independent implementations of your service. (Completely from scratch, as the available libraries were not developed to the necessary standard.) All of them should run on different hardware produced in different factories by different companies. Furthermore, you need to hire a team of testers and software verification experts vastly exceeding the team of developers in magnitude, etc.It's exacerbated because Walter is in a mindset of writing mission-critical applications where any detectable bug means you need to restart the program. Honestly, if I were writing flight control systems for Airbus, I could modify druntime to raise SIGABRT or call exit(3) when you try to throw an Error. It would be easy, and it would be worthwhile. If you really need cleanup, atexit(3) is available.The worst thing happened in programming in the last 30 years is just that less and less programmers are adopting Walter mindset... I'm really really puzzled by why this topic pops up so often... /PaoloThe point is software, somehow, has to be run, with bugs, or sometimes logic flaws: alas bugged software is running here [2]... ...I.e., a detected bug is not always a sufficient reason to bring down the entire system.So, if you have to, you should restart 'not-so-critical-software', and you should code it as it should be restarted from time to time. ...I agree. What I don't agree with is the idea that the programmer should have no way to figure out which component failed and only stop or restart that component if that is the most sensible thing to do under the given circumstances. Ideally, the Mars mission shouldn't need to be restarted just because there is a bug in one component of the probe.It's an opinion, when it's the better moment to just restart it, and a judgement between risks and opportunities. ...I.e., the language shouldn't mandate it to be one way or the other.My personal opinion, it should be stopped ASAP a bug is detected. ...Which is the right thing to do often enough./Paolo [1] http://exploration.esa.int/mars/59176-exomars-2016-schiapar lli-anomaly-inquiry [2] https://motherboard.vice.com/en_us/article/the-f-35s-software-is-so-buggy-it-might-gr und-the-whole-fleet
Jun 03 2017