www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - I think Associative Array should throw Exception

reply Jesse Phillips <Jesse.K.Phillips+D gmail.com> writes:
This is going to be a hard one for me to argue but I'm going to 
give it a try.

Today if you attempt to access a key from an associative array 
(AA) that does not exist inside the array, a RangeError is 
thrown. This is similar to when an array is accessed outside the 
bounds.

```
string[string] dict;
dict["Hello"] // RangeError

string[] arr;
arr[6] // RangeError
```

There are many things written[1][2] on the difference between 
Exception and Error.

* Errors are for programming bugs and Exceptions are environmental
* Exceptions can be caught, Errors shouldn't be (stack may not 
unwind)
* Exceptions are recoverable, Errors aren't recoverable
* Errors can live in nothrow, Exceptions can't
* Always verify input

I don't have an issue with the normal array RangeError, there is 
a clear means for claiming your access is a programming bug. 
However associative arrays tend to have both the key and value as 
"input."

Trying to create a contract for any given method to validate the 
AA as input becomes cumbersome. I would say it is analogous to 
`find` throwing an error if it didn't find the value requested.

Using RangeError is nice as it allows code to use array index 
inside `nothrow.` But again, can we really review code and say 
this will be the case? We'd  have to enforce all access to 
associative arrays to be done using `in` and checked for null. 
Then what use is the [] syntax?

Is it recoverable? I would say yes. We aren't actually trying to 
access memory outside the application ownership, we haven't put 
the system state into a critical situation (out of memory). And a 
higher portion of the code could easily decide to take a 
different path due to the failure of its call.

"if exceptions are thrown for errors instead, the programmer has 
to deliberately add code if he wishes to ignore the error."

1. 
https://stackoverflow.com/questions/5813614/what-is-difference-between-errors-and-exceptions
2. https://forum.dlang.org/thread/m8tkfm$ret$1 digitalmars.com
Sep 01 2020
next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 9/1/20 2:20 PM, Jesse Phillips wrote:

 Using RangeError is nice as it allows code to use array index inside 
 `nothrow.`
This is the big sticking point -- code that is nothrow would no longer be able to use AAs. It makes the idea, unfortunately, a non-starter. What is wrong with using `in`? I use this mostly: if(auto v = key in aa) { /* use v */ } Note, that in certain cases, I want to turn *normal* array access errors into exceptions, because I always want bounds checking on, but I don't want a (caught) programming error to bring down my whole vibe.d server. So I created a simple wrapper around arrays which throws exceptions on out-of-bounds access. You could do a similar thing with AAs. It's just that the declaration syntax isn't as nice. -Steve
Sep 01 2020
next sibling parent Adam D. Ruppe <destructionator gmail.com> writes:
On Tuesday, 1 September 2020 at 18:55:20 UTC, Steven 
Schveighoffer wrote:
 This is the big sticking point -- code that is nothrow would no 
 longer be able to use AAs. It makes the idea, unfortunately, a 
 non-starter.
You could always catch it though. But I kinda like things the way they are exactly because you can check with the in operator ahead of time. Or the .get helper method is pretty convenient too.
Sep 01 2020
prev sibling next sibling parent reply James Blachly <james.blachly gmail.com> writes:
On 9/1/20 2:55 PM, Steven Schveighoffer wrote:
 On 9/1/20 2:20 PM, Jesse Phillips wrote:
 
 Using RangeError is nice as it allows code to use array index inside 
 `nothrow.`
This is the big sticking point -- code that is nothrow would no longer be able to use AAs. It makes the idea, unfortunately, a non-starter.
...
 -Steve
Steve, are there not several (probably better, faster) alternatives to the built-in AA that are nothrow? I think a nice way to look at the built-in AA is an easy default for quick scripts, new users, etc., much like the default of `throw` status of a function or code block. Advanced users, (i.e. those using nothrow annotation) could select a more efficient AA implementation anyway.
Sep 01 2020
parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 9/1/20 10:46 PM, James Blachly wrote:
 On 9/1/20 2:55 PM, Steven Schveighoffer wrote:
 On 9/1/20 2:20 PM, Jesse Phillips wrote:

 Using RangeError is nice as it allows code to use array index inside 
 `nothrow.`
This is the big sticking point -- code that is nothrow would no longer be able to use AAs. It makes the idea, unfortunately, a non-starter.
.... Steve, are there not several (probably better, faster) alternatives to the built-in AA that are nothrow? I think a nice way to look at the built-in AA is an easy default for quick scripts, new users, etc., much like the default of `throw` status of a function or code block. Advanced users, (i.e. those using nothrow annotation) could select a more efficient AA implementation anyway.
The problem is not the requirement but the resulting code breakage if you change it now. -Steve
Sep 02 2020
prev sibling parent reply Jesse Phillips <Jesse.K.Phillips+D gmail.com> writes:
On Tuesday, 1 September 2020 at 18:55:20 UTC, Steven 
Schveighoffer wrote:
 On 9/1/20 2:20 PM, Jesse Phillips wrote:

 Using RangeError is nice as it allows code to use array index 
 inside `nothrow.`
This is the big sticking point -- code that is nothrow would no longer be able to use AAs. It makes the idea, unfortunately, a non-starter. What is wrong with using `in`? I use this mostly: if(auto v = key in aa) { /* use v */ }
I think that actually might be my point. If you need nothrow then this is what you need to do. For breaking nothrow code using the [] syntax, I'd say it is already broken because the behavior is to throw and the above is how you would check that it won't. The issue is, associative arrays throw an "uncatchable" error. Meaning code is written to catch the error (because it works). And correctly written `nothrow` code needs to use `in` to be properly nothrow.
Sep 03 2020
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 9/3/20 10:43 AM, Jesse Phillips wrote:
 On Tuesday, 1 September 2020 at 18:55:20 UTC, Steven Schveighoffer wrote:
 On 9/1/20 2:20 PM, Jesse Phillips wrote:

 Using RangeError is nice as it allows code to use array index inside 
 `nothrow.`
This is the big sticking point -- code that is nothrow would no longer be able to use AAs. It makes the idea, unfortunately, a non-starter. What is wrong with using `in`? I use this mostly: if(auto v = key in aa) { /* use v */ }
I think that actually might be my point. If you need nothrow then this is what you need to do. For breaking nothrow code using the [] syntax, I'd say it is already broken because the behavior is to throw and the above is how you would check that it won't.
int[int] aa; aa[4] = 5; auto b = aa[4]; How is this code broken? It's valid, will never throw, and there's no reason that we should break it by adding an exception into the mix.
 The issue is, associative arrays throw an "uncatchable" error. Meaning 
 code is written to catch the error (because it works). And correctly 
 written `nothrow` code needs to use `in` to be properly nothrow.
The big issue is -- is accessing an invalid index a programming error or an environmental error? The answer is -- it depends. D has declared, if you use the indexing syntax, then it's a programming error. If you want it not to be a programming error, you use the key in aa syntax, and handle it. The other thing you can do is use a different type, if you don't want to deal with the verbose syntax, but still want to catch environmental errors. A wrapper type is possible. -Steve
Sep 03 2020
parent reply Jesse Phillips <Jesse.K.Phillips+D gmail.com> writes:
On Thursday, 3 September 2020 at 15:12:14 UTC, Steven 
Schveighoffer wrote:
 int[int] aa;
 aa[4] = 5;
 auto b = aa[4];

 How is this code broken? It's valid, will never throw, and 
 there's no reason that we should break it by adding an 
 exception into the mix.
int foo() nothrow { return "1".to!int; } The following code is valid, will never throw, why does the compiler prevent it?
Sep 03 2020
parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 9/4/20 1:48 AM, Jesse Phillips wrote:
 On Thursday, 3 September 2020 at 15:12:14 UTC, Steven Schveighoffer wrote:
 int[int] aa;
 aa[4] = 5;
 auto b = aa[4];

 How is this code broken? It's valid, will never throw, and there's no 
 reason that we should break it by adding an exception into the mix.
int foo() nothrow {     return "1".to!int; } The following code is valid, will never throw, why does the compiler prevent it?
You are still missing the point ;) Your example doesn't compile today. Mine does. It's not a question of which way is better, but that we already have code that depends on the chosen solution, and changing now means breaking all such existing code. My point of bringing up the example is that your assertion that "it is already broken" isn't true. To put it another way, if the above to!int call compiled, and we switched to exceptions, it would be the same problem, even if the right choice is to use Exceptions. -Steve
Sep 04 2020
prev sibling parent Paul Backus <snarwin gmail.com> writes:
On Tuesday, 1 September 2020 at 18:20:17 UTC, Jesse Phillips 
wrote:
 This is going to be a hard one for me to argue but I'm going to 
 give it a try.

 Today if you attempt to access a key from an associative array 
 (AA) that does not exist inside the array, a RangeError is 
 thrown. This is similar to when an array is accessed outside 
 the bounds.

 [...]

 I don't have an issue with the normal array RangeError, there 
 is a clear means for claiming your access is a programming bug. 
 However associative arrays tend to have both the key and value 
 as "input."

 [...]

 Is it recoverable? I would say yes. We aren't actually trying 
 to access memory outside the application ownership, we haven't 
 put the system state into a critical situation (out of memory). 
 And a higher portion of the code could easily decide to take a 
 different path due to the failure of its call.
Any time you have an operation that can only succeed if some precondition is met, there are two possible ways you can implement it: 1. Make it the caller's responsibility to check the precondition. 2. Make it the function's responsibility to check the precondition. ideally provide both versions and let the user choose the one In this case, for D's associative arrays, the [] operator is have been named something else, but at this point, it's not worth breaking backwards compatibility to change it.
Sep 04 2020