www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - I want this so badly, please implement

reply Adam D. Ruppe <destructionator gmail.com> writes:
Just make RangeError tell report the index and length, where 
easy. For ordinary arrays, both are size_t so it can be trivially 
passed by value without thinking about ownership or anything.

And it would save me hours.

https://issues.dlang.org/show_bug.cgi?id=15889


I would like to know the index on associative arrays too, but I 
understand that can be more difficult.

But if you guys implement it for regular array for the next 
release, I would be exceedingly joyous.
Apr 06 2016
next sibling parent reply =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Thursday, 7 April 2016 at 02:00:39 UTC, Adam D. Ruppe wrote:
 https://issues.dlang.org/show_bug.cgi?id=15889
Could you please add a test-program with current and wanted behaviour? Do you mean that the index and length are lost in the compiler which should then be passed to the RangeError object? Could you highlight the details? If both dmd and druntime/phobos needs to be updated could you point out some source locations of interest? I might give this a try.
Apr 07 2016
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Thursday, 7 April 2016 at 07:07:32 UTC, Nordlöw wrote:
 Could you please add a test-program with current and wanted 
 behaviour?
int[] a = new int[](1); a[2]; // currently throws RangeError, telling file and line of code I'd like it to throw a RangeError telling file and line of code, and idx == 2, length == 1. The way it works is the compiler, in e2ir.c (unless I'm mistaken) turns the index and slice expressions into code. It basically generates this code: ((idx < length) || _d_array_bounds(module*, cast(uint) line_number)), array.ptr[idx] What I want it to do is generate this code: ((idx < length) || _d_array_bounds(module*, cast(uint) line_number), idx, length), array.ptr[idx] Just so idx and length is passed to the druntime function. That's dmd's change. Make sure it covers all the places the array bounds functions are called. (I thought this would be easy but I keep hitting dmd assert failures. However, that's probably because I'm a n00b, it probably is easy if you know the e2ir style.) Then, on the druntime side, change the function to accept the size_t idx, size_t length in addition to its current args. Forward them through to the RangeError object (or a subclass), which then prints them too.
Apr 07 2016
parent reply Daniel Murphy <yebbliesnospam gmail.com> writes:
On 8/04/2016 1:59 AM, Adam D. Ruppe wrote:

 (I thought this would be easy but I keep hitting dmd assert failures.
 However, that's probably because I'm a n00b, it probably is easy if you
 know the e2ir style.)
If you have a patch I can probably point out the error. You're right that it should be fairly straightforward, although you'll need to copy elems to temps when they're used multiple times to avoid the multiple-references asserts/ICEs.
Apr 07 2016
next sibling parent Adam D. Ruppe <destructionator gmail.com> writes:
On Thursday, 7 April 2016 at 16:22:17 UTC, Daniel Murphy wrote:
 If you have a patch I can probably point out the error.  You're 
 right that it should be fairly straightforward, although you'll 
 need to copy elems to temps when they're used multiple times to 
 avoid the multiple-references asserts/ICEs.
That's probably it, I'll try later today and let you know...
Apr 07 2016
prev sibling parent Adam D. Ruppe <destructionator gmail.com> writes:
On Thursday, 7 April 2016 at 16:22:17 UTC, Daniel Murphy wrote:
 you'll need to copy elems to temps when they're
 used multiple times to avoid the multiple-references 
 asserts/ICEs.
Well, I copied the elemns using el_same and it all compiles, but I realize this calls module.__array, which is a generated function in glue.c, and doesn't forward the arguments :( Blah, so gotta change that too. Check genhelpers() in glue.c, it has "// Get sole parameter, linnum" but it isn't the sole param anymore :( I kinda wish it just generated a code string and compiled tha, it'd certainly be easier for us n00bs. But, could you take a look at this and tell me how it looks? https://github.com/D-Programming-Language/dmd/pull/5637
Apr 07 2016
prev sibling parent reply Kagamin <spam here.lot> writes:
On Thursday, 7 April 2016 at 02:00:39 UTC, Adam D. Ruppe wrote:
 Just make RangeError tell report the index and length, where 
 easy. For ordinary arrays, both are size_t so it can be 
 trivially passed by value without thinking about ownership or 
 anything.

 And it would save me hours.

 https://issues.dlang.org/show_bug.cgi?id=15889
How do you spend time on it?
Apr 07 2016
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Thursday, 7 April 2016 at 09:31:15 UTC, Kagamin wrote:
 How do you spend time on it?
A program dies with a RangeError. I don't know why because it doesn't tell me any useful information about the failing data. This is quite frustrating because that information is available and trivial to attach to the object, but it is just dropped. What I'd actually do in a perfect world is attach the whole array to the object: class RangeError : Error {} class TypedRangeError(T) : RangeError { T[] slice; size_t attemptedIndex; override string toString() { /* print the above in some human-readable way */ } } Because then I can see what runtime data specifically caused the problem. But like I said about AAs, adding an array is harder, so I'm happy with just the numbers as a compromise - at least then I can see what part of the loop is the problem instead of just the line of code. The time sink comes when processing a million items and seeing a random RangeError in the middle of processing. file.d:6 is great, but which one of the million lines of DATA was the problem? At least giving me the index and length narrows that down. I typically go back and edit the source to print out some kind of log info, then wait for it to all run again. Just a frustration - one of the few I face regularly in D.
Apr 07 2016
next sibling parent reply Kagamin <spam here.lot> writes:
On Thursday, 7 April 2016 at 15:35:31 UTC, Adam D. Ruppe wrote:
 A program dies with a RangeError. I don't know why because it 
 doesn't tell me any useful information about the failing data.
The length of data will tell you which data it is?
 This is quite frustrating because that information is available 
 and trivial to attach to the object, but it is just dropped.
Sounds like code bloat for little gain. I'm afraid data length and index value are not generally useful to diagnose this problem. Throwing the slice can be helpful, but can't be done because it can be a static array on stack.
Apr 07 2016
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Thursday, 7 April 2016 at 17:03:31 UTC, Kagamin wrote:
 The length of data will tell you which data it is?
It tells me more than ABSOLUTELY NOTHING. Between the length and the index, I can at least get an idea of if there's a broken record or broken code.
 Sounds like code bloat for little gain.
It is already there, just discarded! D's error handling does this a lot, and there's no reason good for it. Useless RangeErrors (a segmentation fault would be more helpful!), generic Exceptions with random strings instead of useful info, dumb crap like enforce discarding easy info. Ugh.
Apr 07 2016
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 4/7/16 1:16 PM, Adam D. Ruppe wrote:
 On Thursday, 7 April 2016 at 17:03:31 UTC, Kagamin wrote:
 The length of data will tell you which data it is?
It tells me more than ABSOLUTELY NOTHING. Between the length and the index, I can at least get an idea of if there's a broken record or broken code.
 Sounds like code bloat for little gain.
It is already there, just discarded! D's error handling does this a lot, and there's no reason good for it. Useless RangeErrors (a segmentation fault would be more helpful!),
Can't you throw a segfault in the handler? The information is still there on the stack, no? -Steve
Apr 07 2016
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Thursday, 7 April 2016 at 17:55:20 UTC, Steven Schveighoffer 
wrote:
 Can't you throw a segfault in the handler? The information is 
 still there on the stack, no?
Yes, of course, I can also printf the indexes myself. But the error could also carry a bit more useful information by default - and practically for free! I'm even actually almost sold on the null pointer exception people ask for. It has more cost than a RangeError with index (seriously, the index and length are *already loaded* in registers for the comparison, it really is as close to free as you can get), but the convenience of the message by default is nice - and you can still get the nice segfault when actually running in the debugger by breaking on the handler or whatever.
Apr 07 2016
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 4/7/16 2:14 PM, Adam D. Ruppe wrote:
 On Thursday, 7 April 2016 at 17:55:20 UTC, Steven Schveighoffer wrote:
 Can't you throw a segfault in the handler? The information is still
 there on the stack, no?
Yes, of course, I can also printf the indexes myself. But the error could also carry a bit more useful information by default - and practically for free! I'm even actually almost sold on the null pointer exception people ask for. It has more cost than a RangeError with index (seriously, the index and length are *already loaded* in registers for the comparison, it really is as close to free as you can get), but the convenience of the message by default is nice - and you can still get the nice segfault when actually running in the debugger by breaking on the handler or whatever.
I hear you. I'm actually fervently on the side of having a seg fault printout of some kind (an exception may not be the best choice because with a seg fault, you can't trust the stack). Having to deal with real world situations where a program runs for weeks and fails on a customer's system, just having "Segmentation Fault" staring at you is no help at all. I'm sure you are in a similar situation with this. Instrumenting and running again is not only costly and time consuming, but may even subvert the actual problem (especially if it's timing sensitive). -Steve
Apr 07 2016
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Thursday, 7 April 2016 at 19:25:57 UTC, Steven Schveighoffer 
wrote:
 I hear you. I'm actually fervently on the side of having a seg 
 fault printout of some kind (an exception may not be the best 
 choice because with a seg fault, you can't trust the stack).
Aye. I'm personally a bit lukewarm on it now because it is possible to enable core dumps... but I often don't think to do it until it is too late. So it would be really nice if it just worked. But I'd make the dereferences generated in the compiler to do check it: byte* a; *a = 0; Could be magically compiled into `a || makeNullPointerException(line), *a`, just like it does for pointers. In that case, throwing the Error should actually be OK because it is caught before attempting to use it. I guess the difficulty though is: struct A { int a; int b; } A* ptr; ptr.b = 0; because then, the pointer is null, but the pointer+offset isn't, so maybe the codegen needs to be a big different. But the same principle should apply - if the ptr is checked before attempting the change, it is exactly the same as a RangeError - "safe" to throw because the write hasn't actually occurred yet. (I say "safe" with scare quotes because it is still a bug and still not supposed to actually be recovered, you are supposed to let the program die, but the error throw mechanism should be fairly ok.)
 I'm sure you are in a similar situation with this. 
 Instrumenting and running again is not only costly and time 
 consuming, but may even subvert the actual problem (especially 
 if it's timing sensitive).
Absolutely, and it is also really annoying when it happens because of a user request. So a web server dies with a RangeError because of some POST data. What was the user submission that killed it? POST data isn't ordinarily logged, so I have no idea and cannot reproduce. (I typically just add assertions around it and try to debug from the code without knowing the data.) But, there's another thing that bothers me, and that is in the development process: I like to quickly guess and check programs over realistic data. Sometimes it can take a minute for the program to run (like I said, there might be millions of lines) and I'm trying to just quickly hammer this "script" out. Having to stop, add log spamming, and run again kills the good attitude. If I wanted to fight my programming language, I wouldn't be using D!
Apr 07 2016
parent reply Kagamin <spam here.lot> writes:
On Thursday, 7 April 2016 at 20:36:28 UTC, Adam D. Ruppe wrote:
 So a web server dies with a RangeError because of some POST 
 data. What was the user submission that killed it? POST data 
 isn't ordinarily logged, so I have no idea and cannot reproduce.
What's the reason to not log POST request on error? It's a plain string, should be trivial to log.
Apr 12 2016
next sibling parent reply Kapps <opantm2+spam gmail.com> writes:
On Tuesday, 12 April 2016 at 08:37:40 UTC, Kagamin wrote:
 On Thursday, 7 April 2016 at 20:36:28 UTC, Adam D. Ruppe wrote:
 So a web server dies with a RangeError because of some POST 
 data. What was the user submission that killed it? POST data 
 isn't ordinarily logged, so I have no idea and cannot 
 reproduce.
What's the reason to not log POST request on error? It's a plain string, should be trivial to log.
Amongst other things, you'd log sensitive data like passwords, which should never be stored anywhere in plain text, including log files. This is one of the reasons to not use GET for anything sensitive.
Apr 12 2016
parent reply Kagamin <spam here.lot> writes:
On Tuesday, 12 April 2016 at 08:51:23 UTC, Kapps wrote:
 Amongst other things, you'd log sensitive data like passwords, 
 which should never be stored anywhere in plain text, including 
 log files. This is one of the reasons to not use GET for 
 anything sensitive.
With Adam's idea sensitive data can still accidentally leak into this extended diagnostic mechanism.
Apr 13 2016
parent reply cym13 <cpicard openmailbox.org> writes:
On Wednesday, 13 April 2016 at 08:48:56 UTC, Kagamin wrote:
 On Tuesday, 12 April 2016 at 08:51:23 UTC, Kapps wrote:
 Amongst other things, you'd log sensitive data like passwords, 
 which should never be stored anywhere in plain text, including 
 log files. This is one of the reasons to not use GET for 
 anything sensitive.
With Adam's idea sensitive data can still accidentally leak into this extended diagnostic mechanism.
There's a world between exceptionnaly getting a user password in order to detect and solve a bug through an error message and knowingly logging every single user password, be it only on the legal side. In France for example you don't have the right to log most sensitive things. On the security side it's the same thing: the chances for an attacker to retrieve a password by server crashing are quite small, while getting his hands on the log file would be a goldmine.
Apr 13 2016
next sibling parent reply Kagamin <spam here.lot> writes:
On Wednesday, 13 April 2016 at 13:17:57 UTC, cym13 wrote:
 There's a world between exceptionnaly getting a user password 
 in order to detect and solve a bug through an error message and 
 knowingly logging every single user password, be it only on the 
 legal side.
The latter shouldn't be needed. When you log everything, you just write garbage logs, that are impractical to process.
Apr 13 2016
parent cym13 <cpicard openmailbox.org> writes:
On Wednesday, 13 April 2016 at 14:47:42 UTC, Kagamin wrote:
 On Wednesday, 13 April 2016 at 13:17:57 UTC, cym13 wrote:
 There's a world between exceptionnaly getting a user password 
 in order to detect and solve a bug through an error message 
 and knowingly logging every single user password, be it only 
 on the legal side.
The latter shouldn't be needed. When you log everything, you just write garbage logs, that are impractical to process.
Logging everything is illegal in many countries (for good reasons) and when I see how easy it is to make sense of a huge bunch of TCP packets, reverse a binary or even a file format I sincerely think you are either trollish, naive or misguided when it comes to logs. Then again it has little to do with the problem in the first place: getting better error messages is easy to achieve and would be a clear gain.
Apr 13 2016
prev sibling parent w0rp <devw0rp gmail.com> writes:
On Wednesday, 13 April 2016 at 13:17:57 UTC, cym13 wrote:
 There's a world between exceptionnaly getting a user password 
 in order to detect and solve a bug through an error message and 
 knowingly logging every single user password, be it only on the 
 legal side. In France for example you don't have the right to 
 log most sensitive things. On the security side it's the same 
 thing: the chances for an attacker to retrieve a password by 
 server crashing are quite small, while getting his hands on the 
 log file would be a goldmine.
This problem is typically solved by providing a list of keys to either whitelist or blacklist for logging from POST requests, so sensitive data is excluded from logs, but other data is available so you can find out what went wrong. I don't think Adam's request to log the index of an array will be a security risk worth worrying about, however, not that you were indicating that. I think his request is quite reasonable, any odd implementation details permitting.
Apr 14 2016
prev sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Tuesday, 12 April 2016 at 08:37:40 UTC, Kagamin wrote:
 What's the reason to not log POST request on error? It's a 
 plain string, should be trivial to log.
It is possible (and sometimes I do it, especially on an exception catcher) but it isn't customary because of things like sensitive things like passwords and credit card numbers, and because of large pieces of data like file uploads. Of course, these can be filtered out of an error log... but with the file upload, wouldn't it be nice to know where the error occurred so you can filter around it? BTW so I opened a PR with an implementation. At least there was a comment on it, though not one I'm satisfied with. https://github.com/D-Programming-Language/dmd/pull/5637 I'm just gonna fork the language eventually.
Apr 12 2016
parent Kagamin <spam here.lot> writes:
On Tuesday, 12 April 2016 at 14:35:25 UTC, Adam D. Ruppe wrote:
 Of course, these can be filtered out of an error log... but 
 with the file upload, wouldn't it be nice to know where the 
 error occurred so you can filter around it?
If you have the file, there's no need for overly smart logging and filtering. You can just reproduce the whole scenario.
 I'm just gonna fork the language eventually.
A compiler option would be probably less intrusive :)
Apr 13 2016
prev sibling parent Kagamin <spam here.lot> writes:
Maybe set up onRangeError handler to create the process dump? 
Then you will get everything.
Apr 07 2016