digitalmars.D - I want this so badly, please implement

Adam D. Ruppe (9/9) Apr 06 2016 Just make RangeError tell report the index and length, where

=?UTF-8?B?Tm9yZGzDtnc=?= (9/10) Apr 07 2016 Could you please add a test-program with current and wanted

Adam D. Ruppe (24/26) Apr 07 2016 int[] a = new int[](1);

Daniel Murphy (5/8) Apr 07 2016 If you have a patch I can probably point out the error. You're right

Adam D. Ruppe (2/6) Apr 07 2016 That's probably it, I'll try later today and let you know...
Adam D. Ruppe (11/14) Apr 07 2016 Well, I copied the elemns using el_same and it all compiles, but

Kagamin (2/8) Apr 07 2016 How do you spend time on it?

Adam D. Ruppe (27/28) Apr 07 2016 A program dies with a RangeError. I don't know why because it

Kagamin (6/10) Apr 07 2016 Sounds like code bloat for little gain. I'm afraid data length

Adam D. Ruppe (9/11) Apr 07 2016 It tells me more than ABSOLUTELY NOTHING. Between the length and

Steven Schveighoffer (4/13) Apr 07 2016 Can't you throw a segfault in the handler? The information is still

Adam D. Ruppe (12/14) Apr 07 2016 Yes, of course, I can also printf the indexes myself. But the

Steven Schveighoffer (11/24) Apr 07 2016 I hear you. I'm actually fervently on the side of having a seg fault

Adam D. Ruppe (43/50) Apr 07 2016 Aye. I'm personally a bit lukewarm on it now because it is

Kagamin (3/6) Apr 12 2016 What's the reason to not log POST request on error? It's a plain

Kapps (5/12) Apr 12 2016 Amongst other things, you'd log sensitive data like passwords,

Kagamin (3/7) Apr 13 2016 With Adam's idea sensitive data can still accidentally leak into

cym13 (9/16) Apr 13 2016 There's a world between exceptionnaly getting a user password in

Kagamin (3/7) Apr 13 2016 The latter shouldn't be needed. When you log everything, you just

cym13 (9/16) Apr 13 2016 Logging everything is illegal in many countries (for good

w0rp (9/17) Apr 14 2016 This problem is typically solved by providing a list of keys to

Adam D. Ruppe (12/14) Apr 12 2016 It is possible (and sometimes I do it, especially on an exception

Kagamin (4/8) Apr 13 2016 If you have the file, there's no need for overly smart logging

Kagamin (2/2) Apr 07 2016 Maybe set up onRangeError handler to create the process dump?

Adam D. Ruppe <destructionator gmail.com> writes:

Just make RangeError tell report the index and length, where 
easy. For ordinary arrays, both are size_t so it can be trivially 
passed by value without thinking about ownership or anything.

And it would save me hours.

https://issues.dlang.org/show_bug.cgi?id=15889


I would like to know the index on associative arrays too, but I 
understand that can be more difficult.

But if you guys implement it for regular array for the next 
release, I would be exceedingly joyous.

Apr 06 2016

=?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Thursday, 7 April 2016 at 02:00:39 UTC, Adam D. Ruppe wrote:
 https://issues.dlang.org/show_bug.cgi?id=15889

Could you please add a test-program with current and wanted 
behaviour?

Do you mean that the index and length are lost in the compiler 
which should then be passed to the RangeError object? Could you 
highlight the details?

If both dmd and druntime/phobos needs to be updated could you 
point out some source locations of interest?

I might give this a try.

Apr 07 2016

Adam D. Ruppe <destructionator gmail.com> writes:

On Thursday, 7 April 2016 at 07:07:32 UTC, Nordlöw wrote:
 Could you please add a test-program with current and wanted 
 behaviour?

int[] a = new int[](1);
a[2]; // currently throws RangeError, telling file and line of 
code


I'd like it to throw a RangeError telling file and line of code, 
and idx == 2, length == 1.


The way it works is the compiler, in e2ir.c (unless I'm mistaken) 
turns the index and slice expressions into code. It basically 
generates this code:

((idx < length) || _d_array_bounds(module*, cast(uint) 
line_number)), array.ptr[idx]


What I want it to do is generate this code:


((idx < length) || _d_array_bounds(module*, cast(uint) 
line_number), idx, length), array.ptr[idx]


Just so idx and length is passed to the druntime function.

That's dmd's change. Make sure it covers all the places the array 
bounds functions are called.

(I thought this would be easy but I keep hitting dmd assert 
failures. However, that's probably because I'm a n00b, it 
probably is easy if you know the e2ir style.)

Then, on the druntime side, change the function to accept the 
size_t idx, size_t length in addition to its current args.

Forward them through to the RangeError object (or a subclass), 
which then prints them too.

Apr 07 2016

Daniel Murphy <yebbliesnospam gmail.com> writes:

On 8/04/2016 1:59 AM, Adam D. Ruppe wrote:

 (I thought this would be easy but I keep hitting dmd assert failures.
 However, that's probably because I'm a n00b, it probably is easy if you
 know the e2ir style.)

If you have a patch I can probably point out the error.  You're right 
that it should be fairly straightforward, although you'll need to copy 
elems to temps when they're used multiple times to avoid the 
multiple-references asserts/ICEs.

Apr 07 2016

Adam D. Ruppe <destructionator gmail.com> writes:

On Thursday, 7 April 2016 at 16:22:17 UTC, Daniel Murphy wrote:
 If you have a patch I can probably point out the error.  You're 
 right that it should be fairly straightforward, although you'll 
 need to copy elems to temps when they're used multiple times to 
 avoid the multiple-references asserts/ICEs.

That's probably it, I'll try later today and let you know...

Apr 07 2016

Adam D. Ruppe <destructionator gmail.com> writes:

On Thursday, 7 April 2016 at 16:22:17 UTC, Daniel Murphy wrote:
 you'll need to copy elems to temps when they're
 used multiple times to avoid the multiple-references 
 asserts/ICEs.


Well, I copied the elemns using el_same and it all compiles, but 
I realize this calls module.__array, which is a generated 
function in glue.c, and doesn't forward the arguments :(

Blah, so gotta change that too.

Check genhelpers() in glue.c, it has "// Get sole parameter, 
linnum" but it isn't the sole param anymore :(


I kinda wish it just generated a code string and compiled tha, 
it'd certainly be easier for us n00bs.



But, could you take a look at this and tell me how it looks?

https://github.com/D-Programming-Language/dmd/pull/5637

Apr 07 2016

Kagamin <spam here.lot> writes:

On Thursday, 7 April 2016 at 02:00:39 UTC, Adam D. Ruppe wrote:
 Just make RangeError tell report the index and length, where 
 easy. For ordinary arrays, both are size_t so it can be 
 trivially passed by value without thinking about ownership or 
 anything.

 And it would save me hours.

 https://issues.dlang.org/show_bug.cgi?id=15889

How do you spend time on it?

Apr 07 2016

Adam D. Ruppe <destructionator gmail.com> writes:

On Thursday, 7 April 2016 at 09:31:15 UTC, Kagamin wrote:
 How do you spend time on it?

A program dies with a RangeError. I don't know why because it 
doesn't tell me any useful information about the failing data. 
This is quite frustrating because that information is available 
and trivial to attach to the object, but it is just dropped.

What I'd actually do in a perfect world is attach the whole array 
to the object:

class RangeError : Error {}
class TypedRangeError(T) : RangeError {
    T[] slice;
    size_t attemptedIndex;
    override string toString() { /* print the above in some 
human-readable way */ }
}


Because then I can see what runtime data specifically caused the 
problem.


But like I said about AAs, adding an array is harder, so I'm 
happy with just the numbers as a compromise - at least then I can 
see what part of the loop is the problem instead of just the line 
of code.



The time sink comes when processing a million items and seeing a 
random RangeError in the middle of processing. file.d:6 is great, 
but which one of the million lines of DATA was the problem? At 
least giving me the index and length narrows that down.

I typically go back and edit the source to print out some kind of 
log info, then wait for it to all run again. Just a frustration - 
one of the few I face regularly in D.

Apr 07 2016

Kagamin <spam here.lot> writes:

On Thursday, 7 April 2016 at 15:35:31 UTC, Adam D. Ruppe wrote:
 A program dies with a RangeError. I don't know why because it 
 doesn't tell me any useful information about the failing data.

The length of data will tell you which data it is?

 This is quite frustrating because that information is available 
 and trivial to attach to the object, but it is just dropped.

Sounds like code bloat for little gain. I'm afraid data length 
and index value are not generally useful to diagnose this 
problem. Throwing the slice can be helpful, but can't be done 
because it can be a static array on stack.

Apr 07 2016

Adam D. Ruppe <destructionator gmail.com> writes:

On Thursday, 7 April 2016 at 17:03:31 UTC, Kagamin wrote:
 The length of data will tell you which data it is?

It tells me more than ABSOLUTELY NOTHING. Between the length and 
the index, I can at least get an idea of if there's a broken 
record or broken code.

 Sounds like code bloat for little gain.

It is already there, just discarded!

D's error handling does this a lot, and there's no reason good 
for it. Useless RangeErrors (a segmentation fault would be more 
helpful!), generic Exceptions with random strings instead of 
useful info, dumb crap like enforce discarding easy info. Ugh.

Apr 07 2016

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 4/7/16 1:16 PM, Adam D. Ruppe wrote:
 On Thursday, 7 April 2016 at 17:03:31 UTC, Kagamin wrote:
 The length of data will tell you which data it is?

 It tells me more than ABSOLUTELY NOTHING. Between the length and the
 index, I can at least get an idea of if there's a broken record or
 broken code.

 Sounds like code bloat for little gain.

 It is already there, just discarded!

 D's error handling does this a lot, and there's no reason good for it.
 Useless RangeErrors (a segmentation fault would be more helpful!),

Can't you throw a segfault in the handler? The information is still 
there on the stack, no?

-Steve

Apr 07 2016

Adam D. Ruppe <destructionator gmail.com> writes:

On Thursday, 7 April 2016 at 17:55:20 UTC, Steven Schveighoffer 
wrote:
 Can't you throw a segfault in the handler? The information is 
 still there on the stack, no?

Yes, of course, I can also printf the indexes myself. But the 
error could also carry a bit more useful information by default - 
and practically for free!


I'm even actually almost sold on the null pointer exception 
people ask for. It has more cost than a RangeError with index 
(seriously, the index and length are *already loaded* in 
registers for the comparison, it really is as close to free as 
you can get), but the convenience of the message by default is 
nice - and you can still get the nice segfault when actually 
running in the debugger by breaking on the handler or whatever.

Apr 07 2016

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 4/7/16 2:14 PM, Adam D. Ruppe wrote:
 On Thursday, 7 April 2016 at 17:55:20 UTC, Steven Schveighoffer wrote:
 Can't you throw a segfault in the handler? The information is still
 there on the stack, no?

 Yes, of course, I can also printf the indexes myself. But the error
 could also carry a bit more useful information by default - and
 practically for free!


 I'm even actually almost sold on the null pointer exception people ask
 for. It has more cost than a RangeError with index (seriously, the index
 and length are *already loaded* in registers for the comparison, it
 really is as close to free as you can get), but the convenience of the
 message by default is nice - and you can still get the nice segfault
 when actually running in the debugger by breaking on the handler or
 whatever.

I hear you. I'm actually fervently on the side of having a seg fault 
printout of some kind (an exception may not be the best choice because 
with a seg fault, you can't trust the stack).

Having to deal with real world situations where a program runs for weeks 
and fails on a customer's system, just having "Segmentation Fault" 
staring at you is no help at all.

I'm sure you are in a similar situation with this. Instrumenting and 
running again is not only costly and time consuming, but may even 
subvert the actual problem (especially if it's timing sensitive).

-Steve

Apr 07 2016

Adam D. Ruppe <destructionator gmail.com> writes:

On Thursday, 7 April 2016 at 19:25:57 UTC, Steven Schveighoffer 
wrote:
 I hear you. I'm actually fervently on the side of having a seg 
 fault printout of some kind (an exception may not be the best 
 choice because with a seg fault, you can't trust the stack).

Aye. I'm personally a bit lukewarm on it now because it is 
possible to enable core dumps... but I often don't think to do it 
until it is too late. So it would be really nice if it just 
worked.

But I'd make the dereferences generated in the compiler to do 
check it:

byte* a;
*a = 0;

Could be magically compiled into `a || 
makeNullPointerException(line), *a`, just like it does for 
pointers.

In that case, throwing the Error should actually be OK because it 
is caught before attempting to use it.



I guess the difficulty though is:

struct A { int a; int b; }
A* ptr;
ptr.b = 0;


because then, the pointer is null, but the pointer+offset isn't, 
so maybe the codegen needs to be a big different. But the same 
principle should apply - if the ptr is checked before attempting 
the change, it is exactly the same as a RangeError - "safe" to 
throw because the write hasn't actually occurred yet.

(I say "safe" with scare quotes because it is still a bug and 
still not supposed to actually be recovered, you are supposed to 
let the program die, but the error throw mechanism should be 
fairly ok.)

 I'm sure you are in a similar situation with this. 
 Instrumenting and running again is not only costly and time 
 consuming, but may even subvert the actual problem (especially 
 if it's timing sensitive).

Absolutely, and it is also really annoying when it happens 
because of a user request.

So a web server dies with a RangeError because of some POST data. 
What was the user submission that killed it? POST data isn't 
ordinarily logged, so I have no idea and cannot reproduce. (I 
typically just add assertions around it and try to debug from the 
code without knowing the data.)


But, there's another thing that bothers me, and that is in the 
development process: I like to quickly guess and check programs 
over realistic data. Sometimes it can take a minute for the 
program to run (like I said, there might be millions of lines) 
and I'm trying to just quickly hammer this "script" out. Having 
to stop, add log spamming, and run again kills the good attitude.

If I wanted to fight my programming language, I wouldn't be using 
D!

Apr 07 2016

Kagamin <spam here.lot> writes:

On Thursday, 7 April 2016 at 20:36:28 UTC, Adam D. Ruppe wrote:
 So a web server dies with a RangeError because of some POST 
 data. What was the user submission that killed it? POST data 
 isn't ordinarily logged, so I have no idea and cannot reproduce.

What's the reason to not log POST request on error? It's a plain 
string, should be trivial to log.

Apr 12 2016

Kapps <opantm2+spam gmail.com> writes:

On Tuesday, 12 April 2016 at 08:37:40 UTC, Kagamin wrote:
 On Thursday, 7 April 2016 at 20:36:28 UTC, Adam D. Ruppe wrote:
 So a web server dies with a RangeError because of some POST 
 data. What was the user submission that killed it? POST data 
 isn't ordinarily logged, so I have no idea and cannot 
 reproduce.

 What's the reason to not log POST request on error? It's a 
 plain string, should be trivial to log.

Amongst other things, you'd log sensitive data like passwords, 
which should never be stored anywhere in plain text, including 
log files. This is one of the reasons to not use GET for anything 
sensitive.

Apr 12 2016

Kagamin <spam here.lot> writes:

On Tuesday, 12 April 2016 at 08:51:23 UTC, Kapps wrote:
 Amongst other things, you'd log sensitive data like passwords, 
 which should never be stored anywhere in plain text, including 
 log files. This is one of the reasons to not use GET for 
 anything sensitive.

With Adam's idea sensitive data can still accidentally leak into 
this extended diagnostic mechanism.

Apr 13 2016

cym13 <cpicard openmailbox.org> writes:

On Wednesday, 13 April 2016 at 08:48:56 UTC, Kagamin wrote:
 On Tuesday, 12 April 2016 at 08:51:23 UTC, Kapps wrote:
 Amongst other things, you'd log sensitive data like passwords, 
 which should never be stored anywhere in plain text, including 
 log files. This is one of the reasons to not use GET for 
 anything sensitive.

 With Adam's idea sensitive data can still accidentally leak 
 into this extended diagnostic mechanism.

There's a world between exceptionnaly getting a user password in 
order to detect and solve a bug through an error message and 
knowingly logging every single user password, be it only on the 
legal side. In France for example you don't have the right to log 
most sensitive things. On the security side it's the same thing: 
the chances for an attacker to retrieve a password by server 
crashing are quite small, while getting his hands on the log file 
would be a goldmine.

Apr 13 2016

Kagamin <spam here.lot> writes:

On Wednesday, 13 April 2016 at 13:17:57 UTC, cym13 wrote:
 There's a world between exceptionnaly getting a user password 
 in order to detect and solve a bug through an error message and 
 knowingly logging every single user password, be it only on the 
 legal side.

The latter shouldn't be needed. When you log everything, you just 
write garbage logs, that are impractical to process.

Apr 13 2016

cym13 <cpicard openmailbox.org> writes:

On Wednesday, 13 April 2016 at 14:47:42 UTC, Kagamin wrote:
 On Wednesday, 13 April 2016 at 13:17:57 UTC, cym13 wrote:
 There's a world between exceptionnaly getting a user password 
 in order to detect and solve a bug through an error message 
 and knowingly logging every single user password, be it only 
 on the legal side.

 The latter shouldn't be needed. When you log everything, you 
 just write garbage logs, that are impractical to process.

Logging everything is illegal in many countries (for good 
reasons) and when I see how easy it is to make sense of a huge 
bunch of TCP packets, reverse a binary or even a file format I 
sincerely think you are either trollish, naive or misguided when 
it comes to logs.

Then again it has little to do with the problem in the first 
place: getting better error messages is easy to achieve and would 
be a clear gain.

Apr 13 2016

w0rp <devw0rp gmail.com> writes:

On Wednesday, 13 April 2016 at 13:17:57 UTC, cym13 wrote:
 There's a world between exceptionnaly getting a user password 
 in order to detect and solve a bug through an error message and 
 knowingly logging every single user password, be it only on the 
 legal side. In France for example you don't have the right to 
 log most sensitive things. On the security side it's the same 
 thing: the chances for an attacker to retrieve a password by 
 server crashing are quite small, while getting his hands on the 
 log file would be a goldmine.

This problem is typically solved by providing a list of keys to 
either whitelist or blacklist for logging from POST requests, so 
sensitive data is excluded from logs, but other data is available 
so you can find out what went wrong.

I don't think Adam's request to log the index of an array will be 
a security risk worth worrying about, however, not that you were 
indicating that. I think his request is quite reasonable, any odd 
implementation details permitting.

Apr 14 2016

Adam D. Ruppe <destructionator gmail.com> writes:

On Tuesday, 12 April 2016 at 08:37:40 UTC, Kagamin wrote:
 What's the reason to not log POST request on error? It's a 
 plain string, should be trivial to log.

It is possible (and sometimes I do it, especially on an exception 
catcher) but it isn't customary because of things like sensitive 
things like passwords and credit card numbers, and because of 
large pieces of data like file uploads.

Of course, these can be filtered out of an error log... but with 
the file upload, wouldn't it be nice to know where the error 
occurred so you can filter around it?



BTW so I opened a PR with an implementation. At least there was a 
comment on it, though not one I'm satisfied with.

https://github.com/D-Programming-Language/dmd/pull/5637

I'm just gonna fork the language eventually.

Apr 12 2016

Kagamin <spam here.lot> writes:

On Tuesday, 12 April 2016 at 14:35:25 UTC, Adam D. Ruppe wrote:
 Of course, these can be filtered out of an error log... but 
 with the file upload, wouldn't it be nice to know where the 
 error occurred so you can filter around it?

If you have the file, there's no need for overly smart logging 
and filtering. You can just reproduce the whole scenario.

 I'm just gonna fork the language eventually.

A compiler option would be probably less intrusive :)

Apr 13 2016

Kagamin <spam here.lot> writes:

Maybe set up onRangeError handler to create the process dump? 
Then you will get everything.

Apr 07 2016

D Programming

C/C++ Programming

Other

digitalmars.D - I want this so badly, please implement