digitalmars.D.debugger - Can't set BP in middle of line

Michelle Long (3/3) Oct 30 2018 statement1; statement2;

Rainer Schuetze (3/8) Nov 03 2018 I haven't seen this with the VS native debugger and C++, so I doubt it

Michelle Long (14/22) Nov 03 2018 Supposedly it is possible. When you put a BP it has a "char"

Rainer Schuetze (5/27) Nov 04 2018 These examples are with the debugger for managed code (e.g. C#), not the

Michelle Long (24/53) Nov 04 2018 Is it then possible to simply split a line internally to handle

Stefan Koch (4/10) Nov 20 2018 Debug information is mapped to machine-code on a source line by

Michelle Long (61/74) Nov 26 2018 Why would that be non-trivial?

wjoe (100/120) Apr 12 2019 Theoretically nothing's stopping a debugger from setting a

Michelle Long <HappyDance321 gmail.com> writes:

statement1; statement2;

Would be nice to easily be able to set a BP just on statement2 
without having to add a new line.

Oct 30 2018

Rainer Schuetze <r.sagitario gmx.de> writes:

On 30/10/2018 16:21, Michelle Long wrote:
 statement1; statement2;
 
 Would be nice to easily be able to set a BP just on statement2 without
 having to add a new line.
 

I haven't seen this with the VS native debugger and C++, so I doubt it
is feasible.

Nov 03 2018

Michelle Long <HappyDance321 gmail.com> writes:

On Saturday, 3 November 2018 at 08:10:37 UTC, Rainer Schuetze 
wrote:
 On 30/10/2018 16:21, Michelle Long wrote:
 statement1; statement2;
 
 Would be nice to easily be able to set a BP just on statement2 
 without having to add a new line.
 

 I haven't seen this with the VS native debugger and C++, so I 
 doubt it is feasible.

Supposedly it is possible. When you put a BP it has a "char" 
value which is usually set to 1. (you can see it in when you 
hover over the BP).

Supposedly setting the value to something else allows one to do 
BP's in the middle of lines. It may be only valid for certain 
languages though but it is doable in some cases as you can find 
information online about people doing it.

https://stackoverflow.com/questions/36166205/setting-a-breakpoint-in-the-middle-of-a-line-with-multiple-statements

https://stackoverflow.com/questions/3952782/setting-break-point-using-source-code-line-number-in-windbg

https://stackoverflow.com/questions/3952782/setting-break-point-using-source-code-line-number-in-windbg

So, this should be possible. After all, it's just a "virtual" 
problem having to do with whitespace.

Nov 03 2018

Rainer Schuetze <r.sagitario gmx.de> writes:

On 03/11/2018 17:31, Michelle Long wrote:
 On Saturday, 3 November 2018 at 08:10:37 UTC, Rainer Schuetze wrote:
 On 30/10/2018 16:21, Michelle Long wrote:
 statement1; statement2;

 Would be nice to easily be able to set a BP just on statement2
 without having to add a new line.

 I haven't seen this with the VS native debugger and C++, so I doubt it
 is feasible.

 
 Supposedly it is possible. When you put a BP it has a "char" value which
 is usually set to 1. (you can see it in when you hover over the BP).
 
 Supposedly setting the value to something else allows one to do BP's in
 the middle of lines. It may be only valid for certain languages though
 but it is doable in some cases as you can find information online about
 people doing it.
 
 https://stackoverflow.com/questions/36166205/setting-a-breakpoint-in-the-middle-of-a-line-with-multiple-statements


debug engine for native applications.

I haven't yet seen any compiler (C++ or D) emit CodeView debug
information that contains more than the line number.

Nov 04 2018

Michelle Long <HappyDance321 gmail.com> writes:

On Sunday, 4 November 2018 at 08:27:34 UTC, Rainer Schuetze wrote:
On 03/11/2018 17:31, Michelle Long wrote:
On Saturday, 3 November 2018 at 08:10:37 UTC, Rainer Schuetze
wrote:
On 30/10/2018 16:21, Michelle Long wrote:
statement1; statement2;

Would be nice to easily be able to set a BP just on
statement2 without having to add a new line.

I haven't seen this with the VS native debugger and C++, so I
doubt it is feasible.

Supposedly it is possible. When you put a BP it has a "char"
value which is usually set to 1. (you can see it in when you
hover over the BP).

Supposedly setting the value to something else allows one to
do BP's in the middle of lines. It may be only valid for
certain languages though but it is doable in some cases as you
can find information online about people doing it.

https://stackoverflow.com/questions/36166205/setting-a-breakpoint-in-the-middle-of-a-line-with-multiple-statements

These examples are with the debugger for managed code (e.g.

I haven't yet seen any compiler (C++ or D) emit CodeView debug
information that contains more than the line number.

Is it then possible to simply split a line internally to handle
it?

1. Surely the char/statement for the BP on a multi-statement line
can be determined?

2. If such a BP exist then simply insert a new line in the text
before compiling(I know it modifies the code but it's just
whitespace and it could be an optional feature).

3. Then remove the inserted new line character after compiling.
(might need to suppress file modifications temporarily).

4. When a BP is hit just re-calculate the line positions(will be
offset by one and such.

Should be pretty simple to do and allows one to not have to worry
about reformatting code just to add BP's.

I routinely do stuff like

if (x) return;

and I want a BP on the return, not the if. I have to insert the
new line by hand just so I can add a BP!

Since this is a white space problem it really should be trival
and technically should be supported by the compiler and
debugger/IDE. It's a pretty basic problem and also helpful to
have the ability to put BP's in the middle of lines.

It may actually be easier to implement though since the IDE
already does have some of the capabilities.

Nov 04 2018

Stefan Koch <uplink.coder googlemail.com> writes:

On Sunday, 4 November 2018 at 21:17:32 UTC, Michelle Long wrote:
 On Sunday, 4 November 2018 at 08:27:34 UTC, Rainer Schuetze 
 wrote:
 [...]


 Is it then possible to simply split a line internally to handle 
 it?

 [...]

Debug information is mapped to machine-code on a source line by 
source line basis.

Therefore it's indeed nontrivial to do this.

Nov 20 2018

Michelle Long <HappyDance321 gmail.com> writes:

On Tuesday, 20 November 2018 at 15:11:56 UTC, Stefan Koch wrote:
 On Sunday, 4 November 2018 at 21:17:32 UTC, Michelle Long wrote:
 On Sunday, 4 November 2018 at 08:27:34 UTC, Rainer Schuetze 
 wrote:
 [...]


 Is it then possible to simply split a line internally to 
 handle it?

 [...]

 Debug information is mapped to machine-code on a source line by 
 source line basis.

 Therefore it's indeed nontrivial to do this.


Why would that be non-trivial?

semantically there is nothing different between

statement; statement;

and

statement;
statement;

to say that there is a difference to the compiler is not true.

To say it is non-trivial does not mean it is. You have to at 
least prove it with some reasonable logic because the facts do 
not support your claim.


Whatever mapping takes place, one just needs to propagate the 
column info along with the line. It may be non-trivial to 
implement in the compiler because of the design, but it is not a 
hard problem, in fact, it should be quite easy.


In fact, if the compiler simply broke every statement in to it's 
own line, by adding a new line to each statement(after every `;`) 
and memorized that mapping then it could be inverted to get the 
desired behavior... that is almost a trivial solution, so to say 
it is non-trivial simply does not jive and you will have to be 
more specific why.

For example, one could write a pre-parser that records the line 
and column mapping of statements, converts all multi-statement 
lines into multi-line statements. Compile, then unmap whatever 
the compiler says:

statement1; statement2; statement3;

Record line/col

0/0 0/12 0/24

convert

statement1;
statement2;
statement3;

0/0
1/0
2/0

map

0/0 -> 0/0
0/12 -> 1/0
0/24 -> 2/0

Then the mapping can be inverted after compilation.

E.g., If there is an error on line 2 it is looked up in the 
mapping

2/0 -> 0/24.

And so the error is actually at line 0/24 in the original 
sort(before newlines added).

It's quite simple. So you are simply wrong. One could write such 
a preprocessor and get the mapping and such and it would work 
fine. All it would require is to properly parse D code code to 
get the end of statements ';', insert a new line, and determine 
the mapping and for the debugger then to invert the map when line 
info is presented.

Because it is trivial to do, except for the D language 
processing(which the dmd parser should handle without problem and 
why it could be done in dmd transparently pre and post), it 
probably should be looked in to.

It's easy to say something is true or false... but it doesn't 
make it so. I've demonstrated a feasible solution.

Many people thought it was impossible to do X and they were 
proven wrong. Have you learned your lesson? We can at least then 
spend the time in finding an optimal way to do this so we all 
benefit from it.

Nov 26 2018

wjoe <fake example.com> writes:

On Monday, 26 November 2018 at 10:42:02 UTC, Michelle Long wrote:
 On Tuesday, 20 November 2018 at 15:11:56 UTC, Stefan Koch wrote:
 On Sunday, 4 November 2018 at 21:17:32 UTC, Michelle Long 
 wrote:
 On Sunday, 4 November 2018 at 08:27:34 UTC, Rainer Schuetze 
 wrote:
 [...]


 Is it then possible to simply split a line internally to 
 handle it?

 [...]

 Debug information is mapped to machine-code on a source line 
 by source line basis.

 Therefore it's indeed nontrivial to do this.


 Why would that be non-trivial?

 semantically there is nothing different between

 statement; statement;

Theoretically nothing's stopping a debugger from setting a 
breakpoint in between those 2 statements.
Because when those statements are compiled to machine code that 
structure doesn't exist anymore. There's no whitespace in machine 
code nor statements separated by a delimiter.

Consider this:

A)
10| int x = 0; int y = 0; // this is line number 10 in the source 
file
11| writeln(x, y);

B)
10| int x = 0, y = 0; // this is line number 10 in the source file
11| writeln(x, y);

C)
10| int x = 0; // this is line number 10 in the source file
11| int y = 0;
12| writeln(x, y);

this would be translated to something like this (and is highly 
dependent on the architecture):

0x00230| xor accumulator, accumulator
0x00231| move to-address-of-var-x, value-of-accumulator
0x00233| move to-address-of-var-y, value-of-accumulator
0x00235| ...
^
| this is the address of the machine code in memory

all of these assignments would compile to the same 3 machine 
instructions.
(This is not quite how it actually works but just to show the 
concept. Depending on the architecture it might not be possible 
to write to an address directly, and it would have to be loaded 
into an address register to which the value would then be moved. 
Also consider stack, variable alignment, machine instructions 
might need to be aligned and padded with NOPs, etc., etc.)

So what happens when you set a breakpoint on line 10 in your 
debugger ?

The debugger will rewrite your code in RAM and overwrite the 
instruction on address 0x00230 with an INT3, like so:

0x00230| INT3
0x00232| move to-address-of-var-x, value-of-accumulator
0x00234| move to-address-of-var-y, value-of-accumulator
0x00235| ...

The INT3 is a one-byte instruction (and applies to the x86 
architecture) intended to be used by debuggers in order to 
interrupt the flow of a running program.

The important part here is: one-byte instruction. Which means it 
can be used to overwrite any machine instruction.

Once the CPU executes the INT3 instruction, the execution stops 
and you regain control of your program in the debugger.

Now it depends on how you proceed.

I) You continue execution, e.g. by pressing F5 -> The debugger 
restores the original opcode (xor accumulator, accumulator) to 
address 0x00230, and transfers control back to the CPU which 
continues to execute instruction at address 0x00231, 0x00232, etc.


II) You single step, e.g. by pressing F8.

Case A) and B) behave in the same way, the debugger would write 
the INT3 instruction to 0x00235, restore 0x00230 and resume. And 
halt again, executing code at 0x00235.

Case C) behaves similarly but would write INT3 to 0x00233, then 
next step to 0x00235, etc.

So, because INT3 is a one byte instruction, theoretically, 
debuggers could set a breakpoint in between the same one 
statement, like after pushing the function parameters on the 
stack, but before making the actual call.

0x00235| push x
0x00237| push y
0x00239| call writeln | <- by overwriting the call with INT3

But if all this is possible in theory why can't you break on your 
'statment2;' ?

The answer is granularity. Line number ganularity to be more 
specific.

In order for the debugger to know where statement 1; is the 
compiler must generate debug information. Which is a map that 
maps the address of a machine instruction to a line in - oh wait 
- what actually is a line ?

Technically there are no lines in your source code either. It's 
simply a stream of characters and lines are a convention, a 
formatting hint, for your editor.
This convention differs even between different OSs.
On Posix the convention is a line feed (ASCII code 10).
On Windows it is a carriage return (ASCII code 13) followed 
immediately by a line feed.

You can see this concept when you open a text file produced in a 
Posix environment in Windows Notepad. (although, I believe, a 
recent Windows 10 Notepad understands both conventions.)

So, back to debug info.
The compiler maps the address of the instruction to the offset of 
that CR/LF offset in your source code file. (It might be a little 
bit more complicated than that with indirections and stuff so 
someone more knowledgeable in that domain might want to weigh in)

And since the CR/LF is the granularity of the debug info you 
can't get better resolution for your breakpoints in your debugger.

Now if you wanted to set a breakpoint at any statement in a line, 
or inside for(...;...;...) loops, etc. you would need a) a 
compiler that can produce this kind of debug info, and b) a 
debugger which can take advantage of it.

So technically you would need 2 different pieces of software to 
work together which probably are made by completely different 
groups people, who would have to reach consensus - and that's why 
it's non-trivial.

Apr 12 2019

D Programming

C/C++ Programming

Other

digitalmars.D.debugger - Can't set BP in middle of line