www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Array indexing/offset inline asm ambiguity

reply "David Nadlinger" <see klickverbot.at> writes:
Hi all,

Another installment in the »What does the following function 
return« series:

---
int[2] foo() {
   int[2] regs;
   asm {
     mov regs[1], 0xdeadbeef;
   }
   return regs;
}
---

If you answered [0, 0xdeadbeef], then congratulations: You fell 
into exactly the same trap as Martin Nowak and I did in 
https://github.com/D-Programming-Language/druntime/pull/426, 
without any of the other reviewers noticing either.

The issue is that in an inline asm block, regs[1] is actually 
equivalent to 1[regs] or [regs + 1] and evaluates to an address 
one *byte* after the start of regs, not one element as it does 
everywhere else.

In my opinion, this is completely counterintuitive and a source 
of bugs that could easily be prevented by the language. But 
Walter seems to think this issue not worth addressing: 
http://d.puremagic.com/issues/show_bug.cgi?id=9738

What do you think?

Thanks,
David
Mar 16 2013
next sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Sunday, 17 March 2013 at 00:24:16 UTC, David Nadlinger wrote:
 Hi all,

 Another installment in the »What does the following function 
 return« series:

 ---
 int[2] foo() {
   int[2] regs;
   asm {
     mov regs[1], 0xdeadbeef;
   }
   return regs;
 }
 ---

 If you answered [0, 0xdeadbeef], then congratulations: You fell 
 into exactly the same trap as Martin Nowak and I did in 
 https://github.com/D-Programming-Language/druntime/pull/426, 
 without any of the other reviewers noticing either.

 The issue is that in an inline asm block, regs[1] is actually 
 equivalent to 1[regs] or [regs + 1] and evaluates to an address 
 one *byte* after the start of regs, not one element as it does 
 everywhere else.

 In my opinion, this is completely counterintuitive and a source 
 of bugs that could easily be prevented by the language. But 
 Walter seems to think this issue not worth addressing: 
 http://d.puremagic.com/issues/show_bug.cgi?id=9738

 What do you think?

 Thanks,
 David
D syntax should have D semantics, even in inline asm. (where that syntax does not conflict with the asm syntax of course)
Mar 16 2013
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/16/2013 5:24 PM, David Nadlinger wrote:
 But Walter seems to think this issue
 not worth addressing: http://d.puremagic.com/issues/show_bug.cgi?id=9738
Not exactly. I felt: The inline assembler uses Intel syntax, and for better or worse, that's what it is. We need to either stick with it, as it is fairly well understood by asm programmers, or use D syntax. Some hybrid in between will be liked by nobody.
Mar 16 2013
next sibling parent "Chris Nicholson-Sauls" <ibisbasenji gmail.com> writes:
On Sunday, 17 March 2013 at 01:54:27 UTC, Walter Bright wrote:
 On 3/16/2013 5:24 PM, David Nadlinger wrote:
 But Walter seems to think this issue
 not worth addressing: 
 http://d.puremagic.com/issues/show_bug.cgi?id=9738
Not exactly. I felt: The inline assembler uses Intel syntax, and for better or worse, that's what it is. We need to either stick with it, as it is fairly well understood by asm programmers, or use D syntax. Some hybrid in between will be liked by nobody.
---Warning: hijack ahead--- How reasonable would it be to consider a HLA-like syntax, at least as a basis? ( 01283_pgfId-1001263 ) Call me crazy (everyone else does) but I prefer it over any other I've used.
Mar 16 2013
prev sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Sunday, 17 March 2013 at 01:54:27 UTC, Walter Bright wrote:
 On 3/16/2013 5:24 PM, David Nadlinger wrote:
 But Walter seems to think this issue
 not worth addressing: 
 http://d.puremagic.com/issues/show_bug.cgi?id=9738
Not exactly. I felt: The inline assembler uses Intel syntax, and for better or worse, that's what it is. We need to either stick with it, as it is fairly well understood by asm programmers, or use D syntax. Some hybrid in between will be liked by nobody.
It is already mixed as regs comes from D. D symbol should behave as D, ASM symbols should behave as asm.
Mar 16 2013
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 3/16/2013 11:40 PM, deadalnix wrote:
 On Sunday, 17 March 2013 at 01:54:27 UTC, Walter Bright wrote:
 On 3/16/2013 5:24 PM, David Nadlinger wrote:
 But Walter seems to think this issue
 not worth addressing: http://d.puremagic.com/issues/show_bug.cgi?id=9738
Not exactly. I felt: The inline assembler uses Intel syntax, and for better or worse, that's what it is. We need to either stick with it, as it is fairly well understood by asm programmers, or use D syntax. Some hybrid in between will be liked by nobody.
It is already mixed as regs comes from D. D symbol should behave as D, ASM symbols should behave as asm.
Symbols and syntax are very different things.
Mar 16 2013
prev sibling parent Paulo Pinto <pjmlp progtools.org> writes:
On 17.03.2013 07:40, deadalnix wrote:
 On Sunday, 17 March 2013 at 01:54:27 UTC, Walter Bright wrote:
 On 3/16/2013 5:24 PM, David Nadlinger wrote:
 But Walter seems to think this issue
 not worth addressing: http://d.puremagic.com/issues/show_bug.cgi?id=9738
Not exactly. I felt: The inline assembler uses Intel syntax, and for better or worse, that's what it is. We need to either stick with it, as it is fairly well understood by asm programmers, or use D syntax. Some hybrid in between will be liked by nobody.
It is already mixed as regs comes from D. D symbol should behave as D, ASM symbols should behave as asm.
I agree with Walter here. This might be a surprise for UNIX guys, but it is the way inline assembler works in Turbo Pascal/Delphi/C/C++ compilers on Windows, at least on the compilers from Borland and Microsoft. -- Paulo
Mar 17 2013
prev sibling parent "Tobias Pankrath" <tobias pankrath.net> writes:
On Sunday, 17 March 2013 at 00:24:16 UTC, David Nadlinger wrote:
 Hi all,

 Another installment in the »What does the following function 
 return« series:

 ---
 int[2] foo() {
   int[2] regs;
   asm {
     mov regs[1], 0xdeadbeef;
   }
   return regs;
 }
 ---

 If you answered [0, 0xdeadbeef], then congratulations: You fell 
 into exactly the same trap as Martin Nowak and I did in 
 https://github.com/D-Programming-Language/druntime/pull/426, 
 without any of the other reviewers noticing either.

 The issue is that in an inline asm block, regs[1] is actually 
 equivalent to 1[regs] or [regs + 1] and evaluates to an address 
 one *byte* after the start of regs, not one element as it does 
 everywhere else.

 In my opinion, this is completely counterintuitive and a source 
 of bugs that could easily be prevented by the language. But 
 Walter seems to think this issue not worth addressing: 
 http://d.puremagic.com/issues/show_bug.cgi?id=9738

 What do you think?

 Thanks,
 David
I am not using assembler, but I agree with Walter that consistency matters. However if both of you and no reviewer found that bug than maybe a warning should be issued if your index in assembler does not correspond to the first byte of an element.
Mar 17 2013