digitalmars.D - ARM bare-metal programming in D (cont) - volatile
- Mike (15/15) Oct 23 2013 Hello again,
- =?UTF-8?B?IsOYaXZpbmQi?= (2/17) Oct 23 2013 +1
- Walter Bright (8/19) Oct 23 2013 volatile was never a reliable method for dealing with memory mapped I/O....
- Mike (10/41) Oct 23 2013 Thanks for the answer, Walter. I think this would be acceptable
- Iain Buclaw (8/49) Oct 23 2013 Operations on volatile are *not* atomic. Nor do they establish a
- Mike (9/38) Oct 23 2013 I probably shouldn't have used the word "operations". What I
- Walter Bright (9/16) Oct 23 2013 You have to give up on volatile. Nobody agrees on what it means. What do...
- Mike (6/31) Oct 23 2013 Well, I wasn't rooting for volatile, I just wanted a way to
- eles (4/21) Oct 24 2013 Is not about "atomize me", it is about "really *read* me" or
- Walter Bright (3/19) Oct 24 2013 Like I said, nobody (on the standards committees) could agree on exactly...
- eles (20/26) Oct 24 2013 The standard committees might not agree, but there is somebody
- Walter Bright (4/26) Oct 24 2013 The trouble with that is since the standards people cannot agree on what...
- Mike (20/67) Oct 24 2013 There should be some way, in the D language, to tell the compiler
- eles (5/18) Oct 25 2013 The problem with shared alone variable is that it can be simply
- Walter Bright (13/17) Oct 25 2013 I've written device drivers and embedded systems. The quantity of code t...
- Russel Winder (42/57) Oct 28 2013 My experience, admittedly late 1970s, early 1980s then early 2000s
- Walter Bright (17/46) Oct 28 2013 I've not only built my own single board computers with video buffers, bu...
- eles (17/28) Oct 28 2013 read [address] into registry (mov)
- Walter Bright (4/13) Oct 28 2013 That overlooks what happens if another thread changes the memory in betw...
- eles (7/12) Oct 28 2013 Synchronizing the access to the resource is the job of the
- Walter Bright (2/4) Oct 28 2013 It'll be subject to review by the community.
- Russel Winder (12/29) Oct 24 2013 Also this (peek and poke) is not a viable approach if you wanted to
- eles (2/7) Oct 24 2013 I pray strongly that W&A believe the same.
- Iain Buclaw (15/45) Oct 23 2013 Are you talking dmd or in general (it's hard to tell). In gdc,
- Mike (6/69) Oct 24 2013 Well, I've done some reading about "shared" but I don't quite
- Iain Buclaw (8/76) Oct 24 2013 'shared' guarantees that all reads and writes specified in source code
- John Colvin (5/105) Oct 24 2013 Is it actually implemented as such in any D compiler? That's a
- Iain Buclaw (7/104) Oct 24 2013 If you require memory barriers to access share data, that is what
- John Colvin (6/14) Oct 24 2013 If there are no memory barriers, then there is no guarantee* of
- Iain Buclaw (5/17) Oct 24 2013 I was talking about the compiler, not CPU.
- Johannes Pfau (27/33) Oct 24 2013 Does this include writes to non-shared data? For example:
- Iain Buclaw (14/47) Oct 24 2013 Yes, reordering may occur so long as the compiler does not change
- Johannes Pfau (5/64) Oct 25 2013 Sounds good. Now this should be the standard defined behaviour for
- eles (4/18) Oct 24 2013 All that's missing is a guarantee that the reading/writing
- Iain Buclaw (5/23) Oct 24 2013 The compiler does not cache shared data (at least in GDC).
- eles (13/30) Oct 24 2013 Well, that should not be a matter of implementation, but of
- Timo Sintonen (12/43) Oct 23 2013 Yes, this is a simplest way to do it and works with gdc when
- Mike (6/53) Oct 24 2013 +1, This is what I feared. I don't think D needs a volatile
- eles (9/18) Oct 24 2013 I rised the problem here:
- Walter Bright (6/10) Oct 25 2013 Why wouldn't they work with arithmetic expressions?
- eles (13/23) Oct 26 2013 Frankly, if it is not a big deal, why don't you put those in a
- Russel Winder (28/33) Oct 27 2013 I am assuming that the C++ memory model and it's definition of volatile
- Russel Winder (14/14) Oct 27 2013 The sub-text here is that, D should be one of the main languages on A
- Dmitry Olshansky (6/14) Oct 27 2013 s/Raspberry Pi/ARM boards/
- Russel Winder (14/14) Oct 27 2013 The sub-text here is that, D should be one of the main languages on A
- Russel Winder (34/39) Oct 27 2013 I am assuming that the C++ memory model and it's definition of volatile
- Iain Buclaw (15/50) Oct 24 2013 To elaborate (now I am a little more awake :) - 'volatile' on the type
- Arjan (4/19) Oct 23 2013 This article might also give some insight in the problems with
- Mike (12/35) Oct 23 2013 Arjan, Thanks for the information. I basically agree with
- Daniel Murphy (21/27) Oct 24 2013 There are a few options:
- Iain Buclaw (17/41) Oct 24 2013 In gdc:
- Timo Sintonen (16/35) Oct 24 2013 I have not (yet) had any problems when writing io registers but
- Daniel Murphy (9/40) Oct 25 2013 Volatile blocks are already in the language, but they suck. You don't w...
- Timo Sintonen (19/52) Oct 25 2013 It seems that it is two different things here. As far as I
- Johannes Pfau (6/65) Oct 25 2013 What's wrong with the solution Iain mentioned, i.e the way shared
- Timo Sintonen (13/42) Oct 25 2013 There is nothing wrong if it works.
- Johannes Pfau (6/28) Oct 26 2013 Yes, this was news to me as well.
- Iain Buclaw (9/24) Oct 26 2013 Was added about 3 years ago...
- Timo Sintonen (8/39) Oct 26 2013 Seems to work. I can make every member as shared or the whole
- Russel Winder (25/28) Oct 27 2013 Not a good style of argument, since the way of the Commodore 64 might be
- Walter Bright (8/23) Oct 27 2013 Bitfield code generation for C compilers has generally been rather crapp...
- Russel Winder (43/50) Oct 28 2013 Endianism and packing have always been the bête noir of bitfields due t...
- Walter Bright (16/50) Oct 28 2013 Generally the shifting is unnecessary, but the compiler doesn't know tha...
- Russel Winder (31/34) Oct 27 2013 Not a good style of argument, since the way of the Commodore 64 might be
- David Nadlinger (8/13) Oct 27 2013 I agree, and thus I think it's dangerous at best and harmful at
Hello again, I'm interested in ARM bare-metal programming with D, and I'm trying to get my head wrapped around how to approach this. I'm making progress, but I found something that was surprising to me: deprecation of the volatile keyword. In the bare-metal/hardware/driver world, this keyword is important to ensure the optimizer doesn't cache reads to memory-mapped IO, as some hardware peripheral may modify the value without involving the processor. I've read a few discussions on the D forums about the volatile keyword debate, but noone seemed to reconcile the need for volatile in memory-mapped IO. Was this an oversight? What's D's answer to this? If one were to use D to read from memory-mapped IO, how would one ensure the compiler doesn't cache the value?
Oct 23 2013
On Thursday, 24 October 2013 at 00:43:11 UTC, Mike wrote:Hello again, I'm interested in ARM bare-metal programming with D, and I'm trying to get my head wrapped around how to approach this. I'm making progress, but I found something that was surprising to me: deprecation of the volatile keyword. In the bare-metal/hardware/driver world, this keyword is important to ensure the optimizer doesn't cache reads to memory-mapped IO, as some hardware peripheral may modify the value without involving the processor. I've read a few discussions on the D forums about the volatile keyword debate, but noone seemed to reconcile the need for volatile in memory-mapped IO. Was this an oversight? What's D's answer to this? If one were to use D to read from memory-mapped IO, how would one ensure the compiler doesn't cache the value?+1
Oct 23 2013
On 10/23/2013 5:43 PM, Mike wrote:I'm interested in ARM bare-metal programming with D, and I'm trying to get my head wrapped around how to approach this. I'm making progress, but I found something that was surprising to me: deprecation of the volatile keyword. In the bare-metal/hardware/driver world, this keyword is important to ensure the optimizer doesn't cache reads to memory-mapped IO, as some hardware peripheral may modify the value without involving the processor. I've read a few discussions on the D forums about the volatile keyword debate, but noone seemed to reconcile the need for volatile in memory-mapped IO. Was this an oversight? What's D's answer to this? If one were to use D to read from memory-mapped IO, how would one ensure the compiler doesn't cache the value?volatile was never a reliable method for dealing with memory mapped I/O. The correct and guaranteed way to make this work is to write two "peek" and "poke" functions to read/write a particular memory address: int peek(int* p); void poke(int* p, int value); Implement them in the obvious way, and compile them separately so the optimizer will not try to inline/optimize them.
Oct 23 2013
On Thursday, 24 October 2013 at 05:37:49 UTC, Walter Bright wrote:On 10/23/2013 5:43 PM, Mike wrote:Thanks for the answer, Walter. I think this would be acceptable in many (most?) cases, but not where high performance is needed I think these functions add too much overhead if they are not inlined and in a critical path (bit-banging IO, for example). Afterall, a read/write to a volatile address is a single atomic instruction, if done properly. Is there a way to tell D to remove the function overhead, for example, like a "naked" attribute, yet still retain the "volatile" behavior?I'm interested in ARM bare-metal programming with D, and I'm trying to get my head wrapped around how to approach this. I'm making progress, but I found something that was surprising to me: deprecation of the volatile keyword. In the bare-metal/hardware/driver world, this keyword is important to ensure the optimizer doesn't cache reads to memory-mapped IO, as some hardware peripheral may modify the value without involving the processor. I've read a few discussions on the D forums about the volatile keyword debate, but noone seemed to reconcile the need for volatile in memory-mapped IO. Was this an oversight? What's D's answer to this? If one were to use D to read from memory-mapped IO, how would one ensure the compiler doesn't cache the value?volatile was never a reliable method for dealing with memory mapped I/O. The correct and guaranteed way to make this work is to write two "peek" and "poke" functions to read/write a particular memory address: int peek(int* p); void poke(int* p, int value); Implement them in the obvious way, and compile them separately so the optimizer will not try to inline/optimize them.
Oct 23 2013
On 24 October 2013 07:19, Mike <none none.com> wrote:On Thursday, 24 October 2013 at 05:37:49 UTC, Walter Bright wrote:Operations on volatile are *not* atomic. Nor do they establish a proper happens-before relationship for threading. This is why we have core.atomic as a portable synchronisation mechanism in D. Regards -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';On 10/23/2013 5:43 PM, Mike wrote:Thanks for the answer, Walter. I think this would be acceptable in many (most?) cases, but not where high performance is needed I think these functions add too much overhead if they are not inlined and in a critical path (bit-banging IO, for example). Afterall, a read/write to a volatile address is a single atomic instruction, if done properly.I'm interested in ARM bare-metal programming with D, and I'm trying to get my head wrapped around how to approach this. I'm making progress, but I found something that was surprising to me: deprecation of the volatile keyword. In the bare-metal/hardware/driver world, this keyword is important to ensure the optimizer doesn't cache reads to memory-mapped IO, as some hardware peripheral may modify the value without involving the processor. I've read a few discussions on the D forums about the volatile keyword debate, but noone seemed to reconcile the need for volatile in memory-mapped IO. Was this an oversight? What's D's answer to this? If one were to use D to read from memory-mapped IO, how would one ensure the compiler doesn't cache the value?volatile was never a reliable method for dealing with memory mapped I/O. The correct and guaranteed way to make this work is to write two "peek" and "poke" functions to read/write a particular memory address: int peek(int* p); void poke(int* p, int value); Implement them in the obvious way, and compile them separately so the optimizer will not try to inline/optimize them.
Oct 23 2013
On Thursday, 24 October 2013 at 06:40:14 UTC, Iain Buclaw wrote:I probably shouldn't have used the word "operations". What I meant is reading/writing to a volatile, aligned word in memory is an atomic operation. At least on my target platform it is. That may not be a correct generalization, however. The point I'm trying to make is the Peek/Poke function proposal adds function overhead compared to the "volatile" method in C, and I'm just want to know if there's a way to to eliminate/reduce it.Operations on volatile are *not* atomic. Nor do they establish a proper happens-before relationship for threading. This is why we have core.atomic as a portable synchronisation mechanism in D. Regardsvolatile was never a reliable method for dealing with memory mapped I/O. The correct and guaranteed way to make this work is to write two "peek" and "poke" functions to read/write a particular memory address: int peek(int* p); void poke(int* p, int value); Implement them in the obvious way, and compile them separately so the optimizer will not try to inline/optimize them.Thanks for the answer, Walter. I think this would be acceptable in many (most?) cases, but not where high performance is needed I think these functions add too much overhead if they are not inlined and in a critical path (bit-banging IO, for example). Afterall, a read/write to a volatile address is a single atomic instruction, if done properly.
Oct 23 2013
On 10/23/2013 11:19 PM, Mike wrote:Thanks for the answer, Walter. I think this would be acceptable in many (most?) cases, but not where high performance is needed I think these functions add too much overhead if they are not inlined and in a critical path (bit-banging IO, for example). Afterall, a read/write to a volatile address is a single atomic instruction, if done properly. Is there a way to tell D to remove the function overhead, for example, like a "naked" attribute, yet still retain the "volatile" behavior?You have to give up on volatile. Nobody agrees on what it means. What does "don't optimize" mean? And that's not at all the same thing as "atomic". I wouldn't worry about peek/poke being too slow unless you actually benchmark it and prove it is. Then, your alternatives are: 1. Write it in ordinary D, compile it, check the code generated, and if it is what you want, you're golden (at least for that compiler & switches). 2. Write it in inline asm. That's what it's for. 3. Write it in an external C function and link it in.
Oct 23 2013
On Thursday, 24 October 2013 at 06:48:07 UTC, Walter Bright wrote:On 10/23/2013 11:19 PM, Mike wrote:Well, I wasn't rooting for volatile, I just wanted a way to read/write my IO registers as fast as possible with D. I think the last two methods you've given confirm my suspicions and will work. But... I had my heart set on doing it all in D :-( Thanks for the answers.Thanks for the answer, Walter. I think this would be acceptable in many (most?) cases, but not where high performance is needed I think these functions add too much overhead if they are not inlined and in a critical path (bit-banging IO, for example). Afterall, a read/write to a volatile address is a single atomic instruction, if done properly. Is there a way to tell D to remove the function overhead, for example, like a "naked" attribute, yet still retain the "volatile" behavior?You have to give up on volatile. Nobody agrees on what it means. What does "don't optimize" mean? And that's not at all the same thing as "atomic". I wouldn't worry about peek/poke being too slow unless you actually benchmark it and prove it is. Then, your alternatives are: 1. Write it in ordinary D, compile it, check the code generated, and if it is what you want, you're golden (at least for that compiler & switches). 2. Write it in inline asm. That's what it's for. 3. Write it in an external C function and link it in.
Oct 23 2013
On Thursday, 24 October 2013 at 06:48:07 UTC, Walter Bright wrote:On 10/23/2013 11:19 PM, Mike wrote:Is not about "atomize me", it is about "really *read* me" or "really *write* me" at that memory location, don't fake it, don't cache me. And do it now, not 10 seconds later.Thanks for the answer, Walter. I think this would be acceptable in many (most?) cases, but not where high performance is needed I think these functions add too much overhead if they are not inlined and in a critical path (bit-banging IO, for example). Afterall, a read/write to a volatile address is a single atomic instruction, if done properly. Is there a way to tell D to remove the function overhead, for example, like a "naked" attribute, yet still retain the "volatile" behavior?You have to give up on volatile. Nobody agrees on what it means. What does "don't optimize" mean? And that's not at all the same thing as "atomic".
Oct 24 2013
On 10/24/2013 4:18 AM, eles wrote:On Thursday, 24 October 2013 at 06:48:07 UTC, Walter Bright wrote:Like I said, nobody (on the standards committees) could agree on exactly what that meant.On 10/23/2013 11:19 PM, Mike wrote:Is not about "atomize me", it is about "really *read* me" or "really *write* me" at that memory location, don't fake it, don't cache me. And do it now, not 10 seconds later.Thanks for the answer, Walter. I think this would be acceptable in many (most?) cases, but not where high performance is needed I think these functions add too much overhead if they are not inlined and in a critical path (bit-banging IO, for example). Afterall, a read/write to a volatile address is a single atomic instruction, if done properly. Is there a way to tell D to remove the function overhead, for example, like a "naked" attribute, yet still retain the "volatile" behavior?You have to give up on volatile. Nobody agrees on what it means. What does "don't optimize" mean? And that's not at all the same thing as "atomic".
Oct 24 2013
On Thursday, 24 October 2013 at 17:02:51 UTC, Walter Bright wrote:On 10/24/2013 4:18 AM, eles wrote:The standard committees might not agree, but there is somebody out there that really knows very accurately what that should mean: that somebody is the hardware itself. Just imagine the best hardware example that you have at hand: the microprocessor that you are programming for. It writes on the bus, there is a short delay before the signals are guaranteed to reach the correct levels, then reads the memory data and so on. You cannot read the data before the delay passes. You cannot say "well, I could postpone the writing on the address on the bus, let's read the memory location first" -- or you would read garbage. Or you cannot say: well, first I will execute the program without a processor then, when the user is already pissed off, I would finally execute all those instructions at once. Too bad that the computer is already flying through the window at that time. You command that processor from the compiler. Now, the thing that's needed is to give a way to do the same (ie commanding a hardware) from the program compiled by the compiler.On Thursday, 24 October 2013 at 06:48:07 UTC, Walter Bright wrote:Like I said, nobody (on the standards committees) could agree on exactly what that meant.On 10/23/2013 11:19 PM, Mike wrote:
Oct 24 2013
On 10/24/2013 11:33 AM, eles wrote:On Thursday, 24 October 2013 at 17:02:51 UTC, Walter Bright wrote:The trouble with that is since the standards people cannot agree on what volatile means, you're working with a compiler that has non-standard behavior. This is not portable and not reliable.On 10/24/2013 4:18 AM, eles wrote:The standard committees might not agree, but there is somebody out there that really knows very accurately what that should mean: that somebody is the hardware itself. Just imagine the best hardware example that you have at hand: the microprocessor that you are programming for. It writes on the bus, there is a short delay before the signals are guaranteed to reach the correct levels, then reads the memory data and so on. You cannot read the data before the delay passes. You cannot say "well, I could postpone the writing on the address on the bus, let's read the memory location first" -- or you would read garbage. Or you cannot say: well, first I will execute the program without a processor then, when the user is already pissed off, I would finally execute all those instructions at once. Too bad that the computer is already flying through the window at that time. You command that processor from the compiler. Now, the thing that's needed is to give a way to do the same (ie commanding a hardware) from the program compiled by the compiler.On Thursday, 24 October 2013 at 06:48:07 UTC, Walter Bright wrote:Like I said, nobody (on the standards committees) could agree on exactly what that meant.On 10/23/2013 11:19 PM, Mike wrote:
Oct 24 2013
On Thursday, 24 October 2013 at 19:11:03 UTC, Walter Bright wrote:On 10/24/2013 11:33 AM, eles wrote:There should be some way, in the D language, to tell the compiler "Do exactly what I say here, and don't try to be clever about it" without introducing unnecessary (and unfortunate) overhead. It doesn't have to be /volatile/. /shared/ may be the solution here, but based on a comment by Iain Buclaw (http://forum.dlang.org/post/mailman.2454.1382619958.1719.digitalm rs-d puremagic.com) it seems there could be some disagreement on what this means to compiler implementers. I don't see why "shared" could not only mean "shared by more than one thread/cpu", but also "shared by external hardware peripherals". Maybe /shared/'s definition needs to be further defined to ensure all compilers implement it the same way, and be unambiguous enough to provide a solution to this /volatile/ debate. Using peek and poke functions is, well, nah... Better methods exist. Using inline assembly is a reasonable alternative, as is linking to an external C library, but why use D then? Is low-level/embedded software development a design goal of the D language?On Thursday, 24 October 2013 at 17:02:51 UTC, Walter Bright wrote:The trouble with that is since the standards people cannot agree on what volatile means, you're working with a compiler that has non-standard behavior. This is not portable and not reliable.On 10/24/2013 4:18 AM, eles wrote:The standard committees might not agree, but there is somebody out there that really knows very accurately what that should mean: that somebody is the hardware itself. Just imagine the best hardware example that you have at hand: the microprocessor that you are programming for. It writes on the bus, there is a short delay before the signals are guaranteed to reach the correct levels, then reads the memory data and so on. You cannot read the data before the delay passes. You cannot say "well, I could postpone the writing on the address on the bus, let's read the memory location first" -- or you would read garbage. Or you cannot say: well, first I will execute the program without a processor then, when the user is already pissed off, I would finally execute all those instructions at once. Too bad that the computer is already flying through the window at that time. You command that processor from the compiler. Now, the thing that's needed is to give a way to do the same (ie commanding a hardware) from the program compiled by the compiler.On Thursday, 24 October 2013 at 06:48:07 UTC, Walter Bright wrote:Like I said, nobody (on the standards committees) could agree on exactly what that meant.On 10/23/2013 11:19 PM, Mike wrote:
Oct 24 2013
On Friday, 25 October 2013 at 04:30:37 UTC, Mike wrote:On Thursday, 24 October 2013 at 19:11:03 UTC, Walter Bright wrote:The problem with shared alone variable is that it can be simply placed by the optimizer at another memory location than the intended one, even if all threads are seeing it "as if" at the intended location.On 10/24/2013 11:33 AM, eles wrote:Maybe /shared/'s definition needs to be further defined to ensure all compilers implement it the same way, and be unambiguous enough to provide a solution to this /volatile/ debate.On Thursday, 24 October 2013 at 17:02:51 UTC, Walter Bright wrote:On 10/24/2013 4:18 AM, eles wrote:On Thursday, 24 October 2013 at 06:48:07 UTC, Walter Bright wrote:On 10/23/2013 11:19 PM, Mike wrote:
Oct 25 2013
On 10/24/2013 9:30 PM, Mike wrote:Using peek and poke functions is, well, nah... Better methods exist. Using inline assembly is a reasonable alternative, as is linking to an external C library, but why use D then? Is low-level/embedded software development a design goal of the D language?I've written device drivers and embedded systems. The quantity of code that deals with memory-mapped I/O is a very, very small part of those programs. The subset of that code that needs to exactly control the read and write cycles is tinier still. (For example, when writing to a memory-mapped video buffer, such control is quite unnecessary.) Any of the methods I presented are not a significant burden. Adding two lines of inline assembler to get exactly what you want isn't hard, and you can hide it behind a mixin if you like. And, of course, you'll still need inline assembler to deal with the other system-type operations needed for embedded systems work. For example, setting up the program stack, setting the segment registers, etc. No language provides support for them outside of inline assembler or assembler intrinsics.
Oct 25 2013
On Fri, 2013-10-25 at 13:04 -0700, Walter Bright wrote: […]I've written device drivers and embedded systems. The quantity of code that deals with memory-mapped I/O is a very, very small part of those programs. The subset of that code that needs to exactly control the read and write cycles is tinier still. (For example, when writing to a memory-mapped video buffer, such control is quite unnecessary.) Any of the methods I presented are not a significant burden. Adding two lines of inline assembler to get exactly what you want isn't hard, and you can hide it behind a mixin if you like. And, of course, you'll still need inline assembler to deal with the other system-type operations needed for embedded systems work. For example, setting up the program stack, setting the segment registers, etc. No language provides support for them outside of inline assembler or assembler intrinsics.My experience, admittedly late 1970s, early 1980s then early 2000s concurs with yours that only a small amount of code requires this read and write behaviour, but where it is needed it is crucial and in areas where every picosecond matters (*). I disagree with your point about memory video buffers as a general statement, it depends on the buffering and refresh strategy of the buffer. Some frame buffers are very picky and so exact read and write behaviour of the code is needed. Less so now though fortunately. Using functions is a burden here if it involves a function call, only macros are feasible as units of abstraction. Moreover this is the classic approach to inline assembler some form of macro so as to create a comprehensible abstraction. The problem with inline assembler is that you need versions for every target architecture making it a source code and build nightmare. OK there are directory hierarchy idioms and build idioms that make it easier (**), but inline assembler should only really be an answer in cases where there are hardware instructions on a given target that it cannot reasonable be expected that the compiler can generate from the source code. Classics here are the elliptic function libraries, and the context switch operations. So the issue is not the approach per se but how that is encoded in the source code to make it readable and comprehensible AND performant. Volatile as a variable modifier always worked for me in the past but it got bad press and all compiler writers ignored it as a feature till it became useless. Perhaps it is time to reclaim volatile for D give it a memory barrier semantic so that there can be no instruction reordering around the read and write operations, and make it a tool for those who need it. After all no-one is actually using for anything just now are they? (*) OK a small exaggeration in late 1970s where the time scale was 18ms, but you get my point. (**) Actually it is much easier to do with build tools such as SCons and Waf than it ever was with Make, and the GNU "Auto" tools (especially on Windows), and even CMake. -- Russel. ============================================================================= Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.net 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Oct 28 2013
On 10/28/2013 1:13 AM, Russel Winder wrote:My experience, admittedly late 1970s, early 1980s then early 2000s concurs with yours that only a small amount of code requires this read and write behaviour, but where it is needed it is crucial and in areas where every picosecond matters (*). I disagree with your point about memory video buffers as a general statement, it depends on the buffering and refresh strategy of the buffer. Some frame buffers are very picky and so exact read and write behaviour of the code is needed. Less so now though fortunately.I've not only built my own single board computers with video buffers, but I've written code for several graphics boards back in the 80's. None needed exact read/write behavior.Using functions is a burden here if it involves a function call, only macros are feasible as units of abstraction. Moreover this is the classic approach to inline assembler some form of macro so as to create a comprehensible abstraction.If you want every picosecond, you're really best off writing a few lines of inline asm. Then you can craft exactly what you need.The problem with inline assembler is that you need versions for every target architecture making it a source code and build nightmare.When you're writing code for memory-mapped I/O, it is NOT going to be portable, pretty much by definition! (Are there any two different target architectures with exactly the same memory-mapped I/O stuff?)OK there are directory hierarchy idioms and build idioms that make it easier (**), but inline assembler should only really be an answer in cases where there are hardware instructions on a given target that it cannot reasonable be expected that the compiler can generate from the source code. Classics here are the elliptic function libraries, and the context switch operations. So the issue is not the approach per se but how that is encoded in the source code to make it readable and comprehensible AND performant. Volatile as a variable modifier always worked for me in the past but it got bad press and all compiler writers ignored it as a feature till it became useless. Perhaps it is time to reclaim volatile for D give it a memory barrier semantic so that there can be no instruction reordering around the read and write operations, and make it a tool for those who need it. After all no-one is actually using for anything just now are they?Ask any two people, even ones in this thread, what "volatile" means, and you'll get two different answers. Note that the issues of reordering, caching, cycles, and memory barriers are separate and distinct issues. Those issues also vary dramatically from one architecture to the next. (For example, what really happens with a+=1 ? Should it generate an INC, or an ADD, or a MOV/ADD/MOV triple for MMIO? Where do the barriers go? Do you even need barriers? Should a LOCK prefix be emitted? How is the compiler supposed to know just how the MMIO works on some particular computer board?)
Oct 28 2013
On Monday, 28 October 2013 at 08:42:12 UTC, Walter Bright wrote:On 10/28/2013 1:13 AM, Russel Winder wrote: Ask any two people, even ones in this thread, what "volatile" means, and you'll get two different answers. Note that the issues of reordering, caching, cycles, and memory barriers are separate and distinct issues. Those issues also vary dramatically from one architecture to the next."volatile" => "fickle"(For example, what really happens with a+=1 ? Should it generate an INC, or an ADD, or a MOV/ADD/MOV triple for MMIO? Where do the barriers go? Do you even need barriers? Should a LOCK prefix be emitted? How is the compiler supposed to know just how the MMIO works on some particular computer board?)read [address] into registry (mov) registry++ (add) write registry to [address] (mov) You cannot do it otherwise (that is, a shortcut operator). "Shortcut" operators on fickle memory location shall be simply forbidden. Compiler is able to complain about that. Only explicit reads and writes shall be possible. OK, go with peek() and poke() if you feel it's better and easier (this avoids the a+=1 problem). At least as a first step. But put those into the compiler/phobos, not force somebody to write ASM or C for that. If D send people back to a C compiler, it would never displace C. Templated peek() and poke() are 5 LOCs. Put those in a std.hardware module and, if you prefer, leave it undocumented. Since we discuss this matter, it could have been solved 10 times.
Oct 28 2013
On 10/28/2013 2:33 AM, eles wrote:That overlooks what happens if another thread changes the memory in between the read and the write. Hence the issues of memory barriers, lock prefixes, etc.(For example, what really happens with a+=1 ? Should it generate an INC, or an ADD, or a MOV/ADD/MOV triple for MMIO? Where do the barriers go? Do you even need barriers? Should a LOCK prefix be emitted? How is the compiler supposed to know just how the MMIO works on some particular computer board?)read [address] into registry (mov) registry++ (add) write registry to [address] (mov) You cannot do it otherwise (that is, a shortcut operator).Since we discuss this matter, it could have been solved 10 times.Pull requests are welcome!
Oct 28 2013
On Monday, 28 October 2013 at 16:06:48 UTC, Walter Bright wrote:On 10/28/2013 2:33 AM, eles wrote: That overlooks what happens if another thread changes the memory in between the read and the write. Hence the issues of memory barriers, lock prefixes, etc.Synchronizing the access to the resource is the job of the programmer. He will take a mutex for it. You do that inside the kernel space, not in the user space. There is just one kernel, and it is able to synchronize with itself. Put this into perspective.Pull requests are welcome!You pre-approve?
Oct 28 2013
On 10/28/2013 11:50 AM, eles wrote:It'll be subject to review by the community.Pull requests are welcome!You pre-approve?
Oct 28 2013
On Thu, 2013-10-24 at 08:19 +0200, Mike wrote: […]Also this (peek and poke) is not a viable approach if you wanted to write an operating system in D. I think it should be an aim to have the replacement for Windows, OS X, Linux, etc. written in D instead of C/C++. -- Russel. ============================================================================= Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.net 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winderint peek(int* p); void poke(int* p, int value); Implement them in the obvious way, and compile them separately so the optimizer will not try to inline/optimize them.Thanks for the answer, Walter. I think this would be acceptable in many (most?) cases, but not where high performance is needed I think these functions add too much overhead if they are not inlined and in a critical path (bit-banging IO, for example). Afterall, a read/write to a volatile address is a single atomic instruction, if done properly. Is there a way to tell D to remove the function overhead, for example, like a "naked" attribute, yet still retain the "volatile" behavior?
Oct 24 2013
On Thursday, 24 October 2013 at 14:53:18 UTC, Russel Winder wrote:On Thu, 2013-10-24 at 08:19 +0200, Mike wrote: […] I think it should be an aim to have the replacement for Windows, OS X, Linux, etc. written in D instead of C/C++.I pray strongly that W&A believe the same.
Oct 24 2013
On 24 October 2013 06:37, Walter Bright <newshound2 digitalmars.com> wrote:On 10/23/2013 5:43 PM, Mike wrote:Are you talking dmd or in general (it's hard to tell). In gdc, volatile is the same as in gcc/g++ in behaviour. Although in one aspect, when the default storage model was switched to thread-local, that made volatile on it's own pointless. As a side note, 'shared' is considered a volatile type in gdc, which differs from the deprecated keyword which set volatile at a decl/expression level. There is a difference in semantics, but it escapes this author at 6.30am in the morning. :o) In any case, using shared would be my recommended route for you to go down.I'm interested in ARM bare-metal programming with D, and I'm trying to get my head wrapped around how to approach this. I'm making progress, but I found something that was surprising to me: deprecation of the volatile keyword. In the bare-metal/hardware/driver world, this keyword is important to ensure the optimizer doesn't cache reads to memory-mapped IO, as some hardware peripheral may modify the value without involving the processor. I've read a few discussions on the D forums about the volatile keyword debate, but noone seemed to reconcile the need for volatile in memory-mapped IO. Was this an oversight? What's D's answer to this? If one were to use D to read from memory-mapped IO, how would one ensure the compiler doesn't cache the value?volatile was never a reliable method for dealing with memory mapped I/O.The correct and guaranteed way to make this work is to write two "peek" and "poke" functions to read/write a particular memory address: int peek(int* p); void poke(int* p, int value); Implement them in the obvious way, and compile them separately so the optimizer will not try to inline/optimize them.+1. Using an optimiser along with code that talks to hardware can result in bizarre behaviour. -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';
Oct 23 2013
On Thursday, 24 October 2013 at 06:37:08 UTC, Iain Buclaw wrote:On 24 October 2013 06:37, Walter Bright <newshound2 digitalmars.com> wrote:Well, I've done some reading about "shared" but I don't quite grasp it yet. I still have some learning to do. That's my problem, but if you feel like explaining how it can be used in place of volatile for hardware register access, that would be awfully nice.On 10/23/2013 5:43 PM, Mike wrote:Are you talking dmd or in general (it's hard to tell). In gdc, volatile is the same as in gcc/g++ in behaviour. Although in one aspect, when the default storage model was switched to thread-local, that made volatile on it's own pointless. As a side note, 'shared' is considered a volatile type in gdc, which differs from the deprecated keyword which set volatile at a decl/expression level. There is a difference in semantics, but it escapes this author at 6.30am in the morning. :o) In any case, using shared would be my recommended route for you to go down.I'm interested in ARM bare-metal programming with D, and I'm trying to get my head wrapped around how to approach this. I'm making progress, but I found something that was surprising to me: deprecation of the volatile keyword. In the bare-metal/hardware/driver world, this keyword is important to ensure the optimizer doesn't cache reads to memory-mapped IO, as some hardware peripheral may modify the value without involving the processor. I've read a few discussions on the D forums about the volatile keyword debate, but noone seemed to reconcile the need for volatile in memory-mapped IO. Was this an oversight? What's D's answer to this? If one were to use D to read from memory-mapped IO, how would one ensure the compiler doesn't cache the value?volatile was never a reliable method for dealing with memory mapped I/O.The correct and guaranteed way to make this work is to write two "peek" and "poke" functions to read/write a particular memory address: int peek(int* p); void poke(int* p, int value); Implement them in the obvious way, and compile them separately so the optimizer will not try to inline/optimize them.+1. Using an optimiser along with code that talks to hardware can result in bizarre behaviour.
Oct 24 2013
On 24 October 2013 08:18, Mike <none none.com> wrote:On Thursday, 24 October 2013 at 06:37:08 UTC, Iain Buclaw wrote:'shared' guarantees that all reads and writes specified in source code happen in the exact order specified with no omissions, as there may be other threads reading/writing to the variable at the same time. Regards -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';On 24 October 2013 06:37, Walter Bright <newshound2 digitalmars.com> wrote:Well, I've done some reading about "shared" but I don't quite grasp it yet. I still have some learning to do. That's my problem, but if you feel like explaining how it can be used in place of volatile for hardware register access, that would be awfully nice.On 10/23/2013 5:43 PM, Mike wrote:Are you talking dmd or in general (it's hard to tell). In gdc, volatile is the same as in gcc/g++ in behaviour. Although in one aspect, when the default storage model was switched to thread-local, that made volatile on it's own pointless. As a side note, 'shared' is considered a volatile type in gdc, which differs from the deprecated keyword which set volatile at a decl/expression level. There is a difference in semantics, but it escapes this author at 6.30am in the morning. :o) In any case, using shared would be my recommended route for you to go down.I'm interested in ARM bare-metal programming with D, and I'm trying to get my head wrapped around how to approach this. I'm making progress, but I found something that was surprising to me: deprecation of the volatile keyword. In the bare-metal/hardware/driver world, this keyword is important to ensure the optimizer doesn't cache reads to memory-mapped IO, as some hardware peripheral may modify the value without involving the processor. I've read a few discussions on the D forums about the volatile keyword debate, but noone seemed to reconcile the need for volatile in memory-mapped IO. Was this an oversight? What's D's answer to this? If one were to use D to read from memory-mapped IO, how would one ensure the compiler doesn't cache the value?volatile was never a reliable method for dealing with memory mapped I/O.The correct and guaranteed way to make this work is to write two "peek" and "poke" functions to read/write a particular memory address: int peek(int* p); void poke(int* p, int value); Implement them in the obvious way, and compile them separately so the optimizer will not try to inline/optimize them.+1. Using an optimiser along with code that talks to hardware can result in bizarre behaviour.
Oct 24 2013
On Thursday, 24 October 2013 at 08:20:43 UTC, Iain Buclaw wrote:On 24 October 2013 08:18, Mike <none none.com> wrote:Is it actually implemented as such in any D compiler? That's a lot of memory barriers, shared would have to come with a massive SLOW! notice on it. Not saying that's a bad choice necessarily, but I was pretty sure this had never been implemented.On Thursday, 24 October 2013 at 06:37:08 UTC, Iain Buclaw wrote:'shared' guarantees that all reads and writes specified in source code happen in the exact order specified with no omissions, as there may be other threads reading/writing to the variable at the same time. RegardsOn 24 October 2013 06:37, Walter Bright <newshound2 digitalmars.com> wrote:Well, I've done some reading about "shared" but I don't quite grasp it yet. I still have some learning to do. That's my problem, but if you feel like explaining how it can be used in place of volatile for hardware register access, that would be awfully nice.On 10/23/2013 5:43 PM, Mike wrote:Are you talking dmd or in general (it's hard to tell). In gdc, volatile is the same as in gcc/g++ in behaviour. Although in one aspect, when the default storage model was switched to thread-local, that made volatile on it's own pointless. As a side note, 'shared' is considered a volatile type in gdc, which differs from the deprecated keyword which set volatile at a decl/expression level. There is a difference in semantics, but it escapes this author at 6.30am in the morning. :o) In any case, using shared would be my recommended route for you to go down.I'm interested in ARM bare-metal programming with D, and I'm trying to get my head wrapped around how to approach this. I'm making progress, but I found something that was surprising to me: deprecation of the volatile keyword. In the bare-metal/hardware/driver world, this keyword is important to ensure the optimizer doesn't cache reads to memory-mapped IO, as some hardware peripheral may modify the value without involving the processor. I've read a few discussions on the D forums about the volatile keyword debate, but noone seemed to reconcile the need for volatile in memory-mapped IO. Was this an oversight? What's D's answer to this? If one were to use D to read from memory-mapped IO, how would one ensure the compiler doesn't cache the value?volatile was never a reliable method for dealing with memory mapped I/O.The correct and guaranteed way to make this work is to write two "peek" and "poke" functions to read/write a particular memory address: int peek(int* p); void poke(int* p, int value); Implement them in the obvious way, and compile them separately so the optimizer will not try to inline/optimize them.+1. Using an optimiser along with code that talks to hardware can result in bizarre behaviour.
Oct 24 2013
On 24 October 2013 10:27, John Colvin <john.loughran.colvin gmail.com> wrote:On Thursday, 24 October 2013 at 08:20:43 UTC, Iain Buclaw wrote:If you require memory barriers to access share data, that is what 'synchronized' and core.atomic is for. There is *no* implicit locks occurring when accessing the data. -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';On 24 October 2013 08:18, Mike <none none.com> wrote:Is it actually implemented as such in any D compiler? That's a lot of memory barriers, shared would have to come with a massive SLOW! notice on it. Not saying that's a bad choice necessarily, but I was pretty sure this had never been implemented.On Thursday, 24 October 2013 at 06:37:08 UTC, Iain Buclaw wrote:'shared' guarantees that all reads and writes specified in source code happen in the exact order specified with no omissions, as there may be other threads reading/writing to the variable at the same time. RegardsOn 24 October 2013 06:37, Walter Bright <newshound2 digitalmars.com> wrote:Well, I've done some reading about "shared" but I don't quite grasp it yet. I still have some learning to do. That's my problem, but if you feel like explaining how it can be used in place of volatile for hardware register access, that would be awfully nice.On 10/23/2013 5:43 PM, Mike wrote:Are you talking dmd or in general (it's hard to tell). In gdc, volatile is the same as in gcc/g++ in behaviour. Although in one aspect, when the default storage model was switched to thread-local, that made volatile on it's own pointless. As a side note, 'shared' is considered a volatile type in gdc, which differs from the deprecated keyword which set volatile at a decl/expression level. There is a difference in semantics, but it escapes this author at 6.30am in the morning. :o) In any case, using shared would be my recommended route for you to go down.I'm interested in ARM bare-metal programming with D, and I'm trying to get my head wrapped around how to approach this. I'm making progress, but I found something that was surprising to me: deprecation of the volatile keyword. In the bare-metal/hardware/driver world, this keyword is important to ensure the optimizer doesn't cache reads to memory-mapped IO, as some hardware peripheral may modify the value without involving the processor. I've read a few discussions on the D forums about the volatile keyword debate, but noone seemed to reconcile the need for volatile in memory-mapped IO. Was this an oversight? What's D's answer to this? If one were to use D to read from memory-mapped IO, how would one ensure the compiler doesn't cache the value?volatile was never a reliable method for dealing with memory mapped I/O.The correct and guaranteed way to make this work is to write two "peek" and "poke" functions to read/write a particular memory address: int peek(int* p); void poke(int* p, int value); Implement them in the obvious way, and compile them separately so the optimizer will not try to inline/optimize them.+1. Using an optimiser along with code that talks to hardware can result in bizarre behaviour.
Oct 24 2013
On Thursday, 24 October 2013 at 09:43:51 UTC, Iain Buclaw wrote:'shared' guarantees that all reads and writes specified in source code happen in the exact order specified with no omissionsIf you require memory barriers to access share data, that is what 'synchronized' and core.atomic is for. There is *no* implicit locks occurring when accessing the data.If there are no memory barriers, then there is no guarantee* of ordering of reads or writes. Sure, the compiler can promise not to rearrange them, but the CPU is a different matter. *dependant on CPU architecture of course. e.g. IIRC the intel atom never reorders anything.
Oct 24 2013
On 24 October 2013 12:10, John Colvin <john.loughran.colvin gmail.com> wrote:On Thursday, 24 October 2013 at 09:43:51 UTC, Iain Buclaw wrote:I was talking about the compiler, not CPU. -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';'shared' guarantees that all reads and writes specified in source code happen in the exact order specified with no omissionsIf you require memory barriers to access share data, that is what 'synchronized' and core.atomic is for. There is *no* implicit locks occurring when accessing the data.If there are no memory barriers, then there is no guarantee* of ordering of reads or writes. Sure, the compiler can promise not to rearrange them, but the CPU is a different matter. *dependant on CPU architecture of course. e.g. IIRC the intel atom never reorders anything.
Oct 24 2013
Am Thu, 24 Oct 2013 14:04:44 +0100 schrieb Iain Buclaw <ibuclaw ubuntu.com>:On 24 October 2013 12:10, John Colvin <john.loughran.colvin gmail.com> wrote:Does this include writes to non-shared data? For example: ------------------------------------ shared int x; int y; void func() { x = 0; y = 3; //Can the compiler move this assignment? x = 1; } ------------------------------------ So there's no additional overhead (in code / instructions emitted) when using shared instead of volatile in code like this? And this is valid code with shared (assuming reading/assigning to x is atomic)? ------------------------------------ volatile bool x = false; void waitForX() { while(!x){} } __interrupt(X) void x_interrupt() { x = true; } ------------------------------------On Thursday, 24 October 2013 at 09:43:51 UTC, Iain Buclaw wrote:'shared' guarantees that all reads and writes specified in source code happen in the exact order specified with no omissions
Oct 24 2013
On 24 October 2013 18:49, Johannes Pfau <nospam example.com> wrote:Am Thu, 24 Oct 2013 14:04:44 +0100 schrieb Iain Buclaw <ibuclaw ubuntu.com>:Yes, reordering may occur so long as the compiler does not change behaviour in respect to the programs sequential points in the application. (Your example, for instance, can not possibly be re-ordered). It is also worth noting while you may have guarantee of this, it does not mean that you can go using __thread data without memory barriers. (For instance, if you have an asynchronous signal handler, it may alter the __thread'ed data at any point in the sequential program).On 24 October 2013 12:10, John Colvin <john.loughran.colvin gmail.com> wrote:Does this include writes to non-shared data? For example: ------------------------------------ shared int x; int y; void func() { x = 0; y = 3; //Can the compiler move this assignment? x = 1; } ------------------------------------On Thursday, 24 October 2013 at 09:43:51 UTC, Iain Buclaw wrote:'shared' guarantees that all reads and writes specified in source code happen in the exact order specified with no omissionsSo there's no additional overhead (in code / instructions emitted) when using shared instead of volatile in code like this? And this is valid code with shared (assuming reading/assigning to x is atomic)? ------------------------------------ volatile bool x = false; void waitForX() { while(!x){} } __interrupt(X) void x_interrupt() { x = true; } ------------------------------------That is correct. :o) Regards -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';
Oct 24 2013
Am Thu, 24 Oct 2013 21:28:45 +0100 schrieb Iain Buclaw <ibuclaw ubuntu.com>:On 24 October 2013 18:49, Johannes Pfau <nospam example.com> wrote:Sounds good. Now this should be the standard defined behaviour for all compilers. But I guess it'll take some more time till the shared design is really finalized.Am Thu, 24 Oct 2013 14:04:44 +0100 schrieb Iain Buclaw <ibuclaw ubuntu.com>:Yes, reordering may occur so long as the compiler does not change behaviour in respect to the programs sequential points in the application. (Your example, for instance, can not possibly be re-ordered). It is also worth noting while you may have guarantee of this, it does not mean that you can go using __thread data without memory barriers. (For instance, if you have an asynchronous signal handler, it may alter the __thread'ed data at any point in the sequential program).On 24 October 2013 12:10, John Colvin <john.loughran.colvin gmail.com> wrote:Does this include writes to non-shared data? For example: ------------------------------------ shared int x; int y; void func() { x = 0; y = 3; //Can the compiler move this assignment? x = 1; } ------------------------------------On Thursday, 24 October 2013 at 09:43:51 UTC, Iain Buclaw wrote:'shared' guarantees that all reads and writes specified in source code happen in the exact order specified with no omissionsSo there's no additional overhead (in code / instructions emitted) when using shared instead of volatile in code like this? And this is valid code with shared (assuming reading/assigning to x is atomic)? ------------------------------------ volatile bool x = false; void waitForX() { while(!x){} } __interrupt(X) void x_interrupt() { x = true; } ------------------------------------That is correct. :o) Regards
Oct 25 2013
On Thursday, 24 October 2013 at 08:20:43 UTC, Iain Buclaw wrote:On 24 October 2013 08:18, Mike <none none.com> wrote:All that's missing is a guarantee that the reading/writing actually occur at the intended address and not in some compiler cache.On Thursday, 24 October 2013 at 06:37:08 UTC, Iain Buclaw wrote:'shared' guarantees that all reads and writes specified in source code happen in the exact order specified with no omissions, as there may be other threads reading/writing to the variable at the same time.On 24 October 2013 06:37, Walter Bright <newshound2 digitalmars.com> wrote:On 10/23/2013 5:43 PM, Mike wrote:
Oct 24 2013
On 24 October 2013 12:22, eles <eles eles.com> wrote:On Thursday, 24 October 2013 at 08:20:43 UTC, Iain Buclaw wrote:The compiler does not cache shared data (at least in GDC). -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';On 24 October 2013 08:18, Mike <none none.com> wrote:All that's missing is a guarantee that the reading/writing actually occur at the intended address and not in some compiler cache.On Thursday, 24 October 2013 at 06:37:08 UTC, Iain Buclaw wrote:'shared' guarantees that all reads and writes specified in source code happen in the exact order specified with no omissions, as there may be other threads reading/writing to the variable at the same time.On 24 October 2013 06:37, Walter Bright <newshound2 digitalmars.com> wrote:On 10/23/2013 5:43 PM, Mike wrote:
Oct 24 2013
On Thursday, 24 October 2013 at 13:05:58 UTC, Iain Buclaw wrote:On 24 October 2013 12:22, eles <eles eles.com> wrote:On Thursday, 24 October 2013 at 08:20:43 UTC, Iain Buclaw wrote:On 24 October 2013 08:18, Mike <none none.com> wrote:On Thursday, 24 October 2013 at 06:37:08 UTC, Iain Buclaw wrote:On 24 October 2013 06:37, Walter Bright <newshound2 digitalmars.com> wrote:On 10/23/2013 5:43 PM, Mike wrote:The compiler does not cache shared data (at least in GDC).Well, that should not be a matter of implementation, but of language standard. Besides not caching, still MIA is the fact that these read/write operations should occur when asked, not later (orderly execution means almost nothing if all those operations are executed by the compiler at some time later, eventually not taking into account sleep()s between operations - sometimes the hardware needs, let's say, 500ms to guarantee a register is filled with a meaning value - and so on. So it is about the correct memory location, the immediateness of those operations (this will also ensure orderly execution) and about the uncaching.
Oct 24 2013
On Thursday, 24 October 2013 at 05:37:49 UTC, Walter Bright wrote:On 10/23/2013 5:43 PM, Mike wrote:Yes, this is a simplest way to do it and works with gdc when compiled in separate file with no optimizations and inlining. But todays peripherals may have tens of registers and they are usually represented as a struct. Using the peripheral often require several register access. Doing it this way will not make code very readable. As a workaround I have all register access functions in a separate file and compile those files in a separate directory with no optimizations. The amount of code generated is 3-4 times more and this is a problem because in controllers memory and speed are always too small.I'm interested in ARM bare-metal programming with D, and I'm trying to get my head wrapped around how to approach this. I'm making progress, but I found something that was surprising to me: deprecation of the volatile keyword. In the bare-metal/hardware/driver world, this keyword is important to ensure the optimizer doesn't cache reads to memory-mapped IO, as some hardware peripheral may modify the value without involving the processor. I've read a few discussions on the D forums about the volatile keyword debate, but noone seemed to reconcile the need for volatile in memory-mapped IO. Was this an oversight? What's D's answer to this? If one were to use D to read from memory-mapped IO, how would one ensure the compiler doesn't cache the value?volatile was never a reliable method for dealing with memory mapped I/O. The correct and guaranteed way to make this work is to write two "peek" and "poke" functions to read/write a particular memory address: int peek(int* p); void poke(int* p, int value); Implement them in the obvious way, and compile them separately so the optimizer will not try to inline/optimize them.
Oct 23 2013
On Thursday, 24 October 2013 at 06:41:54 UTC, Timo Sintonen wrote:On Thursday, 24 October 2013 at 05:37:49 UTC, Walter Bright wrote:+1, This is what I feared. I don't think D needs a volatile keyword, but it would be nice to have *some* way to avoid this overhead using language features. I'm beginning to think inline ASM is the only way to avoid this. That's not a deal breaker for me, but it makes me sad.On 10/23/2013 5:43 PM, Mike wrote:Yes, this is a simplest way to do it and works with gdc when compiled in separate file with no optimizations and inlining. But todays peripherals may have tens of registers and they are usually represented as a struct. Using the peripheral often require several register access. Doing it this way will not make code very readable. As a workaround I have all register access functions in a separate file and compile those files in a separate directory with no optimizations. The amount of code generated is 3-4 times more and this is a problem because in controllers memory and speed are always too small.I'm interested in ARM bare-metal programming with D, and I'm trying to get my head wrapped around how to approach this. I'm making progress, but I found something that was surprising to me: deprecation of the volatile keyword. In the bare-metal/hardware/driver world, this keyword is important to ensure the optimizer doesn't cache reads to memory-mapped IO, as some hardware peripheral may modify the value without involving the processor. I've read a few discussions on the D forums about the volatile keyword debate, but noone seemed to reconcile the need for volatile in memory-mapped IO. Was this an oversight? What's D's answer to this? If one were to use D to read from memory-mapped IO, how would one ensure the compiler doesn't cache the value?volatile was never a reliable method for dealing with memory mapped I/O. The correct and guaranteed way to make this work is to write two "peek" and "poke" functions to read/write a particular memory address: int peek(int* p); void poke(int* p, int value); Implement them in the obvious way, and compile them separately so the optimizer will not try to inline/optimize them.
Oct 24 2013
On Thursday, 24 October 2013 at 05:37:49 UTC, Walter Bright wrote:On 10/23/2013 5:43 PM, Mike wrote: volatile was never a reliable method for dealing with memory mapped I/O. The correct and guaranteed way to make this work is to write two "peek" and "poke" functions to read/write a particular memory address: int peek(int* p); void poke(int* p, int value); Implement them in the obvious way, and compile them separately so the optimizer will not try to inline/optimize them.I rised the problem here: http://forum.dlang.org/thread/selnpobzzvrsuyihnstl forum.dlang.org Anyway, poke's and peek's are a bit more cumbersome than volatile variables, since they do not cope so well, for example, with arithmetic expressions. Anyway, still better than nothing. *If* they would exist. IMHO, the embedded and hardware interfacing should get more attention.
Oct 24 2013
On 10/24/2013 4:13 AM, eles wrote:Anyway, poke's and peek's are a bit more cumbersome than volatile variables, since they do not cope so well, for example, with arithmetic expressions.Why wouldn't they work with arithmetic expressions? poke(0x888777, peek(0x12345678) + 1);Anyway, still better than nothing. *If* they would exist.T peek(T)(T* addr) { return *addr; } void poke(T)(T* addr, T value) { *addr = value; }IMHO, the embedded and hardware interfacing should get more attention.D's most excellent support for inline assembler should do nicely.
Oct 25 2013
On Friday, 25 October 2013 at 20:10:13 UTC, Walter Bright wrote:On 10/24/2013 4:13 AM, eles wrote:Frankly, if it is not a big deal, why don't you put those in a std.hardware or std.directaccess module? If I have to compile those in D outside of main program, why don't allow me to disable the optimizer *for a part* of my D program? Or for a variable? And if I have to compile those in C, should I go entirely with it and only let the D program to be an "extern int main()"? OOH you show me it is not a big deal, OTOH you make a big deal from it refusing every support inside the compiler or the standard library. Should I one day define my own "int plus(int a, int b) { return a+b; }"?Anyway, poke's and peek's are a bit more cumbersome than volatile variables, since they do not cope so well, for example, with arithmetic expressions.Why wouldn't they work with arithmetic expressions? poke(0x888777, peek(0x12345678) + 1);Anyway, still better than nothing. *If* they would exist.T peek(T)(T* addr) { return *addr; } void poke(T)(T* addr, T value) { *addr = value; }
Oct 26 2013
On Sat, 2013-10-26 at 16:48 +0200, eles wrote: […]OOH you show me it is not a big deal, OTOH you make a big deal from it refusing every support inside the compiler or the standard library.I am assuming that the C++ memory model and it's definition of volatile has in some way made the problem go away and expressions such as: device->csw.ready can be constructed such that there is no caching of values and the entity is always read. Given the issues of out of order execution, compiler optimization and multicore, what is their solution? (The above is a genuine question rather than a troll. Last time I was writing UNIX device drivers seriously was 30+ years ago, in C, and the last embedded systems work was 10 years ago using C with specialist compilers – 8051, AVR chips and the like. I would love to be able to work with the GPIO on a Raspberry Pi with D, it would get me back into all that fun stuff. I am staying away as it looks like a return to C is the only viable just now, unless I learn C++ again.)Should I one day define my own "int plus(int a, int b) { return a+b; }"?Surely, a + b always transforms to a.__add__(b) in all quality languages (*) so that you can redefine the meaning from the default. (*) which rules out Java. -- Russel. ============================================================================= Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.net 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Oct 27 2013
The sub-text here is that, D should be one of the main languages on A Raspberry Pi. Currently children start with Scratch, move to Python and then (most likely) to C. Oracle have made a huge push to ensure Java is mainstream on the Raspberry Pi. I would much prefer to have D or Go as the "Don't go to C or Java" option. -- Russel. ============================================================================= Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.net 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Oct 27 2013
27-Oct-2013 13:09, Russel Winder пишет:The sub-text here is that, D should be one of the main languages on A Raspberry Pi.s/Raspberry Pi/ARM boards/ After all Rasp Pi is only one of many - a tiny piece of outdated ARM.Currently children start with Scratch, move to Python and then (most likely) to C. Oracle have made a huge push to ensure Java is mainstream on the Raspberry Pi. I would much prefer to have D or Go as the "Don't go to C or Java" option.+1 -- Dmitry Olshansky
Oct 27 2013
The sub-text here is that, D should be one of the main languages on A Raspberry Pi. Currently children start with Scratch, move to Python and then (most likely) to C. Oracle have made a huge push to ensure Java is mainstream on the Raspberry Pi. I would much prefer to have D or Go as the "Don't go to C or Java" option. -- Russel. ============================================================================= Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.net 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Oct 27 2013
On Sat, 2013-10-26 at 16:48 +0200, eles wrote: […]OOH you show me it is not a big deal, OTOH you make a big deal from it refusing every support inside the compiler or the standard library.I am assuming that the C++ memory model and it's definition of volatile has in some way made the problem go away and expressions such as: device->csw.ready can be constructed such that there is no caching of values and the entity is always read. Given the issues of out of order execution, compiler optimization and multicore, what is their solution? (The above is a genuine question rather than a troll. Last time I was writing UNIX device drivers seriously was 30+ years ago, in C, and the last embedded systems work was 10 years ago using C with specialist compilers – 8051, AVR chips and the like. I would love to be able to work with the GPIO on a Raspberry Pi with D, it would get me back into all that fun stuff. I am staying away as it looks like a return to C is the only viable just now, unless I learn C++ again.)Should I one day define my own "int plus(int a, int b) { return a+b; }"?Surely, a + b always transforms to a.__add__(b) in all quality languages (*) so that you can redefine the meaning from the default. (*) which rules out Java. -- Russel. ============================================================================= Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.net 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder -- Russel. ============================================================================= Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.net 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Oct 27 2013
On 24 October 2013 07:36, Iain Buclaw <ibuclaw ubuntu.com> wrote:On 24 October 2013 06:37, Walter Bright <newshound2 digitalmars.com> wrote:To elaborate (now I am a little more awake :) - 'volatile' on the type means that it's volatile-qualified. volatile on the decl means it's treated as volatile in the 'C' sense. What's the difference? Well for the backend a volatile type only really has an effect on function returns (eg: a function that returns a shared int may not be subject for use in, say, tail-call optimisations). GDC propagates the volatile flag of the type to the decl, so that there is effectively no difference between shared and volatile, except in a semantic sense in the D frontend language implementation. Regards -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';On 10/23/2013 5:43 PM, Mike wrote:Are you talking dmd or in general (it's hard to tell). In gdc, volatile is the same as in gcc/g++ in behaviour. Although in one aspect, when the default storage model was switched to thread-local, that made volatile on it's own pointless. As a side note, 'shared' is considered a volatile type in gdc, which differs from the deprecated keyword which set volatile at a decl/expression level. There is a difference in semantics, but it escapes this author at 6.30am in the morning. :o)I'm interested in ARM bare-metal programming with D, and I'm trying to get my head wrapped around how to approach this. I'm making progress, but I found something that was surprising to me: deprecation of the volatile keyword. In the bare-metal/hardware/driver world, this keyword is important to ensure the optimizer doesn't cache reads to memory-mapped IO, as some hardware peripheral may modify the value without involving the processor. I've read a few discussions on the D forums about the volatile keyword debate, but noone seemed to reconcile the need for volatile in memory-mapped IO. Was this an oversight? What's D's answer to this? If one were to use D to read from memory-mapped IO, how would one ensure the compiler doesn't cache the value?volatile was never a reliable method for dealing with memory mapped I/O.
Oct 24 2013
On Thursday, 24 October 2013 at 00:43:11 UTC, Mike wrote:Hello again, I'm interested in ARM bare-metal programming with D, and I'm trying to get my head wrapped around how to approach this. I'm making progress, but I found something that was surprising to me: deprecation of the volatile keyword. In the bare-metal/hardware/driver world, this keyword is important to ensure the optimizer doesn't cache reads to memory-mapped IO, as some hardware peripheral may modify the value without involving the processor. I've read a few discussions on the D forums about the volatile keyword debate, but noone seemed to reconcile the need for volatile in memory-mapped IO. Was this an oversight? What's D's answer to this? If one were to use D to read from memory-mapped IO, how would one ensure the compiler doesn't cache the value?This article might also give some insight in the problems with volatile: http://blog.regehr.org/archives/28
Oct 23 2013
On Thursday, 24 October 2013 at 06:25:20 UTC, Arjan wrote:On Thursday, 24 October 2013 at 00:43:11 UTC, Mike wrote:Arjan, Thanks for the information. I basically agree with premise of that blog, but the author also said "If you are writing code for an in-order embedded processor and have little or no infrastructure besides the C compiler, you may need to lean more heavily on volatile". Well, that's me. My goal is to target bare-metal embedded systems with D, so I'm looking for a volatile-like solution in D. I don't care if the language has the "volatile" keyword or not, I just want to be able to read and write to my IO as fast as possible, and I'm wondering if D has a way to do this in a way that is comparable to what can be achieved in C.Hello again, I'm interested in ARM bare-metal programming with D, and I'm trying to get my head wrapped around how to approach this. I'm making progress, but I found something that was surprising to me: deprecation of the volatile keyword. In the bare-metal/hardware/driver world, this keyword is important to ensure the optimizer doesn't cache reads to memory-mapped IO, as some hardware peripheral may modify the value without involving the processor. I've read a few discussions on the D forums about the volatile keyword debate, but noone seemed to reconcile the need for volatile in memory-mapped IO. Was this an oversight? What's D's answer to this? If one were to use D to read from memory-mapped IO, how would one ensure the compiler doesn't cache the value?This article might also give some insight in the problems with volatile: http://blog.regehr.org/archives/28
Oct 23 2013
"Mike" <none none.com> wrote in message news:bifrvifzrhgocrejepvc forum.dlang.org...I've read a few discussions on the D forums about the volatile keyword debate, but noone seemed to reconcile the need for volatile in memory-mapped IO. Was this an oversight? What's D's answer to this? If one were to use D to read from memory-mapped IO, how would one ensure the compiler doesn't cache the value?There are a few options: 1. Use shared in place of volatile. I'm not sure this actually works, but otherwise this is pretty good. 2. Use the deprecated volatile statement. D got it right that volatile access is a property of the load/store and not the variable, but missed the point that it's a huge pain to have to remember volatile at use. Could be made better with a wrapper. I think this still works. 3. Use inline assembly. This sucks. 4. Defeat the optimizer with inline assembly. asm { nop; } // Haha, gotcha *my_hardware_register = 999; asm { nop; } This might be harder with gdc/ldc than it is with dmd, but I'm pretty sure there's a way to trick it into thinking an asm block could clobber/read arbitrary memory. 5. Lobby for/implement some nice new volatile_read and volatile_write intrinsics. Old discussion: http://www.digitalmars.com/d/archives/digitalmars/D/volatile_variables in_D...._51984.html
Oct 24 2013
On 24 October 2013 12:50, Daniel Murphy <yebblies nospamgmail.com> wrote:"Mike" <none none.com> wrote in message news:bifrvifzrhgocrejepvc forum.dlang.org...In gdc: --- asm {"" ::: "memory";} An asm instruction without any output operands will be treated identically to a volatile asm instruction in gcc, which indicates that the instruction has important side effects. So it creates a point in the code which may not be deleted (unless it is proved to be unreachable). The "memory" clobber will tell the backend to not keep memory values cached in registers across the assembler instruction and not optimize stores or loads to that memory. (That does not prevent a CPU from reordering loads and stores with respect to another CPU, though; you need real memory barrier instructions for that.) -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';I've read a few discussions on the D forums about the volatile keyword debate, but noone seemed to reconcile the need for volatile in memory-mapped IO. Was this an oversight? What's D's answer to this? If one were to use D to read from memory-mapped IO, how would one ensure the compiler doesn't cache the value?There are a few options: 1. Use shared in place of volatile. I'm not sure this actually works, but otherwise this is pretty good. 2. Use the deprecated volatile statement. D got it right that volatile access is a property of the load/store and not the variable, but missed the point that it's a huge pain to have to remember volatile at use. Could be made better with a wrapper. I think this still works. 3. Use inline assembly. This sucks. 4. Defeat the optimizer with inline assembly. asm { nop; } // Haha, gotcha *my_hardware_register = 999; asm { nop; } This might be harder with gdc/ldc than it is with dmd, but I'm pretty sure there's a way to trick it into thinking an asm block could clobber/read arbitrary memory.
Oct 24 2013
On Thursday, 24 October 2013 at 13:22:50 UTC, Iain Buclaw wrote:In gdc: --- asm {"" ::: "memory";} An asm instruction without any output operands will be treated identically to a volatile asm instruction in gcc, which indicates that the instruction has important side effects. So it creates a point in the code which may not be deleted (unless it is proved to be unreachable). The "memory" clobber will tell the backend to not keep memory values cached in registers across the assembler instruction and not optimize stores or loads to that memory. (That does not prevent a CPU from reordering loads and stores with respect to another CPU, though; you need real memory barrier instructions for that.)I have not (yet) had any problems when writing io registers but more with read access. Any operation after write should read the register back from real memory and not in processor registers. Any repetitive read should always read the real io register in memory. The hardware may change the register value at any time. Now a very common task like while (regs.status==0) ... may be optimized to an endless loop because the memory is read only once before the loop starts. I understood from earlier posts that variables should not be volatile but the operation should. It seems it is possible to guide the compiler like above. So would the right solution be to have a volatile block, similar to synchronized? Inside that block no memory access is optimized. This way no information of volatility is needed outside the block or in variables used there.
Oct 24 2013
"Timo Sintonen" <t.sintonen luukku.com> wrote in message news:qdvhyrzshckafkiekvnw forum.dlang.org...On Thursday, 24 October 2013 at 13:22:50 UTC, Iain Buclaw wrote:Volatile blocks are already in the language, but they suck. You don't want to have to mark every access as volatile, because all accesses to that hardware register are going to be volatile. You want it to be automatic. I'm really starting to think intrinsics are the way to go. They are safe, clear, and can be inlined. The semantics I imagine would be along the lines of llvm's volatile memory accesses (http://llvm.org/docs/LangRef.html#volatile-memory-accesses)In gdc: --- asm {"" ::: "memory";} An asm instruction without any output operands will be treated identically to a volatile asm instruction in gcc, which indicates that the instruction has important side effects. So it creates a point in the code which may not be deleted (unless it is proved to be unreachable). The "memory" clobber will tell the backend to not keep memory values cached in registers across the assembler instruction and not optimize stores or loads to that memory. (That does not prevent a CPU from reordering loads and stores with respect to another CPU, though; you need real memory barrier instructions for that.)I have not (yet) had any problems when writing io registers but more with read access. Any operation after write should read the register back from real memory and not in processor registers. Any repetitive read should always read the real io register in memory. The hardware may change the register value at any time. Now a very common task like while (regs.status==0) ... may be optimized to an endless loop because the memory is read only once before the loop starts. I understood from earlier posts that variables should not be volatile but the operation should. It seems it is possible to guide the compiler like above. So would the right solution be to have a volatile block, similar to synchronized? Inside that block no memory access is optimized. This way no information of volatility is needed outside the block or in variables used there.
Oct 25 2013
On Friday, 25 October 2013 at 13:07:56 UTC, Daniel Murphy wrote:"Timo Sintonen" <t.sintonen luukku.com> wrote:It seems that it is two different things here. As far as I understand, sharing means something like 'somebody may change my data' and volatility is something like 'I have to know immediately if the data is changed'. It has become obvious that these two are not easy to fit together and make a working model. The original question in this thread was to have a proper way to access hardware registers. So far, even the top people have offered only workarounds. I wonder how long D can be marketed as system language if it does not have a defined and reliable way to access system hardware. Register access occurs often in time critical places like interrupt routines. A library routine or external function is not a choice. Whatever the feature is, it has to be built in the language. I don't care if it is related to variables, blocks or files as long as I do not have to put these files in a separate directory like I do now. I would like to hear more what would be the options. Then we could make a decision what is the right way to go.I have not (yet) had any problems when writing io registers but more with read access. Any operation after write should read the register back from real memory and not in processor registers. Any repetitive read should always read the real io register in memory. The hardware may change the register value at any time. Now a very common task like while (regs.status==0) ... may be optimized to an endless loop because the memory is read only once before the loop starts. I understood from earlier posts that variables should not be volatile but the operation should. It seems it is possible to guide the compiler like above. So would the right solution be to have a volatile block, similar to synchronized? Inside that block no memory access is optimized. This way no information of volatility is needed outside the block or in variables used there.Volatile blocks are already in the language, but they suck. You don't want to have to mark every access as volatile, because all accesses to that hardware register are going to be volatile. You want it to be automatic. I'm really starting to think intrinsics are the way to go. They are safe, clear, and can be inlined. The semantics I imagine would be along the lines of llvm's volatile memory accesses (http://llvm.org/docs/LangRef.html#volatile-memory-accesses)
Oct 25 2013
Am Fri, 25 Oct 2013 17:20:23 +0200 schrieb "Timo Sintonen" <t.sintonen luukku.com>:On Friday, 25 October 2013 at 13:07:56 UTC, Daniel Murphy wrote:What's wrong with the solution Iain mentioned, i.e the way shared is implemented in GDC? http://forum.dlang.org/thread/bifrvifzrhgocrejepvc forum.dlang.org?page=4#post-mailman.2475.1382646532.1719.digitalmars-d:40puremagic.com http://forum.dlang.org/thread/bifrvifzrhgocrejepvc forum.dlang.org?page=4#post-mailman.2480.1382655175.1719.digitalmars-d:40puremagic.com"Timo Sintonen" <t.sintonen luukku.com> wrote:It seems that it is two different things here. As far as I understand, sharing means something like 'somebody may change my data' and volatility is something like 'I have to know immediately if the data is changed'. It has become obvious that these two are not easy to fit together and make a working model. The original question in this thread was to have a proper way to access hardware registers. So far, even the top people have offered only workarounds. I wonder how long D can be marketed as system language if it does not have a defined and reliable way to access system hardware. Register access occurs often in time critical places like interrupt routines. A library routine or external function is not a choice. Whatever the feature is, it has to be built in the language. I don't care if it is related to variables, blocks or files as long as I do not have to put these files in a separate directory like I do now. I would like to hear more what would be the options. Then we could make a decision what is the right way to go.I have not (yet) had any problems when writing io registers but more with read access. Any operation after write should read the register back from real memory and not in processor registers. Any repetitive read should always read the real io register in memory. The hardware may change the register value at any time. Now a very common task like while (regs.status==0) ... may be optimized to an endless loop because the memory is read only once before the loop starts. I understood from earlier posts that variables should not be volatile but the operation should. It seems it is possible to guide the compiler like above. So would the right solution be to have a volatile block, similar to synchronized? Inside that block no memory access is optimized. This way no information of volatility is needed outside the block or in variables used there.Volatile blocks are already in the language, but they suck. You don't want to have to mark every access as volatile, because all accesses to that hardware register are going to be volatile. You want it to be automatic. I'm really starting to think intrinsics are the way to go. They are safe, clear, and can be inlined. The semantics I imagine would be along the lines of llvm's volatile memory accesses (http://llvm.org/docs/LangRef.html#volatile-memory-accesses)
Oct 25 2013
On Friday, 25 October 2013 at 18:12:40 UTC, Johannes Pfau wrote:Am Fri, 25 Oct 2013 17:20:23 +0200 schrieb "Timo Sintonen" <t.sintonen luukku.com>:There is nothing wrong if it works. When I last time discussed about this with you and Iain, I do not remember if this was mentioned. I have been on belief that gdc has no solution. The second thing is, as I mentioned, that register access is such an important feature in system language that it should be in language specs. A quick search did not bring any documentation about shared in general and how gdc version is different. TDPL mentions only that shared guarantees the order of operations but does not mention anything about volatility. Can anybody point to any documentation?It seems that it is two different things here. As far as I understand, sharing means something like 'somebody may change my data' and volatility is something like 'I have to know immediately if the data is changed'. It has become obvious that these two are not easy to fit together and make a working model. The original question in this thread was to have a proper way to access hardware registers. So far, even the top people have offered only workarounds. I wonder how long D can be marketed as system language if it does not have a defined and reliable way to access system hardware. Register access occurs often in time critical places like interrupt routines. A library routine or external function is not a choice. Whatever the feature is, it has to be built in the language. I don't care if it is related to variables, blocks or files as long as I do not have to put these files in a separate directory like I do now. I would like to hear more what would be the options. Then we could make a decision what is the right way to go.What's wrong with the solution Iain mentioned, i.e the way shared is implemented in GDC? http://forum.dlang.org/thread/bifrvifzrhgocrejepvc forum.dlang.org?page=4#post-mailman.2475.1382646532.1719.digitalmars-d:40puremagic.com http://forum.dlang.org/thread/bifrvifzrhgocrejepvc forum.dlang.org?page=4#post-mailman.2480.1382655175.1719.digitalmars-d:40puremagic.com
Oct 25 2013
Am Fri, 25 Oct 2013 21:16:29 +0200 schrieb "Timo Sintonen" <t.sintonen luukku.com>:On Friday, 25 October 2013 at 18:12:40 UTC, Johannes Pfau wrote:Yes, this was news to me as well.What's wrong with the solution Iain mentioned, i.e the way shared is implemented in GDC? http://forum.dlang.org/thread/bifrvifzrhgocrejepvc forum.dlang.org?page=4#post-mailman.2475.1382646532.1719.digitalmars-d:40puremagic.com http://forum.dlang.org/thread/bifrvifzrhgocrejepvc forum.dlang.org?page=4#post-mailman.2480.1382655175.1719.digitalmars-d:40puremagic.comThere is nothing wrong if it works. When I last time discussed about this with you and Iain, I do not remember if this was mentioned. I have been on belief that gdc has no solution.The second thing is, as I mentioned, that register access is such an important feature in system language that it should be in language specs. A quick search did not bring any documentation about shared in general and how gdc version is different. TDPL mentions only that shared guarantees the order of operations but does not mention anything about volatility. Can anybody point to any documentation?Well to be honest I don't think there's any kind of spec related to shared. This is still a very unspecified / fragile part of the language. (I totally agree though that it should be specified)
Oct 26 2013
On 26 October 2013 12:41, Johannes Pfau <nospam example.com> wrote:Am Fri, 25 Oct 2013 21:16:29 +0200 schrieb "Timo Sintonen" <t.sintonen luukku.com>:Was added about 3 years ago... https://github.com/D-Programming-GDC/GDC/commit/f87a03aa2dc619caf076174f857d4e299ce2bd8d And the type qualifier only got propagated to the declaration just over a year ago. https://github.com/D-Programming-GDC/GDC/commit/ce3e42c7283616e49728dac050f9fb090c94bfd0 -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';On Friday, 25 October 2013 at 18:12:40 UTC, Johannes Pfau wrote:Yes, this was news to me as well.What's wrong with the solution Iain mentioned, i.e the way shared is implemented in GDC? http://forum.dlang.org/thread/bifrvifzrhgocrejepvc forum.dlang.org?page=4#post-mailman.2475.1382646532.1719.digitalmars-d:40puremagic.com http://forum.dlang.org/thread/bifrvifzrhgocrejepvc forum.dlang.org?page=4#post-mailman.2480.1382655175.1719.digitalmars-d:40puremagic.comThere is nothing wrong if it works. When I last time discussed about this with you and Iain, I do not remember if this was mentioned. I have been on belief that gdc has no solution.
Oct 26 2013
On Saturday, 26 October 2013 at 11:43:02 UTC, Johannes Pfau wrote:Am Fri, 25 Oct 2013 21:16:29 +0200 schrieb "Timo Sintonen" <t.sintonen luukku.com>:Seems to work. I can make every member as shared or the whole struct. Not yet tested how it works with property functions or when there are tables or structs as members, but now I get forward in my work. A little bit sad that the honored leader of the language still thinks that the right way to go is what we did with Commodore 64...On Friday, 25 October 2013 at 18:12:40 UTC, Johannes Pfau wrote:Yes, this was news to me as well.What's wrong with the solution Iain mentioned, i.e the way shared is implemented in GDC? http://forum.dlang.org/thread/bifrvifzrhgocrejepvc forum.dlang.org?page=4#post-mailman.2475.1382646532.1719.digitalmars-d:40puremagic.com http://forum.dlang.org/thread/bifrvifzrhgocrejepvc forum.dlang.org?page=4#post-mailman.2480.1382655175.1719.digitalmars-d:40puremagic.comThere is nothing wrong if it works. When I last time discussed about this with you and Iain, I do not remember if this was mentioned. I have been on belief that gdc has no solution.The second thing is, as I mentioned, that register access is such an important feature in system language that it should be in language specs. A quick search did not bring any documentation about shared in general and how gdc version is different. TDPL mentions only that shared guarantees the order of operations but does not mention anything about volatility. Can anybody point to any documentation?Well to be honest I don't think there's any kind of spec related to shared. This is still a very unspecified / fragile part of the language. (I totally agree though that it should be specified)
Oct 26 2013
On Sat, 2013-10-26 at 14:49 +0200, Timo Sintonen wrote: […]A little bit sad that the honored leader of the language still thinks that the right way to go is what we did with Commodore 64...Not a good style of argument, since the way of the Commodore 64 might be a good one. It isn't, but it might have been. The core problem with peek and poke for writing device drivers is that hardware controllers do not just use byte structured memory for things, they use bit structures. So for data I/O, device->buffer = value value = device->buffer can be replaced easily with: poke(device->buffer, value) value = peek(device->buffer) but this doesn't work when you are using bitfields, you end up having to do all the ugly bit mask manipulation explicitly. Thus, what the equivalent of: device->csw.enable = 1 status = device->csw.ready is, is left to the imagination. -- Russel. ============================================================================= Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.net 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Oct 27 2013
On 10/27/2013 1:31 AM, Russel Winder wrote:The core problem with peek and poke for writing device drivers is that hardware controllers do not just use byte structured memory for things, they use bit structures. So for data I/O, device->buffer = value value = device->buffer can be replaced easily with: poke(device->buffer, value) value = peek(device->buffer) but this doesn't work when you are using bitfields, you end up having to do all the ugly bit mask manipulation explicitly. Thus, what the equivalent of: device->csw.enable = 1 status = device->csw.ready is, is left to the imagination.Bitfield code generation for C compilers has generally been rather crappy. If you wanted performant code, you always had to do the masking yourself. I've written device drivers, and have designed, built, and programmed single board computers. I've never found dealing with the oddities of memory mapped I/O and bit flags to be of any difficulty. Do you really find & and | operations to be ugly? I don't find them any uglier than + and *. Maybe that's because of my hardware background.
Oct 27 2013
On Sun, 2013-10-27 at 02:12 -0700, Walter Bright wrote: […]Bitfield code generation for C compilers has generally been rather crappy. If you wanted performant code, you always had to do the masking yourself.Endianism and packing have always been the bête noir of bitfields due to it not being part of the standard but left as compiler specific – sort of essentially in a way due to the vast difference in targets. Given a single compiler for a given target I never found the generated code poor. Using the UNIX compiler in early 1980s and the AVR compiler suites we used in the 2000s generated code always seemed fine. What's your evidence for hand crafted code being better than compiler generated code?I've written device drivers, and have designed, built, and programmed single board computers. I've never found dealing with the oddities of memory mapped I/O and bit flags to be of any difficulty.But don't you find: *x = (1 << 7) & (1 << 9) to lead directly to the use of macros: SET_SOMETHING_READY(x) to hide the lack of immediacy of comprehension of the purpose of the expression?Do you really find & and | operations to be ugly? I don't find them any uglier than + and *. Maybe that's because of my hardware background.It's not the operations that are the problem, it is the expressions using them that lead to code that is the antithesis of self-documenting. Almost all code using <<, >>, & and | invariable ends up being replaced with macros in C and C++ so as to avoid using functions. The core point here is that this sort of code fails as soon as a function call is involved, functions cannot be used as a tool of abstraction. At least with C and C++. Clearly D has a USP over C and C++ here in that macros can be replaced by CTFE. But how to guarantee that a function is fully evaluated at compile time and not allowed to generate a function call. Only then can functions be used instead of macros to make such code self documenting. Much better to have a bitfield system that works. Especially on architectures such as AVR where there are areas of bit addressable memory. Although Intel only have words accessible memory, not all architectures do. C (and thus C++) hacked a solution that worked fine for the one compiler with the PDP and VAX targets. It was only when there were multiple compilers and multiple targets that the problem arose. There is nothing really wrong with the C bitfield syntax it was just that different compilers did different things for the same target. -- Russel. ============================================================================= Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.net 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Oct 28 2013
On 10/28/2013 12:49 AM, Russel Winder wrote:On Sun, 2013-10-27 at 02:12 -0700, Walter Bright wrote: […]Generally the shifting is unnecessary, but the compiler doesn't know that as the spec says the values need to be right-justified. Also, I often set/reset/test many fields at once - doesn't work to well with bitfields. Endianism should not be an issue if you're dealing with MMIO, since MMIO is going to be extremely target-specific and hence so is your code to deal with it.Bitfield code generation for C compilers has generally been rather crappy. If you wanted performant code, you always had to do the masking yourself.Endianism and packing have always been the bête noir of bitfields due to it not being part of the standard but left as compiler specific – sort of essentially in a way due to the vast difference in targets. Given a single compiler for a given target I never found the generated code poor. Using the UNIX compiler in early 1980s and the AVR compiler suites we used in the 2000s generated code always seemed fine. What's your evidence for hand crafted code being better than compiler generated code?My bit code usually looks like: x |= FLAG_X | FLAG_Y; x &= ~(FLAG_Y | FLAG_Z); if (x & (FLAG_A | FLAG_B)) ... You'll find stuff like that all through the dmd source code :-)I've written device drivers, and have designed, built, and programmed single board computers. I've never found dealing with the oddities of memory mapped I/O and bit flags to be of any difficulty.But don't you find: *x = (1 << 7) & (1 << 9) to lead directly to the use of macros: SET_SOMETHING_READY(x) to hide the lack of immediacy of comprehension of the purpose of the expression?I thought that with modern inlining, this was no longer an issue.Do you really find & and | operations to be ugly? I don't find them any uglier than + and *. Maybe that's because of my hardware background.It's not the operations that are the problem, it is the expressions using them that lead to code that is the antithesis of self-documenting. Almost all code using <<, >>, & and | invariable ends up being replaced with macros in C and C++ so as to avoid using functions. The core point here is that this sort of code fails as soon as a function call is involved, functions cannot be used as a tool of abstraction. At least with C and C++.Clearly D has a USP over C and C++ here in that macros can be replaced by CTFE. But how to guarantee that a function is fully evaluated at compile time and not allowed to generate a function call. Only then can functions be used instead of macros to make such code self documenting.enum X = foo(args); guarantees that foo(args) is evaluated at compile time. I.e. in any context that requires a value at compile time guarantees that it will get evaluated at compile time. If it is not required at compile time, it will not attempt CTFE on it.
Oct 28 2013
On Sat, 2013-10-26 at 14:49 +0200, Timo Sintonen wrote: […]A little bit sad that the honored leader of the language still thinks that the right way to go is what we did with Commodore 64...Not a good style of argument, since the way of the Commodore 64 might be a good one. It isn't, but it might have been. The core problem with peek and poke for writing device drivers is that hardware controllers do not just use byte structured memory for things, they use bit structures. So for data I/O, device->buffer = value value = device->buffer can be replaced easily with: poke(device->buffer, value) value = peek(device->buffer) but this doesn't work when you are using bitfields, you end up having to do all the ugly bit mask manipulation explicitly. Thus, what the equivalent of: device->csw.enable = 1 status = device->csw.ready is, is left to the imagination. -- Russel. ============================================================================= Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.net 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder -- Russel. ============================================================================= Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.net 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Oct 27 2013
On Saturday, 26 October 2013 at 11:43:02 UTC, Johannes Pfau wrote:Well to be honest I don't think there's any kind of spec related to shared. This is still a very unspecified / fragile part of the language. (I totally agree though that it should be specified)I agree, and thus I think it's dangerous at best and harmful at worst to make any recommendations to use shared for anything but a mere type tag (with no intrinsic meaning) at the moment. LDC certainly does not ascribe any special meaning to shared variables, and last time I checked, DMD didn't make any of the guarantees discussed here either. David
Oct 27 2013