D - Volatile

Jim Starkey (56/56) Mar 21 2002 Please pardon my ignorance if this has been hashed and re-hashed. I

Walter (8/47) Mar 21 2002 They'd have to be implemented with mutexes anyway, so might as well just

Serge K (11/18) Mar 21 2002 "volatile" does not mean "atomic" or even "synchronized".

Walter (11/29) Mar 21 2002 Volatile

Stephen Fuld (24/36) Mar 22 2002 just

Walter (27/46) Mar 22 2002 register

Stephen Fuld (20/66) Mar 22 2002 is

Walter (24/34) Mar 22 2002 the

Stephen Fuld (31/67) Mar 22 2002 that

Walter (19/55) Mar 26 2002 can

Stephen Fuld (36/95) Mar 27 2002 Sure.

Pavel Minayev (5/7) Mar 27 2002 know

OddesE (11/18) Mar 27 2002 Yeah I loved it!

Walter (10/20) Mar 26 2002 not

Russ Lewis (18/24) Mar 26 2002 Not a bad idea, although I don't like the idea that it removes ALL cachi...

Walter (15/36) Mar 26 2002 caching.

Richard Krehbiel (29/55) Mar 27 2002 charset="Windows-1252"

Walter (25/25) Mar 31 2002 charset="Windows-1252"

Stephen Fuld (27/47) Mar 27 2002 to

OddesE (13/19) Mar 27 2002 "Stephen Fuld" wrote in message

Stephen Fuld (10/31) Mar 27 2002 a

OddesE (15/42) Mar 28 2002 be

Richard Krehbiel (12/17) Mar 26 2002 going

Serge K (4/7) Mar 22 2002 You should try Visual C++ for Alpha.

Walter (8/15) Mar 22 2002 optimize

Karl Bochert (22/26) Mar 23 2002 Watcom has a form of asm that allows optimization.

Pavel Minayev (5/10) Mar 23 2002 AFAIK, D chooses calling convention on its own, and might use

Karl Bochert (21/37) Mar 23 2002 To quote from a message on the Euphoria newsgroup

Sean L. Palmer (13/19) Mar 25 2002 Watcom did run circles around the competition back in the day. GCC's in...
Walter (12/31) Mar 26 2002 wrote:

Jim Starkey (36/45) Mar 22 2002 No, it neither necessary nor desirable to use mutexes. Yes, there are

Walter (12/52) Mar 22 2002 Writes to bytes and aligned words/dwords are done atomically by the CPU,

Jim Starkey <jas netfrastructure.com> writes:

Please pardon my ignorance if this has been hashed and re-hashed.  I
just got a pointer to D
from another list, came over for a quick look-see, and liked what I
saw.  So I thought I'd
toss in a few thoughts.

I notice there is no support for volatile, which perplexes me.  Volatile
is necessary to
warn an optimizer that another thread may change a data item without
warning.  It
isn't necessary in a JVM because those types of optimization can be
expressed in
byte codes, although it does limit what a JIT compiler can do.  D is
intended for real
compilation, however, and when the instruction set guys give us enough
registers,
the compiler is going to want to stick intermediates in them.  Without
volatile, this
ain't a gona work.

That said, the C concept of volatile declaration doesn't go far enough.
While it does
warn the compiler that an unexpected change is value is fair game, it
doesn't tell
the compiler when or if to generate multi-process safe instruction
sequences.

The obvious response is that data structures should be protected by a
mutex or
synchronize.   The problem is that these are vastly too expensive to use
in a
tight, fine-grained multi-thread application.  Modern multi-processors
do a
wonderful job of implementing processor interlocked atomic
instructions.  Modern
OSes do a reasonable job of scheduling threads on multi-processors.
Modern
language, however, do a rotten job of giving the primitives to exploit
these
environments.  Yeah, I know I can write an inline "lock xsub decl" yada
yada
yada.  But it's painful and non-portable.  And we all know that writing
assembler
rots the soul.

So, guys, I would like the following:

    1.  A volatile declaration so the compiler can do smart things while
I do
         fast things.
    2.  A "volatile volatile" declaration or distinct operator or
operator modified
         to tell the compiler to use an processor interlock instruction
sequence OR
         give me a compile time error why it can't.

There are probably smarter ways to do this than a volatile declaration.
But something
is needed in that niche.

Or, alternatively, I could have my head throughly wedged.  But I'll take
on all comers
until that is so obvious that I can see it myself.

Mar 21 2002

"Walter" <walter digitalmars.com> writes:

"Jim Starkey" <jas netfrastructure.com> wrote in message
news:3C9A43BC.AFBA03BA netfrastructure.com...
 I notice there is no support for volatile, which perplexes me.  Volatile
 is necessary to
 warn an optimizer that another thread may change a data item without
 warning.

They'd have to be implemented with mutexes anyway, so might as well just
wrap them in "synchronized". Note: the X86 CPU doesn't guarantee that
writing to memory will be atomic if the item crosses a 32 bit word boundary,
which can happen writing doubles, longs, or even misaligned ints.

 That said, the C concept of volatile declaration doesn't go far enough.
 While it does
 warn the compiler that an unexpected change is value is fair game, it
 doesn't tell
 the compiler when or if to generate multi-process safe instruction
 sequences.

I agree that the C definition of volatile is next to useless.

 The obvious response is that data structures should be protected by a
 mutex or
 synchronize.   The problem is that these are vastly too expensive to use
 in a
 tight, fine-grained multi-thread application.  Modern multi-processors
 do a
 wonderful job of implementing processor interlocked atomic
 instructions.  Modern
 OSes do a reasonable job of scheduling threads on multi-processors.
 Modern
 language, however, do a rotten job of giving the primitives to exploit
 these
 environments.  Yeah, I know I can write an inline "lock xsub decl" yada
 yada
 yada.  But it's painful and non-portable.  And we all know that writing
 assembler
 rots the soul.
 So, guys, I would like the following:

     1.  A volatile declaration so the compiler can do smart things while
 I do
          fast things.
     2.  A "volatile volatile" declaration or distinct operator or
 operator modified
          to tell the compiler to use an processor interlock instruction
 sequence OR
          give me a compile time error why it can't.

 There are probably smarter ways to do this than a volatile declaration.
 But something
 is needed in that niche.

You're wrong, writing assembler puts one into a State of Grace <g>.

Mar 21 2002

"Serge K" <skarebo programmer.net> writes:

 I notice there is no support for volatile, which perplexes me.  Volatile
 is necessary to
 warn an optimizer that another thread may change a data item without
 warning.

 They'd have to be implemented with mutexes anyway, so might as well just
 wrap them in "synchronized".

"volatile" does not mean "atomic" or even "synchronized".
It's just an indication that some variable in the memory can be changed from
"outside".
And nobody cares when *exactly* it happens, as long as it happens.

For example:

    by another thread on the same processor.
        => everything is in the same cache - no problem here.

    by another processor, or any other hardware (DMA, ...)
        => any modern processor has support for cache coherency
        (MESI or better), in fact - it's a "must" thing for any processor with
the cache.
        - no problem there. (..even i486 had it..)

 I agree that the C definition of volatile is next to useless.

Is it?

Mar 21 2002

"Walter" <walter digitalmars.com> writes:

"Serge K" <skarebo programmer.net> wrote in message
news:a7e2kc$17qp$1 digitaldaemon.com...
 I notice there is no support for volatile, which perplexes me.



Volatile
 is necessary to
 warn an optimizer that another thread may change a data item without
 warning.

 They'd have to be implemented with mutexes anyway, so might as well just
 wrap them in "synchronized".

 "volatile" does not mean "atomic" or even "synchronized".

It does in Java, which to me makes it more useful than C's notion of "don't
put it in a register".

 It's just an indication that some variable in the memory can be changed

from "outside".
 And nobody cares when *exactly* it happens, as long as it happens.
 For example:
     by another thread on the same processor.
         => everything is in the same cache - no problem here.
     by another processor, or any other hardware (DMA, ...)
         => any modern processor has support for cache coherency
         (MESI or better), in fact - it's a "must" thing for any processor

with the cache.
         - no problem there. (..even i486 had it..)

If you are writing to, say, a long, the long will be two write cycles. In
between those two, another thread could change part of it, resulting in a
scrambled write.

 I agree that the C definition of volatile is next to useless.

 Is it?

 Since it does not guarantee atomic writes, yes, I believe it is useless.

Mar 21 2002

"Stephen Fuld" <s.fuld.pleaseremove att.net> writes:

"Walter" <walter digitalmars.com> wrote in message
news:a7entf$1hik$2 digitaldaemon.com...
 "Serge K" <skarebo programmer.net> wrote in message
 news:a7e2kc$17qp$1 digitaldaemon.com...
 I notice there is no support for volatile, which perplexes me.



 Volatile
 is necessary to
 warn an optimizer that another thread may change a data item without
 warning.

 They'd have to be implemented with mutexes anyway, so might as well



just
 wrap them in "synchronized".

 "volatile" does not mean "atomic" or even "synchronized".

 It does in Java, which to me makes it more useful than C's notion of

"don't
 put it in a register".

This is necessary in many embedded systems, even when they are single
threaded and even some operating system applications.  For example, it is
common in embedded systems to have external hardware be made visible by
memory mapping the external hardware registers into the process memory
space.  This makes it easy to use standard syntax to manipulate the register
and is the only way to implement I/O on some processors.  However, you can't
let the CPU keep the "data" in a CPU register or it won't work.  For
example, an update to the register has to actually go to the external
register to be effective.  It doesn't accomplish anything to update the copy
in a CPU register without doing the store as the external hardware might not
see it for a long time.  Similarly, of course, these external registers can
change their contents as the state of the external hardware changes (For
example, a status register showing the completion of some external
operation.)  You can't let the data be stay in a register as subsequent
reads, in say a polling loop, wouldn't go to the actual hardware, or worse
yet, even be "optimized" away altogether.  Note that this is a different
issue than cache coherence.

--
 - Stephen Fuld
   e-mail address disguised to prevent spam

Mar 22 2002

"Walter" <walter digitalmars.com> writes:

"Stephen Fuld" <s.fuld.pleaseremove att.net> wrote in message
news:a7fq36$2uha$1 digitaldaemon.com...
 It does in Java, which to me makes it more useful than C's notion of

 "don't put it in a register".
 This is necessary in many embedded systems, even when they are single
 threaded and even some operating system applications.  For example, it is
 common in embedded systems to have external hardware be made visible by
 memory mapping the external hardware registers into the process memory
 space.  This makes it easy to use standard syntax to manipulate the

register
 and is the only way to implement I/O on some processors.  However, you

can't
 let the CPU keep the "data" in a CPU register or it won't work.  For
 example, an update to the register has to actually go to the external
 register to be effective.  It doesn't accomplish anything to update the

copy
 in a CPU register without doing the store as the external hardware might

not
 see it for a long time.  Similarly, of course, these external registers

can
 change their contents as the state of the external hardware changes (For
 example, a status register showing the completion of some external
 operation.)  You can't let the data be stay in a register as subsequent
 reads, in say a polling loop, wouldn't go to the actual hardware, or worse
 yet, even be "optimized" away altogether.  Note that this is a different
 issue than cache coherence.

I understand what you mean. It's still problematic how that actually winds
up being implemented in the compiler. C doesn't really define how many reads
are done to an arbitrary expression in order to implement it, for example:
    j = i++;
How many times is i read? Once or twice?
    mov eax, i
    inc i
    mov j, eax
or:
    mov eax, i
    mov j, eax
    inc eax
    mov i, eax
These ambiguities to me mean that if you need precise control over memory
read and write cycles, the appropriate thing to use is the inline assembler.
Volatile may happen to work, but to my mind is unreliable and may change
behavior from compiler to compiler.

BTW, D's inline assembler is well integrated in with the compiler. The
compiler can track register usage even in asm blocks, and can still optimize
the surrounding code, unlike any other inline implementation I'm aware of.

Mar 22 2002

"Stephen Fuld" <s.fuld.pleaseremove att.net> writes:

"Walter" <walter digitalmars.com> wrote in message
news:a7ft6h$1ccq$1 digitaldaemon.com...
 "Stephen Fuld" <s.fuld.pleaseremove att.net> wrote in message
 news:a7fq36$2uha$1 digitaldaemon.com...
 It does in Java, which to me makes it more useful than C's notion of

 "don't put it in a register".
 This is necessary in many embedded systems, even when they are single
 threaded and even some operating system applications.  For example, it


is
 common in embedded systems to have external hardware be made visible by
 memory mapping the external hardware registers into the process memory
 space.  This makes it easy to use standard syntax to manipulate the

 register
 and is the only way to implement I/O on some processors.  However, you

 can't
 let the CPU keep the "data" in a CPU register or it won't work.  For
 example, an update to the register has to actually go to the external
 register to be effective.  It doesn't accomplish anything to update the

 copy
 in a CPU register without doing the store as the external hardware might

 not
 see it for a long time.  Similarly, of course, these external registers

 can
 change their contents as the state of the external hardware changes (For
 example, a status register showing the completion of some external
 operation.)  You can't let the data be stay in a register as subsequent
 reads, in say a polling loop, wouldn't go to the actual hardware, or


worse
 yet, even be "optimized" away altogether.  Note that this is a different
 issue than cache coherence.

 I understand what you mean. It's still problematic how that actually winds
 up being implemented in the compiler. C doesn't really define how many

reads
 are done to an arbitrary expression in order to implement it, for example:
     j = i++;
 How many times is i read? Once or twice?
     mov eax, i
     inc i
     mov j, eax
 or:
     mov eax, i
     mov j, eax
     inc eax
     mov i, eax
 These ambiguities to me mean that if you need precise control over memory
 read and write cycles, the appropriate thing to use is the inline

assembler.
 Volatile may happen to work, but to my mind is unreliable and may change
 behavior from compiler to compiler.

 BTW, D's inline assembler is well integrated in with the compiler. The
 compiler can track register usage even in asm blocks, and can still

optimize
 the surrounding code, unlike any other inline implementation I'm aware of.

While I agree that you can use inline asm, and there are ways to code that
could cause trouble, in practice, it works pretty well.  People don't do
things like post increment external registers when reading them.  I know the
syntax allows it, but programmers, especially embedded programmers learn
pretty quickly what things to do and what not to do with the hardware they
have.  In practice, most uses of stuff like this is to read the whole
register and test some bits or extract a field, or to create a word with the
desired contents and write it in one piece to the external register.  So,
while volatile isn't a complete solution, it avoids having to delve into asm
for the vast majority of such uses.

--
 - Stephen Fuld
   e-mail address disguised to prevent spam

Mar 22 2002

"Walter" <walter digitalmars.com> writes:

"Stephen Fuld" <s.fuld.pleaseremove att.net> wrote in message
news:a7g4uv$2q8r$1 digitaldaemon.com...
 While I agree that you can use inline asm, and there are ways to code that
 could cause trouble, in practice, it works pretty well.  People don't do
 things like post increment external registers when reading them.  I know

the
 syntax allows it, but programmers, especially embedded programmers learn
 pretty quickly what things to do and what not to do with the hardware they
 have.  In practice, most uses of stuff like this is to read the whole
 register and test some bits or extract a field, or to create a word with

the
 desired contents and write it in one piece to the external register.  So,
 while volatile isn't a complete solution, it avoids having to delve into

asm
 for the vast majority of such uses.

Wouldn't it be better to have a more reliable method than trial and error?
Trial and error is subject to subtle changes if a new compiler is used.

I also wish to point out that volatile permeates the typing system in a
C/C++ compiler. There is a great deal of code to keep everything straight in
the contexts of overloading, casting, type copying, etc.

I don't see why volatile is that necessary for hardware registers. You can
still easilly read a hardware register by setting a pointer to it and going
*p. The compiler isn't going to skip the write to it through *p (it's very,
very hard for a C optimizer to remove dead stores through pointers, due to
the aliasing problem). Any reads through a pointer are not cached across any
assignments through a pointer, including any function calls (again, due to
the aliasing problem). For example, the second read of *p will not get
cached away:

    x = *p;        // first read
    func();        // call function to prevent caching of pointer results
    y = *p;        // second read

func() can simply consist of RET. To do, say, a spin lock on *p:

    while (*p != value)
        func();

Mar 22 2002

"Stephen Fuld" <s.fuld.pleaseremove att.net> writes:

"Walter" <walter digitalmars.com> wrote in message
news:a7gfrs$35e$1 digitaldaemon.com...
 "Stephen Fuld" <s.fuld.pleaseremove att.net> wrote in message
 news:a7g4uv$2q8r$1 digitaldaemon.com...
 While I agree that you can use inline asm, and there are ways to code


that
 could cause trouble, in practice, it works pretty well.  People don't do
 things like post increment external registers when reading them.  I know

 the
 syntax allows it, but programmers, especially embedded programmers learn
 pretty quickly what things to do and what not to do with the hardware


they
 have.  In practice, most uses of stuff like this is to read the whole
 register and test some bits or extract a field, or to create a word with

 the
 desired contents and write it in one piece to the external register.


So,
 while volatile isn't a complete solution, it avoids having to delve into

 asm
 for the vast majority of such uses.

 Wouldn't it be better to have a more reliable method than trial and error?

Of course!  :-)

 Trial and error is subject to subtle changes if a new compiler is used.

Yes.


 I also wish to point out that volatile permeates the typing system in a
 C/C++ compiler. There is a great deal of code to keep everything straight

in
 the contexts of overloading, casting, type copying, etc.

I'll take your word for what is required within the compiler.  I'm a
compiler user, not a designer.


 I don't see why volatile is that necessary for hardware registers. You can
 still easilly read a hardware register by setting a pointer to it and

going
 *p.

Sure.  But I am trying, as I thought you were with D, trying to
minimize/eliminate the use of pointers in the source code as a major source
of error.

 The compiler isn't going to skip the write to it through *p (it's very,
 very hard for a C optimizer to remove dead stores through pointers, due to
 the aliasing problem).

Again, I am not a compiler designer, but "very very hard" implies that it
isn't impossible and therefore, some future compiler *could* do it and thus
breaking code as you described the problem above.  :-(

 Any reads through a pointer are not cached across any
 assignments through a pointer, including any function calls (again, due to
 the aliasing problem). For example, the second read of *p will not get
 cached away:

     x = *p;        // first read
     func();        // call function to prevent caching of pointer results
     y = *p;        // second read

 func() can simply consist of RET. To do, say, a spin lock on *p:

     while (*p != value)
         func();

Oh, that's intuitive!  :-(  Add an extra empty function call in order to
prevent the compiler from doing some undesirable optimization.  Uccccch!
There has got to be a better way to address the problem than this.  I'm not
wedded to the "volatile" syntax and certainly not wedded to how C does
things.  I was just pointing out, for those who have never done embedded
programming, a major reason for that syntax.  If you can come up with a
better solution (I guess I don't count the ones you have proposed so far to
be better.) than I am all for it.  You have showed such immagination in
solving other C/C++ deficiencies that I have reason to hope you can solve
this one elegantly.

- Sorry to put you on the spot.  :-)

--
 - Stephen Fuld
   e-mail address disguised to prevent spam

Mar 22 2002

"Walter" <walter digitalmars.com> writes:

"Stephen Fuld" <s.fuld.pleaseremove att.net> wrote in message
news:a7gom0$96n$1 digitaldaemon.com...
 "Walter" <walter digitalmars.com> wrote in message
 news:a7gfrs$35e$1 digitaldaemon.com...
 I don't see why volatile is that necessary for hardware registers. You


can
 still easilly read a hardware register by setting a pointer to it and

 going
 *p.

 Sure.  But I am trying, as I thought you were with D, trying to
 minimize/eliminate the use of pointers in the source code as a major

source
 of error.

Pointers are still in D, for the reason that sometimes you just gotta have
them. Minimizing them is a design goal, though. Also, to access hardware
registers, you're going to need pointers because there is no way to specify
absolute addresses for variables.

 The compiler isn't going to skip the write to it through *p (it's very,
 very hard for a C optimizer to remove dead stores through pointers, due


to
 the aliasing problem).

 Again, I am not a compiler designer, but "very very hard" implies that it
 isn't impossible and therefore, some future compiler *could* do it and

thus
 breaking code as you described the problem above.  :-(

To make it impossible just have the pointer set in a function that the
compiler doesn't know about.

 Any reads through a pointer are not cached across any
 assignments through a pointer, including any function calls (again, due


to
 the aliasing problem). For example, the second read of *p will not get
 cached away:
     x = *p;        // first read
     func();        // call function to prevent caching of pointer


results
     y = *p;        // second read
 func() can simply consist of RET. To do, say, a spin lock on *p:
     while (*p != value)
         func();

 Oh, that's intuitive!  :-(  Add an extra empty function call in order to
 prevent the compiler from doing some undesirable optimization.  Uccccch!
 There has got to be a better way to address the problem than this.  I'm

not
 wedded to the "volatile" syntax and certainly not wedded to how C does
 things.  I was just pointing out, for those who have never done embedded
 programming, a major reason for that syntax.  If you can come up with a
 better solution (I guess I don't count the ones you have proposed so far

to
 be better.) than I am all for it.

Yeah, I understand it isn't the greatest, but it'll work reliably. I also
happen to be fond of inline assembler when dealing with hardware <g>.

 You have showed such immagination in
 solving other C/C++ deficiencies that I have reason to hope you can solve
 this one elegantly.

Ahem. I'm on to that tactic!

Mar 26 2002

"Stephen Fuld" <s.fuld.pleaseremove att.net> writes:

"Walter" <walter digitalmars.com> wrote in message
news:a7qh13$15fb$1 digitaldaemon.com...
 "Stephen Fuld" <s.fuld.pleaseremove att.net> wrote in message
 news:a7gom0$96n$1 digitaldaemon.com...
 "Walter" <walter digitalmars.com> wrote in message
 news:a7gfrs$35e$1 digitaldaemon.com...
 I don't see why volatile is that necessary for hardware registers. You


 can
 still easilly read a hardware register by setting a pointer to it and

 going
 *p.

 Sure.  But I am trying, as I thought you were with D, trying to
 minimize/eliminate the use of pointers in the source code as a major

 source
 of error.

 Pointers are still in D, for the reason that sometimes you just gotta have
 them.

Sure.

 Minimizing them is a design goal, though.

And a worthy one.


 Also, to access hardware
 registers, you're going to need pointers because there is no way to

specify
 absolute addresses for variables.

Well, you could change that and eliminate one more use of pointers.  I know
of at least one language that allows the specification of absolute addresses
for variables.  You have to be careful when to allow/implement it, but it
seems to work well.  Some versions of the compiler (like the one given to
students) just ignore the extra specification, but there are versions (you
could use options) to support this.  Another way to do it is to honor the
requests but make the addresses program absolute and rely on the linker and
other external things like the loader (or Prom/flash) burne to make them
truely absolute.

BTW, their syntax is varname type  address


 The compiler isn't going to skip the write to it through *p (it's



very,
 very hard for a C optimizer to remove dead stores through pointers,



due
 to
 the aliasing problem).

 Again, I am not a compiler designer, but "very very hard" implies that


it
 isn't impossible and therefore, some future compiler *could* do it and

 thus
 breaking code as you described the problem above.  :-(

 To make it impossible just have the pointer set in a function that the
 compiler doesn't know about.

Yes, but that is another "work around" that just doesn't seem "natural"
Adding extra requirements that the programmer needs to know about in order
to "trick" the compiler into doing the right thing are IMNSHO, not the right
way to go.

 Any reads through a pointer are not cached across any
 assignments through a pointer, including any function calls (again,



due
 to
 the aliasing problem). For example, the second read of *p will not get
 cached away:
     x = *p;        // first read
     func();        // call function to prevent caching of pointer


 results
     y = *p;        // second read
 func() can simply consist of RET. To do, say, a spin lock on *p:
     while (*p != value)
         func();

 Oh, that's intuitive!  :-(  Add an extra empty function call in order to
 prevent the compiler from doing some undesirable optimization.  Uccccch!
 There has got to be a better way to address the problem than this.  I'm

 not
 wedded to the "volatile" syntax and certainly not wedded to how C does
 things.  I was just pointing out, for those who have never done embedded
 programming, a major reason for that syntax.  If you can come up with a
 better solution (I guess I don't count the ones you have proposed so far

 to
 be better.) than I am all for it.

 Yeah, I understand it isn't the greatest, but it'll work reliably.

Agreed. I am working toward comming up with "the greatest" solution.  :-)

 I also
 happen to be fond of inline assembler when dealing with hardware <g>.

An affliction that I am afraid is chronic, and probably not curable. :-)  As
I believe that the purpose of a high level language is to minimize the use
of assembler, I am not so afflicted.  You can always drop to assembler, but
that is precisely what we are trying to avoid as much as possible.


 You have showed such immagination in
 solving other C/C++ deficiencies that I have reason to hope you can


solve
 this one elegantly.

 Ahem. I'm on to that tactic!

But, based on your next post about "sequential", it seems to have worked :-)
(I'll respond to that post there.) That is my goal here.  To promote
discussion on variious ways of solving the problems in order for the best
one to come out.

--
 - Stephen Fuld
   e-mail address disguised to prevent spam

Mar 27 2002

"Pavel Minayev" <evilone omen.ru> writes:

"Stephen Fuld" <s.fuld.pleaseremove att.net> wrote in message
news:a7t4c6$2jfd$1 digitaldaemon.com...

 Well, you could change that and eliminate one more use of pointers.  I

know
 of at least one language that allows the specification of absolute

addresses

Borland Pascal had it. It was great for low-level programming, indeed.

Mar 27 2002

"OddesE" <OddesE_XYZ hotmail.com> writes:

"Pavel Minayev" <evilone omen.ru> wrote in message
news:a7tdf1$2o5m$1 digitaldaemon.com...
 "Stephen Fuld" <s.fuld.pleaseremove att.net> wrote in message
 news:a7t4c6$2jfd$1 digitaldaemon.com...

 Well, you could change that and eliminate one more use of pointers.  I

 know
 of at least one language that allows the specification of absolute

 addresses

 Borland Pascal had it. It was great for low-level programming, indeed.

Yeah I loved it!
Also great for addressing BIOS vars and VGA memory
(in the old DOS days)... :)


--
Stijn
OddesE_XYZ hotmail.com
http://OddesE.cjb.net
_________________________________________________
Remove _XYZ from my address when replying by mail

Mar 27 2002

"Walter" <walter digitalmars.com> writes:

"Stephen Fuld" <s.fuld.pleaseremove att.net> wrote in message
news:a7gom0$96n$1 digitaldaemon.com...
 Oh, that's intuitive!  :-(  Add an extra empty function call in order to

 prevent the compiler from doing some undesirable optimization.  Uccccch!
 There has got to be a better way to address the problem than this.  I'm

not
 wedded to the "volatile" syntax and certainly not wedded to how C does
 things.  I was just pointing out, for those who have never done embedded
 programming, a major reason for that syntax.  If you can come up with a
 better solution (I guess I don't count the ones you have proposed so far

to
 be better.) than I am all for it.  You have showed such immagination in
 solving other C/C++ deficiencies that I have reason to hope you can solve
 this one elegantly.

I did have a thought. How about a keyword "sequence", as in:

    sequence;        // no caching across this keyword
    x = *p;            // *p is always reloaded

and:
    x = *p;
    sequence;        // *p is not cached

Mar 26 2002

Russ Lewis <spamhole-2001-07-16 deming-os.org> writes:

Walter wrote:

 I did have a thought. How about a keyword "sequence", as in:

     sequence;        // no caching across this keyword
     x = *p;            // *p is always reloaded

 and:
     x = *p;
     sequence;        // *p is not cached

Not a bad idea, although I don't like the idea that it removes ALL caching.
How about also adding a block syntax, where caching is only disabled on the
statements in the block:
    y = *q;
    sequence { x = *p; }// *p is NOT cached
    func(*q);    // *q is still cached

Pardon me if I'm being anal, but it seems like we should make 'sequence' impact
as few lines of code as possible, so you can still mix good optimization into
the same code block.

Of course, somebody's going to say (for their hardware registers) that they
will have to add 'sequence' to every line that uses the register, and they're
going to ask for a 'sequence' type modifier...and we're back to volatile. :(

--
The Villagers are Online! villagersonline.com

.[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ]
.[ (a version.of(English).(precise.more)) is(possible) ]
?[ you want.to(help(develop(it))) ]

Mar 26 2002

"Walter" <walter digitalmars.com> writes:

"Russ Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message
news:3CA0F98A.265E2A99 deming-os.org...
 Walter wrote:

 I did have a thought. How about a keyword "sequence", as in:

     sequence;        // no caching across this keyword
     x = *p;            // *p is always reloaded

 and:
     x = *p;
     sequence;        // *p is not cached

 Not a bad idea, although I don't like the idea that it removes ALL

caching.
 How about also adding a block syntax, where caching is only disabled on

the
 statements in the block:
     y = *q;
     sequence { x = *p; }// *p is NOT cached
     func(*q);    // *q is still cached

 Pardon me if I'm being anal, but it seems like we should make 'sequence'

impact
 as few lines of code as possible, so you can still mix good optimization

into
 the same code block.

Sequence won't affect enregistering variables, which is the big speed win,
not caching. I think it will have a negligible affect on performance.
Sequence fits nicely into the optimizer, because a special op is just
inserted into the instruction stream that causes a 'kill' in the data flow
analysis.

 Of course, somebody's going to say (for their hardware registers) that

they
 will have to add 'sequence' to every line that uses the register, and

they're
 going to ask for a 'sequence' type modifier...and we're back to volatile.

:(

Nobody's ever happy <g>.

Mar 26 2002

"Richard Krehbiel" <rich kastle.com> writes:

	charset="Windows-1252"
Content-Transfer-Encoding: quoted-printable

(Apology: This message is HTML so a massive link might still be =
clickable.)

"Walter" <walter digitalmars.com> wrote in message =
news:a7qrji$1bnv$1 digitaldaemon.com...
=20
 "Stephen Fuld" <s.fuld.pleaseremove att.net> wrote in message
 news:a7gom0$96n$1 digitaldaemon.com...
 Oh, that's intuitive!  :-(  Add an extra empty function call in =



order to
 prevent the compiler from doing some undesirable optimization.  =


Uccccch!
 There has got to be a better way to address the problem than this.  =


I'm
 not
 wedded to the "volatile" syntax and certainly not wedded to how C =


does
 things.  I was just pointing out, for those who have never done =


embedded
 programming, a major reason for that syntax.  If you can come up =


with a
 better solution (I guess I don't count the ones you have proposed so =


far
 to
 be better.) than I am all for it.  You have showed such immagination =


in
 solving other C/C++ deficiencies that I have reason to hope you can =


solve
 this one elegantly.

=20
 I did have a thought. How about a keyword "sequence", as in:
=20
     sequence;        // no caching across this keyword
     x =3D *p;            // *p is always reloaded
=20
 and:
     x =3D *p;
     sequence;        // *p is not cached
=20
=20

This reminded me of something, so I did a quick Google search.

Go read a Linux Torvalds rant about SMP-safety, volatile, and =
"barrier()" (which is the Linux kernel's equivalent of "sequence").  And =
much of the thread is interesting, so I'm linking the whole thing (with =
this massive link - sorry).

http://groups.google.com/groups?hl=3Den&threadm=3Dlinux.kernel.Pine.LNX.4=
.33.0107231546430.7916-100000%40penguin.transmeta.com&rnum=3D5&prev=3D/gr=
oups%3Fq%3Dtorvalds%2Btransmeta%2Bbarrier%26hl%3Den

Boiled down, Torvalds believes that "volatile" as a storage class =
modifier is always wrong; if "volatile" semantics (whatever they are) =
are needed, then apply them at the moment of access (as with a cast).

--=20
Richard Krehbiel, Arlington, VA, USA
rich kastle.com (work) or krehbiel3 comcast.net  (personal)

Mar 27 2002

"Walter" <walter digitalmars.com> writes:

	charset="Windows-1252"
Content-Transfer-Encoding: quoted-printable

That's a great link! Thanks. Interestingly, Linus appears to have come =
to the same conclusion about volatile I did:

"But the fact is, that when you add "volatile" to
the register, it really tells gcc "Be afraid.  Be very afraid. This user
expects some random behaviour that is not actually covered by any
standard, so just don't ever use this variable for any optimizations, =
even
if they are obviously correct. That way he can't complain". -Linus
  "Richard Krehbiel" <rich kastle.com> wrote in message =
news:a7secs$27fl$1 digitaldaemon.com...
  (Apology: This message is HTML so a massive link might still be =
clickable.)
  Go read a Linux Torvalds rant about SMP-safety, volatile, and =
"barrier()" (which is the Linux kernel's equivalent of "sequence").  And =
much of the thread is interesting, so I'm linking the whole thing (with =
this massive link - sorry).

  =
http://groups.google.com/groups?hl=3Den&threadm=3Dlinux.kernel.Pine.LNX.4=
.33.0107231546430.7916-100000%40penguin.transmeta.com&rnum=3D5&prev=3D/gr=
oups%3Fq%3Dtorvalds%2Btransmeta%2Bbarrier%26hl%3Den

  Boiled down, Torvalds believes that "volatile" as a storage class =
modifier is always wrong; if "volatile" semantics (whatever they are) =
are needed, then apply them at the moment of access (as with a cast).

Mar 31 2002

"Stephen Fuld" <s.fuld.pleaseremove att.net> writes:

"Walter" <walter digitalmars.com> wrote in message
news:a7qrji$1bnv$1 digitaldaemon.com...
 "Stephen Fuld" <s.fuld.pleaseremove att.net> wrote in message
 news:a7gom0$96n$1 digitaldaemon.com...
 Oh, that's intuitive!  :-(  Add an extra empty function call in order



to
 prevent the compiler from doing some undesirable optimization.  Uccccch!
 There has got to be a better way to address the problem than this.  I'm

 not
 wedded to the "volatile" syntax and certainly not wedded to how C does
 things.  I was just pointing out, for those who have never done embedded
 programming, a major reason for that syntax.  If you can come up with a
 better solution (I guess I don't count the ones you have proposed so far

 to
 be better.) than I am all for it.  You have showed such immagination in
 solving other C/C++ deficiencies that I have reason to hope you can


solve
 this one elegantly.

 I did have a thought. How about a keyword "sequence", as in:

     sequence;        // no caching across this keyword
     x = *p;            // *p is always reloaded

 and:
     x = *p;
     sequence;        // *p is not cached

I think the fundamental question is whether the "non registerability" should
be a property of the variable (that is, "volatile") or of the particular
access to the variable (that is, "sequence").  I guess there are two types
of situations where this functionality is required, variables shared among
multiple threads and physical hardware registers.

For the latter, since we are talking about a direct, one to one relationship
between a variable and a particular piece of physical hardware, I think it
is clearly a property of the variable itself.

For the former, I guess it it could be considered either.  But in practical
terms, since one thread can't know when another thread is going to access
the variable, you probably don't want the variable living in a register for
any significant length of time, and probably want a simple locking mechanism
as well.

So I guess I come down on the side of making it a property of the variable,
not the particular access.  I think that will reduce source program size,
eliminate the class of bugs that might occur for someone "forgetting" to put
in the sequence keyword, etc.

The lock mechanism is a separate issue, but I do believe there should be a
defined access to the low cost locks offerred by atomic instructions in most
architectures.

--
 - Stephen Fuld
   e-mail address disguised to prevent spam

Mar 27 2002

"OddesE" <OddesE_XYZ hotmail.com> writes:

"Stephen Fuld" <s.fuld.pleaseremove att.net> wrote in message
news:a7t4ca$2jfd$2 digitaldaemon.com...
<SNIP>
 The lock mechanism is a separate issue, but I do believe there should be a
 defined access to the low cost locks offerred by atomic instructions in

most
 architectures.

 --
  - Stephen Fuld
    e-mail address disguised to prevent spam

Isn't depending on atomic instructions dangerous?
What about multi-processor systems, where two
atomic instructions might execute simultaneously?


--
Stijn
OddesE_XYZ hotmail.com
http://OddesE.cjb.net
__________________________________________
Remove _XYZ from my address when replying by mail

Mar 27 2002

"Stephen Fuld" <s.fuld.pleaseremove att.net> writes:

"OddesE" <OddesE_XYZ hotmail.com> wrote in message
news:a7tf5d$2p0k$1 digitaldaemon.com...
 "Stephen Fuld" <s.fuld.pleaseremove att.net> wrote in message
 news:a7t4ca$2jfd$2 digitaldaemon.com...
 <SNIP>
 The lock mechanism is a separate issue, but I do believe there should be


a
 defined access to the low cost locks offerred by atomic instructions in

 most
 architectures.

 --
  - Stephen Fuld
    e-mail address disguised to prevent spam

 Isn't depending on atomic instructions dangerous?
 What about multi-processor systems, where two
 atomic instructions might execute simultaneously?

The atomic instructions I was talking about are things like test and set,
compare and swap, or atomic fetch-op-store, where the memory is locked for
the duration of the instruction.  These are safe in multi-processor systems.
Sorry if I confused you.

--
 - Stephen Fuld
   e-mail address disguised to prevent spam



 --
 Stijn
 OddesE_XYZ hotmail.com
 http://OddesE.cjb.net
 __________________________________________
 Remove _XYZ from my address when replying by mail

Mar 27 2002

"OddesE" <OddesE_XYZ hotmail.com> writes:

"Stephen Fuld" <s.fuld.pleaseremove att.net> wrote in message
news:a7tm71$2sjd$2 digitaldaemon.com...
 "OddesE" <OddesE_XYZ hotmail.com> wrote in message
 news:a7tf5d$2p0k$1 digitaldaemon.com...
 "Stephen Fuld" <s.fuld.pleaseremove att.net> wrote in message
 news:a7t4ca$2jfd$2 digitaldaemon.com...
 <SNIP>
 The lock mechanism is a separate issue, but I do believe there should



be
 a
 defined access to the low cost locks offerred by atomic instructions



in
 most
 architectures.

 --
  - Stephen Fuld
    e-mail address disguised to prevent spam

 Isn't depending on atomic instructions dangerous?
 What about multi-processor systems, where two
 atomic instructions might execute simultaneously?

 The atomic instructions I was talking about are things like test and set,
 compare and swap, or atomic fetch-op-store, where the memory is locked for
 the duration of the instruction.  These are safe in multi-processor

systems.
 Sorry if I confused you.

 --
  - Stephen Fuld
    e-mail address disguised to prevent spam

You didn't confuse me, the topic just does.
Multi threading issues are one of my weaker
points when it comes to programming... :(

Thanks for clearing it up.


--
Stijn
OddesE_XYZ hotmail.com
http://OddesE.cjb.net
_________________________________________________
Remove _XYZ from my address when replying by mail

Mar 28 2002

"Richard Krehbiel" <krehbiel3 comcast.net> writes:

"Walter" <walter digitalmars.com> wrote in message
news:a7gfrs$35e$1 digitaldaemon.com...
 I don't see why volatile is that necessary for hardware registers. You can
 still easilly read a hardware register by setting a pointer to it and

going
 *p. The compiler isn't going to skip the write to it through *p (it's

very,
 very hard for a C optimizer to remove dead stores through pointers, due to
 the aliasing problem).

The linux crowd had the devil of a time with a new release of GCC.  It seems
that the standard for C states that acessing the bytes of one object does
not necessarily alias the bytes of any other object if their accesses are by
different types, unless one is char.

This means that in:

    auto float f;
    *(volatile long *)&f = 0;

...this need not visibly affect the object f.  Yep.

Mar 26 2002

"Serge K" <skarebo programmer.net> writes:

 BTW, D's inline assembler is well integrated in with the compiler. The
 compiler can track register usage even in asm blocks, and can still optimize
 the surrounding code, unlike any other inline implementation I'm aware of.

You should try Visual C++ for Alpha.
It can optimize not only the surrounding code,
but inline assembly code as well.
I was truly amazed when I've noticed that.

Mar 22 2002

"Walter" <walter digitalmars.com> writes:

"Serge K" <skarebo programmer.net> wrote in message
news:a7gclf$ej$1 digitaldaemon.com...
 BTW, D's inline assembler is well integrated in with the compiler. The
 compiler can track register usage even in asm blocks, and can still


optimize
 the surrounding code, unlike any other inline implementation I'm aware


of.
 You should try Visual C++ for Alpha.
 It can optimize not only the surrounding code,
 but inline assembly code as well.
 I was truly amazed when I've noticed that.

D's instruction scheduler (and peephole optimizer) is specifically prevented
from operating on the inline assembler blocks. I'm a little surprised that a
compiler wouldn't do that. The whole point of inline asm is to wrest control
away from the compiler and precisely lay out the instructions.

Mar 22 2002

Karl Bochert <kbochert ix.netcom.com> writes:

On Fri, 22 Mar 2002 10:20:54 -0800, "Walter" <walter digitalmars.com> wrote:

 BTW, D's inline assembler is well integrated in with the compiler. The
 compiler can track register usage even in asm blocks, and can still optimize
 the surrounding code, unlike any other inline implementation I'm aware of.
 

Watcom has a form of asm that allows optimization.

#pragma aux  setSP = \
    "mov ESP,  eax"  \
    parm [eax]            \
   modify [EAX] ;

#pragma aux getSP = \
    "mov edx, esp" \
    value [edx] modify [eax];

Then:
    ...
    current_sp = getSP()
   ---
is fully optimized.

It also has the 'asm("mov eax, esp") form, which I believe is opaque
to the compiler.


Watcom also allows register passing convention in addition
to the standard _stdcall and _stddecl.  This, and extensive optimization,
enables it to produce the fastest C code of any compiler
that I am aware of. An excellent back-end for D, someday.

free too ;-)

Karl Bochert

Mar 23 2002

"Pavel Minayev" <evilone omen.ru> writes:

"Karl Bochert" <kbochert ix.netcom.com> wrote in message
news:1103_1016902791 bose...

 Watcom also allows register passing convention in addition
 to the standard _stdcall and _stddecl.  This, and extensive optimization,
 enables it to produce the fastest C code of any compiler

AFAIK, D chooses calling convention on its own, and might use
fastcall where it seems better.

 that I am aware of. An excellent back-end for D, someday.

 free too ;-)

Hm? Where can I get it, then?

Mar 23 2002

Karl Bochert <kbochert ix.netcom.com> writes:

On Sat, 23 Mar 2002 22:49:09 +0300, "Pavel Minayev" <evilone omen.ru> wrote:
 "Karl Bochert" <kbochert ix.netcom.com> wrote in message
 news:1103_1016902791 bose...
 
 Watcom also allows register passing convention in addition
 to the standard _stdcall and _stddecl.  This, and extensive optimization,
 enables it to produce the fastest C code of any compiler

 
 AFAIK, D chooses calling convention on its own, and might use
 fastcall where it seems better.
 
 that I am aware of. An excellent back-end for D, someday.

 free too ;-)

 
 Hm? Where can I get it, then?
 

To quote from a message on the Euphoria newsgroup
"
OpenWatcom is available as most of you know. The Beta to 11c does
compile Euphoria Translated Code and runs much faster than LCC
or Borland but you have to know a few tricks to get Watcom to work at all
because the libraries and header files arent included in the beta release.
I have the solution to this problem!

Download Watcom 11c beta 
Download Masm32 by Hutch
"

I did this and the only problem I had was that I downloaded the file
groups individually and missed one. Also the Watcom resource compiler
is missing.

The URL's are:
http://www.openwatcom.org/
http://www.movsd.com/masm.htm

A couple of benchmarks:
http://www.byte.com/art/9801/sec12/art7.htm.
http://www.geocities.com/SiliconValley/Vista/6552/compila.html.

Karl Bochert

Mar 23 2002

"Sean L. Palmer" <spalmer iname.com> writes:

Watcom did run circles around the competition back in the day.  GCC's inline
asm provides a similar amount of information to the optimizer so in theory
it should be able to perform as well as Watcom (but in practice it doesn't,
from what I can tell so far)

Watcom's inline asm had one main problem, which GCC doesn't:  Watcom didn't
let your inline asm request an empty register from the compiler... you just
used a given register and the asm around the call would be rearranged to
make room for the register your inline asm used.  For recursive functions
that doesn't work so well.  For instance if you made a vector add routine
where the vectors are pointed to by edx and eax, then edx and eax would
become bottleneck registers whilst doing lots of vector adds and would end
up getting pushed and popped alot.

Sean


 Watcom also allows register passing convention in addition
 to the standard _stdcall and _stddecl.  This, and extensive optimization,
 enables it to produce the fastest C code of any compiler
 that I am aware of. An excellent back-end for D, someday.

 free too ;-)

 Karl Bochert

Mar 25 2002

"Walter" <walter digitalmars.com> writes:

"Karl Bochert" <kbochert ix.netcom.com> wrote in message
news:1103_1016902791 bose...
 On Fri, 22 Mar 2002 10:20:54 -0800, "Walter" <walter digitalmars.com>

wrote:
 BTW, D's inline assembler is well integrated in with the compiler. The
 compiler can track register usage even in asm blocks, and can still


optimize
 the surrounding code, unlike any other inline implementation I'm aware


of.
 Watcom has a form of asm that allows optimization.

 #pragma aux  setSP = \
     "mov ESP,  eax"  \
     parm [eax]            \
    modify [EAX] ;
 #pragma aux getSP = \
     "mov edx, esp" \
     value [edx] modify [eax];
 Then:
     current_sp = getSP()
 is fully optimized.

The Digital Mars optimizer doesn't need those hints to be specified by the
user, it just analyzes the instructions.

 Watcom also allows register passing convention in addition
 to the standard _stdcall and _stddecl.  This, and extensive optimization,
 enables it to produce the fastest C code of any compiler
 that I am aware of.

My marketing has always been bad. I remember magazine compiler reviews where
the reviewer's own numbers showed us to be the fastest compiler, but borland
got the writeup as fastest. Where we produced the fastest benchmarks
according to the reviewer's own numbers, but watcom got the writeup as
fastest. It's all a bit maddening <g>.

Mar 26 2002

"Jim Starkey" <jas netfrastructure.com> writes:

Walter wrote in message ...
 I notice there is no support for volatile, which perplexes me.  Volatile
 is necessary to
 warn an optimizer that another thread may change a data item without
 warning.

They'd have to be implemented with mutexes anyway, so might as well just
wrap them in "synchronized". Note: the X86 CPU doesn't guarantee that
writing to memory will be atomic if the item crosses a 32 bit word

boundary,
which can happen writing doubles, longs, or even misaligned ints.

No, it neither necessary nor desirable to use mutexes.  Yes, there are
restrictions on the interlocked instructions, but since volatile is
implemented/
enforced by the compiler, this should be acceptable.  The compiler's
responsibility should be to either implement an operation atomically
or generate a diagnostic explaining why it can't.

An example of something that can be cheaply handled by enhanced volatile
is use counts by objects shared across threads.  An atomic interlocked
decrement implemented with "lock xsub decl" does the trick correctly
with no more cost than an extra bus cycle, where a mutex requires an
OS call.  The ratio of costs are probably three orders of magnitude or
more.

I agree that the C definition of volatile is next to useless.

I didn't mean to imply that the C definition of volatle is next to
useless -- it
is, in fact, absolutely critical for all but the most primitive
multi-threaded code.
Even when used with mutexes volatile is necessary to warn the optimizer
off unwarranted assumptions of invariance.

If D is going to succeed, it is necessary to anticipate where computer
architures are going.  Everyone, I hope, understands that memory is
cheap and plentiful, larger virtual address spaces are in easy sight,
and dirt cheap multi-processors are here.  Although we're in a period
of rapidly increasing clock rates, we're also approaching physical
limits on feature size.  In the not distant future it will be cheaper to
add more processors than buy/build faster ones.  At that point
performance will be gated by the degree to which doubling the
number of processors doubles the speed of the system.

There are a hierarchy of synchronization primitives -- interlocked
instructions, shared/exclusive locks, and mutexes -- with a large
variation in cost.  Interlocked instructions are almost free, mutexes
cost an arm and a leg.  Forcing all synchronization to use mutexes
is an unnecessary waste of resources.  In the absence of
volatile, however, it is impossible to implement finer grained
sychronization primitives.  This doesn't strike me as wise....

Mar 22 2002

"Walter" <walter digitalmars.com> writes:

"Jim Starkey" <jas netfrastructure.com> wrote in message
news:a7fk2p$20rj$1 digitaldaemon.com...
They'd have to be implemented with mutexes anyway, so might as well just
wrap them in "synchronized". Note: the X86 CPU doesn't guarantee that
writing to memory will be atomic if the item crosses a 32 bit word

 boundary,
which can happen writing doubles, longs, or even misaligned ints.

 No, it neither necessary nor desirable to use mutexes.  Yes, there are
 restrictions on the interlocked instructions, but since volatile is
 implemented/
 enforced by the compiler, this should be acceptable.  The compiler's
 responsibility should be to either implement an operation atomically
 or generate a diagnostic explaining why it can't.

Writes to bytes and aligned words/dwords are done atomically by the CPU,
misaligned data and multiword data is not.

 An example of something that can be cheaply handled by enhanced volatile
 is use counts by objects shared across threads.  An atomic interlocked
 decrement implemented with "lock xsub decl" does the trick correctly
 with no more cost than an extra bus cycle, where a mutex requires an
 OS call.  The ratio of costs are probably three orders of magnitude or
 more.

Synchronizing mutexes do not require an os call most of the time, although
they still are slower than a simple lock. None of the modern java vm's do an
os call for each synchronize.

I agree that the C definition of volatile is next to useless.

 I didn't mean to imply that the C definition of volatle is next to
 useless -- it
 is, in fact, absolutely critical for all but the most primitive
 multi-threaded code.
 Even when used with mutexes volatile is necessary to warn the optimizer
 off unwarranted assumptions of invariance.

I'm sorry, I just don't see how. See my other post here about j=i++; and how
volatile doesn't help.

 If D is going to succeed, it is necessary to anticipate where computer
 architures are going.  Everyone, I hope, understands that memory is
 cheap and plentiful, larger virtual address spaces are in easy sight,
 and dirt cheap multi-processors are here.  Although we're in a period
 of rapidly increasing clock rates, we're also approaching physical
 limits on feature size.  In the not distant future it will be cheaper to
 add more processors than buy/build faster ones.  At that point
 performance will be gated by the degree to which doubling the
 number of processors doubles the speed of the system.

I think you're right.

 There are a hierarchy of synchronization primitives -- interlocked
 instructions, shared/exclusive locks, and mutexes -- with a large
 variation in cost.  Interlocked instructions are almost free, mutexes
 cost an arm and a leg.  Forcing all synchronization to use mutexes
 is an unnecessary waste of resources.  In the absence of
 volatile, however, it is impossible to implement finer grained
 sychronization primitives.  This doesn't strike me as wise....

I think your points merit further investigation, though I don't see how
volatile is the answer.

Mar 22 2002

D Programming

C/C++ Programming

Other

D - Volatile