digitalmars.D - Dual Core Support

Manfred Nowak (9/9) Jun 16 2005 The shipping of the "AMD Athlon 64 X2" is announced to start at the

Brad Beveridge (4/17) Jun 16 2005 How do you mean? You can program in a multithreaded manner in D, which
Lionello Lunesu (10/10) Jun 16 2005 | Will D be outdated before the release of 1.0 because D has no support

Manfred Nowak (56/58) Jun 17 2005 Thank you both for your responses, Brad and Lionellu.

xs0 (41/92) Jun 17 2005 AFAIK, multi-core processors are almost exactly the same as having

Manfred Nowak (27/34) Jun 18 2005 Thanks for your opinions, I have read them carefully several times.

xs0 (34/67) Jun 18 2005 I don't know what can or can't be done over the internal bus, but as far...

Manfred Nowak (6/8) Jun 18 2005 Please have a look at

Sean Kelly (15/40) Jun 18 2005 Yikes. So you're saying you'd have lockless sharing of data between the...

Manfred Nowak (24/48) Jun 19 2005 Why lockless?

Sean Kelly (18/42) Jun 19 2005 If multiple cores share a single cache, then there's no need to force ca...

James Dunne (10/57) Jun 19 2005 It's been said in this thread before, but multi-threading control is a f...

Manfred Nowak (19/32) Jun 19 2005 True. But have you read why Buhr abandoned his concurrency project

Sean Kelly (10/25) Jun 19 2005 They should because the way errors are handled depends on system state. ...
Matthias Becker (6/32) Jun 20 2005 Anyway, this isn't a new problem as real concurrency isn't an invention ...
Brad Beveridge (21/30) Jun 20 2005 I haven't had time to read the references that you posted, but the above...

Sean Kelly (19/49) Jun 20 2005 AFAIK, dual core machines are indistuingishable from 'true' SMP machines...

Sean Kelly (9/9) Jun 17 2005 I need to read up a bit on multi-core systems, but they act the same as ...

Brad Beveridge (17/30) Jun 17 2005 This I agree with, library support for multi-processor systems is a good...

Sean Kelly (7/14) Jun 17 2005 Exactly. And that leaves us with cache coherency problems. I think we'...

Brad Beveridge (12/25) Jun 17 2005 Thinking along these lines, performance programming in D would possibly

Sean Kelly (16/25) Jun 17 2005 True enough :) And things are changing for x86 architectures in this re...

Manfred Nowak (7/12) Jun 18 2005 No. I have somewhere seen an argument, that if concurrency is not

Matthias Becker (3/13) Jun 18 2005 There are some problems with optimizers that can move code around so thi...
Sean Kelly (10/21) Jun 18 2005 This is an issue with C/C++. Specifically, it relates to the "as if" ru...

Manfred Nowak (7/15) Jun 19 2005 Are you able to prove, that the argument holds for C++ only, which

Sean Kelly (4/18) Jun 19 2005 Not at all. I imagine many languages target a single-threaded virtual m...
Sean Kelly (19/33) Jun 20 2005 Okay, I dug up a copy of Ghostscript for the PC and read the first few p...

Brad Beveridge (22/32) Jun 20 2005 Does volatile prevent code movement within the block? For example

Sean Kelly (18/49) Jun 21 2005 The spec just says that "Memory writes occurring before the Statement ar...

Derek Parnell (11/22) Jun 19 2005 Yes. In the exact same manner that all existing 3+GL languages are.

Manfred Nowak (24/35) Jun 20 2005 I disagree. All this languages are way beyond version 1.0 whereas D

Brad Beveridge (29/40) Jun 20 2005 If I have contributed to your discomfort, I am sorry - that was
Matthias Becker (1/8) Jun 21 2005 You can build mutexes and monitors with synchronized without problems.

Manfred Nowak (3/5) Jun 21 2005 So why did Buhr implement them?

Brad Beveridge (17/27) Jun 22 2005 I read the library approaches paper from Buhr that you reference, I

Manfred Nowak <svv1999 hotmail.com> writes:

The shipping of the "AMD Athlon 64 X2" is announced to start at the 
end of this month.

A review is available:
http://www.amdreview.com/reviews.php?rev=athlonx24200

As the review suggests WinXP and Sandra are prepared to use more than 
one CPU.

Will D be outdated before the release of 1.0 because D has no support 
for multi core units?

-manfred

Jun 16 2005

Brad Beveridge <brad somewhere.net> writes:

Manfred Nowak wrote:
 The shipping of the "AMD Athlon 64 X2" is announced to start at the 
 end of this month.
 
 A review is available:
 http://www.amdreview.com/reviews.php?rev=athlonx24200
 
 As the review suggests WinXP and Sandra are prepared to use more than 
 one CPU.
 
 Will D be outdated before the release of 1.0 because D has no support 
 for multi core units?
 
 -manfred

How do you mean?  You can program in a multithreaded manner in D, which 
should take advantage of multiple cpus/cores.  Or am I missing something?

Brad

Jun 16 2005

"Lionello Lunesu" <lio lunesu.removethis.com> writes:

| Will D be outdated before the release of 1.0 because D has no support
| for multi core units?

There's nothing special about multi-core processors, at least when it comes 
to the compiler, it's all the same. A PC with a dual-core CPU (or two 
'single-core' CPU's for that matter) can simply run two programs at full 
speed, at the same time.

On a single-core CPU, the operating system lets each running program use the 
CPU for a fraction of a second, so it seems they are running at the same 
time, but they never really are.

L.

Jun 16 2005

Manfred Nowak <svv1999 hotmail.com> writes:

"Lionello Lunesu" <lio lunesu.removethis.com> wrote:

 There's nothing special about multi-core processors, at least
 when it comes to the compiler, it's all the same.

Thank you both for your responses, Brad and Lionellu.

In essence both of you seem to want the OS to represent a
multicore system as a virtual single core system to you. In this
case you are right: neglecting the fact that you have a multicore
system does not raise any need to use its capabilities. 

On the other hand the OS has to do the work to make the multicore
sytem to appear as a virtual single core system to you. 

| If control of Northbridge functions is shared between software
| on both cores, software must ensure that only one core at a time
| is allowed to access the shared MSR. 
http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_
docs/26094.PDF (p. 324)

So there is a need to adress the specialities of dual core machines. 


Please recall that an AMD Athlon64 system can contain up to 8 dual
core units and that one of D's major goals is to 

| Provide low level bare metal access as required
http://www.digitalmars.com/d/overview.html

Is this really true when all bare metal access has to use the asm 
statement?


Please look deeper into the D specs:
http://www.digitalmars.com/d/statement.html

The throw-statement:
| The Object reference is thrown as an exception.

What will happen if both cores throw an exception at the same
clock impulse? 

The volatile satement:
| Memory writes occurring before the Statement are performed
| before any reads within or after the Statement. Memory reads
| occurring after the Statement occur after any writes before or
| within Statement are completed.

What does this mean for a multi core system, which shares the main 
memory between all activated cores?


Algorithmically it is simply not true that a dual core system is 
aequivalent to a higher clocked single core system!

Please recall the simple task of deciding wether there is a given and 
fixed value in an array large enough.

Using a virtual single core machine you would simply loop through all 
indices until you find the given value or end up not finding it, then 
issuing the appropriate result.

Given a natural number n (n>=2 && n <=16) and a mchine with n cores 
you would divide the array into n equal sized pieces and assign a 
core to each piece of the array. In case of not finding the searched 
value you would in essence end up having cut down the number of clock 
cycles needed to an n-th of the time of a virtual single core system.

But if you cannot assign a core to a task because the used language 
does not allow this assignment you can do nothing more than assigning 
the n parts of the array to n threads and then _hope_ that the OS 
will execute them in parallel.

Would you trust your life to a system, that is usually fast but 
cannot be assured to have reaction time prolongations in a magnitude 
of more than ten?

You may want to answer with "no", and in this case my initial 
question on the outdatedness of D is assigned a positive value.

-manfred

Jun 17 2005

xs0 <xs0 xs0.com> writes:

Manfred Nowak wrote:
 "Lionello Lunesu" <lio lunesu.removethis.com> wrote:
 
 
There's nothing special about multi-core processors, at least
when it comes to the compiler, it's all the same.

 
 
 Thank you both for your responses, Brad and Lionellu.
 
 In essence both of you seem to want the OS to represent a
 multicore system as a virtual single core system to you. In this
 case you are right: neglecting the fact that you have a multicore
 system does not raise any need to use its capabilities. 

AFAIK, multi-core processors are almost exactly the same as having 
multiple cpus, except they're in a single box and share a single bus to 
the outside world. So, I'd say that there's nothing much that can be 
done beyond what is already done (which is basically multi-threading 
support and synchronization objects).

I don't think starting a thread is light-weight enough that the compiler 
should try to multi-thread code automatically, because in 99.9% cases 
there'd be no benefit.


 On the other hand the OS has to do the work to make the multicore
 sytem to appear as a virtual single core system to you. 

I think the OS does just the opposite - by scheduling and 
task-switching, it hides the actual CPUs/cores, and makes the system 
appear as having any number of them (where the number is the number of 
threads that are running).


 | If control of Northbridge functions is shared between software
 | on both cores, software must ensure that only one core at a time
 | is allowed to access the shared MSR. 
 http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_
 docs/26094.PDF (p. 324)
 
 So there is a need to adress the specialities of dual core machines.

You should've also mentioned the title of the white paper, which is BIOS 
and Kernel Developer's Guide for [AMD processors]. I disagree that D 
should be specialized for those types of software, and I think you'd 
still need assembler anyway; much important kernel code is both 
speed-critical and extremely specific, so coding it in a high-level 
langauge is just not an option realistically.


 Please look deeper into the D specs:
 http://www.digitalmars.com/d/statement.html
 
 The throw-statement:
 | The Object reference is thrown as an exception.
 
 What will happen if both cores throw an exception at the same
 clock impulse? 

Each thread will unwind its stack, like it does now, until it gets to an 
exception handler.. I don't see the difference when there is more than 
one core..


 The volatile satement:
 | Memory writes occurring before the Statement are performed
 | before any reads within or after the Statement. Memory reads
 | occurring after the Statement occur after any writes before or
 | within Statement are completed.
 
 What does this mean for a multi core system, which shares the main 
 memory between all activated cores?

Again, you skipped an important part: A volatile statement does not 
guarantee atomicity. Whenever more than one thread can access the same 
memory (where at least one is writing to it), the accesses should be 
synchronized, multi-core or not. Providing synchronization methods is 
the job of OS and/or hardware, and using them is already simple in D.


 Algorithmically it is simply not true that a dual core system is 
 aequivalent to a higher clocked single core system!

Unfortunately, no, it isn't.

 [snip]
 
 But if you cannot assign a core to a task because the used language 
 does not allow this assignment you can do nothing more than assigning 
 the n parts of the array to n threads and then _hope_ that the OS 
 will execute them in parallel.

The OS is in charge of both cores anyway; you can't bypass it and 
somehow take control of the cores, so you hope for the best in any case. 
That's another reason why automatically multi-threading doesn't make 
much sense.


 Would you trust your life to a system, that is usually fast but 
 cannot be assured to have reaction time prolongations in a magnitude 
 of more than ten?

No, but luckily both software and OSs in such systems are usually 
written with hard guarantees about how much time anything takes..


 You may want to answer with "no", and in this case my initial 
 question on the outdatedness of D is assigned a positive value.

Well, I certainly wouldn't like D to be outdated so soon, but I think 
that as far as performance is concerned, there are several better things 
that could be done first (any-order loops, array ops, easier MMX/SSE 
utilization, etc.). I think that only after single-thread optimizations 
are exhausted, we (or D or Walter) should be moving towards 
multi-cpu/core stuff.


xs0

Jun 17 2005

Manfred Nowak <svv1999 hotmail.com> writes:

xs0 <xs0 xs0.com> wrote:

[...]
 AFAIK, multi-core processors are almost exactly the same as
 having multiple cpus, except they're in a single box and share a
 single bus to the outside world.

Thanks for your opinions, I have read them carefully several times.

There is one fundamental difference between dual-cores and dual-
cpus: dual-cores can exchange data over the internal bus and do not 
need any bandwidth on the bus to the "outside world".

I.e. if you have a multi dual-core machine and knows that two 
threads have to communicate intensively you loose performance if 
you cannot control to have both threads running on a single dual-
core die.


 So, I'd say that there's
 nothing much that can be done beyond what is already done (which
 is basically multi-threading support and synchronization
 objects). 

[...]

I do not find the hook in your arguments to the explanation why 
control of the two points of execution (which are implied by a dual 
core machine) is not necessary.

In the example of the throw statement you even explicitely say, 
that you are not interested in guiding the machine, instead the 
machine is allowed to do what ever _randomly_ occurs first

To explain why this might be wrong imagine security rules for a 
train:
- if the pressing of the alive-knob for the driver times out then 
stop the train as if you are joining in to a sattion
- if fire alarm is issued then bring the train to a stop as fast as 
possible except you are in a tunnel, then delay the stopping of the 
train until you have left the tunnel

Now what will your machine do if fire alarm is issued in a tunnel 
and the pressing of the alive-knob is timing out also?

-manfred

Jun 18 2005

xs0 <xs0 xs0.com> writes:

Manfred Nowak wrote:
 Thanks for your opinions, I have read them carefully several times.
 
 There is one fundamental difference between dual-cores and dual-
 cpus: dual-cores can exchange data over the internal bus and do not 
 need any bandwidth on the bus to the "outside world".
 
 I.e. if you have a multi dual-core machine and knows that two 
 threads have to communicate intensively you loose performance if 
 you cannot control to have both threads running on a single dual-
 core die.

I don't know what can or can't be done over the internal bus, but as far 
as thread control is concerned, it's not something that can be done by 
user apps, no matter what you do to the language they were coded in, 
because it's in the OS domain. If/when OS supports it, the functionality 
is available through an OS library, so, everything that D needs for 
multi-core CPU support is already there (access to OS :)

Again, I think it'd be better to focus on providing constructs that 
allow optimization in general. When/if it is feasible to optimize them 
by utilizing multi-core cpus in the way you'd want, the only thing that 
needs to be done is improve the compiler. In the meantime, they can be 
optimized for other cases, like by making use of MMX/SSE instructions, 
which I think are totally underutilized generally, and which could 
easily provide comparable gains in speed.

Well, writing all this, I think I'm not sure what are you actually 
proposing to be done. You seem to want some sort of multi-core support, 
but what would that be? Can you give an example or two?


So, I'd say that there's
nothing much that can be done beyond what is already done (which
is basically multi-threading support and synchronization
objects). 

 
 [...]
 
 I do not find the hook in your arguments to the explanation why 
 control of the two points of execution (which are implied by a dual 
 core machine) is not necessary.

I'm not saying it's not necessary, I'm just saying it's not something 
that can be done in the language itself.


 In the example of the throw statement you even explicitely say, 
 that you are not interested in guiding the machine, instead the 
 machine is allowed to do what ever _randomly_ occurs first

In a general-purpose OS, everything is basically random - at any time, 
the OS can switch to another task. In a real-time OS, things are 
different (although, admittedly, I don't know how much), but I guess 
most software we're writing won't be running on such an OS.

Even regardless of all this - considering the two simultaneous 
exceptions case: if they can occur simultaneously, it's almost certain 
that they can also occur within, say, 1 microsecond. If that is so, you 
must handle both cases of which occurs first anyway; when that is done, 
it doesn't matter anymore which comes first..


 To explain why this might be wrong imagine security rules for a 
 train:
 - if the pressing of the alive-knob for the driver times out then 
 stop the train as if you are joining in to a sattion
 - if fire alarm is issued then bring the train to a stop as fast as 
 possible except you are in a tunnel, then delay the stopping of the 
 train until you have left the tunnel
 
 Now what will your machine do if fire alarm is issued in a tunnel 
 and the pressing of the alive-knob is timing out also?

Hmm, I'm not sure where you see randomness in all this (hopefully, the 
software would be coded to handle the case where both things occur), but 
as for "my machine" - for something this simple (stop if (at_station && 
!alive) || (on_fire && !inside_tunnel)), I wouldn't use a CPU at all, 
this can be done far more reliably with a few really big logic gates :)


xs0

Jun 18 2005

Manfred Nowak <svv1999 hotmail.com> writes:

xs0 <xs0 xs0.com> wrote:

 You seem to want some sort of multi-core support, but what would
 that be? Can you give an example or two? 

Please have a look at

http://plg.uwaterloo.ca/~usystem/pub/uSystem/uC++book.pdf

Thanks for "Marco A"'s post 29355 in the old D group for directing me 
to this reference.

-manfred

Jun 18 2005

Sean Kelly <sean f4.ca> writes:

In article <d90h6h$134$1 digitaldaemon.com>, Manfred Nowak says...
xs0 <xs0 xs0.com> wrote:

[...]
 AFAIK, multi-core processors are almost exactly the same as
 having multiple cpus, except they're in a single box and share a
 single bus to the outside world.

Thanks for your opinions, I have read them carefully several times.

There is one fundamental difference between dual-cores and dual-
cpus: dual-cores can exchange data over the internal bus and do not 
need any bandwidth on the bus to the "outside world".

I.e. if you have a multi dual-core machine and knows that two 
threads have to communicate intensively you loose performance if 
you cannot control to have both threads running on a single dual-
core die.

Yikes.  So you're saying you'd have lockless sharing of data between the cores
and only force a cache sync when communicating between processors?  Makes sense,
I suppose, but it sounds risky.

In the example of the throw statement you even explicitely say, 
that you are not interested in guiding the machine, instead the 
machine is allowed to do what ever _randomly_ occurs first

To explain why this might be wrong imagine security rules for a 
train:
- if the pressing of the alive-knob for the driver times out then 
stop the train as if you are joining in to a sattion
- if fire alarm is issued then bring the train to a stop as fast as 
possible except you are in a tunnel, then delay the stopping of the 
train until you have left the tunnel

Now what will your machine do if fire alarm is issued in a tunnel 
and the pressing of the alive-knob is timing out also?

Perhaps I'm missing something, but I don't see why this example requires special
assembly-level handling of exceptions.  If the button failure exception is
thrown before the fire warning is signalled, then the train will begin to slow
down.  Then when the fire warning is signalled I assume the train will continue
on at its existing speed until it exits the tunnel, then it will stop?  And if
the reverse happens, the train will ignore the stop button time-out because it's
handing a more important directive.

Is the issue that you don't want to use traditional synchronization in the error
handling mechanism and would rather prioritize at the signalling level?  I'll
admit I haven't done this sort of programming before.


Sean

Jun 18 2005

Manfred Nowak <svv1999 hotmail.com> writes:

Sean Kelly <sean f4.ca> wrote:
[...] 
 Yikes.  So you're saying you'd have lockless sharing of data
 between the cores and only force a cache sync when communicating
 between processors?  Makes sense, I suppose, but it sounds
 risky. 

Why lockless?
 
In the example of the throw statement you even explicitely say, 
that you are not interested in guiding the machine, instead the 
machine is allowed to do what ever _randomly_ occurs first

To explain why this might be wrong imagine security rules for a 
train:
- if the pressing of the alive-knob for the driver times out
then stop the train as if you are joining in to a sattion
- if fire alarm is issued then bring the train to a stop as fast
as possible except you are in a tunnel, then delay the stopping
of the train until you have left the tunnel

Now what will your machine do if fire alarm is issued in a
tunnel and the pressing of the alive-knob is timing out also?

 
 If the button failure exception is thrown before the fire warning 

[...]
 And if the reverse happens

[...]
 Is the issue that you don't want to use traditional
 synchronization in the error handling mechanism and would rather
 prioritize at the signalling level?

I see, that you catched the basic principal behind my example. And 
as you may see above it is difficult to the human brain to think in 
concurrency: you serialized the events but do not handle the case 
when depending on an unlucky implementation both cores might 
independently raise both exceptions, one core the fire exception 
and the other the alive-knob exception.

In this case you have a control leak.


There is one more thing to mention: it is not seldom, that 
specifications are incomplete or even contradictory and that 
detection of this specification faults occurs late in the software 
production process.

Depending on the awareness of the implementators such a fault might 
traverse into the final product.

Have a look at your two cases: you are handling the case that the 
alive-knob exception comes first, but you missed that the fire-knob 
exception might be thrown, when the train stopped already, but in a 
tunnel.

-manfred

Jun 19 2005

Sean Kelly <sean f4.ca> writes:

In article <d94alb$2gld$1 digitaldaemon.com>, Manfred Nowak says...
Sean Kelly <sean f4.ca> wrote:
[...] 
 Yikes.  So you're saying you'd have lockless sharing of data
 between the cores and only force a cache sync when communicating
 between processors?  Makes sense, I suppose, but it sounds
 risky. 

Why lockless?

If multiple cores share a single cache, then there's no need to force cache
coherency when sharing data between them.  Of course, that assumes there's some
way to tell you're running on two cores sharing a cache, which may not be
possible.  As for why: cache synchs take time.  Less time than full locking, but
time nevertheless.  I don't know how useful this would be for PCs, but for NUMA
machines that have clusered cores but inter-cluster ops involve message-passing,
this may be a reasonable strategy.  Though I'm speculating here, as I've never
actually coded for such a machine. 

I see, that you catched the basic principal behind my example. And 
as you may see above it is difficult to the human brain to think in 
concurrency: you serialized the events but do not handle the case 
when depending on an unlucky implementation both cores might 
independently raise both exceptions, one core the fire exception 
and the other the alive-knob exception.

In this case you have a control leak.

Why can't the exception handlers serialize error-handing though?  There
ultimately has to be some coordination to resolve potentially conflicting
directives.  Why should this happen when the exception is thrown as opposed to
when it's caught?

There is one more thing to mention: it is not seldom, that 
specifications are incomplete or even contradictory and that 
detection of this specification faults occurs late in the software 
production process.

Depending on the awareness of the implementators such a fault might 
traverse into the final product.

Have a look at your two cases: you are handling the case that the 
alive-knob exception comes first, but you missed that the fire-knob 
exception might be thrown, when the train stopped already, but in a 
tunnel.

And what if the train had already stopped because of an engine failure, or
because someone pulled the emergency brake?  The 'fire' routine would need to
know whether it should try and move a stopped train out of a tunnel, etc.  How
can this be solved by prioritizing exceptions?  Or am I missing something?


Sean

Jun 19 2005

James Dunne <james.jdunne gmail.com> writes:

It's been said in this thread before, but multi-threading control is a function
of the OS and not the language.  Is C a dead language because it doesn't have
dual-core functionality?  Of course not.  Although, we're still not clear on
what dual-core functionality is being proposed to be added to the language.
Regardless, it shouldn't be a concern.  Simple mutli-threading constructs and
locking mechanisms should be enough to guarantee that D will work in dual-core
systems.

In article <d94hu8$2l7i$1 digitaldaemon.com>, Sean Kelly says...
In article <d94alb$2gld$1 digitaldaemon.com>, Manfred Nowak says...
Sean Kelly <sean f4.ca> wrote:
[...] 
 Yikes.  So you're saying you'd have lockless sharing of data
 between the cores and only force a cache sync when communicating
 between processors?  Makes sense, I suppose, but it sounds
 risky. 

Why lockless?

If multiple cores share a single cache, then there's no need to force cache
coherency when sharing data between them.  Of course, that assumes there's some
way to tell you're running on two cores sharing a cache, which may not be
possible.  As for why: cache synchs take time.  Less time than full locking, but
time nevertheless.  I don't know how useful this would be for PCs, but for NUMA
machines that have clusered cores but inter-cluster ops involve message-passing,
this may be a reasonable strategy.  Though I'm speculating here, as I've never
actually coded for such a machine. 

I see, that you catched the basic principal behind my example. And 
as you may see above it is difficult to the human brain to think in 
concurrency: you serialized the events but do not handle the case 
when depending on an unlucky implementation both cores might 
independently raise both exceptions, one core the fire exception 
and the other the alive-knob exception.

In this case you have a control leak.

Why can't the exception handlers serialize error-handing though?  There
ultimately has to be some coordination to resolve potentially conflicting
directives.  Why should this happen when the exception is thrown as opposed to
when it's caught?

There is one more thing to mention: it is not seldom, that 
specifications are incomplete or even contradictory and that 
detection of this specification faults occurs late in the software 
production process.

Depending on the awareness of the implementators such a fault might 
traverse into the final product.

Have a look at your two cases: you are handling the case that the 
alive-knob exception comes first, but you missed that the fire-knob 
exception might be thrown, when the train stopped already, but in a 
tunnel.

And what if the train had already stopped because of an engine failure, or
because someone pulled the emergency brake?  The 'fire' routine would need to
know whether it should try and move a stopped train out of a tunnel, etc.  How
can this be solved by prioritizing exceptions?  Or am I missing something?


Sean

Regards,
James Dunne

Jun 19 2005

Manfred Nowak <svv1999 hotmail.com> writes:

James Dunne <james.jdunne gmail.com> wrote:

 Is C a dead language because it doesn't have dual-core
 functionality? Of course not.

True. But have you read why Buhr abandoned his concurrency project 
in C?

 Simple
 mutli-threading constructs and locking mechanisms should be
 enough to guarantee that D will work in dual-core systems.

Can you prove that?

[...]
In this case you have a control leak.

Why can't the exception handlers serialize error-handing though?


Why should they? This kind of argument has shown up repeatedly: Why 
should a concurrent working machine be viewed as a serial working 
machine? In fact the AMD cores are designed to have a programmable 
lower bound on the priority of interrupts they will handle: so they 
will handle interrupts concurrently.

[...]
And what if the train had already stopped because of an engine
failure, or because someone pulled the emergency brake?


You are right, that you can extend the security rules and will have 
more complex scenes to solve. Therefore I limited the example to 
only three variables. 

The
'fire' routine would need to know whether it should try and move
a stopped train out of a tunnel, etc.  How can this be solved by
prioritizing exceptions?  Or am I missing something? 


This truly cannot be done by prioritizing and therefore I said, 
that you have a control leak: depending on the implementation it 
might be necessary to preemptry both taks assigned to the cores and 
start one adapted to the more complex scene.

-manfred

Jun 19 2005

Sean Kelly <sean f4.ca> writes:

In article <d94ls5$2o57$1 digitaldaemon.com>, Manfred Nowak says...
In this case you have a control leak.

Why can't the exception handlers serialize error-handing though?


Why should they? This kind of argument has shown up repeatedly: Why 
should a concurrent working machine be viewed as a serial working 
machine? In fact the AMD cores are designed to have a programmable 
lower bound on the priority of interrupts they will handle: so they 
will handle interrupts concurrently.

They should because the way errors are handled depends on system state.  And
resources for handling there errors are shared.  If two errors are thrown
concurrently that both want to do something with the speed of the train, for
example, something will need to prioritize those operations.  What would the
speed control do if it simultaneously received errors to stop and to accelerate?

The
'fire' routine would need to know whether it should try and move
a stopped train out of a tunnel, etc.  How can this be solved by
prioritizing exceptions?  Or am I missing something? 


This truly cannot be done by prioritizing and therefore I said, 
that you have a control leak: depending on the implementation it 
might be necessary to preemptry both taks assigned to the cores and 
start one adapted to the more complex scene.

This can all be done in code though.  Do multi-core CPUs actually offer
instructions to do this in a way that requires language support beyond what D
already has?  (I suppose I should go read the references you've been posting)


Sean

Jun 19 2005

Matthias Becker <Matthias_member pathlink.com> writes:

 Simple
 mutli-threading constructs and locking mechanisms should be
 enough to guarantee that D will work in dual-core systems.

Can you prove that?

A dualcore isn't that mucgh different from dual CPUs. Make an example of what
problem could arise on a dual core that can't on dual CPUs.


[...]
In this case you have a control leak.

Why can't the exception handlers serialize error-handing though?


Why should they? This kind of argument has shown up repeatedly: Why 
should a concurrent working machine be viewed as a serial working 
machine? In fact the AMD cores are designed to have a programmable 
lower bound on the priority of interrupts they will handle: so they 
will handle interrupts concurrently.

[...]
And what if the train had already stopped because of an engine
failure, or because someone pulled the emergency brake?


You are right, that you can extend the security rules and will have 
more complex scenes to solve. Therefore I limited the example to 
only three variables. 

The
'fire' routine would need to know whether it should try and move
a stopped train out of a tunnel, etc.  How can this be solved by
prioritizing exceptions?  Or am I missing something? 


This truly cannot be done by prioritizing and therefore I said, 
that you have a control leak: depending on the implementation it 
might be necessary to preemptry both taks assigned to the cores and 
start one adapted to the more complex scene.

Anyway, this isn't a new problem as real concurrency isn't an invention of this
year. We have it for a long time. There are a lot of dual CPU-machines with real
concurrency. You haven't described any problem that wouldn't arise on such
machine.

Jun 20 2005

Brad Beveridge <brad somewhere.net> writes:

Manfred Nowak wrote:
 James Dunne <james.jdunne gmail.com> wrote:

<SNIP>
 
Simple
mutli-threading constructs and locking mechanisms should be
enough to guarantee that D will work in dual-core systems.

 
 
 Can you prove that?
 

I haven't had time to read the references that you posted, but the above 
begs the question - can you prove that existing multi-threaded controls 
will not work correctly on SMP machines?

I've read this thread, and I am sorry to say that I am too thick to see 
why dual core CPUs are any different to programming multiple CPU 
machines - or for that matter any different to programming a 
multi-threaded application.

Manfred, you look to be most concerned with concurrency issues - but 
from a programmers point of view I cannot see the difference between 
programming with multiple threads and programming with multiple 
CPUS/cores.  Assuming a general purpose OS (and I think we have to), 
then your train example has (to my mind) exactly the same problems 
regardless of what kind of machine it is run on.  The only true 
difference is that on a multiple core machine the instructions can 
actually run at the same physical time, on a single core machine the 
threads need to share the CPU, but that means nothing because the CPU 
could change threads every few operations - ie you need to provide the 
same locks and measures anyhow.

Brad

Jun 20 2005

Sean Kelly <sean f4.ca> writes:

In article <d96n2l$11lq$1 digitaldaemon.com>, Brad Beveridge says...
Manfred Nowak wrote:
 James Dunne <james.jdunne gmail.com> wrote:

<SNIP>
 
Simple
mutli-threading constructs and locking mechanisms should be
enough to guarantee that D will work in dual-core systems.

 
 
 Can you prove that?
 

I haven't had time to read the references that you posted, but the above 
begs the question - can you prove that existing multi-threaded controls 
will not work correctly on SMP machines?

They will.

I've read this thread, and I am sorry to say that I am too thick to see 
why dual core CPUs are any different to programming multiple CPU 
machines - or for that matter any different to programming a 
multi-threaded application.

AFAIK, dual core machines are indistuingishable from 'true' SMP machines to all
but perhaps an OS programmer.  The most obvious example of this is that Windows
reports each core of a multi-core machine as a separate CPU.

Manfred, you look to be most concerned with concurrency issues - but 
from a programmers point of view I cannot see the difference between 
programming with multiple threads and programming with multiple 
CPUS/cores.

The only difference I can think of is that cache coherency is not an issue with
single CPU machines, though you typically have to pretend that it is anyway
(since not many applications are written to target a specific hardware
configuration).  Theoretically, I could see some of what Manfred mentioned being
a potential point of optimization for realtime systems, but those would probably
be built with a custom compiler and target a specific run environment anyway.

Assuming a general purpose OS (and I think we have to), 
then your train example has (to my mind) exactly the same problems 
regardless of what kind of machine it is run on.  The only true 
difference is that on a multiple core machine the instructions can 
actually run at the same physical time, on a single core machine the 
threads need to share the CPU, but that means nothing because the CPU 
could change threads every few operations - ie you need to provide the 
same locks and measures anyhow.

Exactly.  D is no different that any other procedural language in how it deals
with concurrency.  Though as a point of geek interest I suppose it's worth
mentioning that BS' original purpose for C++ was as a concurrent language--it
just didn't really stay that way once he'd finished his research.

In any case, if there's anything that D lacks, I'd love to hear some concrete
examples.  It's much easier to address issues when you know specifically what
they are, and the discussion has remained pretty abstract up to this point.


Sean

Jun 20 2005

Sean Kelly <sean f4.ca> writes:

I need to read up a bit on multi-core systems, but they act the same as SMP
systems, correct?  So your concern is having library facilities which allow you
to assign tasks to different processors and so on?  If so, I think at least some
basic functionality is a candidate for 1.0, especially if some motivated person
is willing to write it :)  I'm currently experimenting with some lockless synch.
functionality in Ares, and would be happy to build processor affinity support
and such into the Thread class if someone is willing to supply the assembly for
it... and I believe Walter would do the same for Phobos.


Sean

Jun 17 2005

Brad Beveridge <brad somewhere.net> writes:

Sean Kelly wrote:
 I need to read up a bit on multi-core systems, but they act the same as SMP
 systems, correct?  So your concern is having library facilities which allow you
 to assign tasks to different processors and so on?  If so, I think at least
some
 basic functionality is a candidate for 1.0, especially if some motivated person
 is willing to write it :)  I'm currently experimenting with some lockless
synch.
 functionality in Ares, and would be happy to build processor affinity support
 and such into the Thread class if someone is willing to supply the assembly for
 it... and I believe Walter would do the same for Phobos.
 
 
 Sean
 
 

This I agree with, library support for multi-processor systems is a good 
idea.  Of course, as far as I am aware at the application level you 
don't really get to choose anyhow - you can provide hints to the OS 
about processor afinity, but that is about it.  Writing software for 
multicore systems is almost the same as writing multithreaded programs - 
the main difference being that even more sublte bugs can show due to the 
fact that threads actually are executing at the same time rather than 
concurrently.

As an aside, I don't particularly see the true use for multicore systems 
in real life applications at the moment.  Right now most CPUs, unless 
you program very carefully, are memory bound - they spend a lot of their 
time waiting for memory accesses.  Having multiple cores just increases 
the demand on the main memory bus, so the CPUs (unless executing 
completely out of cache) will still be waiting a lot.  But I guess that 
is why we are seeing larger and larger L1 caches.

Brad

Jun 17 2005

Sean Kelly <sean f4.ca> writes:

In article <d8uq8m$1heq$1 digitaldaemon.com>, Brad Beveridge says...
As an aside, I don't particularly see the true use for multicore systems 
in real life applications at the moment.  Right now most CPUs, unless 
you program very carefully, are memory bound - they spend a lot of their 
time waiting for memory accesses.  Having multiple cores just increases 
the demand on the main memory bus, so the CPUs (unless executing 
completely out of cache) will still be waiting a lot.  But I guess that 
is why we are seeing larger and larger L1 caches.

Exactly.  And that leaves us with cache coherency problems.  I think we're
getting close to a fundamental change in how applications are designed, but I
haven't seen any suggestion for how to handle SMP efficiently and easily as
locks and such just don't cut it.  It's an interesting time for software design
:)


Sean

Jun 17 2005

Brad Beveridge <brad somewhere.net> writes:

Sean Kelly wrote:
 In article <d8uq8m$1heq$1 digitaldaemon.com>, Brad Beveridge says...
 

<snip>
 
 Exactly.  And that leaves us with cache coherency problems.  I think we're
 getting close to a fundamental change in how applications are designed, but I
 haven't seen any suggestion for how to handle SMP efficiently and easily as
 locks and such just don't cut it.  It's an interesting time for software design
 :)
 
 
 Sean
 
 

Thinking along these lines, performance programming in D would possibly 
benefit more from a library that lets you manipulate the cache.  Such a 
library could possibly provide functions to prefill the cache, lock 
portions of it, etc.  Of course, messing with caches is not the kind of 
thing that you want to do even 1% of the time - there is just too much 
chance that locking the cache down will negatively impact performance. 
Especially if the OS wants to do a context switch.  Sigh, programming 
just ain't what it used to be when you could cycle count your assembler 
instructions & figure out how fast your loop would be :)

Brad

Jun 17 2005

Sean Kelly <sean f4.ca> writes:

In article <d8utov$1khp$1 digitaldaemon.com>, Brad Beveridge says...
Thinking along these lines, performance programming in D would possibly 
benefit more from a library that lets you manipulate the cache.  Such a 
library could possibly provide functions to prefill the cache, lock 
portions of it, etc.  Of course, messing with caches is not the kind of 
thing that you want to do even 1% of the time - there is just too much 
chance that locking the cache down will negatively impact performance. 
Especially if the OS wants to do a context switch.  Sigh, programming 
just ain't what it used to be when you could cycle count your assembler 
instructions & figure out how fast your loop would be :)

True enough :)  And things are changing for x86 architectures in this regard.
Until recently, x86 machines only had full mfence facilities (with the LOCK
instruction) but IIRC acquire/release instructions were added to the Itanium,
and I think things are moving towards more fine-grained cache control.  But this
is something that is sufficiently complex (even for experts) that it really
needs to be done right in a library so that the average joe doesn't have to
worry about it.  Lockless containers are one such feature, and perhaps some
other design patterns would be appropriate to support as well.  Ben's work is a
definite step in the right direction, and it may well be a basis for some of the
stuff that ends up in Ares.  As for the rest... it's worth keeping on on the C++
standardization process as they're facing similar issues for the next release.
But D has a lead on C++ at the moment because of the way Walter implemented
'volatile'.  It's my hope that D will be we well suited for concurrent
programming years before the next iteration of the C++ standard is finalized.


Sean

Jun 17 2005

Manfred Nowak <svv1999 hotmail.com> writes:

Sean Kelly <sean f4.ca> wrote:

 I need to read up a bit on multi-core systems, but they act the
 same as SMP systems, correct?

Dual-cores _are_ an implementation of SMP.  

 So your concern is having library
 facilities which allow you to assign tasks to different
 processors and so on?

No. I have somewhere seen an argument, that if concurrency is not 
implemented into the language then no compiler can be guaranteed to 
deliver correct code under all circumstances---therefore concurrency 
must be implemented into the language.

-manfred

Jun 18 2005

Matthias Becker <Matthias_member pathlink.com> writes:

 I need to read up a bit on multi-core systems, but they act the
 same as SMP systems, correct?

Dual-cores _are_ an implementation of SMP.  

 So your concern is having library
 facilities which allow you to assign tasks to different
 processors and so on?

No. I have somewhere seen an argument, that if concurrency is not 
implemented into the language then no compiler can be guaranteed to 
deliver correct code under all circumstances---therefore concurrency 
must be implemented into the language.

There are some problems with optimizers that can move code around so things
might get called before a libray-lock-directive if the compiler isn't aware of
that it musn't move code in fromt of or behind this function call.

Jun 18 2005

Sean Kelly <sean f4.ca> writes:

In article <d90hkm$134$2 digitaldaemon.com>, Manfred Nowak says...
Sean Kelly <sean f4.ca> wrote:

 I need to read up a bit on multi-core systems, but they act the
 same as SMP systems, correct?

Dual-cores _are_ an implementation of SMP.

Just making sure I wasn't missing something.

 So your concern is having library
 facilities which allow you to assign tasks to different
 processors and so on?

No. I have somewhere seen an argument, that if concurrency is not 
implemented into the language then no compiler can be guaranteed to 
deliver correct code under all circumstances---therefore concurrency 
must be implemented into the language.

This is an issue with C/C++.  Specifically, it relates to the "as if" rule and
the fact that the theoretical virtual machine optimizers target has no concept
of concurrency.  So there's no real way to ensure volatile instructions aren't
being reordered unless you use a synchronization library.  D addresses this
particular issue somewhat in its reinterpretation of "volatile," and I'm sure
Walter is keeping an eye on the C++ standardization talks about this issue as
well.


Sean

Jun 18 2005

Manfred Nowak <svv1999 hotmail.com> writes:

Sean Kelly <sean f4.ca> wrote:

[...]
 This is an issue with C/C++.  Specifically, it relates to the
 "as if" rule and the fact that the theoretical virtual machine
 optimizers target has no concept of concurrency.  So there's no
 real way to ensure volatile instructions aren't being reordered
 unless you use a synchronization library.  D addresses this 
 particular issue somewhat in its reinterpretation of "volatile,"
 and I'm sure Walter is keeping an eye on the C++ standardization
 talks about this issue as well.

Are you able to prove, that the argument holds for C++ only, which 
would be a contradiction to a paper accepted by ACM and available 
here:

http://plg.uwaterloo.ca/~usystem/pub/uSystem/LibraryApproach.ps.gz

-manfred

Jun 19 2005

Sean Kelly <sean f4.ca> writes:

In article <d948ms$2feb$1 digitaldaemon.com>, Manfred Nowak says...
Sean Kelly <sean f4.ca> wrote:

[...]
 This is an issue with C/C++.  Specifically, it relates to the
 "as if" rule and the fact that the theoretical virtual machine
 optimizers target has no concept of concurrency.  So there's no
 real way to ensure volatile instructions aren't being reordered
 unless you use a synchronization library.  D addresses this 
 particular issue somewhat in its reinterpretation of "volatile,"
 and I'm sure Walter is keeping an eye on the C++ standardization
 talks about this issue as well.

Are you able to prove, that the argument holds for C++ only, which 
would be a contradiction to a paper accepted by ACM and available 
here:

http://plg.uwaterloo.ca/~usystem/pub/uSystem/LibraryApproach.ps.gz

Not at all.  I imagine many languages target a single-threaded virtual machine.
Java is probably one of the few exceptions.


Sean

Jun 19 2005

Sean Kelly <sean f4.ca> writes:

In article <d948ms$2feb$1 digitaldaemon.com>, Manfred Nowak says...
Sean Kelly <sean f4.ca> wrote:

[...]
 This is an issue with C/C++.  Specifically, it relates to the
 "as if" rule and the fact that the theoretical virtual machine
 optimizers target has no concept of concurrency.  So there's no
 real way to ensure volatile instructions aren't being reordered
 unless you use a synchronization library.  D addresses this 
 particular issue somewhat in its reinterpretation of "volatile,"
 and I'm sure Walter is keeping an eye on the C++ standardization
 talks about this issue as well.

Are you able to prove, that the argument holds for C++ only, which 
would be a contradiction to a paper accepted by ACM and available 
here:

http://plg.uwaterloo.ca/~usystem/pub/uSystem/LibraryApproach.ps.gz

Okay, I dug up a copy of Ghostscript for the PC and read the first few pages of
this paper.  I definately agree with it, but I don't know that it applies to D.
For reference, here are the suggested solutions:

1. provide some explicit language facilities to control optimization (eg.
pragma, volatile, etc.)

2. provide some concurrency constructs that allow the translator to determine
when to disable certain optimizations

3. a combination of approaches one and two

It's worth noting that D already provides both of these proposed solutions in
language.  The 'synchronized' keyword could be used to prevent the compiler from
optimizing code around these areas (if it isn't already).  And 'volatile'
provides programmers who need to implement concurrent code outside of
synchronization blocks a means of preventing compiler optimization of critical
code blocks.  More work may still be useful in this area.  For example,
'volatile' in D just prevents optimization across a code block, but it might be
worthwhile to provide a means for something akin to acquire and release
semantics to allow *some* optimization to occur.


Sean

Jun 20 2005

Brad Beveridge <brad somewhere.net> writes:

Sean Kelly wrote:
<Snip>
 It's worth noting that D already provides both of these proposed solutions in
 language.  The 'synchronized' keyword could be used to prevent the compiler
from
 optimizing code around these areas (if it isn't already).  And 'volatile'
 provides programmers who need to implement concurrent code outside of
 synchronization blocks a means of preventing compiler optimization of critical
 code blocks.  More work may still be useful in this area.  For example,
 'volatile' in D just prevents optimization across a code block, but it might be
 worthwhile to provide a means for something akin to acquire and release
 semantics to allow *some* optimization to occur.
 

Does volatile prevent code movement within the block?  For example
...
some optimised code (A)
...
volatile {
...
some order critical code
...
}
...
some optimised code (B)

It is obvious from the description of volatile that the 3 sections of 
code above will have memory barriers, ie when the volatile section 
begins all memory writes from A will have occured, and when B begins 
executing all memory writes from the volatile block will have finished.

But, does code within the volatile block get optimised?  It would be 
nice if code within a volatile statement is strictly ordered, with no 
opportunity for the compiler to move memory read/write operations.
Does anybody know if this is true in practice?

Brad

Jun 20 2005

Sean Kelly <sean f4.ca> writes:

In article <d97jeu$1mcv$1 digitaldaemon.com>, Brad Beveridge says...
Sean Kelly wrote:
<Snip>
 It's worth noting that D already provides both of these proposed solutions in
 language.  The 'synchronized' keyword could be used to prevent the compiler
from
 optimizing code around these areas (if it isn't already).  And 'volatile'
 provides programmers who need to implement concurrent code outside of
 synchronization blocks a means of preventing compiler optimization of critical
 code blocks.  More work may still be useful in this area.  For example,
 'volatile' in D just prevents optimization across a code block, but it might be
 worthwhile to provide a means for something akin to acquire and release
 semantics to allow *some* optimization to occur.
 

Does volatile prevent code movement within the block?  For example
...
some optimised code (A)
...
volatile {
...
some order critical code
...
}
...
some optimised code (B)

It is obvious from the description of volatile that the 3 sections of 
code above will have memory barriers, ie when the volatile section 
begins all memory writes from A will have occured, and when B begins 
executing all memory writes from the volatile block will have finished.

But, does code within the volatile block get optimised?  It would be 
nice if code within a volatile statement is strictly ordered, with no 
opportunity for the compiler to move memory read/write operations.
Does anybody know if this is true in practice?

The spec just says that "Memory writes occurring before the Statement are
performed before any reads within or after the Statement. Memory reads occurring
after the Statement occur after any writes before or within Statement are
completed."  So the compiler is currently free to optimize within the code
block, just not across the boundaries.  And now that I look at it, it sounds
like volatile statements already implement acquire/release semantics.  I think
the current behavior is actually okay though, as the code within the volatile
block could theoretically be thousands of lines long, and I wouldn't want the
optimizer to ignore that code completely, just not optimize it beyond the
boundaries I've established.

Also, the requirements for 'synchronized' say nothing about optimizer behavior,
and I think they should--'synchronized' should probably be identical to
'volatile' except that the block is also atomic.  I grant that it would be easy
enough for a Mutex writer to add volatile blocks to his code, but as a
synchronized block is implicitly volatile, it's worth changing simply to improve
clarity if nothing else.


Sean

Jun 21 2005

Derek Parnell <derek psych.ward> writes:

On Thu, 16 Jun 2005 16:09:44 +0000 (UTC), Manfred Nowak wrote:

 The shipping of the "AMD Athlon 64 X2" is announced to start at the 
 end of this month.
 
 A review is available:
 http://www.amdreview.com/reviews.php?rev=athlonx24200
 
 As the review suggests WinXP and Sandra are prepared to use more than 
 one CPU.
 
 Will D be outdated before the release of 1.0 because D has no support 
 for multi core units?

Yes. In the exact same manner that all existing 3+GL languages are.

talking about library support rather than language support? Are you talking
about the need for D to have new keywords or new object code generation
when the target is a dual/triple/quadruple/quintuple/... core machine?

Maybe this thread can be renamed "Duel Core Support"  ;-)

-- 
Derek Parnell
Melbourne, Australia
20/06/2005 7:35:55 AM

Jun 19 2005

Manfred Nowak <svv1999 hotmail.com> writes:

Derek Parnell <derek psych.ward> wrote:

[...]
 Will D be outdated before the release of 1.0 because D has no
 support for multi core units?

 
 Yes. In the exact same manner that all existing 3+GL languages


I disagree. All this languages are way beyond version 1.0 whereas D 
isnt.


 But maybe you are talking about library support rather than
 language support?

If the paper of Buhr, which I have mentioned somewehere above, is 
right then it is possible to include all concurrency support into a 
library, but only if the language follows the rules dictated by the 
library. And I agree with Buhr that such dicatation is the same as 
havin chnged the language.


 Are you talking about the need for D to have
 new keywords or new object code generation when the target is a
 dual/triple/quadruple/quintuple/... core machine? 

According to my statement above a clear: maybe. And the reason for 
this is that I do not believe that the only two keyowrds in D that 
something have to do with concurrency can be show as aequivalents to 
Buhrs "mutex" and "monitor". But I may be wrong.  

 
 Maybe this thread can be renamed "Duel Core Support"  ;-)

Thx for this broad hint. In fact I feel thrown onto a position which 
I did not want to be engaged in. All I wanted to know is whether 
there is a proof that D can handle concurrency in general and as the 
title shows dual cores as a special case. Maybe I should have posted 
this into the "learn" group.

However, I posted here and found myself confronted with opinions, 
that dual cores are not different from single cores or unfounded 
claims that D can handle any kind of concurrency.

Somehow I feel very uncomfortable.

-manfred

Jun 20 2005

Brad Beveridge <brad somewhere.net> writes:

Manfred Nowak wrote:

 Thx for this broad hint. In fact I feel thrown onto a position which 
 I did not want to be engaged in. All I wanted to know is whether 
 there is a proof that D can handle concurrency in general and as the 
 title shows dual cores as a special case. Maybe I should have posted 
 this into the "learn" group.
 
 However, I posted here and found myself confronted with opinions, 
 that dual cores are not different from single cores or unfounded 
 claims that D can handle any kind of concurrency.
 
 Somehow I feel very uncomfortable.

If I have contributed to your discomfort, I am sorry - that was 
certainly not my intention.  I truly am interested in this topic, but as 
I've said before I just don't understand the problem.  I also have not 
read the references previously posted as they are not in a format I can 
easily open (need to get a ps viewer, etc).
I think the primary things I don't understand are (all are from a 
logical/programmers point of view)

1) Is there any difference between multiple core CPUs, and machines with 
multiple CPUs?
   * I don't believe that there is any significant difference, in which 
case we perhaps should agree that we are talking about SMP in general.

2) From a programmers point of view, what _is_ the difference between a 
program that runs in multiple threads and a program that runs in 
multiple threads on multiple cores?
   * I understand that physically there are different things happening, 
but I currently believe that logically there is no difference.

3) Can you please summerise the primitives that are required to program 
properly on SMP machines?
   * Although I do little multi-threaded programming, I understand that 
threads need to have atomic operations as a basic synchronizing 
mechanism, other than that I am not familiar enough to comment.

4) Could you please show a specific case that D is not able to handle an 
SMP situation, and how it could/should be fixed with additions to the 
language?
   * I liked the train example, could you perhaps make it pseudo-code & 
point out the weaknesses?

Thanks
Brad

Jun 20 2005

Matthias Becker <Matthias_member pathlink.com> writes:

 Are you talking about the need for D to have
 new keywords or new object code generation when the target is a
 dual/triple/quadruple/quintuple/... core machine? 

According to my statement above a clear: maybe. And the reason for 
this is that I do not believe that the only two keyowrds in D that 
something have to do with concurrency can be show as aequivalents to 
Buhrs "mutex" and "monitor". But I may be wrong.  

You can build mutexes and monitors with synchronized without problems.

Jun 21 2005

Manfred Nowak <svv1999 hotmail.com> writes:

Matthias Becker <Matthias_member pathlink.com> wrote:

 You can build mutexes and monitors with synchronized without
 problems. 

So why did Buhr implement them?

-manfred

Jun 21 2005

Brad Beveridge <brad somewhere.net> writes:

Manfred Nowak wrote:
 Matthias Becker <Matthias_member pathlink.com> wrote:
 
 
You can build mutexes and monitors with synchronized without
problems. 

 
 
 So why did Buhr implement them?
 
 -manfred

I read the library approaches paper from Buhr that you reference, I 
don't see that he implemented anything.
He made two basic points
1) Variables cached in registers will not be visible between tasks
2) Code optimisation can reorder instructions agressively, which can 
lead to code that should be inside critical sections being moved outside 
  critical sections.

C addresses point 1 with the volatile keyword, any variable that is 
"volatile" will be written to memory rather than kept solely in registers.

D's meaning of volatile addresses both concerns, code cannot move around 
a volatile statement, and reads and writes are performed to memory.  D 
also adds "synchronized", but in reality you could build your own locks 
on top of volatile without the language feature "sychronized".

So D as a language meets the criteria for concurrent programming that 
Buhr layed out.

Brad

Jun 22 2005

D Programming

C/C++ Programming

Other

digitalmars.D - Dual Core Support