digitalmars.D.learn - How can I make a program which uses all cores and 100% of cpu power?

Murilo (5/5) Oct 10 2019 I have started working with neural networks and for that I need a

Daniel Kozak (7/12) Oct 10 2019 You should use minimally same amount of threads as you have cores. So
Daniel Kozak (2/3) Oct 10 2019 can't
=?UTF-8?Q?Ali_=c3=87ehreli?= (53/57) Oct 10 2019 Your threads must allocate as little memory as possible because memory

Murilo (2/16) Dec 05 2019 Thanks for the information, they were very helpful.

Russel Winder (25/30) Oct 10 2019 Why do you want to get CPU utilisation to 100%?

Murilo (4/22) Dec 05 2019 Thanks, that helped a lot. But I already figured out a new

Murilo <murilomiranda92 hotmail.com> writes:

I have started working with neural networks and for that I need a 
lot of computing power but the programs I make only use around 
30% of the cpu, or at least that is what Task Manager tells me. 
How can I make it use all 4 cores of my AMD FX-4300 and how can I 
make it use 100% of it?

Oct 10 2019

Daniel Kozak <kozzi11 gmail.com> writes:

On Fri, Oct 11, 2019 at 2:45 AM Murilo via Digitalmars-d-learn
<digitalmars-d-learn puremagic.com> wrote:
 I have started working with neural networks and for that I need a
 lot of computing power but the programs I make only use around
 30% of the cpu, or at least that is what Task Manager tells me.
 How can I make it use all 4 cores of my AMD FX-4300 and how can I
 make it use 100% of it?

You should use minimally same amount of threads as you have cores. So
in your case 4 or even more.
Than you should buy a new CPU if you really need a lot of computing
power :). Other issue can be using blocking IO, so your threads are in
idle, so can stress your CPU.

Oct 10 2019

Daniel Kozak <kozzi11 gmail.com> writes:

On Fri, Oct 11, 2019 at 6:58 AM Daniel Kozak <kozzi11 gmail.com> wrote:
  so can stress your CPU.

can't

Oct 10 2019

=?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:

On 10/10/2019 05:41 PM, Murilo wrote:
 I have started working with neural networks and for that I need a lot of
 computing power but the programs I make only use around 30% of the cpu,
 or at least that is what Task Manager tells me. How can I make it use
 all 4 cores of my AMD FX-4300 and how can I make it use 100% of it?

Your threads must allocate as little memory as possible because memory 
allocation can trigger garbage collection and garbage collection stops 
all threads (except the one that's performing collection).

We studied the effects of different allocation schemes during our last 
local D meetup[1]. The following program has two similar worker threads. 
One allocates in an inner scope, the other one uses a static Appender 
and clears its state as needed.

The program sets 'w' to 'worker' inside main(). Change it to 'worker2' 
to see a huge difference: On my 4-core laptop its 100% versus 400% CPU 
usage.

import std.random;
import std.range;
import std.algorithm;
import std.concurrency;
import std.parallelism;

enum inner_N = 100;

void worker() {
   ulong result;
   while (true) {
     int[] arr;
     foreach (j; 0 .. inner_N) {
       arr ~= uniform(0, 2);
     }
     result += arr.sum;
   }
}

void worker2() {
   ulong result;
   static Appender!(int[]) arr;
   while (true) {
     arr.clear();
     foreach (j; 0 .. inner_N) {
       arr ~= uniform(0, 2);
     }
     result += arr.data.sum;
   }
}

void main() {
   // Replace with 'worker2' to see the speedup
   alias w = worker;

   auto workers = totalCPUs.iota.map!(_ => spawn(&w)).array;

   w();
}

The static Appender is thread-safe because each thread gets their own 
copy due to data being thread-local by default in D. However, it doesn't 
mean that the functions are reentrant: If they get called recursively 
perhaps indirectly, then the subsequent executions would corrupt 
previous executions' Appender states.

Ali

[1] https://www.meetup.com/D-Lang-Silicon-Valley/events/kmqcvqyzmbzb/ 
Are you someone in the Bay Area but do not come to our meetups? We've 
been eating your falafel wraps! ;)

Oct 10 2019

Murilo <murilomiranda92 hotmail.com> writes:

On Friday, 11 October 2019 at 06:18:03 UTC, Ali Çehreli wrote:
 Your threads must allocate as little memory as possible because 
 memory allocation can trigger garbage collection and garbage 
 collection stops all threads (except the one that's performing 
 collection).
 We studied the effects of different allocation schemes during 
 our last local D meetup[1]. The following program has two 
 similar worker threads. One allocates in an inner scope, the 
 other one uses a static Appender and clears its state as needed.
 The static Appender is thread-safe because each thread gets 
 their own copy due to data being thread-local by default in D. 
 However, it doesn't mean that the functions are reentrant: If 
 they get called recursively perhaps indirectly, then the 
 subsequent executions would corrupt previous executions' 
 Appender states.

Thanks for the information, they were very helpful.

Dec 05 2019

Russel Winder <russel winder.org.uk> writes:

On Fri, 2019-10-11 at 00:41 +0000, Murilo via Digitalmars-d-learn wrote:
 I have started working with neural networks and for that I need a=20
 lot of computing power but the programs I make only use around=20
 30% of the cpu, or at least that is what Task Manager tells me.=20
 How can I make it use all 4 cores of my AMD FX-4300 and how can I=20
 make it use 100% of it?

Why do you want to get CPU utilisation to 100%?

I would have thought you'd want to get the neural net to be as fast as
possible, this does not necessarily imply that all CPU cycles must be used.

A neural net is, at it's heart, a set of communicating nodes. This is as mu=
ch
an I/O bound model as it is compute bound one =E2=80=93 nodes are generally=
 waiting
for input as much as they are computing a value. The obvious solution
architecture for a small computer is to create a task per node on a thread
pool, with a few more threads in the pool than you have processors, and hop=
e
that you can organise the communication between tasks so as to avoid cache
misses. This can be tricky when using multi-core processors. It gets even
worse when you have hyperthreads =E2=80=93 many organisations doing CPU bou=
nd
computations switch off hyperthreads as they cause more problems than theys=
olve.=20

--=20
Russel.
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Dr Russel Winder      t: +44 20 7585 2200
41 Buckmaster Road    m: +44 7770 465 077
London SW11 1EN, UK   w: www.russel.org.uk

Oct 10 2019

Murilo <murilomiranda92 hotmail.com> writes:

On Friday, 11 October 2019 at 06:57:46 UTC, Russel Winder wrote:
 A neural net is, at it's heart, a set of communicating nodes. 
 This is as much
 an I/O bound model as it is compute bound one – nodes are 
 generally waiting
 for input as much as they are computing a value. The obvious 
 solution
 architecture for a small computer is to create a task per node 
 on a thread
 pool, with a few more threads in the pool than you have 
 processors, and hope
 that you can organise the communication between tasks so as to 
 avoid cache
 misses. This can be tricky when using multi-core processors. It 
 gets even
 worse when you have hyperthreads – many organisations doing CPU 
 bound
 computations switch off hyperthreads as they cause more 
 problems than theysolve.

Thanks, that helped a lot. But I already figured out a new 
training algorithm that is a lot faster, no need to use 
parallelism anymore.

Dec 05 2019

D Programming

C/C++ Programming

Other

digitalmars.D.learn - How can I make a program which uses all cores and 100% of cpu power?