digitalmars.D.internals - multi threading in dmd

Robert Schadek (28/28) Oct 11 2019 Compiling is IMHO is getting painfully slow with growing projects.

Jacob Carlborg (11/48) Oct 11 2019 In general, there are quite a lot of globals in DMD. There are five
Sebastian Wilzbach (45/73) Oct 11 2019 Are you aware that the official release of dmd is built with dmd?

Robert Schadek <rschadek symmetryinvestments.com> writes:

Compiling is IMHO is getting painfully slow with growing projects.
One thing I'm working on is up to 30+ seconds, for 20k lines of 
somewhat heavy code.
But lets not argue whether or not or not I'm doing it wrong, for 
the sake of
the arguments lets assume compiling is slow.

One thing I see is that dub passes many files at once to dmd.
And dmd runs one thread on that input.

I think there is some opportunity to start multiple threads to do 
at least some of
the work in parallel.

1. Has anybody done any work on doing work in dmd with threads?

2. Am I correct that in theory dmd should be able to lex all 
passed files in
parallel (given enough cpu cores).

3. Is it correct that currently one token is created at a time on 
request by the
parser.

4. This would currently require the classes Identifier and 
StringTable be made
thread safe.

5. AsyncRead in mars.d is dead code?

6. Is there any way to test all the different version statements 
and static if's
used for the same purpose?

7. Is there a change to parse all the initially given files in 
parallel?

8. Any other ideas on how to do threading in dmd?

Oct 11 2019

Jacob Carlborg <doob me.com> writes:

On 2019-10-11 14:49, Robert Schadek wrote:
 Compiling is IMHO is getting painfully slow with growing projects.
 One thing I'm working on is up to 30+ seconds, for 20k lines of somewhat 
 heavy code.
 But lets not argue whether or not or not I'm doing it wrong, for the 
 sake of
 the arguments lets assume compiling is slow.
 
 One thing I see is that dub passes many files at once to dmd.
 And dmd runs one thread on that input.
 
 I think there is some opportunity to start multiple threads to do at 
 least some of
 the work in parallel.
 
 1. Has anybody done any work on doing work in dmd with threads?
 
 2. Am I correct that in theory dmd should be able to lex all passed 
 files in
 parallel (given enough cpu cores).
 
 3. Is it correct that currently one token is created at a time on 
 request by the
 parser.
 
 4. This would currently require the classes Identifier and StringTable 
 be made
 thread safe.
 
 5. AsyncRead in mars.d is dead code?
 
 6. Is there any way to test all the different version statements and 
 static if's
 used for the same purpose?
 
 7. Is there a change to parse all the initially given files in parallel?
 
 8. Any other ideas on how to do threading in dmd?

In general, there are quite a lot of globals in DMD. There are five 
`__gshared` variables in the lexer alone. Then more will be referenced 
from the lexer and parser. There's also the issue of reporting 
diagnostics. That might need to be synchronized otherwise some parts of 
an error message from one file might be printed and then some other 
parts from another file.

Most of the time is spent doing semantic analysis. That will be even 
harder to do with multiple threads.

-- 
/Jacob Carlborg

Oct 11 2019

Sebastian Wilzbach <seb wilzba.ch> writes:

On 11/10/2019 14.49, Robert Schadek via Dlang-internal wrote:
Compiling is IMHO is getting painfully slow with growing projects.
One thing I'm working on is up to 30+ seconds, for 20k lines of somewhat
heavy code.
But lets not argue whether or not or not I'm doing it wrong, for the
sake of
the arguments lets assume compiling is slow.

Are you aware that the official release of dmd is built with dmd?
Compiling with LDC improves dmd about 2x as fast in my last tests (this
as without LTO, PGO and an older LLVM backend).

One thing I see is that dub passes many files at once to dmd.

Dub has the same problem (built with dmd), but the semi-official
binaries that you can grab here are built with LDC:
https://github.com/dlang/dub/releases (and of course the ones shipped
with LDC).

And dmd runs one thread on that input.

I think there is some opportunity to start multiple threads to do at
least some of
the work in parallel.

Yes, but I don't think lexing is an important part here. It's too cheap.

1. Has anybody done any work on doing work in dmd with threads?

https://blog.thecybershadow.net/2018/11/18/d-compilation-is-too-slow-and-i-am-forking-the-compiler/

2. Am I correct that in theory dmd should be able to lex all passed
files in
parallel (given enough cpu cores).

Yes, but lexing is __very__ cheap. Your performance problems come from
code with heavy templates + CTFE usage and other expensive semantics
check. Benchmark before you optimize!

3. Is it correct that currently one token is created at a time on
request by the
parser.

The parser generally calls nextToken(), but it can also ask for more
e.g. with peekNext2() or peekPastParen(tk). Though note that the entire
file is loaded into one buffer
(https://github.com/dlang/dmd/blob/7c90cf18cf2ff8bea7eb9aa372b09fc4870efe9e/src/dmd/dmodule.d#L560).

4. This would currently require the classes Identifier and StringTable
be made
thread safe

Lexing doesn't touch Identifer or StringTable. It simply slices the
string from the fully allocated blob (see e.g.
https://github.com/dlang/dmd/blob/7c90cf18cf2ff8bea7eb9aa372b09fc4870efe9e/src/dmd/lexer.d#L1657)
and new allocations are malloc and copied (see e.g.
https://github.com/dlang/dmd/blob/7c90cf18cf2ff8bea7eb9aa372b09fc4870efe9e/src/dmd/tokens.d#L736).

5. AsyncRead in mars.d is dead code?

Yes.

6. Is there any way to test all the different version statements and
static if's
used for the same purpose?

No.

7. Is there a change to parse all the initially given files in parallel?

No. I think Async changes were abandoned when it become apparent that it
the work/benefit ratio was low.

8. Any other ideas on how to do threading in dmd?

Do not focus on lexing. Focus on CTFE + templates.
You want to do the following:
- cache (e.g. https://github.com/dlang/dmd/pull/7843)
- even entire modules could be cached and loaded for subsequent runs
- be more lazy (i.e. DMD could be a lot more conservative)
- reduce DMD's memory comsumption (there are still many low-hanging fruits)
example:
https://github.com/dlang/dmd/pull/10396#issuecomment-531454363 or
https://github.com/dlang/dmd/pull/10427
- optimize DMD's CTFE + template code (there are still many low-hanging
fruits)
- example: https://github.com/dlang/dmd/pull/10395,
https://github.com/dlang/dmd/pull/10394 or even things like
https://github.com/dlang/dmd/pull/10391
- focus on running semantics in parallel (hard, but should be easier for
when working on independent modules)

Also, I recommend to look for real culprits (dub does come with a real
overhead too) or easy low hanging fruits. For example, on Linux DMD
could use mmaped files to speed-up file reading.

Oct 11 2019

D Programming

C/C++ Programming

Other

digitalmars.D.internals - multi threading in dmd