www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Speeding up importing Phobos files

reply Walter Bright <newshound2 digitalmars.com> writes:
Andrei and I were talking on the phone today, trading ideas about speeding up 
importation of Phobos files. Any particular D file tends to import much of 
Phobos, and much of Phobos imports the rest of it. We've both noticed that file 
size doesn't seem to matter much for importation speed, but file lookups remain 
slow.

So looking up fewer files would make it faster.

Here's the idea: Place all Phobos source files into a single zip file, call it 
phobos.zip (duh). Then,

     dmd myfile.d phobos.zip

and the compiler will look in phobos.zip to resolve, say, std/stdio.d. If 
phobos.zip is opened as a memory mapped file, whenever std/stdio.d is read, the 
file will be "faulted" into memory rather than doing a file lookup / read.
We're 
speculating that this should be significantly faster, besides being very 
convenient for the user to treat Phobos as a single file rather than a
blizzard. 
(phobos.lib could also be in the same file!)

It doesn't have to be just phobos, this can be a general facility. People can 
distribute their D libraries as a zip file that never needs unzipping.

We already have https://dlang.org/phobos/std_zip.html to do the dirty work. We 
can experiment to see if compressed zips are faster than uncompressed ones.

This can be a fun challenge! Anyone up for it?

P.S. dmd's ability to directly manipulate object library files, rather than 
going through lib or ar, has been a nice success.
Jan 19
next sibling parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:
 Andrei and I were talking on the phone today, trading ideas 
 about speeding up importation of Phobos files. Any particular D 
 file tends to import much of Phobos, and much of Phobos imports 
 the rest of it. We've both noticed that file size doesn't seem 
 to matter much for importation speed, but file lookups remain 
 slow.

 [...]
If we are going there we might as well use a proper database as the compiler cache format. similar to to pre-compiled headers. I'd be interested to see in how this bears out. I am going to devote some time this weekend to it. But using sqlite rather than zip.
Jan 19
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Jan 19, 2019 at 08:59:37AM +0000, Stefan Koch via Digitalmars-d wrote:
 On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:
 Andrei and I were talking on the phone today, trading ideas about
 speeding up importation of Phobos files. Any particular D file tends
 to import much of Phobos, and much of Phobos imports the rest of it.
 We've both noticed that file size doesn't seem to matter much for
 importation speed, but file lookups remain slow.
 
 [...]
If we are going there we might as well use a proper database as the compiler cache format. similar to to pre-compiled headers.
[...] I'd like to see us go in this direction. It could lead to other new things, like the compiler inferring attributes for all functions (not just template / auto functions) and storing the inferred attributes in the precompiled cache. It could even store additional derived information not representable in the source that could be used for program-wide optimization, etc.. T -- Let's call it an accidental feature. -- Larry Wall
Jan 19
prev sibling next sibling parent reply Temtaime <temtaime gmail.com> writes:
On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:
 Andrei and I were talking on the phone today, trading ideas 
 about speeding up importation of Phobos files. Any particular D 
 file tends to import much of Phobos, and much of Phobos imports 
 the rest of it. We've both noticed that file size doesn't seem 
 to matter much for importation speed, but file lookups remain 
 slow.

 So looking up fewer files would make it faster.

 Here's the idea: Place all Phobos source files into a single 
 zip file, call it phobos.zip (duh). Then,

     dmd myfile.d phobos.zip

 and the compiler will look in phobos.zip to resolve, say, 
 std/stdio.d. If phobos.zip is opened as a memory mapped file, 
 whenever std/stdio.d is read, the file will be "faulted" into 
 memory rather than doing a file lookup / read. We're 
 speculating that this should be significantly faster, besides 
 being very convenient for the user to treat Phobos as a single 
 file rather than a blizzard. (phobos.lib could also be in the 
 same file!)

 It doesn't have to be just phobos, this can be a general 
 facility. People can distribute their D libraries as a zip file 
 that never needs unzipping.

 We already have https://dlang.org/phobos/std_zip.html to do the 
 dirty work. We can experiment to see if compressed zips are 
 faster than uncompressed ones.

 This can be a fun challenge! Anyone up for it?

 P.S. dmd's ability to directly manipulate object library files, 
 rather than going through lib or ar, has been a nice success.
C'mon, everyone has a SSD, OS tends to cache previously opened files. What's the problem ? Better speedup compilation speed.
Jan 19
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/19/2019 1:00 AM, Temtaime wrote:
 C'mon, everyone has a SSD, OS tends to cache previously opened files. What's
the 
 problem ?
 Better speedup compilation speed.
You'd think that'd be true, but it isn't. File reads are fast, but file lookups are slow. Searching for a file along a path is particularly slow.
Jan 19
parent reply FeepingCreature <feepingcreature gmail.com> writes:
On Saturday, 19 January 2019 at 09:08:00 UTC, Walter Bright wrote:
 On 1/19/2019 1:00 AM, Temtaime wrote:
 C'mon, everyone has a SSD, OS tends to cache previously opened 
 files. What's the problem ?
 Better speedup compilation speed.
You'd think that'd be true, but it isn't. File reads are fast, but file lookups are slow. Searching for a file along a path is particularly slow.
If you've benchmarked this, could you please post your benchmark source so people can reproduce it? Probably be good to gather data from more than one PC. Maybe make a minisurvey for the results.
Jan 19
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/19/2019 1:12 AM, FeepingCreature wrote:
 If you've benchmarked this, could you please post your benchmark source so 
 people can reproduce it?
I benchmarked it while developing Warp (the C preprocessor replacement I did for Facebook). I was able to speed up searches for .h files substantially by remembering previous lookups in a hash table. The speedup persisted across Windows and Linux. https://github.com/facebookarchive/warp
 Probably be good to gather data from more than one PC. 
 Maybe make a minisurvey for the results.
Sounds like a good idea. Please take charge of this!
Jan 19
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/19/19 4:12 AM, FeepingCreature wrote:
 On Saturday, 19 January 2019 at 09:08:00 UTC, Walter Bright wrote:
 On 1/19/2019 1:00 AM, Temtaime wrote:
 C'mon, everyone has a SSD, OS tends to cache previously opened files. 
 What's the problem ?
 Better speedup compilation speed.
You'd think that'd be true, but it isn't. File reads are fast, but file lookups are slow. Searching for a file along a path is particularly slow.
If you've benchmarked this, could you please post your benchmark source so people can reproduce it? Probably be good to gather data from more than one PC. Maybe make a minisurvey for the results.
I've done a bunch of measurements while I was working on https://github.com/dlang/DIPs/blob/master/DIPs/DIP1005.md, on a modern machine with SSD and Linux (which aggressively caches file contents). I don't think I still have the code, but it shouldn't be difficult to sit down and produce some. The overall conclusion of those experiments was that if you want to improve compilation speed, you need to minimize the number of files opened; once opened, whether it was 1 KB or 100 KB made virtually no difference. One thing I didn't measure was whether opening the file was most overhead, or closing also had a large share.
Jan 19
parent Pjotr Prins <pjotr.public12 thebird.nl> writes:
On Saturday, 19 January 2019 at 16:30:39 UTC, Andrei Alexandrescu 
wrote:
 On 1/19/19 4:12 AM, FeepingCreature wrote:
 On Saturday, 19 January 2019 at 09:08:00 UTC, Walter Bright 
 wrote:
 On 1/19/2019 1:00 AM, Temtaime wrote:
 C'mon, everyone has a SSD, OS tends to cache previously 
 opened files. What's the problem ?
 Better speedup compilation speed.
You'd think that'd be true, but it isn't. File reads are fast, but file lookups are slow. Searching for a file along a path is particularly slow.
If you've benchmarked this, could you please post your benchmark source so people can reproduce it? Probably be good to gather data from more than one PC. Maybe make a minisurvey for the results.
I've done a bunch of measurements while I was working on https://github.com/dlang/DIPs/blob/master/DIPs/DIP1005.md, on a modern machine with SSD and Linux (which aggressively caches file contents). I don't think I still have the code, but it shouldn't be difficult to sit down and produce some. The overall conclusion of those experiments was that if you want to improve compilation speed, you need to minimize the number of files opened; once opened, whether it was 1 KB or 100 KB made virtually no difference. One thing I didn't measure was whether opening the file was most overhead, or closing also had a large share.
I deal with large compressed files. For large data lz4 would probably be a better choice over zip these days. And, even with cached lookups of dir entries, I think one file that is sequentially read will always be an improvement. Note also that compressed files may even be faster than uncompressed ones with some system configurations (relatively slow disk IO, many processors). Another thing to look at is indexed compressed files. For example http://www.htslib.org/doc/tabix.html. Using those we may partition phobos into sensible sub-sections. Particularly section out those submodules people hardly ever use.
Jan 20
prev sibling next sibling parent Kagamin <spam here.lot> writes:
On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:
 We already have https://dlang.org/phobos/std_zip.html to do the 
 dirty work. We can experiment to see if compressed zips are 
 faster than uncompressed ones.
BTW firefox uses fast compression option indicated by general purpose flags 2, std.zip uses default compression option.
Jan 19
prev sibling next sibling parent reply Boris-Barboris <ismailsiege gmail.com> writes:
On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:
 Andrei and I were talking on the phone today, trading ideas 
 about speeding up importation of Phobos files. Any particular D 
 file tends to import much of Phobos, and much of Phobos imports 
 the rest of it. We've both noticed that file size doesn't seem 
 to matter much for importation speed, but file lookups remain 
 slow.

 So looking up fewer files would make it faster.
Sounds rather strange that on modern operating systems, that cache files themselves and the metadata even more surely (VFS directory cache as an example), file lookup is a problem. Calls to something like glob(3) could render the whole phobos directory tree to your memory in milliseconds.
Jan 19
parent sarn <sarn theartofmachinery.com> writes:
On Saturday, 19 January 2019 at 15:25:37 UTC, Boris-Barboris 
wrote:
 On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright 
 wrote:
 Andrei and I were talking on the phone today, trading ideas 
 about speeding up importation of Phobos files. Any particular 
 D file tends to import much of Phobos, and much of Phobos 
 imports the rest of it. We've both noticed that file size 
 doesn't seem to matter much for importation speed, but file 
 lookups remain slow.

 So looking up fewer files would make it faster.
Sounds rather strange that on modern operating systems, that cache files themselves and the metadata even more surely (VFS directory cache as an example), file lookup is a problem. Calls to something like glob(3) could render the whole phobos directory tree to your memory in milliseconds.
It's a known problem that Windows can't cache file metadata as aggressively as Posix systems because of differences in filesystem semantics. (See https://github.com/Microsoft/WSL/issues/873#issuecomment-425272829) Walter did say he saw benchmarked improvements on Linux for Warp, though.
Jan 19
prev sibling next sibling parent reply Neia Neutuladh <neia ikeran.org> writes:
On Sat, 19 Jan 2019 00:45:27 -0800, Walter Bright wrote:
 Andrei and I were talking on the phone today, trading ideas about
 speeding up importation of Phobos files. Any particular D file tends to
 import much of Phobos, and much of Phobos imports the rest of it. We've
 both noticed that file size doesn't seem to matter much for importation
 speed, but file lookups remain slow.
 
 So looking up fewer files would make it faster.
I compiled one file with one extra -I directive as a test. The file had two imports, one for url.d (which depends on std.conv and std.string) and one for std.stdio. For each transitive import encountered, the compiler: * looked at each import path (total of four: ., druntime, phobos, and urld's source dir) * looked at each possible extension (.d, .di, and none) * built a matching filename * checked if it existed Maybe it should do a shallow directory listing for the current directory and every -I path at startup, using that information to prune the list of checks it needs to do. It could also do that recursively when importing from a subpackage. Then it would have called 'opendir' about four times and readdir() about 70 times; it wouldn't ever have had to call exists(). It would allocate a lot less memory. And that's a change that will help people who don't zip up their source code every time they want to compile it. It could even do this in another thread while parsing the files passed on the command line. That would require a bit of caution, of course.
Jan 19
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Jan 19, 2019 at 06:05:13PM +0000, Neia Neutuladh via Digitalmars-d
wrote:
[...]
 I compiled one file with one extra -I directive as a test. The file
 had two imports, one for url.d (which depends on std.conv and
 std.string) and one for std.stdio.
 
 For each transitive import encountered, the compiler:
 * looked at each import path (total of four: ., druntime, phobos, and
 urld's source dir)
 * looked at each possible extension (.d, .di, and none)
 * built a matching filename
 * checked if it existed
 
 Maybe it should do a shallow directory listing for the current
 directory and every -I path at startup, using that information to
 prune the list of checks it needs to do. It could also do that
 recursively when importing from a subpackage. Then it would have
 called 'opendir' about four times and readdir() about 70 times; it
 wouldn't ever have had to call exists().  It would allocate a lot less
 memory. And that's a change that will help people who don't zip up
 their source code every time they want to compile it.
[...] Excellent finding! I *knew* something was off when looking up a file is more expensive than reading it. I'm thinking a quick fix could be to just cache the intermediate results of each lookup, and reuse those instead of issuing another call to opendir() each time. I surmise that after this change, this issue may no longer even be a problem anymore. T -- It is of the new things that men tire --- of fashions and proposals and improvements and change. It is the old things that startle and intoxicate. It is the old things that are young. -- G.K. Chesterton
Jan 19
parent reply Neia Neutuladh <neia ikeran.org> writes:
On Sat, 19 Jan 2019 11:56:29 -0800, H. S. Teoh wrote:
 Excellent finding! I *knew* something was off when looking up a file is
 more expensive than reading it.  I'm thinking a quick fix could be to
 just cache the intermediate results of each lookup, and reuse those
 instead of issuing another call to opendir() each time.  I surmise that
 after this change, this issue may no longer even be a problem anymore.
It doesn't even call opendir(). It assembles each potential path and calls exists(). Which might be better for only having a small number of imports, but that's not the common case. I've a partial fix for Posix, and I'll see about getting dev tools running in WINE to get a Windows version. (Which isn't exactly the same, but if I find a difference in FindFirstFile / FindNextFile between Windows and WINE, I'll be surprised.) I'm not sure what it should do when the same module is found in multiple locations, though -- the current code seems to take the first match. I'm also not sure whether it should be lazy or not. Also symlinks and case-insensitive filesystems are annoying. https://github.com/dhasenan/dmd/tree/fasterimport
Jan 19
next sibling parent Stefan Koch <uplink.coder googlemail.com> writes:
On Saturday, 19 January 2019 at 20:32:07 UTC, Neia Neutuladh 
wrote:
 Also symlinks and case-insensitive filesystems are annoying.

 https://github.com/dhasenan/dmd/tree/fasterimport
Considering it's only ever used in one place it's not a canidate for root. Maybe you could just put it in src.
Jan 19
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/19/19 3:32 PM, Neia Neutuladh wrote:
 On Sat, 19 Jan 2019 11:56:29 -0800, H. S. Teoh wrote:
 Excellent finding! I *knew* something was off when looking up a file is
 more expensive than reading it.  I'm thinking a quick fix could be to
 just cache the intermediate results of each lookup, and reuse those
 instead of issuing another call to opendir() each time.  I surmise that
 after this change, this issue may no longer even be a problem anymore.
It doesn't even call opendir(). It assembles each potential path and calls exists(). Which might be better for only having a small number of imports, but that's not the common case. I've a partial fix for Posix, and I'll see about getting dev tools running in WINE to get a Windows version. (Which isn't exactly the same, but if I find a difference in FindFirstFile / FindNextFile between Windows and WINE, I'll be surprised.) I'm not sure what it should do when the same module is found in multiple locations, though -- the current code seems to take the first match. I'm also not sure whether it should be lazy or not. Also symlinks and case-insensitive filesystems are annoying. https://github.com/dhasenan/dmd/tree/fasterimport
This is great, looking forward to seeing this improvement merged. (There are packaging- and distribution-related advantages to archives independent of this.)
Jan 20
parent reply Doc Andrew <x x.com> writes:
On Sunday, 20 January 2019 at 15:10:58 UTC, Andrei Alexandrescu 
wrote:
 On 1/19/19 3:32 PM, Neia Neutuladh wrote:
 [...]
This is great, looking forward to seeing this improvement merged. (There are packaging- and distribution-related advantages to archives independent of this.)
Andrei, Are you envisioning something like the JAR format Java uses? That would be pretty convenient for D library installation and imports... -Doc
Jan 20
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/20/2019 9:53 AM, Doc Andrew wrote:
     Are you envisioning something like the JAR format Java uses?
jar/zip/arc/tar/lib/ar/cab/lzh/whatever They're all the same under the hood - a bunch of files concatenated together with a table of contents.
Jan 20
parent reply Adam Wilson <flyboynw gmail.com> writes:
On 1/20/19 9:31 PM, Walter Bright wrote:
 On 1/20/2019 9:53 AM, Doc Andrew wrote:
     Are you envisioning something like the JAR format Java uses?
jar/zip/arc/tar/lib/ar/cab/lzh/whatever They're all the same under the hood - a bunch of files concatenated together with a table of contents.
I notice a trend here. You eventually end up at the Java/JAR or .NET/Assembly model where everything need to compile against a library is included in a single file. I have often wondered how hard it would be to teach D how to automatically include the contents of a DI file into a library file and read the contents of the library at compile time. Having studied dependencies/packaging in D quite a bit; a readily apparent observation is that D is much more difficult to work with than Java/.NET/JavaScript in regards to packaging. Unified interface/implementation would go a LONG way to simplifying the problem. I even built a ZIP based packing model for D. -- Adam Wilson IRC: EllipticBit import quiet.dlang.dev;
Jan 21
parent reply Jacob Carlborg <doob me.com> writes:
On 2019-01-21 11:21, Adam Wilson wrote:

 I notice a trend here. You eventually end up at the Java/JAR or 
 .NET/Assembly model where everything need to compile against a library 
 is included in a single file. I have often wondered how hard it would be 
 to teach D how to automatically include the contents of a DI file into a 
 library file and read the contents of the library at compile time.
 
 Having studied dependencies/packaging in D quite a bit; a readily 
 apparent observation is that D is much more difficult to work with than 
 Java/.NET/JavaScript in regards to packaging. Unified 
 interface/implementation would go a LONG way to simplifying the problem.
 
 I even built a ZIP based packing model for D.
For distributing libraries you use Dub. For distributing applications to end users you distribute the executable. -- /Jacob Carlborg
Jan 21
parent Adam Wilson <flyboynw gmail.com> writes:
On 1/21/19 3:50 AM, Jacob Carlborg wrote:
 On 2019-01-21 11:21, Adam Wilson wrote:
 
 I notice a trend here. You eventually end up at the Java/JAR or 
 .NET/Assembly model where everything need to compile against a library 
 is included in a single file. I have often wondered how hard it would 
 be to teach D how to automatically include the contents of a DI file 
 into a library file and read the contents of the library at compile time.

 Having studied dependencies/packaging in D quite a bit; a readily 
 apparent observation is that D is much more difficult to work with 
 than Java/.NET/JavaScript in regards to packaging. Unified 
 interface/implementation would go a LONG way to simplifying the problem.

 I even built a ZIP based packing model for D.
For distributing libraries you use Dub. For distributing applications to end users you distribute the executable.
DUB does nothing to solve the file lookup problem so I am curious, how does it apply to this conversation? I was talking about how the files are packaged for distribution, not the actual distribution of the package itself. And don't even get me started on DUB... -- Adam Wilson IRC: EllipticBit import quiet.dlang.dev;
Jan 21
prev sibling next sibling parent Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Saturday, 19 January 2019 at 20:32:07 UTC, Neia Neutuladh 
wrote:
 https://github.com/dhasenan/dmd/tree/fasterimport
Any benchmarks?
Jan 20
prev sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 1/19/19 3:32 PM, Neia Neutuladh wrote:
 On Sat, 19 Jan 2019 11:56:29 -0800, H. S. Teoh wrote:
 Excellent finding! I *knew* something was off when looking up a file is
 more expensive than reading it.  I'm thinking a quick fix could be to
 just cache the intermediate results of each lookup, and reuse those
 instead of issuing another call to opendir() each time.  I surmise that
 after this change, this issue may no longer even be a problem anymore.
It doesn't even call opendir(). It assembles each potential path and calls exists(). Which might be better for only having a small number of imports, but that's not the common case. I've a partial fix for Posix, and I'll see about getting dev tools running in WINE to get a Windows version. (Which isn't exactly the same, but if I find a difference in FindFirstFile / FindNextFile between Windows and WINE, I'll be surprised.) I'm not sure what it should do when the same module is found in multiple locations, though -- the current code seems to take the first match. I'm also not sure whether it should be lazy or not. Also symlinks and case-insensitive filesystems are annoying. https://github.com/dhasenan/dmd/tree/fasterimport
I wonder if packages could be used to eliminate possibilities. For example, std is generally ONLY going to be under a phobos import path. That can eliminate any other import directives from even being tried (if they don't have a std directory, there's no point in looking for an std.algorithm directory in there). Maybe you already implemented this, I'm not sure. I still find it difficult to believe that calling exists x4 is a huge culprit. But certainly, caching a directory structure is going to be more efficient than reading it every time. -Steve
Jan 21
parent reply Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Monday, 21 January 2019 at 19:01:57 UTC, Steven Schveighoffer 
wrote:
 I still find it difficult to believe that calling exists x4 is 
 a huge culprit. But certainly, caching a directory structure is 
 going to be more efficient than reading it every time.
For large directories, opendir+readdir, especially with stat, is much slower than open/access. Most filesystems already use a hash table or equivalent, so looking up a known file name is faster because it's a hash table lookup. This whole endeavor generally seems like poorly reimplementing what the OS should already be doing.
Jan 21
next sibling parent Neia Neutuladh <neia ikeran.org> writes:
On Mon, 21 Jan 2019 19:10:11 +0000, Vladimir Panteleev wrote:
 On Monday, 21 January 2019 at 19:01:57 UTC, Steven Schveighoffer wrote:
 I still find it difficult to believe that calling exists x4 is a huge
 culprit. But certainly, caching a directory structure is going to be
 more efficient than reading it every time.
For large directories, opendir+readdir, especially with stat, is much slower than open/access.
We can avoid stat() except with symbolic links. Opendir + readdir for my example would be about 500 system calls, so it breaks even with `import std.stdio;` assuming the cost per call is identical and we're reading eagerly. Testing shows that this is the case. With a C preprocessor, though, you're dealing with /usr/share with thousands of header files.
 This whole endeavor generally seems like poorly reimplementing what the
 OS should already be doing.
The OS doesn't have a "find a file with one of this handful of names among these directories" call.
Jan 21
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Jan 21, 2019 at 07:10:11PM +0000, Vladimir Panteleev via Digitalmars-d
wrote:
 On Monday, 21 January 2019 at 19:01:57 UTC, Steven Schveighoffer wrote:
 I still find it difficult to believe that calling exists x4 is a
 huge culprit. But certainly, caching a directory structure is going
 to be more efficient than reading it every time.
For large directories, opendir+readdir, especially with stat, is much slower than open/access. Most filesystems already use a hash table or equivalent, so looking up a known file name is faster because it's a hash table lookup. This whole endeavor generally seems like poorly reimplementing what the OS should already be doing.
I can't help wondering why we're making so much noise about a few milliseconds on opening/reading import files, when there's the elephant in the room of the 3-5 *seconds* of compile-time added by the mere act of using a single instance of std.regex.Regex. Shouldn't we be doing something about that first?? T -- Verbing weirds language. -- Calvin (& Hobbes)
Jan 21
parent Stefan Koch <uplink.coder googlemail.com> writes:
On Monday, 21 January 2019 at 19:42:57 UTC, H. S. Teoh wrote:
 I can't help wondering why we're making so much noise about a 
 few milliseconds on opening/reading import files, when there's 
 the elephant in the room of the 3-5 *seconds* of compile-time 
 added by the mere act of using a single instance of 
 std.regex.Regex.

 Shouldn't we be doing something about that first??


 T
I am on it :P I cannot do it any faster than I am currently doing it though. 1st Class Functions to replace recursive templates are still pursued by me as well.
Jan 21
prev sibling next sibling parent Thomas Mader <thomas.mader gmail.com> writes:
On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:
 We've both noticed that file size doesn't seem to matter much 
 for importation speed, but file lookups remain slow.
Maybe there are still some tricks to apply to make the lookup faster? Ripgrep [1] is to my knowledge the fastest grepping tool out there currently and the speed is mostly about grepping probably but to be fast they need to get to the files fast too. So it might be helpful to look at the code. [1] https://github.com/BurntSushi/ripgrep
Jan 20
prev sibling next sibling parent reply Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:
 This can be a fun challenge! Anyone up for it?
This might get you some speed for the first compilation (fewer directory entry lookups), but I would expect that follow-up compilations to have a negligible overhead, compared to the CPU time needed to actually process the source code.
Jan 20
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/20/2019 10:21 PM, Vladimir Panteleev wrote:
 This might get you some speed for the first compilation (fewer directory entry 
 lookups), but I would expect that follow-up compilations to have a negligible 
 overhead, compared to the CPU time needed to actually process the source code.
In my benchmarks with Warp, the slowdown persisted with multiple sequential runs.
Jan 21
parent reply Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Monday, 21 January 2019 at 08:11:22 UTC, Walter Bright wrote:
 On 1/20/2019 10:21 PM, Vladimir Panteleev wrote:
 This might get you some speed for the first compilation (fewer 
 directory entry lookups), but I would expect that follow-up 
 compilations to have a negligible overhead, compared to the 
 CPU time needed to actually process the source code.
In my benchmarks with Warp, the slowdown persisted with multiple sequential runs.
Would be nice if the benchmarks were reproducible. Some things to consider: - Would the same result apply to Phobos files? Some things that could be different between the two that would affect the viability of this approach: - Total number of files - The size of the files - The relative processing time needed to actually parse the files (Presumably, a C preprocessor would be faster than a full D compiler.) - Could the same result be achieved through other means, especially those without introducing disruptive changes to tooling? For example, re-enabling the parallel / preemptive loading of source files in DMD. Using JAR-like files would be disruptive to many tools. Consider: - What should the file/path error message location look like when printing error messages with locations in Phobos? - How should tools such as DCD, which need to scan source code, adapt to these changes? - How should tools and editors with a "go to definition" function work when the definition is inside an archive, especially in editors which do not support visiting files inside archives? Even if you have obvious answers to these questions, they still need to be implemented, so the speed gain from such a change would need to be significant in order to justify the disruption.
Jan 21
parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/21/2019 1:02 AM, Vladimir Panteleev wrote:
 Even if you have obvious answers to these questions, they still need to be 
 implemented, so the speed gain from such a change would need to be significant 
 in order to justify the disruption.
The only way to get definitive answers is to try it. Fortunately, it isn't that difficult.
Jan 21
prev sibling next sibling parent Neia Neutuladh <neia ikeran.org> writes:
On Sat, 19 Jan 2019 00:45:27 -0800, Walter Bright wrote:
 Andrei and I were talking on the phone today, trading ideas about
 speeding up importation of Phobos files. Any particular D file tends to
 import much of Phobos, and much of Phobos imports the rest of it. We've
 both noticed that file size doesn't seem to matter much for importation
 speed, but file lookups remain slow.
I should have started out by testing this. I replaced the file lookup with, essentially: if (name.startsWith("std")) filename = "/phobos/" ~ name.replace(".", "/") ~ ".d"; else filename = "/druntime/" ~ name.replace(".", "/") ~ ".d"; Plus a hard-coded set of package.d references. Before, compiling my test file took about 0.67 to 0.70 seconds. After, it took about 0.67 to 0.70 seconds. There is no point in optimizing filesystem access for importing phobos at this time.
Jan 21
prev sibling next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:
 This can be a fun challenge! Anyone up for it?
I wrote about this idea in my blog today: http://dpldocs.info/this-week-in-d/Blog.Posted_2019_01_21.html#my-thoughts-on-forum-discussions In short, it may be a fun challenge, and may be useful to some library distributors, but I don't think it is actually worth it.
Jan 21
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 1/21/19 4:13 PM, Adam D. Ruppe wrote:
 On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:
 This can be a fun challenge! Anyone up for it?
I wrote about this idea in my blog today: http://dpldocs.info/this-week-in-d/Blog.Posted_2019_01_21.html#my-thoughts- n-forum-discussions In short, it may be a fun challenge, and may be useful to some library distributors, but I don't think it is actually worth it.
Lot of good thoughts there, most of which I agree with. Thanks for sharing. One note -- I don't think modules like std.datetime were split up for the sake of the compiler parsing speed, I thought they were split up to a) avoid the insane ddoc generation that came from it, and b) reduce dependencies on symbols that you didn't care about. Not to mention that github would refuse to load std.datetime for any PRs :) But it does help to consider the cost of finding the file and the cost of using the file separately, and see how they compare. -Steve
Jan 21
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Jan 21, 2019 at 04:38:21PM -0500, Steven Schveighoffer via
Digitalmars-d wrote:
[...]
 One note -- I don't think modules like std.datetime were split up for
 the sake of the compiler parsing speed, I thought they were split up
 to a) avoid the insane ddoc generation that came from it, and b)
 reduce dependencies on symbols that you didn't care about. Not to
 mention that github would refuse to load std.datetime for any PRs :)
And also, I originally split up std.algorithm (at Andrei's protest) because it was so ridiculously huge that I couldn't get unittests to run on my PC without dying with out-of-memory errors.
 But it does help to consider the cost of finding the file and the cost
 of using the file separately, and see how they compare.
[...] I still think a lot of this effort is misdirected -- we're trying to hunt small fish while there's a shark in the pond. Instead of trying to optimize file open / read times, what we *should* be doing is to reduce the number of recursive templates heavy-weight Phobos modules like std.regex are using, or improving the template expansion strategies (e.g., the various PRs that have been checked in to replace O(n) recursive template expansions with O(log n), or replace O(n^2) with O(n), etc.). Or, for that matter, optimizing how the compiler processes templates so that it performs better. Optimizing file open / file read in the face of these much heavier components in the compiler sounds to me like straining out the gnat while swallowing the camel. T -- Жил-был король когда-то, при нём блоха жила.
Jan 21
parent 12345swordy <alexanderheistermann gmail.com> writes:
On Monday, 21 January 2019 at 21:52:01 UTC, H. S. Teoh wrote:
 On Mon, Jan 21, 2019 at 04:38:21PM -0500, Steven Schveighoffer 
 via Digitalmars-d wrote: [...]
 [...]
And also, I originally split up std.algorithm (at Andrei's protest) because it was so ridiculously huge that I couldn't get unittests to run on my PC without dying with out-of-memory errors. [...]
Does dmd ever do dynamic programming when it does recursive templates? -Alex
Jan 21
prev sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Monday, 21 January 2019 at 21:38:21 UTC, Steven Schveighoffer 
wrote:
 One note -- I don't think modules like std.datetime were split 
 up for the sake of the compiler parsing speed
Yeah, I think std.datetime was about the size of unittest runs, but std.range, as I recall at least, was specifically to separate stuff to get quicker builds by avoiding the majority of the import for common cases via std.range.primitives which doesn't need to bring in as much code. The local import pattern also helps with this goal - lazy imports to only get what you need when you need it, so the compiler doesn't have to do as much work. Maybe I am wrong about that, but still, two files that import each other aren't actually two modules. Phobos HAS been making a LOT of progress toward untangling that import mess and addressing specific compile time problems. A few years ago, any phobos import would cost you like a half second. It is down to a quarter second for the hello world example. Which is IMO still quite poor, but a lot better than it was. But is this caused by finding the files? $ cd dmd2/src/phobos $ time find . real 0m0.003s I find that very hard to believe. And let us remember, old D1 phobos wasn't this slow: $ cat hi.d import std.stdio; void main() { writefln("Hello!"); } $ time dmd-1.0 hi.d real 0m0.042s user 0m0.032s sys 0m0.005s $ time dmd hi.d real 0m0.434s user 0m0.383s sys 0m0.044s Using the old D compilers reminds me what quick compiles REALLY are. Sigh.
Jan 21
parent Adam D. Ruppe <destructionator gmail.com> writes:
BTW

dmd2/src/phobos$ time cat `find . | grep -E '\.d$'` > catted.d

real    0m0.015s
user    0m0.006s
sys     0m0.009s

$ wc catted.d
   319707  1173911 10889167 catted.d


If it were the filesystem at fault, shouldn't that I/O heavy 
operation take a significant portion of the dmd runtime?

Yes, I know the kernel is caching these things and deferring 
writes and so on. But it does that for dmd too! Blaming the 
filesystem doesn't pass the prima facie test, at least on Linux. 
Maybe Windows is different, I will try that tomorrow, but I 
remain exceedingly skeptical.
Jan 21
prev sibling parent reply Arun Chandrasekaran <aruncxy gmail.com> writes:
On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:
 Andrei and I were talking on the phone today, trading ideas 
 about speeding up importation of Phobos files. Any particular D 
 file tends to import much of Phobos, and much of Phobos imports 
 the rest of it. We've both noticed that file size doesn't seem 
 to matter much for importation speed, but file lookups remain 
 slow.

 So looking up fewer files would make it faster.

 If phobos.zip is opened as a memory mapped file, whenever 
 std/stdio.d is read, the file will be "faulted" into memory 
 rather than doing a file lookup / read. We're speculating that 
 this should be significantly faster,
Speaking from Linux, the kernel already caches the file (after the first read) unless `echo 3 > /proc/sys/vm/drop_caches` is triggered. I've tested with the entire phobos cached and the compilation is still slow. IO is not the bottleneck here. The compilation needs to be speeded up. If you still think the file read is the culprint, why does recompilation take the same amount of time as the first compilation (albeit kernel file cache)?
 being very convenient for the user to treat Phobos as a single 
 file rather than a blizzard. (phobos.lib could also be in the 
 same file!)
We already have std.experimental.all for convenience.
Jan 21
next sibling parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Monday, January 21, 2019 5:46:32 PM MST Arun Chandrasekaran via 
Digitalmars-d wrote:
 being very convenient for the user to treat Phobos as a single
 file rather than a blizzard. (phobos.lib could also be in the
 same file!)
We already have std.experimental.all for convenience.
If I understand correctly, that's an orthogonal issue. What Walter is proposing wouldn't change how any code imported anything. Rather, it would just change how the compiler reads the files. So, anyone wanting to import all of Phobos at once would still need something like std.experimental.all, but regardless of how much you were importing from Phobos, dmd would read in all of Phobos at once, because it would be a single zip file. It would then only actually compile what it needed to for the imports in your program, but it would have read the entire zip file into memory so that it would only have to open one file instead of searching for and opening each file individually. - Jonathan M Davis
Jan 21
prev sibling parent Neia Neutuladh <neia ikeran.org> writes:
On Tue, 22 Jan 2019 00:46:32 +0000, Arun Chandrasekaran wrote:
 If you still think the file read is the culprint, why does recompilation
 take the same amount of time as the first compilation (albeit kernel
 file cache)?
And another quick way to test this is to use import() and a hard-coded switch statement instead of IO. Just get rid of all disk access and see how fast you can compile code. I'm betting you'll save 10% at most.
Jan 21