digitalmars.D - Speeding up importing Phobos files
- Walter Bright (22/22) Jan 19 2019 Andrei and I were talking on the phone today, trading ideas about speedi...
- Stefan Koch (7/14) Jan 19 2019 If we are going there we might as well use a proper database as
- H. S. Teoh (11/23) Jan 19 2019 [...]
- Temtaime (4/31) Jan 19 2019 C'mon, everyone has a SSD, OS tends to cache previously opened
- Walter Bright (3/6) Jan 19 2019 You'd think that'd be true, but it isn't. File reads are fast, but file ...
- FeepingCreature (5/12) Jan 19 2019 If you've benchmarked this, could you please post your benchmark
- Walter Bright (7/11) Jan 19 2019 I benchmarked it while developing Warp (the C preprocessor replacement I...
- Andrei Alexandrescu (11/24) Jan 19 2019 I've done a bunch of measurements while I was working on
- Pjotr Prins (13/40) Jan 20 2019 I deal with large compressed files. For large data lz4 would
- Kagamin (3/6) Jan 19 2019 BTW firefox uses fast compression option indicated by general
- Boris-Barboris (6/13) Jan 19 2019 Sounds rather strange that on modern operating systems, that
- sarn (8/23) Jan 19 2019 It's a known problem that Windows can't cache file metadata as
- Neia Neutuladh (20/27) Jan 19 2019 I compiled one file with one extra -I directive as a test. The file had
- H. S. Teoh (11/30) Jan 19 2019 [...]
- Neia Neutuladh (13/18) Jan 19 2019 It doesn't even call opendir(). It assembles each potential path and cal...
- Stefan Koch (5/7) Jan 19 2019 Considering it's only ever used in one place it's not a canidate
- Andrei Alexandrescu (4/27) Jan 20 2019 This is great, looking forward to seeing this improvement merged. (There...
- Doc Andrew (7/12) Jan 20 2019 Andrei,
- Walter Bright (4/5) Jan 20 2019 jar/zip/arc/tar/lib/ar/cab/lzh/whatever
- Adam Wilson (15/22) Jan 21 2019 I notice a trend here. You eventually end up at the Java/JAR or
- Jacob Carlborg (5/17) Jan 21 2019 For distributing libraries you use Dub. For distributing applications to...
- Adam Wilson (10/28) Jan 21 2019 DUB does nothing to solve the file lookup problem so I am curious, how
- Vladimir Panteleev (3/4) Jan 20 2019 Any benchmarks?
- Steven Schveighoffer (11/35) Jan 21 2019 I wonder if packages could be used to eliminate possibilities.
- Vladimir Panteleev (8/11) Jan 21 2019 For large directories, opendir+readdir, especially with stat, is
- Neia Neutuladh (9/18) Jan 21 2019 We can avoid stat() except with symbolic links.
- Andrei Alexandrescu (9/23) Jun 08 2019 Another simple test:
- Steven Schveighoffer (4/30) Jun 10 2019 Might it be due to something like this?
- H. S. Teoh (9/21) Jan 21 2019 I can't help wondering why we're making so much noise about a few
- Stefan Koch (5/12) Jan 21 2019 I am on it :P
- Thomas Mader (8/10) Jan 20 2019 Maybe there are still some tricks to apply to make the lookup
- Vladimir Panteleev (5/6) Jan 20 2019 This might get you some speed for the first compilation (fewer
- Walter Bright (2/5) Jan 21 2019 In my benchmarks with Warp, the slowdown persisted with multiple sequent...
- Vladimir Panteleev (27/34) Jan 21 2019 Would be nice if the benchmarks were reproducible.
- Walter Bright (3/6) Jan 21 2019 The only way to get definitive answers is to try it. Fortunately, it isn...
- Neia Neutuladh (12/17) Jan 21 2019 I should have started out by testing this.
- Andrei Alexandrescu (3/26) Jun 08 2019 Word. (Unless the libs are installed over a networked mount. Not sure
- Adam D. Ruppe (5/6) Jan 21 2019 I wrote about this idea in my blog today:
- Steven Schveighoffer (10/21) Jan 21 2019 Lot of good thoughts there, most of which I agree with. Thanks for shari...
- H. S. Teoh (21/28) Jan 21 2019 And also, I originally split up std.algorithm (at Andrei's protest)
- 12345swordy (4/12) Jan 21 2019 Does dmd ever do dynamic programming when it does recursive
- Andrei Alexandrescu (2/3) Jun 08 2019 Shouldn't have been splitted.
- H. S. Teoh (9/13) Jun 08 2019 It should have been. The old std.algorithm was a monster of 10,000 LOC
- Guillaume Piolat (3/16) Jun 08 2019 +1
- Andrei Alexandrescu (2/3) Jun 08 2019 That should indeed have been broken as it was.
- Andrei Alexandrescu (4/7) Jun 08 2019 The appropriate response would have been (and still is) to fix the
- Nicholas Wilson (4/6) Jun 08 2019 OTOH, a more sparse working set will accelerate development since
- Adam D. Ruppe (6/8) Jun 08 2019 I personally find it is a LOT easier to work with one big file
- matheus (7/12) Jun 08 2019 Well a friend of mine would discord on this over his hog electron
- Atila Neves (10/18) Jun 08 2019 I never understand complaints about where files are located and
- Nick Sabalausky (Abscissa) (9/17) Jun 08 2019 Sheesh, this is *exactly* the sort of "Perfection is the enemy of the
- Andrei Alexandrescu (2/14) Jun 09 2019 I did allow the breaking up of std.algorithm.
- Nick Sabalausky (Abscissa) (19/34) Jun 09 2019 Yes, and that's definitely good - after all, it gave us a stopgap fix
- Andrei Alexandrescu (6/9) Jun 09 2019 I can tell at least what I tried to do - use good judgment for each such...
- Amex (18/29) Jun 10 2019 Has anyone merged all of phobos in to one large file, removed all
- Andrei Alexandrescu (2/34) Jun 10 2019 Not if that 1% costs 99% of your budget.
- Adam D. Ruppe (37/39) Jan 21 2019 Yeah, I think std.datetime was about the size of unittest runs,
- Adam D. Ruppe (14/14) Jan 21 2019 BTW
- Arun Chandrasekaran (10/24) Jan 21 2019 Speaking from Linux, the kernel already caches the file (after
- Jonathan M Davis (13/17) Jan 21 2019 If I understand correctly, that's an orthogonal issue. What Walter is
- Neia Neutuladh (4/7) Jan 21 2019 And another quick way to test this is to use import() and a hard-coded
- Mike Franklin (56/62) Jun 07 2019 The topic of import speed when an implementation is spread over
- KnightMare (22/22) Jun 07 2019 zip-archive allows you to unpack the file in its original form.
- KnightMare (11/13) Jun 07 2019 u can unzip w/o any special tools - OSes can work with zip usually
- Nick Sabalausky (Abscissa) (6/9) Jun 07 2019 That's an interesting approach to the issue: Just allow a package to be
- Patrick Schluter (4/14) Jun 07 2019 Isn't it what Java does? A jar file is nothing more than a zip
- H. S. Teoh (21/40) Jun 07 2019 Why final abstract class? If all you have are static properties, you
- Mike Franklin (16/27) Jun 07 2019 What I'm proposing is that a library's organization can be one
- Gregor =?UTF-8?B?TcO8Y2ts?= (5/21) Jun 07 2019 How would compilation even work with multiple modules per file?
- Adam D. Ruppe (8/12) Jun 07 2019 It doesn't have to search because you pass the modules to the
- H. S. Teoh (9/15) Jun 07 2019 [...]
- KnightMare (5/8) Jun 07 2019 or to use LZ4 for no-dependencies from zlib and smaller code:
- Seb (13/41) Jun 07 2019 Reading files is really cheap, evaluating templates and running
- Mike Franklin (8/12) Jun 07 2019 Yes that make much more sense to me. But, if that's the case,
- Mike Franklin (5/16) Jun 08 2019 Another recent comment from Walter:
- Seb (8/20) Jun 13 2019 If they wanted to make DMD faster, compiling with LDC would make
- Nick Sabalausky (Abscissa) (20/23) Jun 07 2019 We don't really have that limitation. The compiler gets the
- H. S. Teoh (18/28) Jun 07 2019 I honestly doubt it would improve compilation time that much. Reading
- KnightMare (17/20) Jun 07 2019 need to search for existing files in dir-tree with some templates
- H. S. Teoh (12/19) Jun 07 2019 [...]
- Amex (15/42) Jun 07 2019 Why not compile phobos to an object file? Basically store the AST
- KnightMare (29/32) Jun 08 2019 +1 for AST.
- KnightMare (4/6) Jun 08 2019 no need store any unittests and doc/comments in AST-packages.
- Adam D. Ruppe (4/5) Jun 08 2019 This doesn't make a significant difference. Parsing D source into
- KnightMare (5/9) Jun 08 2019 one more idea. allow to store in AST-packages user/any metadata.
- KnightMare (4/7) Jun 08 2019 template code will be taken from AST still.
- KnightMare (5/6) Jun 08 2019 I watched a video now from conference
Andrei and I were talking on the phone today, trading ideas about speeding up importation of Phobos files. Any particular D file tends to import much of Phobos, and much of Phobos imports the rest of it. We've both noticed that file size doesn't seem to matter much for importation speed, but file lookups remain slow. So looking up fewer files would make it faster. Here's the idea: Place all Phobos source files into a single zip file, call it phobos.zip (duh). Then, dmd myfile.d phobos.zip and the compiler will look in phobos.zip to resolve, say, std/stdio.d. If phobos.zip is opened as a memory mapped file, whenever std/stdio.d is read, the file will be "faulted" into memory rather than doing a file lookup / read. We're speculating that this should be significantly faster, besides being very convenient for the user to treat Phobos as a single file rather than a blizzard. (phobos.lib could also be in the same file!) It doesn't have to be just phobos, this can be a general facility. People can distribute their D libraries as a zip file that never needs unzipping. We already have https://dlang.org/phobos/std_zip.html to do the dirty work. We can experiment to see if compressed zips are faster than uncompressed ones. This can be a fun challenge! Anyone up for it? P.S. dmd's ability to directly manipulate object library files, rather than going through lib or ar, has been a nice success.
Jan 19 2019
On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:Andrei and I were talking on the phone today, trading ideas about speeding up importation of Phobos files. Any particular D file tends to import much of Phobos, and much of Phobos imports the rest of it. We've both noticed that file size doesn't seem to matter much for importation speed, but file lookups remain slow. [...]If we are going there we might as well use a proper database as the compiler cache format. similar to to pre-compiled headers. I'd be interested to see in how this bears out. I am going to devote some time this weekend to it. But using sqlite rather than zip.
Jan 19 2019
On Sat, Jan 19, 2019 at 08:59:37AM +0000, Stefan Koch via Digitalmars-d wrote:On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:[...] I'd like to see us go in this direction. It could lead to other new things, like the compiler inferring attributes for all functions (not just template / auto functions) and storing the inferred attributes in the precompiled cache. It could even store additional derived information not representable in the source that could be used for program-wide optimization, etc.. T -- Let's call it an accidental feature. -- Larry WallAndrei and I were talking on the phone today, trading ideas about speeding up importation of Phobos files. Any particular D file tends to import much of Phobos, and much of Phobos imports the rest of it. We've both noticed that file size doesn't seem to matter much for importation speed, but file lookups remain slow. [...]If we are going there we might as well use a proper database as the compiler cache format. similar to to pre-compiled headers.
Jan 19 2019
On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:Andrei and I were talking on the phone today, trading ideas about speeding up importation of Phobos files. Any particular D file tends to import much of Phobos, and much of Phobos imports the rest of it. We've both noticed that file size doesn't seem to matter much for importation speed, but file lookups remain slow. So looking up fewer files would make it faster. Here's the idea: Place all Phobos source files into a single zip file, call it phobos.zip (duh). Then, dmd myfile.d phobos.zip and the compiler will look in phobos.zip to resolve, say, std/stdio.d. If phobos.zip is opened as a memory mapped file, whenever std/stdio.d is read, the file will be "faulted" into memory rather than doing a file lookup / read. We're speculating that this should be significantly faster, besides being very convenient for the user to treat Phobos as a single file rather than a blizzard. (phobos.lib could also be in the same file!) It doesn't have to be just phobos, this can be a general facility. People can distribute their D libraries as a zip file that never needs unzipping. We already have https://dlang.org/phobos/std_zip.html to do the dirty work. We can experiment to see if compressed zips are faster than uncompressed ones. This can be a fun challenge! Anyone up for it? P.S. dmd's ability to directly manipulate object library files, rather than going through lib or ar, has been a nice success.C'mon, everyone has a SSD, OS tends to cache previously opened files. What's the problem ? Better speedup compilation speed.
Jan 19 2019
On 1/19/2019 1:00 AM, Temtaime wrote:C'mon, everyone has a SSD, OS tends to cache previously opened files. What's the problem ? Better speedup compilation speed.You'd think that'd be true, but it isn't. File reads are fast, but file lookups are slow. Searching for a file along a path is particularly slow.
Jan 19 2019
On Saturday, 19 January 2019 at 09:08:00 UTC, Walter Bright wrote:On 1/19/2019 1:00 AM, Temtaime wrote:If you've benchmarked this, could you please post your benchmark source so people can reproduce it? Probably be good to gather data from more than one PC. Maybe make a minisurvey for the results.C'mon, everyone has a SSD, OS tends to cache previously opened files. What's the problem ? Better speedup compilation speed.You'd think that'd be true, but it isn't. File reads are fast, but file lookups are slow. Searching for a file along a path is particularly slow.
Jan 19 2019
On 1/19/2019 1:12 AM, FeepingCreature wrote:If you've benchmarked this, could you please post your benchmark source so people can reproduce it?I benchmarked it while developing Warp (the C preprocessor replacement I did for Facebook). I was able to speed up searches for .h files substantially by remembering previous lookups in a hash table. The speedup persisted across Windows and Linux. https://github.com/facebookarchive/warpProbably be good to gather data from more than one PC. Maybe make a minisurvey for the results.Sounds like a good idea. Please take charge of this!
Jan 19 2019
On 1/19/19 4:12 AM, FeepingCreature wrote:On Saturday, 19 January 2019 at 09:08:00 UTC, Walter Bright wrote:I've done a bunch of measurements while I was working on https://github.com/dlang/DIPs/blob/master/DIPs/DIP1005.md, on a modern machine with SSD and Linux (which aggressively caches file contents). I don't think I still have the code, but it shouldn't be difficult to sit down and produce some. The overall conclusion of those experiments was that if you want to improve compilation speed, you need to minimize the number of files opened; once opened, whether it was 1 KB or 100 KB made virtually no difference. One thing I didn't measure was whether opening the file was most overhead, or closing also had a large share.On 1/19/2019 1:00 AM, Temtaime wrote:If you've benchmarked this, could you please post your benchmark source so people can reproduce it? Probably be good to gather data from more than one PC. Maybe make a minisurvey for the results.C'mon, everyone has a SSD, OS tends to cache previously opened files. What's the problem ? Better speedup compilation speed.You'd think that'd be true, but it isn't. File reads are fast, but file lookups are slow. Searching for a file along a path is particularly slow.
Jan 19 2019
On Saturday, 19 January 2019 at 16:30:39 UTC, Andrei Alexandrescu wrote:On 1/19/19 4:12 AM, FeepingCreature wrote:I deal with large compressed files. For large data lz4 would probably be a better choice over zip these days. And, even with cached lookups of dir entries, I think one file that is sequentially read will always be an improvement. Note also that compressed files may even be faster than uncompressed ones with some system configurations (relatively slow disk IO, many processors). Another thing to look at is indexed compressed files. For example http://www.htslib.org/doc/tabix.html. Using those we may partition phobos into sensible sub-sections. Particularly section out those submodules people hardly ever use.On Saturday, 19 January 2019 at 09:08:00 UTC, Walter Bright wrote:I've done a bunch of measurements while I was working on https://github.com/dlang/DIPs/blob/master/DIPs/DIP1005.md, on a modern machine with SSD and Linux (which aggressively caches file contents). I don't think I still have the code, but it shouldn't be difficult to sit down and produce some. The overall conclusion of those experiments was that if you want to improve compilation speed, you need to minimize the number of files opened; once opened, whether it was 1 KB or 100 KB made virtually no difference. One thing I didn't measure was whether opening the file was most overhead, or closing also had a large share.On 1/19/2019 1:00 AM, Temtaime wrote:If you've benchmarked this, could you please post your benchmark source so people can reproduce it? Probably be good to gather data from more than one PC. Maybe make a minisurvey for the results.C'mon, everyone has a SSD, OS tends to cache previously opened files. What's the problem ? Better speedup compilation speed.You'd think that'd be true, but it isn't. File reads are fast, but file lookups are slow. Searching for a file along a path is particularly slow.
Jan 20 2019
On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:We already have https://dlang.org/phobos/std_zip.html to do the dirty work. We can experiment to see if compressed zips are faster than uncompressed ones.BTW firefox uses fast compression option indicated by general purpose flags 2, std.zip uses default compression option.
Jan 19 2019
On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:Andrei and I were talking on the phone today, trading ideas about speeding up importation of Phobos files. Any particular D file tends to import much of Phobos, and much of Phobos imports the rest of it. We've both noticed that file size doesn't seem to matter much for importation speed, but file lookups remain slow. So looking up fewer files would make it faster.Sounds rather strange that on modern operating systems, that cache files themselves and the metadata even more surely (VFS directory cache as an example), file lookup is a problem. Calls to something like glob(3) could render the whole phobos directory tree to your memory in milliseconds.
Jan 19 2019
On Saturday, 19 January 2019 at 15:25:37 UTC, Boris-Barboris wrote:On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:It's a known problem that Windows can't cache file metadata as aggressively as Posix systems because of differences in filesystem semantics. (See https://github.com/Microsoft/WSL/issues/873#issuecomment-425272829) Walter did say he saw benchmarked improvements on Linux for Warp, though.Andrei and I were talking on the phone today, trading ideas about speeding up importation of Phobos files. Any particular D file tends to import much of Phobos, and much of Phobos imports the rest of it. We've both noticed that file size doesn't seem to matter much for importation speed, but file lookups remain slow. So looking up fewer files would make it faster.Sounds rather strange that on modern operating systems, that cache files themselves and the metadata even more surely (VFS directory cache as an example), file lookup is a problem. Calls to something like glob(3) could render the whole phobos directory tree to your memory in milliseconds.
Jan 19 2019
On Sat, 19 Jan 2019 00:45:27 -0800, Walter Bright wrote:Andrei and I were talking on the phone today, trading ideas about speeding up importation of Phobos files. Any particular D file tends to import much of Phobos, and much of Phobos imports the rest of it. We've both noticed that file size doesn't seem to matter much for importation speed, but file lookups remain slow. So looking up fewer files would make it faster.I compiled one file with one extra -I directive as a test. The file had two imports, one for url.d (which depends on std.conv and std.string) and one for std.stdio. For each transitive import encountered, the compiler: * looked at each import path (total of four: ., druntime, phobos, and urld's source dir) * looked at each possible extension (.d, .di, and none) * built a matching filename * checked if it existed Maybe it should do a shallow directory listing for the current directory and every -I path at startup, using that information to prune the list of checks it needs to do. It could also do that recursively when importing from a subpackage. Then it would have called 'opendir' about four times and readdir() about 70 times; it wouldn't ever have had to call exists(). It would allocate a lot less memory. And that's a change that will help people who don't zip up their source code every time they want to compile it. It could even do this in another thread while parsing the files passed on the command line. That would require a bit of caution, of course.
Jan 19 2019
On Sat, Jan 19, 2019 at 06:05:13PM +0000, Neia Neutuladh via Digitalmars-d wrote: [...]I compiled one file with one extra -I directive as a test. The file had two imports, one for url.d (which depends on std.conv and std.string) and one for std.stdio. For each transitive import encountered, the compiler: * looked at each import path (total of four: ., druntime, phobos, and urld's source dir) * looked at each possible extension (.d, .di, and none) * built a matching filename * checked if it existed Maybe it should do a shallow directory listing for the current directory and every -I path at startup, using that information to prune the list of checks it needs to do. It could also do that recursively when importing from a subpackage. Then it would have called 'opendir' about four times and readdir() about 70 times; it wouldn't ever have had to call exists(). It would allocate a lot less memory. And that's a change that will help people who don't zip up their source code every time they want to compile it.[...] Excellent finding! I *knew* something was off when looking up a file is more expensive than reading it. I'm thinking a quick fix could be to just cache the intermediate results of each lookup, and reuse those instead of issuing another call to opendir() each time. I surmise that after this change, this issue may no longer even be a problem anymore. T -- It is of the new things that men tire --- of fashions and proposals and improvements and change. It is the old things that startle and intoxicate. It is the old things that are young. -- G.K. Chesterton
Jan 19 2019
On Sat, 19 Jan 2019 11:56:29 -0800, H. S. Teoh wrote:Excellent finding! I *knew* something was off when looking up a file is more expensive than reading it. I'm thinking a quick fix could be to just cache the intermediate results of each lookup, and reuse those instead of issuing another call to opendir() each time. I surmise that after this change, this issue may no longer even be a problem anymore.It doesn't even call opendir(). It assembles each potential path and calls exists(). Which might be better for only having a small number of imports, but that's not the common case. I've a partial fix for Posix, and I'll see about getting dev tools running in WINE to get a Windows version. (Which isn't exactly the same, but if I find a difference in FindFirstFile / FindNextFile between Windows and WINE, I'll be surprised.) I'm not sure what it should do when the same module is found in multiple locations, though -- the current code seems to take the first match. I'm also not sure whether it should be lazy or not. Also symlinks and case-insensitive filesystems are annoying. https://github.com/dhasenan/dmd/tree/fasterimport
Jan 19 2019
On Saturday, 19 January 2019 at 20:32:07 UTC, Neia Neutuladh wrote:Also symlinks and case-insensitive filesystems are annoying. https://github.com/dhasenan/dmd/tree/fasterimportConsidering it's only ever used in one place it's not a canidate for root. Maybe you could just put it in src.
Jan 19 2019
On 1/19/19 3:32 PM, Neia Neutuladh wrote:On Sat, 19 Jan 2019 11:56:29 -0800, H. S. Teoh wrote:This is great, looking forward to seeing this improvement merged. (There are packaging- and distribution-related advantages to archives independent of this.)Excellent finding! I *knew* something was off when looking up a file is more expensive than reading it. I'm thinking a quick fix could be to just cache the intermediate results of each lookup, and reuse those instead of issuing another call to opendir() each time. I surmise that after this change, this issue may no longer even be a problem anymore.It doesn't even call opendir(). It assembles each potential path and calls exists(). Which might be better for only having a small number of imports, but that's not the common case. I've a partial fix for Posix, and I'll see about getting dev tools running in WINE to get a Windows version. (Which isn't exactly the same, but if I find a difference in FindFirstFile / FindNextFile between Windows and WINE, I'll be surprised.) I'm not sure what it should do when the same module is found in multiple locations, though -- the current code seems to take the first match. I'm also not sure whether it should be lazy or not. Also symlinks and case-insensitive filesystems are annoying. https://github.com/dhasenan/dmd/tree/fasterimport
Jan 20 2019
On Sunday, 20 January 2019 at 15:10:58 UTC, Andrei Alexandrescu wrote:On 1/19/19 3:32 PM, Neia Neutuladh wrote:Andrei, Are you envisioning something like the JAR format Java uses? That would be pretty convenient for D library installation and imports... -Doc[...]This is great, looking forward to seeing this improvement merged. (There are packaging- and distribution-related advantages to archives independent of this.)
Jan 20 2019
On 1/20/2019 9:53 AM, Doc Andrew wrote:Are you envisioning something like the JAR format Java uses?jar/zip/arc/tar/lib/ar/cab/lzh/whatever They're all the same under the hood - a bunch of files concatenated together with a table of contents.
Jan 20 2019
On 1/20/19 9:31 PM, Walter Bright wrote:On 1/20/2019 9:53 AM, Doc Andrew wrote:I notice a trend here. You eventually end up at the Java/JAR or .NET/Assembly model where everything need to compile against a library is included in a single file. I have often wondered how hard it would be to teach D how to automatically include the contents of a DI file into a library file and read the contents of the library at compile time. Having studied dependencies/packaging in D quite a bit; a readily apparent observation is that D is much more difficult to work with than Java/.NET/JavaScript in regards to packaging. Unified interface/implementation would go a LONG way to simplifying the problem. I even built a ZIP based packing model for D. -- Adam Wilson IRC: EllipticBit import quiet.dlang.dev;Are you envisioning something like the JAR format Java uses?jar/zip/arc/tar/lib/ar/cab/lzh/whatever They're all the same under the hood - a bunch of files concatenated together with a table of contents.
Jan 21 2019
On 2019-01-21 11:21, Adam Wilson wrote:I notice a trend here. You eventually end up at the Java/JAR or .NET/Assembly model where everything need to compile against a library is included in a single file. I have often wondered how hard it would be to teach D how to automatically include the contents of a DI file into a library file and read the contents of the library at compile time. Having studied dependencies/packaging in D quite a bit; a readily apparent observation is that D is much more difficult to work with than Java/.NET/JavaScript in regards to packaging. Unified interface/implementation would go a LONG way to simplifying the problem. I even built a ZIP based packing model for D.For distributing libraries you use Dub. For distributing applications to end users you distribute the executable. -- /Jacob Carlborg
Jan 21 2019
On 1/21/19 3:50 AM, Jacob Carlborg wrote:On 2019-01-21 11:21, Adam Wilson wrote:DUB does nothing to solve the file lookup problem so I am curious, how does it apply to this conversation? I was talking about how the files are packaged for distribution, not the actual distribution of the package itself. And don't even get me started on DUB... -- Adam Wilson IRC: EllipticBit import quiet.dlang.dev;I notice a trend here. You eventually end up at the Java/JAR or .NET/Assembly model where everything need to compile against a library is included in a single file. I have often wondered how hard it would be to teach D how to automatically include the contents of a DI file into a library file and read the contents of the library at compile time. Having studied dependencies/packaging in D quite a bit; a readily apparent observation is that D is much more difficult to work with than Java/.NET/JavaScript in regards to packaging. Unified interface/implementation would go a LONG way to simplifying the problem. I even built a ZIP based packing model for D.For distributing libraries you use Dub. For distributing applications to end users you distribute the executable.
Jan 21 2019
On Saturday, 19 January 2019 at 20:32:07 UTC, Neia Neutuladh wrote:https://github.com/dhasenan/dmd/tree/fasterimportAny benchmarks?
Jan 20 2019
On 1/19/19 3:32 PM, Neia Neutuladh wrote:On Sat, 19 Jan 2019 11:56:29 -0800, H. S. Teoh wrote:I wonder if packages could be used to eliminate possibilities. For example, std is generally ONLY going to be under a phobos import path. That can eliminate any other import directives from even being tried (if they don't have a std directory, there's no point in looking for an std.algorithm directory in there). Maybe you already implemented this, I'm not sure. I still find it difficult to believe that calling exists x4 is a huge culprit. But certainly, caching a directory structure is going to be more efficient than reading it every time. -SteveExcellent finding! I *knew* something was off when looking up a file is more expensive than reading it. I'm thinking a quick fix could be to just cache the intermediate results of each lookup, and reuse those instead of issuing another call to opendir() each time. I surmise that after this change, this issue may no longer even be a problem anymore.It doesn't even call opendir(). It assembles each potential path and calls exists(). Which might be better for only having a small number of imports, but that's not the common case. I've a partial fix for Posix, and I'll see about getting dev tools running in WINE to get a Windows version. (Which isn't exactly the same, but if I find a difference in FindFirstFile / FindNextFile between Windows and WINE, I'll be surprised.) I'm not sure what it should do when the same module is found in multiple locations, though -- the current code seems to take the first match. I'm also not sure whether it should be lazy or not. Also symlinks and case-insensitive filesystems are annoying. https://github.com/dhasenan/dmd/tree/fasterimport
Jan 21 2019
On Monday, 21 January 2019 at 19:01:57 UTC, Steven Schveighoffer wrote:I still find it difficult to believe that calling exists x4 is a huge culprit. But certainly, caching a directory structure is going to be more efficient than reading it every time.For large directories, opendir+readdir, especially with stat, is much slower than open/access. Most filesystems already use a hash table or equivalent, so looking up a known file name is faster because it's a hash table lookup. This whole endeavor generally seems like poorly reimplementing what the OS should already be doing.
Jan 21 2019
On Mon, 21 Jan 2019 19:10:11 +0000, Vladimir Panteleev wrote:On Monday, 21 January 2019 at 19:01:57 UTC, Steven Schveighoffer wrote:We can avoid stat() except with symbolic links. Opendir + readdir for my example would be about 500 system calls, so it breaks even with `import std.stdio;` assuming the cost per call is identical and we're reading eagerly. Testing shows that this is the case. With a C preprocessor, though, you're dealing with /usr/share with thousands of header files.I still find it difficult to believe that calling exists x4 is a huge culprit. But certainly, caching a directory structure is going to be more efficient than reading it every time.For large directories, opendir+readdir, especially with stat, is much slower than open/access.This whole endeavor generally seems like poorly reimplementing what the OS should already be doing.The OS doesn't have a "find a file with one of this handful of names among these directories" call.
Jan 21 2019
On 1/21/19 2:35 PM, Neia Neutuladh wrote:On Mon, 21 Jan 2019 19:10:11 +0000, Vladimir Panteleev wrote:Another simple test: import std.experimental.all; void main(){} Use "time -c test.d". On my SSD laptop that takes 0.55 seconds. Without the import, it takes 0.02 seconds. In an ideal world there should be no difference. Those 0.53 seconds are the upper bound of the gains to be made by first-order improvements to import mechanics. (IMHO: low impact yet not negligible.)On Monday, 21 January 2019 at 19:01:57 UTC, Steven Schveighoffer wrote:We can avoid stat() except with symbolic links. Opendir + readdir for my example would be about 500 system calls, so it breaks even with `import std.stdio;` assuming the cost per call is identical and we're reading eagerly. Testing shows that this is the case.I still find it difficult to believe that calling exists x4 is a huge culprit. But certainly, caching a directory structure is going to be more efficient than reading it every time.For large directories, opendir+readdir, especially with stat, is much slower than open/access.
Jun 08 2019
On 6/8/19 3:12 AM, Andrei Alexandrescu wrote:On 1/21/19 2:35 PM, Neia Neutuladh wrote:Might it be due to something like this? https://issues.dlang.org/show_bug.cgi?id=19874 -SteveOn Mon, 21 Jan 2019 19:10:11 +0000, Vladimir Panteleev wrote:Another simple test: import std.experimental.all; void main(){} Use "time -c test.d". On my SSD laptop that takes 0.55 seconds. Without the import, it takes 0.02 seconds. In an ideal world there should be no difference. Those 0.53 seconds are the upper bound of the gains to be made by first-order improvements to import mechanics. (IMHO: low impact yet not negligible.)On Monday, 21 January 2019 at 19:01:57 UTC, Steven Schveighoffer wrote:We can avoid stat() except with symbolic links. Opendir + readdir for my example would be about 500 system calls, so it breaks even with `import std.stdio;` assuming the cost per call is identical and we're reading eagerly. Testing shows that this is the case.I still find it difficult to believe that calling exists x4 is a huge culprit. But certainly, caching a directory structure is going to be more efficient than reading it every time.For large directories, opendir+readdir, especially with stat, is much slower than open/access.
Jun 10 2019
On Mon, Jan 21, 2019 at 07:10:11PM +0000, Vladimir Panteleev via Digitalmars-d wrote:On Monday, 21 January 2019 at 19:01:57 UTC, Steven Schveighoffer wrote:I can't help wondering why we're making so much noise about a few milliseconds on opening/reading import files, when there's the elephant in the room of the 3-5 *seconds* of compile-time added by the mere act of using a single instance of std.regex.Regex. Shouldn't we be doing something about that first?? T -- Verbing weirds language. -- Calvin (& Hobbes)I still find it difficult to believe that calling exists x4 is a huge culprit. But certainly, caching a directory structure is going to be more efficient than reading it every time.For large directories, opendir+readdir, especially with stat, is much slower than open/access. Most filesystems already use a hash table or equivalent, so looking up a known file name is faster because it's a hash table lookup. This whole endeavor generally seems like poorly reimplementing what the OS should already be doing.
Jan 21 2019
On Monday, 21 January 2019 at 19:42:57 UTC, H. S. Teoh wrote:I can't help wondering why we're making so much noise about a few milliseconds on opening/reading import files, when there's the elephant in the room of the 3-5 *seconds* of compile-time added by the mere act of using a single instance of std.regex.Regex. Shouldn't we be doing something about that first?? TI am on it :P I cannot do it any faster than I am currently doing it though. 1st Class Functions to replace recursive templates are still pursued by me as well.
Jan 21 2019
On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:We've both noticed that file size doesn't seem to matter much for importation speed, but file lookups remain slow.Maybe there are still some tricks to apply to make the lookup faster? Ripgrep [1] is to my knowledge the fastest grepping tool out there currently and the speed is mostly about grepping probably but to be fast they need to get to the files fast too. So it might be helpful to look at the code. [1] https://github.com/BurntSushi/ripgrep
Jan 20 2019
On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:This can be a fun challenge! Anyone up for it?This might get you some speed for the first compilation (fewer directory entry lookups), but I would expect that follow-up compilations to have a negligible overhead, compared to the CPU time needed to actually process the source code.
Jan 20 2019
On 1/20/2019 10:21 PM, Vladimir Panteleev wrote:This might get you some speed for the first compilation (fewer directory entry lookups), but I would expect that follow-up compilations to have a negligible overhead, compared to the CPU time needed to actually process the source code.In my benchmarks with Warp, the slowdown persisted with multiple sequential runs.
Jan 21 2019
On Monday, 21 January 2019 at 08:11:22 UTC, Walter Bright wrote:On 1/20/2019 10:21 PM, Vladimir Panteleev wrote:Would be nice if the benchmarks were reproducible. Some things to consider: - Would the same result apply to Phobos files? Some things that could be different between the two that would affect the viability of this approach: - Total number of files - The size of the files - The relative processing time needed to actually parse the files (Presumably, a C preprocessor would be faster than a full D compiler.) - Could the same result be achieved through other means, especially those without introducing disruptive changes to tooling? For example, re-enabling the parallel / preemptive loading of source files in DMD. Using JAR-like files would be disruptive to many tools. Consider: - What should the file/path error message location look like when printing error messages with locations in Phobos? - How should tools such as DCD, which need to scan source code, adapt to these changes? - How should tools and editors with a "go to definition" function work when the definition is inside an archive, especially in editors which do not support visiting files inside archives? Even if you have obvious answers to these questions, they still need to be implemented, so the speed gain from such a change would need to be significant in order to justify the disruption.This might get you some speed for the first compilation (fewer directory entry lookups), but I would expect that follow-up compilations to have a negligible overhead, compared to the CPU time needed to actually process the source code.In my benchmarks with Warp, the slowdown persisted with multiple sequential runs.
Jan 21 2019
On 1/21/2019 1:02 AM, Vladimir Panteleev wrote:Even if you have obvious answers to these questions, they still need to be implemented, so the speed gain from such a change would need to be significant in order to justify the disruption.The only way to get definitive answers is to try it. Fortunately, it isn't that difficult.
Jan 21 2019
On Sat, 19 Jan 2019 00:45:27 -0800, Walter Bright wrote:Andrei and I were talking on the phone today, trading ideas about speeding up importation of Phobos files. Any particular D file tends to import much of Phobos, and much of Phobos imports the rest of it. We've both noticed that file size doesn't seem to matter much for importation speed, but file lookups remain slow.I should have started out by testing this. I replaced the file lookup with, essentially: if (name.startsWith("std")) filename = "/phobos/" ~ name.replace(".", "/") ~ ".d"; else filename = "/druntime/" ~ name.replace(".", "/") ~ ".d"; Plus a hard-coded set of package.d references. Before, compiling my test file took about 0.67 to 0.70 seconds. After, it took about 0.67 to 0.70 seconds. There is no point in optimizing filesystem access for importing phobos at this time.
Jan 21 2019
On 1/21/19 2:46 PM, Neia Neutuladh wrote:On Sat, 19 Jan 2019 00:45:27 -0800, Walter Bright wrote:Word. (Unless the libs are installed over a networked mount. Not sure how much we need to worry about that.)Andrei and I were talking on the phone today, trading ideas about speeding up importation of Phobos files. Any particular D file tends to import much of Phobos, and much of Phobos imports the rest of it. We've both noticed that file size doesn't seem to matter much for importation speed, but file lookups remain slow.I should have started out by testing this. I replaced the file lookup with, essentially: if (name.startsWith("std")) filename = "/phobos/" ~ name.replace(".", "/") ~ ".d"; else filename = "/druntime/" ~ name.replace(".", "/") ~ ".d"; Plus a hard-coded set of package.d references. Before, compiling my test file took about 0.67 to 0.70 seconds. After, it took about 0.67 to 0.70 seconds. There is no point in optimizing filesystem access for importing phobos at this time.
Jun 08 2019
On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:This can be a fun challenge! Anyone up for it?I wrote about this idea in my blog today: http://dpldocs.info/this-week-in-d/Blog.Posted_2019_01_21.html#my-thoughts-on-forum-discussions In short, it may be a fun challenge, and may be useful to some library distributors, but I don't think it is actually worth it.
Jan 21 2019
On 1/21/19 4:13 PM, Adam D. Ruppe wrote:On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:Lot of good thoughts there, most of which I agree with. Thanks for sharing. One note -- I don't think modules like std.datetime were split up for the sake of the compiler parsing speed, I thought they were split up to a) avoid the insane ddoc generation that came from it, and b) reduce dependencies on symbols that you didn't care about. Not to mention that github would refuse to load std.datetime for any PRs :) But it does help to consider the cost of finding the file and the cost of using the file separately, and see how they compare. -SteveThis can be a fun challenge! Anyone up for it?I wrote about this idea in my blog today: http://dpldocs.info/this-week-in-d/Blog.Posted_2019_01_21.html#my-thoughts- n-forum-discussions In short, it may be a fun challenge, and may be useful to some library distributors, but I don't think it is actually worth it.
Jan 21 2019
On Mon, Jan 21, 2019 at 04:38:21PM -0500, Steven Schveighoffer via Digitalmars-d wrote: [...]One note -- I don't think modules like std.datetime were split up for the sake of the compiler parsing speed, I thought they were split up to a) avoid the insane ddoc generation that came from it, and b) reduce dependencies on symbols that you didn't care about. Not to mention that github would refuse to load std.datetime for any PRs :)And also, I originally split up std.algorithm (at Andrei's protest) because it was so ridiculously huge that I couldn't get unittests to run on my PC without dying with out-of-memory errors.But it does help to consider the cost of finding the file and the cost of using the file separately, and see how they compare.[...] I still think a lot of this effort is misdirected -- we're trying to hunt small fish while there's a shark in the pond. Instead of trying to optimize file open / read times, what we *should* be doing is to reduce the number of recursive templates heavy-weight Phobos modules like std.regex are using, or improving the template expansion strategies (e.g., the various PRs that have been checked in to replace O(n) recursive template expansions with O(log n), or replace O(n^2) with O(n), etc.). Or, for that matter, optimizing how the compiler processes templates so that it performs better. Optimizing file open / file read in the face of these much heavier components in the compiler sounds to me like straining out the gnat while swallowing the camel. T -- Жил-был король когда-то, при нём блоха жила.
Jan 21 2019
On Monday, 21 January 2019 at 21:52:01 UTC, H. S. Teoh wrote:On Mon, Jan 21, 2019 at 04:38:21PM -0500, Steven Schveighoffer via Digitalmars-d wrote: [...]Does dmd ever do dynamic programming when it does recursive templates? -Alex[...]And also, I originally split up std.algorithm (at Andrei's protest) because it was so ridiculously huge that I couldn't get unittests to run on my PC without dying with out-of-memory errors. [...]
Jan 21 2019
On 1/21/19 4:52 PM, H. S. Teoh wrote:split up std.algorithm (at Andrei's protest)Shouldn't have been splitted.
Jun 08 2019
On Sat, Jun 08, 2019 at 09:02:46AM +0200, Andrei Alexandrescu via Digitalmars-d wrote:On 1/21/19 4:52 PM, H. S. Teoh wrote:It should have been. The old std.algorithm was a monster of 10,000 LOC that caused the compiler to exhaust my RAM and thrash on swap before dying horribly, when building unittests. It was an embarrassment. The old std.datetime had the same problem and I'm very glad Jonathan eventually also split it up into more sensible chunks. T -- Береги платье снову, а здоровье смолоду.split up std.algorithm (at Andrei's protest)Shouldn't have been splitted.
Jun 08 2019
On Saturday, 8 June 2019 at 09:23:54 UTC, H. S. Teoh wrote:On Sat, Jun 08, 2019 at 09:02:46AM +0200, Andrei Alexandrescu via Digitalmars-d wrote:+1 It's just about not having too much code in the first place.On 1/21/19 4:52 PM, H. S. Teoh wrote:It should have been. The old std.algorithm was a monster of 10,000 LOC that caused the compiler to exhaust my RAM and thrash on swap before dying horribly, when building unittests. It was an embarrassment. The old std.datetime had the same problem and I'm very glad Jonathan eventually also split it up into more sensible chunks. Tsplit up std.algorithm (at Andrei's protest)Shouldn't have been splitted.
Jun 08 2019
On 6/8/19 5:23 AM, H. S. Teoh wrote:The old std.datetime had the same problemThat should indeed have been broken as it was.
Jun 08 2019
On 6/8/19 5:23 AM, H. S. Teoh wrote:It should have been. The old std.algorithm was a monster of 10,000 LOC that caused the compiler to exhaust my RAM and thrash on swap before dying horribly, when building unittests. It was an embarrassment.The appropriate response would have been (and still is) to fix the compiler. A more compact working set will also accelerate execution due to better locality.
Jun 08 2019
On Saturday, 8 June 2019 at 14:48:24 UTC, Andrei Alexandrescu wrote:A more compact working set will also accelerate execution due to better locality.OTOH, a more sparse working set will accelerate development since it lessens cognitive load.
Jun 08 2019
On Saturday, 8 June 2019 at 14:53:20 UTC, Nicholas Wilson wrote:OTOH, a more sparse working set will accelerate development since it lessens cognitive load.I personally find it is a LOT easier to work with one big file than multiple small files. But that said, with something like an IDE, it shouldn't matter either way since the computer should be able to work with both equally well.
Jun 08 2019
On Saturday, 8 June 2019 at 14:58:17 UTC, Adam D. Ruppe wrote:I personally find it is a LOT easier to work with one big file than multiple small files.I feel the same.But that said, with something like an IDE, it shouldn't matter either way since the computer should be able to work with both equally well.Well a friend of mine would discord on this over his hog electron based IDE. :) Depending on the IDE people will complain about the size of the file, but I'm not one of them, because I'm not IDE type of guy. Matheus.
Jun 08 2019
On Saturday, 8 June 2019 at 14:58:17 UTC, Adam D. Ruppe wrote:On Saturday, 8 June 2019 at 14:53:20 UTC, Nicholas Wilson wrote:I never understand complaints about where files are located and what goes where. That's for the computer to know and for me to pretty much never care about. Likewise, I don't understand why anyone would want a tree view of their filesystem in their editor. When I know the file name and care, I use fuzzy matching to open it instead of clicking on directories. Do you know when I care about code organisation? When it impacts compile times negatively and/or affect coupling.OTOH, a more sparse working set will accelerate development since it lessens cognitive load.I personally find it is a LOT easier to work with one big file than multiple small files. But that said, with something like an IDE, it shouldn't matter either way since the computer should be able to work with both equally well.
Jun 08 2019
On 6/8/19 10:48 AM, Andrei Alexandrescu wrote:On 6/8/19 5:23 AM, H. S. Teoh wrote:Sheesh, this is *exactly* the sort of "Perfection is the enemy of the good" foot-shooting that's been keeping D years behind where it could be. In D1 days, things got fixed and improved quickly. But now, problems that have a working solution available remain unfixed for years just because "there's a better way to do it, but it takes far more work, isn't going to ready anytime soon, there or may or may not even be anyone active being working on it." A fix in master is worth two in the bush.It should have been. The old std.algorithm was a monster of 10,000 LOC that caused the compiler to exhaust my RAM and thrash on swap before dying horribly, when building unittests. It was an embarrassment.The appropriate response would have been (and still is) to fix the compiler. A more compact working set will also accelerate execution due to better locality.
Jun 08 2019
On 6/8/19 6:19 PM, Nick Sabalausky (Abscissa) wrote:On 6/8/19 10:48 AM, Andrei Alexandrescu wrote:I did allow the breaking up of std.algorithm.On 6/8/19 5:23 AM, H. S. Teoh wrote:Sheesh, this is *exactly* the sort of "Perfection is the enemy of the good" foot-shooting that's been keeping D years behind where it could be.It should have been. The old std.algorithm was a monster of 10,000 LOC that caused the compiler to exhaust my RAM and thrash on swap before dying horribly, when building unittests. It was an embarrassment.The appropriate response would have been (and still is) to fix the compiler. A more compact working set will also accelerate execution due to better locality.
Jun 09 2019
On 6/9/19 3:56 AM, Andrei Alexandrescu wrote:On 6/8/19 6:19 PM, Nick Sabalausky (Abscissa) wrote:Yes, and that's definitely good - after all, it gave us a stopgap fix for the immediate problem, while the proper solution is(was?) still in-the-works. Besides, much of the time, once a "proper solution" does become available, the old stopgap can then be rolled back/deprecated if necessary. (Not sure whether or not rolling back the change would be appropriate in std.algorithm's case, but ATM, I'm not too terribly concerned with that either way.) To clarify, in my previous post, I wasn't really talking *specifically* about the breaking up of std.algorithm (after all, like you said, that DID go through). I was just speaking in general about the overall strategy you were promoting: Preferring to forego stopgap measures in the interims before correct solutions are available. (Unless I misunderstood?) (Of course, in the cases where a correct solution is just as quick-n-simple as any stopgap, well, then yes, certainly the correct solution should just be done instead. Again, not saying this was or wasn't the case with the breaking up of std.algorithm, I'm just speaking in general terms.)On 6/8/19 10:48 AM, Andrei Alexandrescu wrote:I did allow the breaking up of std.algorithm.On 6/8/19 5:23 AM, H. S. Teoh wrote:Sheesh, this is *exactly* the sort of "Perfection is the enemy of the good" foot-shooting that's been keeping D years behind where it could be.It should have been. The old std.algorithm was a monster of 10,000 LOC that caused the compiler to exhaust my RAM and thrash on swap before dying horribly, when building unittests. It was an embarrassment.The appropriate response would have been (and still is) to fix the compiler. A more compact working set will also accelerate execution due to better locality.
Jun 09 2019
On 6/9/19 2:52 PM, Nick Sabalausky (Abscissa) wrote:I was just speaking in general about the overall strategy you were promoting: Preferring to forego stopgap measures in the interims before correct solutions are available. (Unless I misunderstood?)I can tell at least what I tried to do - use good judgment for each such decision. More often than not I propose workarounds that the community turns its nose at - see e.g. the lazy import. To this day I think that would have been a great thing to do. But no, we need to "wait" for the full lazy import that will never materialize.
Jun 09 2019
On Sunday, 9 June 2019 at 19:51:14 UTC, Andrei Alexandrescu wrote:On 6/9/19 2:52 PM, Nick Sabalausky (Abscissa) wrote:Has anyone merged all of phobos in to one large file, removed all the modules(or modify the compile to handle multiple modules per file and internally break them up) and see where the bottle neck truly is? Is it specific template use or just all templates? Is there a specific template design pattern that is often used that kills D? And alternatively, break phobos up in to many more files, one per template... and see the performance of it. Maybe there is specific performance blocks in the compiler and those could be rewritten such as parallel compilation or rewriting that part of the compiler to be faster or whatever.. What I'm seeing is that it seems no one really knows the true culprit. Is it the file layout or templates? and each issue then branches... After all, maybe it is a combination and all the areas could be optimized better. Even a 1% increase is 1% and if it stacks with another 1% then one has 2%. A journey starts with the first 1%.I was just speaking in general about the overall strategy you were promoting: Preferring to forego stopgap measures in the interims before correct solutions are available. (Unless I misunderstood?)I can tell at least what I tried to do - use good judgment for each such decision. More often than not I propose workarounds that the community turns its nose at - see e.g. the lazy import. To this day I think that would have been a great thing to do. But no, we need to "wait" for the full lazy import that will never materialize.
Jun 10 2019
On 6/10/19 4:54 AM, Amex wrote:On Sunday, 9 June 2019 at 19:51:14 UTC, Andrei Alexandrescu wrote:Not if that 1% costs 99% of your budget.On 6/9/19 2:52 PM, Nick Sabalausky (Abscissa) wrote:Has anyone merged all of phobos in to one large file, removed all the modules(or modify the compile to handle multiple modules per file and internally break them up) and see where the bottle neck truly is? Is it specific template use or just all templates? Is there a specific template design pattern that is often used that kills D? And alternatively, break phobos up in to many more files, one per template... and see the performance of it. Maybe there is specific performance blocks in the compiler and those could be rewritten such as parallel compilation or rewriting that part of the compiler to be faster or whatever.. What I'm seeing is that it seems no one really knows the true culprit. Is it the file layout or templates? and each issue then branches... After all, maybe it is a combination and all the areas could be optimized better. Even a 1% increase is 1% and if it stacks with another 1% then one has 2%. A journey starts with the first 1%.I was just speaking in general about the overall strategy you were promoting: Preferring to forego stopgap measures in the interims before correct solutions are available. (Unless I misunderstood?)I can tell at least what I tried to do - use good judgment for each such decision. More often than not I propose workarounds that the community turns its nose at - see e.g. the lazy import. To this day I think that would have been a great thing to do. But no, we need to "wait" for the full lazy import that will never materialize.
Jun 10 2019
On Monday, 21 January 2019 at 21:38:21 UTC, Steven Schveighoffer wrote:One note -- I don't think modules like std.datetime were split up for the sake of the compiler parsing speedYeah, I think std.datetime was about the size of unittest runs, but std.range, as I recall at least, was specifically to separate stuff to get quicker builds by avoiding the majority of the import for common cases via std.range.primitives which doesn't need to bring in as much code. The local import pattern also helps with this goal - lazy imports to only get what you need when you need it, so the compiler doesn't have to do as much work. Maybe I am wrong about that, but still, two files that import each other aren't actually two modules. Phobos HAS been making a LOT of progress toward untangling that import mess and addressing specific compile time problems. A few years ago, any phobos import would cost you like a half second. It is down to a quarter second for the hello world example. Which is IMO still quite poor, but a lot better than it was. But is this caused by finding the files? $ cd dmd2/src/phobos $ time find . real 0m0.003s I find that very hard to believe. And let us remember, old D1 phobos wasn't this slow: $ cat hi.d import std.stdio; void main() { writefln("Hello!"); } $ time dmd-1.0 hi.d real 0m0.042s user 0m0.032s sys 0m0.005s $ time dmd hi.d real 0m0.434s user 0m0.383s sys 0m0.044s Using the old D compilers reminds me what quick compiles REALLY are. Sigh.
Jan 21 2019
BTW dmd2/src/phobos$ time cat `find . | grep -E '\.d$'` > catted.d real 0m0.015s user 0m0.006s sys 0m0.009s $ wc catted.d 319707 1173911 10889167 catted.d If it were the filesystem at fault, shouldn't that I/O heavy operation take a significant portion of the dmd runtime? Yes, I know the kernel is caching these things and deferring writes and so on. But it does that for dmd too! Blaming the filesystem doesn't pass the prima facie test, at least on Linux. Maybe Windows is different, I will try that tomorrow, but I remain exceedingly skeptical.
Jan 21 2019
On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:Andrei and I were talking on the phone today, trading ideas about speeding up importation of Phobos files. Any particular D file tends to import much of Phobos, and much of Phobos imports the rest of it. We've both noticed that file size doesn't seem to matter much for importation speed, but file lookups remain slow. So looking up fewer files would make it faster. If phobos.zip is opened as a memory mapped file, whenever std/stdio.d is read, the file will be "faulted" into memory rather than doing a file lookup / read. We're speculating that this should be significantly faster,Speaking from Linux, the kernel already caches the file (after the first read) unless `echo 3 > /proc/sys/vm/drop_caches` is triggered. I've tested with the entire phobos cached and the compilation is still slow. IO is not the bottleneck here. The compilation needs to be speeded up. If you still think the file read is the culprint, why does recompilation take the same amount of time as the first compilation (albeit kernel file cache)?being very convenient for the user to treat Phobos as a single file rather than a blizzard. (phobos.lib could also be in the same file!)We already have std.experimental.all for convenience.
Jan 21 2019
On Monday, January 21, 2019 5:46:32 PM MST Arun Chandrasekaran via Digitalmars-d wrote:If I understand correctly, that's an orthogonal issue. What Walter is proposing wouldn't change how any code imported anything. Rather, it would just change how the compiler reads the files. So, anyone wanting to import all of Phobos at once would still need something like std.experimental.all, but regardless of how much you were importing from Phobos, dmd would read in all of Phobos at once, because it would be a single zip file. It would then only actually compile what it needed to for the imports in your program, but it would have read the entire zip file into memory so that it would only have to open one file instead of searching for and opening each file individually. - Jonathan M Davisbeing very convenient for the user to treat Phobos as a single file rather than a blizzard. (phobos.lib could also be in the same file!)We already have std.experimental.all for convenience.
Jan 21 2019
On Tue, 22 Jan 2019 00:46:32 +0000, Arun Chandrasekaran wrote:If you still think the file read is the culprint, why does recompilation take the same amount of time as the first compilation (albeit kernel file cache)?And another quick way to test this is to use import() and a hard-coded switch statement instead of IO. Just get rid of all disk access and see how fast you can compile code. I'm betting you'll save 10% at most.
Jan 21 2019
On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:Andrei and I were talking on the phone today, trading ideas about speeding up importation of Phobos files. Any particular D file tends to import much of Phobos, and much of Phobos imports the rest of it. We've both noticed that file size doesn't seem to matter much for importation speed, but file lookups remain slow.The topic of import speed when an implementation is spread over multiple files came up again on a PR I'm working on, so I wanted to share an idea I had. Why not remove the arbitrary limitation that a module or package must be tied to a file or directory respectively? That is within one file you could have something like this: --- package thePackage { module thePackage.module1 { ... } module thePackage.module2 { ... } } package thePackage2 { .... } --- Then when one distributes their library, they could concatenate all the files into one, potentially reducing the overhead required to querying the filesystem. After loading the file, the entire library would essentially be cached in memory. It would also help with a number of other helpful patterns. For example, I currently use the following pattern for creating register maps of memory-mapped IO: --- final abstract class MyRegisterBank { final abstract class MyRegister { //static properties for bit-fields } } // See https://github.com/JinShil/stm32f42_discovery_demo/blob/master/s urce/stm32f42/spi.d for the real thing --- I could avoid doing silly things like that and use modules in a single file instead, which would be a more appropriate model for that use case. I currently don't use modules simply because I'd end up with 100's of file to manage for a single MCU. Another trivial use case would be the ability to test multiple packages and modules on run.dlang.io without the need for any additional infrastructure. I'm sure the creativity of this community would find other interesting uses cases. It seems like an arbitrary limitation to have the package/module system hard-coded to the underlying storage technology, and removing that limitation may help with the import speed while also enabling a few interesting use cases. Mike
Jun 07 2019
zip-archive allows you to unpack the file in its original form. unpacking allows to see source code. ur version - join sources to one file - is more complicated // file pack/a.d module pack.one; //... // file pack/b.d module pack.one; //... // ur package package pack { module one { // both files here? } file( "pack/b.d" ); // ??? module one { // or separate? } // how to restore individual files? need some special comment or attrs }
Jun 07 2019
On Friday, 7 June 2019 at 12:50:27 UTC, KnightMare wrote:zip-archive allows you to unpack the file in its original form. unpacking allows to see source code.u can unzip w/o any special tools - OSes can work with zip usually so, yes, 1stage - using ZIP as one file is ok. 2stage: pack sources to 1 file as AST pro: - no need unpack stage from mapped files - all string spans can be stored as string-table-indicies allows to pack 2x-3x times - no need parsing/verifying stage - u already have checheked AST. contra: need special tools (or commands to compiler) to unpack source files
Jun 07 2019
On 6/7/19 8:50 AM, KnightMare wrote:zip-archive allows you to unpack the file in its original form. unpacking allows to see source code.That's an interesting approach to the issue: Just allow a package to be either a directory tree OR a tar/tarball archive of a directory tree. Then, the language itself wouldn't need any provisions at all to support multiple packages in one file, and the compiler could still read an entire multi-module package by accessing only one file.
Jun 07 2019
On Friday, 7 June 2019 at 19:56:27 UTC, Nick Sabalausky (Abscissa) wrote:On 6/7/19 8:50 AM, KnightMare wrote:Isn't it what Java does? A jar file is nothing more than a zip file.zip-archive allows you to unpack the file in its original form. unpacking allows to see source code.That's an interesting approach to the issue: Just allow a package to be either a directory tree OR a tar/tarball archive of a directory tree. Then, the language itself wouldn't need any provisions at all to support multiple packages in one file, and the compiler could still read an entire multi-module package by accessing only one file.
Jun 07 2019
On Fri, Jun 07, 2019 at 09:47:34AM +0000, Mike Franklin via Digitalmars-d wrote: [...]It would also help with a number of other helpful patterns. For example, I currently use the following pattern for creating register maps of memory-mapped IO: --- final abstract class MyRegisterBank { final abstract class MyRegister { //static properties for bit-fields } } // See https://github.com/JinShil/stm32f42_discovery_demo/blob/master/source/stm32f42/spi.d for the real thing ---Why final abstract class? If all you have are static properties, you could use structs instead. Though of course, it still doesn't really address your fundamental concerns here. [...]It seems like an arbitrary limitation to have the package/module system hard-coded to the underlying storage technology, and removing that limitation may help with the import speed while also enabling a few interesting use cases.[...] The flip side of this coin is that having one file per module helps with code organization in typical use cases. As opposed to the chaotic situation in C/C++ where you can have one .h file with arbitrary numbers of .c/.cc files that implement the function prototypes, with no obvious mapping between them, or multiple .h files with a single amorphous messy .c/.cc file that implements everything. Or where there are arbitrary numbers of modules nested inside any number of .c/.cc files with no clear separation between them, and where filenames bear no relation with contents, making code maintenance challenging at best, outright nightmarish at worst. T -- Real men don't take backups. They put their source on a public FTP-server and let the world mirror it. -- Linus Torvalds
Jun 07 2019
On Friday, 7 June 2019 at 14:45:34 UTC, H. S. Teoh wrote:The flip side of this coin is that having one file per module helps with code organization in typical use cases. As opposed to the chaotic situation in C/C++ where you can have one .h file with arbitrary numbers of .c/.cc files that implement the function prototypes, with no obvious mapping between them, or multiple .h files with a single amorphous messy .c/.cc file that implements everything. Or where there are arbitrary numbers of modules nested inside any number of .c/.cc files with no clear separation between them, and where filenames bear no relation with contents, making code maintenance challenging at best, outright nightmarish at worst.What I'm proposing is that a library's organization can be one file per module while it is being developed, but once it is published for consumption by others all packages/modules are concatenated into one file so it's faster for users to import. Or you could even concatenate some packages/modules, but not others depending on how users typically interact with the library. There may be use cases (like mine with the memory-mapped IO registers) where users may want to actually develop with more than one module per file. Or when one wants to share multi-package/module cut-and-pastable code examples. But those are special cases. What's nice is that it changes nothing for users today. It just removes an arbitrary limitation to help improve import speed while enabling some more flexibility for those who may it. Mike
Jun 07 2019
On Friday, 7 June 2019 at 15:34:22 UTC, Mike Franklin wrote:What I'm proposing is that a library's organization can be one file per module while it is being developed, but once it is published for consumption by others all packages/modules are concatenated into one file so it's faster for users to import. Or you could even concatenate some packages/modules, but not others depending on how users typically interact with the library. There may be use cases (like mine with the memory-mapped IO registers) where users may want to actually develop with more than one module per file. Or when one wants to share multi-package/module cut-and-pastable code examples. But those are special cases. What's nice is that it changes nothing for users today. It just removes an arbitrary limitation to help improve import speed while enabling some more flexibility for those who may it. MikeHow would compilation even work with multiple modules per file? Wouldn't the compiler have to parse all .d files in the whole search path then? That would be the opposite of faster compile times.
Jun 07 2019
On Friday, 7 June 2019 at 15:46:03 UTC, Gregor Mückl wrote:How would compilation even work with multiple modules per file? Wouldn't the compiler have to parse all .d files in the whole search path then? That would be the opposite of faster compile times.It doesn't have to search because you pass the modules to the compiler, just like we do now in the general case. The search path and filename conventions are - today - just conventions, there's no requirement that the filename and module name match, which means you don't actually know what module a file is until it is parsed. (which is quick and simple btw)
Jun 07 2019
On Fri, Jun 07, 2019 at 03:34:22PM +0000, Mike Franklin via Digitalmars-d wrote: [...]What I'm proposing is that a library's organization can be one file per module while it is being developed, but once it is published for consumption by others all packages/modules are concatenated into one file so it's faster for users to import. Or you could even concatenate some packages/modules, but not others depending on how users typically interact with the library.[...] What *would* be really nice is if dmd could read .zip archives. Then all you need to use a library is to download the .zip into your source tree and run `dmd -i`. T -- I'm still trying to find a pun for "punishment"...
Jun 07 2019
What *would* be really nice is if dmd could read .zip archives. Then all you need to use a library is to download the .zip into your source tree and run `dmd -i`.or to use LZ4 for no-dependencies from zlib and smaller code: decompressing is 50 LOC, and probably 200LOC for processing "directory entries" inside file. its faster by 10x than zip - decompressing working w/o hash-tables as LZW do. contra: u cannot just unzip files
Jun 07 2019
On Friday, 7 June 2019 at 15:34:22 UTC, Mike Franklin wrote:On Friday, 7 June 2019 at 14:45:34 UTC, H. S. Teoh wrote:Reading files is really cheap, evaluating templates and running CTFE isn't. That's why importing Phobos modules is slow - not because of the files it imports, but because of all the CTFE these imports trigger. This means we can do e.g. - improve CTFE performance - cache templates over multiple compilations (there's a DMD PR for this) - make imports lazy - reduce all the big CTFE bottlenecks in Phobos (e.g. std.uni) - convert more Phobos code into templates with local imports to reduce the baseline import overhead Ordered from hardest to easiest.The flip side of this coin is that having one file per module helps with code organization in typical use cases. As opposed to the chaotic situation in C/C++ where you can have one .h file with arbitrary numbers of .c/.cc files that implement the function prototypes, with no obvious mapping between them, or multiple .h files with a single amorphous messy .c/.cc file that implements everything. Or where there are arbitrary numbers of modules nested inside any number of .c/.cc files with no clear separation between them, and where filenames bear no relation with contents, making code maintenance challenging at best, outright nightmarish at worst.What I'm proposing is that a library's organization can be one file per module while it is being developed, but once it is published for consumption by others all packages/modules are concatenated into one file so it's faster for users to import. Or you could even concatenate some packages/modules, but not others depending on how users typically interact with the library. There may be use cases (like mine with the memory-mapped IO registers) where users may want to actually develop with more than one module per file. Or when one wants to share multi-package/module cut-and-pastable code examples. But those are special cases. What's nice is that it changes nothing for users today. It just removes an arbitrary limitation to help improve import speed while enabling some more flexibility for those who may it. Mike
Jun 07 2019
On Friday, 7 June 2019 at 16:38:56 UTC, Seb wrote:Reading files is really cheap, evaluating templates and running CTFE isn't. That's why importing Phobos modules is slow - not because of the files it imports, but because of all the CTFE these imports trigger.Yes that make much more sense to me. But, if that's the case, what's all the concern from Walter and Andrei expressed in this thread and in the conversations linked below? https://forum.dlang.org/post/q7dpmg$29oq$1 digitalmars.com https://github.com/dlang/druntime/pull/2634#issuecomment-499494019 https://github.com/dlang/druntime/pull/2222#issuecomment-398390889 Mike
Jun 07 2019
On Saturday, 8 June 2019 at 00:00:17 UTC, Mike Franklin wrote:On Friday, 7 June 2019 at 16:38:56 UTC, Seb wrote:Another recent comment from Walter: https://github.com/dlang/dmd/pull/9814#issuecomment-493773769 It seems there is disagreement about what the actual issue is. MikeReading files is really cheap, evaluating templates and running CTFE isn't. That's why importing Phobos modules is slow - not because of the files it imports, but because of all the CTFE these imports trigger.Yes that makes much more sense to me. But, if that's the case, what's all the concern from Walter and Andrei expressed in this thread and in the conversations linked below? https://forum.dlang.org/post/q7dpmg$29oq$1 digitalmars.com https://github.com/dlang/druntime/pull/2634#issuecomment-499494019 https://github.com/dlang/druntime/pull/2222#issuecomment-398390889
Jun 08 2019
On Saturday, 8 June 2019 at 00:00:17 UTC, Mike Franklin wrote:On Friday, 7 June 2019 at 16:38:56 UTC, Seb wrote:If they wanted to make DMD faster, compiling with LDC would make it truly faster (more than halfs the runtime!!). As mentioned above, the real reason for the import overhead are the ton of templates and CTFE evaluations and these either need to be cached, made faster, made lazier or reduced if any significant performance gains are anticipated. Tweaking the file tree won't help.Reading files is really cheap, evaluating templates and running CTFE isn't. That's why importing Phobos modules is slow - not because of the files it imports, but because of all the CTFE these imports trigger.Yes that make much more sense to me. But, if that's the case, what's all the concern from Walter and Andrei expressed in this thread and in the conversations linked below? https://forum.dlang.org/post/q7dpmg$29oq$1 digitalmars.com https://github.com/dlang/druntime/pull/2634#issuecomment-499494019 https://github.com/dlang/druntime/pull/2222#issuecomment-398390889 Mike
Jun 13 2019
On 6/7/19 5:47 AM, Mike Franklin wrote:Why not remove the arbitrary limitation that a module or package must be tied to a file or directory respectively?We don't really have that limitation. The compiler gets the package/module name from the `module ...;` statement (if any) at the beginning of a *.d file. It's only when the `module` statement is absent that the package/module name is inferred from the filepath. (There *might* also be such a requirement when importing *but not compiling* a given module, but I'm not sure on that.) Beyond that, any other requirement for packages/modules to match the filesystem is purely a convention relied upon by certain buildsystems, like rdmd (and now, `dmd -i`), and otherwise has nothing to do with the compiler. TBH, I've always kinda wanted to just do away with the ability to have modules/packages that DON'T match the filesystem. I never saw any sensible use-cases for supporting such weird mismatches that couldn't already be accomplished via -I (not to be confused with the new -i) or version/static if. HOWEVER, that said, you do bring up an interesting point I'd never thought of: If concating a bunch of modules/packages into one file would improve compile time, than that would certainly be a feature worth considering.
Jun 07 2019
On Fri, Jun 07, 2019 at 03:48:53PM -0400, Nick Sabalausky (Abscissa) via Digitalmars-d wrote: [...]TBH, I've always kinda wanted to just do away with the ability to have modules/packages that DON'T match the filesystem. I never saw any sensible use-cases for supporting such weird mismatches that couldn't already be accomplished via -I (not to be confused with the new -i) or version/static if. HOWEVER, that said, you do bring up an interesting point I'd never thought of: If concating a bunch of modules/packages into one file would improve compile time, than that would certainly be a feature worth considering.I honestly doubt it would improve compilation time that much. Reading files from the filesystem is pretty fast, compared to the rest of the stuff the compiler has to do afterwards. If anything, it would *slow down* the process if you really only needed to import one module but it was concatenated with a whole bunch of others in a single file, thereby requiring the compiler to parse the whole thing just to find it. There's also the problem that if multiple -I's were given, and more than 1 of those paths contain concatenated modules, then the compiler would potentially have to parse *everything* in *every import path* just to be sure it will find the one import you asked for. I don't know, it just seems like there are too many disadvantages to justify doing this. T -- Philosophy: how to make a career out of daydreaming.
Jun 07 2019
I honestly doubt it would improve compilation time that much. Reading files from the filesystem is pretty fast, compared to the rest of the stuff the compiler has to do afterwards.need to search for existing files in dir-tree with some templates "datetime.d*" for each folder in -I. I tried compile (in Windows) one small program from today issue https://issues.dlang.org/show_bug.cgi?id=19947 under Procmon.exe (tool from SysInternals that allow to see what doing some process with Net,FS,Registry,Process&Threads). Result for FileSystem only is: 3497 requests to FS for create/open,query,close,read/write (libs and dll counted too) 768! requests just "not found" (dlls and libs counted too) 2nd try with option "-c" - compile only (no linking): 2693 requests to FS 727 reqs are "not found" so for clear benchmark need to compare compilation some middle program (without any dub packages coz they add mor path for searching maybe N! factorial) with SSD and with RAM-disk, imo we can win 1-2secs for compilation
Jun 07 2019
On Fri, Jun 07, 2019 at 09:09:48PM +0000, KnightMare via Digitalmars-d wrote: [...]I tried compile (in Windows) one small program from today issue https://issues.dlang.org/show_bug.cgi?id=19947 under Procmon.exe (tool from SysInternals that allow to see what doing some process with Net,FS,Registry,Process&Threads). Result for FileSystem only is: 3497 requests to FS for create/open,query,close,read/write (libs and dll counted too) 768! requests just "not found" (dlls and libs counted too)[...] This is a known issue that has been discussed before. The proposed solution was to cache the contents of each directory in the import path (probably lazily, so that we don't incur up-front costs) so that the compiler can subsequently find a module pathname with just a single hash lookup. I don't know if anyone set about implementing this, though. T -- Error: Keyboard not attached. Press F1 to continue. -- Yoon Ha Lee, CONLANG
Jun 07 2019
On Saturday, 19 January 2019 at 08:45:27 UTC, Walter Bright wrote:Andrei and I were talking on the phone today, trading ideas about speeding up importation of Phobos files. Any particular D file tends to import much of Phobos, and much of Phobos imports the rest of it. We've both noticed that file size doesn't seem to matter much for importation speed, but file lookups remain slow. So looking up fewer files would make it faster. Here's the idea: Place all Phobos source files into a single zip file, call it phobos.zip (duh). Then, dmd myfile.d phobos.zip and the compiler will look in phobos.zip to resolve, say, std/stdio.d. If phobos.zip is opened as a memory mapped file, whenever std/stdio.d is read, the file will be "faulted" into memory rather than doing a file lookup / read. We're speculating that this should be significantly faster, besides being very convenient for the user to treat Phobos as a single file rather than a blizzard. (phobos.lib could also be in the same file!) It doesn't have to be just phobos, this can be a general facility. People can distribute their D libraries as a zip file that never needs unzipping. We already have https://dlang.org/phobos/std_zip.html to do the dirty work. We can experiment to see if compressed zips are faster than uncompressed ones. This can be a fun challenge! Anyone up for it? P.S. dmd's ability to directly manipulate object library files, rather than going through lib or ar, has been a nice success.Why not compile phobos to an object file? Basically store the AST directly in to a file and just load it. Phobos never changes so why recompile it over and over and over and over? This should be done with all files... sorta like rdmd, so to speak. It might take some work to figure out how to get it all to work but maybe the time has come to stop using ancient design patterns and move on to a higher level? After all, the issue is mainly templates since they cannot be compiled to a library... but if they could, then they wouldn't be an issue. .til -> .lib -> .exe .til is a higher level library that includes objectified templates from D code, it is basically an extension of a lib file and eventually the lib is compiled in to the exe.
Jun 07 2019
On Saturday, 8 June 2019 at 06:29:16 UTC, Amex wrote:Why not compile phobos to an object file? Basically store the AST directly in to a file and just load it. Phobos never changes so why recompile it over and over and over and over?+1 for AST. and even more: AST-builder make as RT-module. bonuses: - storing packages as AST-asssembly: no parsing/building-AST for packages, only for user code - that increase compile speed. - can be stored more compacted than source coz words (var names, keywords) are repeated many times through source. and many source files can be stored in one "assembly/AST-package" with one string-literal-table, with one typeinfo-table. - DSL metaprogramming moves to a higher level: steps "parsing DSL - generate D code - parse D code - generate AST" will become "parsing DSL - generate AST" that increase compiling time and helps to appear many DSL for JSON/XML/DB-schemas, and will be used for UI description (same as QML) and many more. - LDC(dunno with DMD/GCC) can generate code dynamically at runtime already (probably it stores LLVM-IR now) then can generate code from AST in runtime (not only in compile time for metaprogramming): same bonuses/possibilities as Expression Trees and Scripting for .NET. Yes, Dlang can use 3rd parties script engines but they are not native - interop is not free, many wrappers/dispatchers to and from, two different GC, failed with threads and TLS, so its for a toy now not more. With native scripting absent interop at all, execution speed as compiled C++ (LLVM generate fast code), one GC for all objects, one thread pools with same TLS.. I see only one minus: dont use AST for DRT and as module itself for programmers. In any case AST-builder already exists in compiler, just bring it outside to public space and allow store packages as AST too.
Jun 08 2019
- can be stored more compacted than source coz words (var names, keywords) are repeated many times through source. andno need store any unittests and doc/comments in AST-packages. docs will be generated to html. unittests of packages need only to builders of those packages. if u needed u take sources of packages, build it and run unittests for it
Jun 08 2019
On Saturday, 8 June 2019 at 10:35:58 UTC, KnightMare wrote:+1 for AST.This doesn't make a significant difference. Parsing D source into an AST is quick and easy. The problem is applying that ast to user types.
Jun 08 2019
On Saturday, 8 June 2019 at 11:58:36 UTC, Adam D. Ruppe wrote:one more idea. allow to store in AST-packages user/any metadata. LDC can store there optimized LLVM-IR code for AST-tree, so compilation time will be: compile user code only, link and write result to FS+1 for AST.This doesn't make a significant difference. Parsing D source into an AST is quick and easy. The problem is applying that ast to user types.
Jun 08 2019
On Saturday, 8 June 2019 at 12:26:56 UTC, KnightMare wrote:LDC can store there optimized LLVM-IR code for AST-tree, so compilation time will be: compile user code only, link and write result to FStemplate code will be taken from AST still. (we just cant compile template for all types in universe to IR-code. also maybe IR can store some generics. dunno)
Jun 08 2019
On Saturday, 8 June 2019 at 12:32:54 UTC, KnightMare wrote:On Saturday, 8 June 2019 at 12:26:56 UTC, KnightMare wrote:I watched a video now from conference https://youtu.be/OsOfTVm2ExY?t=2107 where Walter said that deserialization some AST can be same speed as parsing source. should to think
Jun 08 2019