digitalmars.D.bugs - [Issue 4721] New: compilation slow when compiling unittests on dcollections
- d-bugmail puremagic.com (53/53) Aug 24 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4721
- d-bugmail puremagic.com (66/66) Aug 24 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4721
- d-bugmail puremagic.com (13/13) Aug 24 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4721
- d-bugmail puremagic.com (10/20) Aug 24 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4721
- d-bugmail puremagic.com (6/6) Aug 24 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4721
- d-bugmail puremagic.com (8/11) Aug 24 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4721
- d-bugmail puremagic.com (23/23) Aug 24 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4721
- d-bugmail puremagic.com (13/13) Aug 25 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4721
- d-bugmail puremagic.com (27/27) Aug 26 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4721
- d-bugmail puremagic.com (31/40) Aug 26 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4721
- d-bugmail puremagic.com (12/12) Aug 26 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4721
- d-bugmail puremagic.com (12/12) Sep 20 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4721
http://d.puremagic.com/issues/show_bug.cgi?id=4721 Summary: compilation slow when compiling unittests on dcollections Product: D Version: D2 Platform: Other OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: DMD AssignedTo: nobody puremagic.com ReportedBy: schveiguy yahoo.com 06:44:06 PDT --- When compiling dcollections unit tests, it takes over 1 minute to compile the small library. I am unsure what the exact trigger is to make it slow. At the request of Walter, I profiled the compiler to try and see where the slowdown was. Here are the first few lines of the results using gprof: Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ms/call ms/call name 77.76 6.68 6.68 2952 2.26 2.26 elf_findstr(Outbuffer*, char const*, char const*) 2.10 6.86 0.18 4342 0.04 0.04 searchfixlist 1.28 6.97 0.11 663755 0.00 0.00 ScopeDsymbol::search(Loc, Identifier*, int) 1.05 7.06 0.09 2623497 0.00 0.00 isType(Object*) 0.76 7.12 0.07 911667 0.00 0.00 match(Object*, Object*, TemplateDeclaration*, Scope*) 0.76 7.19 0.07 656268 0.00 0.00 _aaGetRvalue(AA*, void*) 0.58 7.24 0.05 2507041 0.00 0.00 isTuple(Object*) Walter said it looks like elf_findstr is a problem area and asked me to file this bug. The relevant discussion is here: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=116007 All I can say to duplicate this is to download the latest svn of dcollections and compile unit tests. If you are running the profiled version, it takes a long long time to complete (I gave up after 5 minutes). I suggest compiling just one class, HashMap. The compile time for this takes 4 seconds. The appropriate commands to run are: svn co http://svn.dsource.org/projects/dcollections/branches/d2 dcollections cd dcollections dmd -unittest dcollections/HashMap.d dcollections/Hash.d dcollections/Iterators.d dcollections/model/* The compilation will fail, because there is no main function, but it will demonstrate the problem well enough. If you want to compile the entire unit test suite, use the command: ./build-lib-linux.sh unittest Which builds and runs the unit tests for the whole library. This takes over a minute to compile. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 24 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4721 Andrej Mitrovic <andrej.mitrovich gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |andrej.mitrovich gmail.com 07:31:19 PDT --- I don't know about Linux (I might try it in a VM if you want me to), but on Windows I can compile the collection in 27 seconds: C:\d2\dcollections>timeit build-lib-win32.bat unittest unittest C:\d2\dcollections>dmd -unittest unit_test.d dcollections\ArrayList.d dcollections\DefaultAllocator .d dcollections\DefaultFunctions.d dcollections\Hash.d dcollections\HashMap.d dcollections\HashMulti set.d dcollections\HashSet.d dcollections\Iterators.d dcollections\Link.d dcollections\LinkList.d dc ollections\RBTree.d dcollections\TreeMap.d dcollections\TreeMultiset.d dcollections\TreeSet.d dcolle ctions\model\Addable.d dcollections\model\Iterator.d dcollections\model\Keyed.d dcollections\model\L ist.d dcollections\model\Map.d dcollections\model\Multiset.d dcollections\model\Set.d running unit tests... Version Number: Windows NT 5.1 (Build 2600) Exit Time: 4:29 pm, Tuesday, August 24 2010 Elapsed Time: 0:00:27.328 Process Time: 0:00:00.015 System Calls: 161445 Context Switches: 36381 Page Faults: 165602 Bytes Read: 10062897 Bytes Written: 5792330 Bytes Other: 131570 This is with DMD 2.046 as stated in the Readme. Do you need my system specs? With the newer 2.048 I get errors: C:\d2>build-lib-win32.bat unittest unittest C:\d2>dmd -unittest unit_test.d dcollections\ArrayList.d dcollections\DefaultAllocator.d ctions\Hash.d dcollections\HashMap.d dcollections\HashMultiset.d dcollections\HashSet.d d nk.d dcollections\LinkList.d dcollections\RBTree.d dcollections\TreeMap.d dcollections\Tr llections\model\Addable.d dcollections\model\Iterator.d dcollections\model\Keyed.d dcolle Map.d dcollections\model\Multiset.d dcollections\model\Set.d std.contracts has been scheduled for deprecation. Please use std.exception instead. dcollections\HashSet.d(28): Error: template std.algorithm.find(alias pred = "a == b",R,E) ryFun!(pred)(haystack.front,needle)) : bool)) does not match any function template declar dcollections\HashSet.d(28): Error: template std.algorithm.find(alias pred = "a == b",R,E) ryFun!(pred)(haystack.front,needle)) : bool)) cannot deduce template function from argume ),ubyte) dcollections\HashSet.d(301): Error: template instance dcollections.HashSet.rangeEqual!(ub dcollections\HashSet.d(996): instantiated from here: HashSet!(ubyte) running unit tests... '.\unit_test' is not recognized as an internal or external command, operable program or batch file. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 24 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4721 07:49:15 PDT --- It might be a linux-only problem, I'm not sure. I also have an older machine, maybe yours is twice as fast? 27 seconds still seems quite long for a small library. Re the errors, I'm working on them now, I just downloaded 2.048. Apparently there is some issue with AssumeSorted not working as a range anymore... It's not critical to the library, but the unit tests use it. I'm just removing the assumeSorted for now (as unit test performance isn't important, it was more the "correct" thing to do), asked about it in d.learn. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 24 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4721 09:55:25 PDT ---It might be a linux-only problem, I'm not sure. I also have an older machine, maybe yours is twice as fast? 27 seconds still seems quite long for a small library. Re the errors, I'm working on them now, I just downloaded 2.048. Apparently there is some issue with AssumeSorted not working as a range anymore... It's not critical to the library, but the unit tests use it. I'm just removing the assumeSorted for now (as unit test performance isn't important, it was more the "correct" thing to do), asked about it in d.learn.Yeah I guess it could be rather long, there's not that many size and they're mostly small from what I can tell. Btw your updated d2.0b library now compiles in 11 seconds on my system with DMD 2.048. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 24 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4721 09:55:55 PDT --- * not that many files, not size. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 24 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4721 10:13:43 PDT --- y small from what I can tell.Btw your updated d2.0b library now compiles in 11 seconds on my system with DMD 2.048.Still over a minute for me on Linux 2.048. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 24 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4721 14:10:26 PDT --- More testing, I did some printf debugging. According to the comments and the code of the function, it's a *linear* search through the symbol table for a match to a given symbol + suffix. The symbol table is a null-separated single buffer. So not only is it linear through the strings, but it's linear on the *characters*. That is, when comparing the current symbol and it's a mismatch, the code must iterate through all the characters anyways to find the next null character. I added some printouts to determine the eventual size of the symbol table, and the number of times the function is called. I also added printouts to show the number of matches found (those should theoretically not do a linear search through the entire table, but likely would search through most of it). The eventual numbers for dcollections: symbol table size: 4,253,953 number of calls: 12,677 number of matches: 648 doing the math, thats probably conservatively about 20 billion loop iterations -- way unacceptable for something that should be done via a hash lookup, or at least a tree/trie/binary search. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 24 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4721 Walter Bright <bugzilla digitalmars.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED CC| |bugzilla digitalmars.com Resolution| |FIXED 23:13:11 PDT --- Changed to a hash table. http://www.dsource.org/projects/dmd/changeset/628 -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 25 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4721 06:22:33 PDT --- This helps, but only reduces it to 20 seconds (but a 66% reduction is pretty good!). I ran another round of profiling, and found we have a new bottleneck: Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ms/call ms/call name 75.79 6.51 6.51 8103 0.80 0.80 TemplateDeclaration::toJsonBuffer(OutBuffer*) 3.14 6.78 0.27 1668093 0.00 0.00 StructDeclaration::semantic(Scope*) 2.10 6.96 0.18 1 180.00 180.00 do32bit(FL, evc*, int) 1.98 7.13 0.17 15445 0.01 0.01 EnumDeclaration::toJsonBuffer(OutBuffer*) 0.70 7.19 0.06 656268 0.00 0.00 Port::isSignallingNan(long double) 0.47 7.23 0.04 915560 0.00 0.00 StructDeclaration::toCBuffer(OutBuffer*, HdrGenState*) 0.47 7.27 0.04 Dsymbol::searchX(Loc, Scope*, Identifier*) This time, I question whether toJsonBuffer should be called at all, as I'm not outputting any JSON data? Walter, reopen if you think this could still use improvement. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 26 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4721 11:42:02 PDT ---Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ms/call ms/call name 75.79 6.51 6.51 8103 0.80 0.80 TemplateDeclaration::toJsonBuffer(OutBuffer*) 3.14 6.78 0.27 1668093 0.00 0.00 StructDeclaration::semantic(Scope*)That table of functions is invalid -- I somehow compiled dmd wrong when running that profile. I think I used a unit test build. However, that was only the profiled version, the non-profiled version still takes 20 seconds to compile dcollections. In any case, here is the correctly profiled version's heavy hitters: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ms/call ms/call name 80.31 11.99 11.99 19000 0.63 0.63 searchfixlist 0.67 12.09 0.10 203173 0.00 0.00 StringTable::search(char const*, unsigned int) 0.60 12.18 0.09 369389 0.00 0.00 Lexer::scan(Token*) 0.54 12.26 0.08 953613 0.00 0.00 ScopeDsymbol::search(Loc, Identifier*, int) 0.47 12.33 0.07 1449798 0.00 0.00 calccodsize Note, this profile is different from the first in that I was compiling the *entire* dcollections library, not just HashMap (which is now bearable due to compared to the original. So the last significant performance problem here is searchfixlist. Looking back, it was 2% of the runtime before, now it's jumped to 80%. Looking at the function in backend/cod3.c, it appears to be another linear search through a linked list. I understand much less about this function than the other, so I'm not sure how/if it should be solved. Perhaps another hash map? -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 26 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4721 bearophile_hugs eml.cc changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |REOPENED CC| |bearophile_hugs eml.cc Resolution|FIXED | Reopened, because you have added here a second different performance bug. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 26 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4721 Steven Schveighoffer <schveiguy yahoo.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|REOPENED |RESOLVED Resolution| |FIXED 05:21:19 PDT --- Closing this, opened the new issue as bug 4900. Fixed in 2.049 -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 20 2010