www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - DMD internal: appendToModuleMember performance

reply Johan Engelen <j j.nl> writes:
Hi all,
   When building a unittest case in Weka.io's codebase, I measure 
that appendToModuleMember takes about 10% of the total compile 
time, and about 25% of total semantic analysis time. 
"appendToModuleMember" is the only function that pops up in 
profiling with such a large time fraction spent in it.

The culprit is the linear search at the end of 
appendToModuleMember, because the members list with Dsymbols gets 
pretty large (50k+ members).
https://github.com/D-Programming-Language/dmd/blob/master/src/dtemplate.d#L8012-L8026

I've modified the code to also store the list of members as an 
rbtree, and voila, much faster compiles: from 26sec to 20sec 
(semantic only). The old data structure is still there and the 
only change is to do a lookup into the tree instead of the linear 
search.
The hack is here: 
https://github.com/ldc-developers/ldc/compare/master...JohanEngelen:speed_up

I couldn't use symtab as it isn't populated with all symbols and 
I wasn't sure if it would break things.

I'm thinking about removing the old array all-together.
My question is: is it essential to keep an ordered list? (I see a 
`.shift(...)` call on the array, to put something in first 
position. If so, could that be solved by having *two* trees, 
where "in-order" iteration first iterates over the first one and 
then the second. The "high priority" symbols can then be put in 
the first tree, normal ones in the second tree (if that is how 
order matters...).

Thanks,
   Johan
Apr 15 2016
next sibling parent reply David Nadlinger <code klickverbot.at> writes:
On Friday, 15 April 2016 at 18:33:44 UTC, Johan Engelen wrote:
 Hi all,
   When building a unittest case in Weka.io's codebase, I 
 measure that appendToModuleMember takes about 10% of the total 
 compile time, and about 25% of total semantic analysis time. 
 "appendToModuleMember" is the only function that pops up in 
 profiling with such a large time fraction spent in it.
Yep, see also: https://issues.dlang.org/show_bug.cgi?id=15323. When I tried to get rid of the ordered array, I ran into several interesting issues, but I don't remember many details.
 I'm thinking about removing the old array all-together.
 My question is: is it essential to keep an ordered list? (I see 
 a `.shift(...)` call on the array, to put something in first 
 position. If so, could that be solved by having *two* trees, 
 where "in-order" iteration first iterates over the first one 
 and then the second. The "high priority" symbols can then be 
 put in the first tree, normal ones in the second tree (if that 
 is how order matters...).
Another "quick fix" if we have to keep the order would be to add a Bloom filter/… on the side to eliminate most array searches. — David
Apr 15 2016
parent reply Johan Engelen <j j.nl> writes:
On Friday, 15 April 2016 at 19:32:46 UTC, David Nadlinger wrote:
 
 Another "quick fix" if we have to keep the order would be to 
 add a Bloom filter/… on the side to eliminate most array 
 searches.
In rare cases, symbols are removed from the members list, so the shadow data structure needs the ability to delete elements.
Apr 16 2016
parent reply David Nadlinger <code klickverbot.at> writes:
On Saturday, 16 April 2016 at 13:58:28 UTC, Johan Engelen wrote:
 On Friday, 15 April 2016 at 19:32:46 UTC, David Nadlinger wrote:
 Another "quick fix" if we have to keep the order would be to 
 add a Bloom filter/… on the side to eliminate most array 
 searches.
In rare cases, symbols are removed from the members list, so the shadow data structure needs the ability to delete elements.
Bloom filters can have false positives anyway. As long as elements are not removed too frequently (what do your numbers say?), the performance impact of doing a full linear search in those cases shouldn't be too bad. — David
Apr 22 2016
next sibling parent reply David Nadlinger <code klickverbot.at> writes:
On Friday, 22 April 2016 at 18:51:57 UTC, David Nadlinger wrote:
 As long as elements are not removed too frequently (what do 
 your numbers say?), the performance impact of doing a full 
 linear search in those cases shouldn't be too bad.
Note that a (properly tuned) probabilistic data structure like a Bloom filter allows you to avoid having to keep a full second copy of the list around, hopefully making it possible to keep the optimisation on by default without regrets. (For tiny modules with only a small number of members, you can still do the linear search to skip the hash lookup.) — David
Apr 22 2016
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/22/2016 11:55 AM, David Nadlinger wrote:
 On Friday, 22 April 2016 at 18:51:57 UTC, David Nadlinger wrote:
 As long as elements are not removed too frequently (what do your numbers
 say?), the performance impact of doing a full linear search in those cases
 shouldn't be too bad.
Note that a (properly tuned) probabilistic data structure like a Bloom filter allows you to avoid having to keep a full second copy of the list around, hopefully making it possible to keep the optimisation on by default without regrets. (For tiny modules with only a small number of members, you can still do the linear search to skip the hash lookup.) — David
Why not just use a hash table? D's builtin one?
Apr 22 2016
parent reply David Nadlinger <code klickverbot.at> writes:
On Friday, 22 April 2016 at 21:48:09 UTC, Walter Bright wrote:
 Why not just use a hash table? D's builtin one?
Apparently, some parts rely on the insertion order, although I'm not convinced they should. Module::importAll is one of them, but in a way that's trivial to avoid. I didn't check any other locations. If order is not important, a hash set should be fine. — David
Apr 22 2016
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/22/2016 3:07 PM, David Nadlinger wrote:
 On Friday, 22 April 2016 at 21:48:09 UTC, Walter Bright wrote:
 Why not just use a hash table? D's builtin one?
Apparently, some parts rely on the insertion order, although I'm not convinced they should. Module::importAll is one of them, but in a way that's trivial to avoid. I didn't check any other locations. If order is not important, a hash set should be fine.
The linear search is checking for membership. Another way is to have the 'this' have a field that says which says which members[] it is in, if any. Voila, the membership becomes an O(1) search.
Apr 22 2016
parent Johan Engelen <j j.nl> writes:
On Friday, 22 April 2016 at 23:33:24 UTC, Walter Bright wrote:
 On 4/22/2016 3:07 PM, David Nadlinger wrote:
 On Friday, 22 April 2016 at 21:48:09 UTC, Walter Bright wrote:
 Why not just use a hash table? D's builtin one?
Apparently, some parts rely on the insertion order, although I'm not convinced they should. Module::importAll is one of them, but in a way that's trivial to avoid. I didn't check any other locations. If order is not important, a hash set should be fine.
The linear search is checking for membership. Another way is to have the 'this' have a field that says which says which members[] it is in, if any. Voila, the membership becomes an O(1) search.
Excellent. I didn't dare assume the code actually worked that simply.
Apr 23 2016
prev sibling parent reply Johan Engelen <j j.nl> writes:
On Friday, 22 April 2016 at 18:51:57 UTC, David Nadlinger wrote:
 On Saturday, 16 April 2016 at 13:58:28 UTC, Johan Engelen wrote:
 In rare cases, symbols are removed from the members list, so 
 the shadow data structure needs the ability to delete elements.
Bloom filters can have false positives anyway. As long as elements are not removed too frequently (what do your numbers say?), the performance impact of doing a full linear search in those cases shouldn't be too bad.
The false-positive nature of Bloom filters makes them unsuited, I think, unless we can make the chance of a false-positive very low for lists that are very big, my testcase has a list size of 140k elements. It needs something scalable, otherwise there is no size-benefit. Wikipedia leads me to this: http://gsd.di.uminho.pt/members/cbm/ps/dbloom.pdf I don't understand exactly what you mean; do you propose to resort to linear search after a removal happened? Or only do a linear search when the shadow data structure says the item is present? I don't know how often removals happen, but for the 140k elements list removals happens _very_ often. While compiling phobos, removals happen not for all modules, but for quite a few of them. In any case, I wrote the code such that it is easy to change the underlying data structure and test the impact.
Apr 22 2016
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/22/2016 2:38 PM, Johan Engelen wrote:
 I don't understand exactly what you mean; do you propose to resort to linear
 search after a removal happened? Or only do a linear search when the shadow
data
 structure says the item is present?
 I don't know how often removals happen, but for the 140k elements list removals
 happens _very_ often. While compiling phobos, removals happen not for all
 modules, but for quite a few of them.
I did a grep for "members.remove" and got no hits. Where are the removes happening?
Apr 22 2016
parent reply David Nadlinger <code klickverbot.at> writes:
On Friday, 22 April 2016 at 22:10:47 UTC, Walter Bright wrote:
 On 4/22/2016 2:38 PM, Johan Engelen wrote:
 I don't understand exactly what you mean; do you propose to 
 resort to linear
 search after a removal happened? Or only do a linear search 
 when the shadow data
 structure says the item is present?
 I don't know how often removals happen, but for the 140k 
 elements list removals
 happens _very_ often. While compiling phobos, removals happen 
 not for all
 modules, but for quite a few of them.
I did a grep for "members.remove" and got no hits. Where are the removes happening?
One of them is https://github.com/dlang/dmd/blob/5ea445c68451152d43595c9de4797b6ec1e4f57d/sr /dtemplate.d#L6503, I think. — David
Apr 22 2016
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/22/2016 3:16 PM, David Nadlinger wrote:
 One of them is
 https://github.com/dlang/dmd/blob/5ea445c68451152d43595c9de4797b6ec1e4f57d/src/dtemplate.d#L6503,
 I think.
Definitely one.
Apr 22 2016
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/22/2016 4:31 PM, Walter Bright wrote:
 On 4/22/2016 3:16 PM, David Nadlinger wrote:
 One of them is
 https://github.com/dlang/dmd/blob/5ea445c68451152d43595c9de4797b6ec1e4f57d/src/dtemplate.d#L6503,

 I think.
Definitely one.
BTW, this looks like a particularly bad piece of engineering. The trouble is, it saves an index to the member, then does a bunch of semantic processing, then deletes what is on that index. But what if the members[] in the meantime shifts around? There is an assert for that case, so it at least won't cause corruption, but it looks bad.
Apr 22 2016
parent David Nadlinger <code klickverbot.at> writes:
On Friday, 22 April 2016 at 23:49:22 UTC, Walter Bright wrote:
 BTW, this looks like a particularly bad piece of engineering. 
 The trouble is, it saves an index to the member, then does a 
 bunch of semantic processing, then deletes what is on that 
 index. But what if the members[] in the meantime shifts around?

 There is an assert for that case, so it at least won't cause 
 corruption, but it looks bad.
IIRC I was involved in this somehow. It has been some time, but I think the deal is that it only does so when all the other members that might have been added in the meantime have also been removed. I suppose it seemed like a good idea to save that local information locally instead of increasing the size of TemplateInstance by a pointer, of which there might be a bajillion instances in a typical "modern" D program. — David
Apr 22 2016
prev sibling parent reply David Nadlinger <code klickverbot.at> writes:
On Friday, 22 April 2016 at 21:38:41 UTC, Johan Engelen wrote:
 I don't understand exactly what you mean; do you propose to […] 
 do a linear search when the shadow data structure says the item 
 is present?
That's the idea. As long as you can reduce the need to do a full linear search by, say, 1 : 1000, it would be enough to make the quadratic behaviour disappear into the noise. That being said, I did some estimates, and since the keys are only 8 bytes in size, Bloom filters in particular might not be worth it in this case, unfortunately. Quite sad, actually. I like Bloom filters. — David
Apr 22 2016
parent reply Johan Engelen <j j.nl> writes:
On Friday, 22 April 2016 at 22:37:49 UTC, David Nadlinger wrote:
 On Friday, 22 April 2016 at 21:38:41 UTC, Johan Engelen wrote:
 I don't understand exactly what you mean; do you propose to 
 […] do a linear search when the shadow data structure says the 
 item is present?
That's the idea. As long as you can reduce the need to do a full linear search by, say, 1 : 1000, it would be enough to make the quadratic behaviour disappear into the noise.
Well, this 140k list has a removal very early on, so the linear search is going to happen >100k times from then on ;)
 That being said, I did some estimates, and since the keys are 
 only 8 bytes in size, Bloom filters in particular might not be 
 worth it in this case, unfortunately. Quite sad, actually. I 
 like Bloom filters.
I figured :-)
Apr 23 2016
parent David Nadlinger <code klickverbot.at> writes:
On Saturday, 23 April 2016 at 09:13:01 UTC, Johan Engelen wrote:
 Well, this 140k list has a removal very early on, so the linear 
 search is going to happen >100k times from then on ;)
No – only for searching the particular removed item (or other false positives due to hash collisions). — David
Apr 23 2016
prev sibling next sibling parent David Nadlinger <code klickverbot.at> writes:
On Friday, 15 April 2016 at 18:33:44 UTC, Johan Engelen wrote:
 I'm thinking about removing the old array all-together.
 My question is: is it essential to keep an ordered list? (I see 
 a `.shift(...)` call on the array, to put something in first 
 position. If so, could that be solved by having *two* trees, 
 where "in-order" iteration first iterates over the first one 
 and then the second. The "high priority" symbols can then be 
 put in the first tree, normal ones in the second tree (if that 
 is how order matters...).
Did you have a closer look at where this is done? One place where order matters is for `object` in `importAll`, but that's trivially rewritten a different way. — David
Apr 22 2016
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 4/15/2016 11:33 AM, Johan Engelen wrote:
 The culprit is the linear search at the end of appendToModuleMember, because
the
 members list with Dsymbols gets pretty large (50k+ members).
 https://github.com/D-Programming-Language/dmd/blob/master/src/dtemplate.d#L8012-L8026
https://github.com/dlang/dmd/pull/5693
Apr 22 2016