digitalmars.D - DMD internal: appendToModuleMember performance

Johan Engelen (29/29) Apr 15 2016 Hi all,

David Nadlinger (7/21) Apr 15 2016 Yep, see also: https://issues.dlang.org/show_bug.cgi?id=15323.

Johan Engelen (3/7) Apr 16 2016 In rare cases, symbols are removed from the members list, so the

David Nadlinger (6/12) Apr 22 2016 Bloom filters can have false positives anyway. As long as

David Nadlinger (8/11) Apr 22 2016 Note that a (properly tuned) probabilistic data structure like a

Walter Bright (2/12) Apr 22 2016 Why not just use a hash table? D's builtin one?

David Nadlinger (7/8) Apr 22 2016 Apparently, some parts rely on the insertion order, although I'm

Walter Bright (4/10) Apr 22 2016 The linear search is checking for membership. Another way is to have the...

Johan Engelen (3/18) Apr 23 2016 Excellent. I didn't dare assume the code actually worked that

Johan Engelen (16/24) Apr 22 2016 The false-positive nature of Bloom filters makes them unsuited, I

Walter Bright (2/8) Apr 22 2016 I did a grep for "members.remove" and got no hits. Where are the removes...

David Nadlinger (4/17) Apr 22 2016 One of them is

Walter Bright (2/5) Apr 22 2016 Definitely one.

Walter Bright (7/13) Apr 22 2016 BTW, this looks like a particularly bad piece of engineering. The troubl...

David Nadlinger (9/15) Apr 22 2016 IIRC I was involved in this somehow. It has been some time, but I

David Nadlinger (9/12) Apr 22 2016 That's the idea. As long as you can reduce the need to do a full

Johan Engelen (4/15) Apr 23 2016 Well, this 140k list has a removal very early on, so the linear

David Nadlinger (4/6) Apr 23 2016 No – only for searching the particular removed item (or other

David Nadlinger (5/13) Apr 22 2016 Did you have a closer look at where this is done? One place where
Walter Bright (2/5) Apr 22 2016 https://github.com/dlang/dmd/pull/5693

Johan Engelen <j j.nl> writes:

Hi all,
   When building a unittest case in Weka.io's codebase, I measure 
that appendToModuleMember takes about 10% of the total compile 
time, and about 25% of total semantic analysis time. 
"appendToModuleMember" is the only function that pops up in 
profiling with such a large time fraction spent in it.

The culprit is the linear search at the end of 
appendToModuleMember, because the members list with Dsymbols gets 
pretty large (50k+ members).
https://github.com/D-Programming-Language/dmd/blob/master/src/dtemplate.d#L8012-L8026

I've modified the code to also store the list of members as an 
rbtree, and voila, much faster compiles: from 26sec to 20sec 
(semantic only). The old data structure is still there and the 
only change is to do a lookup into the tree instead of the linear 
search.
The hack is here: 
https://github.com/ldc-developers/ldc/compare/master...JohanEngelen:speed_up

I couldn't use symtab as it isn't populated with all symbols and 
I wasn't sure if it would break things.

I'm thinking about removing the old array all-together.
My question is: is it essential to keep an ordered list? (I see a 
`.shift(...)` call on the array, to put something in first 
position. If so, could that be solved by having *two* trees, 
where "in-order" iteration first iterates over the first one and 
then the second. The "high priority" symbols can then be put in 
the first tree, normal ones in the second tree (if that is how 
order matters...).

Thanks,
   Johan

Apr 15 2016

David Nadlinger <code klickverbot.at> writes:

On Friday, 15 April 2016 at 18:33:44 UTC, Johan Engelen wrote:
 Hi all,
   When building a unittest case in Weka.io's codebase, I 
 measure that appendToModuleMember takes about 10% of the total 
 compile time, and about 25% of total semantic analysis time. 
 "appendToModuleMember" is the only function that pops up in 
 profiling with such a large time fraction spent in it.

Yep, see also: https://issues.dlang.org/show_bug.cgi?id=15323. 
When I tried to get rid of the ordered array, I ran into several 
interesting issues, but I don't remember many details.

 I'm thinking about removing the old array all-together.
 My question is: is it essential to keep an ordered list? (I see 
 a `.shift(...)` call on the array, to put something in first 
 position. If so, could that be solved by having *two* trees, 
 where "in-order" iteration first iterates over the first one 
 and then the second. The "high priority" symbols can then be 
 put in the first tree, normal ones in the second tree (if that 
 is how order matters...).

Another "quick fix" if we have to keep the order would be to add 
a Bloom filter/… on the side to eliminate most array searches.

  — David

Apr 15 2016

Johan Engelen <j j.nl> writes:

On Friday, 15 April 2016 at 19:32:46 UTC, David Nadlinger wrote:
 
 Another "quick fix" if we have to keep the order would be to 
 add a Bloom filter/… on the side to eliminate most array 
 searches.

In rare cases, symbols are removed from the members list, so the 
shadow data structure needs the ability to delete elements.

Apr 16 2016

David Nadlinger <code klickverbot.at> writes:

On Saturday, 16 April 2016 at 13:58:28 UTC, Johan Engelen wrote:
 On Friday, 15 April 2016 at 19:32:46 UTC, David Nadlinger wrote:
 Another "quick fix" if we have to keep the order would be to 
 add a Bloom filter/… on the side to eliminate most array 
 searches.

 In rare cases, symbols are removed from the members list, so 
 the shadow data structure needs the ability to delete elements.

Bloom filters can have false positives anyway. As long as 
elements are not removed too frequently (what do your numbers 
say?), the performance impact of doing a full linear search in 
those cases shouldn't be too bad.

  — David

Apr 22 2016

David Nadlinger <code klickverbot.at> writes:

On Friday, 22 April 2016 at 18:51:57 UTC, David Nadlinger wrote:
 As long as elements are not removed too frequently (what do 
 your numbers say?), the performance impact of doing a full 
 linear search in those cases shouldn't be too bad.

Note that a (properly tuned) probabilistic data structure like a 
Bloom filter allows you to avoid having to keep a full second 
copy of the list around, hopefully making it possible to keep the 
optimisation on by default without regrets. (For tiny modules 
with only a small number of members, you can still do the linear 
search to skip the hash lookup.)

  — David

Apr 22 2016

Walter Bright <newshound2 digitalmars.com> writes:

On 4/22/2016 11:55 AM, David Nadlinger wrote:
 On Friday, 22 April 2016 at 18:51:57 UTC, David Nadlinger wrote:
 As long as elements are not removed too frequently (what do your numbers
 say?), the performance impact of doing a full linear search in those cases
 shouldn't be too bad.

 Note that a (properly tuned) probabilistic data structure like a Bloom filter
 allows you to avoid having to keep a full second copy of the list around,
 hopefully making it possible to keep the optimisation on by default without
 regrets. (For tiny modules with only a small number of members, you can still
do
 the linear search to skip the hash lookup.)

   — David

Why not just use a hash table? D's builtin one?

Apr 22 2016

David Nadlinger <code klickverbot.at> writes:

On Friday, 22 April 2016 at 21:48:09 UTC, Walter Bright wrote:
 Why not just use a hash table? D's builtin one?

Apparently, some parts rely on the insertion order, although I'm 
not convinced they should. Module::importAll is one of them, but 
in a way that's trivial to avoid. I didn't check any other 
locations.

If order is not important, a hash set should be fine.

  — David

Apr 22 2016

Walter Bright <newshound2 digitalmars.com> writes:

On 4/22/2016 3:07 PM, David Nadlinger wrote:
 On Friday, 22 April 2016 at 21:48:09 UTC, Walter Bright wrote:
 Why not just use a hash table? D's builtin one?

 Apparently, some parts rely on the insertion order, although I'm not convinced
 they should. Module::importAll is one of them, but in a way that's trivial to
 avoid. I didn't check any other locations.

 If order is not important, a hash set should be fine.

The linear search is checking for membership. Another way is to have the 'this' 
have a field that says which says which members[] it is in, if any. Voila, the 
membership becomes an O(1) search.

Apr 22 2016

Johan Engelen <j j.nl> writes:

On Friday, 22 April 2016 at 23:33:24 UTC, Walter Bright wrote:
 On 4/22/2016 3:07 PM, David Nadlinger wrote:
 On Friday, 22 April 2016 at 21:48:09 UTC, Walter Bright wrote:
 Why not just use a hash table? D's builtin one?

 Apparently, some parts rely on the insertion order, although 
 I'm not convinced
 they should. Module::importAll is one of them, but in a way 
 that's trivial to
 avoid. I didn't check any other locations.

 If order is not important, a hash set should be fine.

 The linear search is checking for membership. Another way is to 
 have the 'this' have a field that says which says which 
 members[] it is in, if any. Voila, the membership becomes an 
 O(1) search.

Excellent. I didn't dare assume the code actually worked that 
simply.

Apr 23 2016

Johan Engelen <j j.nl> writes:

On Friday, 22 April 2016 at 18:51:57 UTC, David Nadlinger wrote:
 On Saturday, 16 April 2016 at 13:58:28 UTC, Johan Engelen wrote:
 In rare cases, symbols are removed from the members list, so 
 the shadow data structure needs the ability to delete elements.

 Bloom filters can have false positives anyway. As long as 
 elements are not removed too frequently (what do your numbers 
 say?), the performance impact of doing a full linear search in 
 those cases shouldn't be too bad.

The false-positive nature of Bloom filters makes them unsuited, I 
think, unless we can make the chance of a false-positive very low 
for lists that are very big, my testcase has a list size of 140k 
elements. It needs something scalable, otherwise there is no 
size-benefit. Wikipedia leads me to this: 
http://gsd.di.uminho.pt/members/cbm/ps/dbloom.pdf

I don't understand exactly what you mean; do you propose to 
resort to linear search after a removal happened? Or only do a 
linear search when the shadow data structure says the item is 
present?
I don't know how often removals happen, but for the 140k elements 
list removals happens _very_ often. While compiling phobos, 
removals happen not for all modules, but for quite a few of them.

In any case, I wrote the code such that it is easy to change the 
underlying data structure and test the impact.

Apr 22 2016

Walter Bright <newshound2 digitalmars.com> writes:

On 4/22/2016 2:38 PM, Johan Engelen wrote:
 I don't understand exactly what you mean; do you propose to resort to linear
 search after a removal happened? Or only do a linear search when the shadow
data
 structure says the item is present?
 I don't know how often removals happen, but for the 140k elements list removals
 happens _very_ often. While compiling phobos, removals happen not for all
 modules, but for quite a few of them.

I did a grep for "members.remove" and got no hits. Where are the removes
happening?

Apr 22 2016

David Nadlinger <code klickverbot.at> writes:

On Friday, 22 April 2016 at 22:10:47 UTC, Walter Bright wrote:
 On 4/22/2016 2:38 PM, Johan Engelen wrote:
 I don't understand exactly what you mean; do you propose to 
 resort to linear
 search after a removal happened? Or only do a linear search 
 when the shadow data
 structure says the item is present?
 I don't know how often removals happen, but for the 140k 
 elements list removals
 happens _very_ often. While compiling phobos, removals happen 
 not for all
 modules, but for quite a few of them.

 I did a grep for "members.remove" and got no hits. Where are 
 the removes happening?

One of them is 
https://github.com/dlang/dmd/blob/5ea445c68451152d43595c9de4797b6ec1e4f57d/sr
/dtemplate.d#L6503, I think.

  — David

Apr 22 2016

Walter Bright <newshound2 digitalmars.com> writes:

On 4/22/2016 3:16 PM, David Nadlinger wrote:
 One of them is
 https://github.com/dlang/dmd/blob/5ea445c68451152d43595c9de4797b6ec1e4f57d/src/dtemplate.d#L6503,
 I think.

Definitely one.

Apr 22 2016

Walter Bright <newshound2 digitalmars.com> writes:

On 4/22/2016 4:31 PM, Walter Bright wrote:
 On 4/22/2016 3:16 PM, David Nadlinger wrote:
 One of them is
 https://github.com/dlang/dmd/blob/5ea445c68451152d43595c9de4797b6ec1e4f57d/src/dtemplate.d#L6503,

 I think.

 Definitely one.

BTW, this looks like a particularly bad piece of engineering. The trouble is,
it 
saves an index to the member, then does a bunch of semantic processing, then 
deletes what is on that index. But what if the members[] in the meantime shifts 
around?

There is an assert for that case, so it at least won't cause corruption, but it 
looks bad.

Apr 22 2016

David Nadlinger <code klickverbot.at> writes:

On Friday, 22 April 2016 at 23:49:22 UTC, Walter Bright wrote:
 BTW, this looks like a particularly bad piece of engineering. 
 The trouble is, it saves an index to the member, then does a 
 bunch of semantic processing, then deletes what is on that 
 index. But what if the members[] in the meantime shifts around?

 There is an assert for that case, so it at least won't cause 
 corruption, but it looks bad.

IIRC I was involved in this somehow. It has been some time, but I 
think the deal is that it only does so when all the other members 
that might have been added in the meantime have also been 
removed. I suppose it seemed like a good idea to save that local 
information locally instead of increasing the size of 
TemplateInstance by a pointer, of which there might be a 
bajillion instances in a typical "modern" D program.

  — David

Apr 22 2016

David Nadlinger <code klickverbot.at> writes:

On Friday, 22 April 2016 at 21:38:41 UTC, Johan Engelen wrote:
 I don't understand exactly what you mean; do you propose to […] 
 do a linear search when the shadow data structure says the item 
 is present?

That's the idea. As long as you can reduce the need to do a full 
linear search by, say, 1 : 1000, it would be enough to make the 
quadratic behaviour disappear into the noise.

That being said, I did some estimates, and since the keys are 
only 8 bytes in size, Bloom filters in particular might not be 
worth it in this case, unfortunately. Quite sad, actually. I like 
Bloom filters.

  — David

Apr 22 2016

Johan Engelen <j j.nl> writes:

On Friday, 22 April 2016 at 22:37:49 UTC, David Nadlinger wrote:
 On Friday, 22 April 2016 at 21:38:41 UTC, Johan Engelen wrote:
 I don't understand exactly what you mean; do you propose to 
 […] do a linear search when the shadow data structure says the 
 item is present?

 That's the idea. As long as you can reduce the need to do a 
 full linear search by, say, 1 : 1000, it would be enough to 
 make the quadratic behaviour disappear into the noise.

Well, this 140k list has a removal very early on, so the linear 
search is going to happen >100k times from then on ;)

 That being said, I did some estimates, and since the keys are 
 only 8 bytes in size, Bloom filters in particular might not be 
 worth it in this case, unfortunately. Quite sad, actually. I 
 like Bloom filters.

I figured :-)

Apr 23 2016

David Nadlinger <code klickverbot.at> writes:

On Saturday, 23 April 2016 at 09:13:01 UTC, Johan Engelen wrote:
 Well, this 140k list has a removal very early on, so the linear 
 search is going to happen >100k times from then on ;)

No – only for searching the particular removed item (or other 
false positives due to hash collisions).

  — David

Apr 23 2016

David Nadlinger <code klickverbot.at> writes:

On Friday, 15 April 2016 at 18:33:44 UTC, Johan Engelen wrote:
 I'm thinking about removing the old array all-together.
 My question is: is it essential to keep an ordered list? (I see 
 a `.shift(...)` call on the array, to put something in first 
 position. If so, could that be solved by having *two* trees, 
 where "in-order" iteration first iterates over the first one 
 and then the second. The "high priority" symbols can then be 
 put in the first tree, normal ones in the second tree (if that 
 is how order matters...).

Did you have a closer look at where this is done? One place where 
order matters is for `object` in `importAll`, but that's 
trivially rewritten a different way.

  — David

Apr 22 2016

Walter Bright <newshound2 digitalmars.com> writes:

On 4/15/2016 11:33 AM, Johan Engelen wrote:
 The culprit is the linear search at the end of appendToModuleMember, because
the
 members list with Dsymbols gets pretty large (50k+ members).
 https://github.com/D-Programming-Language/dmd/blob/master/src/dtemplate.d#L8012-L8026

https://github.com/dlang/dmd/pull/5693

Apr 22 2016

D Programming

C/C++ Programming

Other

digitalmars.D - DMD internal: appendToModuleMember performance