digitalmars.D - DList and SList missing length property?
- Paulo Pinto (12/12) May 28 2013 Hi,
- Jonathan M Davis (25/39) May 28 2013 It never says that. It lists what complexity each operation must have in...
- Paulo Pinto (9/48) May 28 2013 Ok, maybe the documentation should explicitly state to which containers
- monarch_dodra (19/34) May 28 2013 SList can be spliced, but the semantics are a bit more
- Jonathan M Davis (11/14) May 28 2013 I do remember hearing something about that, though that will actually re...
- Steven Schveighoffer (11/30) May 28 2013 I actually had performance bugs the OTHER way -- I needed the length of ...
- monarch_dodra (6/42) May 28 2013 A proper implementation should be able to track length anyway:
- Steven Schveighoffer (4/8) May 28 2013 I'm not sure how that works, can you explain/have a link?
- monarch_dodra (39/48) May 29 2013 Well, the basic idea is to give your list an "m_size" member.
- Steven Schveighoffer (8/29) May 29 2013 This is O(n) length, not amortized O(1) length, as it is highly dependen...
- monarch_dodra (16/50) May 29 2013 That's why I said "amortized" with quotes. Not the exact term, I
- Steven Schveighoffer (4/8) May 30 2013 That is true, and a good point.
- monarch_dodra (16/21) May 28 2013 I think having two lists in list would be overkill.
Hi, maybe this is more indicated for d.learn, but I am not sure if it is not a bug in the documentation/implementation. According to the phobos documentation, all containers in std.container should provide a length property. However it is only available for arrays. Sure one could use the std.algorithm.count() or std.range.walkLength(), but I don't see a reason why SList and DList aren't aware of their current size. Is this on purpose? -- Paulo
May 28 2013
On Tuesday, May 28, 2013 20:03:49 Paulo Pinto wrote:Hi, maybe this is more indicated for d.learn, but I am not sure if it is not a bug in the documentation/implementation. According to the phobos documentation, all containers in std.container should provide a length property.It never says that. It lists what complexity each operation must have in order for a container to have that operation. It doesn't guarantee that any particular operation will be present on all containers (though any that's O(n) or worse is pretty much bound to be on all containers unless in does something fundamentally incompatible with a particular container's data structure).However it is only available for arrays. Sure one could use the std.algorithm.count() or std.range.walkLength(), but I don't see a reason why SList and DList aren't aware of their current size. Is this on purpose?Yes. It's on purpose. When dealing with lists, because you can insert at any point, you have two choices: 1. Make length efficent. 2. Make splicing efficient. If you keep track of the length, then you have to count all of the elements when you splice two lists, so splicing is O(n) instead of O(1). If you don't keep track of the length, then splicing is O(1), but then length is O(n), and length can't be O(n) per std.container's complexity requirements. Now, that being said, I don't see a splice operation on either SList or DList (and I don't think that splicing an SList can be done in O(1) anyway), so it would appear that the implementation is optimizing for an operation that it doesn't currently support. The same situation exists with C++'s std::list type though - size() is O(n) and splicing is O(1). However, we opted for not having containers implement inefficient operations like that (which C++ mostly did as well, but not in this case). If you want the length of a container which doesn't have a length property, then use walkLength on its range - e.g. walkLength(container[]). - Jonathan M Davis
May 28 2013
Am 28.05.2013 20:16, schrieb Jonathan M Davis:On Tuesday, May 28, 2013 20:03:49 Paulo Pinto wrote:Ok, maybe the documentation should explicitly state to which containers each operation applies to. Currently it is a bit confusing to see the complexity listed, but not finding the methods. Thanks for the lengthy explanation, it is actually in sync with what I was thinking might be the reason. -- PauloHi, maybe this is more indicated for d.learn, but I am not sure if it is not a bug in the documentation/implementation. According to the phobos documentation, all containers in std.container should provide a length property.It never says that. It lists what complexity each operation must have in order for a container to have that operation. It doesn't guarantee that any particular operation will be present on all containers (though any that's O(n) or worse is pretty much bound to be on all containers unless in does something fundamentally incompatible with a particular container's data structure).However it is only available for arrays. Sure one could use the std.algorithm.count() or std.range.walkLength(), but I don't see a reason why SList and DList aren't aware of their current size. Is this on purpose?Yes. It's on purpose. When dealing with lists, because you can insert at any point, you have two choices: 1. Make length efficent. 2. Make splicing efficient. If you keep track of the length, then you have to count all of the elements when you splice two lists, so splicing is O(n) instead of O(1). If you don't keep track of the length, then splicing is O(1), but then length is O(n), and length can't be O(n) per std.container's complexity requirements. Now, that being said, I don't see a splice operation on either SList or DList (and I don't think that splicing an SList can be done in O(1) anyway), so it would appear that the implementation is optimizing for an operation that it doesn't currently support. The same situation exists with C++'s std::list type though - size() is O(n) and splicing is O(1). However, we opted for not having containers implement inefficient operations like that (which C++ mostly did as well, but not in this case). If you want the length of a container which doesn't have a length property, then use walkLength on its range - e.g. walkLength(container[]). - Jonathan M Davis
May 28 2013
On Tuesday, 28 May 2013 at 18:16:48 UTC, Jonathan M Davis wrote:Now, that being said, I don't see a splice operation on either SList or DList (and I don't think that splicing an SList can be done in O(1) anyway), so it would appear that the implementation is optimizing for an operation that it doesn't currently support.SList can be spliced, but the semantics are a bit more complicated. In most cases, if you need to splice, you'd use a DList anyway: SList is really for particular use cases. I think that if splice is not in DList, it is only because it is not *yet* in DList. I have an implementation ready to submit, which adheres to Andrei's proposed container semantics. I have not yet submitted it though, because fixing DList is more important that inserting more functionality. It's current semantics are neither value nor ref. It's... something else. My fault for inserting it when trying to fix previous clusterfuck. I've since submitted fix that gives it classic semantics, but it's kind of just sitting there... We should *really* see to getting it (or something else) pulled through.The same situation exists with C++'s std::list type though - size() is O(n) and splicing is O(1). However, we opted for not having containers implement inefficient operations like that (which C++ mostly did as well, but not in this case). - Jonathan M DavisThat was in C++03. In C++11, length was reduced to 0(1). And splice was increased to 0(N) (I think, I don't remember the exact details, but it was definitely changed). I'll look up the exact requirement later tonight.
May 28 2013
On Tuesday, May 28, 2013 20:51:55 monarch_dodra wrote:That was in C++03. In C++11, length was reduced to 0(1). And splice was increased to 0(N) (I think, I don't remember the exact details, but it was definitely changed).I do remember hearing something about that, though that will actually result in some nasty performance changes to some of the code out there (including code that I've written). Regardless, you have the choice of making length O(1) or splicing O(1) and not both, and C++98/03 made splicing O(1), which I'm inclined to believe is the better choice, though having size when it's O(n) was a definite mistake on C++'s part IMHO. I hate to think of how many times I've seen programmer's write container.size() == 0 instead of empty(), particularly when size() was O(n)... - Jonathan M Davis
May 28 2013
On Tue, 28 May 2013 15:06:53 -0400, Jonathan M Davis <jmdavisProg gmx.com> wrote:On Tuesday, May 28, 2013 20:51:55 monarch_dodra wrote:I actually had performance bugs the OTHER way -- I needed the length of a list, and was using .size() without realizing it was O(n). In profiling my code, I found that std::list<T>::iterator::operator++ was the most used function, and I had no idea why for a LONG time :) You know, it is trivial to have an O(1) splice O(n) length list wrapped into an o(n) splice O(1) length list. All you need is to add a tracked length. And this is a REAL pain when you have to do it manually. There should just be two list types in <list> -SteveThat was in C++03. In C++11, length was reduced to 0(1). And splice was increased to 0(N) (I think, I don't remember the exact details, but it was definitely changed).I do remember hearing something about that, though that will actually result in some nasty performance changes to some of the code out there (including code that I've written). Regardless, you have the choice of making length O(1) or splicing O(1) and not both, and C++98/03 made splicing O(1), which I'm inclined to believe is the better choice, though having size when it's O(n) was a definite mistake on C++'s part IMHO. I hate to think of how many times I've seen programmer's write container.size() == 0 instead of empty(), particularly when size() was O(n)...
May 28 2013
On Tuesday, 28 May 2013 at 19:24:05 UTC, Steven Schveighoffer wrote:On Tue, 28 May 2013 15:06:53 -0400, Jonathan M Davis <jmdavisProg gmx.com> wrote:A proper implementation should be able to track length anyway: provide 0(1) splice, and an "amortized" 0(1) length. I've always wondered why the standard didn't decide to do *that*? I think *we* should provide that...On Tuesday, May 28, 2013 20:51:55 monarch_dodra wrote:I actually had performance bugs the OTHER way -- I needed the length of a list, and was using .size() without realizing it was O(n). In profiling my code, I found that std::list<T>::iterator::operator++ was the most used function, and I had no idea why for a LONG time :) You know, it is trivial to have an O(1) splice O(n) length list wrapped into an o(n) splice O(1) length list. All you need is to add a tracked length. And this is a REAL pain when you have to do it manually. There should just be two list types in <list> -SteveThat was in C++03. In C++11, length was reduced to 0(1). And splice was increased to 0(N) (I think, I don't remember the exact details, but it was definitely changed).I do remember hearing something about that, though that will actually result in some nasty performance changes to some of the code out there (including code that I've written). Regardless, you have the choice of making length O(1) or splicing O(1) and not both, and C++98/03 made splicing O(1), which I'm inclined to believe is the better choice, though having size when it's O(n) was a definite mistake on C++'s part IMHO. I hate to think of how many times I've seen programmer's write container.size() == 0 instead of empty(), particularly when size() was O(n)...
May 28 2013
On Tue, 28 May 2013 17:11:06 -0400, monarch_dodra <monarchdodra gmail.com> wrote:A proper implementation should be able to track length anyway: provide 0(1) splice, and an "amortized" 0(1) length. I've always wondered why the standard didn't decide to do *that*? I think *we* should provide that...I'm not sure how that works, can you explain/have a link? -Steve
May 28 2013
On Tuesday, 28 May 2013 at 22:29:02 UTC, Steven Schveighoffer wrote:On Tue, 28 May 2013 17:11:06 -0400, monarch_dodra <monarchdodra gmail.com> wrote:Well, the basic idea is to give your list an "m_size" member. This starts at 0. Whenever the user does a push_back/pop_back or whatnot operation, the the "m_size" attribute gets correctly upgraded. At that point, calling "size()" simply returns "m_size". Now, once a splice operation gets called, the m_size gets reset to a magic value, to indicate that tracking of the size has been lost. Operations will seize upgrading m_size, until a call to size is made, at which point it will be re-calculated, once. This, I think is the best solution, since it keeps people that use length in a safe position, while users of splice are also satisfied. Also, I *think* people who use splice tend to be more aware of the situation, and avoid calling length entirely. Implementation wise, it would mostly look like this: template <typename T> class list { size_t m_size = 0; size_t push_back(T other) { //Normal Code if ( m_size != std::numeric_limits<size_t>::max() ) ++m_size; } size_t splice(list::iterator first, list::iterator last) { //Normal Code //Reset length m_size == std::numeric_limits<size_t>::max(); } size_t size() const { if ( m_size == std::numeric_limits<size_t>::max() ) //Re-evaluate m_size m_size = std::distance( cbegin(), cend() ); return m_size; } };A proper implementation should be able to track length anyway: provide 0(1) splice, and an "amortized" 0(1) length. I've always wondered why the standard didn't decide to do *that*? I think *we* should provide that...I'm not sure how that works, can you explain/have a link? -Steve
May 29 2013
On Wed, 29 May 2013 05:15:00 -0400, monarch_dodra <monarchdodra gmail.com> wrote:On Tuesday, 28 May 2013 at 22:29:02 UTC, Steven Schveighoffer wrote:This is O(n) length, not amortized O(1) length, as it is highly dependent on usage. For example, in a project I am currently working on, I most frequently am using splice to move one list element at a time. This degrades your solution to basically O(n) length for every move, and I move a lot. -SteveOn Tue, 28 May 2013 17:11:06 -0400, monarch_dodra <monarchdodra gmail.com> wrote:Well, the basic idea is to give your list an "m_size" member. This starts at 0. Whenever the user does a push_back/pop_back or whatnot operation, the the "m_size" attribute gets correctly upgraded. At that point, calling "size()" simply returns "m_size". Now, once a splice operation gets called, the m_size gets reset to a magic value, to indicate that tracking of the size has been lost. Operations will seize upgrading m_size, until a call to size is made, at which point it will be re-calculated, once.A proper implementation should be able to track length anyway: provide 0(1) splice, and an "amortized" 0(1) length. I've always wondered why the standard didn't decide to do *that*? I think *we* should provide that...I'm not sure how that works, can you explain/have a link? -Steve
May 29 2013
On Thursday, 30 May 2013 at 01:56:03 UTC, Steven Schveighoffer wrote:On Wed, 29 May 2013 05:15:00 -0400, monarch_dodra <monarchdodra gmail.com> wrote:That's why I said "amortized" with quotes. Not the exact term, I know. I should have been clearer. "You can get 0(1) length, except in certain situations, where you'll have to pay once an 0(n) to get back to 0(1)". And yeah, it is dependent on usage. Like vector's push_back. (although incidentally *that* is true amortized)On Tuesday, 28 May 2013 at 22:29:02 UTC, Steven Schveighoffer wrote:This is O(n) length, not amortized O(1) length, as it is highly dependent on usage.On Tue, 28 May 2013 17:11:06 -0400, monarch_dodra <monarchdodra gmail.com> wrote:Well, the basic idea is to give your list an "m_size" member. This starts at 0. Whenever the user does a push_back/pop_back or whatnot operation, the the "m_size" attribute gets correctly upgraded. At that point, calling "size()" simply returns "m_size". Now, once a splice operation gets called, the m_size gets reset to a magic value, to indicate that tracking of the size has been lost. Operations will seize upgrading m_size, until a call to size is made, at which point it will be re-calculated, once.A proper implementation should be able to track length anyway: provide 0(1) splice, and an "amortized" 0(1) length. I've always wondered why the standard didn't decide to do *that*? I think *we* should provide that...I'm not sure how that works, can you explain/have a link? -SteveFor example, in a project I am currently working on, I most frequently am using splice to move one list element at a time. This degrades your solution to basically O(n) length for every move, and I move a lot.http://www.cplusplus.com/reference/list/list/splice/ Note that 1 element splice (2) is not a problem. It has always been 0(1), regardless of scheme. In my scheme, it wouldn't invalidate legnth. Only [iterator .. iterator] splice (3) is is problematic. Full list splice (1) status is more complex. It is actually 0(1) in the scheme where length is 0(1). With my scheme, the guarantees depends on where you want to place the best compromise.
May 29 2013
On Thu, 30 May 2013 02:09:41 -0400, monarch_dodra <monarchdodra gmail.com> wrote:http://www.cplusplus.com/reference/list/list/splice/ Note that 1 element splice (2) is not a problem. It has always been 0(1), regardless of scheme. In my scheme, it wouldn't invalidate legnth. Only [iterator .. iterator] splice (3) is is problematic.That is true, and a good point. -Steve
May 30 2013
On Tuesday, 28 May 2013 at 19:24:05 UTC, Steven Schveighoffer wrote:You know, it is trivial to have an O(1) splice O(n) length list wrapped into an o(n) splice O(1) length list. All you need is to add a tracked length. And this is a REAL pain when you have to do it manually. There should just be two list types in <list>I think having two lists in list would be overkill. I *do* find the standard's choice odd though: The *primary* reason I ever use list, ever, is for its splice ability to move elements without copying them, and still having their addresses valid. No other container can do that. I don't get why they borked list's primary strength for a function you really shouldn't be using anyway. Worst, as you say, it is easy to implement one in terms of the other... but not the other way around. By doing what they just did, they effectively closed *any* way to splice in 0(1)... If you don't need splice, then most of the time, deque is a better choice of a container, and *that* has 0(1) length (along with a few neat properties). All this really doesn't make sense to me.
May 28 2013