digitalmars.D.announce - New linked list
- Chris Miller (3/3) May 11 2006 A new linked list module has been released at
- Frank Benoit (9/12) May 11 2006 Chris, thanks for your work and that you give it to the community.
- Sean Kelly (9/12) May 11 2006 Very cool. One thing... does this work:
- Walter Bright (5/14) May 11 2006 It's undefined behavior. foreach is entitled to assume that the
- Sean Kelly (13/28) May 11 2006 Even for a class that defines an opApply? What about an alternate synta...
- Walter Bright (4/36) May 11 2006 Yes.
- Chris Miller (4/23) May 11 2006 Makes sense. Perhaps I should add a filter() function that calls back a ...
- Sean Kelly (13/51) May 11 2006 *sigh* I can see the reason for this, but it dramatically reduces the
- Walter Bright (7/24) May 11 2006 The idea is to enable the compiler to do aggressive loop optimizations,
- Sean Kelly (44/52) May 11 2006 I think I could for Thread, though doing so may require an iterator
- Sean Kelly (38/38) May 11 2006 You know, this whole sequence optimization business seems to be linked
- James Dunne (6/53) May 11 2006 The simplest answer is to trust the user to supply some unreserved words...
- Sean Kelly (7/10) May 11 2006 Yeah. I wondered briefly if 'volatile' might suit, but I need to deepen...
- Dave (18/65) May 11 2006 Maybe I'm taking this out of context because I just jumped into this
- Sean Kelly (15/37) May 12 2006 I think you're right. I'll admit much of my confusion here is a result
- Sean Kelly (12/12) May 12 2006 For what it's worth, I've been following a derailed thread on
- Kyle Furlong (9/26) May 12 2006 Straight from the, pardon the phrase, horse's mouth:
- Sean Kelly (5/13) May 12 2006 Personally, I'll wait to rejoice until this actually happens, as it
- Sean Kelly (83/99) May 12 2006 To play the Devil's advocate... doesn't this imply that the body of an
- Walter Bright (9/100) May 12 2006 If you don't mess with the iteration state, it will probably work. But
- Sean Kelly (27/51) May 12 2006 Exactly. And this was the motivation for my suggestion above. If there...
- Walter Bright (9/36) May 16 2006 I have been thinking along the lines of making 'const' a promise *by the...
- kris (8/44) May 16 2006 Is there a way you can do this whereby the source-level "tagging" of
- Dave (37/84) May 16 2006 What I think most programmers would settle for is that const would be
- Sean Kelly (19/29) May 16 2006 How would the programmer know whether data was being modified? For exam...
- R閙y Mou雤a (51/51) May 11 2006 It seems that this kind of use of mixins make the term "mixin" inappropr...
- Derek Parnell (68/83) May 11 2006 This bit my bum yesterday too. I had code inside the foreach that added ...
- Derek Parnell (12/24) May 10 2006 I fell for this one just yesterday. I had ...
- Sean Kelly (7/33) May 12 2006 If People is a dynamic array and adding a person to the list results in
- Boris Wang (3/17) May 23 2006 why not make a feature:
- Frits van Bommel (10/28) May 23 2006 If such a feature was to be implemented, I can think of a better
- Derek Parnell (11/42) May 24 2006 Yes!
- Oskar Linde (23/61) May 24 2006 By reevaluation, I presume you don't mean that given:
- BCS (11/30) May 11 2006 how about:
- Andrew Fedoniouk (8/11) May 13 2006 Yep, intrusive lists are way to go.
- Klaus Oberhofer (21/39) May 16 2006 Since I read an article from Jiri Soukup about intrusive data structures
- Chris Miller (4/4) May 16 2006 Some updates to the linked list module:
A new linked list module has been released at http://www.dprogramming.com/list.php You may be wondering why another linked list; check out the FAQ to see!
May 11 2006
Chris Miller schrieb:A new linked list module has been released at http://www.dprogramming.com/list.php You may be wondering why another linked list; check out the FAQ to see!Chris, thanks for your work and that you give it to the community. I see you use the mixin feature for this list. http://d.puremagic.com/bugzilla/show_bug.cgi?id=106 If the project grows, one will sooner or later get this error, which will block your further development. The mixin feature is only save, if you have the declaration and the instantiation in the same file.
May 11 2006
Chris Miller wrote:A new linked list module has been released at http://www.dprogramming.com/list.php You may be wondering why another linked list; check out the FAQ to see!Very cool. One thing... does this work: foreach( Person p; per.each ) { if( p.age > 50 ) p.listRemove(); } ie. can you remove elements within a foreach? Sean
May 11 2006
Sean Kelly wrote:Very cool. One thing... does this work: foreach( Person p; per.each ) { if( p.age > 50 ) p.listRemove(); } ie. can you remove elements within a foreach?It's undefined behavior. foreach is entitled to assume that the aggregate is a loop invariant, although the contents of the aggregate's elements can change. This precludes things like resizing an array inside a foreach, etc.
May 11 2006
Walter Bright wrote:Sean Kelly wrote:Even for a class that defines an opApply? What about an alternate syntax: foreach( Person p; personList ) { if( p.age > 50 ) personList.remove( p ); } Assuming the list code supports this operation (say the 'next' pointer isn't set to null when p is removed, and thus the iteration should continue without any problems), is the behavior still undefined? If so, I assume it would be okay to store a list of 'removed' items until the iteration ends and them remove them all at once? SeanVery cool. One thing... does this work: foreach( Person p; per.each ) { if( p.age > 50 ) p.listRemove(); } ie. can you remove elements within a foreach?It's undefined behavior. foreach is entitled to assume that the aggregate is a loop invariant, although the contents of the aggregate's elements can change. This precludes things like resizing an array inside a foreach, etc.
May 11 2006
Sean Kelly wrote:Walter Bright wrote:Yes. The idea is to apply uniform semantics.Sean Kelly wrote:Even for a class that defines an opApply?Very cool. One thing... does this work: foreach( Person p; per.each ) { if( p.age > 50 ) p.listRemove(); } ie. can you remove elements within a foreach?It's undefined behavior. foreach is entitled to assume that the aggregate is a loop invariant, although the contents of the aggregate's elements can change. This precludes things like resizing an array inside a foreach, etc.What about an alternate syntax: foreach( Person p; personList ) { if( p.age > 50 ) personList.remove( p ); } Assuming the list code supports this operation (say the 'next' pointer isn't set to null when p is removed, and thus the iteration should continue without any problems), is the behavior still undefined?Yes.If so, I assume it would be okay to store a list of 'removed' items until the iteration ends and them remove them all at once?Yes.
May 11 2006
On Thu, 11 May 2006 15:06:23 -0400, Walter Bright <newshound digitalmars.com> wrote:Sean Kelly wrote:Makes sense. Perhaps I should add a filter() function that calls back a delegate for each item and allows you to remove items safely.Walter Bright wrote:Yes. The idea is to apply uniform semantics.Sean Kelly wrote:Even for a class that defines an opApply?Very cool. One thing... does this work: foreach( Person p; per.each ) { if( p.age > 50 ) p.listRemove(); } ie. can you remove elements within a foreach?It's undefined behavior. foreach is entitled to assume that the aggregate is a loop invariant, although the contents of the aggregate's elements can change. This precludes things like resizing an array inside a foreach, etc.
May 11 2006
Walter Bright wrote:Sean Kelly wrote:Understandable.Walter Bright wrote:Yes. The idea is to apply uniform semantics.Sean Kelly wrote:Even for a class that defines an opApply?Very cool. One thing... does this work: foreach( Person p; per.each ) { if( p.age > 50 ) p.listRemove(); } ie. can you remove elements within a foreach?It's undefined behavior. foreach is entitled to assume that the aggregate is a loop invariant, although the contents of the aggregate's elements can change. This precludes things like resizing an array inside a foreach, etc.*sigh* I can see the reason for this, but it dramatically reduces the utility of foreach for me, as it's a very common idiom to want to remove elements during an iteration. In fact, I already use this technique quite a bit in Thread and ThreadGroup. That aside, does this restriction also exist for processing AA elements? I ask because this is done on a dynamically allocated list of references rather than against the AA itself, though I suppose that's implementation defined.What about an alternate syntax: foreach( Person p; personList ) { if( p.age > 50 ) personList.remove( p ); } Assuming the list code supports this operation (say the 'next' pointer isn't set to null when p is removed, and thus the iteration should continue without any problems), is the behavior still undefined?Yes.Thanks. Assuming I did this, would it be technically legal to destroy this list after the iteration ends but before opApply returns? Otherwise, I'm not entirely certain how to go about processing the removals. SeanIf so, I assume it would be okay to store a list of 'removed' items until the iteration ends and them remove them all at once?Yes.
May 11 2006
Sean Kelly wrote:*sigh* I can see the reason for this, but it dramatically reduces the utility of foreach for me, as it's a very common idiom to want to remove elements during an iteration. In fact, I already use this technique quite a bit in Thread and ThreadGroup. That aside, does this restriction also exist for processing AA elements?Yes - everything used as an aggregate in a foreach.I ask because this is done on a dynamically allocated list of references rather than against the AA itself, though I suppose that's implementation defined.The idea is to enable the compiler to do aggressive loop optimizations, regardless of the aggregate type used, and regardless of whether it is user-defined or built-in.I'm a bit reluctant to say yes on that, worrying that I'm forgetting something. What you can do is not use foreach, but just a for loop.Thanks. Assuming I did this, would it be technically legal to destroy this list after the iteration ends but before opApply returns? Otherwise, I'm not entirely certain how to go about processing the removals.If so, I assume it would be okay to store a list of 'removed' items until the iteration ends and them remove them all at once?Yes.
May 11 2006
Walter Bright wrote:I think I could for Thread, though doing so may require an iterator implementation (threads are now stored in a linked list). I have fewer options for ThreadGroup as it's implemented using an AA, but perhaps I can come up with something. However, using foreach offers a tremendous advantage that other options don't: I can use a 'synchronized' block inside opApply and get thread safety without any user involvement. Better yet, on platforms with reentrant locks (ie. all platforms D currently runs on) I can synchronize on the same object in container method calls and pay for only a single lock/unlock for the entire loop, regardless of what the user does to the container. The alternative with traditional loops would be to use the iterator as an RAII mechanism for this lock, but that seems a disaster waiting to happen. Alternately, I could follow the traditional approach of not locking within the container at all, but that seems a tad odd for code specifically intended for multithreaded programming. To redirect discussion a bit, what about something akin to iterator categories in C++. For example, it seems likely that a compiler might try to optimize loops involving a random_access_iterator, but far less likely that this would occur for other iterator types (forward_only, input_iterator, etc). Perhaps there's a way to indicate to the compiler what sort of sequence is being represented as a way to suggest (or restrict) possible optimizations that may be performed? Thus I may be overjoyed to have the compiler figure out something clever to do with my BitArray iteration and may even be willing to provide hints so that it can do so, but I wouldn't necessarily want or expect this to occur for my SqlCursor object. This would also allow me to restrict optimizations in instances when dynamic manipulation of the sequence is more important to me than iteration performance. This idea just came to me so I don't have any brilliant suggestions for how best to implement it, but one obvious method would be to have separate opApply signatures for each optimization category. However, it would be nice if there were some way to represent in code how the built in types should be handled as well. Herb Sutter's Concur project attempts to solve this by including an additional qualifier in the loops themselves: for_each( c.depth_first(), f ); // sequential for_each( c.depth_first(), f, parallel ); // fully parallel for_each( c.depth_first(), f, ordered ); // ordered parallel And while I don't entirely like that the user must specify the optimization/parallelization level to use, I do like that the choice isn't completely taken out of the hands of the programmer. Does the general idea seem reasonable? SeanThanks. Assuming I did this, would it be technically legal to destroy this list after the iteration ends but before opApply returns? Otherwise, I'm not entirely certain how to go about processing the removals.I'm a bit reluctant to say yes on that, worrying that I'm forgetting something. What you can do is not use foreach, but just a for loop.
May 11 2006
You know, this whole sequence optimization business seems to be linked to mutability: if no mutating operations are performed then optimize away! Unfortunately, I haven't seen a high-level construct for method classification that can't be violated by the user. And while this may be irritating in some cases, it could lead to nasty bugs in the face of such optimizations. In some respects, it would be nice if each operation in the language could be labeled as mutating or non-mutating and then functions could be analyzed for atomicity during compilation, so a function would considered atomic if it does not perform a mutating operation on non-local data. But this may be overly restrictive. For example, the body of a foreach loop must be able to perform some mutating operations, but different categorites would impose different restrictions on optimization: int sum = 0; foreach( int val; values ) { sum += val; } Here, the code could be heavily parallelized because the result doesn't depend on the sequence in which operations are performed. However: foreach( int val; values ) { printf( "val: %d\n", val ); } Here, the code could be optimized somewhat but must still be executed sequentially. By comparison: void fn() { values ~= 5; } foreach( int val; values ) { if( val > 10 ) fn(); } No optimizations may be performed here because the iteration sequence is being modified during processing. To make matters worse, this could occur through an untraceable sequence of pointers and such that the compiler can't make sense of. What would be wonderful is if there were a way to distinguish these three categories of code somehow. I know, that's the million dollar question, but I harbor a secret hope that if it's asked frequently enough then perhaps someone will think of an answer. Sean
May 11 2006
Sean Kelly wrote:You know, this whole sequence optimization business seems to be linked to mutability: if no mutating operations are performed then optimize away! Unfortunately, I haven't seen a high-level construct for method classification that can't be violated by the user. And while this may be irritating in some cases, it could lead to nasty bugs in the face of such optimizations. In some respects, it would be nice if each operation in the language could be labeled as mutating or non-mutating and then functions could be analyzed for atomicity during compilation, so a function would considered atomic if it does not perform a mutating operation on non-local data. But this may be overly restrictive. For example, the body of a foreach loop must be able to perform some mutating operations, but different categorites would impose different restrictions on optimization: int sum = 0; foreach( int val; values ) { sum += val; } Here, the code could be heavily parallelized because the result doesn't depend on the sequence in which operations are performed. However: foreach( int val; values ) { printf( "val: %d\n", val ); } Here, the code could be optimized somewhat but must still be executed sequentially. By comparison: void fn() { values ~= 5; } foreach( int val; values ) { if( val > 10 ) fn(); } No optimizations may be performed here because the iteration sequence is being modified during processing. To make matters worse, this could occur through an untraceable sequence of pointers and such that the compiler can't make sense of. What would be wonderful is if there were a way to distinguish these three categories of code somehow. I know, that's the million dollar question, but I harbor a secret hope that if it's asked frequently enough then perhaps someone will think of an answer. SeanThe simplest answer is to trust the user to supply some unreserved words as hints to the compiler about what the loop's behavior is. -- Regards, James Dunne
May 11 2006
James Dunne wrote:The simplest answer is to trust the user to supply some unreserved words as hints to the compiler about what the loop's behavior is.Yeah. I wondered briefly if 'volatile' might suit, but I need to deepen my understanding of optimizer techniques before going any further. One pretty basic question I had was whether a volatile statement is basically a manually specified basic block or whether the restriction on loads and stores doesn't apply in quite the same way to basic blocks. Sean
May 11 2006
Sean Kelly wrote:You know, this whole sequence optimization business seems to be linked to mutability: if no mutating operations are performed then optimize away! Unfortunately, I haven't seen a high-level construct for method classification that can't be violated by the user. And while this may be irritating in some cases, it could lead to nasty bugs in the face of such optimizations. In some respects, it would be nice if each operation in the language could be labeled as mutating or non-mutating and then functions could be analyzed for atomicity during compilation, so a function would considered atomic if it does not perform a mutating operation on non-local data. But this may be overly restrictive. For example, the body of a foreach loop must be able to perform some mutating operations, but different categorites would impose different restrictions on optimization: int sum = 0; foreach( int val; values ) { sum += val; } Here, the code could be heavily parallelized because the result doesn't depend on the sequence in which operations are performed. However: foreach( int val; values ) { printf( "val: %d\n", val ); } Here, the code could be optimized somewhat but must still be executed sequentially. By comparison: void fn() { values ~= 5; } foreach( int val; values ) { if( val > 10 ) fn(); } No optimizations may be performed here because the iteration sequence is being modified during processing. To make matters worse, this could occur through an untraceable sequence of pointers and such that the compiler can't make sense of. What would be wonderful is if there were a way to distinguish these three categories of code somehow. I know, that's the million dollar question, but I harbor a secret hope that if it's asked frequently enough then perhaps someone will think of an answer. SeanMaybe I'm taking this out of context because I just jumped into this thread, but... I think a D compiler (because foreach is built-in) could safely optimize those three w/o a lot of magic right now. outside of the loop scope. So I think a compiler could determine that and fully optimize w/ parallelization or whatever. The 2nd one calls outside of the loop scope, so the compiler could keep track of that to determine that it can't safely do any optimization where the elements may be unordered. Even if it is a call to something like putc which is then inlined, it must eventually make a call and/or write a value to a hardware reference, either of which - call or dereference - the optimizer could flag as "can't use an unordered optimization". IIRC, the 3rd one is undefined behavior because the aggregate is modified inside the loop. But even if it wasn't, the same restrictions
May 11 2006
Dave wrote:Maybe I'm taking this out of context because I just jumped into this thread, but... I think a D compiler (because foreach is built-in) could safely optimize those three w/o a lot of magic right now. outside of the loop scope. So I think a compiler could determine that and fully optimize w/ parallelization or whatever. The 2nd one calls outside of the loop scope, so the compiler could keep track of that to determine that it can't safely do any optimization where the elements may be unordered. Even if it is a call to something like putc which is then inlined, it must eventually make a call and/or write a value to a hardware reference, either of which - call or dereference - the optimizer could flag as "can't use an unordered optimization". IIRC, the 3rd one is undefined behavior because the aggregate is modified inside the loop. But even if it wasn't, the same restrictionsI think you're right. I'll admit much of my confusion here is a result of not knowing just how aggressive current optimization strategies are. Also, this seems like a good opportunity to explore how this might extend to implicit parallelization and such, which seems to follow similar rules but isn't at all common (I think the Intel compiler does parallel loop unrolling in certain instances, but that's about it). Ideally, it would be nice if there were some means of allowing optimizations to take place for which there currently isn't enough information, similar to the various levels of parallelization in the C++ example. But at the very least I'd be happy if I didn't have to give up on foreach altogether in instances where the result should be but my linked list example should not be). Sean
May 12 2006
For what it's worth, I've been following a derailed thread on comp.lang.c++.moderated where Walter has been discussing foreach (curse the long Google Groups links): http://groups.google.com/group/comp.lang.c++.moderated/browse_frm/thread/d6695737a74e1853/33810ced7bb0924c It makes complete sense that the data structure should be considered the loop invariant in foreach and so this may be a lost cause, but I don't want to give up on it without a bit more research, mostly because the design of foreach is so useful that I very much prefer it over the alternatives from a semantic perspective. If there is a way that optimizations of foreach could be made more flexible without compromising its design intent then fantastic. If not... well, who knows. Sean
May 12 2006
Sean Kelly wrote:For what it's worth, I've been following a derailed thread on comp.lang.c++.moderated where Walter has been discussing foreach (curse the long Google Groups links): http://groups.google.com/group/comp.lang.c++.moderated/browse_frm/thread/d6695737a74e1 53/33810ced7bb0924c It makes complete sense that the data structure should be considered the loop invariant in foreach and so this may be a lost cause, but I don't want to give up on it without a bit more research, mostly because the design of foreach is so useful that I very much prefer it over the alternatives from a semantic perspective. If there is a way that optimizations of foreach could be made more flexible without compromising its design intent then fantastic. If not... well, who knows. SeanStraight from the, pardon the phrase, horse's mouth: "That said, D will probably get some notion of const. But it will be one that works, i.e. can be relied upon by both the programmer and the optimizer." And there was much rejoicing. -- Kyle Furlong // Physics Undergrad, UCSB "D is going wherever the D community wants it to go." - Walter Bright
May 12 2006
Kyle Furlong wrote:Straight from the, pardon the phrase, horse's mouth: "That said, D will probably get some notion of const. But it will be one that works, i.e. can be relied upon by both the programmer and the optimizer." And there was much rejoicing.Personally, I'll wait to rejoice until this actually happens, as it seems a difficult problem to solve. See the bottom of my other post for a few comments on why. Sean
May 12 2006
Walter Bright wrote:Sean Kelly wrote:To play the Devil's advocate... doesn't this imply that the body of an opApply function is also not allowed to make any modifications to the underlying data structure? As a contrived example, suppose I have a SqlCursor object that performs lazy reads and caches this data once read so the read must only occur once, and that this data is stored in an SList which is used for the actual iteration. From a code optimization standpoint this seems entirely reasonable. However, it does seem to violate the general rule you've outlined for what is legal in foreach loops. You've said that the point of foreach is to allow the compiler to treat the aggregate as a loop invariant, but my understanding of this term implies that only optimizations on the referent of v may be optimized. For example: int[] v; ... foreach( int i; v ) { ... } may be transformed into: t1 = v.ptr; t2 = v.len; t3 = 0; L1: if( t2 <= t3 ) goto L2; ... t3 += 1; goto L1; L2: ... This would explain Derek's experience where sequence modifications were not observed within the loop. However, without a viable 'const' signifier, I don't think any optimization regarding the sequence itself may be performed: t1 = v.ptr; t2 = v.len; t3 = 0; if( t2 <= 0 ) goto L2; t4 = alloca( t2 ) t4[] = t1[0 .. t2] L1: if( t2 <= t3 ) goto L2; ... t3 += 1; goto L1; L2: ... Is this correct? The first case would explain why rehashing an AA within a foreach loop would likely result in undefined behavior, as would an append-type operation on dynamic arrays. But more structurally cohesive data structures (ie. those for which modifications don't result in iterator invalidation in C++: map, list, etc) should be fairly compatible with foreach even in the face of such changes regardless of whether such detailed rules would be advisable within a language spec. Regarding the 'const' label. The only way I can see such an idea working in a way that supports compiler optimizations is to make it a dynamic storage attribute of sorts. That is, the const-ness must be applied to the data, not to the reference. I simply can't think of any option where the traditional approach would work: char[] ptr = ""; const char[] ref = ptr; ... foreach( char c; ref ) { ... } If ptr is ever passed by reference to an opaque function then there is no way to determine whether the data referenced by ref may change within the life of the program. In fact, the same could be said if ref were ever passed by reference to an opaque function implemented in any language other than D, as one must assume that the const attribute could be ignored. Frankly, this has me wondering whether it's possible to enforce any sort of const behavior without data movement in the face of aliasing. For example, in instances where const data is being passed to an opaque code segment it should be possible to instead make a copy in a write-protected memory page and pass a reference to that instead, similar to how passing structs by value to functions works now. This would result in a run-time error if a modification occurred to such data, similar to what happens for modifications to const data in D now. But as far as I know it isn't possible to mark arbitrary memory ranges as read-only. In the long-term, if transactional memory ever becomes a reality then it may be possible to exploit this as a lightweight way of enforcing const-ness: cache the data somewhere and if the cache is ever invalidated then signal an error. But this may not be an option for const data that has a long lifetime. Sean*sigh* I can see the reason for this, but it dramatically reduces the utility of foreach for me, as it's a very common idiom to want to remove elements during an iteration. In fact, I already use this technique quite a bit in Thread and ThreadGroup. That aside, does this restriction also exist for processing AA elements?Yes - everything used as an aggregate in a foreach.I ask because this is done on a dynamically allocated list of references rather than against the AA itself, though I suppose that's implementation defined.The idea is to enable the compiler to do aggressive loop optimizations, regardless of the aggregate type used, and regardless of whether it is user-defined or built-in.
May 12 2006
Sean Kelly wrote:Walter Bright wrote:Yes, it does imply that.Sean Kelly wrote:To play the Devil's advocate... doesn't this imply that the body of an opApply function is also not allowed to make any modifications to the underlying data structure?*sigh* I can see the reason for this, but it dramatically reduces the utility of foreach for me, as it's a very common idiom to want to remove elements during an iteration. In fact, I already use this technique quite a bit in Thread and ThreadGroup. That aside, does this restriction also exist for processing AA elements?Yes - everything used as an aggregate in a foreach.I ask because this is done on a dynamically allocated list of references rather than against the AA itself, though I suppose that's implementation defined.The idea is to enable the compiler to do aggressive loop optimizations, regardless of the aggregate type used, and regardless of whether it is user-defined or built-in.As a contrived example, suppose I have a SqlCursor object that performs lazy reads and caches this data once read so the read must only occur once, and that this data is stored in an SList which is used for the actual iteration. From a code optimization standpoint this seems entirely reasonable. However, it does seem to violate the general rule you've outlined for what is legal in foreach loops. You've said that the point of foreach is to allow the compiler to treat the aggregate as a loop invariant, but my understanding of this term implies that only optimizations on the referent of v may be optimized. For example: int[] v; ... foreach( int i; v ) { ... } may be transformed into: t1 = v.ptr; t2 = v.len; t3 = 0; L1: if( t2 <= t3 ) goto L2; ... t3 += 1; goto L1; L2: ... This would explain Derek's experience where sequence modifications were not observed within the loop. However, without a viable 'const' signifier, I don't think any optimization regarding the sequence itself may be performed: t1 = v.ptr; t2 = v.len; t3 = 0; if( t2 <= 0 ) goto L2; t4 = alloca( t2 ) t4[] = t1[0 .. t2] L1: if( t2 <= t3 ) goto L2; ... t3 += 1; goto L1; L2: ... Is this correct? The first case would explain why rehashing an AA within a foreach loop would likely result in undefined behavior, as would an append-type operation on dynamic arrays. But more structurally cohesive data structures (ie. those for which modifications don't result in iterator invalidation in C++: map, list, etc) should be fairly compatible with foreach even in the face of such changes regardless of whether such detailed rules would be advisable within a language spec.If you don't mess with the iteration state, it will probably work. But the spec won't guarantee it will work, as the intention of foreach is to enable aggressive compiler optimizations. I want to leave the door as wide open to that as possible.If ptr is ever passed by reference to an opaque function then there is no way to determine whether the data referenced by ref may change within the life of the program. In fact, the same could be said if ref were ever passed by reference to an opaque function implemented in any language other than D, as one must assume that the const attribute could be ignored. Frankly, this has me wondering whether it's possible to enforce any sort of const behavior without data movement in the face of aliasing. For example, in instances where const data is being passed to an opaque code segment it should be possible to instead make a copy in a write-protected memory page and pass a reference to that instead, similar to how passing structs by value to functions works now. This would result in a run-time error if a modification occurred to such data, similar to what happens for modifications to const data in D now. But as far as I know it isn't possible to mark arbitrary memory ranges as read-only.It isn't possible to prevent the programmer from subverting the rules and modifying memory. The only thing that can be done is to label such as "undefined behavior".
May 12 2006
Walter Bright wrote:If you don't mess with the iteration state, it will probably work. But the spec won't guarantee it will work, as the intention of foreach is to enable aggressive compiler optimizations. I want to leave the door as wide open to that as possible.Understood.Exactly. And this was the motivation for my suggestion above. If there is no way to prevent the user from subverting a system built upon reference attributes, then the only alternative I can think of would be to build it upon the data itself. D already has the const storage type which does much the same thing: if the user manages to outfox the compiler and modify data in ROM an error will occur. A similar measure of protection could be extended to dynamic data using the page protection features supplied by the memory manager, but there are some obvious problems: - To add data to a protected page the page must either be temporarily unprotected, making it vulnerable to modification by another thread, or it must be modified and the resulting error ignored (which is quite slow). - Such protection is far from granular, as it requires copying data the compiler cannot guarantee will not be modified by user code into a protected memory page. But as far as I'm aware, there is no other way to obtain low-level support for read-only data. I can think of a few possible workarounds to the above problems, but all of them would require special handing of references to const data, and this seems unacceptable for a language like D. But a data-oriented const system seems the only option for something that could actually be exploited by an optimizer. It's just too easy to work around any such attempt built into the language syntax itself. Heck, inline assembler is an obvious and huge back-door to any syntax-oriented protection mechanism. There's simply no point in even considering that route. SeanIf ptr is ever passed by reference to an opaque function then there is no way to determine whether the data referenced by ref may change within the life of the program. In fact, the same could be said if ref were ever passed by reference to an opaque function implemented in any language other than D, as one must assume that the const attribute could be ignored. Frankly, this has me wondering whether it's possible to enforce any sort of const behavior without data movement in the face of aliasing. For example, in instances where const data is being passed to an opaque code segment it should be possible to instead make a copy in a write-protected memory page and pass a reference to that instead, similar to how passing structs by value to functions works now. This would result in a run-time error if a modification occurred to such data, similar to what happens for modifications to const data in D now. But as far as I know it isn't possible to mark arbitrary memory ranges as read-only.It isn't possible to prevent the programmer from subverting the rules and modifying memory. The only thing that can be done is to label such as "undefined behavior".
May 12 2006
Sean Kelly wrote:Exactly. And this was the motivation for my suggestion above. If there is no way to prevent the user from subverting a system built upon reference attributes, then the only alternative I can think of would be to build it upon the data itself. D already has the const storage type which does much the same thing: if the user manages to outfox the compiler and modify data in ROM an error will occur. A similar measure of protection could be extended to dynamic data using the page protection features supplied by the memory manager, but there are some obvious problems: - To add data to a protected page the page must either be temporarily unprotected, making it vulnerable to modification by another thread, or it must be modified and the resulting error ignored (which is quite slow). - Such protection is far from granular, as it requires copying data the compiler cannot guarantee will not be modified by user code into a protected memory page. But as far as I'm aware, there is no other way to obtain low-level support for read-only data. I can think of a few possible workarounds to the above problems, but all of them would require special handing of references to const data, and this seems unacceptable for a language like D. But a data-oriented const system seems the only option for something that could actually be exploited by an optimizer. It's just too easy to work around any such attempt built into the language syntax itself. Heck, inline assembler is an obvious and huge back-door to any syntax-oriented protection mechanism. There's simply no point in even considering that route.I have been thinking along the lines of making 'const' a promise *by the programmer* not to modify the data, and then the compiler will *assume* the promise is kept. Your idea is a way to enforce the issue. The problem is that changing page protection attributes is very slow - but one could afford the cost in "debug compiles." This would be a cool way to implement "const-correctness" without the hassle and cruft. There are still problems, though. You can't make part of an object readonly with hardware.
May 16 2006
Walter Bright wrote:Sean Kelly wrote:Is there a way you can do this whereby the source-level "tagging" of these promises is explicit enough for a lint-like tool to operate upon in a (fully) deterministic manner? I ask because there seems to be two opposing needs involved to support the notion of 'const': the complexity and performance of the compiler, and the facility for a programmer to be informed when they invariably make a mistake. As language users, we tend to rely upon the latter?Exactly. And this was the motivation for my suggestion above. If there is no way to prevent the user from subverting a system built upon reference attributes, then the only alternative I can think of would be to build it upon the data itself. D already has the const storage type which does much the same thing: if the user manages to outfox the compiler and modify data in ROM an error will occur. A similar measure of protection could be extended to dynamic data using the page protection features supplied by the memory manager, but there are some obvious problems: - To add data to a protected page the page must either be temporarily unprotected, making it vulnerable to modification by another thread, or it must be modified and the resulting error ignored (which is quite slow). - Such protection is far from granular, as it requires copying data the compiler cannot guarantee will not be modified by user code into a protected memory page. But as far as I'm aware, there is no other way to obtain low-level support for read-only data. I can think of a few possible workarounds to the above problems, but all of them would require special handing of references to const data, and this seems unacceptable for a language like D. But a data-oriented const system seems the only option for something that could actually be exploited by an optimizer. It's just too easy to work around any such attempt built into the language syntax itself. Heck, inline assembler is an obvious and huge back-door to any syntax-oriented protection mechanism. There's simply no point in even considering that route.I have been thinking along the lines of making 'const' a promise *by the programmer* not to modify the data, and then the compiler will *assume* the promise is kept.
May 16 2006
kris wrote:Walter Bright wrote:What I think most programmers would settle for is that const would be enforced in the 'easy' cases like: int foo(const SomeObject o, const int[] arr) { o.i = 20; // error, is not an lvalue arr[10] = 30; // error, is not an lvalue o.bar(); // Ok o.baz(); // error, discards qualifiers o = new SomeObject; // Ok // ... } class SomeObject { int i; int bar() const // a promise object isn't modified (like C++) { //i = 30; error, i is not an lvalue return i; // Ok } void baz() { i = 40; } } The compiler would just check that a member of any type of aggregate/object cannot be on the LHS of any assignment operator, or that a call to an object method is const, that's it. If they modify a const param. through the reference, then all bets are off. To reinforce the idea of const only for aggregates/objects, and to make the implementation as simple as possible, don't allow naked pointer variables to be const if that avoids a significant amount of complexity. Then write the doc. (and spec.) to reflect this, so what const is expected to do is not ambiguous for either programmers or compiler developers. Just a thought... - DaveSean Kelly wrote:Is there a way you can do this whereby the source-level "tagging" of these promises is explicit enough for a lint-like tool to operate upon in a (fully) deterministic manner? I ask because there seems to be two opposing needs involved to support the notion of 'const': the complexity and performance of the compiler, and the facility for a programmer to be informed when they invariably make a mistake. As language users, we tend to rely upon the latter?Exactly. And this was the motivation for my suggestion above. If there is no way to prevent the user from subverting a system built upon reference attributes, then the only alternative I can think of would be to build it upon the data itself. D already has the const storage type which does much the same thing: if the user manages to outfox the compiler and modify data in ROM an error will occur. A similar measure of protection could be extended to dynamic data using the page protection features supplied by the memory manager, but there are some obvious problems: - To add data to a protected page the page must either be temporarily unprotected, making it vulnerable to modification by another thread, or it must be modified and the resulting error ignored (which is quite slow). - Such protection is far from granular, as it requires copying data the compiler cannot guarantee will not be modified by user code into a protected memory page. But as far as I'm aware, there is no other way to obtain low-level support for read-only data. I can think of a few possible workarounds to the above problems, but all of them would require special handing of references to const data, and this seems unacceptable for a language like D. But a data-oriented const system seems the only option for something that could actually be exploited by an optimizer. It's just too easy to work around any such attempt built into the language syntax itself. Heck, inline assembler is an obvious and huge back-door to any syntax-oriented protection mechanism. There's simply no point in even considering that route.I have been thinking along the lines of making 'const' a promise *by the programmer* not to modify the data, and then the compiler will *assume* the promise is kept.
May 16 2006
Walter Bright wrote:I have been thinking along the lines of making 'const' a promise *by the programmer* not to modify the data, and then the compiler will *assume* the promise is kept.How would the programmer know whether data was being modified? For example: class MyClass { char[] name() { if( !m_name ) m_name = readNameFromDB(); return m_name; } } const MyClass ref = new MyClass; printf( "%.*s\n", ref.name ); It seems this could impose maintenance problems and may require code the user wants to const-qualify to be inspectable. Or are you suggesting there would be some language support for checking this? And if so, how would you avoid a Cpp-like syntax?Your idea is a way to enforce the issue. The problem is that changing page protection attributes is very slow - but one could afford the cost in "debug compiles." This would be a cool way to implement "const-correctness" without the hassle and cruft. There are still problems, though. You can't make part of an object readonly with hardware.Yeah it's kind of a sledgehammer for delicate work. But it might prove useful for tracking down some const violations. I may experiment with it from a library perspective and see if I can find a usable syntax. Sean
May 16 2006
It seems that this kind of use of mixins make the term "mixin" inappropriate: it looks like traits (as on http://www.iam.unibe.ch/~scg/Research/Traits/). A class has two competing role: being a mold to generate instances and being a mean of code reuse. The first goal lead to full featured classes, the second to small sized ones, wich turns to be paradoxal. Traits give a way to address this problem. Not that in object engineering litterature, a mixin class is a parameterized subclass. It can be used to implement some features of aspect oriented programming. Mixin classes semantics in D would be : template MixinClass ( T ) { class MyClass : T { this () { super (); } void foo () { super (); doSomethingMore (); } void doSomethingMore () { addFeatures (); } } } Mixin classes can create conflicts ( methods having the same signatures ) when they are composed together. Using named mixins, we can select wich implementation to use but it would be more interesting to make mixins inherit from the other composed mixins. As the current scope has precedence over the mixin code we could do something like ( I never tested it ): class MyMixedInClass { mixin MyClass first ; mixin MyClass2 second ; void doSomethingMore () { second.doSomethingMore (); // Select the priority between mixins. first.doSomethingMore (); } } Or we could make a feature request to have built in mixin classes, something like : mixin class ToBeComposed ( S /* superclass */, /* template params */ ... ) { void feature () { ... } } class Composit { mixin !( T ) ToBeComposed ; mixin OtherMixinClass ; void feature () { mixin ToBeComposed, OtherMixinClass ; // set mixin precedence. } } The syntax has to be improved but it seems that D has all the features to make that kind of syntactic sugar. BTW, this linked list is a great idea.
May 11 2006
On Thu, 11 May 2006 09:52:10 -0700, Walter Bright wrote:Sean Kelly wrote:This bit my bum yesterday too. I had code inside the foreach that added to the list, only to discover that even though the elements were added, they were ignored inside the same foreach loop. --------- example.d ----------- import std.stdio; import std.string; void tester(char[] x) { writefln("Adding '%s'", x); vList ~= x; } char[][] vList; void main() { tester("abc"); tester("def"); tester("ghi"); version(FOR) { for(int i; i < vList.length; i++) { char[] p = vList[i]; writefln("Found '%s'", p); if (std.string.find(p, "e") != -1) { tester("orang"); } } } version(EACH) { foreach(char[] p; vList) { writefln("Found '%s'", p); if (std.string.find(p, "e") != -1) { tester("orang"); } } } } ---------- end of example.d ---------- c:\temp>build example -full -exec -version=FOR Path and Version : y:\util\build.exe v2.10(1556) built on Thu Apr 6 13:13:32 2006 Adding 'abc' Adding 'def' Adding 'ghi' Found 'abc' Found 'def' Adding 'orang' Found 'ghi' Found 'orang' c:\temp>build example -full -exec -version=EACH Path and Version : y:\util\build.exe v2.10(1556) built on Thu Apr 6 13:13:32 2006 Adding 'abc' Adding 'def' Adding 'ghi' Found 'abc' Found 'def' Adding 'orang' Found 'ghi' -- Derek (skype: derek.j.parnell) Melbourne, Australia "Down with mediocracy!" 12/05/2006 10:08:16 AMVery cool. One thing... does this work: foreach( Person p; per.each ) { if( p.age > 50 ) p.listRemove(); } ie. can you remove elements within a foreach?It's undefined behavior. foreach is entitled to assume that the aggregate is a loop invariant, although the contents of the aggregate's elements can change. This precludes things like resizing an array inside a foreach, etc.
May 11 2006
On Fri, 12 May 2006 02:52:10 +1000, Walter Bright <newshound digitalmars.com> wrote:Sean Kelly wrote:I fell for this one just yesterday. I had ... foreach( Person p; People) { Foo(); SomeFunc(); // Potentially adds an item to the People list. } but the new items were never processed by Foo() ! -- Derek Parnell Melbourne, AustraliaVery cool. One thing... does this work: foreach( Person p; per.each ) { if( p.age > 50 ) p.listRemove(); } ie. can you remove elements within a foreach?It's undefined behavior. foreach is entitled to assume that the aggregate is a loop invariant, although the contents of the aggregate's elements can change. This precludes things like resizing an array inside a foreach, etc.
May 10 2006
Derek Parnell wrote:On Fri, 12 May 2006 02:52:10 +1000, Walter Bright <newshound digitalmars.com> wrote:If People is a dynamic array and adding a person to the list results in a reallocation then I'd expect this. I would actually consider this undefined behavior for dynamic arrays because whether a reallocation occurs is implementation defined. But for a linked list it should work, darnit :-) SeanSean Kelly wrote:I fell for this one just yesterday. I had ... foreach( Person p; People) { Foo(); SomeFunc(); // Potentially adds an item to the People list. } but the new items were never processed by Foo() !Very cool. One thing... does this work: foreach( Person p; per.each ) { if( p.age > 50 ) p.listRemove(); } ie. can you remove elements within a foreach?It's undefined behavior. foreach is entitled to assume that the aggregate is a loop invariant, although the contents of the aggregate's elements can change. This precludes things like resizing an array inside a foreach, etc.
May 12 2006
why not make a feature: safe foreach( Person p; ....) "Walter Bright" <newshound digitalmars.com> 写入消息新闻:e3vq3l$1rp5$3 digitaldaemon.com...Sean Kelly wrote:Very cool. One thing... does this work: foreach( Person p; per.each ) { if( p.age > 50 ) p.listRemove(); } ie. can you remove elements within a foreach?It's undefined behavior. foreach is entitled to assume that the aggregate is a loop invariant, although the contents of the aggregate's elements can change. This precludes things like resizing an array inside a foreach, etc.
May 23 2006
Boris Wang wrote: [edited upside-down post]"Walter Bright" <newshound digitalmars.com> 写入消息新闻:e3vq3l$1rp5$3 digitaldaemon.com...If such a feature was to be implemented, I can think of a better (cleaner and more consistent with the rest of the language) syntax: foreach(Peron p; inout people) { // ... } Why invent a new keyword when an old one (though in a different place) perfectly expresses what's going on? Frits van BommelSean Kelly wrote:why not make a feature: safe foreach( Person p; ....)Very cool. One thing... does this work: foreach( Person p; per.each ) { if( p.age > 50 ) p.listRemove(); } ie. can you remove elements within a foreach?It's undefined behavior. foreach is entitled to assume that the aggregate is a loop invariant, although the contents of the aggregate's elements can change. This precludes things like resizing an array inside a foreach, etc.
May 23 2006
On Wed, 24 May 2006 08:47:21 +0200, Frits van Bommel wrote:Boris Wang wrote: [edited upside-down post]Yes! The meaning of this would be that "people" would be 're-evaluated' prior to each iteration under the assumption that something in the body of the loop modified the array in some manner. -- Derek (skype: derek.j.parnell) Melbourne, Australia "Down with mediocracy!" 24/05/2006 5:07:14 PM"Walter Bright" <newshound digitalmars.com> 写入消息新闻:e3vq3l$1rp5$3 digitaldaemon.com...> why not make a feature: > > safe foreach( Person p; ....) If such a feature was to be implemented, I can think of a better (cleaner and more consistent with the rest of the language) syntax: foreach(Peron p; inout people) { // ... } Why invent a new keyword when an old one (though in a different place) perfectly expresses what's going on?Sean Kelly wrote:Very cool. One thing... does this work: foreach( Person p; per.each ) { if( p.age > 50 ) p.listRemove(); } ie. can you remove elements within a foreach?It's undefined behavior. foreach is entitled to assume that the aggregate is a loop invariant, although the contents of the aggregate's elements can change. This precludes things like resizing an array inside a foreach, etc.
May 24 2006
Derek Parnell skrev:On Wed, 24 May 2006 08:47:21 +0200, Frits van Bommel wrote:By reevaluation, I presume you don't mean that given: foreach(var; inout Expression), Expression is reevaluated on each iteration? What would the semantics be for iteration over a regular array when the array is applied the different changing operations? I.e. what should foreach do if the following is done inside the loop body: * arr ~= element; * arr = arr[0..i-1] ~ arr[i+1..$]; // (legal?) * arr[i] = arr[$-1], arr.length = arr.length-1; * arr = arr[-1..$]; * arr = otherArr ~ arr; * more? I guess the only feasible equivalent of foreach(x ; inout arr()) would be: { auto array = arr(); for (int i = 0; i < array.length; i++) { /*inout*/ auto x = array[i]; But that goes against the "each" in foreach by skipping the next element following a deletion and iterating doubly over elements if the array is appended to while the loop is running. /OskarBoris Wang wrote: [edited upside-down post]Yes! The meaning of this would be that "people" would be 're-evaluated' prior to each iteration under the assumption that something in the body of the loop modified the array in some manner."Walter Bright" <newshound digitalmars.com> 写入消息新闻:e3vq3l$1rp5$3 digitaldaemon.com...> why not make a feature: > > safe foreach( Person p; ....) If such a feature was to be implemented, I can think of a better (cleaner and more consistent with the rest of the language) syntax: foreach(Peron p; inout people) { // ... } Why invent a new keyword when an old one (though in a different place) perfectly expresses what's going on?Sean Kelly wrote:Very cool. One thing... does this work: foreach( Person p; per.each ) { if( p.age > 50 ) p.listRemove(); } ie. can you remove elements within a foreach?It's undefined behavior. foreach is entitled to assume that the aggregate is a loop invariant, although the contents of the aggregate's elements can change. This precludes things like resizing an array inside a foreach, etc.
May 24 2006
Sean Kelly wrote:Chris Miller wrote:how about: alias void delegate(void) vvd; vvd outs[]; foreach( Person p; per.each ) { if( p.age > 50 ) outs ~= &p.listRemove; } foreach (act; outs) act();A new linked list module has been released at http://www.dprogramming.com/list.php You may be wondering why another linked list; check out the FAQ to see!Very cool. One thing... does this work: foreach( Person p; per.each ) { if( p.age > 50 ) p.listRemove(); } ie. can you remove elements within a foreach? Sean
May 11 2006
"Chris Miller" <chris dprogramming.com> wrote in message news:op.s9dvy3lhpo9bzi moe...A new linked list module has been released at http://www.dprogramming.com/list.php You may be wondering why another linked list; check out the FAQ to see!Yep, intrusive lists are way to go. BTW: There was another one http://www.digitalmars.com/d/archives/digitalmars/D/dtl/378.html , also intrusive list with support of hierarchical collections - trees. Andrew Fedoniouk. http://terrainformatica.com
May 13 2006
"Andrew Fedoniouk" <news terrainformatica.com> wrote in news:e4689r$cmm$1 digitaldaemon.com:"Chris Miller" <chris dprogramming.com> wrote in message news:op.s9dvy3lhpo9bzi moe...Since I read an article from Jiri Soukup about intrusive data structures back in 1998, I'm a lttle bit biased towards them. My Nova project on dsource hosts currently implementations for - intrusive single linked list - intrusive double linked list - intrusive doublelink - intrusive red/black tree See http://www.dsource.org/projects/nova/wiki/WikiStart Previously I had implemented some of them in C++, but the template system of D and the support for delegates improves the usability a lot. Further references for intrusive data structures are: boost::intrusive - the idea to use accessor classes for Nova comes from this project. In the meantime Jiri Soukup drives his own open source project dedicated exclusively to intrusive data structures. It is called incode and hosted on sourceforge (http://incode.sourceforge.net). Klaus OberhoferA new linked list module has been released at http://www.dprogramming.com/list.php You may be wondering why another linked list; check out the FAQ to see!Yep, intrusive lists are way to go. BTW: There was another one http://www.digitalmars.com/d/archives/digitalmars/D/dtl/378.html , also intrusive list with support of hierarchical collections - trees. Andrew Fedoniouk. http://terrainformatica.com
May 16 2006
Some updates to the linked list module: http://www.dprogramming.com/list.php filter() added to allow removing list items, since foreach does not allow it; ListHead added; and more.
May 16 2006