digitalmars.D - chevrons for D templates in emacs
- Andrei Alexandrescu (21/21) Oct 12 2008 With the help of Pascal J. Bourguignon on the gnu.emacs.help newsgroup I...
- Walter Bright (3/4) Oct 12 2008 I guess this is why I don't use emacs. I don't want to hear any more
- Andrei Alexandrescu (14/19) Oct 12 2008 I agree about that, but why don't you use std.algorithm? :o)
- Anon (4/28) Oct 12 2008
- Andrei Alexandrescu (3/30) Oct 12 2008 A winner but gets penalty points for not admitting being superdan :o).
- Janderson (7/35) Oct 12 2008 This is a similar trick I've used in the past for A*. Basically you
- Benji Smith (24/48) Oct 12 2008 I smell a clever trick to divert us all from rolling our eyes about the
- Andrei Alexandrescu (6/63) Oct 12 2008 Winner!
- Benji Smith (14/17) Oct 12 2008 I should have mentioned...
- Christopher Wright (7/30) Oct 13 2008 Neither Tango nor Phobos contains a heap implementation. I've submitted
- Andrei Alexandrescu (5/32) Oct 13 2008 There are heap primitives in std.algorithm, just not published yet.
- Klaus Oberhofer (9/14) Oct 13 2008 Some time ago I implemented a binary heap because I needed it to create
- Christopher Wright (12/35) Oct 13 2008 There isn't a lot you can do to mess up a heap, but there are some minor...
- KlausO (8/23) Oct 14 2008 I admit, the implementation has a special purpose. It sorts codes by
- Sergey Gromov (9/30) Oct 12 2008 A straight-forward solution would be putting name-size pairs into a tree...
- Andrei Alexandrescu (5/36) Oct 12 2008 The trick here was to figure that you don't really need a sorted
- KennyTM~ (14/38) Oct 12 2008 Straightforward solution: Use insertion sort.
- Andrei Alexandrescu (8/52) Oct 12 2008 One point taken off for not using the new syntax
- KennyTM~ (4/63) Oct 12 2008 I don't know :). I haven't written a priority queue in D (only stacks
- KennyTM~ (14/59) Oct 12 2008 Sorry, wrong solution. Forgot to compare before pop.
- bearophile (50/51) Oct 14 2008 Very nice, thank you. Few quick comments (no deep comments, I haven't us...
- Christopher Wright (43/88) Oct 14 2008 Agreed. The implementation that I submitted had a push method that took
- bearophile (9/15) Oct 14 2008 Right.
- Christopher Wright (6/6) Oct 14 2008 Actually, I STRONGLY urge that nobody use this heap.
- Steven Schveighoffer (5/12) Oct 14 2008 Unless such value types have copy constructors which make a copy of thei...
- Christopher Wright (9/19) Oct 14 2008 Correction: as partial value types. If they were completely value types,...
- Christopher Wright (5/20) Oct 14 2008 Argh, I'm making more of an idiot of myself. If they're value types by
- Andrei Alexandrescu (4/14) Oct 14 2008 I'd say this statement reflects a lack of understanding of value types,
- Benji Smith (7/23) Oct 14 2008 I'll bite.
- Andrei Alexandrescu (3/25) Oct 14 2008 As the Mexican would say: why not?
- Benji Smith (14/28) Oct 14 2008 Well... here are a few reasons why not:
- Andrei Alexandrescu (21/50) Oct 15 2008 I don't want to seem like picking on you, but I would like to answer
- Sergey Gromov (6/9) Oct 15 2008 How about if you want pass-by-value you use structs, otherwise you don't...
- Andrei Alexandrescu (4/14) Oct 15 2008 A struct can choose to implement value or reference semantics as it
- Sergey Gromov (6/20) Oct 15 2008 I can imagine the documentation: "Ignore the fact it's a struct, it's
- Steven Schveighoffer (8/31) Oct 15 2008 There's nothing requiring that a struct must implement value semantics. ...
- Andrei Alexandrescu (6/26) Oct 15 2008 No. "Don't submit to the dogma that struct == value semantics, because
- Sergey Gromov (16/43) Oct 15 2008 I don't understand. Structs are value types, everything else are
- Bill Baxter (5/30) Oct 15 2008 I think the point is that value *type* doesn't necessarily imply value
- Sergey Gromov (6/37) Oct 15 2008 Arrays are very widely used and very well documented to the point they'r...
- Bill Baxter (13/50) Oct 15 2008 And I think the issues are pretty much the same with all struct-based
- Sergey Gromov (18/73) Oct 15 2008 No. Built-in array is transparent. Appender is not. Because if
- bearophile (4/7) Oct 15 2008 Just a note: my ArrayBuilder is a struct (with 3 fields inside, each siz...
- Steven Schveighoffer (11/59) Oct 15 2008 First, there is no compiler required semantics. It is up to the designe...
- Benji Smith (45/75) Oct 15 2008 I don't feel like you're picking on me! Without disagreements, there
- Andrei Alexandrescu (14/28) Oct 15 2008 I know you're going to hate this, but I'm unable to follow through with
- Benji Smith (9/28) Oct 15 2008 Naw, I don't hate it. I'm up to my eyeballs too, and I had no delusions
- Jarrett Billingsley (3/51) Oct 14 2008 Bearophile, you're doing it again. Why are you so hung up on how
- bearophile (4/6) Oct 15 2008 Not all coding styles are created equal. Some are better.
- Steven Schveighoffer (3/7) Oct 15 2008 Yes, but clearly not yours. Mine is much better.
With the help of Pascal J. Bourguignon on the gnu.emacs.help newsgroup I got some chevrons working for D templates. The current version has quite a few shortcomings (no proper nesting and electric behavior could be better) but it does get us started towards finding a comprehensive solution. Paste this into your ~/.emacs (assuming you already use Bill Baxter's d-mode and of course emacs): (defun d-chevrons () (font-lock-add-keywords nil '(("\\(!(\\)[^()]*\\()\\)" 0 (progn (compose-region (match-beginning 1) (match-end 1) ?« 'decompose-region) (compose-region (match-beginning 2) (match-end 2) ?» 'decompose-region) nil)))) (font-lock-mode 1)) (add-hook 'd-mode-hook 'd-chevrons) The result looks pretty darn yummy. Andrei
Oct 12 2008
Andrei Alexandrescu wrote:nil '(("\\(!(\\)[^()]*\\()\\)"I guess this is why I don't use emacs. I don't want to hear any more grousing about !( ) after that!!!
Oct 12 2008
Walter Bright wrote:Andrei Alexandrescu wrote:I agree about that, but why don't you use std.algorithm? :o) Speaking of which, here's another challenge for everybody who'd like to waste some cycles. Say you have a simple API for accessing a large collection of files - e.g. all of Google's cached documents. The task, should you accept it, is to find the 1,000,000 largest ones of those files. The output should be filenames sorted in decreasing order of size. Assume the API gives you <filename, size> pairs in a serial fashion. You can't use parallel processing (e.g. map/reduce on clusters), but you are allowed to use threads on one machine if if fancies you. Speed is highly desirable. Devise your algorithm and explain how fast it runs with practical and/or theoretical arguments. Andreinil '(("\\(!(\\)[^()]*\\()\\)"I guess this is why I don't use emacs. I don't want to hear any more grousing about !( ) after that!!!
Oct 12 2008
Andrei Alexandrescu Wrote:Walter Bright wrote:implement a max heap on top of a fixed size array. "insert" all pairs to the heap (if the current is smaller than everything on the heap and the heap is full the pair is discarded) insert is O(log(heap height)) =O(log (1_000_000)) for each file in google's cache, remove top is O(1).Andrei Alexandrescu wrote:I agree about that, but why don't you use std.algorithm? :o) Speaking of which, here's another challenge for everybody who'd like to waste some cycles. Say you have a simple API for accessing a large collection of files - e.g. all of Google's cached documents. The task, should you accept it, is to find the 1,000,000 largest ones of those files. The output should be filenames sorted in decreasing order of size. Assume the API gives you <filename, size> pairs in a serial fashion. You can't use parallel processing (e.g. map/reduce on clusters), but you are allowed to use threads on one machine if if fancies you. Speed is highly desirable. Devise your algorithm and explain how fast it runs with practical and/or theoretical arguments. Andreinil '(("\\(!(\\)[^()]*\\()\\)"I guess this is why I don't use emacs. I don't want to hear any more grousing about !( ) after that!!!
Oct 12 2008
Anon wrote:Andrei Alexandrescu Wrote:A winner but gets penalty points for not admitting being superdan :o). AndreiWalter Bright wrote:implement a max heap on top of a fixed size array. "insert" all pairs to the heap (if the current is smaller than everything on the heap and the heap is full the pair is discarded) insert is O(log(heap height)) =O(log (1_000_000)) for each file in google's cache, remove top is O(1).Andrei Alexandrescu wrote:I agree about that, but why don't you use std.algorithm? :o) Speaking of which, here's another challenge for everybody who'd like to waste some cycles. Say you have a simple API for accessing a large collection of files - e.g. all of Google's cached documents. The task, should you accept it, is to find the 1,000,000 largest ones of those files. The output should be filenames sorted in decreasing order of size. Assume the API gives you <filename, size> pairs in a serial fashion. You can't use parallel processing (e.g. map/reduce on clusters), but you are allowed to use threads on one machine if if fancies you. Speed is highly desirable. Devise your algorithm and explain how fast it runs with practical and/or theoretical arguments. Andreinil '(("\\(!(\\)[^()]*\\()\\)"I guess this is why I don't use emacs. I don't want to hear any more grousing about !( ) after that!!!
Oct 12 2008
Anon wrote:Andrei Alexandrescu Wrote:This is a similar trick I've used in the past for A*. Basically you only care about the top item(s) so you only sort them those. With A* of course there's the potential that you might run out of top items. However items that where candidates for the queue but where pushed out are already partially sorted so you can save some cycles time there too. -JoelWalter Bright wrote:implement a max heap on top of a fixed size array. "insert" all pairs to the heap (if the current is smaller than everything on the heap and the heap is full the pair is discarded) insert is O(log(heap height)) =O(log (1_000_000)) for each file in google's cache, remove top is O(1).Andrei Alexandrescu wrote:I agree about that, but why don't you use std.algorithm? :o) Speaking of which, here's another challenge for everybody who'd like to waste some cycles. Say you have a simple API for accessing a large collection of files - e.g. all of Google's cached documents. The task, should you accept it, is to find the 1,000,000 largest ones of those files. The output should be filenames sorted in decreasing order of size. Assume the API gives you <filename, size> pairs in a serial fashion. You can't use parallel processing (e.g. map/reduce on clusters), but you are allowed to use threads on one machine if if fancies you. Speed is highly desirable. Devise your algorithm and explain how fast it runs with practical and/or theoretical arguments. Andreinil '(("\\(!(\\)[^()]*\\()\\)"I guess this is why I don't use emacs. I don't want to hear any more grousing about !( ) after that!!!
Oct 12 2008
Andrei Alexandrescu wrote:Walter Bright wrote:I smell a clever trick to divert us all from rolling our eyes about the emacs chevron trick :-) Fine by me. I'll bite... auto heap = new BinaryHeap!(File, ulong)(1_000_000); while (googleApi.hasMoreFiles()) { File f = googleApi.getNextFile(); ulong size = googleApi.sizeOfFile(f); heap.add(f, size); } while (heap.size() > 0) { ulong size = heap.getTopValue(); File f = heap.removeTopItem(); Stdout.formatln("file: {0}, size: {1}", f.getName(), size); } The BinaryHeap class I'm using is described pretty well here: http://en.wikipedia.org/wiki/Binary_heap Items with the highest value automatically bubble to the top of the heap, and items with lower values sink to the bottom, eventually falling completely out of the collection. The performance is n log m, where 'n' is the total number of documents, and 'm' is the number of <filename, size> pairs that we care about (in this case 1,000,000). --benjiAndrei Alexandrescu wrote:I agree about that, but why don't you use std.algorithm? :o) Speaking of which, here's another challenge for everybody who'd like to waste some cycles. Say you have a simple API for accessing a large collection of files - e.g. all of Google's cached documents. The task, should you accept it, is to find the 1,000,000 largest ones of those files. The output should be filenames sorted in decreasing order of size. Assume the API gives you <filename, size> pairs in a serial fashion. You can't use parallel processing (e.g. map/reduce on clusters), but you are allowed to use threads on one machine if if fancies you. Speed is highly desirable. Devise your algorithm and explain how fast it runs with practical and/or theoretical arguments. Andreinil '(("\\(!(\\)[^()]*\\()\\)"I guess this is why I don't use emacs. I don't want to hear any more grousing about !( ) after that!!!
Oct 12 2008
Benji Smith wrote:Andrei Alexandrescu wrote:Winner! One nice touch is that you don't even bother to sort the heap; you only remove from it. I wonder if this will be overall sensibly faster than just heapsorting the heap. Thoughts? AndreiWalter Bright wrote:I smell a clever trick to divert us all from rolling our eyes about the emacs chevron trick :-) Fine by me. I'll bite... auto heap = new BinaryHeap!(File, ulong)(1_000_000); while (googleApi.hasMoreFiles()) { File f = googleApi.getNextFile(); ulong size = googleApi.sizeOfFile(f); heap.add(f, size); } while (heap.size() > 0) { ulong size = heap.getTopValue(); File f = heap.removeTopItem(); Stdout.formatln("file: {0}, size: {1}", f.getName(), size); } The BinaryHeap class I'm using is described pretty well here: http://en.wikipedia.org/wiki/Binary_heap Items with the highest value automatically bubble to the top of the heap, and items with lower values sink to the bottom, eventually falling completely out of the collection. The performance is n log m, where 'n' is the total number of documents, and 'm' is the number of <filename, size> pairs that we care about (in this case 1,000,000). --benjiAndrei Alexandrescu wrote:I agree about that, but why don't you use std.algorithm? :o) Speaking of which, here's another challenge for everybody who'd like to waste some cycles. Say you have a simple API for accessing a large collection of files - e.g. all of Google's cached documents. The task, should you accept it, is to find the 1,000,000 largest ones of those files. The output should be filenames sorted in decreasing order of size. Assume the API gives you <filename, size> pairs in a serial fashion. You can't use parallel processing (e.g. map/reduce on clusters), but you are allowed to use threads on one machine if if fancies you. Speed is highly desirable. Devise your algorithm and explain how fast it runs with practical and/or theoretical arguments. Andreinil '(("\\(!(\\)[^()]*\\()\\)"I guess this is why I don't use emacs. I don't want to hear any more grousing about !( ) after that!!!
Oct 12 2008
Andrei Alexandrescu wrote:One nice touch is that you don't even bother to sort the heap; you only remove from it. I wonder if this will be overall sensibly faster than just heapsorting the heap. Thoughts?I should have mentioned... The heap doesn't need to be sorted. Anytime you remove an element from the heap, it's guaranteed to have the highest value. That's the nice thing about the heap. Multiple consecutive removals of the topmost item is functionally identical to sorting it in descending order. Of course, every time you remove the topmost item, it has a cost of log n, where n is the number of remaining items in the heap. But sorting the whole heap would cost n log n anyhow, so the iterative removal process is actually slightly less expensive (since n decreases with every iteration). I actually don't have the code implemented in D, but I've attached my Java implementation, so you can get an idea of how it works. --benji
Oct 12 2008
Benji Smith wrote:Andrei Alexandrescu wrote:Neither Tango nor Phobos contains a heap implementation. I've submitted one to Tango, but I don't think anyone's done anything about it. My implementation's rather smaller, but that may be due to the number of comments yours has. The comment lines and code lines are approaching parity in yours. I believe that mine also releases memory if you add a lot to it and subsequently remove a lot from it.One nice touch is that you don't even bother to sort the heap; you only remove from it. I wonder if this will be overall sensibly faster than just heapsorting the heap. Thoughts?I should have mentioned... The heap doesn't need to be sorted. Anytime you remove an element from the heap, it's guaranteed to have the highest value. That's the nice thing about the heap. Multiple consecutive removals of the topmost item is functionally identical to sorting it in descending order. Of course, every time you remove the topmost item, it has a cost of log n, where n is the number of remaining items in the heap. But sorting the whole heap would cost n log n anyhow, so the iterative removal process is actually slightly less expensive (since n decreases with every iteration). I actually don't have the code implemented in D, but I've attached my Java implementation, so you can get an idea of how it works. --benji
Oct 13 2008
Christopher Wright wrote:Benji Smith wrote:There are heap primitives in std.algorithm, just not published yet. Anyhow, in case you or anyone would be interested in putting your ideas in Phobos, I encourage you to share some code with me so I can look over it. AndreiAndrei Alexandrescu wrote:Neither Tango nor Phobos contains a heap implementation. I've submitted one to Tango, but I don't think anyone's done anything about it.One nice touch is that you don't even bother to sort the heap; you only remove from it. I wonder if this will be overall sensibly faster than just heapsorting the heap. Thoughts?I should have mentioned... The heap doesn't need to be sorted. Anytime you remove an element from the heap, it's guaranteed to have the highest value. That's the nice thing about the heap. Multiple consecutive removals of the topmost item is functionally identical to sorting it in descending order. Of course, every time you remove the topmost item, it has a cost of log n, where n is the number of remaining items in the heap. But sorting the whole heap would cost n log n anyhow, so the iterative removal process is actually slightly less expensive (since n decreases with every iteration). I actually don't have the code implemented in D, but I've attached my Java implementation, so you can get an idea of how it works. --benji
Oct 13 2008
Andrei Alexandrescu schrieb:There are heap primitives in std.algorithm, just not published yet. Anyhow, in case you or anyone would be interested in putting your ideas in Phobos, I encourage you to share some code with me so I can look over it. AndreiSome time ago I implemented a binary heap because I needed it to create Huffman codes (Warning: Maybe the code does not catch up to the current Tango library.). Parts of the code are inspired by the MINTL library, which contains an implementation, too. See my code at http://www.dsource.org/projects/nova/browser/trunk/nova/ds/priorityqueue.d MINTL seems to be abandoned, but I found it in the necrophilia repos: http://necrophilia.googlecode.com/svn/trunk/tests/shared/mintl/arrayheap.d KlausO
Oct 13 2008
Klaus Oberhofer wrote:Andrei Alexandrescu schrieb:There isn't a lot you can do to mess up a heap, but there are some minor issues with yours: 1. It only has one direction (which isn't documented). 2. It has a key/value pair as its element type. 3. It's only keyed on ints. My heap was apparently accepted into Tango last week, with some minor fixes and cleanup (and switched into a struct rather than a class; not sure I like that). The only issue I see with the fixed version is that it uses recursion in a couple places where it could loop instead. Anyway, the code's here: http://dsource.org/projects/tango/browser/trunk/tango/util/container/more/Heap.d?rev=3959There are heap primitives in std.algorithm, just not published yet. Anyhow, in case you or anyone would be interested in putting your ideas in Phobos, I encourage you to share some code with me so I can look over it. AndreiSome time ago I implemented a binary heap because I needed it to create Huffman codes (Warning: Maybe the code does not catch up to the current Tango library.). Parts of the code are inspired by the MINTL library, which contains an implementation, too. See my code at http://www.dsource.org/projects/nova/browser/trunk/nova/ds/priorityqueue.d MINTL seems to be abandoned, but I found it in the necrophilia repos: http://necrophilia.googlecode.com/svn/trunk/tests/shared/mintl/arrayheap.d KlausO
Oct 13 2008
Christopher Wright schrieb:There isn't a lot you can do to mess up a heap, but there are some minor issues with yours: 1. It only has one direction (which isn't documented). 2. It has a key/value pair as its element type. 3. It's only keyed on ints.I admit, the implementation has a special purpose. It sorts codes by their propability to create Huffman codes. All these three points are owed to this context. So the class name "PriorityQueue" is misleading, maybe I had to choose a better name like "CodeByProbabilitySortingPriorityQueue" :)My heap was apparently accepted into Tango last week, with some minor fixes and cleanup (and switched into a struct rather than a class; not sure I like that). The only issue I see with the fixed version is that it uses recursion in a couple places where it could loop instead. Anyway, the code's here: http://dsource.org/projects/tango/browser/trunk/tango/util/container/ ore/Heap.d?rev=3959Thank you for the link. I try to use your general purpose implementation. KlausO
Oct 14 2008
Sun, 12 Oct 2008 19:11:00 -0500, Andrei Alexandrescu wrote:Walter Bright wrote:A straight-forward solution would be putting name-size pairs into a tree sorted by size, which is O(1e6 log 1e6). Then, if the tree contains more than a million elements, remove the smallest which is also O(1e6 log 1e6). You can optimize it by skipping elements smaller than the smallest element in the tree if the tree is full. I haven't found any trees in Phobos though. Am I missing something?Andrei Alexandrescu wrote:I agree about that, but why don't you use std.algorithm? :o) Speaking of which, here's another challenge for everybody who'd like to waste some cycles. Say you have a simple API for accessing a large collection of files - e.g. all of Google's cached documents. The task, should you accept it, is to find the 1,000,000 largest ones of those files. The output should be filenames sorted in decreasing order of size. Assume the API gives you <filename, size> pairs in a serial fashion. You can't use parallel processing (e.g. map/reduce on clusters), but you are allowed to use threads on one machine if if fancies you. Speed is highly desirable. Devise your algorithm and explain how fast it runs with practical and/or theoretical arguments.nil '(("\\(!(\\)[^()]*\\()\\)"I guess this is why I don't use emacs. I don't want to hear any more grousing about !( ) after that!!!
Oct 12 2008
Sergey Gromov wrote:Sun, 12 Oct 2008 19:11:00 -0500, Andrei Alexandrescu wrote:The trick here was to figure that you don't really need a sorted container; an "almost sorted" container suffices, and a heap is exactly what the doctor prescribed. AndreiWalter Bright wrote:A straight-forward solution would be putting name-size pairs into a tree sorted by size, which is O(1e6 log 1e6). Then, if the tree contains more than a million elements, remove the smallest which is also O(1e6 log 1e6). You can optimize it by skipping elements smaller than the smallest element in the tree if the tree is full. I haven't found any trees in Phobos though. Am I missing something?Andrei Alexandrescu wrote:I agree about that, but why don't you use std.algorithm? :o) Speaking of which, here's another challenge for everybody who'd like to waste some cycles. Say you have a simple API for accessing a large collection of files - e.g. all of Google's cached documents. The task, should you accept it, is to find the 1,000,000 largest ones of those files. The output should be filenames sorted in decreasing order of size. Assume the API gives you <filename, size> pairs in a serial fashion. You can't use parallel processing (e.g. map/reduce on clusters), but you are allowed to use threads on one machine if if fancies you. Speed is highly desirable. Devise your algorithm and explain how fast it runs with practical and/or theoretical arguments.nil '(("\\(!(\\)[^()]*\\()\\)"I guess this is why I don't use emacs. I don't want to hear any more grousing about !( ) after that!!!
Oct 12 2008
Andrei Alexandrescu wrote:Walter Bright wrote:Straightforward solution: Use insertion sort. auto list = new PriorityQueue!(Documents); // Documents.opCmp returns difference of two documents' length. // the SHORTEST document gets priority. foreach (doc; google_cache) { // O(N) if (list.length >= 1_000_000) list.pop(); // O(log 1M) list.push(doc); // O(log 1M) } auto array = convertToArray(list); // O(1M log 1M) array.reverse(); // O(1M) write(array); // O(1M). Total time complexity: O(N log 1M).Andrei Alexandrescu wrote:I agree about that, but why don't you use std.algorithm? :o) Speaking of which, here's another challenge for everybody who'd like to waste some cycles. Say you have a simple API for accessing a large collection of files - e.g. all of Google's cached documents. The task, should you accept it, is to find the 1,000,000 largest ones of those files. The output should be filenames sorted in decreasing order of size. Assume the API gives you <filename, size> pairs in a serial fashion. You can't use parallel processing (e.g. map/reduce on clusters), but you are allowed to use threads on one machine if if fancies you. Speed is highly desirable. Devise your algorithm and explain how fast it runs with practical and/or theoretical arguments. Andreinil '(("\\(!(\\)[^()]*\\()\\)"I guess this is why I don't use emacs. I don't want to hear any more grousing about !( ) after that!!!
Oct 12 2008
KennyTM~ wrote:Andrei Alexandrescu wrote:One point taken off for not using the new syntax PriorityQueue!Documents. Ahem. On a serious note, you should specify the unusual comparison predicate: auto list = new PriorityQueue!(Documents, "a > b"); as I supposed the default version would use less-than. Or does it?Walter Bright wrote:Straightforward solution: Use insertion sort. auto list = new PriorityQueue!(Documents);Andrei Alexandrescu wrote:I agree about that, but why don't you use std.algorithm? :o) Speaking of which, here's another challenge for everybody who'd like to waste some cycles. Say you have a simple API for accessing a large collection of files - e.g. all of Google's cached documents. The task, should you accept it, is to find the 1,000,000 largest ones of those files. The output should be filenames sorted in decreasing order of size. Assume the API gives you <filename, size> pairs in a serial fashion. You can't use parallel processing (e.g. map/reduce on clusters), but you are allowed to use threads on one machine if if fancies you. Speed is highly desirable. Devise your algorithm and explain how fast it runs with practical and/or theoretical arguments. Andreinil '(("\\(!(\\)[^()]*\\()\\)"I guess this is why I don't use emacs. I don't want to hear any more grousing about !( ) after that!!!// Documents.opCmp returns difference of two documents' length. // the SHORTEST document gets priority. foreach (doc; google_cache) { // O(N) if (list.length >= 1_000_000) list.pop(); // O(log 1M) list.push(doc); // O(log 1M) } auto array = convertToArray(list); // O(1M log 1M) array.reverse(); // O(1M) write(array); // O(1M). Total time complexity: O(N log 1M).Great! I'm glad there's so much interest and competency around here. Andrei
Oct 12 2008
Andrei Alexandrescu wrote:KennyTM~ wrote:Will use it when it's announced :)Andrei Alexandrescu wrote:One point taken off for not using the new syntax PriorityQueue!Documents. Ahem. On a serious note, you should specify the unusual comparison predicate:Walter Bright wrote:Straightforward solution: Use insertion sort. auto list = new PriorityQueue!(Documents);Andrei Alexandrescu wrote:I agree about that, but why don't you use std.algorithm? :o) Speaking of which, here's another challenge for everybody who'd like to waste some cycles. Say you have a simple API for accessing a large collection of files - e.g. all of Google's cached documents. The task, should you accept it, is to find the 1,000,000 largest ones of those files. The output should be filenames sorted in decreasing order of size. Assume the API gives you <filename, size> pairs in a serial fashion. You can't use parallel processing (e.g. map/reduce on clusters), but you are allowed to use threads on one machine if if fancies you. Speed is highly desirable. Devise your algorithm and explain how fast it runs with practical and/or theoretical arguments. Andreinil '(("\\(!(\\)[^()]*\\()\\)"I guess this is why I don't use emacs. I don't want to hear any more grousing about !( ) after that!!!auto list = new PriorityQueue!(Documents, "a > b"); as I supposed the default version would use less-than. Or does it?I don't know :). I haven't written a priority queue in D (only stacks and queues) :)// Documents.opCmp returns difference of two documents' length. // the SHORTEST document gets priority. foreach (doc; google_cache) { // O(N) if (list.length >= 1_000_000) list.pop(); // O(log 1M) list.push(doc); // O(log 1M) } auto array = convertToArray(list); // O(1M log 1M) array.reverse(); // O(1M) write(array); // O(1M). Total time complexity: O(N log 1M).Great! I'm glad there's so much interest and competency around here. Andrei
Oct 12 2008
KennyTM~ wrote:Andrei Alexandrescu wrote:Sorry, wrong solution. Forgot to compare before pop. auto list = new PriorityQueue!(Documents, "a < b"); // Documents.opCmp returns difference of two documents' length. // the SHORTEST document gets priority. foreach (doc; google_cache) { // O(N) if (list.length >= 1_000_000 && list.top() < doc) list.pop(); // O(log 1M) list.push(doc); // O(log 1M) } auto array = convertToArray(list); // O(1M log 1M) array.reverse(); // O(1M) write(array); // O(1M). Total time complexity: O(N log 1M).Walter Bright wrote:Straightforward solution: Use insertion sort. auto list = new PriorityQueue!(Documents); // Documents.opCmp returns difference of two documents' length. // the SHORTEST document gets priority. foreach (doc; google_cache) { // O(N) if (list.length >= 1_000_000) list.pop(); // O(log 1M) list.push(doc); // O(log 1M) } auto array = convertToArray(list); // O(1M log 1M) array.reverse(); // O(1M) write(array); // O(1M). Total time complexity: O(N log 1M).Andrei Alexandrescu wrote:I agree about that, but why don't you use std.algorithm? :o) Speaking of which, here's another challenge for everybody who'd like to waste some cycles. Say you have a simple API for accessing a large collection of files - e.g. all of Google's cached documents. The task, should you accept it, is to find the 1,000,000 largest ones of those files. The output should be filenames sorted in decreasing order of size. Assume the API gives you <filename, size> pairs in a serial fashion. You can't use parallel processing (e.g. map/reduce on clusters), but you are allowed to use threads on one machine if if fancies you. Speed is highly desirable. Devise your algorithm and explain how fast it runs with practical and/or theoretical arguments. Andreinil '(("\\(!(\\)[^()]*\\()\\)"I guess this is why I don't use emacs. I don't want to hear any more grousing about !( ) after that!!!
Oct 12 2008
Christopher Wright:http://dsource.org/projects/tango/browser/trunk/tango/util/container/more/Heap.d?rev=3959Very nice, thank you. Few quick comments (no deep comments, I haven't used it nor read it carefully): In this lines: assert (other.pop is 4); I think it's better to use assert(other.pop == 4); This lines show that a bulk addition method may be useful, that accepts any lazy/eagar iterable: MaxHeap!(uint) h; h ~= 1; h ~= 3; h ~= 2; h ~= 4; The following aliases can be moved just below their respective methods, with a /// ditto before them: alias pop remove; alias push opCatAssign; This style of comments: /** Inserts all elements in the given array into the heap. */ Can sometimes be replaced by: /// Inserts all elements in the given array into the heap. The class ddoc misses the explanation for the Min template argument: struct Heap(T, bool Min) { Some lines of comments are too much long, they may be broken at 80-95 chars long. An optional 'key' callable can be added; it can be used as sorting key mapper (Swartz-style). If not specified it can be "null" that means the current behaviour. Few other handy methods may be added: A merge() (~ and ~=) method can be added, maybe. heapreplace(heap, item) (pop and return the smallest item from the heap, and also push the new item. The heap size doesn't change) heapify(iterable) nlargest(n, iterable) nsmallest(n, iterable) The following is not a critic, just a note, I think programs with such low code density are harder to read to me, they seem like fresh snow: /** Inserts all elements in the given array into the heap. */ void push (T[] array) { while (heap.length < next + array.length) { heap.length = 2 * heap.length + 32; } foreach (t; array) push (t); } I'd write that as this, allowing me to have more code on the screen: /// Inserts all elements in the given array into the heap. void push(T[] array) { while (heap.length < next + array.length) heap.length = 2 * heap.length + 32; foreach (t; array) push(t); } In the future FibonacciHeap too may be useful to be added to the Std libraries. Bye, bearophile
Oct 14 2008
bearophile wrote:Christopher Wright:Any particular reason?http://dsource.org/projects/tango/browser/trunk/tango/util/container/more/Heap.d?rev=3959Very nice, thank you. Few quick comments (no deep comments, I haven't used it nor read it carefully): In this lines: assert (other.pop is 4); I think it's better to use assert(other.pop == 4);This lines show that a bulk addition method may be useful, that accepts any lazy/eagar iterable: MaxHeap!(uint) h; h ~= 1; h ~= 3; h ~= 2; h ~= 4;Agreed. The implementation that I submitted had a push method that took an array. If I had given it more thought, I would've accepted any Tango container, as well, and maybe have a vararg version. (Except, ugh, the new containers are all structs. That means writing a template method for the task rather than using an interface.)The following aliases can be moved just below their respective methods, with a /// ditto before them: alias pop remove; alias push opCatAssign;That was changed after I submitted the code.This style of comments: /** Inserts all elements in the given array into the heap. */ Can sometimes be replaced by: /// Inserts all elements in the given array into the heap.And then it takes slightly more work to change it into a multiline comment.The class ddoc misses the explanation for the Min template argument: struct Heap(T, bool Min) {The version I submitted didn't have the Min template argument; it had a virtual comparison method instead. That method was documented. The added argument was not documented.Some lines of comments are too much long, they may be broken at 80-95 chars long.Dunno how that slipped by me.An optional 'key' callable can be added; it can be used as sorting key mapper (Swartz-style). If not specified it can be "null" that means the current behaviour.I actually need this behavior. The version I submitted had a virtual comparison method, so that would have been reasonably simple to implement: class ComparatorHeap (T, alias comparator) : Heap!(T) { override int comp (T left, T right) { return comparator (left, right); } } Or: class HeapMap (TKey, TValue) : Heap!(KeyValuePair!(TKey, TValue)) { // add methods for inserting a key and a value without having to // construct a KeyValuePair } As it stands, you'd need to use composition and forward a lot of methods to an internal struct -- that's just hideous. Tango has been on a struct rampage. It's part of their campaign against extensibility. I guess making everything private or final just wasn't enough. I jest. But it is a serious concern of mine. If I see anything labeled "package" rather than "protected", my liver will burst. At the very least, I'll write a script to change anything private to protected, and anything protected or package to public.Few other handy methods may be added: A merge() (~ and ~=) method can be added, maybe. heapreplace(heap, item) (pop and return the smallest item from the heap, and also push the new item. The heap size doesn't change) heapify(iterable) nlargest(n, iterable) nsmallest(n, iterable) The following is not a critic, just a note, I think programs with such low code density are harder to read to me, they seem like fresh snow:<snip>I'd write that as this, allowing me to have more code on the screen:I use a large monitor. Whitespace makes things more readable for me. Though the giant tabs are annoying; I use a tabstop of four. Still, this is only a display issue.In the future FibonacciHeap too may be useful to be added to the Std libraries. Bye, bearophileThanks for pointing out those issues. I'll correct those I can, but even after that, the Tango heap will not suffice for my needs.
Oct 14 2008
Christopher Wright:Any particular reason?<'is' is for object equivalence. But you have values there.And then it takes slightly more work to change it into a multiline comment.<Right.I actually need this behavior.<I'll probably add that key mapping function to your code in the following days. If you want I may show you the code later...The version I submitted had a virtual comparison method, so that would have been reasonably simple to implement:<Interesting.I use a large monitor. Whitespace makes things more readable for me.<I think older people tend to write more compact code, because in the past monitors were smaller (and probably because of other factors, like the amount of code you can put on the screen at once) while the new style (that can be sparse lines on the screen. I think my style is midway.the Tango heap will not suffice for my needs.<Well, if you are the author you can change the situation, can't you? :-) Bye, bearophile
Oct 14 2008
Actually, I STRONGLY urge that nobody use this heap. It's a struct. If you access it by value and append to it, then access it from any other alias, you will erase a random element from the heap. If it were a class, this would not be possible. As I implemented it, it would not be possible. Containers as value types is such an idiotic idea for precisely this reason.
Oct 14 2008
"Christopher Wright" wroteActually, I STRONGLY urge that nobody use this heap. It's a struct. If you access it by value and append to it, then access it from any other alias, you will erase a random element from the heap. If it were a class, this would not be possible. As I implemented it, it would not be possible. Containers as value types is such an idiotic idea for precisely this reason.Unless such value types have copy constructors which make a copy of their elements... Or are you calling the authors of STL idiots ;) -Steve
Oct 14 2008
Christopher Wright wrote:Actually, I STRONGLY urge that nobody use this heap. It's a struct. If you access it by value and append to it, then access it from any other alias, you will erase a random element from the heap. If it were a class, this would not be possible. As I implemented it, it would not be possible. Containers as value types is such an idiotic idea for precisely this reason.Correction: as partial value types. If they were completely value types, it'd be stupid for a much different reason -- namely, that they're very difficult to resize. But this class of bugs would not be an issue. D's builtin arrays have the same problem. In their case, the only real way around the issue is setting the first sizeof(size_t) bytes to the length rather than putting that on the stack. Any situation in which only part of your state is shared is a potential for getting data structures into an erroneous state.
Oct 14 2008
Christopher Wright wrote:Christopher Wright wrote:Argh, I'm making more of an idiot of myself. If they're value types by merit of being entirely on the stack, yes. If they're value types by merit of the behavior of assignment, then I'm full of crap and you should pay me no mind.Actually, I STRONGLY urge that nobody use this heap. It's a struct. If you access it by value and append to it, then access it from any other alias, you will erase a random element from the heap. If it were a class, this would not be possible. As I implemented it, it would not be possible. Containers as value types is such an idiotic idea for precisely this reason.Correction: as partial value types. If they were completely value types, it'd be stupid for a much different reason -- namely, that they're very difficult to resize. But this class of bugs would not be an issue.
Oct 14 2008
Christopher Wright wrote:Actually, I STRONGLY urge that nobody use this heap. It's a struct. If you access it by value and append to it, then access it from any other alias, you will erase a random element from the heap. If it were a class, this would not be possible. As I implemented it, it would not be possible. Containers as value types is such an idiotic idea for precisely this reason.I'd say this statement reflects a lack of understanding of value types, rather than value containers are an idiotic idea. Andrei
Oct 14 2008
Andrei Alexandrescu wrote:Christopher Wright wrote:I'll bite. Why is it a good idea that containers are implements as structs? I too was pretty astonished to discover this pattern in tango. And why just for a few container types? Stack, Heap, and Vector are structs, while all other container types are classes. --benjiActually, I STRONGLY urge that nobody use this heap. It's a struct. If you access it by value and append to it, then access it from any other alias, you will erase a random element from the heap. If it were a class, this would not be possible. As I implemented it, it would not be possible. Containers as value types is such an idiotic idea for precisely this reason.I'd say this statement reflects a lack of understanding of value types, rather than value containers are an idiotic idea. Andrei
Oct 14 2008
Benji Smith wrote:Andrei Alexandrescu wrote:As the Mexican would say: why not? AndreiChristopher Wright wrote:I'll bite. Why is it a good idea that containers are implements as structs?Actually, I STRONGLY urge that nobody use this heap. It's a struct. If you access it by value and append to it, then access it from any other alias, you will erase a random element from the heap. If it were a class, this would not be possible. As I implemented it, it would not be possible. Containers as value types is such an idiotic idea for precisely this reason.I'd say this statement reflects a lack of understanding of value types, rather than value containers are an idiotic idea. Andrei
Oct 14 2008
Andrei Alexandrescu wrote:Benji Smith wrote:Well... here are a few reasons why not: Structs can't implement interfaces, and a collection API is the poster-child of good interface usage. Structs can't inherit from abstract classes. A good deal of functionality in a collection class can be implemented once in an abstract base class, making it simpler for subclass authors to implement the API without duplicating a bunch of boilerplate. Conceptually -- and there's no hard and fast rule about this -- structs usually either represent small logically-atomic values (DateTime), or some fixed-size collection of logically-atomic values (Vector3). Using a struct as an arbitrarily-sized container seems, on the face of it, to break those conventions. --benjiAndrei Alexandrescu wrote:As the Mexican would say: why not?Christopher Wright wrote:I'll bite. Why is it a good idea that containers are implements as structs?Containers as value types is such an idiotic idea for precisely this reason.I'd say this statement reflects a lack of understanding of value types, rather than value containers are an idiotic idea.
Oct 14 2008
Benji Smith wrote:Andrei Alexandrescu wrote:I don't want to seem like picking on you, but I would like to answer this message too. In my humble opinion, collections APIs are the poster-child of the misguided joy of the beginnings of object-orientation, right next to the mammal isa animal example. Organizing collections in hierarchies is of occasional benefit, but the real benefit comes from decoupling containers from algorithms via iterators, the way the STL did. The savings there were that an O(m * n) problem has been solved with writing O(m + n) code. Container hierarchies do not achieve such a reduction unless efforts are being made that have little to do with hierarchical organization. At most they factor out some code in the upper classes, but hierarchies help when types have many commonalities and few difference, and that is very rarely the case with containers.Benji Smith wrote:Well... here are a few reasons why not: Structs can't implement interfaces, and a collection API is the poster-child of good interface usage.Andrei Alexandrescu wrote:As the Mexican would say: why not?Christopher Wright wrote:I'll bite. Why is it a good idea that containers are implements as structs?Containers as value types is such an idiotic idea for precisely this reason.I'd say this statement reflects a lack of understanding of value types, rather than value containers are an idiotic idea.Structs can't inherit from abstract classes. A good deal of functionality in a collection class can be implemented once in an abstract base class, making it simpler for subclass authors to implement the API without duplicating a bunch of boilerplate.This also reveals a beginner's view of inheritance. You don't inherit to reuse. You inherit to be reused. If all you want is to reuse, use composition.Conceptually -- and there's no hard and fast rule about this -- structs usually either represent small logically-atomic values (DateTime), or some fixed-size collection of logically-atomic values (Vector3). Using a struct as an arbitrarily-sized container seems, on the face of it, to break those conventions.I see how it breaks those conventions, but I've never heard of them. The convention I heard of is that if you want dynamic polymorphism you use classes, otherwise you don't. Andrei
Oct 15 2008
Wed, 15 Oct 2008 09:17:57 -0500, Andrei Alexandrescu wrote:I see how it breaks those conventions, but I've never heard of them. The convention I heard of is that if you want dynamic polymorphism you use classes, otherwise you don't.How about if you want pass-by-value you use structs, otherwise you don't? My argument against structs-collections is that I want to toss collections around and build other collections out of them without messing with pointers.
Oct 15 2008
Sergey Gromov wrote:Wed, 15 Oct 2008 09:17:57 -0500, Andrei Alexandrescu wrote:A struct can choose to implement value or reference semantics as it pleases. The only thing it can't readily implement is dynamic polymorphism. AndreiI see how it breaks those conventions, but I've never heard of them. The convention I heard of is that if you want dynamic polymorphism you use classes, otherwise you don't.How about if you want pass-by-value you use structs, otherwise you don't? My argument against structs-collections is that I want to toss collections around and build other collections out of them without messing with pointers.
Oct 15 2008
Wed, 15 Oct 2008 09:49:08 -0500, Andrei Alexandrescu wrote:Sergey Gromov wrote:I can imagine the documentation: "Ignore the fact it's a struct, it's actually a reference internally." So how do I return it? Should I return it by value, hoping that the implied reference semantics will stay that way in future releases? How does your Array!() behave?Wed, 15 Oct 2008 09:17:57 -0500, Andrei Alexandrescu wrote:A struct can choose to implement value or reference semantics as it pleases. The only thing it can't readily implement is dynamic polymorphism.I see how it breaks those conventions, but I've never heard of them. The convention I heard of is that if you want dynamic polymorphism you use classes, otherwise you don't.How about if you want pass-by-value you use structs, otherwise you don't? My argument against structs-collections is that I want to toss collections around and build other collections out of them without messing with pointers.
Oct 15 2008
"Sergey Gromov" wroteWed, 15 Oct 2008 09:49:08 -0500, Andrei Alexandrescu wrote:There's nothing requiring that a struct must implement value semantics. The fact that you are expecting such just because it is a struct is a mistake. The reality is that only structs can implement value semantics because of the inability to override A = A in a class. That being said, I think containers work best as a mixture of interface and compile-time typing. See http://www.dsource.org/projects/dcollections. -SteveSergey Gromov wrote:I can imagine the documentation: "Ignore the fact it's a struct, it's actually a reference internally." So how do I return it? Should I return it by value, hoping that the implied reference semantics will stay that way in future releases? How does your Array!() behave?Wed, 15 Oct 2008 09:17:57 -0500, Andrei Alexandrescu wrote:A struct can choose to implement value or reference semantics as it pleases. The only thing it can't readily implement is dynamic polymorphism.I see how it breaks those conventions, but I've never heard of them. The convention I heard of is that if you want dynamic polymorphism you use classes, otherwise you don't.How about if you want pass-by-value you use structs, otherwise you don't? My argument against structs-collections is that I want to toss collections around and build other collections out of them without messing with pointers.
Oct 15 2008
Sergey Gromov wrote:Wed, 15 Oct 2008 09:49:08 -0500, Andrei Alexandrescu wrote:No. "Don't submit to the dogma that struct == value semantics, because that was never true".Sergey Gromov wrote:I can imagine the documentation: "Ignore the fact it's a struct, it's actually a reference internally."Wed, 15 Oct 2008 09:17:57 -0500, Andrei Alexandrescu wrote:A struct can choose to implement value or reference semantics as it pleases. The only thing it can't readily implement is dynamic polymorphism.I see how it breaks those conventions, but I've never heard of them. The convention I heard of is that if you want dynamic polymorphism you use classes, otherwise you don't.How about if you want pass-by-value you use structs, otherwise you don't? My argument against structs-collections is that I want to toss collections around and build other collections out of them without messing with pointers.So how do I return it? Should I return it by value, hoping that the implied reference semantics will stay that way in future releases? How does your Array!() behave?That's not finished, but e.g. Appender can be copied around and always refers to the same underlying array. Andrei
Oct 15 2008
Wed, 15 Oct 2008 14:04:44 -0500, Andrei Alexandrescu wrote:Sergey Gromov wrote:I don't understand. Structs are value types, everything else are implementation details. And yes, I assume that struct copying yields a separate, independent object unless documented otherwise. I don't usually need such documentation in my code because I either remember what I'm doing or the reference components of structs are obvious. But with a library struct with undocumented, private implementation it becomes an issue.Wed, 15 Oct 2008 09:49:08 -0500, Andrei Alexandrescu wrote:No. "Don't submit to the dogma that struct == value semantics, because that was never true".Sergey Gromov wrote:I can imagine the documentation: "Ignore the fact it's a struct, it's actually a reference internally."Wed, 15 Oct 2008 09:17:57 -0500, Andrei Alexandrescu wrote:A struct can choose to implement value or reference semantics as it pleases. The only thing it can't readily implement is dynamic polymorphism.I see how it breaks those conventions, but I've never heard of them. The convention I heard of is that if you want dynamic polymorphism you use classes, otherwise you don't.How about if you want pass-by-value you use structs, otherwise you don't? My argument against structs-collections is that I want to toss collections around and build other collections out of them without messing with pointers.This means that without reading docs I won't be able to tell whether my code is going to work or not. Because the only truly copyable struct- based solution I can imagine is a struct with a single pointer field pointing at a heap-allocated implementation. And I doubt that your Appender is implemented this way. I'm afraid I'll have to actually look into library sources to make sure Appender does what I expect.So how do I return it? Should I return it by value, hoping that the implied reference semantics will stay that way in future releases? How does your Array!() behave?That's not finished, but e.g. Appender can be copied around and always refers to the same underlying array.
Oct 15 2008
On Thu, Oct 16, 2008 at 5:46 AM, Sergey Gromov <snake.scaly gmail.com> wrote:Wed, 15 Oct 2008 14:04:44 -0500, Andrei Alexandrescu wrote:I think the point is that value *type* doesn't necessarily imply value *semantics*. Like with D's built-in arrays. --bbSergey Gromov wrote:I don't understand. Structs are value types, everything else are implementation details.Wed, 15 Oct 2008 09:49:08 -0500, Andrei Alexandrescu wrote:No. "Don't submit to the dogma that struct == value semantics, because that was never true".Sergey Gromov wrote:I can imagine the documentation: "Ignore the fact it's a struct, it's actually a reference internally."Wed, 15 Oct 2008 09:17:57 -0500, Andrei Alexandrescu wrote:A struct can choose to implement value or reference semantics as it pleases. The only thing it can't readily implement is dynamic polymorphism.I see how it breaks those conventions, but I've never heard of them. The convention I heard of is that if you want dynamic polymorphism you use classes, otherwise you don't.How about if you want pass-by-value you use structs, otherwise you don't? My argument against structs-collections is that I want to toss collections around and build other collections out of them without messing with pointers.
Oct 15 2008
Thu, 16 Oct 2008 06:44:30 +0900, Bill Baxter wrote:On Thu, Oct 16, 2008 at 5:46 AM, Sergey Gromov <snake.scaly gmail.com> wrote:Arrays are very widely used and very well documented to the point they're quite obvious. There are no hidden implementation details. Nevertheless, there are well-known problems which directly follow from the arrays' sort-of-reference semantics.Wed, 15 Oct 2008 14:04:44 -0500, Andrei Alexandrescu wrote:I think the point is that value *type* doesn't necessarily imply value *semantics*. Like with D's built-in arrays.Sergey Gromov wrote:I don't understand. Structs are value types, everything else are implementation details.Wed, 15 Oct 2008 09:49:08 -0500, Andrei Alexandrescu wrote:No. "Don't submit to the dogma that struct == value semantics, because that was never true".Sergey Gromov wrote:I can imagine the documentation: "Ignore the fact it's a struct, it's actually a reference internally."Wed, 15 Oct 2008 09:17:57 -0500, Andrei Alexandrescu wrote:A struct can choose to implement value or reference semantics as it pleases. The only thing it can't readily implement is dynamic polymorphism.I see how it breaks those conventions, but I've never heard of them. The convention I heard of is that if you want dynamic polymorphism you use classes, otherwise you don't.How about if you want pass-by-value you use structs, otherwise you don't? My argument against structs-collections is that I want to toss collections around and build other collections out of them without messing with pointers.
Oct 15 2008
On Thu, Oct 16, 2008 at 7:47 AM, Sergey Gromov <snake.scaly gmail.com> wrote:Thu, 16 Oct 2008 06:44:30 +0900, Bill Baxter wrote:And I think the issues are pretty much the same with all struct-based container. So if you understand the issues with D's arrays, then all you have to know is "same thing for other struct containers".On Thu, Oct 16, 2008 at 5:46 AM, Sergey Gromov <snake.scaly gmail.com> wrote:Arrays are very widely used and very well documented to the point they're quite obvious. There are no hidden implementation details.Wed, 15 Oct 2008 14:04:44 -0500, Andrei Alexandrescu wrote:I think the point is that value *type* doesn't necessarily imply value *semantics*. Like with D's built-in arrays.Sergey Gromov wrote:I don't understand. Structs are value types, everything else are implementation details.Wed, 15 Oct 2008 09:49:08 -0500, Andrei Alexandrescu wrote:No. "Don't submit to the dogma that struct == value semantics, because that was never true".Sergey Gromov wrote:I can imagine the documentation: "Ignore the fact it's a struct, it's actually a reference internally."Wed, 15 Oct 2008 09:17:57 -0500, Andrei Alexandrescu wrote:A struct can choose to implement value or reference semantics as it pleases. The only thing it can't readily implement is dynamic polymorphism.I see how it breaks those conventions, but I've never heard of them. The convention I heard of is that if you want dynamic polymorphism you use classes, otherwise you don't.How about if you want pass-by-value you use structs, otherwise you don't? My argument against structs-collections is that I want to toss collections around and build other collections out of them without messing with pointers.Nevertheless, there are well-known problems which directly follow from the arrays' sort-of-reference semantics.Yeh, these do make me wonder about whether it was a good idea to make D's arrays behave the way they do. But since they do, and since you can't really avoid having to learn how they work and what their pitfalls are, I don't think it's so bad to make other containers behave the same way. Less to learn that way. But I'd definitely be willing to entertain new ground-up designs in which arrays behaved more like pure references, or were not copyable, or similar. --bb
Oct 15 2008
Thu, 16 Oct 2008 08:42:23 +0900, Bill Baxter wrote:On Thu, Oct 16, 2008 at 7:47 AM, Sergey Gromov <snake.scaly gmail.com> wrote:No. Built-in array is transparent. Appender is not. Because if Appender is transparent, too, and there is a change in GC which makes its current implementation slow, you won't be able to fix that. Therefore Appender must be opaque and must specify its interface. And copy semantics becomes a part of that interface. It may transfer ownership on assignment for instance. Or declare behavior undefined if there are multiple copies of it. You see, I can't even say "multiple references" because they are not, and "multiple copies" sounds ambiguous. The bottom line is, your knowledge of built-in arrays is useless here. You must study docs for every container you use.Thu, 16 Oct 2008 06:44:30 +0900, Bill Baxter wrote:And I think the issues are pretty much the same with all struct-based container. So if you understand the issues with D's arrays, then all you have to know is "same thing for other struct containers".On Thu, Oct 16, 2008 at 5:46 AM, Sergey Gromov <snake.scaly gmail.com> wrote:Arrays are very widely used and very well documented to the point they're quite obvious. There are no hidden implementation details.Wed, 15 Oct 2008 14:04:44 -0500, Andrei Alexandrescu wrote:I think the point is that value *type* doesn't necessarily imply value *semantics*. Like with D's built-in arrays.Sergey Gromov wrote:I don't understand. Structs are value types, everything else are implementation details.Wed, 15 Oct 2008 09:49:08 -0500, Andrei Alexandrescu wrote:No. "Don't submit to the dogma that struct == value semantics, because that was never true".Sergey Gromov wrote:I can imagine the documentation: "Ignore the fact it's a struct, it's actually a reference internally."Wed, 15 Oct 2008 09:17:57 -0500, Andrei Alexandrescu wrote:A struct can choose to implement value or reference semantics as it pleases. The only thing it can't readily implement is dynamic polymorphism.I see how it breaks those conventions, but I've never heard of them. The convention I heard of is that if you want dynamic polymorphism you use classes, otherwise you don't.How about if you want pass-by-value you use structs, otherwise you don't? My argument against structs-collections is that I want to toss collections around and build other collections out of them without messing with pointers.They cannot behave the same way unless they're very similar to the built- in arrays, which makes them pretty useless. Even built-in AA behaves differently. BTW I was quite surprised to find out that it was implemented exactly as a struct with a single pointer inside. Well, there could have been reasons to implement a built-in type like this. I cannot see such reasons for Appender.Nevertheless, there are well-known problems which directly follow from the arrays' sort-of-reference semantics.Yeh, these do make me wonder about whether it was a good idea to make D's arrays behave the way they do. But since they do, and since you can't really avoid having to learn how they work and what their pitfalls are, I don't think it's so bad to make other containers behave the same way. Less to learn that way.But I'd definitely be willing to entertain new ground-up designs in which arrays behaved more like pure references, or were not copyable, or similar. --bb
Oct 15 2008
Sergey Gromov:Or declare behavior undefined if there are multiple copies of it. You see, I can't even say "multiple references" because they are not, and "multiple copies" sounds ambiguous.Just a note: my ArrayBuilder is a struct (with 3 fields inside, each sizeof 1 CPU word). I have already used it a lot in my libs, probably about 50 times, but I've never had to copy one of them. I have used a struct because a class isn't necessary, and because the struct methods are faster at my benchmarks. Bye, bearophile
Oct 15 2008
"Sergey Gromov" wroteWed, 15 Oct 2008 14:04:44 -0500, Andrei Alexandrescu wrote:First, there is no compiler required semantics. It is up to the designer. If the designer says 'this is return by reference', then the designer is telling you what it is. Second, just because you have your expectations doesn't mean they should be right ;) The only things that are guaranteed are done so by the compiler. Everything else is up to the designer.Sergey Gromov wrote:I don't understand. Structs are value types, everything else are implementation details. And yes, I assume that struct copying yields a separate, independent object unless documented otherwise. I don't usually need such documentation in my code because I either remember what I'm doing or the reference components of structs are obvious. But with a library struct with undocumented, private implementation it becomes an issue.Wed, 15 Oct 2008 09:49:08 -0500, Andrei Alexandrescu wrote:No. "Don't submit to the dogma that struct == value semantics, because that was never true".Sergey Gromov wrote:I can imagine the documentation: "Ignore the fact it's a struct, it's actually a reference internally."Wed, 15 Oct 2008 09:17:57 -0500, Andrei Alexandrescu wrote:A struct can choose to implement value or reference semantics as it pleases. The only thing it can't readily implement is dynamic polymorphism.I see how it breaks those conventions, but I've never heard of them. The convention I heard of is that if you want dynamic polymorphism you use classes, otherwise you don't.How about if you want pass-by-value you use structs, otherwise you don't? My argument against structs-collections is that I want to toss collections around and build other collections out of them without messing with pointers.All you should need to do is look at the docs. If the docs suck, you have a gripe. If the implementation isn't what you would have done, but there is nothing wrong with it, tough. Go write your own. -SteveThis means that without reading docs I won't be able to tell whether my code is going to work or not. Because the only truly copyable struct- based solution I can imagine is a struct with a single pointer field pointing at a heap-allocated implementation. And I doubt that your Appender is implemented this way. I'm afraid I'll have to actually look into library sources to make sure Appender does what I expect.So how do I return it? Should I return it by value, hoping that the implied reference semantics will stay that way in future releases? How does your Array!() behave?That's not finished, but e.g. Appender can be copied around and always refers to the same underlying array.
Oct 15 2008
Andrei Alexandrescu wrote:I don't want to seem like picking on you, but I would like to answer this message too.I don't feel like you're picking on me! Without disagreements, there wouldn't be much to discuss :) Though I do think you should lay off with the "beginner" business. The people who disagree with you disagree for legitimate reasons, not because they're inexperienced. Perhaps Josh Bloch is a beginner?In my humble opinion, collections APIs are the poster-child of the misguided joy of the beginnings of object-orientation, right next to the mammal isa animal example. Organizing collections in hierarchies is of occasional benefit, but the real benefit comes from decoupling containers from algorithms via iterators, the way the STL did. The savings there were that an O(m * n) problem has been solved with writing O(m + n) code. Container hierarchies do not achieve such a reduction unless efforts are being made that have little to do with hierarchical organization. At most they factor out some code in the upper classes, but hierarchies help when types have many commonalities and few difference, and that is very rarely the case with containers.Nonsense. Iterators are not the only way to "decouple containers from algorithms". An iterator in STL is just a compile-time interface for accessing the elements in a container. You can argue that using type hierarchies or interface inheritance is an undue runtime burden and that iterators solve that problem effectively. But, from a data-modeling perspective, there is absolutely no difference between these two: sort(List<T> list); sort(iterator<T> start, iterator<T> end);When the rubber hits the road, I'll take my "beginners view of inheritance any time". One of my favorite tiny classes in Java is this little nugget: import java.util.LinkedHashMap; import java.util.Map; public class CacheMap<K, V> extends LinkedHashMap<K, V> { int maxCapacity; public CacheMap(int maxCapacity) { this.maxCapacity = maxCapacity; } protected boolean removeEldestEntry(Map.Entry<K, V> eldest) { return size() > maxCapacity; } } Now I have a fully-functional Map class that automatically removes entries (in FIFO order) when the collection reaches some maximum size. In every other way, it performs exactly like any other LinkedHashMap (which always iterates in insertion-order). If I used composition, and if I wanted to comply with with Map interface, I'd have to manually re-implement every single method, wrapping the underlying functionality and delegating the calls myself. If the type system can do it for me, why not? And since I've implemented Map<K, V>, I can pass my container to any method that cares about "mappiness", regardless of whether it cares about "cachiness".Structs can't inherit from abstract classes. A good deal of functionality in a collection class can be implemented once in an abstract base class, making it simpler for subclass authors to implement the API without duplicating a bunch of boilerplate.This also reveals a beginner's view of inheritance. You don't inherit to reuse. You inherit to be reused. If all you want is to reuse, use composition.The D documentation implies otherwise: http://www.digitalmars.com/d/2.0/struct.html Though I suppose you're free to use whatever conventions you like. --benjiConceptually -- and there's no hard and fast rule about this -- structs usually either represent small logically-atomic values (DateTime), or some fixed-size collection of logically-atomic values (Vector3). Using a struct as an arbitrarily-sized container seems, on the face of it, to break those conventions.I see how it breaks those conventions, but I've never heard of them. The convention I heard of is that if you want dynamic polymorphism you use classes, otherwise you don't.
Oct 15 2008
Benji Smith wrote:Andrei Alexandrescu wrote:I know you're going to hate this, but I'm unable to follow through with this discussion. There is much background and many aspects to discuss, and I simply can't afford the time. Since not too long ago it has dawned on the programming community that OOP is not all it's cracked to be, and that there are quite a few amendments and qualifications that need to be made to what once seemed to be very attractive principles. But such developments are quite recent, and they may sound downright heretic and indeed nonsensical to many. That's why there's need for a lot of explaining. There's a book I do recommend though to bring some good info on the subject: Agile Software Development by Robert Martin. The book is good, and for this particular subject I recommend Chapter 10. AndreiIn my humble opinion, collections APIs are the poster-child of the misguided joy of the beginnings of object-orientation, right next to the mammal isa animal example. Organizing collections in hierarchies is of occasional benefit, but the real benefit comes from decoupling containers from algorithms via iterators, the way the STL did. The savings there were that an O(m * n) problem has been solved with writing O(m + n) code. Container hierarchies do not achieve such a reduction unless efforts are being made that have little to do with hierarchical organization. At most they factor out some code in the upper classes, but hierarchies help when types have many commonalities and few difference, and that is very rarely the case with containers.Nonsense.
Oct 15 2008
Andrei Alexandrescu wrote:Benji Smith wrote:Naw, I don't hate it. I'm up to my eyeballs too, and I had no delusions that I'd convince you anyhow. I just wanted to put those ideas on the table, since I think there are a lot of other people like me out there. Those people might like to use D too, and their development styles are valid too, so I'm standing up for them. Anyhoo... catcha later! --benjiAndrei Alexandrescu wrote:I know you're going to hate this, but I'm unable to follow through with this discussion. There is much background and many aspects to discuss, and I simply can't afford the time.In my humble opinion, collections APIs are the poster-child of the misguided joy of the beginnings of object-orientation, right next to the mammal isa animal example. Organizing collections in hierarchies is of occasional benefit, but the real benefit comes from decoupling containers from algorithms via iterators, the way the STL did. The savings there were that an O(m * n) problem has been solved with writing O(m + n) code. Container hierarchies do not achieve such a reduction unless efforts are being made that have little to do with hierarchical organization. At most they factor out some code in the upper classes, but hierarchies help when types have many commonalities and few difference, and that is very rarely the case with containers.Nonsense.
Oct 15 2008
On Tue, Oct 14, 2008 at 5:53 PM, bearophile <bearophileHUGS lycos.com> wrote:Christopher Wright:Bearophile, you're doing it again. Why are you so hung up on how other people format their code? Please, please get over it.http://dsource.org/projects/tango/browser/trunk/tango/util/container/more/Heap.d?rev=3959Very nice, thank you. Few quick comments (no deep comments, I haven't used it nor read it carefully): In this lines: assert (other.pop is 4); I think it's better to use assert(other.pop == 4); This lines show that a bulk addition method may be useful, that accepts any lazy/eagar iterable: MaxHeap!(uint) h; h ~= 1; h ~= 3; h ~= 2; h ~= 4; The following aliases can be moved just below their respective methods, with a /// ditto before them: alias pop remove; alias push opCatAssign; This style of comments: /** Inserts all elements in the given array into the heap. */ Can sometimes be replaced by: /// Inserts all elements in the given array into the heap. The class ddoc misses the explanation for the Min template argument: struct Heap(T, bool Min) { Some lines of comments are too much long, they may be broken at 80-95 chars long. An optional 'key' callable can be added; it can be used as sorting key mapper (Swartz-style). If not specified it can be "null" that means the current behaviour. Few other handy methods may be added: A merge() (~ and ~=) method can be added, maybe. heapreplace(heap, item) (pop and return the smallest item from the heap, and also push the new item. The heap size doesn't change) heapify(iterable) nlargest(n, iterable) nsmallest(n, iterable) The following is not a critic, just a note, I think programs with such low code density are harder to read to me, they seem like fresh snow: /** Inserts all elements in the given array into the heap. */ void push (T[] array) { while (heap.length < next + array.length) { heap.length = 2 * heap.length + 32; } foreach (t; array) push (t); } I'd write that as this, allowing me to have more code on the screen: /// Inserts all elements in the given array into the heap. void push(T[] array) { while (heap.length < next + array.length) heap.length = 2 * heap.length + 32; foreach (t; array) push(t); }
Oct 14 2008
Jarrett Billingsley:Why are you so hung up on how other people format their code?Not all coding styles are created equal. Some are better. Bye, bearophile
Oct 15 2008
"bearophile" wroteJarrett Billingsley:Yes, but clearly not yours. Mine is much better. -SteveWhy are you so hung up on how other people format their code?Not all coding styles are created equal. Some are better.
Oct 15 2008