digitalmars.D.announce - cashew.utils.array 0.1a.2
- Chris Nicholson-Sauls (11/11) Sep 12 2006 I've been pushing for some array utilities to get into Phobos, yes, but ...
- Chris Nicholson-Sauls (4/4) Sep 12 2006 And just a few minutes later, I issue another release... I'd neglected ...
- Lionello Lunesu (7/14) Sep 13 2006 Nice.. I hope something like this will make it into Phobos at some
- Chris Nicholson-Sauls (7/27) Sep 13 2006 True... and fixed. :) Functions indexOf, rindexOf, indexOfSub, and rin...
- clayasaurus (2/631) Sep 13 2006 Phobos does need something like this.
- J Duncan (5/6) Sep 13 2006 very badly, everyone rolls their own so it would be nice to have stuff
- Mikola Lysenko (33/77) Sep 14 2006 These algorithms can be implemented more efficiently. The running time
- Chris Nicholson-Sauls (5/93) Sep 14 2006 True, and not a bad idea. I'm not sure how it would be with arrays of c...
- Chris Nicholson-Sauls (10/107) Sep 14 2006 Well I just tried it and... woo, intersect is notably faster indeed. I ...
- Nikita Kalaganov (3/8) Sep 13 2006 Ahh, thanks!
- Reiner Pope (76/93) Sep 13 2006 It looks good. Thanks for doing this.
- Reiner Pope (1/1) Sep 13 2006 Oh, and I forgot: a reverse iterator, like mintl.array has.
- Chris Nicholson-Sauls (9/118) Sep 14 2006 Because I'm mildly paranoid and wanted to avoid memory allocation where ...
- Reiner Pope (8/54) Sep 14 2006 Passing by 'inout' is probably less efficient and certainly doesn't save...
- Chris Nicholson-Sauls (8/70) Sep 16 2006 So it does... although I could swear that it didn't behave right before....
I've been pushing for some array utilities to get into Phobos, yes, but in the meantime I updated the array module in my personal Cashew lib. Included in the updates are the rotl/rotr that I recall someone asking about. In the process I've discovered two bugs, as well: the behavior of 'is' versus '==' is incompatable if the operands are arrays, and DDoc for abbreviated function templates is borked. For an example of the latter, just look at Cashew's own docs. That said: if people think this is a decent collection, and Walter would take it, I would be willing to release it to public domain for Phobos' sake. The array module is attached, and the docs are at: http://www.codemeu.com:81/~pontiff/projects/cashew/doc/array.html -- Christopher Nicholson-Sauls
Sep 12 2006
And just a few minutes later, I issue another release... I'd neglected a little something from rotl() and rotr(), namely I hadn't accounted for iterations greater than the array's length. Heh. Fixed now. New version attached. -- Chris Nicholson-Sauls
Sep 12 2006
Chris Nicholson-Sauls wrote:I've been pushing for some array utilities to get into Phobos, yes, but in the meantime I updated the array module in my personal Cashew lib. Included in the updates are the rotl/rotr that I recall someone asking about. In the process I've discovered two bugs, as well: the behavior of 'is' versus '==' is incompatable if the operands are arrays, and DDoc for abbreviated function templates is borked. For an example of the latter, just look at Cashew's own docs.Nice.. I hope something like this will make it into Phobos at some point, since everybody seems to have implemented these functions anyway.. One remark though: your removeAll is pretty slow. It should be enough to iterate the array only once. A indexOf that takes a starting_index would be a simple fix. L.
Sep 13 2006
Lionello Lunesu wrote:Chris Nicholson-Sauls wrote:True... and fixed. :) Functions indexOf, rindexOf, indexOfSub, and rindexOfSub all take an optional 'start' parameter now. This effected the design of functions removeAll and unique. I did, however, have to do something weird with removeAll. It just didn't want to work right any other way... must play with it some more later. -- Chris Nicholson-SaulsI've been pushing for some array utilities to get into Phobos, yes, but in the meantime I updated the array module in my personal Cashew lib. Included in the updates are the rotl/rotr that I recall someone asking about. In the process I've discovered two bugs, as well: the behavior of 'is' versus '==' is incompatable if the operands are arrays, and DDoc for abbreviated function templates is borked. For an example of the latter, just look at Cashew's own docs.Nice.. I hope something like this will make it into Phobos at some point, since everybody seems to have implemented these functions anyway.. One remark though: your removeAll is pretty slow. It should be enough to iterate the array only once. A indexOf that takes a starting_index would be a simple fix. L.
Sep 13 2006
Chris Nicholson-Sauls wrote:Lionello Lunesu wrote:Phobos does need something like this.Chris Nicholson-Sauls wrote:True... and fixed. :) Functions indexOf, rindexOf, indexOfSub, and rindexOfSub all take an optional 'start' parameter now. This effected the design of functions removeAll and unique. I did, however, have to do something weird with removeAll. It just didn't want to work right any other way... must play with it some more later. -- Chris Nicholson-Sauls ------------------------------------------------------------------------ /***************************************************************************************** * Utility template functions for arrays. Example: * * $(CASHEW_HEAD) *---------------------------------------------------------------------------------------- * import cashew .utils .array ; * // ... * int[] foo = ...; * foo.remove(4); * foo.eat(foo.indexOf(2)); *---------------------------------------------------------------------------------------- */ module cashew.utils.array ; /*********************************************************************************** * Unit tests support. */ version (Unittest) { import std .stdio ; } unittest { writefln("Unittest: cashew.utils.array: begin"); } /*********************************************************************************** * Create an array. */ T[] array (T) (T[] arr ...) body { return arr.dup; } unittest { writef("\t array(...) --------> "); auto foo = array!(int)(1, 2, 3); assert(foo.length == 3U); assert(typeid(typeof(foo[0])) == typeid(typeof(1))); writefln("Pass"); } /*********************************************************************************** * Verify that a given value is present in an array. */ bool contains (T) (inout T[] haystack, T needle) body { return haystack.indexOf(needle) != size_t.max; } unittest { writef("\t.contains(T) -------> "); auto foo = array!(int)(1, 2, 3); assert( foo.contains(2)); assert(! foo.contains(4)); writefln("Pass"); } /*********************************************************************************** * Compose a new array of the values in the first parameter not present in the second. */ T[] diff (T) (inout T[] alpha, inout T[] beta) body { T[] result = alpha.dup; foreach (x; alpha) { if (beta.contains(x)) { while (result.contains(x)) { result.remove(x); } } } return result; } unittest { writef("\t.diff(T[]) ---------> "); auto foo = array!(int)(1, 2, 3, 4, 5); auto bar = array!(int)(1, 3, 5); assert(foo.diff(bar) == array!(int)(2, 4)); writefln("Pass"); } /*********************************************************************************** * Remove some of the beginning of an array. */ void eat (T) (inout T[] haystack, size_t count) body { if (count > haystack.length) { haystack.length = 0; } else { haystack = haystack[count .. haystack.length]; } } unittest { writef("\t.eat(N) ------------> "); auto foo = array!(int)(1, 2, 3, 4, 5); foo.eat(3U); assert(foo == array!(int)(4, 5)); writefln("Pass"); } /*********************************************************************************** * Search an array for a given item, and return its index, or size_t.max if not found. */ size_t indexOf (T) (inout T[] haystack, T needle, size_t start = 0U) body { size_t result = size_t.max ; foreach (index, item; haystack[start .. haystack.length]) { if (item is needle) { result = index; break; } } if (result != size_t.max) { result += start; } return result; } unittest { writef("\t.indexOf(T) --------> "); auto foo = array!(int)(1, 2, 3); assert(foo.indexOf(2) == 1U); writefln("Pass"); } /*********************************************************************************** * Search an array for a given sub-array, and return its index, or size_t.max if not found. */ size_t indexOfSub (T) (inout T[] haystack, inout T[] bale, size_t start = 0U) body { size_t result = size_t.max , wall = haystack.length - bale.length ; for (size_t i = start; i < wall; i++) { if (haystack[i .. i + bale.length] == bale) { result = i; break; } } return result; } unittest { writef("\t.indexOfSub(T[]) ---> "); auto foo = array!(int)(1, 2, 3, 4, 5); auto sub = array!(int)( 3, 4 ); assert(foo.indexOfSub(sub) == 2U); writefln("Pass"); } /*********************************************************************************** * Same as indexOf but work backwards from the end. */ size_t rindexOf (T) (inout T[] haystack, T needle, size_t start = size_t.max) body { size_t result = size_t.max ; if (start == size_t.max) { start = haystack.length - 1U; } for (size_t i = start; i >= 0U; i--) { if (haystack[i] is needle) { result = i; break; } } return result; } unittest { writef("\t.rindexOf(T) -------> "); auto foo = array!(int)(1, 9, 2, 9, 3); assert(foo.rindexOf(9) == 3U); writefln("Pass"); } /*********************************************************************************** * Same as indexOf but work backwards from the end. */ size_t rindexOfSub (T) (inout T[] haystack, inout T[] bale, size_t start = size_t.max) body { size_t result = size_t.max ; if (start == size_t.max) { start = haystack.length; } for (size_t i = start - bale.length; i >= 0; i--) { if (haystack[i .. i + bale.length] == bale) { result = i; break; } } return result; } unittest { writef("\t.rindexOfSub(T[]) --> "); auto foo = array!(int)(1, 2, 9, 8, 1, 2, 9, 8); auto sub = array!(int)(1, 2 ); assert(foo.rindexOfSub(sub) == 4U); writefln("Pass"); } /*********************************************************************************** * Return an array composed of common elements of two arrays. */ T[] intersect (T) (inout T[] alpha, inout T[] beta) body { T[] result; foreach (x; alpha) { if (beta.contains(x)) { result ~= x; } } return result; } unittest { writef("\t.intersect(T[]) ----> "); auto foo = array!(int)(1, 2, 3, 4, 5 ); auto bar = array!(int)( 3, 4, 5, 6, 7); assert(foo.intersect(bar) == array!(int)(3, 4, 5)); writefln("Pass"); } /*********************************************************************************** * Remove an item from an array. */ void remove (T) (inout T[] haystack, T needle) body { size_t index = haystack.indexOf(needle) ; if (index != size_t.max) { haystack.removeIndex(index); } } unittest { writef("\t.remove(T) ---------> "); auto foo = array!(int)(1, 2, 3); foo.remove(2); assert(foo == array!(int)(1, 3)); writefln("Pass"); } /*********************************************************************************** * Remove all occurances of an item from an array. */ void removeAll (T) (inout T[] haystack, T needle) body { size_t index = 0U ; size_t[] indices ; while ((index = haystack.indexOf(needle, index)) != size_t.max) { indices ~= index; index++; } foreach (i, x; indices) { haystack.removeIndex(x - i); } } unittest { writef("\t.removeAll(T) ------> "); auto foo = array!(int)(1, 2, 1, 3, 1, 4, 1, 5); foo.removeAll(1); assert(foo == array!(int)(2, 3, 4, 5)); writefln("Pass"); } /*********************************************************************************** * Remove the item at a given index. */ void removeIndex (T) (inout T[] haystack, size_t index) body { haystack = haystack[0 .. index] ~ haystack[index + 1 .. haystack.length]; } unittest { writef("\t.removeIndex(N) ----> "); auto foo = array!(int)(1, 2, 3, 4); foo.removeIndex(2U); assert(foo == array!(int)(1, 2, 4)); writefln("Pass"); } /*********************************************************************************** * Build an array by repeating an item. */ void repeat (T) (inout T[] haystack, T needle, size_t len = size_t.max) body { if (len == size_t.max) { len = haystack.length; } haystack.length = len; haystack[] = needle; } unittest { writef("\t.repeat(T [, N]) ---> "); int[] foo ; foo.repeat(3, 3U); assert(foo == array!(int)(3, 3, 3)); auto bar = array!(int)(0, 0, 0, 0, 0, 0, 0); bar.repeat(7); assert(bar == array!(int)(7, 7, 7, 7, 7, 7, 7)); writefln("Pass"); } /*********************************************************************************** * Build an array by repeating a smaller array. */ void repeatSub (T) (inout T[] haystack, inout T[] bale, size_t count) body { haystack.length = 0; for (int i; i < count; i++) { haystack ~= bale; } } unittest { writef("\t.repeatSub(T[], N) -> "); int[] foo , sub = array!(int)(4, 2) ; foo.repeatSub(sub, 3U); assert(foo == array!(int)(4, 2, 4, 2, 4, 2)); writefln("Pass"); } /*********************************************************************************** * Fill an array with a given smaller array. */ void fill (T) (inout T[] haystack, inout T[] bale, size_t len = size_t.max) body { if (len == size_t.max) { len = haystack.length; } haystack.length = 0; while (haystack.length < len) { haystack ~= bale; } haystack.length = len; } unittest { writef("\t.fill (T[] [, N]) --> "); auto foo = array!(int)(1, 2, 3, 4, 5); auto sub = array!(int)(3, 2, 1 ); foo.fill(sub); assert(foo == array!(int)(3, 2, 1, 3, 2)); writefln("Pass"); } /*********************************************************************************** * Remove duplicate values from an array. */ void unique (T) (inout T[] haystack) body { size_t ridx ; for (int i = 0; i < haystack.length; i++) { ridx = size_t.max; while ((ridx = haystack.rindexOf(haystack[i], ridx)) != i) { haystack.removeIndex(ridx); } } } unittest { writef("\t.unique() ----------> "); auto foo = array!(int)(1, 1, 2, 1, 2, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5); foo.unique(); assert(foo == array!(int)(1, 2, 3, 4, 5)); writefln("Pass"); } /*********************************************************************************** * Pull a number of items from one array into another. */ T[] pull (T) (inout T[] haystack, size_t idx) body { T[] result ; if (idx >= haystack.length) { idx = haystack.length; } result = haystack[0 .. idx ]; haystack = haystack[idx .. haystack.length]; return result; } unittest { writef("\t.pull(N) -----------> "); auto foo = array!(int)(1, 2, 3, 4, 5); auto bar = foo.pull(3U); assert(foo == array!(int)( 4, 5)); assert(bar == array!(int)(1, 2, 3 )); writefln("Pass"); } /*********************************************************************************** * Same as pull, but indexed from the end. */ T[] rpull (T) (inout T[] haystack, size_t idx) body { T[] result ; if (idx >= haystack.length) { result = haystack; haystack.length = 0; } else { idx--; result = haystack[idx .. haystack.length]; haystack = haystack[0 .. idx ]; } return result; } unittest { writef("\t.rpull(N) ----------> "); auto foo = array!(int)(1, 2, 3, 4, 5); auto bar = foo.rpull(3U); assert(foo == array!(int)(1, 2 )); assert(bar == array!(int)( 3, 4, 5)); auto alpha = array!(int)(1, 2, 3); auto beta = alpha.rpull(4U); assert(alpha == array!(int)( )); assert(beta == array!(int)(1, 2, 3)); writefln("Pass"); } /*********************************************************************************** * Rotate an array's contents toward the left. */ void rotl (T) (inout T[] haystack, size_t iter) body { iter %= haystack.length; haystack = haystack[iter .. haystack.length] ~ haystack[0 .. iter]; } unittest { writef("\t.rotl(N) -----------> "); auto foo = array!(int)(1, 2, 3, 4, 5, 6); foo.rotl(2U); assert(foo == array!(int)(3, 4, 5, 6 ,1, 2)); auto bar = array!(int)(1, 2, 3); bar.rotl(4U); assert(bar == array!(int)(2, 3, 1)); writefln("Pass"); } /*********************************************************************************** * Rotate an array's contents toward the right. */ void rotr (T) (inout T[] haystack, size_t iter) body { size_t idx ; iter %= haystack.length ; idx = haystack.length - iter ; haystack = haystack[idx .. haystack.length] ~ haystack[0 .. idx]; } unittest { writef("\t.rotr(N) -----------> "); auto foo = array!(int)(1, 2, 3, 4, 5, 6); foo.rotr(2U); assert(foo == array!(int)(5, 6, 1, 2, 3, 4)); auto bar = array!(int)(1, 2, 3); bar.rotr(4U); assert(bar == array!(int)(3, 1, 2)); writefln("Pass"); } /*********************************************************************************** * Append to an array. */ void push (T) (inout T[] haystack, T[] bale ...) body { haystack ~= bale; } unittest { writef("\t.push(...) ---------> "); auto foo = array!(int)(1, 2, 3); foo.push(4, 5); assert(foo == array!(int)(1, 2, 3, 4, 5)); writefln("Pass"); } /*********************************************************************************** * Retrieve and remove the last value of an array. */ T pop (T) (inout T[] haystack) body { T result = haystack[haystack.length - 1]; haystack.length = haystack.length - 1; return result; } unittest { writef("\t.pop() -------------> "); auto foo = array!(int)(1, 2, 3); auto elm = foo.pop(); assert(foo == array!(int)(1, 2)); assert(elm == 3); writefln("Pass"); } /*********************************************************************************** * Prepend to an array. */ void backpush (T) (inout T[] haystack, T[] bale ...) body { haystack ~= bale; haystack.rotr(bale.length); } unittest { writef("\t.backpush(...) -----> "); auto foo = array!(int)(3, 4); foo.backpush(1, 2); assert(foo == array!(int)(1, 2, 3, 4)); writefln("Pass"); } /*********************************************************************************** * Retrieve and remove the first value of an array. */ T backpop (T) (inout T[] haystack) body { T result = haystack[0]; haystack = haystack[1 .. haystack.length]; return result; } unittest { writef("\t.backpop() ---------> "); auto foo = array!(int)(1, 2, 3); auto elm = foo.backpop(); assert(foo == array!(int)(2, 3)); assert(elm == 1); writefln("Pass"); } /*********************************************************************************** * Drop values from the beginning of an array. */ T[] shift (T) (inout T[] haystack, size_t count) body { T[] result ; if (count >= haystack.length) { result = haystack; haystack.length = 0; } else { result = haystack[0 .. count]; haystack = haystack[count .. haystack.length]; } return result; } unittest { writef("\t.shift(N) ----------> "); auto foo = array!(int)(1, 2, 3, 4, 5); auto sub = foo.shift(3U); assert(foo == array!(int)( 4, 5)); assert(sub == array!(int)(1, 2, 3 )); writefln("Pass"); } /*********************************************************************************** * Drop values from the end of an array. */ T[] rshift (T) (inout T[] haystack, size_t count) body { T[] result ; if (count >= haystack.length) { result = haystack; haystack.length = 0; } else { result = haystack[haystack.length - count .. haystack.length]; haystack.length = haystack.length - count; } return result; } unittest { writef("\t.rshift(N) ---------> "); auto foo = array!(int)(1, 2, 3, 4, 5); auto sub = foo.rshift(3U); assert(foo == array!(int)(1, 2 )); assert(sub == array!(int)( 3, 4, 5)); writefln("Pass"); } /*********************************************************************************** * Unit tests support. */ unittest { writefln("Unittest: cashew.utils.array: end\n"); }I've been pushing for some array utilities to get into Phobos, yes, but in the meantime I updated the array module in my personal Cashew lib. Included in the updates are the rotl/rotr that I recall someone asking about. In the process I've discovered two bugs, as well: the behavior of 'is' versus '==' is incompatable if the operands are arrays, and DDoc for abbreviated function templates is borked. For an example of the latter, just look at Cashew's own docs.Nice.. I hope something like this will make it into Phobos at some point, since everybody seems to have implemented these functions anyway.. One remark though: your removeAll is pretty slow. It should be enough to iterate the array only once. A indexOf that takes a starting_index would be a simple fix. L.
Sep 13 2006
clayasaurus wrote:Phobos does need something like this.very badly, everyone rolls their own so it would be nice to have stuff like that consolidated in the community.... I also think this would give a more 'complete' feeling to the built-in arrays
Sep 13 2006
Chris Nicholson-Sauls wrote:/*********************************************************************************** * Return an array composed of common elements of two arrays. */ T[] intersect (T) (inout T[] alpha, inout T[] beta) body { T[] result; foreach (x; alpha) { if (beta.contains(x)) { result ~= x; } } return result; } unittest { writef("\t.intersect(T[]) ----> "); auto foo = array!(int)(1, 2, 3, 4, 5 ); auto bar = array!(int)( 3, 4, 5, 6, 7); assert(foo.intersect(bar) == array!(int)(3, 4, 5)); writefln("Pass"); } /*********************************************************************************** * Remove duplicate values from an array. */ void unique (T) (inout T[] haystack) body { size_t ridx ; for (int i = 0; i < haystack.length; i++) { ridx = size_t.max; while ((ridx = haystack.rindexOf(haystack[i], ridx)) != i) { haystack.removeIndex(ridx); } } } unittest { writef("\t.unique() ----------> "); auto foo = array!(int)(1, 1, 2, 1, 2, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5); foo.unique(); assert(foo == array!(int)(1, 2, 3, 4, 5)); writefln("Pass"); }These algorithms can be implemented more efficiently. The running time for the intersect is O(n * m) (where n,m are the length of the two arrays). However, what about the case where alpha and beta are ordered? Here is a simple algorithm for getting the intersect: T[] intersect (T) (inout T[] alpha, inout T[] beta) body { /* Assume alpha, beta are ordered such that alpha[i] < alpha[i+1] * and beta[i] < beta[i+1] for all i. */ T[] result; size_t a_pos = 0; size_t b_pos = 0; while(a_pos < alpha.length && b_pos < beta.length) { if(alpha[a_pos] == beta[b_pos]) { result ~= alpha[a_pos]; ++a_pos; ++b_pos; } else if(alpha[a_pos] < beta[b_pos]) ++a_pos; else if(beta[b_pos] < alpha[a_pos]) ++b_pos; } return result; } The cost for this algorithm is only O(n + m). However, ordering the two arrays is effectively an O(nlogn + mlogm) operation. Therefore the running time for this modified algorithm is overall O(nlogn + mlogm). A similar argument can be made for the unique function.
Sep 14 2006
Mikola Lysenko wrote:Chris Nicholson-Sauls wrote:True, and not a bad idea. I'm not sure how it would be with arrays of classes/structs which have no defined order. I'll try a rewrite and post it later. :) Maybe I should just ask for a DSource entry for Cashew. -- Chris Nicholson-Sauls/*************************************************************** ******************* * Return an array composed of common elements of two arrays. */ T[] intersect (T) (inout T[] alpha, inout T[] beta) body { T[] result; foreach (x; alpha) { if (beta.contains(x)) { result ~= x; } } return result; } unittest { writef("\t.intersect(T[]) ----> "); auto foo = array!(int)(1, 2, 3, 4, 5 ); auto bar = array!(int)( 3, 4, 5, 6, 7); assert(foo.intersect(bar) == array!(int)(3, 4, 5)); writefln("Pass"); } /*************************************************************** ******************* * Remove duplicate values from an array. */ void unique (T) (inout T[] haystack) body { size_t ridx ; for (int i = 0; i < haystack.length; i++) { ridx = size_t.max; while ((ridx = haystack.rindexOf(haystack[i], ridx)) != i) { haystack.removeIndex(ridx); } } } unittest { writef("\t.unique() ----------> "); auto foo = array!(int)(1, 1, 2, 1, 2, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5); foo.unique(); assert(foo == array!(int)(1, 2, 3, 4, 5)); writefln("Pass"); }These algorithms can be implemented more efficiently. The running time for the intersect is O(n * m) (where n,m are the length of the two arrays). However, what about the case where alpha and beta are ordered? Here is a simple algorithm for getting the intersect: T[] intersect (T) (inout T[] alpha, inout T[] beta) body { /* Assume alpha, beta are ordered such that alpha[i] < alpha[i+1] * and beta[i] < beta[i+1] for all i. */ T[] result; size_t a_pos = 0; size_t b_pos = 0; while(a_pos < alpha.length && b_pos < beta.length) { if(alpha[a_pos] == beta[b_pos]) { result ~= alpha[a_pos]; ++a_pos; ++b_pos; } else if(alpha[a_pos] < beta[b_pos]) ++a_pos; else if(beta[b_pos] < alpha[a_pos]) ++b_pos; } return result; } The cost for this algorithm is only O(n + m). However, ordering the two arrays is effectively an O(nlogn + mlogm) operation. Therefore the running time for this modified algorithm is overall O(nlogn + mlogm). A similar argument can be made for the unique function.
Sep 14 2006
Chris Nicholson-Sauls wrote:Mikola Lysenko wrote:Well I just tried it and... woo, intersect is notably faster indeed. I did a benchmark with arrays of 1000 elements (random values), reset and intersected 1000 times, and the numbers are striking: <Benchmark old> 19.330000 <Benchmark new> 3.080000 I'll rewrite unique later, for now here's an attachment of the module with the rewritten intersect, which is actually almost identical to yours, so I'll be sure and include credit for it. :) -- Chris Nicholson-SaulsChris Nicholson-Sauls wrote:True, and not a bad idea. I'm not sure how it would be with arrays of classes/structs which have no defined order. I'll try a rewrite and post it later. :) Maybe I should just ask for a DSource entry for Cashew. -- Chris Nicholson-Sauls/*************************************************************** ******************* * Return an array composed of common elements of two arrays. */ T[] intersect (T) (inout T[] alpha, inout T[] beta) body { T[] result; foreach (x; alpha) { if (beta.contains(x)) { result ~= x; } } return result; } unittest { writef("\t.intersect(T[]) ----> "); auto foo = array!(int)(1, 2, 3, 4, 5 ); auto bar = array!(int)( 3, 4, 5, 6, 7); assert(foo.intersect(bar) == array!(int)(3, 4, 5)); writefln("Pass"); } /*************************************************************** ******************* * Remove duplicate values from an array. */ void unique (T) (inout T[] haystack) body { size_t ridx ; for (int i = 0; i < haystack.length; i++) { ridx = size_t.max; while ((ridx = haystack.rindexOf(haystack[i], ridx)) != i) { haystack.removeIndex(ridx); } } } unittest { writef("\t.unique() ----------> "); auto foo = array!(int)(1, 1, 2, 1, 2, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5); foo.unique(); assert(foo == array!(int)(1, 2, 3, 4, 5)); writefln("Pass"); }These algorithms can be implemented more efficiently. The running time for the intersect is O(n * m) (where n,m are the length of the two arrays). However, what about the case where alpha and beta are ordered? Here is a simple algorithm for getting the intersect: T[] intersect (T) (inout T[] alpha, inout T[] beta) body { /* Assume alpha, beta are ordered such that alpha[i] < alpha[i+1] * and beta[i] < beta[i+1] for all i. */ T[] result; size_t a_pos = 0; size_t b_pos = 0; while(a_pos < alpha.length && b_pos < beta.length) { if(alpha[a_pos] == beta[b_pos]) { result ~= alpha[a_pos]; ++a_pos; ++b_pos; } else if(alpha[a_pos] < beta[b_pos]) ++a_pos; else if(beta[b_pos] < alpha[a_pos]) ++b_pos; } return result; } The cost for this algorithm is only O(n + m). However, ordering the two arrays is effectively an O(nlogn + mlogm) operation. Therefore the running time for this modified algorithm is overall O(nlogn + mlogm). A similar argument can be made for the unique function.
Sep 14 2006
On Wed, 13 Sep 2006 07:56:36 +0400, Chris Nicholson-Sauls <ibisbasenji gmail.com> wrote:I've been pushing for some array utilities to get into Phobos, yes, but in the meantime I updated the array module in my personal Cashew lib. Included in the updates are the rotl/rotr that I recall someone asking about.Ahh, thanks!
Sep 13 2006
Chris Nicholson-Sauls wrote:I've been pushing for some array utilities to get into Phobos, yes, but in the meantime I updated the array module in my personal Cashew lib. Included in the updates are the rotl/rotr that I recall someone asking about. In the process I've discovered two bugs, as well: the behavior of 'is' versus '==' is incompatable if the operands are arrays, and DDoc for abbreviated function templates is borked. For an example of the latter, just look at Cashew's own docs. That said: if people think this is a decent collection, and Walter would take it, I would be willing to release it to public domain for Phobos' sake. The array module is attached, and the docs are at: http://www.codemeu.com:81/~pontiff/projects/cashew/doc/array.html -- Christopher Nicholson-SaulsIt looks good. Thanks for doing this. Some comments/questions: Why does indexOfSub have bale as 'inout', not simply 'in'? Same question for fill Couldn't indexOfSub be named indexOf? Same question for repeatSub and repeat. Also, thinking about its possible use as the starting point for a standard library, I've got some other thoughts for functions which are implemented in other array processing libraries (std.string, Oskar Linde's templated array functions, mintl.array...). Here is what I was thinking could be useful: splitting/joining: split join string processing: stripl/stripr/strip/chomp string matching/replacing: replace insert count squeeze // from std.string startsWith endsWith searching: findmax findmin find(delegate matches) findAll(delegate matches) I've written some code: T[][] split (T) (T[] arr, T[] delim ...) in { assert (delim.length > 0, "Splitting with no delimiters is useless"); } body { T[][] results; T[] token; foreach (i, x; arr) { if (delim.contains(x)) { if (token.length > 0) { results ~= token; token.length = 0; } } else { token ~= x; } } if (token.length > 0) results ~= token; return results; } unittest { writef("\t.split(...) --------> "); /+ auto foo = "a b cde fg"; auto result = foo.split(' '); assert(result == array!(char[])("a", "b", "cde", "fg")); +/ auto bar = "cashew casehew"; auto result2 = bar.split('s', 'h'); assert(result2 == array!(char[])("ca", "ew ca", "e", "ew")); writefln("Pass"); } Also, couldn't you avoid the temporary in removeAll by doing this: void removeAll (T) (inout T[] haystack, T needle) { size_t index = 0U; size_t indices; while ((index = haystack.indexOf(needle, index)) != size_t.max) { haystack.removeIndex(index); } } If you're interested, I'll write some more code. Cheers, Reiner
Sep 13 2006
Oh, and I forgot: a reverse iterator, like mintl.array has.
Sep 13 2006
Reiner Pope wrote:Chris Nicholson-Sauls wrote:Because I'm mildly paranoid and wanted to avoid memory allocation where I could. :) I do suppose it could safely be 'in' though.I've been pushing for some array utilities to get into Phobos, yes, but in the meantime I updated the array module in my personal Cashew lib. Included in the updates are the rotl/rotr that I recall someone asking about. In the process I've discovered two bugs, as well: the behavior of 'is' versus '==' is incompatable if the operands are arrays, and DDoc for abbreviated function templates is borked. For an example of the latter, just look at Cashew's own docs. That said: if people think this is a decent collection, and Walter would take it, I would be willing to release it to public domain for Phobos' sake. The array module is attached, and the docs are at: http://www.codemeu.com:81/~pontiff/projects/cashew/doc/array.html -- Christopher Nicholson-SaulsIt looks good. Thanks for doing this. Some comments/questions: Why does indexOfSub have bale as 'inout', not simply 'in'? Same question for fillCouldn't indexOfSub be named indexOf? Same question for repeatSub and repeat.That's a D limitation. At the moment, function templates cannot be overloaded. *snort stomp* Originally I /did/ have them named the same, but alas.Also, thinking about its possible use as the starting point for a standard library, I've got some other thoughts for functions which are implemented in other array processing libraries (std.string, Oskar Linde's templated array functions, mintl.array...). Here is what I was thinking could be useful: splitting/joining: split join string processing: stripl/stripr/strip/chomp string matching/replacing: replace insert count squeeze // from std.string startsWith endsWith searching: findmax findmin find(delegate matches) findAll(delegate matches) I've written some code: T[][] split (T) (T[] arr, T[] delim ...) in { assert (delim.length > 0, "Splitting with no delimiters is useless"); } body { T[][] results; T[] token; foreach (i, x; arr) { if (delim.contains(x)) { if (token.length > 0) { results ~= token; token.length = 0; } } else { token ~= x; } } if (token.length > 0) results ~= token; return results; } unittest { writef("\t.split(...) --------> "); /+ auto foo = "a b cde fg"; auto result = foo.split(' '); assert(result == array!(char[])("a", "b", "cde", "fg")); +/ auto bar = "cashew casehew"; auto result2 = bar.split('s', 'h'); assert(result2 == array!(char[])("ca", "ew ca", "e", "ew")); writefln("Pass"); } Also, couldn't you avoid the temporary in removeAll by doing this: void removeAll (T) (inout T[] haystack, T needle) { size_t index = 0U; size_t indices; while ((index = haystack.indexOf(needle, index)) != size_t.max) { haystack.removeIndex(index); } }That's exactly how it was written (once I'd modified .indexOf) but it doesn't act quite right for some reason. Until I figure out what the bug is, I figured I could slap together a less pretty but working version.If you're interested, I'll write some more code. Cheers, Reiner-- Chris Nicholson-Sauls
Sep 14 2006
Chris Nicholson-Sauls wrote:Reiner Pope wrote:Passing by 'inout' is probably less efficient and certainly doesn't save any memory allocation. 'inout' is equivalent to passing a pointer to the array reference, which itself contains a pointer to the data. The extra level of indirection can only add computation time.Chris Nicholson-Sauls wrote:Because I'm mildly paranoid and wanted to avoid memory allocation where I could. :) I do suppose it could safely be 'in' though.I've been pushing for some array utilities to get into Phobos, yes, but in the meantime I updated the array module in my personal Cashew lib. Included in the updates are the rotl/rotr that I recall someone asking about. In the process I've discovered two bugs, as well: the behavior of 'is' versus '==' is incompatable if the operands are arrays, and DDoc for abbreviated function templates is borked. For an example of the latter, just look at Cashew's own docs. That said: if people think this is a decent collection, and Walter would take it, I would be willing to release it to public domain for Phobos' sake. The array module is attached, and the docs are at: http://www.codemeu.com:81/~pontiff/projects/cashew/doc/array.html -- Christopher Nicholson-SaulsIt looks good. Thanks for doing this. Some comments/questions: Why does indexOfSub have bale as 'inout', not simply 'in'? Same question for fillThis passes the unit tests, though. Cheers, ReinerAlso, couldn't you avoid the temporary in removeAll by doing this: void removeAll (T) (inout T[] haystack, T needle) { size_t index = 0U; size_t indices; while ((index = haystack.indexOf(needle, index)) != size_t.max) { haystack.removeIndex(index); } }That's exactly how it was written (once I'd modified .indexOf) but it doesn't act quite right for some reason. Until I figure out what the bug is, I figured I could slap together a less pretty but working version.
Sep 14 2006
Reiner Pope wrote:Chris Nicholson-Sauls wrote:Possibly true, so for now I took out some of the 'inout's.Reiner Pope wrote:Passing by 'inout' is probably less efficient and certainly doesn't save any memory allocation. 'inout' is equivalent to passing a pointer to the array reference, which itself contains a pointer to the data. The extra level of indirection can only add computation time.Chris Nicholson-Sauls wrote:Because I'm mildly paranoid and wanted to avoid memory allocation where I could. :) I do suppose it could safely be 'in' though.I've been pushing for some array utilities to get into Phobos, yes, but in the meantime I updated the array module in my personal Cashew lib. Included in the updates are the rotl/rotr that I recall someone asking about. In the process I've discovered two bugs, as well: the behavior of 'is' versus '==' is incompatable if the operands are arrays, and DDoc for abbreviated function templates is borked. For an example of the latter, just look at Cashew's own docs. That said: if people think this is a decent collection, and Walter would take it, I would be willing to release it to public domain for Phobos' sake. The array module is attached, and the docs are at: http://www.codemeu.com:81/~pontiff/projects/cashew/doc/array.html -- Christopher Nicholson-SaulsIt looks good. Thanks for doing this. Some comments/questions: Why does indexOfSub have bale as 'inout', not simply 'in'? Same question for fillSo it does... although I could swear that it didn't behave right before. Ah well, it works fine now. Also I've written a new version of .unique(), in the process of which I've added a .removeRange() function. A quick note on .removeRange() though: it mimics the slicing logic of D arrays. So, foo.removeRange(1, 4) actually removes [1, 2, 3], not [1, 2, 3, 4]. -- Chris Nicholson-SaulsThis passes the unit tests, though. Cheers, ReinerAlso, couldn't you avoid the temporary in removeAll by doing this: void removeAll (T) (inout T[] haystack, T needle) { size_t index = 0U; size_t indices; while ((index = haystack.indexOf(needle, index)) != size_t.max) { haystack.removeIndex(index); } }That's exactly how it was written (once I'd modified .indexOf) but it doesn't act quite right for some reason. Until I figure out what the bug is, I figured I could slap together a less pretty but working version.
Sep 16 2006