digitalmars.D.bugs - [Issue 6718] New: "nWayUnion" => "nWayMerge", plus true nWayUnion
- d-bugmail puremagic.com (46/46) Sep 22 2011 http://d.puremagic.com/issues/show_bug.cgi?id=6718
- d-bugmail puremagic.com (11/11) Feb 27 2013 http://d.puremagic.com/issues/show_bug.cgi?id=6718
- d-bugmail puremagic.com (39/41) Feb 27 2013 http://d.puremagic.com/issues/show_bug.cgi?id=6718
- d-bugmail puremagic.com (11/11) Feb 27 2013 http://d.puremagic.com/issues/show_bug.cgi?id=6718
- d-bugmail puremagic.com (10/10) Feb 27 2013 http://d.puremagic.com/issues/show_bug.cgi?id=6718
- d-bugmail puremagic.com (7/8) Feb 27 2013 http://d.puremagic.com/issues/show_bug.cgi?id=6718
- d-bugmail puremagic.com (13/13) Sep 30 2013 http://d.puremagic.com/issues/show_bug.cgi?id=6718
http://d.puremagic.com/issues/show_bug.cgi?id=6718 Summary: "nWayUnion" => "nWayMerge", plus true nWayUnion Product: D Version: D2 Platform: All OS/Version: All Status: NEW Severity: normal Priority: P2 Component: Phobos AssignedTo: nobody puremagic.com ReportedBy: bearophile_hugs eml.cc This is a bug report plus an optional enhancement request. In std.algorithm there is a section of set functions that work on ordered iterables, like nWayUnion: http://www.d-programming-language.org/phobos/std_algorithm.html#nWayUnion This is an example of nWayUnion usage: import std.algorithm; void main() { auto data = [[1, 2, 3], [1, 3, 5]]; auto expected = [1, 1, 2, 3, 3, 5]; assert(equal(nWayUnion(data), expected)); } But this is not a set operation, and the result is not a set. A set never contains repeated items (a certain definition of "duplication" is possible only if you define your own equality relation). So in my opinion a nWayUnion function has to return: auto data = [[1, 2, 3], [1, 3, 5]]; auto expected = [1, 2, 3, 5]; assert(equal(nWayUnion(data), expected)); While the current nWayUnion function has to be renamed "nWayMerge". auto data = [[1, 2, 3], [1, 3, 5]]; auto expected = [1, 1, 2, 3, 3, 5]; assert(equal(nWayMerge(data), expected)); In my opinion both nWayUnion and nWayMerge are very useful operations useful in various situations, but they are different things. In my opinion confusing the two leads to bugs in programs that use arrays to represent sets (I have just hit a but caused by this). nWayUnion is probably uniq(nWayMerge). If you don't want to add the "nWayUnion" function, then I suggest to just rename "nWayUnion" => "nWayMerge" (and put it outside the documentation section about set functions) to avoid some mistakes. Names are important, and classifications too. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 22 2011
http://d.puremagic.com/issues/show_bug.cgi?id=6718 Andrei Alexandrescu <andrei erdani.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |andrei erdani.com PST --- Here "union" has the meaning in relational algebra, not in set theory. I'll keep this opened until I add a mention to the documentation. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 27 2013
http://d.puremagic.com/issues/show_bug.cgi?id=6718Here "union" has the meaning in relational algebra, not in set theory. I'll keep this opened until I add a mention to the documentation.Thank you for the answer. So all those set functions turn out not being about _sets_. Adding a mention in the documentation is not enough. So: 1) I suggest to rename "nWayUnion" as "nWayMerge", to avoid any confusion in D programmers like me. In the Merge Sort that algorithm is named "Merge", so calling it "merge" has a very long tradition in computer science (http://en.wikipedia.org/wiki/Merge_algorithm ). 2) In this page: http://dlang.org/phobos/std_algorithm.html The "Set operations" column is misnamed because those are NOT sets. In Computer Science and in Mathematics those are named bags or multisets (http://en.wikipedia.org/wiki/Multiset ). But those functions work with a comparison operator and work on ordered sequences, so I suggest to call that column "Multiset operations" or "Ordered bags operations". 3) setIntersection is equally badly named. And it should be renamed "bagsIntersection", or something similar. Also the examples given in the site for setIntersection show only true sets, so this contribute to the confusion, it's even more easy to miss that it is actually a bags intersection: int[] a = [ 1, 2, 4, 5, 7, 9 ]; int[] b = [ 0, 1, 2, 4, 7, 8 ]; int[] c = [ 0, 1, 4, 5, 7, 8 ]; assert(equal(setIntersection(a, a), a)); assert(equal(setIntersection(a, b), [1, 2, 4, 7][])); assert(equal(setIntersection(a, b, c), [1, 4, 7][])); Those examples need to show clearly some repetitions, like this: int[] a = [1, 1, 2, 4, 5, 5, 5, 5, 7, 9]; int[] b = [0, 1, 2, 4, 7, 7, 8]; int[] c = [0, 1, 1, 1, 1, 4, 4, 5, 7, 8]; Extra) Is it worth to add a true nWayUnion (in set theory sense) to std.algorithm? It's a useful operation. A nWayUnion is possibly computed as: seqs.nWayMerge.uniq.array So maybe it's not worth it, I don't know. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 27 2013
http://d.puremagic.com/issues/show_bug.cgi?id=6718 bearophile_hugs eml.cc changed: What |Removed |Added ---------------------------------------------------------------------------- Severity|normal |major Raised importance to Major, because this naming cancer in std.algorithm is worse than I expected in 2011. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 27 2013
http://d.puremagic.com/issues/show_bug.cgi?id=6718 Andrei Alexandrescu <andrei erdani.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Severity|major |minor PST --- No need to raise importance out of proportion. Thanks. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 27 2013
http://d.puremagic.com/issues/show_bug.cgi?id=6718No need to raise importance out of proportion. Thanks.OK, sorry. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 27 2013
http://d.puremagic.com/issues/show_bug.cgi?id=6718 Denis Shelomovskij <verylonglogin.reg gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |verylonglogin.reg gmail.com Severity|minor |normal 18:00:41 MSD --- Bearophile, what about to start NG thread about it? Bad names are always causes of user code bugs so this issue should be thoroughly considered. Also restored "normal" importance as terminology isn't a minor issue. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 30 2013