digitalmars.D - Few things II
- bearophile (152/152) Aug 07 2007 Here are some more notes, that I think are less important/significant th...
- Sean Kelly (42/121) Aug 07 2007 I disagree. Since the key in an AA is equivalent to the index in an
- Bruno Medeiros (11/23) Aug 09 2007 I think some amount of people in the NG knew that already :)
- BCS (10/42) Aug 07 2007 D used to have this after a sorts. It was removed. I forget why.
- Jari-Matti =?ISO-8859-1?Q?M=E4kel=E4?= (18/60) Aug 07 2007 It does a bit more type inference than D at the moment. Currently you ei...
- Jari-Matti =?ISO-8859-1?Q?M=E4kel=E4?= (36/53) Aug 08 2007 I lost my train of thought a bit here. I'm not sure if this works, but
- Walter Bright (3/45) Aug 07 2007 D's ulong is 64 bits, while C's unsigned long is 32 bits. This is likely...
- Oskar Linde (92/136) Aug 07 2007 In that case, writefln(1) should not print the same thing as
- Sean Kelly (6/24) Aug 07 2007 Another advantage is that the property syntax can simplify template code...
- Chris Nicholson-Sauls (16/34) Aug 07 2007 Agreed. Incidentally, consider this little thing I did in Ruby a few
- Bill Baxter (10/24) Aug 07 2007 This is the heart of the repr/str distinction from Python that was
- bearophile (32/53) Aug 09 2007 I was looking for something O(1).
- Oskar Linde (27/47) Aug 09 2007 This is much better, but I would still recommend you to reply to each
- Sean Kelly (4/12) Aug 09 2007 For what it's worth, Tango contains a Fiber implementation that can be
- BCS (30/45) Aug 09 2007 I am currently working on a project that is written in C# but which we a...
- bearophile (190/194) Aug 11 2007 I understand. We may collect a large repository of code/snippets that us...
- BCS (76/116) Aug 11 2007 [dropped D code]
- bearophile (33/35) Aug 16 2007 Thank you. I'm learning D, so I can comment your code; this is the first...
- Oskar Linde (42/56) Aug 16 2007 As far as I can tell, the following should work. Maybe I am missing
- BCS (22/48) Aug 16 2007 that is because it can't, not directly at least. This is because the arg...
Here are some more notes, that I think are less important/significant than the first ones. Probably some of the following notes are silly/useless. There are various things I don't know about D yet. 1) The Ddoc has some bugs. For example I'd like to write in the docs that a function takes an input in [0,1), but that produces a wrong html output. ---------------------- 2) Regarding the printing functions: 2a) I think that the names writef e writefln are a bit too much long and not easy to remember. So better names can be found. For example show/shownl, show/showr, show/showl, put/putr, write/writel, print/printnl, print/printl, etc. I have seen Tango uses print. 2b) I like the idea of the writefl/write functions that don't perform formatting. I was printing a string that casually contained an % and that has produced a bug. So I've created a better printing function, that avoids that bug too. 2c) writefln may print structs too in a default way, my basic (non-recursive!) implementation (that represents structs as: <field1, field2, ...> ): char[] sstr(TyStruct)(TyStruct s) { auto result = new char[][s.tupleof.length]; foreach (i, field; s.tupleof) result[i] = str(field); return "<" ~ result.join(", ") ~ ">"; } 2d) Regarding the printing I think it's not nice that: writefln(["1", "2"]) prints the same thing as: writefln([1, 2]) So maybe writefln can use "" when printing strings inside arrays/AAs, to differentiate them. (To sort out the ideas you may also take a look at the difference between repr() and str() in Python). 2e) I also think writefln when in printing arrays/AAs can put spaces between the commas. I think [el1, el2, el3] and [1:"a", 2:"b"] are more readable than [el1,el2,el3] and [1:a,2:b]. 2f) I think this is a writefln inconsistency: import std.stdio; void main() { char[3] a = "abc"; writefln([a, a]); // Prints: [abc,abc] writefln([a: 1]); // Prints: [[a,b,c]:1] } ---------------------- 3) Given an array x1, like a char[], it seems to me that x1.dup is a bit slower compared to: char[] x2 = new char[x1.length]; x2[] = x1; I hope this can be fixed (or maybe someone can tell me why). ---------------------- 4) The __builtin_expect is a method that gcc (versions >= 2.96) offer for programmers to indicate branch prediction information to the compiler. The return value of __builtin_expect is the first argument (which could only be an integer) passed to it. This may be useful for dmd too, maybe for far future times, when dmd has solved most of its bugs and the D syntax is more finished. ---------------------- 5) I think built-in sets can be quite useful (I use them often in Python, I can show some examples if you want). Few set operations among the keys of AAs may be quite useful too (Python doesn't have them). (A first version may be inside Phobos, implemented using the AAs with empty values.) ---------------------- 6) I suggest to add a third optional to the AAs, the numerical progressive index, it seems easy to add, the foreach already supports more than two variables: foreach(index, key, value; someAA) {...} But I don't like much the fact that the foreach with one parameter scans the values: foreach(value; someAA) {...} because most times you have to iterate on the keys (in Python too the default iteration on a dict is on its keys). With the key you can find the value, while I believe with the value you can't find the key. I'd like a way to find the key if you have a pointer to the actual value contained inside the AA. (I have seen you can cast the pointer to the value to a struct and you can start to walk back with that, but I have not succeed yet to find the key in a reliable way, you may suggest it to me. Maybe a function to do this can be added to Phobos.) I can use it to create a "ordered associative array" data structure (using the built-in AAs) that keeps a linked list of the AA pairs, according to the original insertion order inside the AA. To do that linked list of values you may need to find the key of a given value pointer (or a given key-value pair). ---------------------- 7) A "len" property instead of "length" may be useful (even if it's a bit less clear) because it's used so much often it's boring to write such long word over and over. (And because 'length' isn't an easy to write word for non-English speaking people). ---------------------- 8) From the docs: Overlapping copies are an error: s[0..2] = s[1..3]; // error, overlapping copy s[1..3] = s[0..2]; // error, overlapping copy Disallowing overlapping makes it possible for more aggressive parallel code optimizations than possible with the serial semantics of C. But the programmer may need such operations anyway now and then, so maybe a different syntax can be added... I don't know. Or maybe other solutions can be found. ---------------------- 9) Bit fields of structs: D is a practical language, so I think it can support them too, to allow people to translate C => D code more simply (D has already chosen similar practical compromises regarding C for other things). ---------------------- 10) I think the compiler has to accept a line like this too, converting the strings as necessary: char[][int] a = [1:"abba", 2:"hello"]; It may even accept this too (probably the values have to be converted to dynamic arrays): auto a = [1:"abba", 2:"hello"]; auto aa1 = [1:2, 3:4]; auto sa = ["0x","1x"]; auto a2 = [[1, 2], [3, 4]]; While this works already: auto a = [1, 2, 3, 4]; auto ca = ['0','1']; int[int] aa2 = [1:2, 3:4]; int[][] a3 = [[1, 2], [3, 4]]; ---------------------- 11) I have seen that D AAs are quite efficient, but probably there are very good hash implementations around. So "superfasthash" or a modified version of the super refined and optimized Python/Perl hashes can be useful to replace the currently used by DMD. (Python source code is fully open source but code licenses may be a problem anyway, I don't know). ---------------------- 12) I have seen D allows to pass anonymous functions like: (int x){return x*x;} x => x*x That syntax isn't bad, but I don't know how the compiler automatically manages the types in such situations, it can be a bit complex. ---------------------- 13) The Haskell language community is quite organized, they have a large page on the main Haskell site that compares the best ways to create the entries for the Shootout site. They have even improved the language because one of such tests has shown a bad (slow) side of Haskell. I have recently posted few D entries into the Shootout site, with mixed results (one entry is at the top). Regarding the shootout code the following code may be used by DMD developers to improve the compiler, it shows a bad side of DMD compared to GCC: C code: #include <stdio.h> #include <stdlib.h> unsigned long fib(unsigned long n) { return( (n < 2) ? 1 : (fib(n-2) + fib(n-1)) ); } int main(int argc, char *argv[]) { int N = ((argc == 2) ? atoi(argv[1]) : 1); printf("%ld\n", fib(N)); return(0); } The D code: import std.stdio, std.conv; ulong fib(ulong n) { return (n < 2) ? 1 : (fib(n-2) + fib(n-1)); } void main(string[] argv) { ulong n = ((argv.length == 2) ? toUlong(argv[1]) : 1); writefln(fib(n)); } Used: gcc version 3.4.2 (mingw-special) DMD Digital Mars D Compiler v1.020 (Windows) Compiled with: gcc -O2 fiboC.c -o fiboC.exe dmd -O -release -inline fiboD.d Timings with n = 38: C: 4.38 s D: 5.28 s I think such diffece is big enough to justify an improvement. ---------------------- 14) This code gives: problem.d(5): Error: cannot evaluate StrType() at compile time But I'd like to use typeid().toString at compile time (it's useful inside compile time functions, etc): import std.stdio; char[] StrType(T)() { return typeid(T).toString; } const char[] r = StrType!(typeof(1))(); void main() { writefln(r); } Later I have solved similar problems with mixins, compile-time demangle, etc, but I think the toString of typeid() may just work at compile time. ---------------------- 15) On the newsgroup I have seen a list of "wishes", among them I like: - support struct&array in switch - Multi-Dimensional Allocation (of dynamic arrays too) - Named keyword arguments - Iterators and Generators (yield is a good starting point) - Templates in classes - !in (probably easy to implement) - Multiple return values (tuples) (Quite useful, but I have found other solutions) - Statically check for == null (maybe not easy to implement) - Implicit New or short syntax for New Beside them I like list generators from Python, but their syntax isn't much compatibile with D syntax, so I don't know how they can be written. ---------------------- 16) Beside C++ and Python, I think D can copy Erlang e Fortress languages a bit too. Erlang is nice for its easy parallel capabilities, and the Fortress language made by Sun shows lot of collections literals (bags, sets, lists, etc), and an easy management of parallel processing (I think it looks somewhat similar to the ParallelPascal one). ---------------------- 17) I am following this newsgroup for some weeks, I have seen now and then it pops out some DMD problems regarding type inferencing (I too have found a 'bug' related to that, I have shown it on the learn newsgroup). I am helping the development of "ShedSkin" it's a compiler for an implicitly static subset of Python to C++. This program contains a really powerful type inferencer able to find fully by itself the types of all the variables used inside a 3000-lines long Python program (that contains no explicit type annotations). ShedSkin is rather slow and I think D will never need all its type inferencing capabilities, but I think its main point is to show that you can produce very fast code even if you start with code that's written with a very easy hi-level syntax. This 'discovery' may be useful to D too in the future. ---------------------- 18) I don't like still that string literals are immutable while the others not. I suggest to make the string literals mutable as the other ones (automatic dup during program creation). This will avoid some of my bugs (and it can probably avoid problems in porting code Win <=>Linux that uses strings) and help me avoid some tricks to de-constant strings when they are constant. In few situations you may need to create a program that avoids heap activity, so in those situations you may need the normal string literals anyway, in such situations you may add a certain letter after the string literal (like c?) that tells the compile to not perform the automatic dup. ---------------------- 19) I have created many very general function templates like map, imap, map, zip, sum, max, min, filter, ifilter, all, any, etc, that accept arrays, AAs and iterable classes. It may be useful to add a bit of functional-style programming to D, inside a Phobos module. Such functions may be fit for the 90% of the code where max running speed isn't necessary. They shorten the code, reduce bugs, speed up programming, make the program simpler to read, etc, and being hi-level they may even end helping the compiler produce efficient code (as foreach sometimes does). ---------------------- 20) After few test of mine I think the default way dmd computes struct hashes has to be improved (or removed), to make it compute the hash looking inside the dynamic arrays contained into the struct too. Otherwise it silently gives the "wrong" results. ---------------------- 21) In this newsgroup I've seen that people like a lot the ability of doing: txt.split() But I don't like it much because it's a syntactical sugar that masks the fact that those split are functions (that pollute the namespace) and they aren't true methods (object functions). That namespace pollution may cause problems, if you have already defined elsewhere a function with the same name and similar signature. I think that's all, my successive post will probably be much smaller and mostly comments to specific posts :-) Bear hugs, bearophile
Aug 07 2007
bearophile wrote:6) I suggest to add a third optional to the AAs, the numerical progressive index, it seems easy to add, the foreach already supports more than two variables: foreach(index, key, value; someAA) {...} But I don't like much the fact that the foreach with one parameter scans the values: foreach(value; someAA) {...} because most times you have to iterate on the keys (in Python too the default iteration on a dict is on its keys). With the key you can find the value, while I believe with the value you can't find the key.I disagree. Since the key in an AA is equivalent to the index in an array, the current behavior of foreach is consistent for all types passed to it. One parameter gives you the value, two parameters gives you the 'key' and the value. And since the key for an AA is equivalent to the index value, I'm not sure I like the three parameter idea either. It's just as easy to use a separate variable for that when counting entries is important anyway.I'd like a way to find the key if you have a pointer to the actual value contained inside the AA. (I have seen you can cast the pointer to the value to a struct and you can start to walk back with that, but I have not succeed yet to find the key in a reliable way, you may suggest it to me. Maybe a function to do this can be added to Phobos.) I can use it to create a "ordered associative array" data structure (using the built-in AAs) that keeps a linked list of the AA pairs, according to the original insertion order inside the AA. To do that linked list of values you may need to find the key of a given value pointer (or a given key-value pair).It's hardly efficient, but: Key getKeyOf(Value, Key)( Value[Key] aa, Value val ) { foreach( k, v; aa ) { if( v == val ) return k; } return Key.init; }---------------------- 7) A "len" property instead of "length" may be useful (even if it's a bit less clear) because it's used so much often it's boring to write such long word over and over. (And because 'length' isn't an easy to write word for non-English speaking people).It would bed nice if there were a way to support this within the current syntax. Though I suppose a property method could be created: void copyTo(T)( T src, T dst ) in { assert( src.length == dst.length ); } body { memmove( dst.ptr, src.ptr, src.length ); } myArray[1 .. 3].copyTo( myArray[2 .. 4] );---------------------- 11) I have seen that D AAs are quite efficient, but probably there are very good hash implementations around. So "superfasthash" or a modified version of the super refined and optimized Python/Perl hashes can be useful to replace the currently used by DMD. (Python source code is fully open source but code licenses may be a problem anyway, I don't know).Not hard to do. I've considered even using a "power of two" number of buckets for the D AA but because the degenerate case is so much worse than with a prime number of buckets I think the current implementation is more generally useful. A library hashtable is probably the best option for more specialized needs.---------------------- 12) I have seen D allows to pass anonymous functions like: (int x){return x*x;} x => x*x---------------------- 13) The Haskell language community is quite organized, they have a large page on the main Haskell site that compares the best ways to create the entries for the Shootout site. They have even improved the language because one of such tests has shown a bad (slow) side of Haskell. I have recently posted few D entries into the Shootout site, with mixed results (one entry is at the top). Regarding the shootout code the following code may be used by DMD developers to improve the compiler, it shows a bad side of DMD compared to GCC: C code: #include <stdio.h> #include <stdlib.h> unsigned long fib(unsigned long n) { return( (n < 2) ? 1 : (fib(n-2) + fib(n-1)) ); } int main(int argc, char *argv[]) { int N = ((argc == 2) ? atoi(argv[1]) : 1); printf("%ld\n", fib(N)); return(0); } The D code: import std.stdio, std.conv; ulong fib(ulong n) { return (n < 2) ? 1 : (fib(n-2) + fib(n-1)); } void main(string[] argv) { ulong n = ((argv.length == 2) ? toUlong(argv[1]) : 1); writefln(fib(n)); } Used: gcc version 3.4.2 (mingw-special) DMD Digital Mars D Compiler v1.020 (Windows) Compiled with: gcc -O2 fiboC.c -o fiboC.exe dmd -O -release -inline fiboD.d Timings with n = 38: C: 4.38 s D: 5.28 s I think such diffece is big enough to justify an improvement.Unless this was run on a 64-bit Unix machine, the major differernce in performance is that 'long' in the C code is a 32-but value, while 'long' in the D code is a 64-bit value. Someone should verify that the sizes match.---------------------- 14) This code gives: problem.d(5): Error: cannot evaluate StrType() at compile time But I'd like to use typeid().toString at compile time (it's useful inside compile time functions, etc): import std.stdio; char[] StrType(T)() { return typeid(T).toString; } const char[] r = StrType!(typeof(1))(); void main() { writefln(r); }What about .stringof? Sean
Aug 07 2007
Sean Kelly wrote:bearophile wrote:I think some amount of people in the NG knew that already :) There were several mentions of it some time ago (more than year), already supports delegate literals, with a syntax nearly identical to D, and with full support for outer variables (ie, detects outer variables and allocates them in heap instead of the stack). -- Bruno Medeiros - MSc in CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D---------------------- 12) I have seen D allows to pass anonymous functions like: (int x){return x*x;} x => x*x
Aug 09 2007
Reply to bearophile,Here are some more notes, that I think are less important/significant than the first ones. Probably some of the following notes are silly/useless. There are various things I don't know about D yet. 1) The Ddoc has some bugs. For example I'd like to write in the docs that a function takes an input in [0,1), but that produces a wrong html output.It has a few more than that <G>---------------------- 9) Bit fields of structs: D is a practical language, so I think it can support them too, to allow people to translate C => D code more simply (D has already chosen similar practical compromises regarding C for other things).D used to have this after a sorts. It was removed. I forget why.---------------------- 15) On the newsgroup I have seen a list of "wishes", among them I like: - Iterators and Generators (yield is a good starting point)under the covers, yeild is a disaster IMO21) In this newsgroup I've seen that people like a lot the ability of doing: txt.split() But I don't like it much because it's a syntactical sugar that masks the fact that those split are functionsHow? Just about anything after a dot is a funcion (excluding member access and with regars to types) even method cals are function.(that pollute the namespace)If that were removed then nothing could be used with that syntax on a char[], so what pollution is their?and they aren't true methods (object functions).In what way?That namespace pollution may cause problems, if you have already defined elsewhere a function with the same name and similar signature.Could you give an example?
Aug 07 2007
bearophile wrote:12) I have seen D allows to pass anonymous functions like: (int x){return x*x;} x => x*x That syntax isn't bad, but I don't know how the compiler automatically manages the types in such situations, it can be a bit complex.It does a bit more type inference than D at the moment. Currently you either get return type or parameter type inference, but not both: (int x){return x*x;} int mul(T)(T x) { return x*x; }17) I am following this newsgroup for some weeks, I have seen now and then it pops out some DMD problems regarding type inferencing (I too have found a 'bug' related to that, I have shown it on the learn newsgroup). I am helping the development of "ShedSkin" it's a compiler for an implicitly static subset of Python to C++. This program contains a really powerful type inferencer able to find fully by itself the types of all the variables used inside a 3000-lines long Python program (that contains no explicit type annotations). ShedSkin is rather slow and I think D will never need all its type inferencing capabilities, but I think its main point is to show that you can produce very fast code even if you start with code that's written with a very easy hi-level syntax. This 'discovery' may be useful to D too in the future.At least the compile times are quite fast currently so there's much potential for this.19) I have created many very general function templates like map, imap, map, zip, sum, max, min, filter, ifilter, all, any, etc, that accept arrays, AAs and iterable classes. It may be useful to add a bit of functional-style programming to D, inside a Phobos module. Such functions may be fit for the 90% of the code where max running speed isn't necessary. They shorten the code, reduce bugs, speed up programming, make the program simpler to read, etc, and being hi-level they may even end helping the compiler produce efficient code (as foreach sometimes does).It's hard to predict how much generic D code will still change in the near future, but a functional library sounds useful. I've also written both compile and runtime versions of generic map, fold, min, max, ... functions. Probably some dsource project already implements them too, but it's faster to write a simple algorithm from scratch than search the whole dsource and worry about possible IP issues. But in overall I assume a lot of developer time is wasted on reinventing the wheel :-/21) In this newsgroup I've seen that people like a lot the ability of doing: txt.split() But I don't like it much because it's a syntactical sugar that masks the fact that those split are functions (that pollute the namespace) and they aren't true methods (object functions). That namespace pollution may cause problems, if you have already defined elsewhere a function with the same name and similar signature.There was discussion about this some time ago. For example Scala has an interesting language construct we could use in D.I think that's all, my successive post will probably be much smaller and mostly comments to specific posts :-) Bear hugs, bearophile--
Aug 07 2007
Jari-Matti Mäkelä wrote:bearophile wrote:I lost my train of thought a bit here. I'm not sure if this works, but (T)(T x) { return x; } could allow type inference in both ways. Still on the parameter side the polymorphism is explicit and you can't "chain" the literals because there's no implicit parametric type in D afaik: auto myfun = (x) { return "foo"; } auto myval = myfun(myfun(42)); A bad example, but this kind of code would also benefit a lot template foo(T : int) { alias char[] foo; } template foo(T : char[]) { alias int foo; } template zzz(T : int) { const zzz = "hello"; } template zzz(T : char[]) { const zzz = 0; } U bar(T, U = foo!(T))(T t) { return zzz!(T); } /* with proper type inference def bar(t) { static if (is(t : int) return "hello"; else return 0; } */ void main() { auto a = bar(42); auto b = bar("hi"); pragma(msg, typeof(a).stringof); // char[] pragma(msg, typeof(b).stringof); // int } I've even seen wicked code that in the end looked pretty much like this U bar(T, U = inferType!(parseAST!(scanSource!(traits!(getSourceOfThisFun ())))) )(T t) { ... } What can I say - somebody must have had fun :P12) I have seen D allows to pass anonymous functions like: (int x){return x*x;} x => x*x That syntax isn't bad, but I don't know how the compiler automatically manages the types in such situations, it can be a bit complex.It does a bit more type inference than D at the moment. Currently you either get return type or parameter type inference, but not both: (int x){return x*x;} int mul(T)(T x) { return x*x; }
Aug 08 2007
bearophile wrote:13) The Haskell language community is quite organized, they have a large page on the main Haskell site that compares the best ways to create the entries for the Shootout site. They have even improved the language because one of such tests has shown a bad (slow) side of Haskell. I have recently posted few D entries into the Shootout site, with mixed results (one entry is at the top). Regarding the shootout code the following code may be used by DMD developers to improve the compiler, it shows a bad side of DMD compared to GCC: C code: #include <stdio.h> #include <stdlib.h> unsigned long fib(unsigned long n) { return( (n < 2) ? 1 : (fib(n-2) + fib(n-1)) ); } int main(int argc, char *argv[]) { int N = ((argc == 2) ? atoi(argv[1]) : 1); printf("%ld\n", fib(N)); return(0); } The D code: import std.stdio, std.conv; ulong fib(ulong n) { return (n < 2) ? 1 : (fib(n-2) + fib(n-1)); } void main(string[] argv) { ulong n = ((argv.length == 2) ? toUlong(argv[1]) : 1); writefln(fib(n)); } Used: gcc version 3.4.2 (mingw-special) DMD Digital Mars D Compiler v1.020 (Windows) Compiled with: gcc -O2 fiboC.c -o fiboC.exe dmd -O -release -inline fiboD.d Timings with n = 38: C: 4.38 s D: 5.28 s I think such diffece is big enough to justify an improvement.D's ulong is 64 bits, while C's unsigned long is 32 bits. This is likely the source of the speed difference.
Aug 07 2007
bearophile wrote:2d) Regarding the printing I think it's not nice that: writefln(["1", "2"]) prints the same thing as: writefln([1, 2]) So maybe writefln can use "" when printing strings inside arrays/AAs, to differentiate them.In that case, writefln(1) should not print the same thing as writefln("1") either. But I agree that it would be useful to have a function that prints strings with "". Preferably, it should be able to convert any expression of basic types into a string representation, that when parsed by the compiler as a literal yields an identical value.2e) I also think writefln when in printing arrays/AAs can put spaces between the commas. I think [el1, el2, el3] and [1:"a", 2:"b"] are more readable than [el1,el2,el3] and [1:a,2:b].Ideally, a formatter should allow you to define any format you want.2f) I think this is a writefln inconsistency: import std.stdio; void main() { char[3] a = "abc"; writefln([a, a]); // Prints: [abc,abc] writefln([a: 1]); // Prints: [[a,b,c]:1] }This is not a writefln inconsistency as much as it is a DMD inconsistency. note that typeof([a, a]) == char[][2], while typeof([a: 1]) == int[char[3]], This feels like a bug, but somehow, I have a feeling it is intentional to allow string array literals such as: ["a","ab","abc"] without giving a stupid compiler error about type incompatibility. IMHO, it would be much better if array literals instead picked the best common type, allowing expressions such as: [1, 1.5, 2, 2.5]3) Given an array x1, like a char[], it seems to me that x1.dup is a bit slower compared to: char[] x2 = new char[x1.length]; x2[] = x1; I hope this can be fixed (or maybe someone can tell me why)..dup should definitely not be any slower (unless you have a GC compiled with DbC that does a memcmp after each .dup). Do you have a benchmark showing this?6) I suggest to add a third optional to the AAs, the numerical progressive index, it seems easy to add, the foreach already supports more than two variables: foreach(index, key, value; someAA) {...} But I don't like much the fact that the foreach with one parameter scans the values: foreach(value; someAA) {...} because most times you have to iterate on the keys (in Python too the default iteration on a dict is on its keys). With the key you can find the value, while I believe with the value you can't find the key.I agree with Sean here. The current behavior is the most consistent one.I'd like a way to find the key if you have a pointer to the actual value contained inside the AA. (I have seen you can cast the pointer to the value to a struct and you can start to walk back with that, but I have not succeed yet to find the key in a reliable way, you may suggest it to me. Maybe a function to do this can be added to Phobos.) I can use it to create a "ordered associative array" data structure (using the built-in AAs) that keeps a linked list of the AA pairs, according to the original insertion order inside the AA. To do that linked list of values you may need to find the key of a given value pointer (or a given key-value pair).Something like this should work with current DMD and derivations that keep the memory layout of AA's: template getKeyFromPtr(Key) { Key getKeyFromPtr(Val)(Val *ptr) { auto offset = (Key.sizeof + size_t.sizeof - 1) & ~(size_t.sizeof - 1); return *cast(Key *)(cast(void*)ptr - offset); } } (Not really tested though.)9) Bit fields of structs: D is a practical language, so I think it can support them too, to allow people to translate C => D code more simply (D has already chosen similar practical compromises regarding C for other things).Using string mixins and some CTFE it is not that hard implementing a bit field replacement.10) I think the compiler has to accept a line like this too, converting the strings as necessary: char[][int] a = [1:"abba", 2:"hello"];I agree. My solution would be to make string literals dynamic arrays instead of static. It doesn't make much sense to put the length into the type. "a" and "ab" should be of the same type.11) I have seen that D AAs are quite efficient, but probably there are very good hash implementations around. So "superfasthash" or a modified version of the super refined and optimized Python/Perl hashes can be useful to replace the currently used by DMD. (Python source code is fully open source but code licenses may be a problem anyway, I don't know).I usually make my own containers when the built in AA doesn't cut it, such as: Map!(K,V), OrderedMap!(K,V), DiskStoredMap!(K,V), and so on. The only big problem is D's lack of a reference return type.12) I have seen D allows to pass anonymous functions like: (int x){return x*x;} x => x*x That syntax isn't bad, but I don't know how the compiler automatically manages the types in such situations, it can be a bit complex.It is not bad, but it can never evaluate to a single function. It would have to evaluate to a template alias or similar. But I too wish D had a more convenient syntax for single expression delegate literals, such as: (int x) => x*x or something.Beside them I like list generators from Python, but their syntax isn't much compatibile with D syntax, so I don't know how they can be written.They could probably be written something like: int x; // dummy List(x*x, x, range(10)) or List((int x) { return x*x; }, range(10)); or (as I have implemented them): map(range(10), (int x) { return x*x; }); range(10).map((int x) { return x*x; }); or equivalently as a lazy expression: a = range(1_000_000_000_000_000L).mapLazy((int x) { return x*x; }); writefln(a[555_555_555_555]);18) I don't like still that string literals are immutable while the others not. I suggest to make the string literals mutable as the other ones (automatic dup during program creation). This will avoid some of my bugs (and it can probably avoid problems in porting code Win <=>Linux that uses strings) and help me avoid some tricks to de-constant strings when they are constant. In few situations you may need to create a program that avoids heap activity, so in those situations you may need the normal string literals anyway, in such situations you may add a certain letter after the string literal (like c?) that tells the compile to not perform the automatic dup.Most strings will never be changed, so this will make lots of unnecessary dups. D2.0's strings are constant (pronounced constant but spelled invariant) is a better solution. Maybe an automatic dup from constant strings into char[] would be a good thing.19) I have created many very general function templates like map, imap, map, zip, sum, max, min, filter, ifilter, all, any, etc, that accept arrays, AAs and iterable classes. It may be useful to add a bit of functional-style programming to D, inside a Phobos module. Such functions may be fit for the 90% of the code where max running speed isn't necessary. They shorten the code, reduce bugs, speed up programming, make the program simpler to read, etc, and being hi-level they may even end helping the compiler produce efficient code (as foreach sometimes does).I have implemented heaps of such functions too and I believe most people have implemented at least a few of those. I made a proposal to standardize a number of such functions a long time ago. It is a shame there is no standard regarding such functions. If not only because it would make it so much easier to publish short snippets of code without having to constantly define all the basic functions. Luckily, Tango has some really well implemented and solid functions in tango.core.Array. A collection that can only grow. BTW, I really like your naming convention for in place versions (as I assume they are). I might steal that right away :). Or does i mean those are iterator functions?20) After few test of mine I think the default way dmd computes struct hashes has to be improved (or removed), to make it compute the hash looking inside the dynamic arrays contained into the struct too. Otherwise it silently gives the "wrong" results.This is just another example that treating structs as plain old data may be a too simplistic view. Another example is comparing two structs for equality. One could argue that two structs should be equal if all members are equal. That is not how it works today.21) In this newsgroup I've seen that people like a lot the ability of doing: txt.split() But I don't like it much because it's a syntactical sugar that masks the fact that those split are functions (that pollute the namespace) and they aren't true methods (object functions). That namespace pollution may cause problems, if you have already defined elsewhere a function with the same name and similar signature.I disagree. First, the name space "pollution" is no problem in D. You can never accidentally call an unwanted overload of a function from another module. D will issue an ambiguity error when the same identifier matches instances in several different modules. Secondly, I have really grown to like the way of chaining functions on arrays. The function call order and the data flow is read left to right. An example from my own code: references ~= refs .splitIter() .filterIter((char[] c) { return c.length > 2; }) .map((char[] c) { return a[1..$-1]; }); Writing this as: references ~= map(filterIterate(refs.splitIterate(), (char[] c) { return c.length > 2; }), (char[] c) { return a[1..$-1]; }); Is both much harder to read and parse (to me atleast) and also hard to indent in any meaningful way. -- Oskar
Aug 07 2007
Oskar Linde wrote:bearophile wrote:Another advantage is that the property syntax can simplify template code in some cases because it allows arrays to be treated as if they were UDTs so long as the proper support routine is available. To me, this goes a long way towards supporting arrays as viable containers. Sean21) In this newsgroup I've seen that people like a lot the ability of doing: txt.split() But I don't like it much because it's a syntactical sugar that masks the fact that those split are functions (that pollute the namespace) and they aren't true methods (object functions). That namespace pollution may cause problems, if you have already defined elsewhere a function with the same name and similar signature.I disagree. First, the name space "pollution" is no problem in D. You can never accidentally call an unwanted overload of a function from another module. D will issue an ambiguity error when the same identifier matches instances in several different modules. Secondly, I have really grown to like the way of chaining functions on arrays. The function call order and the data flow is read left to right.
Aug 07 2007
Oskar Linde wrote:Secondly, I have really grown to like the way of chaining functions on arrays. The function call order and the data flow is read left to right. An example from my own code: references ~= refs .splitIter() .filterIter((char[] c) { return c.length > 2; }) .map((char[] c) { return a[1..$-1]; }); Writing this as: references ~= map(filterIterate(refs.splitIterate(), (char[] c) { return c.length > 2; }), (char[] c) { return a[1..$-1]; }); Is both much harder to read and parse (to me atleast) and also hard to indent in any meaningful way.Agreed. Incidentally, consider this little thing I did in Ruby a few weeks back as an answer to a programming challenge (hey, technically just two lines): We can actually come quite close to this in D. -- Chris Nicholson-Sauls
Aug 07 2007
Oskar Linde wrote:bearophile wrote:This is the heart of the repr/str distinction from Python that was mentioned. In Python instead of just one toString method, you have __str__ and __repr__. __str__ is supposed to return a version that looks nice for "human reading" __repr__ is supposed to return a string that (if possible) would recreate the object if passed to the interpreter. So the str of "1" prints without quote characters, but repr of "1" prints with quotes. --bb2d) Regarding the printing I think it's not nice that: writefln(["1", "2"]) prints the same thing as: writefln([1, 2]) So maybe writefln can use "" when printing strings inside arrays/AAs, to differentiate them.In that case, writefln(1) should not print the same thing as writefln("1") either. But I agree that it would be useful to have a function that prints strings with "". Preferably, it should be able to convert any expression of basic types into a string representation, that when parsed by the compiler as a literal yields an identical value.
Aug 07 2007
I have changed the way I answer, hopefully this is better, now I group answers to a person. Sean Kelly:It's hardly efficient, but:<I was looking for something O(1).Though I suppose a property method could be created:<Nice, I'll try that. But maybe a syntax like this one is simpler: myArray[1 .. 3].copyTo(2, 4);Not hard to do.<Well copying large portions of hashing Python/Perl C sourcecode isn't that easy, it has to be adapted.A library hashtable is probably the best option for more specialized needs.<I know, but I was talking about improving the D built in AA copying the already refined implementations that come from 15+ years of community efforts to produce the best.What about .stringof?<Thank you. -------------------------- BCS <ao at pathlink dot com>:under the covers, yeild is a disaster IMO<Maybe this is true for a low level language, I don't know. But in Python it's very useful and quite commonly used (but I don't like its V.2.5 use for coroutines, it's too much confusing and its syntax is dirty because it looks cause no disasters :-) It avoids you to mess with the opApply. -------------------------- Oskar Linde:But I agree that it would be useful to have a function that prints strings with "".<I have already written it, it's called put/putr, and it solves all the problems I was talking of :-)Preferably, it should be able to convert any expression of basic types into a string representation, that when parsed by the compiler as a literal yields an identical value.<This may require some extra testing to see if it's true.Ideally, a formatter should allow you to define any format you want.<You can define all the formats you want, but sensible defaults are necessary anyway. I was talking about the standard tools in standard configuration, that you use when you debug, when you write things quickly, etc.This is not a writefln inconsistency as much as it is a DMD inconsistency.<I see, and I think it deserves to be fixed :-).dup should definitely not be any slower (unless you have a GC compiled with DbC that does a memcmp after each .dup). Do you have a benchmark showing this?<I have just written a benchmark, and now .dup is quite faster than the other (about 50% faster, and it's a repeatable result), so I shut up.Something like this should work with current DMD and derivations that keep the memory layout of AA's:<Thank you, I'll test it.Using string mixins and some CTFE it is not that hard implementing a bit field replacement.<Okay. Maybe this can be standardized inside a Phobos lib (unrelated: bitvectors of Phobos are nice, but they are slow compared to the old 'bit' type).My solution would be to make string literals dynamic arrays instead of static.<I'd like that. I don't know what's the point of having them a different kind of strings (but I like to have a way to use static array of chars anyway, they may be useful now and then).I usually make my own containers when the built in AA doesn't cut it, such as:<Designing and tuning a very good associative array/set requires months or years, so I was talking about the built-in one.I have implemented heaps of such functions too and I believe most people have implemented at least a few of those.<I am sure lot of people have done similar things, because when used (and not over-used) they can help (expecially if you are a bit used to functional programming). It's just I think I have implemented lot of them, and mine are probably more flexible (they usually accept arrays, AAs and iterable objects, more than one when possible, mixed types when possible, etc. There are some limitations that I have tries to push, some can't be removed) :-)Luckily, Tango has some really well implemented and solid functions in tango.core.Array. A collection that can only grow.<I have seen them, and I think they are nice, but I was talking about something for functional-style coding.BTW, I really like your naming convention for in place versions (as I assume they are). I might steal that right away :). Or does i mean those are iterator functions?<You are free to steal. I can probably offer my functional module to Tango developers too. My "isomething" has the same meaning as in the itertools module of Python, they are related to lazy iterable objects.Another example is comparing two structs for equality.<I see :-)First, the name space "pollution" is no problem in D. You can never accidentally call an unwanted overload of a function from another module. D will issue an ambiguity error when the same identifier matches instances in several different modules.<I see.Is both much harder to read and parse (to me atleast) and also hard to indent in any meaningful way.<I think both versions aren't much readable (but the first is better), I suggest you to split your lines in more parts, so you can give those parts a name and you can comment the lines too. Thank you for the many answers, and the two snippets of code I may find useful. Bear hugs, bearophile
Aug 09 2007
bearophile <bearophile wrote:I have changed the way I answer, hopefully this is better, now I group answers to a person.This is much better, but I would still recommend you to reply to each post(and person) in a separate message. Not doing that breaks the message threading (the References header in your post will not refer to the correct messages) making it much harder for others to follow the discussion. The web-news reader has lots of issues in itself, so I would really recommend you to use an NNTP client if possible. Thunderbird seems to be the most popular one.Sean Kelly:This would also have the advantage of actually being implementable with current DMD.Though I suppose a property method could be created:<Nice, I'll try that. But maybe a syntax like this one is simpler: myArray[1 .. 3].copyTo(2, 4);BCS <ao at pathlink dot com>:yield would not be any worse than current opApply. In fact, it would be identical on the low level side. To improve the low level side of this, the compiler would have to be better at inlining and const-folding delegate calls.under the covers, yeild is a disaster IMO<Maybe this is true for a low level language, I don't know. But in Python it's very useful and quite commonly used (but I don't like its V.2.5 use for coroutines, it's too much confusing and its syntax is dirty because it looks cause no disasters :-) It avoids you to mess with the opApply.Many times people have been posting improvements to the built in AA. The problem is proving that the improvements really are better for the general case. The hardest part has proven to be defining the "general case".I usually make my own containers when the built in AA doesn't cut it, such as:<Designing and tuning a very good associative array/set requires months or years, so I was talking about the built-in one.I have also made quite general functions and implemented iterator versions for many of them too. I have really not had much time lately (as in the last 10 months), but would still be interested to see your code. There are quite a few new features in D since I worked on this last that should make some things much easier now.I have implemented heaps of such functions too and I believe most people have implemented at least a few of those.<I am sure lot of people have done similar things, because when used (and not over-used) they can help (expecially if you are a bit used to functional programming). It's just I think I have implemented lot of them, and mine are probably more flexible (they usually accept arrays, AAs and iterable objects, more than one when possible, mixed types when possible, etc. There are some limitations that I have tries to push, some can't be removed) :-)Thank you for your comments on my personal coding style ;p. Cheers, -- OskarIs both much harder to read and parse (to me atleast) and also hard to indent in any meaningful way.<I think both versions aren't much readable (but the first is better), I suggest you to split your lines in more parts, so you can give those parts a name and you can comment the lines too.
Aug 09 2007
bearophile <bearophile wrote:-------------------------- BCS <ao at pathlink dot com>:For what it's worth, Tango contains a Fiber implementation that can be used for coroutines and such. Seanunder the covers, yeild is a disaster IMO<Maybe this is true for a low level language, I don't know. But in Python it's very useful and quite commonly used (but I don't like its V.2.5 use for coroutines, it's too much confusing and its syntax is dirty because it looks cause no disasters :-) It avoids you to mess with the opApply.
Aug 09 2007
Reply to bearophile,-------------------------- BCS <ao at pathlink dot com>:considering converting to D. The use of yield has turned out to be a significant performance problem. This is the reson I don't like it.under the covers, yeild is a disaster IMO<Maybe this is true for a low level language, I don't know. But in Python it's very useful and quite commonly used (but I don't like its V.2.5 use for coroutines, it's too much confusing and its syntax is dirty because it looks like a statement used as an expression). It'sIt avoids you to mess with the opApply.D's opApply syntax IMO seems much cleaner and easier to understand. Also it can work without any heap allocations. OTOH you can fake it quiet well with an delegate that stores most of it's state in an object. |class Walk(T) |{ | T[] dat; | int at; | | this(T[] d){dat = d; at = 0;} | | int opApply(int delegate(inout int, inout T) dg) | { | while(at < dat.length) | { | int j = at; | if(int ret = dg(j, dat[at])) return ret; //yeild somtimes | at++; | } | } |} | |auto walker = new Walk!(char)("hello world"); |foreach(inout int i, inout char c; walker) if(c == ' ') break; //skip hello |foreach(inout int i, inout char c; walker) writef("%s", c); //prints world--------------------------
Aug 09 2007
Oskar Linde:Many times people have been posting improvements to the built in AA. The problem is proving that the improvements really are better for the general case. The hardest part has proven to be defining the "general case".<I understand. We may collect a large repository of code/snippets that use AAs, so they can be tuned against that repository. ----------------- BCS:considering converting to D. The use of yield has turned out to be a significant performance problem. This is the reson I don't like it.<I see. Let's assume you are right, that yield is too much slow for a low level language (as D. D is a low leven language that is able to be used as medium-level language too). I think yield is good if you use Python, that's a high level language. I also think a language like D can have both some hi-level constructs and some lower level ones. Often many parts of a program don't need max speed (a profiler usually shows just few hot spots in a program), so there you may use yield. Where speed is more important the language can offer you means to go faster. This looks like a win-win situation to me (but the programmer must know what language constructs are slow, so they don't have to be used in speed critical parts of the code). After developing ShedSkin I think languages fit for both low level and high level programming can be designed (but a programmer has to know them well to avoid using the slow features in the speed/memory critical parts of the code).D's opApply syntax IMO seems much cleaner and easier to understand.<I can't agree (but I know Python much more than D, so I am biased), yield of Python is very simple :-) Here are some examples (to reduce space I have reduced the indent from 4 to 2 spaces, I have removed most comments and all the docstrings): Space-efficient and lazy version of sieve of Eratosthene, by Eppstein: def xsieve(): roots = xsieve() root = roots.next() square = root*root D = {} n = 3 while True: D[square] = root+root root = roots.next() square = root*root yield n p = D[n] q = n+p while q in D: q += p del D[n] D[q] = p Generator that returns the integer codons of the given (string) sequence: def xcodons(seq, overlap=0): assert 0 <= overlap <= 2 len_seq = len(seq) pos = 0 while len_seq - pos >= 3: yield seq[pos : pos+3] pos += 3 - overlap Divides a given 2D mat in equal blocks, of given size, and yields them: def xblockfy(mat, blocknrows, blockncols): if blocknrows <= 0: raise ValueError("blocknrows must be >= 1") if blockncols <= 0: raise ValueError("blockncols must be >= 1") if mat and mat[0]: nx = len(mat[0]) ny = len(mat) for y in xrange(0, ny, blocknrows): ymax = min(y+blocknrows, ny) for x in xrange(0, nx, blockncols): block = [] for yy in xrange(y, ymax): block.extend(mat[yy][x: x+blockncols]) yield block Arithmetic progression generator, that accepts noninteger increments too: def xfrange(start, end=None, inc=None): if inc is None: inc = 1.0 if end is None: end = start + 0.0 start = 0.0 else: start = float(start) count = int( ceil((end-start)/inc) ) for i in xrange(count): yield start + i * inc Lazy iterable split, with key function too (I have created a similar thing for D): def xsplit(seq, key=bool, keepkeys=True): group = [] for el in seq: if key(el): if group: yield group group = [] if keepkeys: group.append(el) else: group.append(el) yield group Yields all the distinct pairs of the given seq: def xpairs(seq): lung = len(seq) for i,e1 in enumerate(seq): for j in xrange(i+1, lung): yield e1, seq[j] Yields all the positions of item in a given iterable: def xpositions(seq, item): for pos, el in enumerate(seq): if el == item: yield pos I know you can do those things with D too, this is an example of a translation from Python that uses yield to D, a lazy generator of all the permutations of a sequence, algorithm from Phillip Paul Fuchs, modified: def xpermutations(items, copy=True): if copy: items = list(items) n = len(items) p = range(n+1) i = 1 yield items[:] while i < n: p[i] -= 1 j = (p[i] if i & 1 else 0) items[j], items[i] = items[i], items[j] yield items[:] i = 1 while p[i] == 0: p[i] = i i += 1 else: items = items[:] n = len(items) p = range(n+1) i = 1 yield items while i < n: p[i] -= 1 j = (p[i] if i & 1 else 0) items[j], items[i] = items[i], items[j] yield items i = 1 while p[i] == 0: p[i] = i i += 1 Xpermutations!(TyItem) xpermutations(TyItem)(TyItem[] items, bool copy=true) { return new Xpermutations!(TyItem)(items, copy); } class Xpermutations(TyItem) { int[] range(int stop) { int[] result; if (stop > 0) { result.length = stop; for(int i = 0; i < stop; i++) result[i] = i; } return result; } TyItem[] items; bool copy; this(TyItem[] items, bool copy=true) { this.items = items.dup; this.copy = copy; } int opApply(int delegate(ref TyItem[]) dg) { int result; int n = items.length; TyItem aux; auto p = range(n + 1); int i = 1; if (copy) { // yield duplicated arrays (safer) auto items2 = items.dup; result = dg(items2); if (result) goto END; // yield items2 while (i < n) { p[i] -= 1; int j = (i & 1) ? p[i] : 0; aux = items[j]; items[j] = items[i]; items[i] = aux; // swap items2 = items.dup; result = dg(items2); if (result) goto END; // yield items2 i = 1; while (p[i] == 0) { p[i] = i; i++; } } } else { // yield just references (faster) result = dg(items); if (result) goto END; // yield items while (i < n) { p[i] -= 1; int j = (i & 1) ? p[i] : 0; aux = items[j]; items[j] = items[i]; items[i] = aux; // swap result = dg(items); if (result) goto END; // yield items i = 1; while (p[i] == 0) { p[i] = i; i++; } } } END: return result; } } (If you spot problems in that D code you are encouraged to say it, I am a newbie of D still). They do similar things, but I can't see how you can say that opApply of D "seems much cleaner and easier to understand" than Python yield :-)OTOH you can fake it quiet well with an delegate that stores most of it's state in an object.<Something quite similar to a working version of your code is the builtin Python function iter(), that given an iterable (like an array, an iterable object, etc) return a lazy iterable object that allows to scan its elements in a flexible way (I have created some similar functions with D). But iter() doesn't replace yield. Bye and thank you for the answers, bearophile
Aug 11 2007
Reply to bearophile,Oskar Linde:[dropped python code]Many times people have been posting improvements to the built in AA. The problem is proving that the improvements really are better for the general case. The hardest part has proven to be defining the "general case".<I understand. We may collect a large repository of code/snippets that use AAs, so they can be tuned against that repository. ----------------- BCS:D's opApply syntax IMO seems much cleaner and easier to understand.<I can't agree (but I know Python much more than D, so I am biased), yield of Python is very simple :-) Here are some examplesI know you can do those things with D too, this is an example of a translation from Python that uses yield to D, a lazy generator of all the permutations of a sequence, algorithm from Phillip Paul Fuchs, modified:[dropped D code](If you spot problems in that D code you are encouraged to say it, I am a newbie of D still).I don't see anything that is a problem (I don't follow the details though). I'll put a copy at the end that has some edits, mostly for code space and editablility.They do similar things, but I can't see how you can say that opApply of D "seems much cleaner and easier to understand" than Python yield :-)Maybe I under-spoke, the opApply is much more transparent. the system is doing, and hiding, a lot less complexity. IMO a system hiding to much complexity is not a good thing. Also, opApply is working within the existing constraints of the language.|Xpermutations!(TyItem) xpermutations(TyItem)(TyItem[] items, bool copy=true) |{ | if(copy) | return new Xpermutations!(TyItem, true)(items); | else | return new Xpermutations!(TyItem, false)(items); |} | |class Xpermutations(TyItem, bool copy) |{ | int[] range(int stop) | { | int[] result; | if (stop > 0) | { | result.length = stop; | for (int i = 0; i < stop; i++) | result[i] = i; | } | return result; | } | | TyItem[] items; | bool copy; | | this(TyItem[] items, bool copy=true) | { | this.items = items.dup; | this.copy = copy; | } | | int opApply(int delegate(ref TyItem[]) dg) | { | TyItem[] items2 = items; | | static if(copy) items2 = items.dup; | if (int result = dg2(items2)) return result; // yield items2 | | int n = items.length; | int i = 1; | auto p = range(n + 1); | TyItem aux; | | while (i < n) | { | p[i] -= 1; | int j = (i & 1) ? p[i] : 0; | aux = items[j]; | items[j] = items[i]; | items[i] = aux; // swap | | static if(copy) items2 = items.dup; | if (int result = dg2(items2)) return result; // yield items2 | | i = 1; | | while (p[i] == 0) | { | p[i] = i; | i++; | } | } | return 0; | } |}OTOH you can fake it quiet well with an delegate that stores most of it's state in an object.<Something quite similar to a working version of your code is the builtin Python function iter(), that given an iterable (like an array, an iterable object, etc) return a lazy iterable object that allows to scan its elements in a flexible way (I have created some similar functions with D). But iter() doesn't replace yield. Bye and thank you for the answers, bearophile
Aug 11 2007
Sorry for my answering delay. BCS:I'll put a copy at the end that has some edits, mostly for code space and editablility.<Thank you. I'm learning D, so I can comment your code; this is the first line: Xpermutations!(TyItem) xpermutations(TyItem)(TyItem[] items, bool copy=true) I think it has a problem, this templated class (that't the resulting object of the helper function) doesn't receive the boolean value: Xpermutations!(TyItem) And this line: if (int result = dg2(items2)) return result; May raise an error like this: modulename.d(linenumber): Error: '=' does not give a boolean resultMaybe I under-spoke, the opApply is much more transparent. the system is doing, and hiding, a lot less complexity. IMO a system hiding to much complexity is not a good thing.<I understand now. Because of their nature and usage, in dynamic languages like Python/Ruby it's normal to hide part of the complexity from the user, but you are right asserting that in lower level languages like C++/D it's probably better to show most complexities to the programmer, to avoid problems, reduce "mysteries", to improve the capability to optimize the code, etc. (Maybe a language like D can be designed to have some hi-level constructs to be used in the spots where you don't care of speed; beside other lower lever ones to be used where you need more speed and you need to exactly know what's happening under the cover. Other people may argue you can't fit both things in the same language). Thank you for your code version and your answers. ---------------------- In the meantime I have found some problems translating lazy recursive generators from Py=>D, like this one, that yields the combinations of the given items, taking k distinct elements (this kind of code may look slow). You may help me: def xcombinations(items, k): if k: for i in xrange(len(items) - k + 1): for cc in xcombinations(items[i+1:], k-1): yield [items[i]] + cc else: yield [] Usage example: xcombinations("abcd", 1) ==> "a" "b" "c" "d" xcombinations("abcd", 2) ==> "ab" "ac" "ad" "bc" "bd" "cd" xcombinations("abcd", 3) ==> "abc" "abd" "acd" "bcd" Note that: xrange(n) just yields int numbers from 0 to n-1. + among arrays equals to ~. array[a:] equals to: (a >= cast(long)array.length) ? new typeof(array)[0] : array[max(a,0) .. $]. Bear hugs, bearophile
Aug 16 2007
bearophile wrote:In the meantime I have found some problems translating lazy recursive generators from Py=>D, like this one, that yields the combinations of the given items, taking k distinct elements (this kind of code may look slow). You may help me: def xcombinations(items, k): if k: for i in xrange(len(items) - k + 1): for cc in xcombinations(items[i+1:], k-1): yield [items[i]] + cc else: yield [] Usage example: xcombinations("abcd", 1) ==> "a" "b" "c" "d" xcombinations("abcd", 2) ==> "ab" "ac" "ad" "bc" "bd" "cd" xcombinations("abcd", 3) ==> "abc" "abd" "acd" "bcd"As far as I can tell, the following should work. Maybe I am missing something fundamental, as this is just a straight forward implementation of the above. Note that yield(x) is: { auto tmp = x; if (auto r = dg(tmp)) return r; } (Might be a nice macro in the future.) Code: struct XRange { int n; int opApply(int delegate(inout int v) dg) { for (int i = 0; i < n; i++) if (auto r = dg(i)) return r; return 0; } } XRange xrange(int n) { XRange r; r.n = n; return r; } struct XCombinations(T) { T[] items; int k; int opApply(int delegate(inout T[] arr) dg) { if (k > 0) foreach (i; xrange(items.length - k + 1)) foreach(cc; xcombinations(items[i+1..$], k-1)) { auto tmp = items[i] ~ cc; // must be an lvalue if (auto r = dg(tmp)) return r; } else { T[] tmp = null; // need an lvalue return dg(tmp); } return 0; } } XCombinations!(T) xcombinations(T)(T[] items, int k) { XCombinations!(T) ret; ret.items = items; ret.k = k; return ret; } -- Oskar
Aug 16 2007
Reply to bearophile,Sorry for my answering delay. BCS:that is because it can't, not directly at least. This is because the argument is a runtime value and template parameter must be compile time values. However because bools can only have two values it is trivial to generate both versions and have the run time code select between them. OTOH your correct about the return type. this could be solved by using an abstract base class that uses the (TyItem) and a derived class that uses both args. Another solution would be to return a delegate to the opApply function: |int delegate(int delegate(ref TyItem[])) xpermutations(TyItem)(TyItem[] items, bool copy=true) |{ | if(copy) | return &(new Xpermutations!(TyItem, true)(items)).opApply; | else | return &(new Xpermutations!(TyItem, false)(items)).opApply; |}I'll put a copy at the end that has some edits, mostly for code space and editablility.<Thank you. I'm learning D, so I can comment your code; this is the first line: Xpermutations!(TyItem) xpermutations(TyItem)(TyItem[] items, bool copy=true) I think it has a problem, this templated class (that't the resulting object of the helper function) doesn't receive the boolean value: Xpermutations!(TyItem)And this line: if (int result = dg2(items2)) return result; May raise an error like this: modulename.d(linenumber): Error: '=' does not give a boolean resultit doesn't. assignment expressions cause that error, variable delectations don't(Maybe a language like D can be designed to have some hi-level constructs to be used in the spots where you don't care of speed; beside other lower lever ones to be used where you need more speed and you need to exactly know what's happening under the cover. Other people may argue you can't fit both things in the same language).high level". ff
Aug 16 2007