digitalmars.D - Code layout for range-intensive D code
- bearophile (86/86) Jun 09 2012 The introduction of UFCS in D offers new ways to format D code,
- Denis Shelomovskij (11/13) Jun 09 2012 I have to mention that one shouldn't write range-intensive D code for
- Nick Sabalausky (4/29) Jun 10 2012 [...]
- Dmitry Olshansky (9/19) Jun 10 2012 //hopefully I'm not alone in that this:
- ixid (4/4) Jun 10 2012 Having to append .array() all the time is rather annoying. I
- Dmitry Olshansky (5/9) Jun 10 2012 And then try to get lazy version back. Now that would be harder then
- Peter Alexander (10/14) Jun 10 2012 Problem with eager versions is that you end up doing multiple
- Jonathan M Davis (21/25) Jun 10 2012 Eager versions are only good if you don't want to pass the result to ano...
- ixid (4/4) Jun 10 2012 Thank you. May I ask though, is the argument against
- Jonathan M Davis (26/30) Jun 10 2012 D doesn't do much of anything like that automatically, Aside from the fa...
- ixid (3/3) Jun 10 2012 I must say that this is what I love about the D community-
- Philippe Sigaud (6/9) Jun 11 2012 auto answers = questions.filter!(isBasic)
- Jonathan M Davis (8/33) Jun 10 2012 "D doesn't do much of anything like that automatically_._ Aside from the...
- bearophile (12/16) Jun 11 2012 In most cases having a lazy range is the right default. On the
The introduction of UFCS in D offers new ways to format D code, especially when your code uses many high order functions. What is a good layout of the D code in such situations? I have tried several alternative layouts, and in the end I found to appreciate of extreme example :-) A textual matrix of bits like this is the input of a little nonogram puzzle: 0 1 1 1 1 0 1 0 0 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 0 A program has to produce an output like this, in the first part of the output it looks at the columns and counts the lengths of the groups of "1", and in the second part of the output it does the same on the rows: 3 1 2 1 3 5 5 3 4 1 3 1 4 6 4 This is a possible solution program: import std.stdio, std.algorithm, std.string, std.range, std.conv; void main() { auto t = "table.txt" .File() .byLine() .map!(r => r.removechars("^01".dup))() .array(); const transposed = t[0] .length .iota() .map!(i => t.transversal(i).array())() .array(); (t ~ [(char[]).init] ~ transposed) .map!(r => r .group() .filter!(p => p[0] == '1')() .map!(p => p[1].text())() .join(" ") )() .join("\n") .writeln(); } (Note: the second argument of removechars is "^01".dup because removechars is a bit stupid, it requires the same type argument on both arguments, and the 'r' given by byLine() is a char[]. Here the code performs the string->char[] conversion many times because the typical inputs for this program are small enough, otherwise it's a premature optimization.) As you see you have to break the lines, because the processing chains often become too much long for single lines. At first I have put the dots at the end of the lines, but later I have found that putting the dots at their start is better, it reminds me we are inside a processing chain still. Putting a single operation on each line (instead of two or three) helps readability, allowing a bit of single-line nesting like in ".map!(i => t.transversal(i).array())()". And putting the dots and first part aligned vertically helps the eye find what chain we are in. In the last part of the program you see a nested chain too, inside a map. to use this kind of layout for such kind of higher-order-function-heavy code. I have found that breaking the chains and giving variable names to the intermediate parts of those processing chains doesn't help the readability a lot, and the names for those intermediate temporary variables tend to be dull and repetitive. On the other hand putting just one processing step on each row gives space for a short comment on each row, where you thik you need it: Bye, bearophile
Jun 09 2012
09.06.2012 14:43, bearophile пишет:The introduction of UFCS in D offers new ways to format D code, especially when your code uses many high order functions.I have to mention that one shouldn't write range-intensive D code for now. It's too risky to use high level functions in D because it can result in any kind of memory corruption because of at least "Issue 7965 - Invalid outer function scope pointer in some cases": http://d.puremagic.com/issues/show_bug.cgi?id=7965 Not sure about other such issues but at least that one makes high level functions almost unusable because of a danger. -- Денис В. Шеломовский Denis V. Shelomovskij
Jun 09 2012
"bearophile" <bearophileHUGS lycos.com> wrote in message news:lkplokawtyisvwhvfsew forum.dlang.org...The introduction of UFCS in D offers new ways to format D code, especially when your code uses many high order functions. What is a good layout of the D code in such situations? I have tried several alternative layouts, and in the end I found to appreciate a layout similar to the one used in[...]auto t = "table.txt" .File() .byLine() .map!(r => r.removechars("^01".dup))() .array(); const transposed = t[0] .length .iota() .map!(i => t.transversal(i).array())() .array(); (t ~ [(char[]).init] ~ transposed) .map!(r => r .group() .filter!(p => p[0] == '1')() .map!(p => p[1].text())() .join(" ") )() .join("\n") .writeln(); }That's basically what I do, except I always indent by exactly one tab.
Jun 10 2012
On 09.06.2012 14:43, bearophile wrote: [snip]import std.stdio, std.algorithm, std.string, std.range, std.conv; void main() { auto t = "table.txt" .File() .byLine() .map!(r => r.removechars("^01".dup))() .array(); const transposed = t[0] .length .iota()//hopefully I'm not alone in that this: iota(t[0].length) .map!(...) ... is so much more readable. -- Dmitry Olshansky
Jun 10 2012
Having to append .array() all the time is rather annoying. I can't help but feel that there's a better solution than this. Are lazy Result methods really the default way of doing things? I'd rather have eager versions.
Jun 10 2012
On 11.06.2012 1:49, ixid wrote:Having to append .array() all the time is rather annoying. I can't help but feel that there's a better solution than this. Are lazy Result methods really the default way of doing things? I'd rather have eager versions.And then try to get lazy version back. Now that would be harder then appending .array() wouldn't it? ;) -- Dmitry Olshansky
Jun 10 2012
On Sunday, 10 June 2012 at 21:49:14 UTC, ixid wrote:Having to append .array() all the time is rather annoying. I can't help but feel that there's a better solution than this. Are lazy Result methods really the default way of doing things? I'd rather have eager versions.Problem with eager versions is that you end up doing multiple memory allocations and passes when combining range manipulators. e.g. r.take(n).filter!(blah).map!(whatever).array() Only allocates once, and is only one linear pass over the memory. If they were all eager, 'take' would create a new array, then 'filter' would pass over that and create a new array, then 'map' would pass over that and create yet another array. Not very efficient.
Jun 10 2012
On Sunday, June 10, 2012 23:49:12 ixid wrote:Having to append .array() all the time is rather annoying. I can't help but feel that there's a better solution than this. Are lazy Result methods really the default way of doing things? I'd rather have eager versions.Eager versions are only good if you don't want to pass the result to another function. And since Phobos (and plenty of user code) uses ranges all over the place, you'd end up allocating memory all over the place, whereas right now, you only allocate it when you explicitly call a function which allocates memory to hold the data - e.g. array. Something like foreach(e; map!"to!string(a)"(filter!"a <= 50"(arr, 42))) { doSomething(e); if(someCondition) break; } would end up allocating for both the result of filter and map, when it doesn't actually need to allocate for _either_ of them. Right now, it can process them lazily, only filtering and mapping for the elements that actually get processed. By using lazy ranges, you avoid both unnecessary allocations and avoid having to process all of the elements in a range if you don't need to. On top of all of that, if a function doesn't return a lazy range, you _can't_ use it with infinite ranges, so if range-based functions all eagerly operated on ranges, infinite ranges would be useless. - Jonathan M Davis
Jun 10 2012
Thank you. May I ask though, is the argument against automatically appending .array() when a single or chain of lazy functions are used to set a variable or set of variables just syntactic salt against accidentally doing it eagerly?
Jun 10 2012
On Monday, June 11, 2012 02:11:24 ixid wrote:Thank you. May I ask though, is the argument against automatically appending .array() when a single or chain of lazy functions are used to set a variable or set of variables just syntactic salt against accidentally doing it eagerly?D doesn't do much of anything like that automatically, Aside from the fact tha foreach supports the ranges, _nothing_ in the compiler supports them. They're entirely a library artifact. So, the compiler isn't going to do _anything_ to them automatically. And besides, it's not necessarily the case that you want the result of a range-based function to be an array. What if I want to keep the result of map on the stack? auto a = map!"to!string(a)"(arr); I could pass it to multiple functions later without having to allocate anything. Forcing it to be an array would stop that. And it would be downright nasty for that to happen when dealing with containers, because a number of containers require that you pass them the _exact_ range type that you give them (e.g. remove does this). Converting those ranges to arrays just because you assigned them to a variable would be a _big_ problem. And as for infinite ranges again, if they automatically were converted to arrays when assigned to variables, then you couldn't have variables for them at all, because they _have_ to be eager. Take a random number generator for instance. If all ranges were automatically converted to arrays when assigned to variables, then you couldn't do auto generator = rndGen(); You'd be stuck in an infinite loop if you tried. Having to use std.array.array to explicitly convert a range to an array really doesn't cost you much, and trying to do it automatically would be incredibly error-prone and costly. And really, the more that you use rang-based functions, the less that you need to convert ranges to arrays. - Jonathan M Davis
Jun 10 2012
I must say that this is what I love about the D community- getting a very thorough answer to a rather basic and newbie question from someone who knows what they're talking about.
Jun 10 2012
On Mon, Jun 11, 2012 at 2:36 AM, ixid <nuaccount gmail.com> wrote:I must say that this is what I love about the D community- getting a very thorough answer to a rather basic and newbie question from someone who knows what they're talking about.auto answers = questions.filter!(isBasic) .filter!(fromNewbie) .map!(jonathanMDavis) .array(); Hey, you can even get lazy answers :)
Jun 11 2012
On Sunday, June 10, 2012 17:31:37 Jonathan M Davis wrote:On Monday, June 11, 2012 02:11:24 ixid wrote:Ouch, I obviously need to do a better job of re-reading my posts.Thank you. May I ask though, is the argument against automatically appending .array() when a single or chain of lazy functions are used to set a variable or set of variables just syntactic salt against accidentally doing it eagerly?D doesn't do much of anything like that automatically, Aside from the fact tha foreach supports the ranges, _nothing_ in the compiler supports them."D doesn't do much of anything like that automatically_._ Aside from the fact _that_ foreach supports ranges, nothing in the compiler supports them."They're entirely a library artifact. So, the compiler isn't going to do _anything_ to them automatically. And besides, it's not necessarily the case that you want the result of a range-based function to be an array. What if I want to keep the result of map on the stack? auto a = map!"to!string(a)"(arr); I could pass it to multiple functions later without having to allocate anything. Forcing it to be an array would stop that. And it would be downright nasty for that to happen when dealing with containers, because a number of containers require that you pass them the _exact_ range type that you give them (e.g. remove does this)."a number of containers require that you pass them the exact range type that _they_ give _you_..."Converting those ranges to arrays just because you assigned them to a variable would be a _big_ problem. And as for infinite ranges again, if they automatically were converted to arrays when assigned to variables, then you couldn't have variables for them at all, because they _have_ to be eager."because they have to be _lazy_." - Jonathan M Davis
Jun 10 2012
ixid:Having to append .array() all the time is rather annoying. I can't help but feel that there's a better solution than this. Are lazy Result methods really the default way of doing things? I'd rather have eager versions.In most cases having a lazy range is the right default. On the other hand in D many times you can't use lazy ranges, you need arrays, so the ".array()" is quite common after a map or filter. So I propose to add to Phobos two small functions named amap/afilter that produce arrays: http://d.puremagic.com/issues/show_bug.cgi?id=5756 If you think those are redundant, then first of all take a look here: http://d.puremagic.com/issues/show_bug.cgi?id=8155 Bye, bearophile
Jun 11 2012