digitalmars.D.learn - Collect Statistics efficiently and easily
- Brett (13/13) Sep 16 2019 Many times I have to get statistical info which is simply compute
- Paul Backus (6/19) Sep 17 2019 You can use `std.algorithm.fold` to compute multiple results in a
- Brett (13/37) Sep 18 2019 That may work but I'm already iterating and doing it inside a
Many times I have to get statistical info which is simply compute statistics on a data set that may be generating or already generated. The code usually is M = max(M, v); m = min(m, v); but other things like standard deviation, mean, etc might need to be computed. This may need to be done on several data sets simultaneously. is there any way that one could just compute them in one line that is efficient, probably using ranges? I'd like to avoid having to loop through a data set multiple times as it would be quite inefficient.
Sep 16 2019
On Tuesday, 17 September 2019 at 01:53:39 UTC, Brett wrote:Many times I have to get statistical info which is simply compute statistics on a data set that may be generating or already generated. The code usually is M = max(M, v); m = min(m, v); but other things like standard deviation, mean, etc might need to be computed. This may need to be done on several data sets simultaneously. is there any way that one could just compute them in one line that is efficient, probably using ranges? I'd like to avoid having to loop through a data set multiple times as it would be quite inefficient.You can use `std.algorithm.fold` to compute multiple results in a single pass: auto stats = v.fold!(max, min); M = stats[0]; m = stats[1];
Sep 17 2019
On Tuesday, 17 September 2019 at 14:06:41 UTC, Paul Backus wrote:On Tuesday, 17 September 2019 at 01:53:39 UTC, Brett wrote:That may work but I'm already iterating and doing it inside a loop. I'm I'm specifically talking about is sort of abstract the computation of each statistic type. If I were to convert my algorithm to be a range then maybe I could do similar to what you are saying but I would still require using more than min and max(such as avg, std, and others). It may be viable but I'll have to think about it. I tend to find myself writing the same abstract code to compute the same statistics quite often(sometimes it deals with a history and sometimes not. E.g., I might want to compute the average and keep the last 5, or the 5 largest).Many times I have to get statistical info which is simply compute statistics on a data set that may be generating or already generated. The code usually is M = max(M, v); m = min(m, v); but other things like standard deviation, mean, etc might need to be computed. This may need to be done on several data sets simultaneously. is there any way that one could just compute them in one line that is efficient, probably using ranges? I'd like to avoid having to loop through a data set multiple times as it would be quite inefficient.You can use `std.algorithm.fold` to compute multiple results in a single pass: auto stats = v.fold!(max, min); M = stats[0]; m = stats[1];
Sep 18 2019