digitalmars.D - phobos src level stats
- Bruce Carneal (53/53) Sep 22 2020 Below you'll find category line percentages and total line counts
- Bruce Carneal (16/28) Sep 22 2020 The empty line numbers seem a little high to me. I may have a
- DlangUser38 (7/21) Sep 22 2020 you can count empty lines using a sliding window of two token
- Bruce Carneal (4/26) Sep 22 2020 So, a way to stay in "token space" then. Don't see a problem
- Bruce Carneal (5/19) Sep 22 2020 Ah, the problem would be empty lines within docs or comments that
Below you'll find category line percentages and total line counts for the 20 biggest files in phobos. The "line" counts following the file names are of the dscanner/libdparse variety rather than the 'wc' variety. On a 2.4GhZ zen1, libdparse managed all of phobos in a little under 1.5 seconds. For comparison note that all files were read in a little under 10 milliseconds (from file cache). I really enjoyed my first interaction with libdparse but I'm guessing that the maintainers there strongly favor clarity/correctness over speed. One other note, compiling with dub's --combined option cut the execution time of the ldc2/release exe by about 2X. total bytes 10918340 empty 18% comments 9% docs 17% utst 32% src 24% range/package.d, 9610 empty 18% comments 3% docs 12% utst 54% src 13% datetime/systime.d, 9351 empty 18% comments 3% docs 16% utst 52% src 11% datetime/date.d, 8496 empty 12% comments 10% docs 19% utst 18% src 41% uni/package.d, 8335 empty 21% comments 3% docs 34% utst 37% src 6% datetime/interval.d, 8215 empty 13% comments 13% docs 17% utst 24% src 35% math.d, 7679 empty 16% comments 8% docs 10% utst 36% src 30% format.d, 6920 empty 16% comments 7% docs 16% utst 42% src 19% traits.d, 6868 empty 17% comments 11% docs 18% utst 35% src 20% typecons.d, 6784 empty 16% comments 4% docs 19% utst 32% src 29% string.d, 5442 empty 19% comments 9% docs 14% utst 35% src 23% algorithm/iteration.d, 5235 empty 16% comments 11% docs 11% utst 39% src 23% conv.d, 4939 empty 16% comments 7% docs 25% utst 22% src 30% stdio.d, 4396 empty 14% comments 6% docs 43% utst 6% src 31% net/curl.d, 4167 empty 16% comments 8% docs 21% utst 29% src 26% algorithm/searching.d, 4074 empty 14% comments 9% docs 18% utst 22% src 36% algorithm/sorting.d, 4073 empty 20% comments 4% docs 23% utst 25% src 28% file.d, 4071 empty 17% comments 8% docs 16% utst 38% src 22% array.d, 3601 empty 20% comments 9% docs 29% utst 12% src 30% parallelism.d, 3594 empty 20% comments 4% docs 14% utst 43% src 19% bitmanip.d, 3457
Sep 22 2020
On Tuesday, 22 September 2020 at 20:53:17 UTC, Bruce Carneal wrote:Below you'll find category line percentages and total line counts for the 20 biggest files in phobos. The "line" counts following the file names are of the dscanner/libdparse variety rather than the 'wc' variety. On a 2.4GhZ zen1, libdparse managed all of phobos in a little under 1.5 seconds. For comparison note that all files were read in a little under 10 milliseconds (from file cache). I really enjoyed my first interaction with libdparse but I'm guessing that the maintainers there strongly favor clarity/correctness over speed. One other note, compiling with dub's --combined option cut the execution time of the ldc2/release exe by about 2X.The empty line numbers seem a little high to me. I may have a bug in the code for that: ulong countEmptyLines(string rawText) nogc nothrow pure safe { ulong empties; lineLoop: foreach (line; lineSplitter(rawText)) { foreach_reverse (ch; line) if (ch != ' ' && ch != '\t') continue lineLoop; ++empties; } return empties; }
Sep 22 2020
On Tuesday, 22 September 2020 at 21:01:17 UTC, Bruce Carneal wrote:The empty line numbers seem a little high to me. I may have a bug in the code for that: ulong countEmptyLines(string rawText) nogc nothrow pure safe { ulong empties; lineLoop: foreach (line; lineSplitter(rawText)) { foreach_reverse (ch; line) if (ch != ' ' && ch != '\t') continue lineLoop; ++empties; } return empties; }you can count empty lines using a sliding window of two token over the token range. The difference between the two token position give empty line. string literal and comments require a special processing but otherwise this is quite straightforward to implement.
Sep 22 2020
On Tuesday, 22 September 2020 at 23:22:42 UTC, DlangUser38 wrote:On Tuesday, 22 September 2020 at 21:01:17 UTC, Bruce Carneal wrote:So, a way to stay in "token space" then. Don't see a problem with the above but will note for future apps that I need not drop back to raw text.The empty line numbers seem a little high to me. I may have a bug in the code for that: ulong countEmptyLines(string rawText) nogc nothrow pure safe { ulong empties; lineLoop: foreach (line; lineSplitter(rawText)) { foreach_reverse (ch; line) if (ch != ' ' && ch != '\t') continue lineLoop; ++empties; } return empties; }you can count empty lines using a sliding window of two token over the token range. The difference between the two token position give empty line. string literal and comments require a special processing but otherwise this is quite straightforward to implement.
Sep 22 2020
On Wednesday, 23 September 2020 at 00:48:16 UTC, Bruce Carneal wrote:On Tuesday, 22 September 2020 at 23:22:42 UTC, DlangUser38 wrote:Ah, the problem would be empty lines within docs or comments that are counted when they've already been accounted for in the 'docs' and 'comments' sections. I'll revise the code.On Tuesday, 22 September 2020 at 21:01:17 UTC, Bruce Carneal wrote:So, a way to stay in "token space" then. Don't see a problem with the above but will note for future apps that I need not drop back to raw text.[...]you can count empty lines using a sliding window of two token over the token range. The difference between the two token position give empty line. string literal and comments require a special processing but otherwise this is quite straightforward to implement.
Sep 22 2020