digitalmars.D - Is phobos too fluffy?

Andrei Alexandrescu (22/22) Sep 17 2020 As wc -l counts, phobos has some 330 KLOC:

Andrei Alexandrescu (4/5) Sep 17 2020 This should be:
H. S. Teoh (45/76) Sep 17 2020 IMO, it depends. Code formatted into logically-grouped paragraphs is

jmh530 (7/14) Sep 17 2020 I.e., does having extra empty lines have an impact on the time it

Andrei Alexandrescu (3/20) Sep 17 2020 This is exclusively about readability. Too much fluff means too little

jmh530 (8/11) Sep 17 2020 When I looked at some of those examples above, I agree that some

Andrei Alexandrescu (10/19) Sep 17 2020 I was all in favor of that, until I had to refactor a dozen two-liners

H. S. Teoh (40/58) Sep 17 2020 Yeah, when it's just 2 lines like these, a blank line is just an eyesore

Paul Backus (14/25) Sep 17 2020 This seems totally backwards to me. Why implement something as a

H. S. Teoh (29/50) Sep 17 2020 [...]

jmh530 (7/15) Sep 17 2020 [snip]
Abdulhaq (17/31) Sep 21 2020 A general comment on this thread. Each of us has his preferred

Jonathan M Davis (30/42) Sep 17 2020 I do the same. Similarly, I usually put blank lines after declarations,

Seb (2/4) Sep 17 2020 No impact at all. Not even for lexing.
H. S. Teoh (9/15) Sep 17 2020 Lexing/parsing is the fastest part of the compilation process. The

Andrei Alexandrescu (40/61) Sep 17 2020 Only everywhere I look. Just opened std.algorithm.iteration but really
Timon Gehr (3/8) Sep 17 2020 (The discussed aspects of the Phobos code style make it annoying to
Jacob Carlborg (42/51) Sep 18 2020 I hate that style. But I wouldn't mind if D supported expression body

Seb (13/39) Sep 17 2020 We shouldn't have these discussions.

bachmeier (3/14) Sep 17 2020 What would be nice is if we didn't have Github shouting at us
Andrei Alexandrescu (3/50) Sep 17 2020 Great point. We have uncrustify working passably well at Symmetry,
Paolo Invernizzi (2/16) Sep 18 2020 I think I hear Kenji chuckling in the distance ...

Arun (12/38) Sep 17 2020 Good that someone from the core team pays attention to these
Andrej Mitrovic (5/6) Sep 17 2020 I'm sorry but the real problem is the mammoth modules. There are

Martin (11/17) Sep 19 2020 I totally agree. It should be possible to have N files in one

Martin (4/6) Sep 19 2020 OMG! No way! I just realized that there is a a package called

Imperatorn (5/9) Sep 17 2020 Not to criticize, but, if it has reached 330 KLOC there might be

mw (4/7) Sep 18 2020 Sometimes for non-trivial program, `unittest` can easily be

Imperatorn (3/11) Sep 18 2020 Sure, but wouldn't you separate the code and tests in different

H. S. Teoh (20/32) Sep 18 2020 Why would you? Keeping tests and code side-by-side makes it more likely
Jonathan M Davis (17/29) Sep 18 2020 No. It's very common practice in D code to put the tests immediately aft...

Imperatorn (6/23) Sep 19 2020 "It's very common practice"

Jonathan M Davis (6/33) Sep 19 2020 I specifically said that it was very common practice _in D code_. You

Imperatorn (5/23) Sep 19 2020 Sure, I get why you would want to structure it like that. Just
Patrick Schluter (5/22) Sep 19 2020 I introduced it the C project at work thanks to the magic of

Russel Winder (13/20) Sep 19 2020 Rust has unit tests in the source modules, just as D does. It works very...

Imperatorn (4/19) Sep 19 2020 I get the point in doing so. What I mean is that explains why

Russel Winder (14/17) Sep 19 2020 I suspect getting data before coming to a view is probably the wise thin...

Imperatorn (3/13) Sep 19 2020 What I meant is maybe it's not top priority

Russel Winder (12/14) Sep 19 2020 I can agree that that may well be the case. Though the unit test code ma...

Imperatorn (3/11) Sep 20 2020 True

Andrei Alexandrescu (3/21) Sep 19 2020 I'm bothered by the prevalence of empty lines in source code, not so

Andrei Alexandrescu (27/51) Sep 19 2020 I just read that Python's doctest endorses the same:

Imperatorn (12/52) Sep 19 2020 Thanks for a good reply!

data pulverizer (6/16) Sep 19 2020 At the risk of starting WW3, I'd like to propose using 2 space

Andrei Alexandrescu (3/21) Sep 19 2020 Jesting aside, that would actually increase average complexity of the

data pulverizer (15/21) Sep 19 2020 I guess you could set a line width guide to make the code more

Uknown (14/18) Sep 19 2020 the linux kernel has a rule: all code has to be indented with

Andrei Alexandrescu (5/22) Sep 19 2020 Yes indeed! I do mention that in a couple of talks. Their indent size is...

Daniel N (13/17) Sep 19 2020 They recently switched to 100:

Patrick Schluter (3/14) Sep 19 2020 Yeah, 2 wide and no tabs. The only right rule. ;-)

wjoe (5/21) Sep 19 2020 I think the only right rule is to use a code formatting tool and

matheus (4/8) Sep 19 2020 May I ask how this works when comparing differences between files

wjoe (8/16) Sep 22 2020 Shouldn't be too hard to come up with a script that can

wjoe (3/6) Sep 19 2020 You could have configured your tabs to width of 2 :)

Paul Backus (15/35) Sep 20 2020 Just for fun, I decided to run these calculations on sumtype,

Adam D. Ruppe (12/18) Sep 20 2020 omg ima do it too:

Andrei Alexandrescu (3/24) Sep 20 2020 I think you'd need to adjust for tabs - you need to multiply tabs by 4.

Adam D. Ruppe (13/15) Sep 20 2020 Oh yeah, that'd pretty well account for the difference.

mate (3/6) Sep 20 2020 Interesting. How long are your longest lines?

Adam D. Ruppe (23/24) Sep 20 2020 $ cat $(git ls-files '*.d') | awk '{print length}' | sort -rn |

mate (3/27) Sep 20 2020 Thanks. So it does look like you have an implicit soft line

Adam D. Ruppe (32/34) Sep 20 2020 Well, it isn't so much a limit as just a lack of demand or some

Andrei Alexandrescu (67/76) Sep 20 2020 Ah, interesting test. I just ran this modified script against phobos:

mate (3/8) Sep 21 2020 stdx.allocator seems overrepresented doesn't it?

DlangUser38 (2/12) Sep 21 2020 nice shot.

mate (2/15) Sep 21 2020 Hey that was not my intention!

DlangUser38 (3/19) Sep 21 2020 NVM The effect is the same. stdx.allocator is moslty written by

Andrei Alexandrescu (4/23) Sep 21 2020 Haha, nice. Actually quite a few people worked on the allocator,

Steven Schveighoffer (8/50) Sep 20 2020 I put blank lines everywhere. I need the fluff for it to look

Andrei Alexandrescu (4/5) Sep 20 2020 Not to worry, there's no action item here (save for "we should set up an...

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

As wc -l counts, phobos has some 330 KLOC:

$ wc -l $(git ls-files '*.d') | tail -1
   331378 total

I noticed many contributors are fond of inserting empty lines 
discretionarily, sometimes even in the middle of 2-5 line functions, or 
right after opening an "if" statement. The total number of empty lines:

$ git grep '^$' $(git ls-files '*.d') | wc -l
    38503

So Phobos has 11.62% empty lines in it, on average one on every 9 lines 
of code. I find that a bit excessive, particularly given that our coding 
convention uses brace-on-its-own line, which already adds a lot of 
vertical space. Here's the number of lines consisting of only one brace:

git grep '^ *[{}] *$' **/*.d | wc -l
    53126

That's 16% of the total. Combined with empty lines, we're looking at a 
27.65% fluff factor. Isn't that quite a bit, even considering that 
documentation requires empty lines for paragraphs etc?

Today's monitors favor width over height and I didn't yet get to the 
point of rotating my monitor for coding purposes. (It's also curved, 
which would make it awkward.) Would it be reasonable to curb a bit on 
the fluff factor? E.g. there should never be two consecutive empty 
lines, and code blocks shorter than x lines should have no newlines inside.

Sep 17 2020

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/17/20 11:51 AM, Andrei Alexandrescu wrote:
 git grep '^ *[{}] *$' **/*.d | wc -l

This should be:

git grep '^ *[{}] *$' $(git ls-files '*.d') | wc -l

(The **/*.d syntax works only with zsh.)

Sep 17 2020

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, Sep 17, 2020 at 11:51:18AM -0400, Andrei Alexandrescu via Digitalmars-d
wrote:
 As wc -l counts, phobos has some 330 KLOC:
 
 $ wc -l $(git ls-files '*.d') | tail -1
   331378 total
 
 I noticed many contributors are fond of inserting empty lines
 discretionarily, sometimes even in the middle of 2-5 line functions,
 or right after opening an "if" statement.

Do you have a concrete example of this?


 The total number of empty lines:
 
 $ git grep '^$' $(git ls-files '*.d') | wc -l
    38503
 
 So Phobos has 11.62% empty lines in it, on average one on every 9
 lines of code. I find that a bit excessive, particularly given that
 our coding convention uses brace-on-its-own line, which already adds a
 lot of vertical space. Here's the number of lines consisting of only
 one brace:
 
 git grep '^ *[{}] *$' **/*.d | wc -l
    53126
 
 That's 16% of the total. Combined with empty lines, we're looking at a
 27.65% fluff factor. Isn't that quite a bit, even considering that
 documentation requires empty lines for paragraphs etc?

IMO, it depends. Code formatted into logically-grouped paragraphs is
easier to read, esp. to quickly scan to get a quick overview of what the
function does. If it's all smushed into a solid 10-line block, it would
be much harder to read.  But I don't think there's a fixed number of
lines where the cutoff is -- it depends on what the code is trying to
do, and so has to be judged on a case-by-case basis.

OTOH, if there's a bunch of 1-line functions, formatting it in Phobos
style with braces on separate lines would occupy 4-5 lines per function,
which is rather excessive:

	// Too much whitespace
	struct MyRange(E)
	{
	     property bool empty()
	    {
	    	return _isEmpty;
	    }

	     property E front()
	    {
	    	return _front;
	    }

	    void popFront()
	    {
	    	r.popFront;
	    }
	}

as opposed to:

	// Better
	struct MyRange(E)
	{
	     property bool empty() { return _isEmpty; }
	     property E front() { return _front; }
	    void popFront() { r.popFront; }
	}



 Today's monitors favor width over height and I didn't yet get to the
 point of rotating my monitor for coding purposes. (It's also curved,
 which would make it awkward.) Would it be reasonable to curb a bit on
 the fluff factor?

Sure, but why are we fussing over formatting, when there are bigger fish
to fry?  Like code quality issues: pushing sig constraints into static
ifs when they are more suitable, merging unnecessary overloads, fixing
documentation issues, etc..  IMO, removing empty lines is last on that
long list of action items.


 E.g. there should never be two consecutive empty lines, and code
 blocks shorter than x lines should have no newlines inside.

You mean no empty lines?  Otherwise we'd have to pack multiple
statements onto 1 line, which I don't think is what you want. ;-)


T

-- 
Don't drink and derive. Alcohol and algebra don't mix.

Sep 17 2020

jmh530 <john.michael.hall gmail.com> writes:

On Thursday, 17 September 2020 at 16:34:33 UTC, H. S. Teoh wrote:
 [snip]

 Sure, but why are we fussing over formatting, when there are 
 bigger fish to fry?  Like code quality issues: pushing sig 
 constraints into static ifs when they are more suitable, 
 merging unnecessary overloads, fixing documentation issues, 
 etc..  IMO, removing empty lines is last on that long list of 
 action items.


I.e., does having extra empty lines have an impact on the time it 
takes to compile phobos?

I don't really have a good sense of whether the time spent 
compiling is more related to number of lines or number of 
characters. An empty line with no characters or one character may 
not take long to parse.

Sep 17 2020

Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:

On 9/17/20 1:06 PM, jmh530 wrote:
 On Thursday, 17 September 2020 at 16:34:33 UTC, H. S. Teoh wrote:
 [snip]

 Sure, but why are we fussing over formatting, when there are bigger 
 fish to fry?  Like code quality issues: pushing sig constraints into 
 static ifs when they are more suitable, merging unnecessary overloads, 
 fixing documentation issues, etc..  IMO, removing empty lines is last 
 on that long list of action items.

 
 
 I.e., does having extra empty lines have an impact on the time it takes 
 to compile phobos?
 
 I don't really have a good sense of whether the time spent compiling is 
 more related to number of lines or number of characters. An empty line 
 with no characters or one character may not take long to parse.

This is exclusively about readability. Too much fluff means too little 
context in front of you.

Sep 17 2020

jmh530 <john.michael.hall gmail.com> writes:

On Thursday, 17 September 2020 at 17:16:42 UTC, Andrei 
Alexandrescu wrote:
 [snip]

 This is exclusively about readability. Too much fluff means too 
 little context in front of you.

When I looked at some of those examples above, I agree that some 
are extraneous (which should get flagged by the autotester)...but 
I almost always put an empty line after imports. I find that 
helps with readability and even maintainability in short 
functions/scopes in that I don't need to add them back after 
adding more lines and have more trouble finding them.

Sep 17 2020

Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:

On 9/17/20 1:49 PM, jmh530 wrote:
 On Thursday, 17 September 2020 at 17:16:42 UTC, Andrei Alexandrescu wrote:
 [snip]

 This is exclusively about readability. Too much fluff means too little 
 context in front of you.

 
 When I looked at some of those examples above, I agree that some are 
 extraneous (which should get flagged by the autotester)...but I almost 
 always put an empty line after imports.

I was all in favor of that, until I had to refactor a dozen two-liners 
consisting of... let me paste some code:

auto put(ref GGPlotD gg, GeomBar def) {
     import ggplot.backend.ggplot.geom_bar : geomBar;

     return gg.put(geomBar(def));
}

And so it goes for pages.

There is something to be said about a guideline versus a mindless dogma. 
Things like enum strings, each, and autodecoding are more of the latter.

Sep 17 2020

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, Sep 17, 2020 at 02:15:32PM -0400, Andrei Alexandrescu via Digitalmars-d
wrote:
 On 9/17/20 1:49 PM, jmh530 wrote:

[...]
 When I looked at some of those examples above, I agree that some are
 extraneous (which should get flagged by the autotester)...but I
 almost always put an empty line after imports.

 
 I was all in favor of that, until I had to refactor a dozen two-liners
 consisting of... let me paste some code:
 
 auto put(ref GGPlotD gg, GeomBar def) {
     import ggplot.backend.ggplot.geom_bar : geomBar;
 
     return gg.put(geomBar(def));
 }

Yeah, when it's just 2 lines like these, a blank line is just an eyesore
that serves no purpose.  I say we kill 'em.


 And so it goes for pages.
 
 There is something to be said about a guideline versus a mindless
 dogma.  Things like enum strings, each, and autodecoding are more of
 the latter.

I'm on the fence about `each`.  I totally agree that the implementation
is a bit overblown -- do we *really* need to support static arrays,
opApply iterables, iteration indices, and all of that jazz?  Whenever we
end up with a combinatorial explosion of (sig constraint helper
templates, or any kind of code construct, really), it tells me that we
have failed to separate different concerns properly. Instead of writing
O(2^N) cases explicitly, at the most it ought to be O(N) cases, or,
better yet, O(1) (i.e., push tangential concerns out of the
implementation to the caller (or someone else)).

OTOH, I see why it might sometimes be a handy thing to have: if you have
a long UFCS chain, it can be an annoyance to have to wrap the entire
block with a `foreach (x; a.b.c.d.e.f.g./*...*/.z)` instead of being
able to just tack on `.each!(...)` at the end.  However, I'd say in this
case .each really should only support ranges, since UFCS chains are the
primary use case, and leave the other stuff like static arrays (not even
a range by current definitions), opApply-iterables, and AA's alone.

Of course, once you take all of that other fluff out, the raison d'etre
of .each becomes a little shaky. :-P  It's probably pushing it, but I'm
tempted to suggest that perhaps this is one of those opportunities
Andrei was talking about to improve the language: what if the language
equates:

	iterable.foreach(i => { ... });

with

	foreach (i; iterable) { ... }

?  I.e., the former lowers to the latter.  This seems to be a natural
(logical?) next step in the spirit of UFCS, except applied to built-in
loop constructs.

Then there would be no need of .each, and UFCS functional-style code
would be nicely unified with the good ole traditional loops.  And it
doesn't even need further compiler complications, being merely a simple
syntactic rewrite.

(Or perhaps this is illogical and just wayyyy out there... let the
rotten fruit fly. :-D)


T

-- 
Lottery: tax on the stupid. -- Slashdotter

Sep 17 2020

Paul Backus <snarwin gmail.com> writes:

On Thursday, 17 September 2020 at 18:44:09 UTC, H. S. Teoh wrote:
 Of course, once you take all of that other fluff out, the 
 raison d'etre of .each becomes a little shaky. :-P  It's 
 probably pushing it, but I'm tempted to suggest that perhaps 
 this is one of those opportunities Andrei was talking about to 
 improve the language: what if the language equates:

 	iterable.foreach(i => { ... });

 with

 	foreach (i; iterable) { ... }

 ?  I.e., the former lowers to the latter.  This seems to be a 
 natural (logical?) next step in the spirit of UFCS, except 
 applied to built-in loop constructs.

This seems totally backwards to me. Why implement something as a 
language feature when we can already do it with library code?

Also, "foreach but it works in range pipelines" is absolutely a 
worthwhile reason on its own to keep .each around. I understand 
some people don't want "trivial" stuff in Phobos, but there is a 
real value to convenience. (Not to mention that even 
seemingly-trivial generic code can have edge-cases that take some 
work to get right.)

To give a negative example: C++'s <random> library makes 
essentially no concessions to convenience, and as a result is 
gratuitously annoying to use. This blog post by Martin Hořeňovský 
goes into some of the details:

https://codingnest.com/generating-random-numbers-using-c-standard-library-the-solutions/

Sep 17 2020

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, Sep 17, 2020 at 08:36:38PM +0000, Paul Backus via Digitalmars-d wrote:
 On Thursday, 17 September 2020 at 18:44:09 UTC, H. S. Teoh wrote:

[...]
 what if the language equates:
 
 	iterable.foreach(i => { ... });
 
 with
 
 	foreach (i; iterable) { ... }


[...]
 This seems totally backwards to me. Why implement something as a
 language feature when we can already do it with library code?

Oh well, it was worth a shot. :-P


 Also, "foreach but it works in range pipelines" is absolutely a
 worthwhile reason on its own to keep .each around. I understand some
 people don't want "trivial" stuff in Phobos, but there is a real value
 to convenience. (Not to mention that even seemingly-trivial generic
 code can have edge-cases that take some work to get right.)

Convenience, esp. for small wrappers like .each, is determined by
frequency of use.  Is there an easy way to count the number of uses of
.each in public D projects?  It'd be good to have some data to back up
our decisions, otherwise we're just sailing blind based on pure
speculation.


 To give a negative example: C++'s <random> library makes essentially
 no concessions to convenience, and as a result is gratuitously
 annoying to use.  This blog post by Martin Hořeňovský goes into some
 of the details:
 
 https://codingnest.com/generating-random-numbers-using-c-standard-library-the-solutions/

Oy, don't get me started on C++... Once, I had an old C++ project where
I identified a particular performance bottleneck that could be improved
by using hashtables.  Should be easy, right?  Esp. since C++11
finally(!) has hashtables in the standard library.  Long story short,
yeah there are hashtables, all right, but the interface is so klunky to
use and the defaults so unhelpful that it would require major code
refactoring just to get it to work (not to mention an already
significant amount of code cleanup to make the original C++98 code
compile with C++11 -- all just to get hashtables!).

After fighting with it for more hours than I was prepared to spend on
what I thought would be a "trivial" change, I threw in the towel... and
later decided to rewrite the entire code in D -- and have been very
happy with this latter decision ever since!  One of the major sources of
happiness was the way D has AA's as part of the language, and using
structs as AA keys Just Worked(tm) out-of-the-box.  Whatever warts AA
may have had (or still have), it's still orders of magnitude better than
the totally-unfriendly C++ API.


T

-- 
ASCII stupid question, getty stupid ANSI.

Sep 17 2020

jmh530 <john.michael.hall gmail.com> writes:

On Thursday, 17 September 2020 at 18:15:32 UTC, Andrei 
Alexandrescu wrote:
 [snip]

 I was all in favor of that, until I had to refactor a dozen 
 two-liners consisting of... let me paste some code:

 auto put(ref GGPlotD gg, GeomBar def) {
     import ggplot.backend.ggplot.geom_bar : geomBar;

     return gg.put(geomBar(def));
 }

 And so it goes for pages.

[snip]

Oh I probably have lots of code that does stuff like that (even 
more commonly in unittests).

But I also agree with the points above on dfmt and having this 
done automatically.

Sep 17 2020

Abdulhaq <alynch4047 gmail.com> writes:

On Thursday, 17 September 2020 at 18:15:32 UTC, Andrei 
Alexandrescu wrote:
 On 9/17/20 1:49 PM, jmh530 wrote:
 On Thursday, 17 September 2020 at 17:16:42 UTC, Andrei 
 Alexandrescu wrote:
 [snip]


 I was all in favor of that, until I had to refactor a dozen 
 two-liners consisting of... let me paste some code:

 auto put(ref GGPlotD gg, GeomBar def) {
     import ggplot.backend.ggplot.geom_bar : geomBar;

     return gg.put(geomBar(def));
 }

 And so it goes for pages.

 There is something to be said about a guideline versus a 
 mindless dogma. Things like enum strings, each, and 
 autodecoding are more of the latter.

A general comment on this thread. Each of us has his preferred 
code style guidelines (standards are great, everyone should have 
their own standards!). What we have done is we have trained our 
visual system / brain to easily pick out the semantically 
important information based on the styling/layout of the code. 
This leaves us feeling good or bad about certain formatting 
styles, depending on how well your trained visual system/rules 
can cooperate with the the style of the code at hand. Then we say 
that python-style indents stink, K&R brackets are the only way to 
go etc., etc.

In reality we need to agree to a single standard for D code and 
everyone has to stick to it whether it panders to the style of 
their choice or not. After a month or two our visual cortex / D 
brain will accommodate the new style and all will be well again 
in the world.

Sep 21 2020

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Thursday, September 17, 2020 11:49:05 AM MDT jmh530 via Digitalmars-d 
wrote:
 On Thursday, 17 September 2020 at 17:16:42 UTC, Andrei

 Alexandrescu wrote:
 [snip]

 This is exclusively about readability. Too much fluff means too
 little context in front of you.

 When I looked at some of those examples above, I agree that some
 are extraneous (which should get flagged by the autotester)...but
 I almost always put an empty line after imports. I find that
 helps with readability and even maintainability in short
 functions/scopes in that I don't need to add them back after
 adding more lines and have more trouble finding them.

I do the same. Similarly, I usually put blank lines after declarations,
because that makes the code easier to read, but I'm also somewhat flexible
about it, since following strict rules about some of that stuff can
certainly result in way too many blank lines in some code. About the only
time I'll consider not putting a blank line after an import statement is
when the function is otherwise a one-liner, but even then, I'm likely to
prefer having the blank line (which Andrei clearly wouldn't like).

Either way, while I won't claim that Phobos' code always does a good job
with where it has blank lines (and there certainly are plenty of times that
I haven't like code formatting in Phobos), I haven't personally found that
the number of blank lines in Phobos causes problems with being able to see
code on a screen even on my laptop, and in my experience, trying to minimize
blank lines often makes code harder to read. It's a very subjective thing
though and is going to vary from person to person. Using a code formatter
would make it so that you don't have to argue about it in PRs, but it also
tends to result in uglier code. Personally, I'd prefer that we just not be
picky about it, but if we're going to be, then ultimately, using a code
formatter is probably the way to go.

That being said, unless the code formatter is smart enough to deal with very
small functions differently, removing blank lines after something like
import statements in order to reduce the number of blank lines in small
functions will definitely make longer functions harder to read - especially
when we're doing a lot with local import statements. You can get quite a
list of those in a row, in which case, I think that it's pretty bad not have
to have a blank line after it. I don't know how smart any code formatting
tools we currently have at our disposal are though, since I generally avoid
them.

- Jonathan M Davis

Sep 17 2020

Seb <seb wilzba.ch> writes:

On Thursday, 17 September 2020 at 17:06:12 UTC, jmh530 wrote:
 I.e., does having extra empty lines have an impact on the time 
 it takes to compile phobos?

No impact at all. Not even for lexing.

Sep 17 2020

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, Sep 17, 2020 at 05:06:12PM +0000, jmh530 via Digitalmars-d wrote:
[...]
 I.e., does having extra empty lines have an impact on the time it
 takes to compile phobos?
 
 I don't really have a good sense of whether the time spent compiling
 is more related to number of lines or number of characters. An empty
 line with no characters or one character may not take long to parse.

Lexing/parsing is the fastest part of the compilation process.  The
majority of compilation time (by far) is spent on semantic analysis and
codegen.  A couple of empty lines makes a difference so minute I daresay
it's not even measurable.


T

-- 
If it tastes good, it's probably bad for you.

Sep 17 2020

Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:

On 9/17/20 12:34 PM, H. S. Teoh wrote:
 On Thu, Sep 17, 2020 at 11:51:18AM -0400, Andrei Alexandrescu via
Digitalmars-d wrote:
 As wc -l counts, phobos has some 330 KLOC:

 $ wc -l $(git ls-files '*.d') | tail -1
    331378 total

 I noticed many contributors are fond of inserting empty lines
 discretionarily, sometimes even in the middle of 2-5 line functions,
 or right after opening an "if" statement.

 
 Do you have a concrete example of this?

Only everywhere I look. Just opened std.algorithm.iteration but really 
no need to cherry-pick:

Four-liner unittest with a gratuitous empty line: 
https://github.com/dlang/phobos/blob/master/std/algorithm/iteration.d#L278. 
Also uses some vertical alignment that maintainers are going to just love.

Empty line BEFORE the closing "}", that's just sloppy: 
https://github.com/dlang/phobos/blob/master/std/algorithm/iteration.d#L2267

An empty line between the cases and the default of a switch: 
https://github.com/dlang/phobos/blob/master/std/algorithm/iteration.d#L5863. 
The switch itself is four lines. Also: the default case uses return on 
the same line, the others don't.

An empty line inexplicably just before the return of a four-liner. A 
four-liner! Who thinks this makes anything easier to read? 
https://github.com/dlang/phobos/blob/master/std/algorithm/iteration.d#L5902

An empty line in the middle of a three-line block, again before the 
return (what's the deal with that)? 
https://github.com/dlang/phobos/blob/master/std/algorithm/iteration.d#L6787.

Three-liner unittest with an empty one added: 
https://github.com/dlang/phobos/blob/master/std/algorithm/iteration.d#L7004

The list goes on...

(I recall being in a similar debate with a colleague, and he mentioned 
he separates with a newline all non-expression statements (or something 
to that effect). So if there's an if, a loop, an import, a return, a 
definition, etc. - they have newlines in between. I'd think that gets 
into career-limiting stuff, especially because said colleague had the 
air that that's really fundamental, like the theorem of structure.)

This all is a bit of a bummer. I reckon it's a popular style because 
there seem to be a lot of code in phobos like that, but it kinda is 
excessive and limits the context one can put on the screen. Worse, I saw 
that as a side-effect there's a lot of "cheating" on the full bracing - 
many places that require no full braces dispense with them. However, it 
turns out that full-bracing is useful during maintenance (you can easily 
insert or remove code), whereas empty lines aren't.

 Sure, but why are we fussing over formatting, when there are bigger fish
 to fry?  Like code quality issues: pushing sig constraints into static
 ifs when they are more suitable, merging unnecessary overloads, fixing
 documentation issues, etc..  IMO, removing empty lines is last on that
 long list of action items.

Per the Romanian saying, "til you get to God the saints will eat you 
up". Hard to translate. Meaning you try the big thing but the little 
things are going to get you first. Yah, I'm trying to do the good things 
while constantly getting sand in the eye because of the little things.

 E.g. there should never be two consecutive empty lines, and code
 blocks shorter than x lines should have no newlines inside.

 
 You mean no empty lines?  Otherwise we'd have to pack multiple
 statements onto 1 line, which I don't think is what you want. ;-)

Not no two consecutive newlines. Two consecutive empty lines, i.e. three 
consecutive newlines, \n\n\n.

Sep 17 2020

Timon Gehr <timon.gehr gmx.ch> writes:

On 17.09.20 18:34, H. S. Teoh wrote:
 Sure, but why are we fussing over formatting, when there are bigger fish
 to fry?  Like code quality issues: pushing sig constraints into static
 ifs when they are more suitable, merging unnecessary overloads, fixing
 documentation issues, etc..  IMO, removing empty lines is last on that
 long list of action items.

(The discussed aspects of the Phobos code style make it annoying to 
contribute.)

Sep 17 2020

Jacob Carlborg <doob me.com> writes:

On 2020-09-17 18:34, H. S. Teoh wrote:

 as opposed to:
 
 	// Better
 	struct MyRange(E)
 	{
 	     property bool empty() { return _isEmpty; }
 	     property E front() { return _front; }
 	    void popFront() { r.popFront; }
 	}

I hate that style. But I wouldn't mind if D supported expression body 


struct MyRange(E)
{
      property bool empty() => _isEmpty;
      property E front() => _front;
     void popFront() => r.popFront;
}

It would look even better in Scala:

class MyRange[E]
{
   def empty = _isEmpty
   def front = _front
   def popFront() = r.popFront
}

In most languages in the C family, removing the curly braces for the 
body of `if`, `for`, `while` and so on is supported if the body only 
contains a single statement. Compared to Java, D extended this and 
allows to drop the curly braces for `try`, `catch` and `finally` as 
well. It just makes sense to allow to drop them for function bodies as well.

Scala goes even further and allows to drop the curly braces for classes:

class Foo

Not a very useful class but something like this also works in Scala:

class Point(x: Int, y: Int)

The above will automatically generate a instance variables, 
getters/setters and a constructor for the specified `x` and `y`.

Allowing to drop the curly braces is extra useful because in Scala the 
last statement in a method is returned automatically, that in 
combination with that all statements are actually expressions:

class Foo
{
   def bar(x: Int) =
     if (a == 3)
       "3"
     else if (a == 4)
       "4"
     else
       "other"
}

-- 
/Jacob Carlborg

Sep 18 2020

Seb <seb wilzba.ch> writes:

On Thursday, 17 September 2020 at 15:51:18 UTC, Andrei 
Alexandrescu wrote:
 As wc -l counts, phobos has some 330 KLOC:

 $ wc -l $(git ls-files '*.d') | tail -1
   331378 total

 I noticed many contributors are fond of inserting empty lines 
 discretionarily, sometimes even in the middle of 2-5 line 
 functions, or right after opening an "if" statement. The total 
 number of empty lines:

 $ git grep '^$' $(git ls-files '*.d') | wc -l
    38503

 So Phobos has 11.62% empty lines in it, on average one on every 
 9 lines of code. I find that a bit excessive, particularly 
 given that our coding convention uses brace-on-its-own line, 
 which already adds a lot of vertical space. Here's the number 
 of lines consisting of only one brace:

 git grep '^ *[{}] *$' **/*.d | wc -l
    53126

 That's 16% of the total. Combined with empty lines, we're 
 looking at a 27.65% fluff factor. Isn't that quite a bit, even 
 considering that documentation requires empty lines for 
 paragraphs etc?

 Today's monitors favor width over height and I didn't yet get 
 to the point of rotating my monitor for coding purposes. (It's 
 also curved, which would make it awkward.) Would it be 
 reasonable to curb a bit on the fluff factor? E.g. there should 
 never be two consecutive empty lines, and code blocks shorter 
 than x lines should have no newlines inside.

We shouldn't have these discussions.
We should just use a code formatting tool (see e.g. how 
successful black was in the Python world - 
https://github.com/psf/black) and be done with.

So IMHO the only productive discussion here is how we can get 
dfmt (https://github.com/dlang-community/dfmt)  (or a new tool) 
in a shape, s.t. it can be applied automatically.

Alternatively, for your specific request, there was [1], but I 
since gave up on enforcing such style issues manually. As I 
mentioned, a tool should handle this task for you.

[1] https://github.com/dlang-community/D-Scanner/pull/447

Sep 17 2020

bachmeier <no spam.net> writes:

On Thursday, 17 September 2020 at 17:49:39 UTC, Seb wrote:

 We shouldn't have these discussions.
 We should just use a code formatting tool (see e.g. how 
 successful black was in the Python world - 
 https://github.com/psf/black) and be done with.

 So IMHO the only productive discussion here is how we can get 
 dfmt (https://github.com/dlang-community/dfmt)  (or a new tool) 
 in a shape, s.t. it can be applied automatically.

 Alternatively, for your specific request, there was [1], but I 
 since gave up on enforcing such style issues manually. As I 
 mentioned, a tool should handle this task for you.

 [1] https://github.com/dlang-community/D-Scanner/pull/447

What would be nice is if we didn't have Github shouting at us 
because of whitespace errors. We're not using Python after all.

Sep 17 2020

Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:

On 9/17/20 1:49 PM, Seb wrote:
 On Thursday, 17 September 2020 at 15:51:18 UTC, Andrei Alexandrescu wrote:
 As wc -l counts, phobos has some 330 KLOC:

 $ wc -l $(git ls-files '*.d') | tail -1
   331378 total

 I noticed many contributors are fond of inserting empty lines 
 discretionarily, sometimes even in the middle of 2-5 line functions, 
 or right after opening an "if" statement. The total number of empty 
 lines:

 $ git grep '^$' $(git ls-files '*.d') | wc -l
    38503

 So Phobos has 11.62% empty lines in it, on average one on every 9 
 lines of code. I find that a bit excessive, particularly given that 
 our coding convention uses brace-on-its-own line, which already adds a 
 lot of vertical space. Here's the number of lines consisting of only 
 one brace:

 git grep '^ *[{}] *$' **/*.d | wc -l
    53126

 That's 16% of the total. Combined with empty lines, we're looking at a 
 27.65% fluff factor. Isn't that quite a bit, even considering that 
 documentation requires empty lines for paragraphs etc?

 Today's monitors favor width over height and I didn't yet get to the 
 point of rotating my monitor for coding purposes. (It's also curved, 
 which would make it awkward.) Would it be reasonable to curb a bit on 
 the fluff factor? E.g. there should never be two consecutive empty 
 lines, and code blocks shorter than x lines should have no newlines 
 inside.

 
 We shouldn't have these discussions.
 We should just use a code formatting tool (see e.g. how successful black 
 was in the Python world - https://github.com/psf/black) and be done with.
 
 So IMHO the only productive discussion here is how we can get dfmt 
 (https://github.com/dlang-community/dfmt)  (or a new tool) in a shape, 
 s.t. it can be applied automatically.
 
 Alternatively, for your specific request, there was [1], but I since 
 gave up on enforcing such style issues manually. As I mentioned, a tool 
 should handle this task for you.
 
 [1] https://github.com/dlang-community/D-Scanner/pull/447

Great point. We have uncrustify working passably well at Symmetry, 
making them available for phobos is a matter of putting in the time.

Sep 17 2020

Paolo Invernizzi <paolo.invernizzi gmail.com> writes:

On Thursday, 17 September 2020 at 17:49:39 UTC, Seb wrote:
 On Thursday, 17 September 2020 at 15:51:18 UTC, Andrei 
 Alexandrescu wrote:
 [...]

 We shouldn't have these discussions.
 We should just use a code formatting tool (see e.g. how 
 successful black was in the Python world - 
 https://github.com/psf/black) and be done with.

 So IMHO the only productive discussion here is how we can get 
 dfmt (https://github.com/dlang-community/dfmt)  (or a new tool) 
 in a shape, s.t. it can be applied automatically.

 Alternatively, for your specific request, there was [1], but I 
 since gave up on enforcing such style issues manually. As I 
 mentioned, a tool should handle this task for you.

 [1] https://github.com/dlang-community/D-Scanner/pull/447


I think I hear Kenji chuckling in the distance ...

Sep 18 2020

Arun <aruncxy gmail.com> writes:

On Thursday, 17 September 2020 at 15:51:18 UTC, Andrei 
Alexandrescu wrote:
 As wc -l counts, phobos has some 330 KLOC:

 $ wc -l $(git ls-files '*.d') | tail -1
   331378 total

 I noticed many contributors are fond of inserting empty lines 
 discretionarily, sometimes even in the middle of 2-5 line 
 functions, or right after opening an "if" statement. The total 
 number of empty lines:

 $ git grep '^$' $(git ls-files '*.d') | wc -l
    38503

 So Phobos has 11.62% empty lines in it, on average one on every 
 9 lines of code. I find that a bit excessive, particularly 
 given that our coding convention uses brace-on-its-own line, 
 which already adds a lot of vertical space. Here's the number 
 of lines consisting of only one brace:

 git grep '^ *[{}] *$' **/*.d | wc -l
    53126

 That's 16% of the total. Combined with empty lines, we're 
 looking at a 27.65% fluff factor. Isn't that quite a bit, even 
 considering that documentation requires empty lines for 
 paragraphs etc?

 Today's monitors favor width over height and I didn't yet get 
 to the point of rotating my monitor for coding purposes. (It's 
 also curved, which would make it awkward.) Would it be 
 reasonable to curb a bit on the fluff factor? E.g. there should 
 never be two consecutive empty lines, and code blocks shorter 
 than x lines should have no newlines inside.

Good that someone from the core team pays attention to these 
things. We should encourage the use of dfmt for such things, come 
up with a standard .editorconfig and commit that to the repo. 
Vim, emacs, VSCode - pretty much the majority of the editors 
support dfmt + .editorconfig. The more dogfooding, the better it 
is. That's not happening at the moment.

druntime uses different style from Phobos as well.

I wish all the community projects use a common D-style (even the 
one recommended officially). But I understand where that 
discussion will lead to.

Sep 17 2020

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On Thursday, 17 September 2020 at 15:51:18 UTC, Andrei 
Alexandrescu wrote:
 As wc -l counts, phobos has some 330 KLOC:

I'm sorry but the real problem is the mammoth modules. There are 
multiple modules reaching over 10K lines. Fixing the whitespace 
won't improve things much.

Sep 17 2020

Martin <martin.brzenska googlemail.com> writes:

On Friday, 18 September 2020 at 03:01:26 UTC, Andrej Mitrovic 
wrote:
 On Thursday, 17 September 2020 at 15:51:18 UTC, Andrei 
 Alexandrescu wrote:
 As wc -l counts, phobos has some 330 KLOC:

 I'm sorry but the real problem is the mammoth modules. There 
 are multiple modules reaching over 10K lines. Fixing the 
 whitespace won't improve things much.

I totally agree. It should be possible to have N files in one 
module. Yes, this can be achieved with package.d to some extend. 
But it is rather a makeup for the "import" line (imo).

Regarding the code formating. How about a dfmt included in the 
standard instalation package? It could be implemented as a lib in 
Phobos. And if this dfmt could load its rules from a 
EditorConfig-like file, this file could define dfmts default 
behavior. But projects could ship their own ruleset but still use 
a standard tool for it that would come with every installation.

Sep 19 2020

Martin <martin.brzenska googlemail.com> writes:

On Saturday, 19 September 2020 at 10:47:01 UTC, Martin wrote:
 How about a dfmt [...] if this dfmt could load its rules from a 
 EditorConfig-like file [...]

OMG! No way! I just realized that there is a a package called 
dfmt and it does exactly this LOL - SRY for OT but.. If you could 
see my face right now...

Sep 19 2020

Imperatorn <johan_forsberg_86 hotmail.com> writes:

On Thursday, 17 September 2020 at 15:51:18 UTC, Andrei 
Alexandrescu wrote:
 As wc -l counts, phobos has some 330 KLOC:

 $ wc -l $(git ls-files '*.d') | tail -1
   331378 total

 [...]

Not to criticize, but, if it has reached 330 KLOC there might be 
another problem :S

How did it get that big?

Sep 17 2020

mw <mingwu gmail.com> writes:

On Friday, 18 September 2020 at 06:50:48 UTC, Imperatorn wrote:
 Not to criticize, but, if it has reached 330 KLOC there might 
 be another problem :S

 How did it get that big?

Sometimes for non-trivial program, `unittest` can easily be 
longer than the program itself to cover all the test paths & 
cases.

Sep 18 2020

Imperatorn <johan_forsberg_86 hotmail.com> writes:

On Friday, 18 September 2020 at 07:41:25 UTC, mw wrote:
 On Friday, 18 September 2020 at 06:50:48 UTC, Imperatorn wrote:
 Not to criticize, but, if it has reached 330 KLOC there might 
 be another problem :S

 How did it get that big?

 Sometimes for non-trivial program, `unittest` can easily be 
 longer than the program itself to cover all the test paths & 
 cases.

Sure, but wouldn't you separate the code and tests in different 
projects?

Sep 18 2020

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Sep 18, 2020 at 12:05:39PM +0000, Imperatorn via Digitalmars-d wrote:
 On Friday, 18 September 2020 at 07:41:25 UTC, mw wrote:
 On Friday, 18 September 2020 at 06:50:48 UTC, Imperatorn wrote:
 Not to criticize, but, if it has reached 330 KLOC there might be
 another problem :S
 
 How did it get that big?

 
 Sometimes for non-trivial program, `unittest` can easily be longer
 than the program itself to cover all the test paths & cases.

 
 Sure, but wouldn't you separate the code and tests in different
 projects?

Why would you?  Keeping tests and code side-by-side makes it more likely
that the two are in sync, and therefore that the tests are relevant to
the current version of the code. The larger the separation, the more
likely the two are out-of-sync, which in practice usually means the
tests become largely irrelevant and fail to test significant
functionality, and code quality drops.  Keeping tests in a separate
module or a separate project altogether exacerbates this likelihood.
Keeping them right next to the function being tested increases the
chances of being relevant.

Besides, you really should be writing tests alongside the code as you're
coding anyway.  IME, doing that has a high chance of catching bugs early
and increasing code quality as a result. Postponing the writing of
tests, or having to switch to a different file/project to write the
test, makes it less likely tests will be written in the first place, and
more likely that the tests will be general and fail to cover corner
cases.


T

-- 
Doubtless it is a good thing to have an open mind, but a truly open mind should
be open at both ends, like the food-pipe, with the capacity for excretion as
well as absorption. -- Northrop Frye

Sep 18 2020

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Friday, September 18, 2020 6:05:39 AM MDT Imperatorn via Digitalmars-d 
wrote:
 On Friday, 18 September 2020 at 07:41:25 UTC, mw wrote:
 On Friday, 18 September 2020 at 06:50:48 UTC, Imperatorn wrote:
 Not to criticize, but, if it has reached 330 KLOC there might
 be another problem :S

 How did it get that big?

 Sometimes for non-trivial program, `unittest` can easily be
 longer than the program itself to cover all the test paths &
 cases.

 Sure, but wouldn't you separate the code and tests in different
 projects?

No. It's very common practice in D code to put the tests immediately after
the code that they're testing. It makes it far easier to make sure that
everything has tests as well making it easier to maintain the code and
ensure that the code and tests are properly in sync. In addition to that, if
a unittest block immediately after a symbol is marked with a ddoc comment,
then it gets added to the documentation for that symbol, making it easy to
add examples to the documentation and have those examples be tested whenever
you run the unit tests without having to duplicate the examples and worry
about whether they're in sync between the documentation and the tests.

Of course, not everyone likes to have the tests with the code, and some
people will put them separately, but it's generally recommended that they go
together, and that's what Phobos does. And since we try to have Phobos be
both well-documented and well-tested, the vast majority of its LOC are made
up of documentation and unit tests.

- Jonathan M Davis

Sep 18 2020

Imperatorn <johan_forsberg_86 hotmail.com> writes:

On Saturday, 19 September 2020 at 06:59:54 UTC, Jonathan M Davis 
wrote:
 On Friday, September 18, 2020 6:05:39 AM MDT Imperatorn via 
 Digitalmars-d wrote:
 On Friday, 18 September 2020 at 07:41:25 UTC, mw wrote:
 On Friday, 18 September 2020 at 06:50:48 UTC, Imperatorn



 No. It's very common practice in D code to put the tests 
 immediately after the code that they're testing. It makes it 
 far easier to make sure that everything has tests as well 
 making it easier to maintain the code and ensure that the code 
 and tests are properly in sync. In addition to that, if a 
 unittest block immediately after a symbol is marked with a ddoc 
 comment, then it gets added to the documentation for that 
 symbol, making it easy to add examples to the documentation and 
 have those examples be tested whenever you run the unit tests 
 without having to duplicate the examples and worry about 
 whether they're in sync between the documentation and the tests.
...

 - Jonathan M Davis

"It's very common practice"

Actually no it is not. D is the only example I've seen that 
routinely does this. Virtually all other languages separate code 
and tests.

Sep 19 2020

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Saturday, September 19, 2020 3:15:01 AM MDT Imperatorn via Digitalmars-d 
wrote:
 On Saturday, 19 September 2020 at 06:59:54 UTC, Jonathan M Davis

 wrote:
 On Friday, September 18, 2020 6:05:39 AM MDT Imperatorn via

 Digitalmars-d wrote:
 On Friday, 18 September 2020 at 07:41:25 UTC, mw wrote:
 On Friday, 18 September 2020 at 06:50:48 UTC, Imperatorn


 No. It's very common practice in D code to put the tests
 immediately after the code that they're testing. It makes it
 far easier to make sure that everything has tests as well
 making it easier to maintain the code and ensure that the code
 and tests are properly in sync. In addition to that, if a
 unittest block immediately after a symbol is marked with a ddoc
 comment, then it gets added to the documentation for that
 symbol, making it easy to add examples to the documentation and
 have those examples be tested whenever you run the unit tests
 without having to duplicate the examples and worry about
 whether they're in sync between the documentation and the tests.

...

 - Jonathan M Davis

 "It's very common practice"

 Actually no it is not. D is the only example I've seen that
 routinely does this. Virtually all other languages separate code
 and tests.

I specifically said that it was very common practice _in D code_. You
couldn't do it with most other languages even if you wanted to, because they
don't have unit test functionality built into the language.

- Jonathan M Davis

Sep 19 2020

Imperatorn <johan_forsberg_86 hotmail.com> writes:

On Saturday, 19 September 2020 at 09:24:05 UTC, Jonathan M Davis 
wrote:
 On Saturday, September 19, 2020 3:15:01 AM MDT Imperatorn via 
 Digitalmars-d wrote:
 On Saturday, 19 September 2020 at 06:59:54 UTC, Jonathan M 
 Davis

 wrote:
 [...]

 "It's very common practice"

 Actually no it is not. D is the only example I've seen that 
 routinely does this. Virtually all other languages separate 
 code and tests.

 I specifically said that it was very common practice _in D 
 code_. You couldn't do it with most other languages even if you 
 wanted to, because they don't have unit test functionality 
 built into the language.

 - Jonathan M Davis

Sure, I get why you would want to structure it like that. Just 
saying that it's probably the reason for Phobos "fluffyness"/330 
kloc then?

Sep 19 2020

Patrick Schluter <Patrick.Schluter bbox.fr> writes:

On Saturday, 19 September 2020 at 09:24:05 UTC, Jonathan M Davis 
wrote:
 On Saturday, September 19, 2020 3:15:01 AM MDT Imperatorn via 
 Digitalmars-d wrote:
 On Saturday, 19 September 2020 at 06:59:54 UTC, Jonathan M 
 Davis

 wrote:
 [...]

 "It's very common practice"

 Actually no it is not. D is the only example I've seen that 
 routinely does this. Virtually all other languages separate 
 code and tests.

 I specifically said that it was very common practice _in D 
 code_. You couldn't do it with most other languages even if you 
 wanted to, because they don't have unit test functionality 
 built into the language.

I introduced it the C project at work thanks to the magic of 
preprocessor macros and attribute((constructor)) of GCC. Works 
like a charm.

Sep 19 2020

Russel Winder <russel winder.org.uk> writes:

On Sat, 2020-09-19 at 09:15 +0000, Imperatorn via Digitalmars-d wrote:
 On Saturday, 19 September 2020 at 06:59:54 UTC, Jonathan M Davis=20

[=E2=80=A6]
=20
 "It's very common practice"
=20
 Actually no it is not. D is the only example I've seen that=20
 routinely does this. Virtually all other languages separate code=20
 and tests.

Rust has unit tests in the source modules, just as D does. It works very we=
ll.

Rust also has ways of doing integration and system testing, where the tests
are separate. It works very well.

--=20
Russel.
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Dr Russel Winder      t: +44 20 7585 2200
41 Buckmaster Road    m: +44 7770 465 077
London SW11 1EN, UK   w: www.russel.org.uk

Sep 19 2020

Imperatorn <johan_forsberg_86 hotmail.com> writes:

On Saturday, 19 September 2020 at 09:45:13 UTC, Russel Winder 
wrote:
 On Sat, 2020-09-19 at 09:15 +0000, Imperatorn via Digitalmars-d 
 wrote:
 On Saturday, 19 September 2020 at 06:59:54 UTC, Jonathan M 
 Davis

 […]
 
 "It's very common practice"
 
 Actually no it is not. D is the only example I've seen that 
 routinely does this. Virtually all other languages separate 
 code and tests.

 Rust has unit tests in the source modules, just as D does. It 
 works very well.

 Rust also has ways of doing integration and system testing, 
 where the tests are separate. It works very well.

I get the point in doing so. What I mean is that explains why 
Phobos is "fluffy" then, and it's probably not a problem

Sep 19 2020

Russel Winder <russel winder.org.uk> writes:

On Sat, 2020-09-19 at 09:55 +0000, Imperatorn via Digitalmars-d wrote:
=20

[=E2=80=A6]
 I get the point in doing so. What I mean is that explains why=20
 Phobos is "fluffy" then, and it's probably not a problem

I suspect getting data before coming to a view is probably the wise thing t=
o
do. Andrei's "fluffiness" may be an issue of unit tests, but given the post=
s
in the thread I get the feeling it is more than just that.

--=20
Russel.
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Dr Russel Winder      t: +44 20 7585 2200
41 Buckmaster Road    m: +44 7770 465 077
London SW11 1EN, UK   w: www.russel.org.uk

Sep 19 2020

Imperatorn <johan_forsberg_86 hotmail.com> writes:

On Saturday, 19 September 2020 at 11:29:37 UTC, Russel Winder 
wrote:
 On Sat, 2020-09-19 at 09:55 +0000, Imperatorn via Digitalmars-d 
 wrote:
 

 […]
 I get the point in doing so. What I mean is that explains why 
 Phobos is "fluffy" then, and it's probably not a problem

 I suspect getting data before coming to a view is probably the 
 wise thing to do. Andrei's "fluffiness" may be an issue of unit 
 tests, but given the posts in the thread I get the feeling it 
 is more than just that.

What I meant is maybe it's not top priority

Sep 19 2020

Russel Winder <russel winder.org.uk> writes:

On Sat, 2020-09-19 at 11:44 +0000, Imperatorn via Digitalmars-d wrote:
=20

[=E2=80=A6]
 What I meant is maybe it's not top priority

I can agree that that may well be the case. Though the unit test code may
suffer some of the same "fluffiness" that the library code does.
=20
--=20
Russel.
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Dr Russel Winder      t: +44 20 7585 2200
41 Buckmaster Road    m: +44 7770 465 077
London SW11 1EN, UK   w: www.russel.org.uk

Sep 19 2020

Imperatorn <johan_forsberg_86 hotmail.com> writes:

On Saturday, 19 September 2020 at 12:27:27 UTC, Russel Winder 
wrote:
 On Sat, 2020-09-19 at 11:44 +0000, Imperatorn via Digitalmars-d 
 wrote:
 

 […]
 What I meant is maybe it's not top priority

 I can agree that that may well be the case. Though the unit 
 test code may
 suffer some of the same "fluffiness" that the library code does.

True

Sep 20 2020

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/19/20 5:55 AM, Imperatorn wrote:
 On Saturday, 19 September 2020 at 09:45:13 UTC, Russel Winder wrote:
 On Sat, 2020-09-19 at 09:15 +0000, Imperatorn via Digitalmars-d wrote:
 On Saturday, 19 September 2020 at 06:59:54 UTC, Jonathan M Davis

 […]
 "It's very common practice"

 Actually no it is not. D is the only example I've seen that routinely 
 does this. Virtually all other languages separate code and tests.

 Rust has unit tests in the source modules, just as D does. It works 
 very well.

 Rust also has ways of doing integration and system testing, where the 
 tests are separate. It works very well.

 
 I get the point in doing so. What I mean is that explains why Phobos is 
 "fluffy" then, and it's probably not a problem

I'm bothered by the prevalence of empty lines in source code, not so 
much unittests or documentation.

Sep 19 2020

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/19/20 5:15 AM, Imperatorn wrote:
On Saturday, 19 September 2020 at 06:59:54 UTC, Jonathan M Davis wrote:
On Friday, September 18, 2020 6:05:39 AM MDT Imperatorn via
Digitalmars-d wrote:
On Friday, 18 September 2020 at 07:41:25 UTC, mw wrote:
On Friday, 18 September 2020 at 06:50:48 UTC, Imperatorn

No. It's very common practice in D code to put the tests immediately
after the code that they're testing. It makes it far easier to make
sure that everything has tests as well making it easier to maintain
the code and ensure that the code and tests are properly in sync. In
addition to that, if a unittest block immediately after a symbol is
marked with a ddoc comment, then it gets added to the documentation
for that symbol, making it easy to add examples to the documentation
and have those examples be tested whenever you run the unit tests
without having to duplicate the examples and worry about whether
they're in sync between the documentation and the tests.
...

- Jonathan M Davis

"It's very common practice"

Actually no it is not. D is the only example I've seen that routinely
does this. Virtually all other languages separate code and tests.

I just read that Python's doctest endorses the same:

* https://docs.python.org/2/library/doctest.html

It seems a matter in which reasonable people may disagree. I know Atila
Neves doesn't like it, and he's reasonable. His reason is actually good
too, of a practical nature at scale - build times/memory issues. I do
agree with that reason, and that's about the only con I can every think of.

There are also a few discussion on that online:

*
https://stackoverflow.com/questions/9022547/should-test-code-be-separate-from-source-production-code

*
https://softwareengineering.stackexchange.com/questions/188316/is-there-a-reason-that-tests-arent-written-inline-with-the-code-that-they-test

The arguments against seem to come from folks who either never attempted
to use inline tests and are doing a gedankenexperiment, or people who
did try and were deterred by language limitations.

This is a short article in favor of doing it, or so it seems (not sure
what the context is):

*
https://techbeacon.com/app-dev-testing/6-reasons-co-locate-your-app-automation-code

For my money, the troika documentation-implementation-unittest is
sacrosanct. It's the unit of progress. I honestly consider it unkind of
someone to not provide all three together. And there's a synergy of
sorts: in the projects with separate unittests, guess what -
documentation is absent, too.

The advent of documentation unittests has made the matter all but
motherhood and apple pie. Not providing such is almost going out of
one's way to be a jerk to the user.

Sep 19 2020

Imperatorn <johan_forsberg_86 hotmail.com> writes:

On Saturday, 19 September 2020 at 15:04:25 UTC, Andrei
Alexandrescu wrote:
On 9/19/20 5:15 AM, Imperatorn wrote:
On Saturday, 19 September 2020 at 06:59:54 UTC, Jonathan M
Davis wrote:
[...]

[...]

"It's very common practice"

Actually no it is not. D is the only example I've seen that
routinely does this. Virtually all other languages separate
code and tests.

I just read that Python's doctest endorses the same:

* https://docs.python.org/2/library/doctest.html

It seems a matter in which reasonable people may disagree. I
know Atila Neves doesn't like it, and he's reasonable. His
reason is actually good too, of a practical nature at scale -
build times/memory issues. I do agree with that reason, and
that's about the only con I can every think of.

There are also a few discussion on that online:

*
https://stackoverflow.com/questions/9022547/should-test-code-be-separate-from-source-production-code

*
https://softwareengineering.stackexchange.com/questions/188316/is-there-a-reason-that-tests-arent-written-inline-with-the-code-that-they-test

The arguments against seem to come from folks who either never
attempted to use inline tests and are doing a
gedankenexperiment, or people who did try and were deterred by
language limitations.

This is a short article in favor of doing it, or so it seems
(not sure what the context is):

*
https://techbeacon.com/app-dev-testing/6-reasons-co-locate-your-app-automation-code

For my money, the troika documentation-implementation-unittest
is sacrosanct. It's the unit of progress. I honestly consider
it unkind of someone to not provide all three together. And
there's a synergy of sorts: in the projects with separate
unittests, guess what - documentation is absent, too.

The advent of documentation unittests has made the matter all
but motherhood and apple pie. Not providing such is almost
going out of one's way to be a jerk to the user.

Thanks for a good reply!

I am not really saying it's a bad idea. I actually think it might
be wise to do so. I mis-read the comment earlier about it being
common (in other languages).
I have just completed a marathon of going through about 50
languages, so I felt I had something to say about it, that's all.

Anyway, since I'm here, at the D-Forum is kind of a good sign,
right? :) I do believe D is a great language with a good deal of
hope. In the coming months I will (at our company) actually port

Sep 19 2020

data pulverizer <data.pulverizer gmail.com> writes:

On Thursday, 17 September 2020 at 15:51:18 UTC, Andrei 
Alexandrescu wrote:
 As wc -l counts, phobos has some 330 KLOC:

 $ wc -l $(git ls-files '*.d') | tail -1
   331378 total

 I noticed many contributors are fond of inserting empty lines 
 discretionarily, sometimes even in the middle of 2-5 line 
 functions, or right after opening an "if" statement. The total 
 number of empty lines:

 $ git grep '^$' $(git ls-files '*.d') | wc -l
    38503

 So Phobos has 11.62% empty lines in it [...]

At the risk of starting WW3, I'd like to propose using 2 space 
code indent over 4 spaces or tabs. Indents just don't need that 
much space.

:-)

Sep 19 2020

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/19/20 9:00 AM, data pulverizer wrote:
 On Thursday, 17 September 2020 at 15:51:18 UTC, Andrei Alexandrescu wrote:
 As wc -l counts, phobos has some 330 KLOC:

 $ wc -l $(git ls-files '*.d') | tail -1
   331378 total

 I noticed many contributors are fond of inserting empty lines 
 discretionarily, sometimes even in the middle of 2-5 line functions, 
 or right after opening an "if" statement. The total number of empty 
 lines:

 $ git grep '^$' $(git ls-files '*.d') | wc -l
    38503

 So Phobos has 11.62% empty lines in it [...]

 
 At the risk of starting WW3, I'd like to propose using 2 space code 
 indent over 4 spaces or tabs. Indents just don't need that much space.

Jesting aside, that would actually increase average complexity of the 
code because it would allow people to nest more.

Sep 19 2020

data pulverizer <data.pulverizer gmail.com> writes:

On Saturday, 19 September 2020 at 15:08:12 UTC, Andrei 
Alexandrescu wrote:
 On 9/19/20 9:00 AM, data pulverizer wrote:
 At the risk of starting WW3, I'd like to propose using 2 space 
 code indent over 4 spaces or tabs. Indents just don't need 
 that much space.

 Jesting aside, that would actually increase average complexity 
 of the code because it would allow people to nest more.

I guess you could set a line width guide to make the code more 
readable. My issue with 4 space indents is in the best case 
scenario, you end up with a large rectangular block of white 
space running straight down the left hand side of the 
function/struct/class/whatever, and you can get these side-ways 
mountains of white space where you have to scroll to the right to 
actually see the code.

Some time ago I used to use tabs (yikes!) to delimit my code 
until I took a terrible contract job. The only good thing that 
came out of it was that the client insisted that the code be 
2-space delimited, which I didn't even know was a thing. Lol! 
After that job I don't do fixed cost contracts, and I indent code 
with 2 spaces.

Sep 19 2020

Uknown <sireeshkodali1 gmail.com> writes:

On Saturday, 19 September 2020 at 16:12:52 UTC, data pulverizer 
wrote:
 On Saturday, 19 September 2020 at 15:08:12 UTC, Andrei 
 Alexandrescu wrote:
 On 9/19/20 9:00 AM, data pulverizer wrote:
 [...]



the linux kernel has a rule: all code has to be indented with 
tabs. Its for a single reason: more left leaning code is faster 
(heh Andrei). As Andrei already mentioned, less indented code 
allows people to nest more, which makes for arguable worse 
code.[1]

What you really want is a column limit too. The linux kernel 
enforces 80 columns, for phobos something like 100-120 probably 
makes more sense. It would make sure no one writes overly nested 
code, and at the same time make sure you don't have to scroll all 
the way to the right.

[1]: 
https://www.kernel.org/doc/html/v4.10/process/coding-style.html

Sep 19 2020

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/19/20 12:20 PM, Uknown wrote:
 On Saturday, 19 September 2020 at 16:12:52 UTC, data pulverizer wrote:
 On Saturday, 19 September 2020 at 15:08:12 UTC, Andrei Alexandrescu 
 wrote:
 On 9/19/20 9:00 AM, data pulverizer wrote:
 [...]



 
 the linux kernel has a rule: all code has to be indented with tabs. Its 
 for a single reason: more left leaning code is faster (heh Andrei). As 
 Andrei already mentioned, less indented code allows people to nest more, 
 which makes for arguable worse code.[1]
 
 What you really want is a column limit too. The linux kernel enforces 80 
 columns, for phobos something like 100-120 probably makes more sense. It 
 would make sure no one writes overly nested code, and at the same time 
 make sure you don't have to scroll all the way to the right.
 
 [1]: https://www.kernel.org/doc/html/v4.10/process/coding-style.html

Yes indeed! I do mention that in a couple of talks. Their indent size is 
8, too. You can't do a lot of shenanigans with indent size 8 and max 
columns 80.

Many animals are beautiful because they live hard lives.

Sep 19 2020

Daniel N <no public.email> writes:

On Saturday, 19 September 2020 at 16:32:13 UTC, Andrei 
Alexandrescu wrote:
 Yes indeed! I do mention that in a couple of talks. Their 
 indent size is 8, too. You can't do a lot of shenanigans with 
 indent size 8 and max columns 80.

 Many animals are beautiful because they live hard lives.

They recently switched to 100:
https://lkml.org/lkml/2020/5/29/1038

I fully agree with your statement of complexity based on nesting, 
however I totally disagree with the solution.

Modern "linters" are sufficiently capable to issue an error on 3 
levels of _nesting_ instead of an arbitrarily chosen max column.

Just consider some features, such as named-parameters, if 
anything they are more likely to reduce complexity than to 
increase it, however it will result in very long lines, even if 
the line is a single statement without nesting, being forced to 
break such lines will reduce readability.

Sep 19 2020

Patrick Schluter <Patrick.Schluter bbox.fr> writes:

On Saturday, 19 September 2020 at 16:12:52 UTC, data pulverizer 
wrote:
 On Saturday, 19 September 2020 at 15:08:12 UTC, Andrei 
 Alexandrescu wrote:
 [...]

 I guess you could set a line width guide to make the code more 
 readable. My issue with 4 space indents is in the best case 
 scenario, you end up with a large rectangular block of white 
 space running straight down the left hand side of the 
 function/struct/class/whatever, and you can get these side-ways 
 mountains of white space where you have to scroll to the right 
 to actually see the code.

 [...]

Yeah, 2 wide and no tabs. The only right rule. ;-)

Sep 19 2020

wjoe <invalid example.com> writes:

On Saturday, 19 September 2020 at 17:52:27 UTC, Patrick Schluter 
wrote:
 On Saturday, 19 September 2020 at 16:12:52 UTC, data pulverizer 
 wrote:
 On Saturday, 19 September 2020 at 15:08:12 UTC, Andrei 
 Alexandrescu wrote:
 [...]

 I guess you could set a line width guide to make the code more 
 readable. My issue with 4 space indents is in the best case 
 scenario, you end up with a large rectangular block of white 
 space running straight down the left hand side of the 
 function/struct/class/whatever, and you can get these 
 side-ways mountains of white space where you have to scroll to 
 the right to actually see the code.

 [...]

 Yeah, 2 wide and no tabs. The only right rule. ;-)

I think the only right rule is to use a code formatting tool and 
provide the configuration so everyone can use their preferred 
style and format the code before committing it.

Sep 19 2020

matheus <matheus gmail.com> writes:

On Sunday, 20 September 2020 at 01:00:43 UTC, wjoe wrote:
 ...
 I think the only right rule is to use a code formatting tool 
 and provide the configuration so everyone can use their 
 preferred style and format the code before committing it.

May I ask how this works when comparing differences between files 
from server with local?

Matheus.

Sep 19 2020

wjoe <invalid example.com> writes:

On Sunday, 20 September 2020 at 02:31:53 UTC, matheus wrote:
 On Sunday, 20 September 2020 at 01:00:43 UTC, wjoe wrote:
 ...
 I think the only right rule is to use a code formatting tool 
 and provide the configuration so everyone can use their 
 preferred style and format the code before committing it.

 May I ask how this works when comparing differences between 
 files from server with local?

 Matheus.

Shouldn't be too hard to come up with a script that can 
conveniently handle that scenario.
Like a diff wrapper that either formats the server version to 
your favorite style or your local file to the server style.
That's how I'd do it anyways.

But since this is about newlines, i.e. whitespace, a lot of diff 
tools can be configure to ignore whitespace.

Sep 22 2020

wjoe <invalid example.com> writes:

On Saturday, 19 September 2020 at 16:12:52 UTC, data pulverizer 
wrote:
 [...]
 After that job I don't do fixed cost contracts, and I indent 
 code with 2 spaces.

You could have configured your tabs to width of 2 :)

Sep 19 2020

Paul Backus <snarwin gmail.com> writes:

On Thursday, 17 September 2020 at 15:51:18 UTC, Andrei 
Alexandrescu wrote:
 As wc -l counts, phobos has some 330 KLOC:

 $ wc -l $(git ls-files '*.d') | tail -1
   331378 total

 I noticed many contributors are fond of inserting empty lines 
 discretionarily, sometimes even in the middle of 2-5 line 
 functions, or right after opening an "if" statement. The total 
 number of empty lines:

 $ git grep '^$' $(git ls-files '*.d') | wc -l
    38503

 So Phobos has 11.62% empty lines in it, on average one on every 
 9 lines of code. I find that a bit excessive, particularly 
 given that our coding convention uses brace-on-its-own line, 
 which already adds a lot of vertical space. Here's the number 
 of lines consisting of only one brace:

 git grep '^ *[{}] *$' **/*.d | wc -l
    53126

 That's 16% of the total. Combined with empty lines, we're 
 looking at a 27.65% fluff factor. Isn't that quite a bit, even 
 considering that documentation requires empty lines for 
 paragraphs etc?

Just for fun, I decided to run these calculations on sumtype, 
which uses my own personal formatting style:

Lines: 2389 total
Blank: 435 (18%)
Brace: 133 (6%)
Fluff factor: 24%

I'm clearly a lot less shy about using blank lines in my code 
than the average Phobos contributor, but I don't put opening 
braces on their own lines, so I end up with about the same level 
of fluff overall.

I wonder if this is a coincidence, or if "readable" code in 
curly-brace languages naturally converges to around 25% fluff? 
Further research is needed.

Sep 20 2020

Adam D. Ruppe <destructionator gmail.com> writes:

On Sunday, 20 September 2020 at 15:04:35 UTC, Paul Backus wrote:
 Just for fun, I decided to run these calculations on sumtype, 
 which uses my own personal formatting style:

 Lines: 2389 total
 Blank: 435 (18%)
 Brace: 133 (6%)
 Fluff factor: 24%

omg ima do it too:

Lines: 156097 total
Blank: 22552 (14%)
Brace: 5 (0%)
Fluff factor: 14%


Would be kinda interesting to compare byte size and number of 
lexemes too, to get an idea of average line length. 4780494 is 
what wc says to me which makes my average populated line 36 chars 
long.

Phobos wc gives 11320840 which makes its average populated line 
47 chars long. Interesting but prolly meaningless.

Sep 20 2020

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/20/20 11:15 AM, Adam D. Ruppe wrote:
 On Sunday, 20 September 2020 at 15:04:35 UTC, Paul Backus wrote:
 Just for fun, I decided to run these calculations on sumtype, which 
 uses my own personal formatting style:

 Lines: 2389 total
 Blank: 435 (18%)
 Brace: 133 (6%)
 Fluff factor: 24%

 
 omg ima do it too:
 
 Lines: 156097 total
 Blank: 22552 (14%)
 Brace: 5 (0%)
 Fluff factor: 14%

Interesting, thanks.

 Would be kinda interesting to compare byte size and number of lexemes 
 too, to get an idea of average line length. 4780494 is what wc says to 
 me which makes my average populated line 36 chars long.
 
 Phobos wc gives 11320840 which makes its average populated line 47 chars 
 long. Interesting but prolly meaningless.

I think you'd need to adjust for tabs - you need to multiply tabs by 4.

Sep 20 2020

Adam D. Ruppe <destructionator gmail.com> writes:

On Sunday, 20 September 2020 at 15:28:53 UTC, Andrei Alexandrescu 
wrote:
 I think you'd need to adjust for tabs - you need to multiply 
 tabs by 4.

Oh yeah, that'd pretty well account for the difference.

I also need to fix the braces thing, that should be 2917, not 5 
due to spaces and tabs. So that adds 2% to my fluff bringing it 
to 16%.

Anyway, I have 217080 precious, beautiful, perfect tabs in there. 
So that makes my total length in spaces 5431734 / 130628 
corrected non-blank lines = 42 average. Still slightly shorter 
but pretty close.

Like I said, it is probably meaningless, but a bit amusing since 
I'm so staunchly anti line length limits... but it seems to 
converge around the same length anyway.

Sep 20 2020

mate <aiueo aiueo.aiueo> writes:

On Sunday, 20 September 2020 at 16:11:44 UTC, Adam D. Ruppe wrote:

 Like I said, it is probably meaningless, but a bit amusing 
 since I'm so staunchly anti line length limits... but it seems 
 to converge around the same length anyway.

Interesting. How long are your longest lines?

awk '{print length}' | sort -rn | head

Sep 20 2020

Adam D. Ruppe <destructionator gmail.com> writes:

On Sunday, 20 September 2020 at 20:46:42 UTC, mate wrote:
 Interesting. How long are your longest lines?

$ cat $(git ls-files '*.d') | awk '{print length}' | sort -rn | 
head
3036
826
826
819
793
610
593
568
488
488


lol, but I'd note that's almost certainly skewed by documentation 
comments, which I usually let automatically word-wrap (and if you 
don't, you should btw, it is so so so so so much nicer to work 
with).

Of the 156,000 lines though:

400 are > 180. (0.3%, likely documentation paragraphs)
1,600 are between 120 and 180. (1%, likely docs again)
7,700 are between 80 and 120. (5%, more likely actual code)

The remainder are < 80. (94%)


I did NOT convert tabs to spaces for this.

Sep 20 2020

mate <aiueo aiueo.aiueo> writes:

On Sunday, 20 September 2020 at 21:13:53 UTC, Adam D. Ruppe wrote:
 On Sunday, 20 September 2020 at 20:46:42 UTC, mate wrote:
 Interesting. How long are your longest lines?

 $ cat $(git ls-files '*.d') | awk '{print length}' | sort -rn | 
 head
 3036
 826
 826
 819
 793
 610
 593
 568
 488
 488


 lol, but I'd note that's almost certainly skewed by 
 documentation comments, which I usually let automatically 
 word-wrap (and if you don't, you should btw, it is so so so so 
 so much nicer to work with).

 Of the 156,000 lines though:

 400 are > 180. (0.3%, likely documentation paragraphs)
 1,600 are between 120 and 180. (1%, likely docs again)
 7,700 are between 80 and 120. (5%, more likely actual code)

 The remainder are < 80. (94%)


 I did NOT convert tabs to spaces for this.

Thanks. So it does look like you have an implicit soft line 
length limit for code, but not documentation.

Sep 20 2020

Adam D. Ruppe <destructionator gmail.com> writes:

On Sunday, 20 September 2020 at 21:42:54 UTC, mate wrote:
 Thanks. So it does look like you have an implicit soft line 
 length limit for code, but not documentation.

Well, it isn't so much a limit as just a lack of demand or some 
other kind of natural traffic calming.

This makes me think about cities. Locally, the city planning 
department has been doing a "road diet". They are changing the 
designs of roads to make them narrower, planting more trees, and 
other changes that just generally make them more difficult to 
navigate in an effort to make them safer.

It might sound weird that a strait path with a narrow gate would 
make the road safer, but it has been proven to have that positive 
safety effect because it makes drivers more likely to naturally 
slow down and pay attention. Whereas a posted speed limit on a 
wide, straight road is just a suggestion you only really think 
about when you see a police car, the designed-in "speed limit" 
created by trees, curves, narrowness, obstacles, etc. are 
something you think about for pure self-preservation if nothing 
else. At that point, you don't really need a legislated/posted 
speed limit at all (though you might keep one anyway just in case 
there is a particularly reckless or inexperienced driver who 
needs the tip, it would rarely actually need active enforcement).


Well, to bring this back to code, excessively long lines are 
already disincentivized by design. There's really no need in the 
majority of cases, and when there is, the natural benefits of 
two-dimensional layout (and the fact so many programming tools 
are line-based anyway) create an incentive for the author - for 
their own self-interested benefit - to break it up as 
appropriate. They don't have to be TOLD to by some kind of 
rubocop.

And if they choose to not break it up, it is probably because 
they judged it to not actually be a benefit in this case and 
someone complaining about the line length is more likely to be 
seen as patronizing than helpful.

Sep 20 2020

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/20/20 4:46 PM, mate wrote:
 On Sunday, 20 September 2020 at 16:11:44 UTC, Adam D. Ruppe wrote:
 
 Like I said, it is probably meaningless, but a bit amusing since I'm 
 so staunchly anti line length limits... but it seems to converge 
 around the same length anyway.

 
 Interesting. How long are your longest lines?
 
 awk '{print length}' | sort -rn | head

Ah, interesting test. I just ran this modified script against phobos:

awk '{print length " " FILENAME}' $(git ls-files '*.d') | sort -rn

These are all lines longer than 120 characters. Not too bad.

215 std/functional.d
199 std/signals.d
180 std/experimental/allocator/package.d
169 std/experimental/allocator/building_blocks/region.d
163 std/experimental/allocator/building_blocks/region.d
163 std/experimental/allocator/building_blocks/region.d
162 std/signals.d
162 std/experimental/allocator/building_blocks/region.d
162 std/experimental/allocator/building_blocks/region.d
160 std/experimental/allocator/building_blocks/bitmapped_block.d
154 std/experimental/allocator/building_blocks/package.d
154 std/experimental/allocator/building_blocks/ascending_page_allocator.d
153 std/experimental/allocator/building_blocks/region.d
152 std/experimental/allocator/building_blocks/region.d
150 std/windows/registry.d
149 std/socket.d
147 std/range/package.d
147 std/format.d
145 std/digest/sha.d
144 std/internal/digest/sha_SSSE3.d
142 std/experimental/allocator/building_blocks/bitmapped_block.d
142 std/experimental/allocator/building_blocks/bitmapped_block.d
138 std/variant.d
138 std/range/package.d
137 std/internal/math/gammafunction.d
137 std/digest/murmurhash.d
136 std/experimental/allocator/building_blocks/free_list.d
136 std/algorithm/comparison.d
135 std/signals.d
134 std/numeric.d
134 std/experimental/allocator/building_blocks/free_tree.d
133 std/experimental/allocator/mallocator.d
131 std/signals.d
130 std/process.d
129 std/experimental/allocator/building_blocks/region.d
129 std/experimental/allocator/building_blocks/region.d
129 std/experimental/allocator/building_blocks/affix_allocator.d
127 std/socket.d
127 std/range/package.d
127 std/exception.d
126 std/windows/registry.d
126 std/range/package.d
126 std/experimental/allocator/typed.d
126 std/experimental/allocator/mallocator.d
125 std/regex/package.d
125 std/process.d
125 std/internal/math/biguintcore.d
124 std/uni/package.d
124 std/signals.d
124 std/regex/internal/ir.d
124 std/random.d
124 std/internal/math/gammafunction.d
123 std/signals.d
122 std/net/isemail.d
122 std/experimental/allocator/building_blocks/scoped_allocator.d
122 std/array.d
122 std/algorithm/iteration.d
121 std/typecons.d
121 std/socket.d
121 std/signals.d
121 std/exception.d
121 std/exception.d
121 std/bitmanip.d

Sep 20 2020

mate <aiueo aiueo.aiueo> writes:

On Monday, 21 September 2020 at 00:19:28 UTC, Andrei Alexandrescu 
wrote:
 On 9/20/20 4:46 PM, mate wrote:
 [...]

 Ah, interesting test. I just ran this modified script against 
 phobos:

 [...]

stdx.allocator seems overrepresented doesn't it?

Sep 21 2020

DlangUser38 <DlangUser38 nowhere.se> writes:

On Monday, 21 September 2020 at 11:21:50 UTC, mate wrote:
 On Monday, 21 September 2020 at 00:19:28 UTC, Andrei 
 Alexandrescu wrote:
 On 9/20/20 4:46 PM, mate wrote:
 [...]

 Ah, interesting test. I just ran this modified script against 
 phobos:

 [...]

 stdx.allocator seems overrepresented doesn't it?

nice shot.

Sep 21 2020

mate <aiueo aiueo.aiueo> writes:

On Monday, 21 September 2020 at 11:26:40 UTC, DlangUser38 wrote:
 On Monday, 21 September 2020 at 11:21:50 UTC, mate wrote:
 On Monday, 21 September 2020 at 00:19:28 UTC, Andrei 
 Alexandrescu wrote:
 On 9/20/20 4:46 PM, mate wrote:
 [...]

 Ah, interesting test. I just ran this modified script against 
 phobos:

 [...]

 stdx.allocator seems overrepresented doesn't it?

 nice shot.

Hey that was not my intention!

Sep 21 2020

DlangUser38 <DlangUser38 nowhere.se> writes:

On Monday, 21 September 2020 at 11:43:48 UTC, mate wrote:
 On Monday, 21 September 2020 at 11:26:40 UTC, DlangUser38 wrote:
 On Monday, 21 September 2020 at 11:21:50 UTC, mate wrote:
 On Monday, 21 September 2020 at 00:19:28 UTC, Andrei 
 Alexandrescu wrote:
 On 9/20/20 4:46 PM, mate wrote:
 [...]

 Ah, interesting test. I just ran this modified script 
 against phobos:

 [...]

 stdx.allocator seems overrepresented doesn't it?

 nice shot.

 Hey that was not my intention!

NVM The effect is the same. stdx.allocator is moslty written by 
Andrei.

Sep 21 2020

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/21/20 7:48 AM, DlangUser38 wrote:
 On Monday, 21 September 2020 at 11:43:48 UTC, mate wrote:
 On Monday, 21 September 2020 at 11:26:40 UTC, DlangUser38 wrote:
 On Monday, 21 September 2020 at 11:21:50 UTC, mate wrote:
 On Monday, 21 September 2020 at 00:19:28 UTC, Andrei Alexandrescu 
 wrote:
 On 9/20/20 4:46 PM, mate wrote:
 [...]

 Ah, interesting test. I just ran this modified script against phobos:

 [...]

 stdx.allocator seems overrepresented doesn't it?

 nice shot.

 Hey that was not my intention!

 
 NVM The effect is the same. stdx.allocator is moslty written by Andrei.

Haha, nice. Actually quite a few people worked on the allocator, 
including a couple of my students. It's likely the long lines aren't 
mine - it's not how I write code.

Sep 21 2020

Steven Schveighoffer <schveiguy gmail.com> writes:

On 9/20/20 11:04 AM, Paul Backus wrote:
 On Thursday, 17 September 2020 at 15:51:18 UTC, Andrei Alexandrescu wrote:
 As wc -l counts, phobos has some 330 KLOC:

 $ wc -l $(git ls-files '*.d') | tail -1
   331378 total

 I noticed many contributors are fond of inserting empty lines 
 discretionarily, sometimes even in the middle of 2-5 line functions, 
 or right after opening an "if" statement. The total number of empty 
 lines:

 $ git grep '^$' $(git ls-files '*.d') | wc -l
    38503

 So Phobos has 11.62% empty lines in it, on average one on every 9 
 lines of code. I find that a bit excessive, particularly given that 
 our coding convention uses brace-on-its-own line, which already adds a 
 lot of vertical space. Here's the number of lines consisting of only 
 one brace:

 git grep '^ *[{}] *$' **/*.d | wc -l
    53126

 That's 16% of the total. Combined with empty lines, we're looking at a 
 27.65% fluff factor. Isn't that quite a bit, even considering that 
 documentation requires empty lines for paragraphs etc?

 
 Just for fun, I decided to run these calculations on sumtype, which uses 
 my own personal formatting style:
 
 Lines: 2389 total
 Blank: 435 (18%)
 Brace: 133 (6%)
 Fluff factor: 24%
 
 I'm clearly a lot less shy about using blank lines in my code than the 
 average Phobos contributor, but I don't put opening braces on their own 
 lines, so I end up with about the same level of fluff overall.
 
 I wonder if this is a coincidence, or if "readable" code in curly-brace 
 languages naturally converges to around 25% fluff? Further research is 
 needed.


I put blank lines everywhere. I need the fluff for it to look 
reasonable. You will also see a lot of comments in my code too. I used 
to do a 3-line comment using 2 blank // lines above and below, but I now 
think a blank line (without the //) before the comment suffices. Meh, I 
like readability, and I don't see the harm in it.

This is one of the largest rubble-bouncing threads I've seen in a while.

-Steve

Sep 20 2020

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/20/20 11:18 AM, Steven Schveighoffer wrote:
 This is one of the largest rubble-bouncing threads I've seen in a while.

Not to worry, there's no action item here (save for "we should set up an 
automatic formatter sometime"). In particular, my latest refactorings 
aim at reducing code size "for real", not by just removing blank lines.

Sep 20 2020

D Programming

C/C++ Programming

Other

digitalmars.D - Is phobos too fluffy?