digitalmars.D - About the Expressiveness of D
- Jonas Drewsen (5/5) Apr 02 2013 Article about the expressiveness of languages with D included as
- Paulo Pinto (2/7) Apr 02 2013 And me with the one about Go.
- Walter Bright (3/6) Apr 02 2013 It's an interesting metric, but there are too many obvious confounding v...
- Joseph Rushton Wakeling (3/5) Apr 02 2013 Between your response and mine, I think we have a rather good illustrati...
- Joseph Rushton Wakeling (21/23) Apr 02 2013 Personal feeling here -- there's a difference between how expressive a l...
- Walter Bright (4/8) Apr 02 2013 Consider also that this LOC numbers are not lines of code - they're also...
- Peter Alexander (2/13) Apr 02 2013 Not to mention that idiomatic D bracing style adds more lines.
- Jesse Phillips (5/16) Apr 02 2013 While I don't know what this specific report used, but comments
- Walter Bright (4/6) Apr 02 2013 Often, in pulls for D, the LOC of the unittests exceeds the LOC of the f...
- H. S. Teoh (7/16) Apr 02 2013 And I'm inordinately pleased with how many careless mistakes have been
- Jonathan M Davis (12/20) Apr 02 2013 Yes, though I've had complaints before about a pull being too much code ...
- Andrei Alexandrescu (5/19) Apr 02 2013 I think it leads to writing less repetitive unittests.
- Jonathan M Davis (32/54) Apr 02 2013 I very much doubt that you could do that unless you specifically formatt...
- Walter Bright (9/11) Apr 02 2013 Currently, the datetime unittest coverage is 95%. Some of the 0 cases su...
- Jonathan M Davis (8/22) Apr 03 2013 Yes. That's one of the things that I need to improve. std.datetime has a...
- Jonathan M Davis (10/12) Apr 03 2013 I should take another look at those. I thought that I had it at more lik...
- Walter Bright (2/5) Apr 03 2013 Why not just mark them as nothrow? Let the compiler statically check it.
- Jonathan M Davis (14/20) Apr 03 2013 It's for cases where the compiler _can't_ check. For instance, if you ha...
- Walter Bright (2/5) Apr 03 2013 Agreed.
- Walter Bright (2/6) Apr 03 2013 I'd be shocked if running -cov for the first time *didn't* come up with ...
- Jonathan M Davis (8/16) Apr 03 2013 Yes. My point was that 100% should be the goal, whereas I know a number ...
- Walter Bright (3/9) Apr 03 2013 Cov testing also has a tendency to expose dead code - not just insuffici...
- Jonathan M Davis (9/20) Apr 03 2013 Good point. That's not something that I typically think of - though in a...
- Jacob Carlborg (52/83) Apr 02 2013 Since he wrote "2000 lines for all functionality", I don't think he
- Andrei Alexandrescu (4/52) Apr 03 2013 The way I see it, the first is terrible and the second asks for better
- Jacob Carlborg (4/6) Apr 03 2013 Stupid me, posting on Ruby.
- Andrei Alexandrescu (4/8) Apr 03 2013 I was referring to the repeatability of the code used in testing, which
- Jacob Carlborg (4/6) Apr 04 2013 I think the first one is far more readable then the one using the loop.
- Andrei Alexandrescu (3/7) Apr 04 2013 I understand. And I think you are very wrong about that.
- Jonathan M Davis (20/123) Apr 03 2013 That may be, but he does seem to have a habit of including the unit test...
- Jacob Carlborg (7/22) Apr 03 2013 I do refactor tests, but mostly the data. At work I think we have pretty...
- Andrei Alexandrescu (3/6) Apr 03 2013 Well that's quite the assumption.
- Jonathan M Davis (11/17) Apr 03 2013 If you push for the lines of unit testing code to be kept to a minimum, ...
- Walter Bright (6/8) Apr 03 2013 My idea of perfection would be 100% coverage with zero redundancy in the...
- Jonathan M Davis (24/35) Apr 03 2013 Well, determining what's actually redundant isn't always easy. If a test...
- Walter Bright (5/11) Apr 03 2013 We can exploit mathematics to reduce the test cases while testing thorou...
- Jonathan M Davis (9/22) Apr 03 2013 Definitely, though in some cases, figuring the bounds cases can be quite...
- Peter Alexander (10/12) Apr 03 2013 I think you are massively underestimating the complexity and
- Dmitry Olshansky (4/14) Apr 03 2013 --
- Walter Bright (9/10) Apr 03 2013 Stylistic nit:
- Steven Schveighoffer (15/23) Apr 03 2013 I couldn't disagree more. The given +1 had 4 lines of context. There w...
- Andrei Alexandrescu (14/39) Apr 04 2013 I'm with Walter. The top context was fine for that message. The bottom
- Steven Schveighoffer (13/54) Apr 04 2013 Mac mail fixed this problem for me. All previously received text is
- SomeDude (9/11) Apr 04 2013 So there is a lot of visual noise for nothing, and you like it ?
- Steven Schveighoffer (18/30) Apr 05 2013 I like that I don't have to deal with it. I also don't have to deal wit...
- SomeDude (5/12) Apr 04 2013 +1
- Andrei Alexandrescu (13/23) Apr 03 2013 May as well. I recall before I approved std.datetime I looked at the
- Brad Anderson (7/21) Apr 03 2013 Boost datetime is 27k. Just the headers comes to 17k. A 2k
- Jonathan M Davis (8/9) Apr 03 2013 I really should strip out the unit tests and documentation to see what t...
- Andrei Alexandrescu (4/22) Apr 03 2013 Agreed. I just pulled that number randomly without having looked at the
- Simen Kjaeraas (5/18) Apr 03 2013 Removed all comments, unittests, and empty lines from std.datetime. File
- Jesse Phillips (7/25) Apr 03 2013 cloc doesn't support /+ comments... But using your number, cloc,
- Jacob Carlborg (5/7) Apr 04 2013 std.datetime contains mostly /+ and // comments. It only contains a
- Jesse Phillips (4/10) Apr 04 2013 I realize that, reason I had to use math. Cloc reports 11598
- Jacob Carlborg (5/7) Apr 04 2013 Heheh, that's more reasonable. That's also why I don't like to have unit...
- Chad Joan (7/11) Apr 05 2013 My problem with datetime is that it is too monolithic. I really wish it...
- Steven Schveighoffer (5/9) Apr 05 2013 What if the docs were split up?
- Jonathan M Davis (13/29) Apr 05 2013 If/Once some variant of DIPs 15 or 16 is implemented, we'll be able to
- Brad Roberts (7/36) Apr 05 2013 I believe it's really not a module issue at all, but a doc issue. The
- Jacob Carlborg (5/14) Apr 02 2013 The problem is having the unit tests in the same file. Yes, I know, most...
- Andrej Mitrovic (16/18) Apr 02 2013 One thing I noticed is that having unittests in separate files can
- Jacob Carlborg (9/24) Apr 03 2013 Most likely not, but there's nothing wrong with it. We do have modules
- Andrej Mitrovic (8/11) Apr 02 2013 I wonder if there's a way to mitigate that problem with a language
- Chad Joan (3/10) Apr 05 2013 I think this has made me a much better programmer. And it did so a long...
- SomeDude (3/20) Apr 04 2013 He certainly didn't factor out comments for all languages,
- bearophile (5/8) Apr 02 2013 I think D is quite expressive:
- Andrei Alexandrescu (7/17) Apr 02 2013 I meant to comment on this - it's a terrific walkthrough. I think
- Walter Bright (2/19) Apr 02 2013 I agree, it's terrific. But perhaps we can just submit it to reddit as i...
- Andrei Alexandrescu (6/25) Apr 04 2013 Pinging bearophile on this again - do you want to adapt this into a blog...
- bearophile (15/23) Apr 04 2013 Thank you for your interest. I like to write articles, but there
- Zach the Mystic (5/25) Apr 04 2013 I just wanted to say that I also liked the article and I
- Andrei Alexandrescu (7/32) Apr 05 2013 I, too, understand that, with the amendment that it's an unwarranted
- renoX (4/9) Apr 02 2013 Yep, the sorting seems quite random to me, AFAIK Vala is nothing
- Joseph Rushton Wakeling (4/6) Apr 02 2013 To be fair, the author does say that results for what he calls "third ti...
- Kagamin (2/6) Apr 05 2013 Did I get it right, that expressiveness means trading
- Jonathan M Davis (15/21) Apr 05 2013 To some extent, I agree. I'm quite able to maintain it as one module (th...
Article about the expressiveness of languages with D included as one of the contestants. http://redmonk.com/dberkholz/2013/03/25/programming-languages-ranked-by-expressiveness/ I tend to agree with the first comment to the article though :) /Jonas
Apr 02 2013
On Tuesday, 2 April 2013 at 07:59:17 UTC, Jonas Drewsen wrote:Article about the expressiveness of languages with D included as one of the contestants. http://redmonk.com/dberkholz/2013/03/25/programming-languages-ranked-by-expressiveness/ I tend to agree with the first comment to the article though :) /JonasAnd me with the one about Go.
Apr 02 2013
On 4/2/2013 12:59 AM, Jonas Drewsen wrote:Article about the expressiveness of languages with D included as one of the contestants. http://redmonk.com/dberkholz/2013/03/25/programming-languages-ranked-by-expressiveness/It's an interesting metric, but there are too many obvious confounding variables to assume that expressiveness has the first order effect.
Apr 02 2013
On 04/02/2013 11:15 AM, Walter Bright wrote:It's an interesting metric, but there are too many obvious confounding variables to assume that expressiveness has the first order effect.Between your response and mine, I think we have a rather good illustration of this for the English language, never mind programming ... :-)
Apr 02 2013
On 04/02/2013 09:59 AM, Jonas Drewsen wrote:Article about the expressiveness of languages with D included as one of the contestants.Personal feeling here -- there's a difference between how expressive a language can be (even, how expressive it can _easily_ be) versus how expressively programmers tend to use it. I think my own use of D tends to be heavily biased by my background in C/C++ and my lack of training in more expressively-focused development styles. D allows me to write in those paradigms I feel comfortable with -- and so my use of it is almost certainly less expressive than it could be. That feeling is supported by how wide D's error bars are in those plots -- that diversity may well reflect the number of styles of programming one can adopt within the language. I'm surprised that the extreme lower values for the statistic still seem high relative to other languages, but that in turn might reflect the state of development of the language, with new features being added fairly regularly to the standard library (probably larger commits). I also have a strong feeling that LOC per commit reflects too many different factors to be really reliable as a comparison, e.g. it probably depends quite strongly on the age/maturity of a project, the rate of development, and other factors. Reading some later posts on the same blog, the author acknowledges some of these kinds of complications: http://redmonk.com/dberkholz/2013/03/26/what-does-expressiveness-via-loc-per-commit-measure-in-practice/
Apr 02 2013
On 4/2/2013 2:53 AM, Joseph Rushton Wakeling wrote:I also have a strong feeling that LOC per commit reflects too many different factors to be really reliable as a comparison, e.g. it probably depends quite strongly on the age/maturity of a project, the rate of development, and other factors.Consider also that this LOC numbers are not lines of code - they're also lines of comments! D's ddoc encourages writing considerably more lines of comments than C does.
Apr 02 2013
On Tuesday, 2 April 2013 at 17:33:13 UTC, Walter Bright wrote:On 4/2/2013 2:53 AM, Joseph Rushton Wakeling wrote:Not to mention that idiomatic D bracing style adds more lines.I also have a strong feeling that LOC per commit reflects too many different factors to be really reliable as a comparison, e.g. it probably depends quite strongly on the age/maturity of a project, the rate of development, and other factors.Consider also that this LOC numbers are not lines of code - they're also lines of comments! D's ddoc encourages writing considerably more lines of comments than C does.
Apr 02 2013
On Tuesday, 2 April 2013 at 17:33:13 UTC, Walter Bright wrote:On 4/2/2013 2:53 AM, Joseph Rushton Wakeling wrote:While I don't know what this specific report used, but comments are generally factored out of LOC and have their own count. I usually find the build in unittests to cause more skew since those are counted as LOC.I also have a strong feeling that LOC per commit reflects too many different factors to be really reliable as a comparison, e.g. it probably depends quite strongly on the age/maturity of a project, the rate of development, and other factors.Consider also that this LOC numbers are not lines of code - they're also lines of comments! D's ddoc encourages writing considerably more lines of comments than C does.
Apr 02 2013
On 4/2/2013 4:55 PM, Jesse Phillips wrote:I usually find the build in unittests to cause more skew since those are counted as LOC.Often, in pulls for D, the LOC of the unittests exceeds the LOC of the fix. I'm inordinately pleased with how well unittests have become embedded in our D culture.
Apr 02 2013
On Tue, Apr 02, 2013 at 05:01:32PM -0700, Walter Bright wrote:On 4/2/2013 4:55 PM, Jesse Phillips wrote:And I'm inordinately pleased with how many careless mistakes have been caught by unittests in my D code while coding, as opposed to afterwards when I'm actually using the program for something and bugs show up. T -- Тише едешь, дальше будешь.I usually find the build in unittests to cause more skew since those are counted as LOC.Often, in pulls for D, the LOC of the unittests exceeds the LOC of the fix. I'm inordinately pleased with how well unittests have become embedded in our D culture.
Apr 02 2013
On Tuesday, April 02, 2013 17:01:32 Walter Bright wrote:On 4/2/2013 4:55 PM, Jesse Phillips wrote:Yes, though I've had complaints before about a pull being too much code where the unit tests were considered part of the code, and the reviewer thought that number of lines was too great to be worth adding, even if the number of lines of normal code was relatively small. And that sort of attitude would just lead to not properly unit testing stuff. And while we do some great unit testing (the built in unit test feature is a _huge_ success in that regard), there are at least some areas where we really need to step up our game on that (with ranges in particular given all of the variations of them there are and how many static if branches many range-based functions have). So, what we've got is great, but we can do better. - Jonathan M DavisI usually find the build in unittests to cause more skew since those are counted as LOC.Often, in pulls for D, the LOC of the unittests exceeds the LOC of the fix. I'm inordinately pleased with how well unittests have become embedded in our D culture.
Apr 02 2013
On 4/2/13 10:13 PM, Jonathan M Davis wrote:On Tuesday, April 02, 2013 17:01:32 Walter Bright wrote:I think it leads to writing less repetitive unittests. If we did datetime all over again, I'd give a budget of 2000 lines for all functionality. I bet the solution would be better. AndreiOn 4/2/2013 4:55 PM, Jesse Phillips wrote:Yes, though I've had complaints before about a pull being too much code where the unit tests were considered part of the code, and the reviewer thought that number of lines was too great to be worth adding, even if the number of lines of normal code was relatively small. And that sort of attitude would just lead to not properly unit testing stuff.I usually find the build in unittests to cause more skew since those are counted as LOC.Often, in pulls for D, the LOC of the unittests exceeds the LOC of the fix. I'm inordinately pleased with how well unittests have become embedded in our D culture.
Apr 02 2013
On Tuesday, April 02, 2013 22:44:15 Andrei Alexandrescu wrote:On 4/2/13 10:13 PM, Jonathan M Davis wrote:I very much doubt that you could do that unless you specifically formatted the code to take up as few lines as possible and didn't count the unit tests or documentation in that line count. Otherwise, you couldn't do anything even close to what std.datetime does in that few lines. Sure, some functionality could be stripped, but you'd end up with something that did a lot less if it were that small. The unit tests and documentation do make it seem like a lot more code than it is, since they take up well over half the file (probably 3/4), but you'd definitely lose functionality with that few lines of code, and you'd end up with something very poor IMHO if those 2000 lines included the documentation and unit tests. You'd either end up with something that was very bare-bones and/or something which was poorly tested, and given how easy it is to screw up some of those date/time calculations, having only a few tests would be a very bad idea. std.datetime's unit tests do need some refactoring (some of which I've done, but there's still a fair bit of work to do there), which will definitely reduce the number of LOC that they take up, but I don't agree at all with considering the unit tests as part of the LOC of file when discussing keeping LOC to a minimum. And while it's good to avoid repetitive unit tests, I'd much rather have repetitive unit tests which are thorough than short ones which aren't. I find your focus on trying to keep unit tests to a minimum to be disturbing and likely to lead to poorly tested code. If anything, we need to be more thorough, not less. That doesn't mean that the tests need to look like what std.datetime has (particularly since I purposefully avoided loops and other more complicated constructs when I wrote them originally in order to make them as simple and as far from error-prone as possible), but unit tests need to be thorough, and while we're getting better, Phobos' unit tests frequently aren't thorough enough (particularly in std.range and std.algorithm when it comes to testing a variety of range types). Too many of them just test a few cases to make sure that the most obvious stuff works rather than making sure they test corner cases and whatnot. - Jonathan M DavisOn Tuesday, April 02, 2013 17:01:32 Walter Bright wrote:I think it leads to writing less repetitive unittests. If we did datetime all over again, I'd give a budget of 2000 lines for all functionality. I bet the solution would be better.On 4/2/2013 4:55 PM, Jesse Phillips wrote:Yes, though I've had complaints before about a pull being too much code where the unit tests were considered part of the code, and the reviewer thought that number of lines was too great to be worth adding, even if the number of lines of normal code was relatively small. And that sort of attitude would just lead to not properly unit testing stuff.I usually find the build in unittests to cause more skew since those are counted as LOC.Often, in pulls for D, the LOC of the unittests exceeds the LOC of the fix. I'm inordinately pleased with how well unittests have become embedded in our D culture.
Apr 02 2013
On 4/2/2013 8:03 PM, Jonathan M Davis wrote:Too many of them just test a few cases to make sure that the most obvious stuff works rather than making sure they test corner cases and whatnot.Currently, the datetime unittest coverage is 95%. Some of the 0 cases suggest low hanging fruit. Despite what I just said, datetime has one of the highest unittest coverages of any phobos module. Pretty much all of the phobos module unittest coverage testing indicates more work is needed. Minor perf improvement: the order of the tests in yearIsLeapYear() should be reversed, especially since signed divide is a very slow operation, and it is called 20 million times by the unittests!!!
Apr 02 2013
On Tuesday, April 02, 2013 20:41:23 Walter Bright wrote:On 4/2/2013 8:03 PM, Jonathan M Davis wrote:Yes. That's one of the things that I need to improve. std.datetime has a lot of tests, so it needs to do a better job of ordering stuff within unittest blocks in a manner which minimizes their cost. They need to be thorough, but they should also efficient, or the tests will end up taking too long (which is why it doensn't do a lot of testing with exceptions right now, since they slow the tests down considerably). - Jonathan M DavisToo many of them just test a few cases to make sure that the most obvious stuff works rather than making sure they test corner cases and whatnot.Currently, the datetime unittest coverage is 95%. Some of the 0 cases suggest low hanging fruit. Despite what I just said, datetime has one of the highest unittest coverages of any phobos module. Pretty much all of the phobos module unittest coverage testing indicates more work is needed. Minor perf improvement: the order of the tests in yearIsLeapYear() should be reversed, especially since signed divide is a very slow operation, and it is called 20 million times by the unittests!!!
Apr 03 2013
On Tuesday, April 02, 2013 20:41:23 Walter Bright wrote:Currently, the datetime unittest coverage is 95%. Some of the 0 cases suggest low hanging fruit.I should take another look at those. I thought that I had it at more like 98% (with most or all of the missing lines being due to stuff like catching Exception and asserting 0 in the catch block for making a function nothrow when you know that the code being called will never throw), but that was quite a while ago, and it sounds like it's now missing some stuff. I'm very much in favor of having 100% test coverage on every line that _can_ be tested (there may be rare exceptions to that, but I don't think that std.datetime has any of them). - Jonathan M Davis
Apr 03 2013
On 4/3/2013 11:08 AM, Jonathan M Davis wrote:(with most or all of the missing lines being due to stuff like catching Exception and asserting 0 in the catch block for making a function nothrow when you know that the code being called will never throw)Why not just mark them as nothrow? Let the compiler statically check it.
Apr 03 2013
On Wednesday, April 03, 2013 11:27:38 Walter Bright wrote:On 4/3/2013 11:08 AM, Jonathan M Davis wrote:It's for cases where the compiler _can't_ check. For instance, if you had code like string foo(int i, int j) nothrow { try return format("%s: %s", i, j); catch(Exception e) assert(0, "format threw when it should have been impossible."); } the catch is necessary in order to mark the function as nothrow, because format _could_ throw. It's just that given the arguments, you know that it never will. - Jonathan M Davis(with most or all of the missing lines being due to stuff like catching Exception and asserting 0 in the catch block for making a function nothrow when you know that the code being called will never throw)Why not just mark them as nothrow? Let the compiler statically check it.
Apr 03 2013
On 4/3/2013 11:44 AM, Jonathan M Davis wrote:the catch is necessary in order to mark the function as nothrow, because format _could_ throw. It's just that given the arguments, you know that it never will.Agreed.
Apr 03 2013
On 4/3/2013 11:08 AM, Jonathan M Davis wrote:I'm very much in favor of having 100% test coverage on every line that _can_ be tested (there may be rare exceptions to that, but I don't think that std.datetime has any of them).I'd be shocked if running -cov for the first time *didn't* come up with issues.
Apr 03 2013
On Wednesday, April 03, 2013 11:29:53 Walter Bright wrote:On 4/3/2013 11:08 AM, Jonathan M Davis wrote:Yes. My point was that 100% should be the goal, whereas I know a number of developers who consider something like 70% to be sufficient - and these are folks who actually believe in writing unit tests. Certainly, expecting to hit 100% with -cov on the first try isn't generally very realistic unless you're always extremely thorough with your tests, and even then, it's easy to miss a line or two on rarer branches, especially as functions become more complex. - Jonathan M DavisI'm very much in favor of having 100% test coverage on every line that _can_ be tested (there may be rare exceptions to that, but I don't think that std.datetime has any of them).I'd be shocked if running -cov for the first time *didn't* come up with issues.
Apr 03 2013
On 4/3/2013 11:44 AM, Jonathan M Davis wrote:Yes. My point was that 100% should be the goal, whereas I know a number of developers who consider something like 70% to be sufficient - and these are folks who actually believe in writing unit tests. Certainly, expecting to hit 100% with -cov on the first try isn't generally very realistic unless you're always extremely thorough with your tests, and even then, it's easy to miss a line or two on rarer branches, especially as functions become more complex.Cov testing also has a tendency to expose dead code - not just insufficient unit tests.
Apr 03 2013
On Wednesday, April 03, 2013 11:58:20 Walter Bright wrote:On 4/3/2013 11:44 AM, Jonathan M Davis wrote:Good point. That's not something that I typically think of - though in a lot of cases (for me personally at least), I think that the greater risk would be functions which weren't called at all by other code but _were_ properly tested, and -cov wouldn't catch that. But finding dead code with cov is definitely something to remember. I should cov more often anyway. Too often, given how thorough I generally am with unit tests, I tend to assume that the code coverage is there - and it probably is, but it's best to be sure. - Jonathan M DavisYes. My point was that 100% should be the goal, whereas I know a number of developers who consider something like 70% to be sufficient - and these are folks who actually believe in writing unit tests. Certainly, expecting to hit 100% with -cov on the first try isn't generally very realistic unless you're always extremely thorough with your tests, and even then, it's easy to miss a line or two on rarer branches, especially as functions become more complex.Cov testing also has a tendency to expose dead code - not just insufficient unit tests.
Apr 03 2013
On 2013-04-03 05:03, Jonathan M Davis wrote:I very much doubt that you could do that unless you specifically formatted the code to take up as few lines as possible and didn't count the unit tests or documentation in that line count. Otherwise, you couldn't do anything even close to what std.datetime does in that few lines. Sure, some functionality could be stripped, but you'd end up with something that did a lot less if it were that small. The unit tests and documentation do make it seem like a lot more code than it is, since they take up well over half the file (probably 3/4), but you'd definitely lose functionality with that few lines of code, and you'd end up with something very poor IMHO if those 2000 lines included the documentation and unit tests. You'd either end up with something that was very bare-bones and/or something which was poorly tested, and given how easy it is to screw up some of those date/time calculations, having only a few tests would be a very bad idea.Since he wrote "2000 lines for all functionality", I don't think he included unit tests or docs/comments.std.datetime's unit tests do need some refactoring (some of which I've done, but there's still a fair bit of work to do there), which will definitely reduce the number of LOC that they take up, but I don't agree at all with considering the unit tests as part of the LOC of file when discussing keeping LOC to a minimum. And while it's good to avoid repetitive unit tests, I'd much rather have repetitive unit tests which are thorough than short ones which aren't. I find your focus on trying to keep unit tests to a minimum to be disturbing and likely to lead to poorly tested code.If anything, we need to be more thorough, not less. That doesn't mean that the tests need to look like what std.datetime has (particularly since I purposefully avoided loops and other more complicated constructs when I wrote them originally in order to make them as simple and as far from error-prone as possible), but unit tests need to be thorough, and while we're getting better, Phobos' unit tests frequently aren't thorough enough (particularly in std.range and std.algorithm when it comes to testing a variety of range types). Too many of them just test a few cases to make sure that the most obvious stuff works rather than making sure they test corner cases and whatnot. - Jonathan M DavisI actually prefer to have repetitive unit tests and not using loops to make it clear what they actually do. Here's an example from our code base, in Ruby: describe "Swedish" do subject { build(:address) { |a| a.country_id = Country::SWEDEN } } it { should validate_postal_code(12345) } it { should validate_postal_code(85412) } it { should_not validate_postal_code(123) } it { should_not validate_postal_code(123456) } it { should_not validate_postal_code("05412") } it { should_not validate_postal_code("fooba") } end describe "Finnish" do subject { build(:address) { |a| a.country_id = Country::FINLAND } } it { should validate_postal_code(12345) } it { should validate_postal_code(12354) } it { should validate_postal_code(41588) } it { should validate_postal_code("00123") } it { should validate_postal_code("01588") } it { should validate_postal_code("00000") } it { should_not validate_postal_code(1234) } it { should_not validate_postal_code(123456) } it { should_not validate_postal_code("fooba") } end It could be written less repetitive, like this: postal_codes = { Country::SWEDEN => { valid: [12345, 85412], invalid: [123, 123456, "05412", "fooba"] }, Country::FINLAND => { valid: [12345, 12354, 41588], invalid: ["00123", "01588", "00000", 1234, 123456, "fooba"] } } postal_codes.each do |country_id, postal_codes| describe c.english_name do subject { build(:address) { |a| a.country_id = country_id } } postal_codes[:valid].each do |postal_code| it { should validate_postal_code(postal_code) } end postal_codes[:invalid].each do |postal_code| it { should_not validate_postal_code(postal_code) } end end end But I don't think that looks any better. I think it's much worse. -- /Jacob Carlborg
Apr 02 2013
On 4/3/13 2:53 AM, Jacob Carlborg wrote:On 2013-04-03 05:03, Jonathan M Davis wrote: I actually prefer to have repetitive unit tests and not using loops to make it clear what they actually do. Here's an example from our code base, in Ruby: describe "Swedish" do subject { build(:address) { |a| a.country_id = Country::SWEDEN } } it { should validate_postal_code(12345) } it { should validate_postal_code(85412) } it { should_not validate_postal_code(123) } it { should_not validate_postal_code(123456) } it { should_not validate_postal_code("05412") } it { should_not validate_postal_code("fooba") } end describe "Finnish" do subject { build(:address) { |a| a.country_id = Country::FINLAND } } it { should validate_postal_code(12345) } it { should validate_postal_code(12354) } it { should validate_postal_code(41588) } it { should validate_postal_code("00123") } it { should validate_postal_code("01588") } it { should validate_postal_code("00000") } it { should_not validate_postal_code(1234) } it { should_not validate_postal_code(123456) } it { should_not validate_postal_code("fooba") } end It could be written less repetitive, like this: postal_codes = { Country::SWEDEN => { valid: [12345, 85412], invalid: [123, 123456, "05412", "fooba"] }, Country::FINLAND => { valid: [12345, 12354, 41588], invalid: ["00123", "01588", "00000", 1234, 123456, "fooba"] } } postal_codes.each do |country_id, postal_codes| describe c.english_name do subject { build(:address) { |a| a.country_id = country_id } } postal_codes[:valid].each do |postal_code| it { should validate_postal_code(postal_code) } end postal_codes[:invalid].each do |postal_code| it { should_not validate_postal_code(postal_code) } end end end But I don't think that looks any better. I think it's much worse.The way I see it, the first is terrible and the second asks for better focus on a data-driven approach. Andrei
Apr 03 2013
On 2013-04-03 19:39, Andrei Alexandrescu wrote:The way I see it, the first is terrible and the second asks for better focus on a data-driven approach.Stupid me, posting on Ruby. -- /Jacob Carlborg
Apr 03 2013
On 4/3/13 2:55 PM, Jacob Carlborg wrote:On 2013-04-03 19:39, Andrei Alexandrescu wrote:I was referring to the repeatability of the code used in testing, which is language-independent. AndreiThe way I see it, the first is terrible and the second asks for better focus on a data-driven approach.Stupid me, posting on Ruby.
Apr 03 2013
On 2013-04-03 22:50, Andrei Alexandrescu wrote:I was referring to the repeatability of the code used in testing, which is language-independent.I think the first one is far more readable then the one using the loop. -- /Jacob Carlborg
Apr 04 2013
On 4/4/13 10:26 AM, Jacob Carlborg wrote:On 2013-04-03 22:50, Andrei Alexandrescu wrote:I understand. And I think you are very wrong about that. AndreiI was referring to the repeatability of the code used in testing, which is language-independent.I think the first one is far more readable then the one using the loop.
Apr 04 2013
On Wednesday, April 03, 2013 08:53:09 Jacob Carlborg wrote:On 2013-04-03 05:03, Jonathan M Davis wrote:That may be, but he does seem to have a habit of including the unit tests in the line count when he doesn't like how many lines of code a new piece of functionality takes up.I very much doubt that you could do that unless you specifically formatted the code to take up as few lines as possible and didn't count the unit tests or documentation in that line count. Otherwise, you couldn't do anything even close to what std.datetime does in that few lines. Sure, some functionality could be stripped, but you'd end up with something that did a lot less if it were that small. The unit tests and documentation do make it seem like a lot more code than it is, since they take up well over half the file (probably 3/4), but you'd definitely lose functionality with that few lines of code, and you'd end up with something very poor IMHO if those 2000 lines included the documentation and unit tests. You'd either end up with something that was very bare-bones and/or something which was poorly tested, and given how easy it is to screw up some of those date/time calculations, having only a few tests would be a very bad idea.Since he wrote "2000 lines for all functionality", I don't think he included unit tests or docs/comments.In general, I agree, because I think that straight-forward tests that avoid loops and the like are far less error-prone, and you need the tests to not be buggy. I don't want to have to test my test code to make sure that it works correctly. However, I _do_ think that there's something to be said for refactoring the tests later (after the code supposedly fully works) to use loops and other more complicated constructs, because not only can that lead to more compact tests, but it also makes it much easier to make the tests more thorough (without taking many more lines of code). I just think that _starting out_ with the more complicated tests is not necessarily a good idea. Treating unit testing code as if it were the same is normal code doesn't make sense to me, if nothing else, because that would indicate that you're going to have to test your test code, since normal code is complicated enough to require testing. But Andrei and I have argued about this before, and I don't expect us to agree ever on it. - Jonathan M Davisstd.datetime's unit tests do need some refactoring (some of which I've done, but there's still a fair bit of work to do there), which will definitely reduce the number of LOC that they take up, but I don't agree at all with considering the unit tests as part of the LOC of file when discussing keeping LOC to a minimum. And while it's good to avoid repetitive unit tests, I'd much rather have repetitive unit tests which are thorough than short ones which aren't. I find your focus on trying to keep unit tests to a minimum to be disturbing and likely to lead to poorly tested code. If anything, we need to be more thorough, not less. That doesn't mean that the tests need to look like what std.datetime has (particularly since I purposefully avoided loops and other more complicated constructs when I wrote them originally in order to make them as simple and as far from error-prone as possible), but unit tests need to be thorough, and while we're getting better, Phobos' unit tests frequently aren't thorough enough (particularly in std.range and std.algorithm when it comes to testing a variety of range types). Too many of them just test a few cases to make sure that the most obvious stuff works rather than making sure they test corner cases and whatnot. - Jonathan M DavisI actually prefer to have repetitive unit tests and not using loops to make it clear what they actually do. Here's an example from our code base, in Ruby: describe "Swedish" do subject { build(:address) { |a| a.country_id = Country::SWEDEN } } it { should validate_postal_code(12345) } it { should validate_postal_code(85412) } it { should_not validate_postal_code(123) } it { should_not validate_postal_code(123456) } it { should_not validate_postal_code("05412") } it { should_not validate_postal_code("fooba") } end describe "Finnish" do subject { build(:address) { |a| a.country_id = Country::FINLAND } } it { should validate_postal_code(12345) } it { should validate_postal_code(12354) } it { should validate_postal_code(41588) } it { should validate_postal_code("00123") } it { should validate_postal_code("01588") } it { should validate_postal_code("00000") } it { should_not validate_postal_code(1234) } it { should_not validate_postal_code(123456) } it { should_not validate_postal_code("fooba") } end It could be written less repetitive, like this: postal_codes = { Country::SWEDEN => { valid: [12345, 85412], invalid: [123, 123456, "05412", "fooba"] }, Country::FINLAND => { valid: [12345, 12354, 41588], invalid: ["00123", "01588", "00000", 1234, 123456, "fooba"] } } postal_codes.each do |country_id, postal_codes| describe c.english_name do subject { build(:address) { |a| a.country_id = country_id } } postal_codes[:valid].each do |postal_code| it { should validate_postal_code(postal_code) } end postal_codes[:invalid].each do |postal_code| it { should_not validate_postal_code(postal_code) } end end end But I don't think that looks any better. I think it's much worse.
Apr 03 2013
On 2013-04-03 20:08, Jonathan M Davis wrote:In general, I agree, because I think that straight-forward tests that avoid loops and the like are far less error-prone, and you need the tests to not be buggy. I don't want to have to test my test code to make sure that it works correctly. However, I _do_ think that there's something to be said for refactoring the tests later (after the code supposedly fully works) to use loops and other more complicated constructs, because not only can that lead to more compact tests, but it also makes it much easier to make the tests more thorough (without taking many more lines of code). I just think that _starting out_ with the more complicated tests is not necessarily a good idea. Treating unit testing code as if it were the same is normal code doesn't make sense to me, if nothing else, because that would indicate that you're going to have to test your test code, since normal code is complicated enough to require testing. But Andrei and I have argued about this before, and I don't expect us to agree ever on it.I do refactor tests, but mostly the data. At work I think we have pretty DRY tests, mostly the data. Using factories and other functionality to keep the code simple and DRY. "validate_postal_code" is a function written specifically for the tests above to keep it DRY. -- /Jacob Carlborg
Apr 03 2013
On 4/2/13 11:03 PM, Jonathan M Davis wrote:I find your focus on trying to keep unit tests to a minimum to be disturbing and likely to lead to poorly tested code.Well that's quite the assumption. Andrei
Apr 03 2013
On Wednesday, April 03, 2013 13:37:40 Andrei Alexandrescu wrote:On 4/2/13 11:03 PM, Jonathan M Davis wrote:If you push for the lines of unit testing code to be kept to a minimum, I don't see how you can possibly expect stuff to be thoroughly tested. There are times that better written tests take up less space, but testing isn't free, and if anything, we need more of it, not less, if we want to make sure that all of Phobos works correctly. And on multiple occasions now, you've balked at what I would consider to be properly thorough unit tests and wanted them to be reduced in size. And since that generally means testing fewer things, I think that it's pretty much a sure thing that it's generally going to lead to poorer testing and increase the risk of code being buggy. - Jonathan M DavisI find your focus on trying to keep unit tests to a minimum to be disturbing and likely to lead to poorly tested code.Well that's quite the assumption.
Apr 03 2013
On 4/3/2013 10:58 AM, Jonathan M Davis wrote:If you push for the lines of unit testing code to be kept to a minimum, I don't see how you can possibly expect stuff to be thoroughly tested.My idea of perfection would be 100% coverage with zero redundancy in the unittests. In my experience with testing, the technique of "quantity has a quality all its own" style of testing does not produce adequate test coverage - it just simply takes a lot of time to run (which makes it less useful, as one then tends to avoid running them).
Apr 03 2013
On Wednesday, April 03, 2013 11:36:54 Walter Bright wrote:On 4/3/2013 10:58 AM, Jonathan M Davis wrote:Well, determining what's actually redundant isn't always easy. If a test is clearly redundant, then it makes sense to remove it, but if you're not careful with that (especially if you're basing your tests off of what the current code looks like), then it can be easy to remove tests which were basically unnecessary with the current implementation but which would have caught bugs when the code was refactored. So, while in principle, I agree that having zero redundancy would be good, in practice, I don't think that it's that straightforward. I also don't think that code coverage means much beyond the fact that if you don't have 100% (minus the lines of code that can never be hit - e.g. assert(0);), then obviously some stuff isn't tested properly. You need to hit all of the corner cases and whatnot which may not work correctly yet or which may get broken when refactoring, and often, 100% test coverage doesn't get you there, much as it's an important milestone. Certainly, I agree that having the minimal tests required to test everything that needs testing should be the goal, but figuring out which tests are and aren't really needed is a bit of art. Personally, I do tend to err on the side of over-testing rather than under-testing though, as that does a better job of ensuring that the code is correct. Actually, I'd argue that in perfect world, you'd test absolutely every possible input to make sure that it had the correct output, but that's obviously impossible in all but the most simplistic code, and actually attempting that approach just leads to unit tests which take too long to run. - Jonathan M DavisIf you push for the lines of unit testing code to be kept to a minimum, I don't see how you can possibly expect stuff to be thoroughly tested.My idea of perfection would be 100% coverage with zero redundancy in the unittests. In my experience with testing, the technique of "quantity has a quality all its own" style of testing does not produce adequate test coverage - it just simply takes a lot of time to run (which makes it less useful, as one then tends to avoid running them).
Apr 03 2013
On 4/3/2013 11:56 AM, Jonathan M Davis wrote:Certainly, I agree that having the minimal tests required to test everything that needs testing should be the goal, but figuring out which tests are and aren't really needed is a bit of art.That's why we are engineers, and not mere code monkeys.Actually, I'd argue that in perfect world, you'd test absolutely every possible input to make sure that it had the correct output, but that's obviously impossible in all but the most simplistic code,We can exploit mathematics to reduce the test cases while testing thoroughly. In physics I learned to test one's solution with the boundary cases and a couple of known cases. Mathematically, that was sufficient.
Apr 03 2013
On Wednesday, April 03, 2013 12:03:39 Walter Bright wrote:On 4/3/2013 11:56 AM, Jonathan M Davis wrote:True.Certainly, I agree that having the minimal tests required to test everything that needs testing should be the goal, but figuring out which tests are and aren't really needed is a bit of art.That's why we are engineers, and not mere code monkeys.Definitely, though in some cases, figuring the bounds cases can be quite tricky - e.g. as thorough as std.datetime's unit tests are, I still missed some in one instance and got a bug report early on for that (though on the whole, there have been very few bugs reported on std.datetime, so I think that the unit tests have been quite effective). But getting good at figuring that sort of thing out _is_ part of our job description. - Jonathan M DavisActually, I'd argue that in perfect world, you'd test absolutely every possible input to make sure that it had the correct output, but that's obviously impossible in all but the most simplistic code,We can exploit mathematics to reduce the test cases while testing thoroughly. In physics I learned to test one's solution with the boundary cases and a couple of known cases. Mathematically, that was sufficient.
Apr 03 2013
On Wednesday, 3 April 2013 at 02:44:15 UTC, Andrei Alexandrescu wrote:If we did datetime all over again, I'd give a budget of 2000 lines for all functionality. I bet the solution would be better.I think you are massively underestimating the complexity and subtleties of dates and time. For comparison, min and max in std.algorithm come to nearly 200 lines on their own, and their unittests are hopelessly lacking. Things like min(uint.min, int.max) are not tested, even though there's specific code to handle them. To suggest that date and time handling is a mere 10x more complex than min/max is a bit naive in my opinion.
Apr 03 2013
03-Apr-2013 19:55, Peter Alexander пишет:On Wednesday, 3 April 2013 at 02:44:15 UTC, Andrei Alexandrescu wrote:+1If we did datetime all over again, I'd give a budget of 2000 lines for all functionality. I bet the solution would be better.I think you are massively underestimating the complexity and subtleties of dates and time.For comparison, min and max in std.algorithm come to nearly 200 lines on their own, and their unittests are hopelessly lacking. Things like min(uint.min, int.max) are not tested, even though there's specific code to handle them. To suggest that date and time handling is a mere 10x more complex than min/max is a bit naive in my opinion.-- Dmitry Olshansky
Apr 03 2013
On 4/3/2013 9:49 AM, Dmitry Olshansky wrote:+1Stylistic nit: When writing a one-liner post like this, please do not quote the entire preceding post, especially if it is long. We have great forum software, and the newsreaders as well are great at navigating the threads. Not to pick on you, but I see this a lot here from many of our participants and finally felt compelled to speak up! And yes, I know that sometimes people complain that I do the opposite in not quoting enough of the parent.
Apr 03 2013
On Wed, 03 Apr 2013 14:42:12 -0400, Walter Bright <newshound2 digitalmars.com> wrote:On 4/3/2013 9:49 AM, Dmitry Olshansky wrote:I couldn't disagree more. The given +1 had 4 lines of context. There was some straggling text after it, but this was only an additional 5 lines. My newsreader highlights replied-to text in different colors depending on the level of indent. I can immediately pick out new replies, and if I don't want to read the re-posted stuff, I don't have to, unless I want to for context. Newsreaders are known not to thread things properly, and some people's posts don't thread properly ANYWHERE. Context is important.+1Stylistic nit: When writing a one-liner post like this, please do not quote the entire preceding post, especially if it is long. We have great forum software, and the newsreaders as well are great at navigating the threads.Not to pick on you, but I see this a lot here from many of our participants and finally felt compelled to speak up!I find posts that are solely about how you didn't "post properly" annoying. Kind of like compulsively telling someone they didn't use correct grammar (for which I have to fight my instincts in order to remain married). Sorry, I had to say something ;) -Steve
Apr 03 2013
On 4/3/13 11:24 PM, Steven Schveighoffer wrote:On Wed, 03 Apr 2013 14:42:12 -0400, Walter Bright <newshound2 digitalmars.com> wrote:I'm with Walter. The top context was fine for that message. The bottom was not seeing as the poster had nothing to say about it. Deleting the bottom is good common courtesy. Walter himself used to leave vast amounts of trailing context in our communication, and it saved me significant time when he started to consistently trim it. With trailing chaff, essentially every reader needs to scroll down to find "is there anything more this guy wanted to add"? Some don't even insert an empty line.On 4/3/2013 9:49 AM, Dmitry Olshansky wrote:I couldn't disagree more. The given +1 had 4 lines of context. There was some straggling text after it, but this was only an additional 5 lines.+1Stylistic nit: When writing a one-liner post like this, please do not quote the entire preceding post, especially if it is long. We have great forum software, and the newsreaders as well are great at navigating the threads.My newsreader highlights replied-to text in different colors depending on the level of indent. I can immediately pick out new replies, and if I don't want to read the re-posted stuff, I don't have to, unless I want to for context.Mine too, but that doesn't make the problem go away.Newsreaders are known not to thread things properly, and some people's posts don't thread properly ANYWHERE. Context is important.Yes, just not trailing chaff.Such posts are good because netiquette is not as widespread and as agreed upon as grammar. AndreiNot to pick on you, but I see this a lot here from many of our participants and finally felt compelled to speak up!I find posts that are solely about how you didn't "post properly" annoying. Kind of like compulsively telling someone they didn't use correct grammar (for which I have to fight my instincts in order to remain married). Sorry, I had to say something ;)
Apr 04 2013
On Thu, 04 Apr 2013 09:25:30 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:On 4/3/13 11:24 PM, Steven Schveighoffer wrote:Mac mail fixed this problem for me. All previously received text is folded out, no need to look at it.On Wed, 03 Apr 2013 14:42:12 -0400, Walter Bright <newshound2 digitalmars.com> wrote:I'm with Walter. The top context was fine for that message. The bottom was not seeing as the poster had nothing to say about it. Deleting the bottom is good common courtesy. Walter himself used to leave vast amounts of trailing context in our communication, and it saved me significant time when he started to consistently trim it. With trailing chaff, essentially every reader needs to scroll down to find "is there anything more this guy wanted to add"? Some don't even insert an empty line.On 4/3/2013 9:49 AM, Dmitry Olshansky wrote:I couldn't disagree more. The given +1 had 4 lines of context. There was some straggling text after it, but this was only an additional 5 lines.+1Stylistic nit: When writing a one-liner post like this, please do not quote the entire preceding post, especially if it is long. We have great forum software, and the newsreaders as well are great at navigating the threads.It doesn't? It pretty much fixes it for me. I can see exactly what the new text is via it's color.My newsreader highlights replied-to text in different colors depending on the level of indent. I can immediately pick out new replies, and if I don't want to read the re-posted stuff, I don't have to, unless I want to for context.Mine too, but that doesn't make the problem go away.I agree, it's not necessary. But it's not worth a public scolding either.Newsreaders are known not to thread things properly, and some people's posts don't thread properly ANYWHERE. Context is important.Yes, just not trailing chaff.Such posts are annoying precisely because there is no agreed upon netiquette. There is no "Right way" to post. It's actually kind of ironic that grammar is NOT policed here as much, simply because we all agree to post in English, and that's not always the author's native language. -SteveSuch posts are good because netiquette is not as widespread and as agreed upon as grammar.Not to pick on you, but I see this a lot here from many of our participants and finally felt compelled to speak up!I find posts that are solely about how you didn't "post properly" annoying. Kind of like compulsively telling someone they didn't use correct grammar (for which I have to fight my instincts in order to remain married). Sorry, I had to say something ;)
Apr 04 2013
On Thursday, 4 April 2013 at 18:00:27 UTC, Steven Schveighoffer wrote:Mac mail fixed this problem for me. All previously received text is folded out, no need to look at it.So there is a lot of visual noise for nothing, and you like it ? And what if one uses the web forum, like me ? Or Thunderbird ? Do we need to buy a mac and use your newsreader ? Seriously, the netiquette *demands* that you trim previous mails to keep only the necessary. If everybody was doing like you, we would end up having posts hundreds of lines long, most of which being noise.
Apr 04 2013
On Fri, 05 Apr 2013 02:16:02 -0400, SomeDude <lovelydear mailmetrash.com> wrote:On Thursday, 4 April 2013 at 18:00:27 UTC, Steven Schveighoffer wrote:I like that I don't have to deal with it. I also don't have to deal with it if the person deletes the replied-to text. In other words, it takes all forms, and gives me what I need to read.Mac mail fixed this problem for me. All previously received text is folded out, no need to look at it.So there is a lot of visual noise for nothing, and you like it ?And what if one uses the web forum, like me ? Or Thunderbird ? Do we need to buy a mac and use your newsreader ?No, I'm just stating that I don't have that problem. That is with email though, mac mail doesn't do newsgroups. It's not a solution for you, it's just that I realized I don't have to ever deal with this anymore, which I hadn't thought about.Seriously, the netiquette *demands* that you trim previous mails to keep only the necessary.There is no technical requirement for this. I don't think any of this would be grounds for banning here, so as long as you get your point across, I don't see a problem. There is the notion that if you make your posts annoying to read, less people will read them. But for this specific instance, I found 9 lines of context not to be a burden, even though 5 lines were unneeded.If everybody was doing like you, we would end up having posts hundreds of lines long, most of which being noise.I typically trim down my posts to the relevant information. I do it because it makes my point come across much better. -Steve
Apr 05 2013
On Wednesday, 3 April 2013 at 18:42:14 UTC, Walter Bright wrote:On 4/3/2013 9:49 AM, Dmitry Olshansky wrote:+1 I hate it to have to scroll down just to read a one liner that nearly adds nothing to a long post. It gives an impression of laziness from the part of the author.+1Stylistic nit: When writing a one-liner post like this, please do not quote the entire preceding post, especially if it is long. We have great forum software, and the newsreaders as well are great at navigating the threads.
Apr 04 2013
On 4/3/13 11:55 AM, Peter Alexander wrote:On Wednesday, 3 April 2013 at 02:44:15 UTC, Andrei Alexandrescu wrote:May as well. I recall before I approved std.datetime I looked at the implementation sizes of similar functionality in other languages; they were all rather bulky, but std.datetime was at the high end of the range.If we did datetime all over again, I'd give a budget of 2000 lines for all functionality. I bet the solution would be better.I think you are massively underestimating the complexity and subtleties of dates and time.For comparison, min and max in std.algorithm come to nearly 200 lines on their own, and their unittests are hopelessly lacking. Things like min(uint.min, int.max) are not tested, even though there's specific code to handle them. To suggest that date and time handling is a mere 10x more complex than min/max is a bit naive in my opinion.To put things in perspective, std.datetime has 34K lines, whereas std.algorithm has under 12K lines. The entire std/ has 191K lines. I'd be hard pressed to assess that that high proportion is justified. Say we set out to fit std.datetime in e.g. 20K lines without loss in functionality or testing, which I'd find more reasonable. I think the result would force overall better engineering of the entire thing (and in particular better use of data structures) - constraints may be liberating. Andrei
Apr 03 2013
On Wednesday, 3 April 2013 at 17:08:57 UTC, Andrei Alexandrescu wrote:On 4/3/13 11:55 AM, Peter Alexander wrote:Boost datetime is 27k. Just the headers comes to 17k. A 2k budget for a date time library is unreasonable unless you don't want anyone using D for anything serious involving dates and times. They are complex and require a lot of code to get right. Perhaps 34k is too large but 2k is laughable.On Wednesday, 3 April 2013 at 02:44:15 UTC, Andrei Alexandrescu wrote:May as well. I recall before I approved std.datetime I looked at the implementation sizes of similar functionality in other languages; they were all rather bulky, but std.datetime was at the high end of the range.If we did datetime all over again, I'd give a budget of 2000 lines for all functionality. I bet the solution would be better.I think you are massively underestimating the complexity and subtleties of dates and time.
Apr 03 2013
On Wednesday, April 03, 2013 19:59:37 Brad Anderson wrote:Perhaps 34k is too large but 2k is laughable.I really should strip out the unit tests and documentation to see what the line count of actual code is, as something like 75% of that is unit tests and documentation, and IIRC, std.datetime provides most of the functionality that Boost does plus some more, though it does some weird, complicated stuff with its header files from what I recall. I'd hate to be the maintainer of Boost's datetime stuff. - Jonathan M Davis
Apr 03 2013
On 4/3/13 1:59 PM, Brad Anderson wrote:On Wednesday, 3 April 2013 at 17:08:57 UTC, Andrei Alexandrescu wrote:Agreed. I just pulled that number randomly without having looked at the current line count. AndreiOn 4/3/13 11:55 AM, Peter Alexander wrote:Boost datetime is 27k. Just the headers comes to 17k. A 2k budget for a date time library is unreasonable unless you don't want anyone using D for anything serious involving dates and times. They are complex and require a lot of code to get right. Perhaps 34k is too large but 2k is laughable.On Wednesday, 3 April 2013 at 02:44:15 UTC, Andrei Alexandrescu wrote:May as well. I recall before I approved std.datetime I looked at the implementation sizes of similar functionality in other languages; they were all rather bulky, but std.datetime was at the high end of the range.If we did datetime all over again, I'd give a budget of 2000 lines for all functionality. I bet the solution would be better.I think you are massively underestimating the complexity and subtleties of dates and time.
Apr 03 2013
On 2013-04-03, 20:04, Jonathan M Davis wrote:On Wednesday, April 03, 2013 19:59:37 Brad Anderson wrote:Removed all comments, unittests, and empty lines from std.datetime. File went from 34070 to 5843 lines. -- SimenPerhaps 34k is too large but 2k is laughable.I really should strip out the unit tests and documentation to see what the line count of actual code is, as something like 75% of that is unit tests and documentation, and IIRC, std.datetime provides most of the functionality that Boost does plus some more, though it does some weird, complicated stuff with its header files from what I recall. I'd hate to be the maintainer of Boost's datetime stuff.
Apr 03 2013
On Wednesday, 3 April 2013 at 19:28:56 UTC, Simen Kjaeraas wrote:On 2013-04-03, 20:04, Jonathan M Davis wrote:cloc doesn't support /+ comments... But using your number, cloc, and some math loc: 5843 comments: 6255 unittest: 16503 blank: 5469On Wednesday, April 03, 2013 19:59:37 Brad Anderson wrote:Removed all comments, unittests, and empty lines from std.datetime. File went from 34070 to 5843 lines.Perhaps 34k is too large but 2k is laughable.I really should strip out the unit tests and documentation to see what the line count of actual code is, as something like 75% of that is unit tests and documentation, and IIRC, std.datetime provides most of the functionality that Boost does plus some more, though it does some weird, complicated stuff with its header files from what I recall. I'd hate to be the maintainer of Boost's datetime stuff.
Apr 03 2013
On 2013-04-04 03:47, Jesse Phillips wrote:cloc doesn't support /+ comments... But using your number, cloc, and some mathstd.datetime contains mostly /+ and // comments. It only contains a single /* comment. -- /Jacob Carlborg
Apr 04 2013
On Thursday, 4 April 2013 at 14:31:36 UTC, Jacob Carlborg wrote:On 2013-04-04 03:47, Jesse Phillips wrote:I realize that, reason I had to use math. Cloc reports 11598 (something near that) then I know subtracted the actual loc gives me the /+ comments.cloc doesn't support /+ comments... But using your number, cloc, and some mathstd.datetime contains mostly /+ and // comments. It only contains a single /* comment.
Apr 04 2013
On 2013-04-03 21:28, Simen Kjaeraas wrote:Removed all comments, unittests, and empty lines from std.datetime. File went from 34070 to 5843 lines.Heheh, that's more reasonable. That's also why I don't like to have unit tests inline. -- /Jacob Carlborg
Apr 04 2013
On 04/02/2013 10:44 PM, Andrei Alexandrescu wrote:I think it leads to writing less repetitive unittests. If we did datetime all over again, I'd give a budget of 2000 lines for all functionality. I bet the solution would be better. AndreiMy problem with datetime is that it is too monolithic. I really wish it was split into about 3 different modules. This is frustrating from a user-perspective. The docs for that thing can easily make someone's eyes gloss over. If you split it up, then the LOC per module would become smaller too, as a side-effect.
Apr 05 2013
On Fri, 05 Apr 2013 13:13:29 -0400, Chad Joan <chadjoan gmail.com> wrote:My problem with datetime is that it is too monolithic. I really wish it was split into about 3 different modules. This is frustrating from a user-perspective. The docs for that thing can easily make someone's eyes gloss over.What if the docs were split up? E.g. http://vibed.org/temp/d-programming-language.org/phobos/std/datetime.html -Steve
Apr 05 2013
On Friday, April 05, 2013 13:13:29 Chad Joan wrote:On 04/02/2013 10:44 PM, Andrei Alexandrescu wrote:If/Once some variant of DIPs 15 or 16 is implemented, we'll be able to transparently turn modules into packages - making the package have the same name as the old module and split what was in the old module across multiple modules in the new package. Code will then work exactly as before, importing the package as it were a module but allowing you to import the modules in the package directly in new code if you want to. Then we'll be able to split up larger modules like std.algorithm or std.datetime if we want to - without breaking anyone's code. Once that's available, I have every intention of splitting up std.datetime into separate modules, but doing so before that would break code, which I don't want to do. - Jonathan M DavisI think it leads to writing less repetitive unittests. If we did datetime all over again, I'd give a budget of 2000 lines for all functionality. I bet the solution would be better. AndreiMy problem with datetime is that it is too monolithic. I really wish it was split into about 3 different modules. This is frustrating from a user-perspective. The docs for that thing can easily make someone's eyes gloss over. If you split it up, then the LOC per module would become smaller too, as a side-effect.
Apr 05 2013
On 4/5/13 11:17 AM, Jonathan M Davis wrote:On Friday, April 05, 2013 13:13:29 Chad Joan wrote:I believe it's really not a module issue at all, but a doc issue. The two are directly tied today, but I have _no_ problem with importing the module and using it as is. Yes, it's large in terms of lines in the file, but really, who's affected by that and how often. Few and seldom. Breaking it up just because of docs is like ripping a book into 10 books just because you want to only carry one chapter around.On 04/02/2013 10:44 PM, Andrei Alexandrescu wrote:If/Once some variant of DIPs 15 or 16 is implemented, we'll be able to transparently turn modules into packages - making the package have the same name as the old module and split what was in the old module across multiple modules in the new package. Code will then work exactly as before, importing the package as it were a module but allowing you to import the modules in the package directly in new code if you want to. Then we'll be able to split up larger modules like std.algorithm or std.datetime if we want to - without breaking anyone's code. Once that's available, I have every intention of splitting up std.datetime into separate modules, but doing so before that would break code, which I don't want to do. - Jonathan M DavisI think it leads to writing less repetitive unittests. If we did datetime all over again, I'd give a budget of 2000 lines for all functionality. I bet the solution would be better. AndreiMy problem with datetime is that it is too monolithic. I really wish it was split into about 3 different modules. This is frustrating from a user-perspective. The docs for that thing can easily make someone's eyes gloss over. If you split it up, then the LOC per module would become smaller too, as a side-effect.
Apr 05 2013
On 2013-04-03 04:13, Jonathan M Davis wrote:Yes, though I've had complaints before about a pull being too much code where the unit tests were considered part of the code, and the reviewer thought that number of lines was too great to be worth adding, even if the number of lines of normal code was relatively small. And that sort of attitude would just lead to not properly unit testing stuff. And while we do some great unit testing (the built in unit test feature is a _huge_ success in that regard), there are at least some areas where we really need to step up our game on that (with ranges in particular given all of the variations of them there are and how many static if branches many range-based functions have).The problem is having the unit tests in the same file. Yes, I know, most of you love it, I don't. -- /Jacob Carlborg
Apr 02 2013
On 4/3/13, Jacob Carlborg <doob me.com> wrote:The problem is having the unit tests in the same file. Yes, I know, most of you love it, I don't.One thing I noticed is that having unittests in separate files can catch issues with template mixins. If you have any private or protected functions that are used by a mixin template, the mixin template will not compile once the user tries to use it in his own code. There are workarounds, of course, like putting functions inside of the template. But the point still stands that you need to also test the library externally. Another thing local unittests don't test are symbol clashes. If a user imports lib.a and lib.b from your library, he probably doesn't expect to get symbol clashes. In fact Phobos has had symbol clashes before, and we're working on getting rid of them (e.g. through deprecation stages). But if Phobos also had external test-cases then we could have avoided symbol clashes to begin with.
Apr 02 2013
On 2013-04-03 08:45, Andrej Mitrovic wrote:One thing I noticed is that having unittests in separate files can catch issues with template mixins. If you have any private or protected functions that are used by a mixin template, the mixin template will not compile once the user tries to use it in his own code. There are workarounds, of course, like putting functions inside of the template. But the point still stands that you need to also test the library externally.I didn't think about that.Another thing local unittests don't test are symbol clashes. If a user imports lib.a and lib.b from your library, he probably doesn't expect to get symbol clashes.Most likely not, but there's nothing wrong with it. We do have modules for a reason. It's fairly easy do solve for the user if the issue comes up. If there are some common names that always clash, then there are some problems.In fact Phobos has had symbol clashes before, and we're working on getting rid of them (e.g. through deprecation stages). But if Phobos also had external test-cases then we could have avoided symbol clashes to begin with.I don't know if that's something unit tests should explicitly test for. -- /Jacob Carlborg
Apr 03 2013
On 4/3/13, Andrej Mitrovic <andrej.mitrovich gmail.com> wrote:If you have any private or protected functionsI meant private or package.One thing I noticed is that having unittests in separate files can catch issues with template mixins.I wonder if there's a way to mitigate that problem with a language feature. Perhaps marking the unittest as 'extern' would make the unittest only have access to public symbols in the module. That way you never get into the situation where testing something from within a unittest seems to work, but completely forgetting that you're calling a private or package function.
Apr 02 2013
On 04/02/2013 08:01 PM, Walter Bright wrote:On 4/2/2013 4:55 PM, Jesse Phillips wrote:I think this has made me a much better programmer. And it did so a long time ago. Big win!I usually find the build in unittests to cause more skew since those are counted as LOC.Often, in pulls for D, the LOC of the unittests exceeds the LOC of the fix. I'm inordinately pleased with how well unittests have become embedded in our D culture.
Apr 05 2013
On Tuesday, 2 April 2013 at 23:55:19 UTC, Jesse Phillips wrote:On Tuesday, 2 April 2013 at 17:33:13 UTC, Walter Bright wrote:He certainly didn't factor out comments for all languages, meaning that he didn't do it at all.On 4/2/2013 2:53 AM, Joseph Rushton Wakeling wrote:While I don't know what this specific report used, but comments are generally factored out of LOC and have their own count. I usually find the build in unittests to cause more skew since those are counted as LOC.I also have a strong feeling that LOC per commit reflects too many different factors to be really reliable as a comparison, e.g. it probably depends quite strongly on the age/maturity of a project, the rate of development, and other factors.Consider also that this LOC numbers are not lines of code - they're also lines of comments! D's ddoc encourages writing considerably more lines of comments than C does.
Apr 04 2013
Jonas Drewsen:Article about the expressiveness of languages with D included as one of the contestants. http://redmonk.com/dberkholz/2013/03/25/programming-languages-ranked-by-expressiveness/I think D is quite expressive: http://forum.dlang.org/thread/zdhfpftodxnvbpwvklcv forum.dlang.org Bye, bearophile
Apr 02 2013
On 4/2/13 6:04 AM, bearophile wrote:Jonas Drewsen:I meant to comment on this - it's a terrific walkthrough. I think bearophile should convert it into a blog post/article. I think reddit would love it. The suggestions included (such as enumerate()) are also very worth looking into. AndreiArticle about the expressiveness of languages with D included as one of the contestants. http://redmonk.com/dberkholz/2013/03/25/programming-languages-ranked-by-expressiveness/I think D is quite expressive: http://forum.dlang.org/thread/zdhfpftodxnvbpwvklcv forum.dlang.org Bye, bearophile
Apr 02 2013
On 4/2/2013 1:59 PM, Andrei Alexandrescu wrote:On 4/2/13 6:04 AM, bearophile wrote:I agree, it's terrific. But perhaps we can just submit it to reddit as is?Jonas Drewsen:I meant to comment on this - it's a terrific walkthrough. I think bearophile should convert it into a blog post/article. I think reddit would love it. The suggestions included (such as enumerate()) are also very worth looking into.Article about the expressiveness of languages with D included as one of the contestants. http://redmonk.com/dberkholz/2013/03/25/programming-languages-ranked-by-expressiveness/I think D is quite expressive: http://forum.dlang.org/thread/zdhfpftodxnvbpwvklcv forum.dlang.org Bye, bearophile
Apr 02 2013
On 4/2/13 4:59 PM, Andrei Alexandrescu wrote:On 4/2/13 6:04 AM, bearophile wrote:Pinging bearophile on this again - do you want to adapt this into a blog entry? It may be worth posting the link to reddit as is, but one adaptation pass for a larger audience shouldn't hurt. Let us know! AndreiJonas Drewsen:I meant to comment on this - it's a terrific walkthrough. I think bearophile should convert it into a blog post/article. I think reddit would love it. The suggestions included (such as enumerate()) are also very worth looking into.Article about the expressiveness of languages with D included as one of the contestants. http://redmonk.com/dberkholz/2013/03/25/programming-languages-ranked-by-expressiveness/I think D is quite expressive: http://forum.dlang.org/thread/zdhfpftodxnvbpwvklcv forum.dlang.org Bye, bearophile
Apr 04 2013
Andrei Alexandrescu:I think the enumerate() was discussed mostly elsewhere.The suggestions included (such as enumerate()) are also very worth looking into.Pinging bearophile on this again - do you want to adapt this into a blog entry? It may be worth posting the link to reddit as is, but one adaptation pass for a larger audience shouldn't hurt. Let us know!Thank you for your interest. I like to write articles, but there are significant problems related to that post: - It's a soup of very different things; - It suggests things like stream fusion that I think aren't yet discussed in the D community; - I think it's not good for consumption outside the D community, it focuses on details mostly important for the development of D/Phobos; - I think some of its contents are half cooked and need some more of my reflection; - I do not like to show a text two times. Bye, bearophile
Apr 04 2013
On Friday, 5 April 2013 at 01:55:06 UTC, bearophile wrote:Andrei Alexandrescu:I just wanted to say that I also liked the article and I understand why the others would want you to repost it. I think the strengths outweigh the weaknesses you mention, but I do understand nto wanting to show the thing twice.Pinging bearophile on this again - do you want to adapt this into a blog entry? It may be worth posting the link to reddit as is, but one adaptation pass for a larger audience shouldn't hurt. Let us know!Thank you for your interest. I like to write articles, but there are significant problems related to that post: - It's a soup of very different things; - It suggests things like stream fusion that I think aren't yet discussed in the D community; - I think it's not good for consumption outside the D community, it focuses on details mostly important for the development of D/Phobos; - I think some of its contents are half cooked and need some more of my reflection; - I do not like to show a text two times. Bye, bearophile
Apr 04 2013
On 4/4/13 10:36 PM, Zach the Mystic wrote:On Friday, 5 April 2013 at 01:55:06 UTC, bearophile wrote:I, too, understand that, with the amendment that it's an unwarranted concern. I used to worry about that, too (e.g. not give the same talk twice) until I understood that the overlap in audiences is very small, and the people comprising the overlap understand and approve of the reasons for repeating. AndreiAndrei Alexandrescu:I just wanted to say that I also liked the article and I understand why the others would want you to repost it. I think the strengths outweigh the weaknesses you mention, but I do understand nto wanting to show the thing twice.Pinging bearophile on this again - do you want to adapt this into a blog entry? It may be worth posting the link to reddit as is, but one adaptation pass for a larger audience shouldn't hurt. Let us know!Thank you for your interest. I like to write articles, but there are significant problems related to that post: - It's a soup of very different things; - It suggests things like stream fusion that I think aren't yet discussed in the D community; - I think it's not good for consumption outside the D community, it focuses on details mostly important for the development of D/Phobos; - I think some of its contents are half cooked and need some more of my reflection; - I do not like to show a text two times. Bye, bearophile
Apr 05 2013
On Tuesday, 2 April 2013 at 07:59:17 UTC, Jonas Drewsen wrote:Article about the expressiveness of languages with D included as one of the contestants. http://redmonk.com/dberkholz/2013/03/25/programming-languages-ranked-by-expressiveness/ I tend to agree with the first comment to the article though :) /JonasYep, the sorting seems quite random to me, AFAIK Vala is nothing special yet it is ranked very high in this article.. renoX
Apr 02 2013
On 04/02/2013 03:24 PM, renoX wrote:Yep, the sorting seems quite random to me, AFAIK Vala is nothing special yet it is ranked very high in this article..To be fair, the author does say that results for what he calls "third tier" languages (like Vala) should be considered with a great deal of skepticism: http://redmonk.com/dberkholz/2013/03/26/what-does-expressiveness-via-loc-per-commit-measure-in-practice/
Apr 02 2013
It won’t tell you how readable the resulting code is (Hello, lambda functions) or how long it takes to write it (APL anyone?), so it’s not a measure of maintainability or productivity.Did I get it right, that expressiveness means trading maintainability for keystroke saving?
Apr 05 2013
On Friday, April 05, 2013 14:36:07 Brad Roberts wrote:I believe it's really not a module issue at all, but a doc issue. The two are directly tied today, but I have _no_ problem with importing the module and using it as is. Yes, it's large in terms of lines in the file, but really, who's affected by that and how often. Few and seldom. Breaking it up just because of docs is like ripping a book into 10 books just because you want to only carry one chapter around.To some extent, I agree. I'm quite able to maintain it as one module (though to be fair to anyone arguing that it should be broken up for maintainibility - as sometimes happens - it's large enough that if large portions of it get changed, you can't see the diff on github). I'm not sure that it would _hurt_ maintainibility though to break it up. And I know exactly how I'd break it up if I were to break it up, and it would break up quite cleanly, I think. The main reason that it's not broken up in the first place is that I did a horrible job of breaking it up when I first introduced it, and everyone's reaction was that it should just be one module (the code has changed quite a bit since then though, so breaking it up would be much easier now). But regardless, with ddoc, breaking up the module would be the only way to break up the documentation, so we're kind of stuck in that regard (though if we start using ddox for dlang.org, that does change things). - Jonathan m Davis
Apr 05 2013