digitalmars.D - [GSOC] regular expressions beta is here
- Dmitry Olshansky (29/29) Aug 10 2011 In case I failed to mention it before, I m working on the project
- Vladimir Panteleev (7/8) Aug 10 2011 Hi, does this rewrite cover compile-time regex compilation?
- Dmitry Olshansky (11/16) Aug 10 2011 Yes, I've dubbed it static regex. In fact it will be something similar
- Vladimir Panteleev (8/12) Aug 10 2011 Awesome stuff. D's codegen abilities have the potential to put regex
- Jacob Carlborg (7/34) Aug 10 2011 I have a suggestion, make RegexMatch implicitly convertible to bool,
- Dmitry Olshansky (14/51) Aug 10 2011 Interesting idea, one problem with it is that I want this:
- Steven Schveighoffer (12/57) Aug 10 2011 Without actually looking at the code, why wouldn't something like this
- Dmitry Olshansky (4/65) Aug 10 2011 Thanks, I'll give it a try.
- Jacob Carlborg (12/26) Aug 10 2011 No, that won't be any problem:
- Dmitry Olshansky (8/35) Aug 10 2011 That may be all well, but try writeln on it, what will it print?
- Jacob Carlborg (6/33) Aug 10 2011 Oh, I didn't know that it would work implicitly in conditionals. Then
- Steven Schveighoffer (7/14) Aug 10 2011 alias this has lots of problems, but it doesn't mean it's *design* is
- Andrei Alexandrescu (5/36) Aug 10 2011 That's pretty cool actually because it naturally extends the built-in
- Jacob Carlborg (5/9) Aug 10 2011 Cool, I always thought that opCast was for explicit casts, but maybe
- Andrei Alexandrescu (4/29) Aug 10 2011 If alias this is any more blunt than regular subtyping (inheritance),
- bearophile (7/9) Aug 10 2011 When you write some English text you don't write a single block of text,...
- Dmitry Olshansky (16/25) Aug 10 2011 While I haven't asked for review, I do appreciate comments. I have to
- bearophile (8/20) Aug 10 2011 Think about reading a book without the half lines between paragraphs. In...
- Adam D. Ruppe (5/5) Aug 10 2011 bearophile:
- bearophile (4/6) Aug 10 2011 This "you" is a group that includes people like Guido V. Rossum, Rob Pik...
- Marco Leise (17/22) Aug 10 2011 I think a blank line makes code easier on the eyes. When you scroll over...
- Jonathan M Davis (20/46) Aug 10 2011 This sort of thing has been discussed by the Phobos dev team previously,...
- simendsjo (2/4) Aug 10 2011 There is? Parallelism and json uses braces on the same line.
- Jonathan M Davis (4/9) Aug 10 2011 It was agreed upon, and where it has been noticed, it has been fixed. Bu...
- simendsjo (4/13) Aug 11 2011 Damn - I've been changing my D style to braces on the same line. It's
- Jonathan M Davis (5/21) Aug 11 2011 You're free to do your braces however you'd like in your own code, but a...
- simendsjo (8/29) Aug 11 2011 I actually like that a language has a "default" style. Java, C# and
- Jonathan M Davis (5/41) Aug 11 2011 Well, you're free to follow Phobos' style too. It's entirely up to you. ...
- Marco Leise (4/5) Aug 10 2011 You see, and that is why we should make that explicit rather than implic...
- Dmitry Olshansky (16/34) Aug 10 2011 Braces *are* paragraphs of code, with proper indention it's more then
- Vladimir Panteleev (10/22) Aug 10 2011 I agree with bearophile; I find code that leaves a blank line between
- Dmitry Olshansky (5/26) Aug 10 2011 Lucky you, hm... probably turning my monitor on 90 degrees can get me in...
- bearophile (25/29) Aug 10 2011 They sometimes are, but inside functions there are other kinds of "parag...
- Don (10/18) Aug 10 2011 You're conflating a couple of things here. Invariants are tremendously
- Jonathan M Davis (8/11) Aug 10 2011 That would be great, but several bugs need to be fixed before that's pos...
- Lutger Blijdestijn (3/26) Aug 10 2011 What about out contracts on interfaces in a library (where you use the
- Don (3/29) Aug 10 2011 That involves inheritance. But I don't think there are any cases in
- bearophile (27/28) Aug 11 2011 I see three different situations where postconditions are useful in D:
- Don (16/55) Aug 11 2011 Sorry, but personally I don't believe that this is useful outside of toy...
- Adam D. Ruppe (11/11) Aug 11 2011 If it's worth anything, I use the out contracts in dom.d more as
- Marco Leise (46/57) Aug 11 2011 I've been wondering for a while if selective unit tests could be include...
- bearophile (38/50) Aug 11 2011 Putting a simpler algorithm in the post-condition implements a third pos...
- Don (43/113) Aug 12 2011 Conditions required for this to be true:
- Timon Gehr (71/216) Aug 12 2011 If the difference is not an asymptotic one, it can well be time critical...
- bearophile (43/51) Aug 12 2011 This code of mine is a real-world example. This is a struct method with ...
- Dmitry Olshansky (5/25) Aug 11 2011 I stand corrected about invariants, somehow I wasn't considering them a
- Jacob Carlborg (4/9) Aug 10 2011 I always add a blank line before and after statements.
- Dmitry Olshansky (19/45) Aug 16 2011 Meanwhile the new beta is up:
- bearophile (44/46) Aug 16 2011 I have not patched DMD, but it gives me some problem here:
- Dmitry Olshansky (8/52) Aug 17 2011 Yes, that's a bug. But it's not a regression, I assume you started to
- bearophile (6/9) Aug 17 2011 I suggest Phobos devs to use -w too.
- bearophile (1/4) Aug 17 2011 http://d.puremagic.com/issues/show_bug.cgi?id=6518
- amanda (1/1) Aug 16 2011 When you have Herpes, HIV/AIDS, hpv,or any other STD, it can feel like y...
In case I failed to mention it before, I m working on the project codenamed FReD that is aimed at ~100%* source level compatible overhaul of std.regex, that uses better implementation techniques, provides modern Unicode support and common syntax riches. I think it's time for a public beta release, since it _should_ be ready for mainstream usage. There are some rough edges, and a couple issues that I'm aware of but they are nowhere in realistic use cases. In order to avoid unexpected regressions I'd be glad if current std.regex users do try it for their projects/tests. To get a small no-crap-included beta package see download section of https://github.com/blackwhale/FReD for .7zs. I'll upload newer packages as bugs get exposed and fixed. Alternatively, if you a comfortable with git you may just git clone entire repo. Some helpful notes (same as README) can be found here : https://github.com/blackwhale/FReD/wiki/Beta-release Caveats: In order for it compile a tiny change to 2.054 source is needed (no need to recompile Phobos! it's only in templates): patch std.algorithm.cmp according to this diff https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4631 <https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4633> and to get CTFE features working add if(!__ctfe) listed in the next diff on the same webpage. (this is already upstream, so if you're using a fork of phobos just pull this in) * some API problems might lead to a breaking change, though it didn't happen in this release -- Dmitry Olshansky
Aug 10 2011
On Wed, 10 Aug 2011 13:42:25 +0300, Dmitry Olshansky <dmitry.olsh gmail.com> wrote:and to get CTFE features working add if(!__ctfe) listed in the next diffHi, does this rewrite cover compile-time regex compilation? E.g. regex!`^a` compiling to s.length&&s[0]=='a' or something like that. -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Aug 10 2011
On 10.08.2011 15:16, Vladimir Panteleev wrote:On Wed, 10 Aug 2011 13:42:25 +0300, Dmitry Olshansky <dmitry.olsh gmail.com> wrote:Yes, I've dubbed it static regex. In fact it will be something similar to this, though it will do a heap allocation for backtracking points, on first call to match. Heap allocations are definetly going away in final release. You can pass -version=fred_ct -debug to dmd to see generated programs. At the moment it's more prof of concept then speed devil, something I might see about to change once CTFE bugs worked out. Anyway when it doesn't crush the compiler, it's pretty fast :) -- Dmitry Olshanskyand to get CTFE features working add if(!__ctfe) listed in the next diffHi, does this rewrite cover compile-time regex compilation? E.g. regex!`^a` compiling to s.length&&s[0]=='a' or something like that.
Aug 10 2011
On Wed, 10 Aug 2011 14:44:44 +0300, Dmitry Olshansky <dmitry.olsh gmail.com> wrote:Yes, I've dubbed it static regex. In fact it will be something similar to this, though it will do a heap allocation for backtracking points, on first call to match. Heap allocations are definetly going away in final release.Awesome stuff. D's codegen abilities have the potential to put regex matching way ahead of any C/C++ libraries that don't JIT or stuff like that. -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Aug 10 2011
On 2011-08-10 12:42, Dmitry Olshansky wrote:In case I failed to mention it before, I m working on the project codenamed FReD that is aimed at ~100%* source level compatible overhaul of std.regex, that uses better implementation techniques, provides modern Unicode support and common syntax riches. I think it's time for a public beta release, since it _should_ be ready for mainstream usage. There are some rough edges, and a couple issues that I'm aware of but they are nowhere in realistic use cases. In order to avoid unexpected regressions I'd be glad if current std.regex users do try it for their projects/tests. To get a small no-crap-included beta package see download section of https://github.com/blackwhale/FReD for .7zs. I'll upload newer packages as bugs get exposed and fixed. Alternatively, if you a comfortable with git you may just git clone entire repo. Some helpful notes (same as README) can be found here : https://github.com/blackwhale/FReD/wiki/Beta-release Caveats: In order for it compile a tiny change to 2.054 source is needed (no need to recompile Phobos! it's only in templates): patch std.algorithm.cmp according to this diff https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4631 <https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4633> and to get CTFE features working add if(!__ctfe) listed in the next diff on the same webpage. (this is already upstream, so if you're using a fork of phobos just pull this in) * some API problems might lead to a breaking change, though it didn't happen in this releaseI have a suggestion, make RegexMatch implicitly convertible to bool, indicating if there was a match or not. Aren't there a lot of things that should be declared as private in the fred.d module? -- /Jacob Carlborg
Aug 10 2011
On 10.08.2011 15:34, Jacob Carlborg wrote:On 2011-08-10 12:42, Dmitry Olshansky wrote:Interesting idea, one problem with it is that I want this: auto m = match("bleh", "bleh"); writeln(m); to actually print "bleh", not true Right now due to a carry over bug from std.regex (interface thing) writln(m) will just do a stackoverflow, m.hit however works.In case I failed to mention it before, I m working on the project codenamed FReD that is aimed at ~100%* source level compatible overhaul of std.regex, that uses better implementation techniques, provides modern Unicode support and common syntax riches. I think it's time for a public beta release, since it _should_ be ready for mainstream usage. There are some rough edges, and a couple issues that I'm aware of but they are nowhere in realistic use cases. In order to avoid unexpected regressions I'd be glad if current std.regex users do try it for their projects/tests. To get a small no-crap-included beta package see download section of https://github.com/blackwhale/FReD for .7zs. I'll upload newer packages as bugs get exposed and fixed. Alternatively, if you a comfortable with git you may just git clone entire repo. Some helpful notes (same as README) can be found here : https://github.com/blackwhale/FReD/wiki/Beta-release Caveats: In order for it compile a tiny change to 2.054 source is needed (no need to recompile Phobos! it's only in templates): patch std.algorithm.cmp according to this diff https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4631 <https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4633> and to get CTFE features working add if(!__ctfe) listed in the next diff on the same webpage. (this is already upstream, so if you're using a fork of phobos just pull this in) * some API problems might lead to a breaking change, though it didn't happen in this releaseI have a suggestion, make RegexMatch implicitly convertible to bool, indicating if there was a match or not.Aren't there a lot of things that should be declared as private in the fred.d module?Yes, it's a side effect of me having a lot of debugging tool that do need these internals. If only package protection attribute of something was working.... Not to mention that the whole module should work in SafeD with a couple of trusted here and there. -- Dmitry Olshansky
Aug 10 2011
On Wed, 10 Aug 2011 07:51:32 -0400, Dmitry Olshansky <dmitry.olsh gmail.com> wrote:On 10.08.2011 15:34, Jacob Carlborg wrote:Without actually looking at the code, why wouldn't something like this work? struct RegexMatch { ... string toString() {...} opCast(T : bool)() {...} } This isn't an implicit cast, but it will work for conditional statements. -SteveOn 2011-08-10 12:42, Dmitry Olshansky wrote:Interesting idea, one problem with it is that I want this: auto m = match("bleh", "bleh"); writeln(m); to actually print "bleh", not trueIn case I failed to mention it before, I m working on the project codenamed FReD that is aimed at ~100%* source level compatible overhaul of std.regex, that uses better implementation techniques, provides modern Unicode support and common syntax riches. I think it's time for a public beta release, since it _should_ be ready for mainstream usage. There are some rough edges, and a couple issues that I'm aware of but they are nowhere in realistic use cases. In order to avoid unexpected regressions I'd be glad if current std.regex users do try it for their projects/tests. To get a small no-crap-included beta package see download section of https://github.com/blackwhale/FReD for .7zs. I'll upload newer packages as bugs get exposed and fixed. Alternatively, if you a comfortable with git you may just git clone entire repo. Some helpful notes (same as README) can be found here : https://github.com/blackwhale/FReD/wiki/Beta-release Caveats: In order for it compile a tiny change to 2.054 source is needed (no need to recompile Phobos! it's only in templates): patch std.algorithm.cmp according to this diff https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4631 <https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4633> and to get CTFE features working add if(!__ctfe) listed in the next diff on the same webpage. (this is already upstream, so if you're using a fork of phobos just pull this in) * some API problems might lead to a breaking change, though it didn't happen in this releaseI have a suggestion, make RegexMatch implicitly convertible to bool, indicating if there was a match or not.
Aug 10 2011
On 10.08.2011 16:54, Steven Schveighoffer wrote:On Wed, 10 Aug 2011 07:51:32 -0400, Dmitry Olshansky <dmitry.olsh gmail.com> wrote:Thanks, I'll give it a try. -- Dmitry OlshanskyOn 10.08.2011 15:34, Jacob Carlborg wrote:Without actually looking at the code, why wouldn't something like this work? struct RegexMatch { ... string toString() {...} opCast(T : bool)() {...} } This isn't an implicit cast, but it will work for conditional statements.On 2011-08-10 12:42, Dmitry Olshansky wrote:Interesting idea, one problem with it is that I want this: auto m = match("bleh", "bleh"); writeln(m); to actually print "bleh", not trueIn case I failed to mention it before, I m working on the project codenamed FReD that is aimed at ~100%* source level compatible overhaul of std.regex, that uses better implementation techniques, provides modern Unicode support and common syntax riches. I think it's time for a public beta release, since it _should_ be ready for mainstream usage. There are some rough edges, and a couple issues that I'm aware of but they are nowhere in realistic use cases. In order to avoid unexpected regressions I'd be glad if current std.regex users do try it for their projects/tests. To get a small no-crap-included beta package see download section of https://github.com/blackwhale/FReD for .7zs. I'll upload newer packages as bugs get exposed and fixed. Alternatively, if you a comfortable with git you may just git clone entire repo. Some helpful notes (same as README) can be found here : https://github.com/blackwhale/FReD/wiki/Beta-release Caveats: In order for it compile a tiny change to 2.054 source is needed (no need to recompile Phobos! it's only in templates): patch std.algorithm.cmp according to this diff https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4631 <https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4633> and to get CTFE features working add if(!__ctfe) listed in the next diff on the same webpage. (this is already upstream, so if you're using a fork of phobos just pull this in) * some API problems might lead to a breaking change, though it didn't happen in this releaseI have a suggestion, make RegexMatch implicitly convertible to bool, indicating if there was a match or not.
Aug 10 2011
Interesting idea, one problem with it is that I want this: auto m = match("bleh", "bleh"); writeln(m); to actually print "bleh", not true Right now due to a carry over bug from std.regex (interface thing) writln(m) will just do a stackoverflow, m.hit however works.No, that won't be any problem: struct Foo { bool b; alias b this; } auto f = Foo(); static assert(is(typeof(f) == Foo)); The above assert passes as expected.Ok, I see. -- /Jacob CarlborgAren't there a lot of things that should be declared as private in the fred.d module?Yes, it's a side effect of me having a lot of debugging tool that do need these internals. If only package protection attribute of something was working.... Not to mention that the whole module should work in SafeD with a couple of trusted here and there.
Aug 10 2011
On 10.08.2011 18:54, Jacob Carlborg wrote:That may be all well, but try writeln on it, what will it print? After some experience with alias this I had to conclude that it's rather blunt tool, and I'd rather stay away of it. Actually I like Steven's opCast suggestion, so that it works in conditionals.Interesting idea, one problem with it is that I want this: auto m = match("bleh", "bleh"); writeln(m); to actually print "bleh", not true Right now due to a carry over bug from std.regex (interface thing) writln(m) will just do a stackoverflow, m.hit however works.No, that won't be any problem: struct Foo { bool b; alias b this; } auto f = Foo(); static assert(is(typeof(f) == Foo)); The above assert passes as expected.-- Dmitry OlshanskyOk, I see.Aren't there a lot of things that should be declared as private in the fred.d module?Yes, it's a side effect of me having a lot of debugging tool that do need these internals. If only package protection attribute of something was working.... Not to mention that the whole module should work in SafeD with a couple of trusted here and there.
Aug 10 2011
On 2011-08-10 17:55, Dmitry Olshansky wrote:On 10.08.2011 18:54, Jacob Carlborg wrote:Hmm, it doesn't print anything, I think it looks like a bug in writeln.That may be all well, but try writeln on it, what will it print?Interesting idea, one problem with it is that I want this: auto m = match("bleh", "bleh"); writeln(m); to actually print "bleh", not true Right now due to a carry over bug from std.regex (interface thing) writln(m) will just do a stackoverflow, m.hit however works.No, that won't be any problem: struct Foo { bool b; alias b this; } auto f = Foo(); static assert(is(typeof(f) == Foo)); The above assert passes as expected.After some experience with alias this I had to conclude that it's rather blunt tool, and I'd rather stay away of it. Actually I like Steven's opCast suggestion, so that it works in conditionals.Oh, I didn't know that it would work implicitly in conditionals. Then I'm happy with opCast :) -- /Jacob Carlborg
Aug 10 2011
On Wed, 10 Aug 2011 12:46:25 -0400, Jacob Carlborg <doob me.com> wrote:On 2011-08-10 17:55, Dmitry Olshansky wrote:alias this has lots of problems, but it doesn't mean it's *design* is blunt, just that the implementation of it is not too good.After some experience with alias this I had to conclude that it's rather blunt tool, and I'd rather stay away of it. Actually I like Steven's opCast suggestion, so that it works in conditionals.Oh, I didn't know that it would work implicitly in conditionals. Then I'm happy with opCast :)http://www.d-programming-language.org/operatoroverloading.html#Cast Note that it only works for structs (not sure if that return type is a struct or not...) -Steve
Aug 10 2011
On 8/10/11 10:46 AM, Jacob Carlborg wrote:On 2011-08-10 17:55, Dmitry Olshansky wrote:That's pretty cool actually because it naturally extends the built-in approach. When you do e.g. if (pointer) that's really equivalent to if (cast(bool) pointer) and so on. AndreiOn 10.08.2011 18:54, Jacob Carlborg wrote:Hmm, it doesn't print anything, I think it looks like a bug in writeln.That may be all well, but try writeln on it, what will it print?Interesting idea, one problem with it is that I want this: auto m = match("bleh", "bleh"); writeln(m); to actually print "bleh", not true Right now due to a carry over bug from std.regex (interface thing) writln(m) will just do a stackoverflow, m.hit however works.No, that won't be any problem: struct Foo { bool b; alias b this; } auto f = Foo(); static assert(is(typeof(f) == Foo)); The above assert passes as expected.After some experience with alias this I had to conclude that it's rather blunt tool, and I'd rather stay away of it. Actually I like Steven's opCast suggestion, so that it works in conditionals.Oh, I didn't know that it would work implicitly in conditionals. Then I'm happy with opCast :)
Aug 10 2011
On 2011-08-10 19:45, Andrei Alexandrescu wrote:That's pretty cool actually because it naturally extends the built-in approach. When you do e.g. if (pointer) that's really equivalent to if (cast(bool) pointer) and so on. AndreiCool, I always thought that opCast was for explicit casts, but maybe it's explicit in this case, in some way. -- /Jacob Carlborg
Aug 10 2011
On 8/10/11 9:55 AM, Dmitry Olshansky wrote:On 10.08.2011 18:54, Jacob Carlborg wrote:If alias this is any more blunt than regular subtyping (inheritance), that would be a bug. Feel free to submit if you find such issues. AndreiThat may be all well, but try writeln on it, what will it print? After some experience with alias this I had to conclude that it's rather blunt tool, and I'd rather stay away of it.Interesting idea, one problem with it is that I want this: auto m = match("bleh", "bleh"); writeln(m); to actually print "bleh", not true Right now due to a carry over bug from std.regex (interface thing) writln(m) will just do a stackoverflow, m.hit however works.No, that won't be any problem: struct Foo { bool b; alias b this; } auto f = Foo(); static assert(is(typeof(f) == Foo)); The above assert passes as expected.
Aug 10 2011
Dmitry Olshansky:To get a small no-crap-included beta package see download section of https://github.com/blackwhale/FReD for .7zs.When you write some English text you don't write a single block of text, you organize it into paragraphs, and paragraphs into chapters, chapters into sections, sections into books, etc. Time ago I have understood that paragraphs are very good in source code too. So I suggest you to add a blank line here and there inside your functions to separate them into paragraphs. I can't give you a style rule, you will need to create your own style, but often a function that's more than 10 lines line long needs one or more blank lines inside (some people say that every time you see one of such paragraphs in a function, especially if it has a comment before it, then you need to perform an "extract method" to improve the code. I believe this is a bad advice). I see no contracts in the code (I mean the ones with assert inside, instead of enforce). I suggest Walter to fix this situation. One idea is to include two versions of Phobos lib in the zip of the dmd distribution, one with asserts compiled in and one without, and let DMD import from the correct library according to the compilation flags. Some solution to this problem is getting urgent, because Phobos is growing without the use of one of the nicest features of D (contract programming). Solving this problem is more urgent than having an excellent regex library in Phobos. If people don't use contract programming much, is because you can't use it in Phobos. Bye, bearophile
Aug 10 2011
On 10.08.2011 20:02, bearophile wrote:Dmitry Olshansky:While I haven't asked for review, I do appreciate comments. I have to say I did no cleanup or otherwise shape up the code, I'm still working on semantic side part of problems:) Honestly I can't get why you are so nervous about code style anyway, you seem to bring this up way to often. About spaces personally I dislike eating extra vertical space for "clarity", curly braces on it's own line is already way too much.To get a small no-crap-included beta package see download section of https://github.com/blackwhale/FReD for .7zs.When you write some English text you don't write a single block of text, you organize it into paragraphs, and paragraphs into chapters, chapters into sections, sections into books, etc. Time ago I have understood that paragraphs are very good in source code too. So I suggest you to add a blank line here and there inside your functions to separate them into paragraphs. I can't give you a style rule, you will need to create your own style, but often a function that's more than 10 lines line long needs one or more blank lines inside (some people say that every time you see one of such paragraphs in a function, especially if it has a comment before it, then you need to perform an "extract method" to improve the code. I believe this is a bad advice).I see no contracts in the code (I mean the ones with assert inside, instead of enforce). I suggest Walter to fix this situation. One idea is to include two versions of Phobos lib in the zip of the dmd distribution, one with asserts compiled in and one without, and let DMD import from the correct library according to the compilation flags. Some solution to this problem is getting urgent, because Phobos is growing without the use of one of the nicest features of D (contract programming). Solving this problem is more urgent than having an excellent regex library in Phobos. If people don't use contract programming much, is because you can't use it in Phobos.Have to respectfully disagree on this, don't try to nail everything on contracts. They are nice but have little value over plain assert _unless_ we are talking about classes and _inheritance_, which isn't the case here. And there are lots of asserts here, but much more of input is enforced since it's totally expected to supply wrong pattern (or have an outside user to type in the pattern).Bye, bearophile-- Dmitry Olshansky
Aug 10 2011
Dmitry Olshansky:Honestly I can't get why you are so nervous about code style anyway, you seem to bring this up way to often.I bring it often because many D programmers seem half blind to this problem. I am not willing to go to the extremes Go language goes to solve this problem, but I'd like more recognition of this problem in D programmers. A bit more common style is quite helpful to create an ecology of D programmers that share single modules. I guess D programmers are used to C/C++ languages, where there are not modules and where programs are usually made of many files. So they don't see why sharing single modules in the pool is so useful.About spaces personally I dislike eating extra vertical space for "clarity", curly braces on it's own line is already way too much.Think about reading a book without the half lines between paragraphs. In code it's the same. Some empty lines are good to improve readability of the code. Curly braces are not always present, sometimes a paragraphs ends before or after or right on a curly brace.Have to respectfully disagree on this, don't try to nail everything on contracts.Contracts don't replace unittests, they complement each other.They are nice but have little value over plain assert _unless_ we are talking about classes and _inheritance_, which isn't the case here.It's easy to forget to test the output of a function, the "out" contracts help here. In structs the invariant helps you avoid forgetting to call manually a sanity test function every time you come in and out of a method.And there are lots of asserts here, but much more of input is enforced since it's totally expected to supply wrong pattern (or have an outside user to type in the pattern).The idea is to replace those enforces with asserts, and allow user programs to import Phobos stuff that still contain asserts (from a secondary Phobos lib). Enforces are for certain kinds of user code, I don't think they are fit in Phobos. Bye, bearophile
Aug 10 2011
bearophile: The thing is just because you call it a problem a lot doesn't mean everyone else sees it that way. A lot of us have many years of experience and just don't see it the same way you do.
Aug 10 2011
Adam D. Ruppe:A lot of us have many years of experience and just don't see it the same way you do.This "you" is a group that includes people like Guido V. Rossum, Rob Pike, Ken Thompson and R. Hettinger (they have feelings even stronger than mine on this topic). Bye, bearophile
Aug 10 2011
Am 10.08.2011, 19:24 Uhr, schrieb Adam D. Ruppe <destructionator gmail.com>:bearophile: The thing is just because you call it a problem a lot doesn't mean everyone else sees it that way. A lot of us have many years of experience and just don't see it the same way you do.I think a blank line makes code easier on the eyes. When you scroll over it you recognize easily where you are from the size and shape of the paragraphs. So I totally understand that. On the other hand my laptop screen is 1280x800 and I also feel that sometimes I think I scroll over the end of a function body when there is just a blank line in a block of code. So usually I go with the approach of inserting a comment line instead of a blank line, which is usually italic and in a brighter color. If I was working on a Phobos module I would try to mime existing code style (and probably find out that there is no common style :p ). Anyway such things can be up to a vote just like the idea to not use single capital letters only for template type placeholders (i.e. T, S). Google's code style wiki is nice. It lists all the rules and also offers an explanation. We can have that for Phobos, too. So topics like these don't come up over and over again. The D style guide is a good start: http://www.digitalmars.com/d/2.0/dstyle.html
Aug 10 2011
On Wednesday, August 10, 2011 21:42:01 Marco Leise wrote:Am 10.08.2011, 19:24 Uhr, schrieb Adam D. Ruppe <destructionator gmail.com>:This sort of thing has been discussed by the Phobos dev team previously, and the general consensus was not to enforce much in the way of formatting in a style guide. There a few things that were agreed upon (such as always putting braces on their own line), but on the whole, the style guide is supposed to focus on the API (so, things like function and variable names) rather than how code is formatted. I have an update to the style guide as a pull request which is currently being reviewed to make sure that the style guide on the site is in line with what we do: https://github.com/D-Programming-Language/d-programming-language.org/pull/16 But I'm certain that you're not going to get the Phobos devs to agree on a style guide like Bearophile wants. And honestly, I'm a bit tired of the topic coming up. The does need some updates, but it's mostly correct. It's essentially what we've decided on, and I don't see any reason to keep discussing it over and over. Personally, I'd prefer that Dmitry had more blank lines in his code, but it's up to him how he does that as long as his code falls within the rules set down by the D style guide. And for any of his code which isn't going into Phobos, it's completely up to him how to format it. - Jonathan M Davisbearophile: The thing is just because you call it a problem a lot doesn't mean everyone else sees it that way. A lot of us have many years of experience and just don't see it the same way you do.I think a blank line makes code easier on the eyes. When you scroll over it you recognize easily where you are from the size and shape of the paragraphs. So I totally understand that. On the other hand my laptop screen is 1280x800 and I also feel that sometimes I think I scroll over the end of a function body when there is just a blank line in a block of code. So usually I go with the approach of inserting a comment line instead of a blank line, which is usually italic and in a brighter color. If I was working on a Phobos module I would try to mime existing code style (and probably find out that there is no common style :p ). Anyway such things can be up to a vote just like the idea to not use single capital letters only for template type placeholders (i.e. T, S). Google's code style wiki is nice. It lists all the rules and also offers an explanation. We can have that for Phobos, too. So topics like these don't come up over and over again. The D style guide is a good start: http://www.digitalmars.com/d/2.0/dstyle.html
Aug 10 2011
On 10.08.2011 22:12, Jonathan M Davis wrote:There a few things that were agreed upon (such as always putting braces on their own line),There is? Parallelism and json uses braces on the same line.
Aug 10 2011
On 10.08.2011 22:12, Jonathan M Davis wrote:It was agreed upon, and where it has been noticed, it has been fixed. But as I said, the style guide needs updating on a few points. Braces on their own line is one of them. - Jonathan M DavisThere a few things that were agreed upon (such as always putting braces on their own line),There is? Parallelism and json uses braces on the same line.
Aug 10 2011
On 10.08.2011 23:16, Jonathan M Davis wrote:Damn - I've been changing my D style to braces on the same line. It's great as I do most D coding on a small laptop. Guess I'll have to change it again :)On 10.08.2011 22:12, Jonathan M Davis wrote:It was agreed upon, and where it has been noticed, it has been fixed. But as I said, the style guide needs updating on a few points. Braces on their own line is one of them. - Jonathan M DavisThere a few things that were agreed upon (such as always putting braces on their own line),There is? Parallelism and json uses braces on the same line.
Aug 11 2011
On Thursday, August 11, 2011 10:50:41 simendsjo wrote:On 10.08.2011 23:16, Jonathan M Davis wrote:You're free to do your braces however you'd like in your own code, but any code submitted to Phobos or druntime needs to have the braces on their own line. - Jonathan M DavisDamn - I've been changing my D style to braces on the same line. It's great as I do most D coding on a small laptop. Guess I'll have to change it again :)On 10.08.2011 22:12, Jonathan M Davis wrote:It was agreed upon, and where it has been noticed, it has been fixed. But as I said, the style guide needs updating on a few points. Braces on their own line is one of them. - Jonathan M DavisThere a few things that were agreed upon (such as always putting braces on their own line),There is? Parallelism and json uses braces on the same line.
Aug 11 2011
On 11.08.2011 11:04, Jonathan M Davis wrote:On Thursday, August 11, 2011 10:50:41 simendsjo wrote:Python all has a default style that makes code easy to read regardless of who wrote it (of course, python has some enforced stuff with indentation). You can, for instance, break the style as much as you'd But then again.. Unless it's written in an obfuscated style, it doesn't really matter that much..On 10.08.2011 23:16, Jonathan M Davis wrote:You're free to do your braces however you'd like in your own code, but any code submitted to Phobos or druntime needs to have the braces on their own line. - Jonathan M DavisDamn - I've been changing my D style to braces on the same line. It's great as I do most D coding on a small laptop. Guess I'll have to change it again :)On 10.08.2011 22:12, Jonathan M Davis wrote:It was agreed upon, and where it has been noticed, it has been fixed. But as I said, the style guide needs updating on a few points. Braces on their own line is one of them. - Jonathan M DavisThere a few things that were agreed upon (such as always putting braces on their own line),There is? Parallelism and json uses braces on the same line.
Aug 11 2011
On Thursday, August 11, 2011 11:34:30 simendsjo wrote:On 11.08.2011 11:04, Jonathan M Davis wrote:Well, you're free to follow Phobos' style too. It's entirely up to you. But bracing style is the sort of thing that's likely to vary quite a bit from programmer to programmer (especially among those with a C or C++ background). - Jonathan M DavisOn Thursday, August 11, 2011 10:50:41 simendsjo wrote:Python all has a default style that makes code easy to read regardless of who wrote it (of course, python has some enforced stuff with indentation). You can, for instance, break the style as much as you'd But then again.. Unless it's written in an obfuscated style, it doesn't really matter that much..On 10.08.2011 23:16, Jonathan M Davis wrote:You're free to do your braces however you'd like in your own code, but any code submitted to Phobos or druntime needs to have the braces on their own line. - Jonathan M DavisDamn - I've been changing my D style to braces on the same line. It's great as I do most D coding on a small laptop. Guess I'll have to change it again :)On 10.08.2011 22:12, Jonathan M Davis wrote:It was agreed upon, and where it has been noticed, it has been fixed. But as I said, the style guide needs updating on a few points. Braces on their own line is one of them. - Jonathan M DavisThere a few things that were agreed upon (such as always putting braces on their own line),There is? Parallelism and json uses braces on the same line.
Aug 11 2011
Am 10.08.2011, 22:12 Uhr, schrieb Jonathan M Davis <jmdavisProg gmx.com>:[...] I don't see any reason to keep discussing it over and over.You see, and that is why we should make that explicit rather than implicit in the style guide. An additional point "personal preference" could list "blank lines to group logical blocks of code".
Aug 10 2011
On 10.08.2011 21:11, bearophile wrote:Dmitry Olshansky:Braces *are* paragraphs of code, with proper indention it's more then enough to fell the structure. If I really need to stop in the middle function, it's to explain something, then a single line of comment instead of meaningless empty line (which leaves reader clueless as to why) is good enough. Except that I'm not opposed to spaces at global scope.Honestly I can't get why you are so nervous about code style anyway, you seem to bring this up way to often.I bring it often because many D programmers seem half blind to this problem. I am not willing to go to the extremes Go language goes to solve this problem, but I'd like more recognition of this problem in D programmers. A bit more common style is quite helpful to create an ecology of D programmers that share single modules. I guess D programmers are used to C/C++ languages, where there are not modules and where programs are usually made of many files. So they don't see why sharing single modules in the pool is so useful.About spaces personally I dislike eating extra vertical space for "clarity", curly braces on it's own line is already way too much.Think about reading a book without the half lines between paragraphs. In code it's the same. Some empty lines are good to improve readability of the code. Curly braces are not always present, sometimes a paragraphs ends before or after or right on a curly brace.unittest != assert, though the former do contain asserts.Have to respectfully disagree on this, don't try to nail everything on contracts.Contracts don't replace unittests, they complement each other.No gonna work, file I/O is certainly in Phobos, as are network sockets, etc. You can't assert that something external won't fail. While you'd normally assert on your local logical invariants. As for other things I thought e.g. ranges are already hooked on asserts, as much as other templates. If you have a list of modules where you find the lack of compiled in contracts/asserts unbearable, do tell. I hate being drugged in these discussions, but just can't resist. -- Dmitry OlshanskyThey are nice but have little value over plain assert _unless_ we are talking about classes and _inheritance_, which isn't the case here.It's easy to forget to test the output of a function, the "out" contracts help here. In structs the invariant helps you avoid forgetting to call manually a sanity test function every time you come in and out of a method.And there are lots of asserts here, but much more of input is enforced since it's totally expected to supply wrong pattern (or have an outside user to type in the pattern).The idea is to replace those enforces with asserts, and allow user programs to import Phobos stuff that still contain asserts (from a secondary Phobos lib). Enforces are for certain kinds of user code, I don't think they are fit in Phobos.
Aug 10 2011
On Wed, 10 Aug 2011 20:59:27 +0300, Dmitry Olshansky <dmitry.olsh gmail.com> wrote:I agree with bearophile; I find code that leaves a blank line between closely-related lines make the code much more readable. I don't understand what's with the craving for maximum vertical terseness either, but that may be because the resolution of my primary monitor is currently 1200x1920 :) -- Best regards, Vladimir mailto:vladimir thecybershadow.netBraces *are* paragraphs of code, with proper indention it's more then enough to fell the structure. If I really need to stop in the middle function, it's to explain something, then a single line of comment instead of meaningless empty line (which leaves reader clueless as to why) is good enough. Except that I'm not opposed to spaces at global scope.About spaces personally I dislike eating extra vertical space for "clarity", curly braces on it's own line is already way too much.Think about reading a book without the half lines between paragraphs. In code it's the same. Some empty lines are good to improve readability of the code. Curly braces are not always present, sometimes a paragraphs ends before or after or right on a curly brace.
Aug 10 2011
On 10.08.2011 22:11, Vladimir Panteleev wrote:On Wed, 10 Aug 2011 20:59:27 +0300, Dmitry Olshansky <dmitry.olsh gmail.com> wrote:Lucky you, hm... probably turning my monitor on 90 degrees can get me in this league of abundant vertical space :) -- Dmitry OlshanskyI agree with bearophile; I find code that leaves a blank line between closely-related lines make the code much more readable. I don't understand what's with the craving for maximum vertical terseness either, but that may be because the resolution of my primary monitor is currently 1200x1920 :)Braces *are* paragraphs of code, with proper indention it's more then enough to fell the structure. If I really need to stop in the middle function, it's to explain something, then a single line of comment instead of meaningless empty line (which leaves reader clueless as to why) is good enough. Except that I'm not opposed to spaces at global scope.About spaces personally I dislike eating extra vertical space for "clarity", curly braces on it's own line is already way too much.Think about reading a book without the half lines between paragraphs. In code it's the same. Some empty lines are good to improve readability of the code. Curly braces are not always present, sometimes a paragraphs ends before or after or right on a curly brace.
Aug 10 2011
Dmitry Olshansky:Braces *are* paragraphs of code,They sometimes are, but inside functions there are other kinds of "paragraphs". As an example, this is first-quality C code (partially written by R. Hettinger): http://hg.python.org/cpython/file/d5b274a0b0a5/Modules/_collectionsmodule.c If you take a random function from that page, like: 653 static int 654 deque_del_item(dequeobject *deque, Py_ssize_t i) 655 { 656 PyObject *item; 657 658 assert (i >= 0 && i < deque->len); 659 if (_deque_rotate(deque, -i) == -1) 660 return -1; 661 662 item = deque_popleft(deque, NULL); 663 assert (item != NULL); 664 Py_DECREF(item); 665 666 return _deque_rotate(deque, i); 667 } You see a blank line after "Py_DECREF(item);" despite there is no closing brace. The purpose of those blank lines is to help the person that reads the code to tell apart the various things done by that function. This is C code is well written.No gonna work, file I/O is certainly in Phobos, as are network sockets, etc. You can't assert that something external won't fail.OK.I hate being drugged in these discussions, but just can't resist.I am sorry, but thank you for answering :-) Bye, bearophile
Aug 10 2011
bearophile wrote:Contracts don't replace unittests, they complement each other.You're conflating a couple of things here. Invariants are tremendously helpful for structs as well as classes. "out" contracts seem to be almost useless, unless you have a theorem prover. The reason is, that they test nothing apart from the function they are attached to, and it's much better to do that with unittesting. They have very little in common with 'in' contracts. I think that EVERY struct and class in Phobos should have an invariant (except for something like Complex, where there are no invalid values). But I don't think 'out' contracts would add much value at all.They are nice but have little value over plain assert _unless_ we are talking about classes and _inheritance_, which isn't the case here.It's easy to forget to test the output of a function, the "out" contracts help here. In structs the invariant helps you avoid forgetting to call manually a sanity test function every time you come in and out of a method.
Aug 10 2011
On Thursday, August 11, 2011 06:58:51 Don wrote:I think that EVERY struct and class in Phobos should have an invariant (except for something like Complex, where there are no invalid values). But I don't think 'out' contracts would add much value at all.That would be great, but several bugs need to be fixed before that's possible, including http://d.puremagic.com/issues/show_bug.cgi?id=1251 http://d.puremagic.com/issues/show_bug.cgi?id=5039 http://d.puremagic.com/issues/show_bug.cgi?id=5058 http://d.puremagic.com/issues/show_bug.cgi?id=5500 - Jonathan M Davis
Aug 10 2011
Don wrote:bearophile wrote:What about out contracts on interfaces in a library (where you use the library by implementing them).Contracts don't replace unittests, they complement each other.You're conflating a couple of things here. Invariants are tremendously helpful for structs as well as classes. "out" contracts seem to be almost useless, unless you have a theorem prover. The reason is, that they test nothing apart from the function they are attached to, and it's much better to do that with unittesting. They have very little in common with 'in' contracts. I think that EVERY struct and class in Phobos should have an invariant (except for something like Complex, where there are no invalid values). But I don't think 'out' contracts would add much value at all.They are nice but have little value over plain assert _unless_ we are talking about classes and _inheritance_, which isn't the case here.It's easy to forget to test the output of a function, the "out" contracts help here. In structs the invariant helps you avoid forgetting to call manually a sanity test function every time you come in and out of a method.
Aug 10 2011
Lutger Blijdestijn wrote:Don wrote:That involves inheritance. But I don't think there are any cases in Phobos where that is currently applicable.bearophile wrote:What about out contracts on interfaces in a library (where you use the library by implementing them).Contracts don't replace unittests, they complement each other.You're conflating a couple of things here. Invariants are tremendously helpful for structs as well as classes. "out" contracts seem to be almost useless, unless you have a theorem prover. The reason is, that they test nothing apart from the function they are attached to, and it's much better to do that with unittesting. They have very little in common with 'in' contracts. I think that EVERY struct and class in Phobos should have an invariant (except for something like Complex, where there are no invalid values). But I don't think 'out' contracts would add much value at all.They are nice but have little value over plain assert _unless_ we are talking about classes and _inheritance_, which isn't the case here.It's easy to forget to test the output of a function, the "out" contracts help here. In structs the invariant helps you avoid forgetting to call manually a sanity test function every time you come in and out of a method.
Aug 10 2011
Don:"out" contracts seem to be almost useless, unless you have a theorem prover. The reason is, that they test nothing apart from the function they are attached to, and it's much better to do that with unittesting.<I see three different situations where postconditions are useful in D: 1) Sometimes the result of your function/method must satisfy some simple condition to be correct. As example, it must be a nonnegative number. Then you add assert(result >= 0, "..."); in the out. For a Phobos example, std.algorithm.countUntil postcondition is allowed to test assert(result >= -1, "..."); Other possible conditions are the output string can't be longer than a certain amount (like longer than the input string), and so on. In certain cases the program the finds the solution is slow, but testing the correctness of a function is fast. I have hit many situations like this. As an example you test if the result of a complex sorting algorithm is ordered, and with the same length of the input (but maybe you don't test for the output items to be the same of the input). 2) I have found many situations where I am able to solve a problem with both a simple and slow brute force solver, and a complex and fast algorithm to solve a problem. The little program maybe is too much slow for normal usage, but it's just few lines long (especially if I use lot of std.algorithm stuff) but it's much less likely to contain bugs. You can't always verify the result of the fast algorithm with the slow algorithm, this is not useful. In such situations I write the postcondition like this: in { // ... } out(result) { // some fast postconditon tests here debug { assert(result == slowAlgorithm(input)); } body { // fast algorithm here } This way, in release mode it tests nothing, in nonrelease build it tests the fast postconditions, and in debug mode it also verifies the fast algorithm gives the same results as the slow algorithm. Generally solving a problem in two quite different ways helps catch problems in the algorithms. 3) When D will get the prestate ("old" in some contract programming implementations), I will be able to use the prestate inside the postcondition to verify better than the function/method has changed the globals, or instance attributes in a correct way. You can't put such tests in the class/struct invariant, or in the precondition. I'm using postconditions often in my code (less often than preconditions, but often enough). A theorem prover is not strictly necessary for them to be useful. Bye, bearophile
Aug 11 2011
bearophile wrote:Don:Sorry, but personally I don't believe that this is useful outside of toy examples. The question is, what bugs does it find that aren't found by a trivial unit test?"out" contracts seem to be almost useless, unless you have a theorem prover. The reason is, that they test nothing apart from the function they are attached to, and it's much better to do that with unittesting.<I see three different situations where postconditions are useful in D: 1) Sometimes the result of your function/method must satisfy some simple condition to be correct. As example, it must be a nonnegative number. Then you add assert(result >= 0, "..."); in the out. For a Phobos example, std.algorithm.countUntil postcondition is allowed to test assert(result >= -1, "..."); Other possible conditions are the output string can't be longer than a certain amount (like longer than the input string), and so on. In certain cases the program the finds the solution is slow, but testing the correctness of a function is fast. I have hit many situations like this. As an example you test if the result of a complex sorting algorithm is ordered, and with the same length of the input (but maybe you don't test for the output items to be the same of the input). 2) I have found many situations where I am able to solve a problem with both a simple and slow brute force solver, and a complex and fast algorithm to solve a problem. The little program maybe is too much slow for normal usage, but it's just few lines long (especially if I use lot of std.algorithm stuff) but it's much less likely to contain bugs.You can't always verify the result of the fast algorithm with the slow algorithm, this is not useful. In such situations I write the postcondition like this: in { // ... } out(result) { // some fast postconditon tests here debug { assert(result == slowAlgorithm(input)); } body { // fast algorithm here } This way, in release mode it tests nothing, in nonrelease build it tests the fast postconditions, and in debug mode it also verifies the fast algorithm gives the same results as the slow algorithm. Generally solving a problem in two quite different ways helps catch problems in the algorithms. 3) When D will get the prestate ("old" in some contract programming implementations), I will be able to use the prestate inside the postcondition to verify better than the function/method has changed the globals, or instance attributes in a correct way. You can't put such tests in the class/struct invariant, or in the precondition.There are two cases: (1) it's a very tight test. In which case, it's essentially a unit test. or (2) it's a very loose test. In which case, it doesn't find bugs.I'm using postconditions often in my code (less often than preconditions, but often enough). A theorem prover is not strictly necessary for them to be useful.I would like to see an example of a good postcondition. The crucial feature is, they do NOTHING except find bugs in the function they are attached to. So it's very difficult to invent a plausible one. For starters, it really needs to be a function with multiple return values. Otherwise, you can just stick asserts just before your return statement, and you don't need __old or any such thing. Under what circumstances are they are more valuable than any other assert inside a function?
Aug 11 2011
If it's worth anything, I use the out contracts in dom.d more as checked documentation than for serious bug-finding. For example: Element appendChild(Element newChild) out (ret) { assert(ret is newChild); } body { ... } I also use it from time to time to assert that a return value is not null. The check itself isn't particularly useful, but I think it's a nice bit of documentation. Actually, IMO, in and out contracts should be in the generated ddoc too.
Aug 11 2011
Am 11.08.2011, 19:56 Uhr, schrieb Adam D. Ruppe <destructionator gmail.com>:If it's worth anything, I use the out contracts in dom.d more as checked documentation than for serious bug-finding. For example: Element appendChild(Element newChild) out (ret) { assert(ret is newChild); } body { ... } I also use it from time to time to assert that a return value is not null. The check itself isn't particularly useful, but I think it's a nice bit of documentation. Actually, IMO, in and out contracts should be in the generated ddoc too.I've been wondering for a while if selective unit tests could be included in DDOC somehow. Most of the 'examples' in the Phobos documentation look like they were taken right out of a unittest block blow the function. Like BinaryHeap in std.containers: ---------------------------------------------------------------------- DDOC: // Example from "Introduction to Algorithms" Cormen et al, p 146 int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ]; auto h = heapify(a); // largest element assert(h.front == 16); // a has the heap property assert(equal(a, [ 16, 14, 10, 9, 8, 7, 4, 3, 2, 1 ])); ---------------------------------------------------------------------- std/containers.d: unittest { { // example from "Introduction to Algorithms" Cormen et al., p 146 int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ]; auto h = heapify(a); assert(h.front == 16); assert(a == [ 16, 14, 10, 8, 7, 9, 3, 2, 4, 1 ]); auto witness = [ 16, 14, 10, 9, 8, 7, 4, 3, 2, 1 ]; for (; !h.empty; h.removeFront(), witness.popFront()) { assert(!witness.empty); assert(witness.front == h.front); } assert(witness.empty); } { int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ]; int[] b = new int[a.length]; BinaryHeap!(int[]) h = BinaryHeap!(int[])(b, 0); foreach (e; a) { h.insert(e); } assert(b == [ 16, 14, 10, 8, 7, 3, 9, 1, 4, 2 ], text(b)); } } ---------------------------------------------------------------------- bearophile, you are the expert with the DRY buzz word ;)
Aug 11 2011
Don:2) I have found many situations where I am able to solve a problem with both a simple and slow brute force solver, and a complex and fast algorithm to solve a problem. The little program maybe is too much slow for normal usage, but it's just few lines long (especially if I use lot of std.algorithm stuff) but it's much less likely to contain bugs.Sorry, but personally I don't believe that this is useful outside of toy examples. The question is, what bugs does it find that aren't found by a trivial unit test?There are two cases: (1) it's a very tight test. In which case, it's essentially a unit test. or (2) it's a very loose test. In which case, it doesn't find bugs.Putting a simpler algorithm in the post-condition implements a third possibility you are missing. Usually unit tests verify some specific cases (you are also able to add generic testing code in the unit test, but this is just like moving the postcondition elsewhere). If you put an alternative algorithm in the postcondition (under debug{} if you want), you have some advantages: - It's tight, because the second algorithm is supposed to always give the same results as the function. - It works with the real examples the program is run too, not just the cases you have put in the unit test. Sometimes you forget to add certain cases in the unittests. Putting the test in the postcondition makes sure it always run, for all the inputs your function is run on (unless you disable it), so you will catch the cases you didn't think of in the unittests.The crucial feature is, they do NOTHING except find bugs in the function they are attached to.In Eiffel you have the prestate too (the old), so the postcondition is the only place where such information is usable. I hope prestate will be added to D DbC, because it's a majob sub-feature of DbC. But I don't agree that postconditions are useless in D.For starters, it really needs to be a function with multiple return values. Otherwise, you can just stick asserts just before your return statement, and you don't need __old or any such thing.If a function has multiple return values the out(result) helps make sure all the return paths are verified. If the function has only one return value it helps anyway, because it helps you not forget to verify the result.Under what circumstances are they are more valuable than any other assert inside a function?I have already given some answers. Another answer is this: int foo(int x) in { // ... } out(result) { auto y = computeSomething(result); assert(y ...); assert(y ...); } body { // ... } The out{} helps you organize your code, separating the tests of the body from the postcondition tests. Also in the postcondition you are allowed to define new variables and call things. All this out(){} code vanishes in release mode. Ho do you do that with just asserts inside the body? If you do this the asserts will vanish in release mode, but the y will be computed still, wasting computations (a smart compiler is able to see y is not used and etc, but it's not sure this optimization happens if the computation of y is complex and it's done in-place): int foo(int x) in { // ... } body { result = ...; auto y = computeSomething(result); assert(y ...); assert(y ...); return result; } I presume there are ways to disable the computation of y in release mode, but I don't want to think about them. I just stick the y computation in the postcondition and the compiler will take care of it. Bye, bearophile
Aug 11 2011
bearophile wrote:Don:2) I have found many situations where I am able to solve a problem with both a simple and slow brute force solver, and a complex and fast algorithm to solve a problem. The little program maybe is too much slow for normal usage, but it's just few lines long (especially if I use lot of std.algorithm stuff) but it's much less likely to contain bugs.Sorry, but personally I don't believe that this is useful outside of toy examples. The question is, what bugs does it find that aren't found by a trivial unit test?There are two cases: (1) it's a very tight test. In which case, it's essentially a unit test. or (2) it's a very loose test. In which case, it doesn't find bugs.Putting a simpler algorithm in the post-condition implements a third possibility you are missing. Usually unit tests verify some specific cases (you are also able to add generic testing code in the unit test, but this is just like moving the postcondition elsewhere). If you put an alternative algorithm in the postcondition (under debug{} if you want), you have some advantages: - It's tight, because the second algorithm is supposed to always give the same results as the function.- It works with the real examples the program is run too, not just the cases you have put in the unit test.Conditions required for this to be true: (1) the function must not be time critical; (2) an alternative algorithm must exist; (3) the alternative algorithm must be bug-free; (4) the function must not have been tested properly; (5) the faulty test cases must occur during debugging (they won't be caught during production); (6) the programmer must remember to put the asserts in the 'out' contract, but not put them into the body of the function. This doesn't leave much. Sometimes you forget to add certain cases in the unittests. Putting the test in the postcondition makes sure it always run, for all the inputs your function is run on (unless you disable it), so you will catch the cases you didn't think of in the unittests.??? Does that relate to my sentence in any way?The crucial feature is, they do NOTHING except find bugs in the function they are attached to.In Eiffel you have the prestate too (the old), so the postcondition is the only place where such information is usable. I hope prestate will be added to D DbC, because it's a majob sub-feature of DbC. But I don't agree that postconditions are useless in D.That's what I said.For starters, it really needs to be a function with multiple return values. Otherwise, you can just stick asserts just before your return statement, and you don't need __old or any such thing.If a function has multiple return values the out(result) helps make sure all the return paths are verified.If the function has only one return value it helps anyway, because it helps you not forget to verify the result.???? Why would you remember to put an assert in the postcondition, when you didn't put it into the function?No you haven't.Under what circumstances are they are more valuable than any other assert inside a function?I have already given some answers.Another answer is this: int foo(int x) in { // ... } out(result) { auto y = computeSomething(result); assert(y ...); assert(y ...); } body { // ... } The out{} helps you organize your code, separating the tests of the body from the postcondition tests. Also in the postcondition you are allowed to define new variables and call things. All this out(){} code vanishes in release mode. Ho do you do that with just asserts inside the body? If you do this the asserts will vanish in release mode, but the y will be computed still, wasting computations (a smart compiler is able to see y is not used and etc, but it's not sure this optimization happens if the computation of y is complex and it's done in-place): int foo(int x) in { // ... } body { result = ...; auto y = computeSomething(result); assert(y ...); assert(y ...); return result; } I presume there are ways to disable the computation of y in release mode, but I don't want to think about them. I just stick the y computation in the postcondition and the compiler will take care of it.Trivial! Make the postcondition a nested function. (You can even make it a delegate literal, if it's only used in one place). I'll explain my original statement further: If you have a theorem prover, then the theorem prover can use the 'out' contract in any function which calls that function. Eg, int square(int x) out { assert(result>=0); } body { return x*x; } void foo() { int q = square(-5); if (q < 0) { .... } } Theorem prover knows that q>=0, even if it doesn't have access to the body of 'square'. So it detects unreachable code in foo(). So in this case, the 'out' contract can be used to find bugs in code that the author of the contract didn't write. Otherwise, out contracts only find bugs in the local function, which doesn't have much value, since unit testing already performs that role (and does it better). By contrast, 'in' functions ALWAYS find external bugs rather than local ones, so they're an order of magnitude more valuable in the current implementation.
Aug 12 2011
On 08/12/2011 01:31 PM, Don wrote:bearophile wrote:If the difference is not an asymptotic one, it can well be time critical (then the debug version will just not be as responsive as would be desirable for a finished product, which is often the case anyways.)Don:2) I have found many situations where I am able to solve a problem with both a simple and slow brute force solver, and a complex and fast algorithm to solve a problem. The little program maybe is too much slow for normal usage, but it's just few lines long (especially if I use lot of std.algorithm stuff) but it's much less likely to contain bugs.Sorry, but personally I don't believe that this is useful outside of toy examples. The question is, what bugs does it find that aren't found by a trivial unit test?There are two cases: (1) it's a very tight test. In which case, it's essentially a unit test. or (2) it's a very loose test. In which case, it doesn't find bugs.Putting a simpler algorithm in the post-condition implements a third possibility you are missing. Usually unit tests verify some specific cases (you are also able to add generic testing code in the unit test, but this is just like moving the postcondition elsewhere). If you put an alternative algorithm in the postcondition (under debug{} if you want), you have some advantages: - It's tight, because the second algorithm is supposed to always give the same results as the function.- It works with the real examples the program is run too, not just the cases you have put in the unit test.Conditions required for this to be true: (1) the function must not be time critical;(2) an alternative algorithm must exist;If an optimized version exists, a slower one exists too.(3) the alternative algorithm must be bug-free;That is often trivial. Also, if it is buggy, the discrepancy will be caught by the contract and the bug can be fixed.(4) the function must not have been tested properly;Usually, large software that has been 'tested properly' still contains bugs. For mission critical tasks, a form of testing related to this one is used heavily (multiple teams implement the same specification and the result of each query to the software is determined by majority vote).(5) the faulty test cases must occur during debugging (they won't be caught during production);Sure. This can catch eg. regressions during development, If there is a large team of programmers involved, contracts are more useful than if there is only a single developer.(6) the programmer must remember to put the asserts in the 'out' contract, but not put them into the body of the function.Well, if he they are a seasoned contract programmer, this is not a problem at all. ;)This doesn't leave much.I disagree.Sometimes you forget to add certain cases in the unittests. Putting the test in the postcondition makes sure it always run, for all the inputs your function is run on (unless you disable it), so you will catch the cases you didn't think of in the unittests.They specify what the function is supposed to do, in a way that always is up to date because it gets checked.The crucial feature is, they do NOTHING except find bugs in the function they are attached to.Yes. He says that once prestate is available, out contracts will be more useful. But he thinks they are already quite valuable without them.In Eiffel you have the prestate too (the old), so the postcondition is the only place where such information is usable. I hope prestate will be added to D DbC, because it's a majob sub-feature of DbC. But I don't agree that postconditions are useless in D.??? Does that relate to my sentence in any way?Don wroteThat's what I said.For starters, it really needs to be a function with multiple return values. Otherwise, you can just stick asserts just before your return statement, and you don't need __old or any such thing.If a function has multiple return values the out(result) helps make sure all the return paths are verified.If the function has only one return value it helps anyway, because it helps you not forget to verify the result.???? Why would you remember to put an assert in the postcondition, when you didn't put it into the function?bearophile wrote:Exactly that reason: int foo(){ // some code if(condition) return 37; // added after 2h of debugging // more code result=...; assert(condition(result)); return result; } int foo() out(result){assert(condition(result));} body{ //some code if(condition) return 37; // more code return ...; } it is both more convenient (you don't have to change your program logic) and less error-prone. Furthermore, all other programmers on the project can immediately check the postcondition and rely on that it holds for the result of any call of foo, even if the compiler does not use the out contract for any theorem proving. They can even do that before the respective function is implemented correctly. Out contracts are particularly useful when they are written before the function has is implemented completely.If a function has multiple return values the out(result) helps make sure all the return paths are verified.That's what I said.Because everyone who is working on the project wants to check nested functions? Sure it works, but it is not the best way to implement contract programming. That is why D has language support that goes beyond that.No you haven't.Under what circumstances are they are more valuable than any other assert inside a function?I have already given some answers.Another answer is this: int foo(int x) in { // ... } out(result) { auto y = computeSomething(result); assert(y ...); assert(y ...); } body { // ... } The out{} helps you organize your code, separating the tests of the body from the postcondition tests. Also in the postcondition you are allowed to define new variables and call things. All this out(){} code vanishes in release mode. Ho do you do that with just asserts inside the body? If you do this the asserts will vanish in release mode, but the y will be computed still, wasting computations (a smart compiler is able to see y is not used and etc, but it's not sure this optimization happens if the computation of y is complex and it's done in-place): int foo(int x) in { // ... } body { result = ...; auto y = computeSomething(result); assert(y ...); assert(y ...); return result; } I presume there are ways to disable the computation of y in release mode, but I don't want to think about them. I just stick the y computation in the postcondition and the compiler will take care of it.Trivial! Make the postcondition a nested function. (You can even make it a delegate literal, if it's only used in one place).I'll explain my original statement further: If you have a theorem prover, then the theorem prover can use the 'out' contract in any function which calls that function. Eg, int square(int x) out { assert(result>=0); } body { return x*x; } void foo() { int q = square(-5); if (q < 0) { .... } } Theorem prover knows that q>=0, even if it doesn't have access to the body of 'square'. So it detects unreachable code in foo().Theorem prover detects bug in square.So in this case, the 'out' contract can be used to find bugs in code that the author of the contract didn't write. Otherwise, out contracts only find bugs in the local function, which doesn't have much value, since unit testing already performs that role (and does it better).In this case, obviously all the unit tests tested square with an input that was less than 2^^16 in absolute value. Writing the postcondition sometimes also allows you to reflect properly on what the precondition should be. Also, it will be tested on possibly unexpected input.By contrast, 'in' functions ALWAYS find external bugs rather than local ones, so they're an order of magnitude more valuable in the current implementation.Not always. Sometimes they find bugs in the specification or the in contract itself. Contracts are not only an instrument of verification, but also one of specification. http://en.wikipedia.org/wiki/Design_by_contract The out contract is not there to verify some internal consistency conditions, but to specify what the function should compute, in an exact way, that is always up to date. The out contract is for programmers too, not only for compilers. Contract programming is one of these Software Engineering things. :) The crucial difference between out contract and an assert at the end of the function is how they are supposed to be used, not how they will work. This is reflected by the fact that DMD *.di generation will keep the contracts around. -Timon
Aug 12 2011
Don:bearophile wrote:This code of mine is a real-world example. This is a struct method with comments removed, the postcondition contains both fast loose tests and a tight slow O(n^2) version that thanks to std.algorithm is just 2 lines long (unfortunately because of DMD bug 6417 it's a bit longer than 2 lines). It's asymptotically slower than the fast algorithm, so I've put it into a debug{}. void foo(in int[] p, int[] q) nothrow in { assert(p.length == vectorLen); assert(q.length == vectorLen); assert(equal(p.dup.sort(), iota(1, vectorLen+1))); } out { foreach (i, qi; q) assert(qi >= 0 && qi < (vectorLen - i)); debug foreach (j; 1 .. (q.length + 1)) assert(q[j-1] == count!((int k){ return p[k] > j; })(iota(countUntil(cast()p, j) + 1))); } body { op[0] = &items[0]; foreach (i, pi; p) { items[i] = Item(pi, 0); op[i + 1] = &items[i + 1]; } foreach_reverse (k; 0 .. (lim + 1)) { xs[0 .. ((vectorLen >> (k + 1)) + 1)] = 0; foreach (j; 0 .. vectorLen) { int r = (op[j].space >> k) % 2; int s = op[j].space >> (k + 1); if (r) xs[s]++; else op[j].digit += xs[s]; } } foreach (i; 0 .. vectorLen) q[op[i].space - 1] = op[i].digit; } This postcondition has caught a simple mistake I've put in the fast algorithm. Probably there are ways to catch the same bug with unittests too. The ugly empty cast() inside the postcondition is another workaround, because countUntil doesn't work with a const p. If you write those two postcondition lines in Python3 it becomes less noisy: assert q == [sum(p[k] > j for k in range(p.index(j) + 1)) for j in range(1, len(q)+1)] Instead of: foreach (j; 1 .. q.length+1) assert(q[j-1] == count!((int k){ return p[k] > j; })(iota(countUntil(cast()p, j) + 1))); Here using assert(equal(q, map!...)) becomes too much puzzle-code. It's already too much nested. If you program in functional-style it's hard to write lines of 70 chars. In Haskell too lines of code are often long. Bye, bearophile2) I have found many situations where I am able to solve a problem with both a simple and slow brute force solver, and a complex and fast algorithm to solve a problem. The little program maybe is too much slow for normal usage, but it's just few lines long (especially if I use lot of std.algorithm stuff) but it's much less likely to contain bugs.Sorry, but personally I don't believe that this is useful outside of toy examples.
Aug 12 2011
On 11.08.2011 8:58, Don wrote:bearophile wrote:I stand corrected about invariants, somehow I wasn't considering them a part of contracts.Contracts don't replace unittests, they complement each other.You're conflating a couple of things here. Invariants are tremendously helpful for structs as well as classes.They are nice but have little value over plain assert _unless_ we are talking about classes and _inheritance_, which isn't the case here.It's easy to forget to test the output of a function, the "out" contracts help here. In structs the invariant helps you avoid forgetting to call manually a sanity test function every time you come in and out of a method."out" contracts seem to be almost useless, unless you have a theorem prover. The reason is, that they test nothing apart from the function they are attached to, and it's much better to do that with unittesting. They have very little in common with 'in' contracts. I think that EVERY struct and class in Phobos should have an invariant (except for something like Complex, where there are no invalid values). But I don't think 'out' contracts would add much value at all.-- Dmitry Olshansky
Aug 11 2011
On 2011-08-10 18:02, bearophile wrote:Dmitry Olshansky:I always add a blank line before and after statements. -- /Jacob CarlborgTo get a small no-crap-included beta package see download section of https://github.com/blackwhale/FReD for .7zs.When you write some English text you don't write a single block of text, you organize it into paragraphs, and paragraphs into chapters, chapters into sections, sections into books, etc. Time ago I have understood that paragraphs are very good in source code too. So I suggest you to add a blank line here and there inside your functions to separate them into paragraphs. I can't give you a style rule, you will need to create your own style, but often a function that's more than 10 lines line long needs one or more blank lines inside (some people say that every time you see one of such paragraphs in a function, especially if it has a comment before it, then you need to perform an "extract method" to improve the code. I believe this is a bad advice).
Aug 10 2011
On 10.08.2011 14:42, Dmitry Olshansky wrote:In case I failed to mention it before, I m working on the project codenamed FReD that is aimed at ~100%* source level compatible overhaul of std.regex, that uses better implementation techniques, provides modern Unicode support and common syntax riches. I think it's time for a public beta release, since it _should_ be ready for mainstream usage. There are some rough edges, and a couple issues that I'm aware of but they are nowhere in realistic use cases. In order to avoid unexpected regressions I'd be glad if current std.regex users do try it for their projects/tests. To get a small no-crap-included beta package see download section of https://github.com/blackwhale/FReD for .7zs. I'll upload newer packages as bugs get exposed and fixed. Alternatively, if you a comfortable with git you may just git clone entire repo. Some helpful notes (same as README) can be found here : https://github.com/blackwhale/FReD/wiki/Beta-release Caveats: In order for it compile a tiny change to 2.054 source is needed (no need to recompile Phobos! it's only in templates): patch std.algorithm.cmp according to this diff https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4631 <https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4633> and to get CTFE features working add if(!__ctfe) listed in the next diff on the same webpage. (this is already upstream, so if you're using a fork of phobos just pull this in) * some API problems might lead to a breaking change, though it didn't happen in this releaseMeanwhile the new beta is up: https://github.com/downloads/blackwhale/FReD/FReD_beta1.7z or checkout "stable" branch https://github.com/blackwhale/FReD/tree/stable ( as dawgfoto noticed the master branch tend to break on 64-bit as I develop primarily on 32bit) With prominent changes being: - fixed a horrible memory corruption with regex having certain groups/backrefs in lookaround - no GC heap activity during matching in all engines, except as workaround for bug http://d.puremagic.com/issues/show_bug.cgi?id=6199 - new prefix searcher, featuring up to 40x search speed up on patterns with semi-fixed prefixes e.g. \b(https?|ftp|file)://\S+ and ([0-9][0-9]?)/([0-9][0-9]?)/([0-9][0-9]([0-9][0-9])?) - bool opCast for RegexMatch for nice "test if not empty syntax" as suggested by Steven - lots of small fixes and optimizations -- Dmitry Olshansky
Aug 16 2011
Dmitry Olshansky:I have not patched DMD, but it gives me some problem here: void parseFlags(S)(S flags) { foreach(ch; flags)//flags are ASCII anyway { switch(ch) { foreach(i, op; __traits(allMembers, RegexOption)) { case RegexOptionNames[i]: if(re_flags & mixin("RegexOption."~op)) throw new RegexException(text("redundant flag specified: ",ch)); re_flags |= mixin("RegexOption."~op); break; } default: if(__ctfe) assert(text("unknown regex flag '",ch,"'")); else new RegexException(text("unknown regex flag '",ch,"'")); } To better see the situation I have written a small test case: import std.typetuple: TypeTuple; enum RegexOption : uint { A, B, C } // no need to put a semicolon here alias TypeTuple!(RegexOption.A, RegexOption.B, RegexOption.C) RegexOptionNames; void main() { RegexOption ch; switch (ch) { foreach (i, op; __traits(allMembers, RegexOption)) case RegexOptionNames[i]: break; default: assert(0); } } test.d(12): Error: switch case fallthrough - use 'goto case;' if intended test.d(12): Error: switch case fallthrough - use 'goto case;' if intended test.d(12): Error: switch case fallthrough - use 'goto case;' if intended test.d(14): Error: switch case fallthrough - use 'goto default;' if intended This used to work, I think. The new DMD switch analysis seems to have a bug. ------------- If you want a benchmark, to compare it with other implementations, there is this one: http://shootout.alioth.debian.org/debian/program.php?test=regexdna&lang=gdc&id=4 Bye, bearophileTo get a small no-crap-included beta package see download section of https://github.com/blackwhale/FReD for .7zs.
Aug 16 2011
On 17.08.2011 3:47, bearophile wrote:Dmitry Olshansky:Yes, that's a bug. But it's not a regression, I assume you started to compile with -w, that's when it happens IIRC. I almost forgot about it, thanks for uncovering it again, you may as well file it.I have not patched DMD, but it gives me some problem here: void parseFlags(S)(S flags) { foreach(ch; flags)//flags are ASCII anyway { switch(ch) { foreach(i, op; __traits(allMembers, RegexOption)) { case RegexOptionNames[i]: if(re_flags& mixin("RegexOption."~op)) throw new RegexException(text("redundant flag specified: ",ch)); re_flags |= mixin("RegexOption."~op); break; } default: if(__ctfe) assert(text("unknown regex flag '",ch,"'")); else new RegexException(text("unknown regex flag '",ch,"'")); } To better see the situation I have written a small test case: import std.typetuple: TypeTuple; enum RegexOption : uint { A, B, C } // no need to put a semicolon here alias TypeTuple!(RegexOption.A, RegexOption.B, RegexOption.C) RegexOptionNames; void main() { RegexOption ch; switch (ch) { foreach (i, op; __traits(allMembers, RegexOption)) case RegexOptionNames[i]: break; default: assert(0); } } test.d(12): Error: switch case fallthrough - use 'goto case;' if intended test.d(12): Error: switch case fallthrough - use 'goto case;' if intended test.d(12): Error: switch case fallthrough - use 'goto case;' if intended test.d(14): Error: switch case fallthrough - use 'goto default;' if intended This used to work, I think. The new DMD switch analysis seems to have a bug. -------------To get a small no-crap-included beta package see download section of https://github.com/blackwhale/FReD for .7zs.If you want a benchmark, to compare it with other implementations, there is this one: http://shootout.alioth.debian.org/debian/program.php?test=regexdna&lang=gdc&id=4All in due time, though this one involves semi-fixed patterns, hm ... very promising. -- Dmitry Olshansky
Aug 17 2011
Dmitry Olshansky:Yes, that's a bug. But it's not a regression,I think it's a DMD regression, probably introduced with the recent changes in switch semantics. DMD 2.042 doesn't have this bug.I assume you started to compile with -w,I suggest Phobos devs to use -w too.thanks for uncovering it again, you may as well file it.OK, I'll add it to Bugzilla. Bye, bearophile
Aug 17 2011
http://d.puremagic.com/issues/show_bug.cgi?id=6518thanks for uncovering it again, you may as well file it.OK, I'll add it to Bugzilla.
Aug 17 2011
When you have Herpes, HIV/AIDS, hpv,or any other STD, it can feel like you are all alone in the world DatingHerpesSingles.com is a place where you didn't have to worry about being rejected Just feel free to chat, share stories, make friends in your local area.
Aug 16 2011