digitalmars.D - Bye bye, fast compilation times
- H. S. Teoh (37/37) Feb 05 2018 One of my D projects for the past while has been taking unusually long
- psychoticRabbit (11/17) Feb 05 2018 regex is not the only one I avoid..
- psychoticRabbit (4/13) Feb 05 2018 oh.. and for an even bigger laugh... -O -release (ldc2 took ~10
- Steven Schveighoffer (8/32) Feb 05 2018 I was surprised at this, then I looked at the first line of isEmail:
- rikki cattermole (3/39) Feb 05 2018 On that note, we really should remove it performance-aside, you cannot
- Dmitry Olshansky (6/38) Feb 05 2018 That’s really bad idea - isEmail is template so the burden of
- Steven Schveighoffer (7/48) Feb 05 2018 Obviously it is horrible. On my mac, it took about 2.5 seconds to
- Dmitry Olshansky (6/18) Feb 05 2018 Just use the run-time version, it’s not that much slower. But
- Nathan S. (6/23) Feb 06 2018 FYI I've made a pull request that replaces uses of regexes in
- Dmitry Olshansky (4/14) Feb 06 2018 Then again if you may not need regex for IPv4 / IPv6.
- Nick Sabalausky (Abscissa) (3/21) Feb 06 2018 If the regex string isn't dependent on the template's params, just move
- H. S. Teoh (32/54) Feb 06 2018 Yeah, ctRegex is a bear at compile-time. Why can't we just use a
- Steven Schveighoffer (10/38) Feb 06 2018 You may not realize that this actually compiles it for ALL modules that
- Walter Bright (54/56) Feb 06 2018 std.string.isEmail() in D1 was a simple function. Maybe regex is just th...
- Steven Schveighoffer (7/16) Feb 06 2018 The regex in question I think is to ensure an email address like
- Timothee Cour (6/25) Feb 06 2018 another weird gotcha:
- Walter Bright (4/9) Feb 06 2018 Regex is well known to not always be the best solution for string proces...
- H. S. Teoh (8/19) Feb 06 2018 Are you sure? What about lex and its successors, like flex?
- Nathan S. (18/19) Feb 07 2018 Some years ago I was surprised when I saw this in Clojure's
- Walter Bright (6/30) Feb 07 2018 Yes, I'm sure somebody does it. And now that regex has produced a match,...
- bauss (12/16) Feb 09 2018 An invalid IP is not necessarily an invalid email though.
- aliak (4/10) Feb 11 2018 +1.
- Adam D. Ruppe (5/7) Feb 11 2018 The isemail function isn't about validating email addresses. It
- aliak (22/26) Feb 11 2018 *valid email format... (is better? :) )
- Jacob Carlborg (5/7) Feb 06 2018 If I recall correctly, the current implementation of std.net.isEmail was...
- Walter Bright (3/10) Feb 06 2018 Regardless of whether it was requested by me or not, if the current vers...
- Steven Schveighoffer (4/15) Feb 06 2018 The regex problem is being solved:
- Andres Clari (15/31) Feb 06 2018 That's fixing just the "isEmail" issue which is good I guess.
- H. S. Teoh (21/29) Feb 06 2018 I seem to vaguely recall that in some cases, ctRegex might even perform
- Andres Clari (6/29) Feb 06 2018 Well I'm using vibe.d, but not templates on this project, just a
- H. S. Teoh (26/28) Feb 06 2018 Not that I know of. Basically what I did was:
- Walter Bright (2/5) Feb 06 2018 Great!
- psychoticRabbit (2/5) Feb 06 2018 C .. D style. I love it! (bugs and all).
- Dmitry Olshansky (15/44) Feb 05 2018 There is a fuckton of templates involved, plus a couple of tries
- H. S. Teoh (27/39) Feb 06 2018 Heh, dmd's famous memory usage is causing me tons of grief on low-memory
- Stefan Koch (17/24) Feb 07 2018 There are some good news for you.
- Stefan Koch (17/24) Feb 07 2018 There are some good news for you.
- Bastiaan Veelo (4/10) Feb 07 2018 What is the preferred place for this?
- Stefan Koch (4/15) Feb 08 2018 I'd prefer a pr against
- Stefan Koch (4/15) Feb 08 2018 Corrected link:
- Bastiaan Veelo (7/13) Feb 11 2018 Is this on someone's agenda? It probably needs an enhancement
- Dmitry Olshansky (12/19) Feb 12 2018 Was once on my together with other OS memory manager functions,
- Stefan Koch (8/19) Feb 13 2018 Since dmd is only targeting x86/x86_64 there is really just one
- Martin Tschierschke (16/42) Feb 08 2018 Thank you for this finding!
- Nick Sabalausky (Abscissa) (6/13) Feb 08 2018 Unfortunately that depends completely on what buildsystem you're using.
One of my D projects for the past while has been taking unusually long times to compile. This morning, I finally decided to sit down and figure out exactly why. What I found was rather disturbing: ------ import std.regex; void main() { auto re = regex(``); } ------ Compile command: time dmd -c test.d Output: ------ real 0m3.113s user 0m2.884s sys 0m0.226s ------ Comment out the call to `regex()`, and I get: ------ real 0m0.285s user 0m0.262s sys 0m0.023s ------ Clearly, something is wrong if the mere act of compiling a regex causes a 4-line program to take *3 seconds* to compile, where normally dmd takes less than a second. Apparently, the offending Phobos PR was merged late last year: https://issues.dlang.org/show_bug.cgi?id=18378 This is a serious slap-in-the-face to dmd's reputation of super-fast compilation. Makes our "fast code, fast" slogan look more and more ironic. :-( (Note: this particular regression is in *compilation* times; it's not directly related to the *performance* of the regex code itself. The latter department as also suffered a regression; see for example: https://github.com/dlang/phobos/pull/5981.) T -- Маленькие детки - маленькие бедки.
Feb 05 2018
On Monday, 5 February 2018 at 21:27:57 UTC, H. S. Teoh wrote:Comment out the call to `regex()`, and I get: ------ real 0m0.285s user 0m0.262s sys 0m0.023s ------regex is not the only one I avoid.. how long you think this takes to compile? (try ldc2 too ..just for laughs ;-) ---- import std.net.isemail; void main() { auto checkEmail = "someone somewhere.com".isEmail(); } ----
Feb 05 2018
On Tuesday, 6 February 2018 at 04:09:24 UTC, psychoticRabbit wrote:how long you think this takes to compile? (try ldc2 too ..just for laughs ;-) ---- import std.net.isemail; void main() { auto checkEmail = "someone somewhere.com".isEmail(); } ----oh.. and for an even bigger laugh... -O -release (ldc2 took ~10 seconds)
Feb 05 2018
On 2/5/18 11:09 PM, psychoticRabbit wrote:On Monday, 5 February 2018 at 21:27:57 UTC, H. S. Teoh wrote:I was surprised at this, then I looked at the first line of isEmail: static ipRegex = ctRegex!(`\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}`~ `(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$`.to!(const(Char)[])); So it's really still related to regex. -SteveComment out the call to `regex()`, and I get: ------ real 0m0.285s user 0m0.262s sys 0m0.023s ------regex is not the only one I avoid.. how long you think this takes to compile? (try ldc2 too ..just for laughs ;-) ---- import std.net.isemail; void main() { auto checkEmail = "someone somewhere.com".isEmail(); } ----
Feb 05 2018
On 06/02/2018 4:35 AM, Steven Schveighoffer wrote:On 2/5/18 11:09 PM, psychoticRabbit wrote:On that note, we really should remove it performance-aside, you cannot really trust it.On Monday, 5 February 2018 at 21:27:57 UTC, H. S. Teoh wrote:I was surprised at this, then I looked at the first line of isEmail: static ipRegex = ctRegex!(`\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}`~ `(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$`.to!(const(Char)[])); So it's really still related to regex. -SteveComment out the call to `regex()`, and I get: ------ real 0m0.285s user 0m0.262s sys 0m0.023s ------regex is not the only one I avoid.. how long you think this takes to compile? (try ldc2 too ..just for laughs ;-) ---- import std.net.isemail; void main() { auto checkEmail = "someone somewhere.com".isEmail(); } ----
Feb 05 2018
On Tuesday, 6 February 2018 at 04:35:42 UTC, Steven Schveighoffer wrote:On 2/5/18 11:09 PM, psychoticRabbit wrote:That’s really bad idea - isEmail is template so the burden of freaking slow ctRegex is paid on per instantiation basis. Could be horrible with separate compilation.On Monday, 5 February 2018 at 21:27:57 UTC, H. S. Teoh wrote:I was surprised at this, then I looked at the first line of isEmail: static ipRegex = ctRegex!(`\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}`~ `(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$`.to!(const(Char)[])); So it's really still related to regex.Comment out the call to `regex()`, and I get: ------ real 0m0.285s user 0m0.262s sys 0m0.023s ------regex is not the only one I avoid.. how long you think this takes to compile? (try ldc2 too ..just for laughs ;-) ---- import std.net.isemail; void main() { auto checkEmail = "someone somewhere.com".isEmail(); } -----Steve
Feb 05 2018
On 2/6/18 12:35 AM, Dmitry Olshansky wrote:On Tuesday, 6 February 2018 at 04:35:42 UTC, Steven Schveighoffer wrote:Obviously it is horrible. On my mac, it took about 2.5 seconds to compile this one line. I'm not sure how to fix it though... I suppose you could make it 3 overloads, but this defeats a lot of the purpose of having templates in the first place. -SteveOn 2/5/18 11:09 PM, psychoticRabbit wrote:That’s really bad idea - isEmail is template so the burden of freaking slow ctRegex is paid on per instantiation basis. Could be horrible with separate compilation.On Monday, 5 February 2018 at 21:27:57 UTC, H. S. Teoh wrote:I was surprised at this, then I looked at the first line of isEmail: static ipRegex = ctRegex!(`\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}`~ `(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$`.to!(const(Char)[])); So it's really still related to regex.Comment out the call to `regex()`, and I get: ------ real 0m0.285s user 0m0.262s sys 0m0.023s ------regex is not the only one I avoid.. how long you think this takes to compile? (try ldc2 too ..just for laughs ;-) ---- import std.net.isemail; void main() { auto checkEmail = "someone somewhere.com".isEmail(); } ----
Feb 05 2018
On Tuesday, 6 February 2018 at 05:45:35 UTC, Steven Schveighoffer wrote:On 2/6/18 12:35 AM, Dmitry Olshansky wrote:Just use the run-time version, it’s not that much slower. But then again static ipRegex = regex(...) will parse and build regex at CTFE. Maybe lazy init?That’s really bad idea - isEmail is template so the burden of freaking slow ctRegex is paid on per instantiation basis. Could be horrible with separate compilation.Obviously it is horrible. On my mac, it took about 2.5 seconds to compile this one line. I'm not sure how to fix it though... I suppose you could makeit 3 overloads, but this defeats a lot of the purpose of having templates in the first place. -Steve
Feb 05 2018
On Tuesday, 6 February 2018 at 06:11:55 UTC, Dmitry Olshansky wrote:On Tuesday, 6 February 2018 at 05:45:35 UTC, Steven Schveighoffer wrote:FYI I've made a pull request that replaces uses of regexes in std.net.isemail. It turns out they weren't being used for anything indispensable. Import benchmark results were encouraging. https://github.com/dlang/phobos/pull/6129On 2/6/18 12:35 AM, Dmitry Olshansky wrote:Just use the run-time version, it’s not that much slower. But then again static ipRegex = regex(...) will parse and build regex at CTFE. Maybe lazy init?That’s really bad idea - isEmail is template so the burden of freaking slow ctRegex is paid on per instantiation basis. Could be horrible with separate compilation.Obviously it is horrible. On my mac, it took about 2.5 seconds to compile this one line. I'm not sure how to fix it though... I suppose you could make
Feb 06 2018
On Tuesday, 6 February 2018 at 13:51:01 UTC, Nathan S. wrote:Then again if you may not need regex for IPv4 / IPv6. In theory it should have been the goto case for ctRegex but not at the cost of such horrible compile times.Just use the run-time version, it’s not that much slower. But then again static ipRegex = regex(...) will parse and build regex at CTFE. Maybe lazy init?FYI I've made a pull request that replaces uses of regexes in std.net.isemail. It turns out they weren't being used for anything indispensable. Import benchmark results were encouraging. https://github.com/dlang/phobos/pull/6129
Feb 06 2018
On 02/06/2018 01:11 AM, Dmitry Olshansky wrote:On Tuesday, 6 February 2018 at 05:45:35 UTC, Steven Schveighoffer wrote:If the regex string isn't dependent on the template's params, just move the regex outside the template.On 2/6/18 12:35 AM, Dmitry Olshansky wrote:Just use the run-time version, it’s not that much slower. But then again static ipRegex = regex(...) will parse and build regex at CTFE. Maybe lazy init?That’s really bad idea - isEmail is template so the burden of freaking slow ctRegex is paid on per instantiation basis. Could be horrible with separate compilation.Obviously it is horrible. On my mac, it took about 2.5 seconds to compile this one line. I'm not sure how to fix it though... I suppose you could make
Feb 06 2018
On Tue, Feb 06, 2018 at 05:35:44AM +0000, Dmitry Olshansky via Digitalmars-d wrote:On Tuesday, 6 February 2018 at 04:35:42 UTC, Steven Schveighoffer wrote:[...]On 2/5/18 11:09 PM, psychoticRabbit wrote:Yeah, ctRegex is a bear at compile-time. Why can't we just use a runtime regex? It will at least take "only" 3 seconds to compile. :-D Or just don't use a regex at all.---- import std.net.isemail; void main() { auto checkEmail = "someone somewhere.com".isEmail(); } ----I was surprised at this, then I looked at the first line of isEmail: static ipRegex = ctRegex!(`\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}`~ `(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$`.to!(const(Char)[])); So it's really still related to regex.That’s really bad idea - isEmail is template so the burden of freaking slow ctRegex is paid on per instantiation basis. Could be horrible with separate compilation.[...] I'm not sure I'm seeing the value of using ctRegex here. What's wrong with a module static runtime regex initialized by a static this()? And before anyone complains about initializing the regex if user code never actually uses it, it's possible to use static this() on an as-needed basis: template ipRegex() { // Eponymous templates FTW! Regex!char ipRegex; static this() { ipRegex = regex(`blah blah blah`); } } auto isEmail(... blah blah ...) { ... if (ipRegex.match(...)) ... ... } Basically, if `ipRegex` is never referenced, the template is never instantiated and the static this() basically doesn't exist. :-D Pay-as-you-go FTW! T -- If you want to solve a problem, you need to address its root cause, not just its symptoms. Otherwise it's like treating cancer with Tylenol...
Feb 06 2018
On 2/6/18 2:07 PM, H. S. Teoh wrote:I'm not sure I'm seeing the value of using ctRegex here. What's wrong with a module static runtime regex initialized by a static this()?No, I'd rather have it initialized on first call.And before anyone complains about initializing the regex if user code never actually uses it, it's possible to use static this() on an as-needed basis: template ipRegex() { // Eponymous templates FTW! Regex!char ipRegex; static this() { ipRegex = regex(`blah blah blah`); } } auto isEmail(... blah blah ...) { ... if (ipRegex.match(...)) ... ... } Basically, if `ipRegex` is never referenced, the template is never instantiated and the static this() basically doesn't exist. :-D Pay-as-you-go FTW!You may not realize that this actually compiles it for ALL modules that use it, and the compiler puts in a gate to prevent it from running more than once. So you pay every time anyways (compile-time wise at least). It also makes any importing module now a module that defines a static ctor, so cycles are much more likely. In any case, there is a PR in the works that should eliminate the need for regex altogether: https://github.com/dlang/phobos/pull/6129 -Steve
Feb 06 2018
On 2/5/2018 9:35 PM, Dmitry Olshansky wrote:That’s really bad idea - isEmail is template so the burden of freaking slow ctRegex is paid on per instantiation basis. Could be horrible with separate compilation.std.string.isEmail() in D1 was a simple function. Maybe regex is just the wrong solution for this problem. ---------------------- std.string.isEmail -------------------- /*************************** * Does string s[] start with an email address? * Returns: * null it does not * char[] it does, and this is the slice of s[] that is that email address * References: * RFC2822 */ char[] isEmail(char[] s) { size_t i; if (!isalpha(s[0])) goto Lno; for (i = 1; 1; i++) { if (i == s.length) goto Lno; auto c = s[i]; if (isalnum(c)) continue; if (c == '-' || c == '_' || c == '.') continue; if (c != ' ') goto Lno; i++; break; } //writefln("test1 '%s'", s[0 .. i]); /* Now do the part past the ' ' */ size_t lastdot; for (; i < s.length; i++) { auto c = s[i]; if (isalnum(c)) continue; if (c == '-' || c == '_') continue; if (c == '.') { lastdot = i; continue; } break; } if (!lastdot || (i - lastdot != 3 && i - lastdot != 4)) goto Lno; return s[0 .. i]; Lno: return null; }
Feb 06 2018
On 2/6/18 3:11 PM, Walter Bright wrote:On 2/5/2018 9:35 PM, Dmitry Olshansky wrote:The regex in question I think is to ensure an email address like abc 192.168.0.5 has a valid IP address. The D1 function doesn't support that requirement. I admit, I've never used it, so I don't know why it needs to be so complex. But I assume some people depend on that functionality. -SteveThat’s really bad idea - isEmail is template so the burden of freaking slow ctRegex is paid on per instantiation basis. Could be horrible with separate compilation.std.string.isEmail() in D1 was a simple function. Maybe regex is just the wrong solution for this problem.
Feb 06 2018
another weird gotcha: auto s="foo".isEmail; writeln(s.toString); // ok writeln(s); // compile error On Tue, Feb 6, 2018 at 12:30 PM, Steven Schveighoffer via Digitalmars-d <digitalmars-d puremagic.com> wrote:On 2/6/18 3:11 PM, Walter Bright wrote:On 2/5/2018 9:35 PM, Dmitry Olshansky wrote:The regex in question I think is to ensure an email address like abc 192.168.0.5 has a valid IP address. The D1 function doesn't support that requirement. I admit, I've never used it, so I don't know why it needs to be so complex. But I assume some people depend on that functionality. -SteveThat’s really bad idea - isEmail is template so the burden of freaking slow ctRegex is paid on per instantiation basis. Could be horrible with separate compilation.std.string.isEmail() in D1 was a simple function. Maybe regex is just the wrong solution for this problem.
Feb 06 2018
On 2/6/2018 12:30 PM, Steven Schveighoffer wrote:The regex in question I think is to ensure an email address like abc 192.168.0.5 has a valid IP address. The D1 function doesn't support that requirement. I admit, I've never used it, so I don't know why it needs to be so complex. But I assume some people depend on that functionality.Regex is well known to not always be the best solution for string processing tasks. For example, it does not work well at all where recursion is desired, and nobody uses regex for lexer in a compiler.
Feb 06 2018
On Tue, Feb 06, 2018 at 02:29:07PM -0800, Walter Bright via Digitalmars-d wrote:On 2/6/2018 12:30 PM, Steven Schveighoffer wrote:Are you sure? What about lex and its successors, like flex? Of course, one could argue that the generated code isn't strictly a regex implementation in the same way as std.regex... but isn't that just a QoI issue? T -- Life would be easier if I had the source code. -- YHLThe regex in question I think is to ensure an email address like abc 192.168.0.5 has a valid IP address. The D1 function doesn't support that requirement. I admit, I've never used it, so I don't know why it needs to be so complex. But I assume some people depend on that functionality.Regex is well known to not always be the best solution for string processing tasks. For example, it does not work well at all where recursion is desired, and nobody uses regex for lexer in a compiler.
Feb 06 2018
On Tuesday, 6 February 2018 at 22:29:07 UTC, Walter Bright wrote:nobody uses regex for lexer in a compiler.Some years ago I was surprised when I saw this in Clojure's source code. It appears to still be there today: https://github.com/clojure/clojure/blob/1215ba346ffea3fe48def6ec70542e3300b6f9ed/src/jvm/clojure/lang/LispReader.java#L66-L73 --- static Pattern symbolPat = Pattern.compile("[:]?([\\D&&[^/]].*/)?(/|[\\D&&[^/]][^/]*)"); //static Pattern varPat = Pattern.compile("([\\D&&[^:\\.]][^:\\.]*):([\\D&&[^:\\.]][^:\\.]*)"); //static Pattern intPat = Pattern.compile("[-+]?[0-9]+\\.?"); static Pattern intPat = Pattern.compile( "([-+]?)(?:(0)|([1-9][0-9]*)|0[xX]([0-9A-Fa-f]+)|0([0-7]+)|([1-9][0-9]?)[rR]([0-9A-Za-z]+)|0[0-9]+)(N)?"); static Pattern ratioPat = Pattern.compile("([-+]?[0-9]+)/([0-9]+)"); static Pattern floatPat = Pattern.compile("([-+]?[0-9]+(\\.[0-9]*)?([eE][-+]?[0-9]+)?)(M)?"); ---
Feb 07 2018
On 2/7/2018 1:07 PM, Nathan S. wrote:On Tuesday, 6 February 2018 at 22:29:07 UTC, Walter Bright wrote:Yes, I'm sure somebody does it. And now that regex has produced a match, you have to scan it again to turn it into a number, making for slow lexing. And if regex doesn't produce a match, you get a generic error message rather than something specific like "character 'A' is not allowed in a numeric literal". (Generic error messages are one of the downsides of using tools like lex and yacc.)nobody uses regex for lexer in a compiler.Some years ago I was surprised when I saw this in Clojure's source code. It appears to still be there today: https://github.com/clojure/clojure/blob/1215ba346ffea3fe48def6ec70542e3300b6f9ed/src/jvm/clojure/lang/Lis Reader.java#L66-L73 --- static Pattern symbolPat = Pattern.compile("[:]?([\\D&&[^/]].*/)?(/|[\\D&&[^/]][^/]*)"); //static Pattern varPat = Pattern.compile("([\\D&&[^:\\.]][^:\\.]*):([\\D&&[^:\\.]][^:\\.]*)"); //static Pattern intPat = Pattern.compile("[-+]?[0-9]+\\.?"); static Pattern intPat = Pattern.compile( "([-+]?)(?:(0)|([1-9][0-9]*)|0[xX]([0-9A-Fa-f]+)|0([0-7]+)|([1-9][0-9]?)[rR]([0-9A-Za- ]+)|0[0-9]+)(N)?"); static Pattern ratioPat = Pattern.compile("([-+]?[0-9]+)/([0-9]+)"); static Pattern floatPat = Pattern.compile("([-+]?[0-9]+(\\.[0-9]*)?([eE][-+]?[0-9]+)?)(M)?"); ---
Feb 07 2018
On Tuesday, 6 February 2018 at 20:30:42 UTC, Steven Schveighoffer wrote:The regex in question I think is to ensure an email address like abc 192.168.0.5 has a valid IP address. The D1 function doesn't support that requirement.-SteveAn invalid IP is not necessarily an invalid email though. You'd be surprised how much __garbage__ a valid email actually can contain. https://www.w3.org/Protocols/rfc822/ Generally the best way to validate an email is just to check if there is a value before and a value after. The real way to validate an email is to check if the email exists on a SMTP server, BUT some SMTP servers will not provide such information (Such as gmail I think?) and thus you can't really rely on that either.
Feb 09 2018
On Friday, 9 February 2018 at 14:19:56 UTC, bauss wrote:Generally the best way to validate an email is just to check if there is a value before and a value after. The real way to validate an email is to check if the email exists on a SMTP server, BUT some SMTP servers will not provide such information (Such as gmail I think?) and thus you can't really rely on that either.+1. If anyone wants to do email validation this should be read first: https://hackernoon.com/the-100-correct-way-to-validate-email-addresses-7c4818f24643
Feb 11 2018
On Sunday, 11 February 2018 at 16:26:19 UTC, aliak wrote:If anyone wants to do email validation this should be read first:The isemail function isn't about validating email addresses. It is just about recognizing something that looks like one. just like isurl doesn't actually try to fetch the site to see if it is broken, it just sees if it looks like one as a first step.
Feb 11 2018
On Sunday, 11 February 2018 at 16:35:35 UTC, Adam D. Ruppe wrote:The isemail function isn't about validating email addresses. It is just about recognizing something that looks like one. just like isurl doesn't actually try to fetch the site to see if it is broken, it just sees if it looks like one as a first step.*valid email format... (is better? :) ) When someone says isurl checks if a string is a valid url, I don't think the general assumption is that it makes a network call to check if it is a resolvable url. (could be mistaken of course, but not to me at least). Isurl checks that the format is correct. Same for isemail. The isemail API and the docs all use the term valid as well. Plus, to further see how hard it is to validate an email, these are apparently all erroneous results (granted wikipedia could be wrong as well): import std.net.isemail, std.stdio; void main() { isEmail("john.smith(comment) example.com").valid.writeln; // is valid, prints false isEmail("user [2001:DB8::1]").valid.writeln; // is valid, prints false isEmail(`" " example.org`).valid.writeln; // not valid, prints true isEmail(`"very.unusual. .unusual.com" example.com`).valid.writeln; // not valid, prints true }
Feb 11 2018
On 2018-02-06 21:11, Walter Bright wrote:std.string.isEmail() in D1 was a simple function. Maybe regex is just the wrong solution for this problem.If I recall correctly, the current implementation of std.net.isEmail was requested by you. -- /Jacob Carlborg
Feb 06 2018
On 2/6/2018 2:03 PM, Jacob Carlborg wrote:On 2018-02-06 21:11, Walter Bright wrote:Regardless of whether it was requested by me or not, if the current version is not working for us, we need to explore alternatives.std.string.isEmail() in D1 was a simple function. Maybe regex is just the wrong solution for this problem.If I recall correctly, the current implementation of std.net.isEmail was requested by you.
Feb 06 2018
On 2/6/18 5:23 PM, Walter Bright wrote:On 2/6/2018 2:03 PM, Jacob Carlborg wrote:The regex problem is being solved: https://github.com/dlang/phobos/pull/6129 -SteveOn 2018-02-06 21:11, Walter Bright wrote:Regardless of whether it was requested by me or not, if the current version is not working for us, we need to explore alternatives.std.string.isEmail() in D1 was a simple function. Maybe regex is just the wrong solution for this problem.If I recall correctly, the current implementation of std.net.isEmail was requested by you.
Feb 06 2018
On Tuesday, 6 February 2018 at 22:51:51 UTC, Steven Schveighoffer wrote:On 2/6/18 5:23 PM, Walter Bright wrote:That's fixing just the "isEmail" issue which is good I guess. But after reading this thread, I run some tests on one of my code bases, which uses about 6 regex throughout. Switching from ctRegex! to regex yielded a 50% build time reduction, and from what I read even the normal regex are slowing things down considerably. Might need a warning on the docs for ctRegex! explaining it'll screw your build times if you use it, unless there's some way to speed that up to something normal. Btw, my project which is 3517 lines of D builds in 20s disabling the ctRegex on an i7 4770k at 4.3Ghz. So I'd say once you start doing some more complex usages, D's build speed goes out the door.On 2/6/2018 2:03 PM, Jacob Carlborg wrote:The regex problem is being solved: https://github.com/dlang/phobos/pull/6129 -SteveOn 2018-02-06 21:11, Walter Bright wrote:Regardless of whether it was requested by me or not, if the current version is not working for us, we need to explore alternatives.std.string.isEmail() in D1 was a simple function. Maybe regex is just the wrong solution for this problem.If I recall correctly, the current implementation of std.net.isEmail was requested by you.
Feb 06 2018
On Tue, Feb 06, 2018 at 11:20:53PM +0000, Andres Clari via Digitalmars-d wrote: [...]Switching from ctRegex! to regex yielded a 50% build time reduction, and from what I read even the normal regex are slowing things down considerably.I seem to vaguely recall that in some cases, ctRegex might even perform slower than regex(). But either way, my use cases for regexes generally aren't performance-sensitive enough to be worth the trouble of huge compilation time slowdown -- I just use regex() instead of ctRegex. [...]Btw, my project which is 3517 lines of D builds in 20s disabling the ctRegex on an i7 4770k at 4.3Ghz. So I'd say once you start doing some more complex usages, D's build speed goes out the door.That depends on what you're doing with it, and also how you're building it. 3500+ lines isn't a lot of code; it ought to compile pretty fast unless you're using a lot of (1) templates, (2) CTFE. Also, I find that dub builds are excruciatingly slow compared to just invoking dmd directly, due to network access and rescanning dependencies on every invocation. I have a 4700+ line vibe.d project; Diet templates are template/CTFE-heavy and generally take the longest to build. (I dumped dub and went back to an SCons-based system with separate compilation for major subsystems -- as long as I don't recompile Diet templates, the whole thing can build within seconds; with Diet templates it takes about 30 seconds :-/.) T -- Unix was not designed to stop people from doing stupid things, because that would also stop them from doing clever things. -- Doug Gwyn
Feb 06 2018
On Wednesday, 7 February 2018 at 00:36:22 UTC, H. S. Teoh wrote:On Tue, Feb 06, 2018 at 11:20:53PM +0000, Andres Clari via Digitalmars-d wrote: [...]Well I'm using vibe.d, but not templates on this project, just a minimal rest service, and a few timers and runTasks. So yeah I don't see why it should slow down that much. Is there some tutorial or example for using SCons with dub dependencies?[...]I seem to vaguely recall that in some cases, ctRegex might even perform slower than regex(). But either way, my use cases for regexes generally aren't performance-sensitive enough to be worth the trouble of huge compilation time slowdown -- I just use regex() instead of ctRegex. [...][...]That depends on what you're doing with it, and also how you're building it. 3500+ lines isn't a lot of code; it ought to compile pretty fast unless you're using a lot of (1) templates, (2) CTFE. Also, I find that dub builds are excruciatingly slow compared to just invoking dmd directly, due to network access and rescanning dependencies on every invocation. I have a 4700+ line vibe.d project; Diet templates are template/CTFE-heavy and generally take the longest to build. (I dumped dub and went back to an SCons-based system with separate compilation for major subsystems -- as long as I don't recompile Diet templates, the whole thing can build within seconds; with Diet templates it takes about 30 seconds :-/.) T
Feb 06 2018
On Wed, Feb 07, 2018 at 01:22:02AM +0000, Andres Clari via Digitalmars-d wrote: [...]Is there some tutorial or example for using SCons with dub dependencies?Not that I know of. Basically what I did was: - Create a dummy dub project in a subdirectory, containing a dummy source file containing an empty main(). - Declare whatever dub dependencies you need in this dummy project. - Run `dub build -v` inside this subdirectory to make dub fetch dependencies, build libraries, etc.. - Parse the output, esp. the last few lines that show which include paths, linker flags, and libraries are required to build the main program. - Specify these include paths, linker flags, and libraries in your SConstruct file for building your real project. - Build away. - If you need to refresh dependencies, go into the dummy project and run `dub build --force` to rebuild all dependencies, then run scons in your real project. Arguably, some/all of the above could be automated by SCons. Though the whole point is to *not* run dub every single time you build, so I'd keep them separate, or as a non-default build target that only triggers when you explicitly want it to. Also, none of this is specific to SCons; you could use whatever other build system you wish with the above steps. T -- This sentence is false.
Feb 06 2018
On 2/6/2018 2:51 PM, Steven Schveighoffer wrote:The regex problem is being solved: https://github.com/dlang/phobos/pull/6129Great!
Feb 06 2018
On Tuesday, 6 February 2018 at 20:11:56 UTC, Walter Bright wrote:std.string.isEmail() in D1 was a simple function. Maybe regex is just the wrong solution for this problem. [...]C .. D style. I love it! (bugs and all).
Feb 06 2018
On Monday, 5 February 2018 at 21:27:57 UTC, H. S. Teoh wrote:One of my D projects for the past while has been taking unusually long times to compile. This morning, I finally decided to sit down and figure out exactly why. What I found was rather disturbing: ------ import std.regex; void main() { auto re = regex(``); } ------ Compile command: time dmd -c test.d Output: ------ real 0m3.113s user 0m2.884s sys 0m0.226s ------ Comment out the call to `regex()`, and I get: ------ real 0m0.285s user 0m0.262s sys 0m0.023s ------Clearly, something is wrong if the mere act of compiling a regex causes a 4-line program to take *3 seconds* to compile,There is a fuckton of templates involved, plus a couple of tries are built at CTFE. The regression is curious though, maybe something gets recomputed at CTFE over and over again.where normally dmd takes less than a second.Honestly I’m tired to hell of working with our compiler and its compile time features. When it doesn’t pee itself due to OOM I’m almost happy. In retrospect I should have just provided a C interface and compiled the whole thing separately. And CTFE could easily be replaced by a small custom JIT compiler, it would also work at run-time(!). Especially considering that it’s been 6 years but it’s still is not practical to use ctRegex.The latter department as also suffered a regression; see for example: https://github.com/dlang/phobos/pull/5981.)Yup, Martin seems on top of it, thankfully.T
Feb 05 2018
On Tue, Feb 06, 2018 at 05:44:17AM +0000, Dmitry Olshansky via Digitalmars-d wrote: [...]Honestly I’m tired to hell of working with our compiler and its compile time features. When it doesn’t pee itself due to OOM I’m almost happy.Heh, dmd's famous memory usage is causing me tons of grief on low-memory systems, too. Basically if you have anything less than 2GB of RAM, you might as well give up trying to compile anything non-trivial. We need to get a serious handle on dmd's memory consumption -- at least let there be an option or something that will turn out the GC or whatever. It's better for dmd to be (gosh) slow, than for it not to be able to compile anything at all due to it provoking the kernel OOM killer.In retrospect I should have just provided a C interface and compiled the whole thing separately. And CTFE could easily be replaced by a small custom JIT compiler, it would also work at run-time(!).We seriously need to get newCTFE finished and merged. Stefan is very busy with other stuff ATM; I wonder if a few of us can continue his work and get newCTFE into a mergeable state. Given how much D's "compile-time" features are advertised, and D's new (ick) slogan of being fast or whatever, it's high time we actually delivered on our promises by actually making CTFE more usable. On that note, though, I think a JIT regex compiler totally makes sense. I'd totally support that.Especially considering that it’s been 6 years but it’s still is not practical to use ctRegex.I find that using just plain `regex` is Good Enough(tm) for my purposes. Do we really need ctRegex? The idea of generating an optimal FSM at compile-time is rather appealing, but in the grand scheme of things, doesn't seem like an absolute must-have.[...] Unfortunately, Martin's PR is only to improve runtime performance. It's still dog-slow to *compile* std.regex. :-( T -- Dogs have owners ... cats have staff. -- Krista CasadaThe latter department as also suffered a regression; see for example: https://github.com/dlang/phobos/pull/5981.)Yup, Martin seems on top of it, thankfully.
Feb 06 2018
On Tuesday, 6 February 2018 at 18:56:44 UTC, H. S. Teoh wrote:We seriously need to get newCTFE finished and merged. Stefan is very busy with other stuff ATM; I wonder if a few of us can continue his work and get newCTFE into a mergeable state. Given how much D's "compile-time" features are advertised, and D's new (ick) slogan of being fast or whatever, it's high time we actually delivered on our promises by actually making CTFE more usable.There are some good news for you. I've recently allocated a few more resources to newCTFE again. I have to stress that it is not enough to get newCTFE feature complete. It is also vital make performance-related pass through the code. newCTFE currently still at a Proof-Of-Concept quality level. That said, newCTFE is designed with performance and JIT in mind. It can achieve a 10-30x speed-up when implemented properly. One thing that I really need in druntime is a cross-platform way to allocate executable memory-pages, this can be done by someone else. Another Thing that can be done is reviewing the code and alerting me to potential problems. i.e. Missing or indecipherable comments as well as spelling mistakes. (with the correction please (just telling me something is wrong, will not help since I obliviously don't know how to spell it))
Feb 07 2018
On Tuesday, 6 February 2018 at 18:56:44 UTC, H. S. Teoh wrote:We seriously need to get newCTFE finished and merged. Stefan is very busy with other stuff ATM; I wonder if a few of us can continue his work and get newCTFE into a mergeable state. Given how much D's "compile-time" features are advertised, and D's new (ick) slogan of being fast or whatever, it's high time we actually delivered on our promises by actually making CTFE more usable.There are some good news for you. I've recently allocated a few more resources to newCTFE again. I have to stress that it is not enough to get newCTFE feature complete. It is also vital make performance-related pass through the code. newCTFE currently still at a Proof-Of-Concept quality level. That said, newCTFE is designed with performance and JIT in mind. It can achieve a 10-30x speed-up when implemented properly. One thing that I really need in druntime is a cross-platform way to allocate executable memory-pages, this can be done by someone else. Another Thing that can be done is reviewing the code and alerting me to potential problems. i.e. Missing or indecipherable comments as well as spelling mistakes. (with the correction please (just telling me something is wrong, will not help since I obliviously don't know how to spell it))
Feb 07 2018
On Wednesday, 7 February 2018 at 09:27:47 UTC, Stefan Koch wrote:Another Thing that can be done is reviewing the code and alerting me to potential problems. i.e. Missing or indecipherable comments as well as spelling mistakes. (with the correction please (just telling me something is wrong, will not help since I obliviously don't know how to spell it))What is the preferred place for this? https://github.com/dlang/dmd/pull/7073 or do you want PRs against a fork of yours?
Feb 07 2018
On Wednesday, 7 February 2018 at 22:00:48 UTC, Bastiaan Veelo wrote:On Wednesday, 7 February 2018 at 09:27:47 UTC, Stefan Koch wrote:I'd prefer a pr against https://github.com/UplinkCoder/dmd/newCTFE_rebootAnother Thing that can be done is reviewing the code and alerting me to potential problems. i.e. Missing or indecipherable comments as well as spelling mistakes. (with the correction please (just telling me something is wrong, will not help since I obliviously don't know how to spell it))What is the preferred place for this? https://github.com/dlang/dmd/pull/7073 or do you want PRs against a fork of yours?
Feb 08 2018
On Wednesday, 7 February 2018 at 22:00:48 UTC, Bastiaan Veelo wrote:On Wednesday, 7 February 2018 at 09:27:47 UTC, Stefan Koch wrote:Corrected link: https://github.com/UplinkCoder/dmd/tree/newCTFE_rebootAnother Thing that can be done is reviewing the code and alerting me to potential problems. i.e. Missing or indecipherable comments as well as spelling mistakes. (with the correction please (just telling me something is wrong, will not help since I obliviously don't know how to spell it))What is the preferred place for this? https://github.com/dlang/dmd/pull/7073 or do you want PRs against a fork of yours?
Feb 08 2018
On Wednesday, 7 February 2018 at 09:27:47 UTC, Stefan Koch wrote:One thing that I really need in druntime is a cross-platform way to allocate executable memory-pages, this can be done by someone else.Is this on someone's agenda? It probably needs an enhancement request at the very least, I don't think it's there yet [1].Another Thing that can be done is reviewing the code and alerting me to potential problems. i.e. Missing or indecipherable comments as well as spelling mistakes.I had a go at this [2]. [1] https://issues.dlang.org/buglist.cgi?bug_severity=enhancement&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&component=druntime&list_id=219522&product=D&query_format=advanced [2] https://github.com/UplinkCoder/dmd/pull/3
Feb 11 2018
On Monday, 12 February 2018 at 00:24:32 UTC, Bastiaan Veelo wrote:On Wednesday, 7 February 2018 at 09:27:47 UTC, Stefan Koch wrote:Was once on my together with other OS memory manager functions, but postponed the work indefinetly. https://github.com/dlang/druntime/pull/1549 If someone is willing to revive that I’d gladly assist with review. Lastly on Windows it would need FlushCpuCaches call before executing new memory. And ofc JIT is cool, but it would be more cool to have sane interpreter that doesn’t leak sooner. Simply put JIT is x5 work due to different architectures and seeing first-hand how it goes I’m not sure we want that in our compiler yet.One thing that I really need in druntime is a cross-platform way to allocate executable memory-pages, this can be done by someone else.Is this on someone's agenda? It probably needs an enhancement request at the very least, I don't think it's there yet [1].
Feb 12 2018
On Tuesday, 13 February 2018 at 05:47:10 UTC, Dmitry Olshansky wrote:Was once on my together with other OS memory manager functions, but postponed the work indefinetly. https://github.com/dlang/druntime/pull/1549 If someone is willing to revive that I’d gladly assist with review. Lastly on Windows it would need FlushCpuCaches call before executing new memory. And ofc JIT is cool, but it would be more cool to have sane interpreter that doesn’t leak sooner. Simply put JIT is x5 work due to different architectures and seeing first-hand how it goes I’m not sure we want that in our compiler yet.Since dmd is only targeting x86/x86_64 there is really just one arch to support for now. All the others can fallback to either the interpreter or generated c code compiled into a shared lib :) newCTFE already provides a very low-level IR that should be trivially translatable to machine -code. (famous last words :o) )
Feb 13 2018
On Monday, 5 February 2018 at 21:27:57 UTC, H. S. Teoh wrote:One of my D projects for the past while has been taking unusually long times to compile. This morning, I finally decided to sit down and figure out exactly why. What I found was rather disturbing: ------ import std.regex; void main() { auto re = regex(``); } ------ Compile command: time dmd -c test.d Output: ------ real 0m3.113s user 0m2.884s sys 0m0.226s ------ Comment out the call to `regex()`, and I get: ------ real 0m0.285s user 0m0.262s sys 0m0.023s ------ Clearly, something is wrong if the mere act of compiling a regex causes a 4-line program to take *3 seconds* to compile, where normally dmd takes less than a second.Thank you for this finding! I was wondering why my little vibe.d project started to take approximately twice the time to compile, and because of making a mistake in my test setup, even my minimal program still included the file containing the regex. So that even reducing the used code to a minimum the compilation time was ~7 sec compared to less than 4 seconds. Would be cool if we could get fast compilation of regex. I am coming from using scripting languages (perl and ruby) using regex a lot, so that this is really disappointing for me. Beginner question: How to split my project, to compile the regex part separately as a lib and just link them?
Feb 08 2018
On 02/08/2018 06:21 AM, Martin Tschierschke wrote:Beginner question: How to split my project, to compile the regex part separately as a lib and just link them?Unfortunately that depends completely on what buildsystem you're using. But if you're just calling the compiler directly, then it's really easy:dmd -lib -of=myLib.a [all other flags your project may need]fileYouWantInLib.d anyOtherFileYouAlsoWant.ddmd myLib.a [your project's usual flags, and all the rest of your .dfiles] If on windows, then just replace ".a" with ".lib".
Feb 08 2018