digitalmars.D.announce - DMD 0.147 release

Walter Bright (2/2) Feb 15 2006 Added match expressions.

Chr. Grade (4/6) Feb 15 2006 Nifty feature. Would be handy if regex searches be included as well - fo...

Walter Bright (3/5) Feb 15 2006 Not sure what you mean?

Derek Parnell (10/17) Feb 15 2006 Oh, I'm positive I don't know what Chr. means :-)
Chr. Grade (22/28) Feb 15 2006 I obviously lack the terminology neccessary.

Derek Parnell (11/41) Feb 15 2006 I'm really sorry, but this has just made it worse for me. I have absolut...

Chr. Grade (8/20) Feb 15 2006 Yes, whole list in one operation, indexing matches. The regexp engine wo...

clayasaurus (5/11) Feb 15 2006 Nice, these match expressions make things really handy. At first I was

clayasaurus (14/26) Feb 15 2006 One thing I forgot to ask, do we have
Walter Bright (19/23) Feb 15 2006 It's the other way around, the regexp is on the left. Also, operating sy...

Derek Parnell (11/18) Feb 15 2006 Should that be ...

Walter Bright (3/13) Feb 15 2006 Yes.
clayasaurus (2/20) Feb 16 2006 Hrm. The compiler tells me it is an unidentified escape sequence.

Sean Kelly (3/23) Feb 16 2006 Try "\.wav$"r :-)

Walter Bright (3/4) Feb 16 2006 r"\.wav$"
pragma (4/26) Feb 17 2006 Or use backticks instead:

Chr. Grade (12/16) Feb 15 2006 Static regex? Umm...

Walter Bright (9/21) Feb 15 2006 ---------------------------

Sean Kelly (5/6) Feb 15 2006 Interesting. So where can I find documentation on pattern syntax? The

Walter Bright (5/10) Feb 15 2006 There's a link in the std_regexp page to it:

Derek Parnell (16/19) Feb 15 2006 There is a couple of problems with this link. It doesn't work when one u...

Sean Kelly (5/14) Feb 15 2006 Got me. I'm looking at the online docs

Sean Kelly (7/19) Feb 15 2006 Awesome! This will take some getting used to, but it promises to be of

Walter Bright (5/8) Feb 15 2006 I don't really know why either. std.regexp has been in D since day 1, bu...

jicman (3/11) Feb 16 2006 Most of the programs that I've done with D use std.regexp, so I use it a...

Derek Parnell (8/9) Feb 15 2006 Too lazy to test sorry. Do match expressions support Unicode or just ASC...

Walter Bright (7/11) Feb 15 2006 I know it works with ASCII, and it's supposed to work with UTF. I wouldn...

Derek Parnell (17/32) Feb 15 2006 Seems to be working, but more unittests could be written.

Walter Bright (4/14) Feb 15 2006 You can use !~ for the fails cases.

Derek Parnell (56/59) Feb 15 2006 And here they are ...
Sean Kelly (9/16) Feb 15 2006 As cool as this is, I don't entirely like the prospect of cutting yet

Kris (14/30) Feb 15 2006 I agree. And it's hard to fathom what the sudden rush to get this is abo...

Sean Kelly (46/75) Feb 16 2006 I'm willing to let the new language feature mature in place. And while

kris (3/96) Feb 16 2006 Amen.
Kyle Furlong (5/90) Feb 16 2006 We (the Titan team) have run into this sort of issue with Titan. When tr...

Sean Kelly (7/15) Feb 16 2006 In the meantime, I suggest using Ares as a starting point. It still

Kris (3/15) Feb 16 2006 I'll second that. Ares is as good an isolation of D compiler support as

Walter Bright (3/10) Feb 16 2006 Sure, but I'm not sure why this is a problem.

Kyle Furlong (5/21) Feb 16 2006 It basically forces us to write our own libc. And yes one can argue that...

Walter Bright (7/27) Feb 16 2006 Since D is designed to interface directly to C, the C runtime is kind of...

Kyle Furlong (3/36) Feb 16 2006 So your dmd compiler emit's references to the _d_whatever extern(C) func...

Walter Bright (6/10) Feb 17 2006 Yes.

Sean Kelly (21/29) Feb 16 2006 I think Kyle is wondering whether a compiler writer could simply sit

Kyle Furlong (2/36) Feb 16 2006 Well put Sean, I would be very interested in Walter's take on these issu...
Walter Bright (4/22) Feb 17 2006 For the language implementor, the stuff in std.gc. How operator new

Sean Kelly (88/101) Feb 17 2006 But what if the user wants to employ a non-standard GC? There have

Walter Bright (14/46) Feb 17 2006 I don't know what you mean by non-standard. It must implement the interf...

Sean Kelly (55/105) Feb 18 2006 True enough.

Walter Bright (3/10) Feb 20 2006 You have made some good points.

Kyle Furlong (2/8) Feb 15 2006 Is it possible to drop in compile-time regex support? (i.e. Eric's solut...

pragma (4/12) Feb 15 2006 IMHO, it's not quite ready for prime-time yet. In fact, some parts of i...

James Dunne (11/29) Feb 15 2006 Not to knock Eric's great efforts at compile-time regex (which is

BCS (13/46) Feb 16 2006 I am not commenting on the regex support in particular (I haven't used i...

Georg Wrede (14/20) Feb 17 2006 Probably nobody thinks that compile time regexes should be implemented

James Dunne (8/38) Feb 18 2006 DMD has the speed; that's great and all, but we simply can't assume all

clayasaurus (6/43) Feb 18 2006 No, but we can assume implementations of D will be faster than C++ since...

Tom (8/10) Feb 15 2006 A question: I wonder, do you fix the regressions that arise on each of t...

James Dunne (5/22) Feb 15 2006
Walter Bright (4/12) Feb 15 2006 I try to do the most important ones first.

huangliang (5/5) Feb 16 2006 MatchExpression is a robust feather, but too robust.
Georg Wrede (8/11) Feb 16 2006 Cool! You really have to be working like 25 hours a day at this!

Charles (6/20) Feb 16 2006 .... Trouble.
Walter Bright (5/12) Feb 16 2006 It's explained in the IfStatement and WhileStatement sections.

Stewart Gordon (15/18) Feb 16 2006 So this

James Dunne (11/28) Feb 16 2006 If you're not using whitespace to deliniate your tokens in the first
Walter Bright (7/16) Feb 16 2006 That's right. Neither will:

Stewart Gordon (26/49) Feb 16 2006 At least neither of those two is syntactically valid now.

Sean Kelly (4/37) Feb 16 2006 I think because MatchExpression injects a _Match* object into the
Walter Bright (3/4) Feb 16 2006 Because IfStatement handles them differently.

Wang Zhen (5/11) Feb 17 2006 Although syntactically correct, MatchExpression in StaticIfCondition or

Walter Bright (4/8) Feb 17 2006 The problem is that getting it to work requires the compiler itself to

Craig Black (3/5) Feb 17 2006 You could also perhaps use compile-time templates to evaluate static if
Georg Wrede (4/16) Feb 17 2006 Intriguing. I'd sure love to hear more about this.

Walter Bright (8/24) Feb 17 2006 If the compiler is to constant fold regular expressions, then it needs t...

Georg Wrede (6/32) Feb 21 2006 Yes. IMHO in essence, the binary machine code, which the runtime also

Walter Bright (8/15) Feb 21 2006 Consider the strlen() function. Compiling a strlen() function and genera...

Lionello Lunesu (15/31) Feb 22 2006 Interesting indeed.

Oskar Linde (31/36) Feb 22 2006 Disclaimer: I don't know much about this. Most is pure speculation.

Deewiant (3/5) Feb 22 2006 This is what I've always thought declaring a function as "const", like c...
Lionello Lunesu (14/35) Feb 22 2006 Good point. Completely forgot about that.

Oskar Linde (12/24) Feb 22 2006 The function has to halt also. An infinite loop can be impossible for

Sean Kelly (7/12) Feb 22 2006 If the function can be inlined and the operations it contains are also

Georg Wrede (20/40) Feb 22 2006 Either I'm getting too old for this business, or you're only giving

Don Clugston (43/74) Feb 23 2006 That would be OK. The issue is that the compiler is a tool for

Georg Wrede (12/46) Feb 23 2006 Aaaaaah... heureka.

Don Clugston (20/67) Feb 23 2006 Oh dear, I think I've just confused you. I was only referring to strlen,...

Georg Wrede (2/2) Feb 23 2006 (I put stuff in D.dtl.)

Stewart Gordon (22/40) Feb 21 2006 A problem is that there are a number of dialects of regexp. The spec

"Walter Bright" <newshound digitalmars.com> writes:

Added match expressions.

http://www.digitalmars.com/d/changelog.html

Feb 15 2006

Chr. Grade <Chr._member pathlink.com> writes:

Nifty feature. Would be handy if regex searches be included as well - for
continuous buffers and for chunked buffers.

Chr. Grade

In article <dt088d$1svm$1 digitaldaemon.com>, Walter Bright says...
Added match expressions.

http://www.digitalmars.com/d/changelog.html

Feb 15 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Chr. Grade" <Chr._member pathlink.com> wrote in message 
news:dt0ait$1v78$1 digitaldaemon.com...
 Nifty feature. Would be handy if regex searches be included as well - for
 continuous buffers and for chunked buffers.

Not sure what you mean?

Feb 15 2006

Derek Parnell <derek psych.ward> writes:

On Wed, 15 Feb 2006 14:49:14 -0800, Walter Bright wrote:

 "Chr. Grade" <Chr._member pathlink.com> wrote in message 
 news:dt0ait$1v78$1 digitaldaemon.com...
 Nifty feature. Would be handy if regex searches be included as well - for
 continuous buffers and for chunked buffers.

 
 Not sure what you mean?

Oh, I'm positive I don't know what Chr. means :-)

What is a "continuous buffer" in this context?
What is a "chunked buffer"?

-- 
Derek
(skype: derek.j.parnell)
Melbourne, Australia
"Down with mediocracy!"
16/02/2006 10:11:17 AM

Feb 15 2006

Chr. Grade <Chr._member pathlink.com> writes:

I obviously lack the terminology neccessary.
Trying with pseudo-code:

// --- Situation: find matches, chunks of data in a list, no continuous
// buffer, memcpys to duplicate and concatenate data inefficient

slist List    = ...; // containers with some data differing in size
rxres Results = List ~~ "regex+";

// Hopefully indexed all potentially dangling matches between two
// chunks (?)...

while( !Results )
.. = Results.nFirst,
.. = Results.nLast,
.. = Results.get_ptr,
Results++;

// --- Situation: find matches in a continuous buffer:

utf16 Text[]  = ...;
rxres Results = Text ~~ "foo+";

while( !Results )
print( Results++ );

---

Maybe this explains what I meant, maybe it is just absurd.

Chr. Grade


In article <dt0b6l$1vpe$1 digitaldaemon.com>, Walter Bright says...
"Chr. Grade" <Chr._member pathlink.com> wrote in message 
news:dt0ait$1v78$1 digitaldaemon.com...
 Nifty feature. Would be handy if regex searches be included as well - for
 continuous buffers and for chunked buffers.

Not sure what you mean?

Feb 15 2006

Derek Parnell <derek psych.ward> writes:

On Thu, 16 Feb 2006 00:13:16 +0000 (UTC), Chr. Grade wrote:

 I obviously lack the terminology neccessary.
 Trying with pseudo-code:
 
 // --- Situation: find matches, chunks of data in a list, no continuous
 // buffer, memcpys to duplicate and concatenate data inefficient
 
 slist List    = ...; // containers with some data differing in size
 rxres Results = List ~~ "regex+";
 
 // Hopefully indexed all potentially dangling matches between two
 // chunks (?)...
 
 while( !Results )
 .. = Results.nFirst,
 .. = Results.nLast,
 .. = Results.get_ptr,
 Results++;
 
 // --- Situation: find matches in a continuous buffer:
 
 utf16 Text[]  = ...;
 rxres Results = Text ~~ "foo+";
 
 while( !Results )
 print( Results++ );
 
 ---
 
 Maybe this explains what I meant, maybe it is just absurd.
 

I'm really sorry, but this has just made it worse for me. I have absolutely
no idea what you are trying to do or say.

Are you talking about a list of pointers to strings and searching over the
referenced strings in one ~~ operation?

-- 
Derek
(skype: derek.j.parnell)
Melbourne, Australia
"Down with mediocracy!"
16/02/2006 11:20:32 AM

Feb 15 2006

Chr. Grade <Chr._member pathlink.com> writes:

In article <4z8zsk5s3ozv$.1xsunk1521nn9.dlg 40tude.net>, Derek Parnell says...

 Maybe this explains what I meant, maybe it is just absurd.
 

I'm really sorry, but this has just made it worse for me. I have absolutely
no idea what you are trying to do or say.

Are you talking about a list of pointers to strings and searching over the
referenced strings in one ~~ operation?

Yes, whole list in one operation, indexing matches. The regexp engine would have
to do the pointer hopping as needed.

Here's an example of what I mean, but it can't handle discontinuous buffers:
www.boost.org/libs/regex/example/snippets/regex_search_example.cpp
The code there could be wrapped up in a class/struct which only exposes the
iteration through the map/list with matches via overloaded operators.

Chr. Grade

-- 
Derek
(skype: derek.j.parnell)
Melbourne, Australia
"Down with mediocracy!"
16/02/2006 11:20:32 AM

Feb 15 2006

clayasaurus <clayasaurus gmail.com> writes:

Nice, these match expressions make things really handy. At first I was 
confused on what I would use them for, but I started programming for a 
little bit and already found a use for them. Namely, assert(filename ~~ 
"*.wav"), assuming I understand it correctly.

Walter Bright wrote:
 Added match expressions.
 
 http://www.digitalmars.com/d/changelog.html

Feb 15 2006

clayasaurus <clayasaurus gmail.com> writes:

One thing I forgot to ask, do we have

this()
in
{
}
out
{
}
body
{
}

Yet? Thanks.
~ Clay

clayasaurus wrote:
 Nice, these match expressions make things really handy. At first I was 
 confused on what I would use them for, but I started programming for a 
 little bit and already found a use for them. Namely, assert(filename ~~ 
 "*.wav"), assuming I understand it correctly.
 
 Walter Bright wrote:
 Added match expressions.

 http://www.digitalmars.com/d/changelog.html

Feb 15 2006

"Walter Bright" <newshound digitalmars.com> writes:

"clayasaurus" <clayasaurus gmail.com> wrote in message 
news:dt0dm6$228a$1 digitaldaemon.com...
 Nice, these match expressions make things really handy. At first I was 
 confused on what I would use them for, but I started programming for a 
 little bit and already found a use for them. Namely, assert(filename ~~ 
 "*.wav"), assuming I understand it correctly.

It's the other way around, the regexp is on the left. Also, operating system 
wildcard thing isn't the one used, it's real regular expressions from 
std.regexp. So you'd write it as:

    assert(".wav$" ~~ filename);

which means any string ending in ".wav".

std.path.fnmatch() does operating system style wildcards like "*.wav" - I 
could make that work with the match expressions too if there's a desire 
(because operator overloading works with it!).

Another example of things you can do:

    assert("^abc" ~~ string);    // (1)

matches any string that starts with the string "abc". It's a little klunky 
to do otherwise,

    assert(string.length >= 3 && string[0..3] == "abc");   // (2)

Currently, evaluating ("^abc"~~string) invokes the full std.regexp 
machinery. But a compiler is free to optimize (1) into (2). I'm thinking of 
Eric and Don's examples of generating custom recognizers for static regex 
strings. This could make D's regex support into a real screamer.

Feb 15 2006

Derek Parnell <derek psych.ward> writes:

On Wed, 15 Feb 2006 15:50:03 -0800, Walter Bright wrote:


 Also, operating system 
 wildcard thing isn't the one used, it's real regular expressions from 
 std.regexp. So you'd write it as:
 
     assert(".wav$" ~~ filename);
 
 which means any string ending in ".wav".

Should that be ...

       assert("\.wav$" ~~ filename);

otherwise it would also match things like "somefile.awav" because doesn't
the "." in the regexp represents 'any-character'.

-- 
Derek
(skype: derek.j.parnell)
Melbourne, Australia
"Down with mediocracy!"
16/02/2006 11:16:13 AM

Feb 15 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Derek Parnell" <derek psych.ward> wrote in message 
news:1fyp16zonzb9q$.1qxsxpiy1s1ry.dlg 40tude.net...
 On Wed, 15 Feb 2006 15:50:03 -0800, Walter Bright wrote:
 So you'd write it as:

     assert(".wav$" ~~ filename);

 which means any string ending in ".wav".

 Should that be ...

       assert("\.wav$" ~~ filename);

 otherwise it would also match things like "somefile.awav" because doesn't
 the "." in the regexp represents 'any-character'.

Yes. <g>

Feb 15 2006

clayasaurus <clayasaurus gmail.com> writes:

Derek Parnell wrote:
 On Wed, 15 Feb 2006 15:50:03 -0800, Walter Bright wrote:
 
 
 Also, operating system 
 wildcard thing isn't the one used, it's real regular expressions from 
 std.regexp. So you'd write it as:

     assert(".wav$" ~~ filename);

 which means any string ending in ".wav".

 
 Should that be ...
 
        assert("\.wav$" ~~ filename);
 
 otherwise it would also match things like "somefile.awav" because doesn't
 the "." in the regexp represents 'any-character'.
 

Hrm. The compiler tells me it is an unidentified escape sequence.

Feb 16 2006

Sean Kelly <sean f4.ca> writes:

clayasaurus wrote:
 Derek Parnell wrote:
 On Wed, 15 Feb 2006 15:50:03 -0800, Walter Bright wrote:


 Also, operating system wildcard thing isn't the one used, it's real 
 regular expressions from std.regexp. So you'd write it as:

     assert(".wav$" ~~ filename);

 which means any string ending in ".wav".

 Should that be ...

        assert("\.wav$" ~~ filename);

 otherwise it would also match things like "somefile.awav" because doesn't
 the "." in the regexp represents 'any-character'.

 
 Hrm. The compiler tells me it is an unidentified escape sequence.

Try "\.wav$"r :-)


Sean

Feb 16 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Sean Kelly" <sean f4.ca> wrote in message 
news:dt2uj7$1e4l$1 digitaldaemon.com...
 Try "\.wav$"r :-)

r"\.wav$"

Feb 16 2006

pragma <pragma_member pathlink.com> writes:

In article <dt2uj7$1e4l$1 digitaldaemon.com>, Sean Kelly says...
clayasaurus wrote:
 Derek Parnell wrote:
 On Wed, 15 Feb 2006 15:50:03 -0800, Walter Bright wrote:


 Also, operating system wildcard thing isn't the one used, it's real 
 regular expressions from std.regexp. So you'd write it as:

     assert(".wav$" ~~ filename);

 which means any string ending in ".wav".

 Should that be ...

        assert("\.wav$" ~~ filename);

 otherwise it would also match things like "somefile.awav" because doesn't
 the "." in the regexp represents 'any-character'.

 
 Hrm. The compiler tells me it is an unidentified escape sequence.

Try "\.wav$"r :-)

Or use backticks instead:

assert(`\.wav$` ~~ filename);

- Eric Anderton at yahoo

Feb 17 2006

Chr. Grade <Chr._member pathlink.com> writes:

Currently, evaluating ("^abc"~~string) invokes the full std.regexp 
machinery. But a compiler is free to optimize (1) into (2). I'm thinking of 
Eric and Don's examples of generating custom recognizers for static regex 
strings. This could make D's regex support into a real screamer. 

Static regex? Umm...
Again, this might be absurd, but there could be a type "regex".

regex rxSome  = "�|&|=";
regex rxMore  = "[a-n]";
regex rxMerge = "foo($rxSome)?($rxMore)+";

Whereas...
char[] cpSome  = "...";
char[] cpMore  = "...";
.. would lead to a less readable:
char[] cpMerge = "foo" . cpSome . "?" . cpMore . "+";

---

Chr. Grade

Feb 15 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Chr. Grade" <Chr._member pathlink.com> wrote in message 
news:dt0gmk$24v3$1 digitaldaemon.com...
Currently, evaluating ("^abc"~~string) invokes the full std.regexp
machinery. But a compiler is free to optimize (1) into (2). I'm thinking 
of
Eric and Don's examples of generating custom recognizers for static regex
strings. This could make D's regex support into a real screamer.

 Static regex? Umm...
 Again, this might be absurd, but there could be a type "regex".

 regex rxSome  = "�|&|=";
 regex rxMore  = "[a-n]";
 regex rxMerge = "foo($rxSome)?($rxMore)+";

---------------------------
import std.regexp;

auto rxSome = RegExp("�|&|=");
if (rxSome ~~ "string")
    ...
-----------------------------
works now.

Feb 15 2006

Sean Kelly <sean f4.ca> writes:

Walter Bright wrote:
 Added match expressions.

Interesting.  So where can I find documentation on pattern syntax?  The 
docs for std.regexp doesn't seem to mention it.  Is it just the classic 
textbook syntax, or are there differences?


Sean

Feb 15 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Sean Kelly" <sean f4.ca> wrote in message 
news:dt0ds9$226k$1 digitaldaemon.com...
 Walter Bright wrote:
 Added match expressions.

 Interesting.  So where can I find documentation on pattern syntax?  The 
 docs for std.regexp doesn't seem to mention it.  Is it just the classic 
 textbook syntax, or are there differences?

There's a link in the std_regexp page to it: 
www.digitalmars.com/ctg/regular.html

It's the classic syntax.

Feb 15 2006

Derek Parnell <derek psych.ward> writes:

On Wed, 15 Feb 2006 15:51:40 -0800, Walter Bright wrote:

 
 There's a link in the std_regexp page to it: 
 www.digitalmars.com/ctg/regular.html

There is a couple of problems with this link. It doesn't work when one uses
the downloaded html docs. This is because it uses a link to a file that is
not a part of the downloaded stuff. But more importantly, the syntax is
wrong.

The actual html you use is (notice the twin double quotes)

	<a href=""../../ctg/regular.html"">Regular expressions</a>

but it would be better to use something like ...

	<a href="http://www.digitalmars.com/ctg/regular.html">Regular
expressions</a>

-- 
Derek
(skype: derek.j.parnell)
Melbourne, Australia
"Down with mediocracy!"
16/02/2006 10:59:41 AM

Feb 15 2006

Sean Kelly <sean f4.ca> writes:

Derek Parnell wrote:
 On Wed, 15 Feb 2006 15:51:40 -0800, Walter Bright wrote:
 
 There's a link in the std_regexp page to it: 
 www.digitalmars.com/ctg/regular.html

 
 There is a couple of problems with this link. It doesn't work when one uses
 the downloaded html docs. This is because it uses a link to a file that is
 not a part of the downloaded stuff. But more importantly, the syntax is
 wrong.

Got me.  I'm looking at the online docs 
(http://www.digitalmars.com/d/phobos/std_regexp.html) and both links at 
the top of the page just link to std_regexp.html.  Thus my question.


Sean

Feb 15 2006

Sean Kelly <sean f4.ca> writes:

Walter Bright wrote:
 "Sean Kelly" <sean f4.ca> wrote in message 
 news:dt0ds9$226k$1 digitaldaemon.com...
 Walter Bright wrote:
 Added match expressions.

 Interesting.  So where can I find documentation on pattern syntax?  The 
 docs for std.regexp doesn't seem to mention it.  Is it just the classic 
 textbook syntax, or are there differences?

 
 There's a link in the std_regexp page to it: 
 www.digitalmars.com/ctg/regular.html
 
 It's the classic syntax. 

Awesome!  This will take some getting used to, but it promises to be of 
tremendous use.  Don't ask me why a built-in feature seems preferable to 
the same thing in library code, but it does :-p  Perhaps some of it is 
that this will work for both compile-time and run-time evaluation, while 
the library version would likely be different for each.


Sean

Feb 15 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Sean Kelly" <sean f4.ca> wrote in message 
news:dt0fp5$23qb$1 digitaldaemon.com...
 Awesome!  This will take some getting used to, but it promises to be of 
 tremendous use.  Don't ask me why a built-in feature seems preferable to 
 the same thing in library code, but it does :-p

I don't really know why either. std.regexp has been in D since day 1, but 
it's been completely overlooked, and I regularly get comments about D not 
doing regular expressions. If this is what it takes, then so be it <g>.

Feb 15 2006

jicman <jicman_member pathlink.com> writes:

Walter Bright says...
"Sean Kelly" <sean f4.ca> wrote in message 
news:dt0fp5$23qb$1 digitaldaemon.com...
 Awesome!  This will take some getting used to, but it promises to be of 
 tremendous use.  Don't ask me why a built-in feature seems preferable to 
 the same thing in library code, but it does :-p

I don't really know why either. std.regexp has been in D since day 1, but 
it's been completely overlooked, and I regularly get comments about D not 
doing regular expressions. If this is what it takes, then so be it <g>. 

Most of the programs that I've done with D use std.regexp, so I use it all the
time.

Feb 16 2006

Derek Parnell <derek psych.ward> writes:

On Wed, 15 Feb 2006 13:52:12 -0800, Walter Bright wrote:

 Added match expressions.

Too lazy to test sorry. Do match expressions support Unicode or just ASCII?

-- 
Derek
(skype: derek.j.parnell)
Melbourne, Australia
"Down with mediocracy!"
16/02/2006 10:44:23 AM

Feb 15 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Derek Parnell" <derek psych.ward> wrote in message
news:k0lbfijz1ng3$.7oaf5rf2w9ut$.dlg 40tude.net...
 On Wed, 15 Feb 2006 13:52:12 -0800, Walter Bright wrote:

 Added match expressions.

 Too lazy to test sorry. Do match expressions support Unicode or just
 ASCII?

I know it works with ASCII, and it's supposed to work with UTF. I wouldn't
be surprised if the latter is buggy, though, since I haven't written test
cases for it.

It's designed, however, so the compiler itself need know nothing about 
regular expressions. The work is all done by std.regexp.

Feb 15 2006

Derek Parnell <derek psych.ward> writes:

On Wed, 15 Feb 2006 16:29:21 -0800, Walter Bright wrote:

 "Derek Parnell" <derek psych.ward> wrote in message
 news:k0lbfijz1ng3$.7oaf5rf2w9ut$.dlg 40tude.net...
 On Wed, 15 Feb 2006 13:52:12 -0800, Walter Bright wrote:

 Added match expressions.

 Too lazy to test sorry. Do match expressions support Unicode or just
 ASCII?

 
 I know it works with ASCII, and it's supposed to work with UTF. I wouldn't
 be surprised if the latter is buggy, though, since I haven't written test
 cases for it.
 
 It's designed, however, so the compiler itself need know nothing about 
 regular expressions. The work is all done by std.regexp.

Seems to be working, but more unittests could be written.

void main()
{
    assert( "\uff16" ~~ "\u2341\uff16" );  // succeeds correctly
    //assert( "\xff" ~~ "\u2341\uff16" );  // fails correctly
    //assert( "^\uff16" ~~ "\u2341\uff16" );  // fails correctly
    assert( "\uff16$" ~~ "\u2341\uff16" );  // succeeds correctly
}

BTW, one side effect of the new matching syntax is that you don't have to
explicitly import std.regexp.

-- 
Derek
(skype: derek.j.parnell)
Melbourne, Australia
"Down with mediocracy!"
16/02/2006 11:56:43 AM

Feb 15 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Derek Parnell" <derek psych.ward> wrote in message 
news:fdfenjm7wj46.1xmq12pyjxp8c$.dlg 40tude.net...
 Seems to be working, but more unittests could be written.

 void main()
 {
    assert( "\uff16" ~~ "\u2341\uff16" );  // succeeds correctly
    //assert( "\xff" ~~ "\u2341\uff16" );  // fails correctly
    //assert( "^\uff16" ~~ "\u2341\uff16" );  // fails correctly
    assert( "\uff16$" ~~ "\u2341\uff16" );  // succeeds correctly
 }

You can use !~ for the fails cases.

 BTW, one side effect of the new matching syntax is that you don't have to
 explicitly import std.regexp.

That was on purpose. It uses a proxy.

Feb 15 2006

Derek Parnell <derek psych.ward> writes:

On Wed, 15 Feb 2006 17:13:48 -0800, Walter Bright wrote:

 "Derek Parnell" <derek psych.ward> wrote in message 
 news:fdfenjm7wj46.1xmq12pyjxp8c$.dlg 40tude.net...
 Seems to be working, but more unittests could be written.


And here they are ...

void main()
{
    char[] target = "\u2341\u2201\uff16";

    assert( "\xff" !~  target );  // fails correctly
    assert( "\x22" !~  target );  // fails correctly

    assert( ".\x22." !~  target );  // fails correctly

    assert( "\uff16" ~~ target );  // succeeds correctly
    assert( "^\uff16" !~ target );  // fails correctly
    assert( "\uff16$" ~~ target );  // succeeds correctly

    assert( "\u2341" ~~ target );  // succeeds correctly
    assert( "^\u2341" ~~ target );  // succeeds correctly
    assert( "\u2341$" !~ target );  // fails correctly

    assert( "\u2201" ~~ target );  // succeeds correctly
    assert( "^\u2201" !~ target );  // fails correctly
    assert( "\u2201$" !~ target );  // fails correctly

    assert( "\u2201\uff16" ~~ target );  // succeeds correctly
    assert( "^\u2201\uff16" !~ target );  // succeeds correctly
    assert( "\u2201\uff16$" ~~ target );  // succeeds correctly

    assert( "\u2341\u2201" ~~ target );  // succeeds correctly
    assert( "^\u2341\u2201" ~~ target );  // succeeds correctly
    assert( "\u2341\u2201$" !~ target );  // fails correctly

    assert( "\u2341\u2201\uff16" ~~ target );  // succeeds correctly
    assert( "^\u2341\u2201\uff16" ~~ target );  // succeeds correctly
    assert( "\u2341\u2201\uff16$" ~~ target );  // succeeds correctly
    assert( "^\u2341\u2201\uff16$" ~~ target );  // succeeds correctly

    //assert( "\u2341.\uff16" ~~ target );  // fails
    //assert( "^\u2341.\uff16" ~~ target );  // fails
    //assert( "\u2341.\uff16$" ~~ target );  // fails
    //assert( "^\u2341.\uff16$" ~~ target );  // fails

    assert( "\u2341.." ~~ target );  // succeeds correctly
    assert( "^\u2341.." ~~ target );  // succeeds correctly
    //assert( "\u2341..$" ~~ target );  // fails
    //assert( "^\u2341..$" ~~ target );  // fails

    assert( ".." ~~ target );  // succeeds correctly
    assert( "^.." ~~ target );  // succeeds correctly
    assert( "..$" ~~ target );  // succeeds correctly
    assert( "^..$" !~ target );  // fails correctly

    assert( "..\uff16" ~~ target );  // succeeds correctly
    //assert( "^..\uff16" ~~ target );  // fails
    assert( "..\uff16$" ~~ target );  // succeeds correctly
    //assert( "^..\uff16$" ~~ target );  // fails

    assert( "..." ~~ target );  // succeeds correctly
    assert( "^..." ~~ target );  // succeeds correctly
    assert( "...$" ~~ target );  // succeeds correctly
    //assert( "^...$" ~~ target );  // fails

}

It seems that the pattern "." only tries to match a single byte and not a
single character.

-- 
Derek
(skype: derek.j.parnell)
Melbourne, Australia
"Down with mediocracy!"
16/02/2006 1:16:12 PM

Feb 15 2006

Sean Kelly <sean f4.ca> writes:

Walter Bright wrote:
 "Derek Parnell" <derek psych.ward> wrote in message 
 news:fdfenjm7wj46.1xmq12pyjxp8c$.dlg 40tude.net...
 
 BTW, one side effect of the new matching syntax is that you don't have to
 explicitly import std.regexp.

 
 That was on purpose. It uses a proxy. 

As cool as this is, I don't entirely like the prospect of cutting yet 
more ties between standard library components and runtime code.  My 
approach with Ares has been to separate the two, which until now has 
meant moving only std.utf into the DMD runtime.  Now it looks like 
std.regex will end up there as well (along with std.outbuffer perhaps). 
  With the new language features, is there any reason to continue regex 
library support?  Just how much can't be done by the built-in syntax?


Sean

Feb 15 2006

"Kris" <fu bar.com> writes:

"Sean Kelly" <sean f4.ca> wrote ...
 Walter Bright wrote:
 "Derek Parnell" <derek psych.ward> wrote in message 
 news:fdfenjm7wj46.1xmq12pyjxp8c$.dlg 40tude.net...

 BTW, one side effect of the new matching syntax is that you don't have 
 to
 explicitly import std.regexp.

 That was on purpose. It uses a proxy.

 As cool as this is, I don't entirely like the prospect of cutting yet more 
 ties between standard library components and runtime code.  My approach 
 with Ares has been to separate the two, which until now has meant moving 
 only std.utf into the DMD runtime.  Now it looks like std.regex will end 
 up there as well (along with std.outbuffer perhaps). With the new language 
 features, is there any reason to continue regex library support?  Just how 
 much can't be done by the built-in syntax?

I agree. And it's hard to fathom what the sudden rush to get this is about. 
I listed a number of (IMO) serious issues on the main forum, so I'll add my 
support here that hooking RegExp (and all its various imports) into the 
compiler is just bad news *at this point in time*

Let's just suppose for a minute that the regex-templates work out well. It 
seems to me that any built-in support for regex (within the D grammar) would 
be nothing more than a thin veneer over the template syntax (for 
regex-templates), to make it somewhat  more palatable for the masses? That 
may not come to pass, but it seems that we should at least wait until 
there's a bit of education and experience in this regard, rather than 
hurriedly tie the grammar to something which clearly has a number of 
fundamental problems. Again; what's the big rush here?

- Kris

Feb 15 2006

Sean Kelly <sean f4.ca> writes:

Kris wrote:
 "Sean Kelly" <sean f4.ca> wrote ...
 Walter Bright wrote:
 "Derek Parnell" <derek psych.ward> wrote in message 
 news:fdfenjm7wj46.1xmq12pyjxp8c$.dlg 40tude.net...

 BTW, one side effect of the new matching syntax is that you don't have 
 to
 explicitly import std.regexp.

 That was on purpose. It uses a proxy.

 As cool as this is, I don't entirely like the prospect of cutting yet more 
 ties between standard library components and runtime code.  My approach 
 with Ares has been to separate the two, which until now has meant moving 
 only std.utf into the DMD runtime.  Now it looks like std.regex will end 
 up there as well (along with std.outbuffer perhaps). With the new language 
 features, is there any reason to continue regex library support?  Just how 
 much can't be done by the built-in syntax?

 
 I agree. And it's hard to fathom what the sudden rush to get this is about. 
 I listed a number of (IMO) serious issues on the main forum, so I'll add my 
 support here that hooking RegExp (and all its various imports) into the 
 compiler is just bad news *at this point in time*

I'm willing to let the new language feature mature in place.  And while 
I think it's unnecessary given that it can be done just as well in 
library code, there's something about making regex handling a first 
class citizen that increases its appeal.  However, though I can 
understand Walter's desire to leverage existing code, I think it's a 
terrible mistake to make language features rely on library code, even if 
the relationship is concealed.  I don't think this is an issue for D in 
general (since the language spec obviously doesn't require this) so much 
as DMD specifically however.  For example, the GC code currently imports 
std.thread to do various things.  Now let's say that the private 
implementation of std.thread changes, and the changes have an impact on 
inlinable functions.  If the GC code isn't rebuilt, and if it was 
compiled with the -inline option set, all hell could break loose.

More generally however, such ties make it very difficult for third party 
library writers to provide alternate standard library implementations to 
work with DMD (similar to STLPort or STLSoft for C++), because the 
compiler runtime must be rebuilt to operate with any new library used. 
And it's difficult to be certain just what low-level features the 
runtime may rely on without well-defined points of interaction.  This is 
something I'm completely unfamiliar with coming from a C/C++ background, 
and it makes me wonder if any other compiled languages are like this as 
well.

Personally, I would love it if more attention were paid to defining 
necessary library interaction in D.  This is probably the most 
significant thing I've done in Ares and is what I think gives Ares the 
most credibility as a replacement standard library.  And while I would 
love for Walter to assume control of the DMD runtime and GC portions, 
doing so would require some care (and discussion) given to how language 
features such as regex are implemented: does the runtime truly need to 
interact with the standard library?  If so, how?

Implicit UTF conversions during foreach seems a reasonable language 
feature as such code is relatively simple to implement, but regular 
expression processing is somewhat complicated.  Is this a language 
feature that may be ignored by compilers that target embedded 
processors, simply because of size/complexity?  Can I expect to see 
shoddy regex implementations in some compilers such that I'm inclined to 
use a library implementation anyway?  My real concern isn't D now so 
much as D five years from now--I like the language enough that I really 
want it to succeed :-)

 Let's just suppose for a minute that the regex-templates work out well. It 
 seems to me that any built-in support for regex (within the D grammar) would 
 be nothing more than a thin veneer over the template syntax (for 
 regex-templates), to make it somewhat  more palatable for the masses? That 
 may not come to pass, but it seems that we should at least wait until 
 there's a bit of education and experience in this regard, rather than 
 hurriedly tie the grammar to something which clearly has a number of 
 fundamental problems. Again; what's the big rush here?

It makes an odd sort of sense that a language designed by a compiler 
writer should have built-in regex support.  And it seems like Walter has 
been thinking about this for a while, so I'm willing to see how it goes. 
  But as a library writer, the way that Walter implements these changes 
gives me fits ;-)


Sean

Feb 16 2006

kris <fu bar.org> writes:

Sean Kelly wrote:
 Kris wrote:
 
 "Sean Kelly" <sean f4.ca> wrote ...

 Walter Bright wrote:

 "Derek Parnell" <derek psych.ward> wrote in message 
 news:fdfenjm7wj46.1xmq12pyjxp8c$.dlg 40tude.net...

 BTW, one side effect of the new matching syntax is that you don't 
 have to
 explicitly import std.regexp.

 That was on purpose. It uses a proxy.

 As cool as this is, I don't entirely like the prospect of cutting yet 
 more ties between standard library components and runtime code.  My 
 approach with Ares has been to separate the two, which until now has 
 meant moving only std.utf into the DMD runtime.  Now it looks like 
 std.regex will end up there as well (along with std.outbuffer 
 perhaps). With the new language features, is there any reason to 
 continue regex library support?  Just how much can't be done by the 
 built-in syntax?


 I agree. And it's hard to fathom what the sudden rush to get this is 
 about. I listed a number of (IMO) serious issues on the main forum, so 
 I'll add my support here that hooking RegExp (and all its various 
 imports) into the compiler is just bad news *at this point in time*

 
 
 I'm willing to let the new language feature mature in place.  And while 
 I think it's unnecessary given that it can be done just as well in 
 library code, there's something about making regex handling a first 
 class citizen that increases its appeal.  However, though I can 
 understand Walter's desire to leverage existing code, I think it's a 
 terrible mistake to make language features rely on library code, even if 
 the relationship is concealed.  I don't think this is an issue for D in 
 general (since the language spec obviously doesn't require this) so much 
 as DMD specifically however.  For example, the GC code currently imports 
 std.thread to do various things.  Now let's say that the private 
 implementation of std.thread changes, and the changes have an impact on 
 inlinable functions.  If the GC code isn't rebuilt, and if it was 
 compiled with the -inline option set, all hell could break loose.
 
 More generally however, such ties make it very difficult for third party 
 library writers to provide alternate standard library implementations to 
 work with DMD (similar to STLPort or STLSoft for C++), because the 
 compiler runtime must be rebuilt to operate with any new library used. 
 And it's difficult to be certain just what low-level features the 
 runtime may rely on without well-defined points of interaction.  This is 
 something I'm completely unfamiliar with coming from a C/C++ background, 
 and it makes me wonder if any other compiled languages are like this as 
 well.
 
 Personally, I would love it if more attention were paid to defining 
 necessary library interaction in D.  This is probably the most 
 significant thing I've done in Ares and is what I think gives Ares the 
 most credibility as a replacement standard library.  And while I would 
 love for Walter to assume control of the DMD runtime and GC portions, 
 doing so would require some care (and discussion) given to how language 
 features such as regex are implemented: does the runtime truly need to 
 interact with the standard library?  If so, how?
 
 Implicit UTF conversions during foreach seems a reasonable language 
 feature as such code is relatively simple to implement, but regular 
 expression processing is somewhat complicated.  Is this a language 
 feature that may be ignored by compilers that target embedded 
 processors, simply because of size/complexity?  Can I expect to see 
 shoddy regex implementations in some compilers such that I'm inclined to 
 use a library implementation anyway?  My real concern isn't D now so 
 much as D five years from now--I like the language enough that I really 
 want it to succeed :-)
 
 Let's just suppose for a minute that the regex-templates work out 
 well. It seems to me that any built-in support for regex (within the D 
 grammar) would be nothing more than a thin veneer over the template 
 syntax (for regex-templates), to make it somewhat  more palatable for 
 the masses? That may not come to pass, but it seems that we should at 
 least wait until there's a bit of education and experience in this 
 regard, rather than hurriedly tie the grammar to something which 
 clearly has a number of fundamental problems. Again; what's the big 
 rush here?

 
 
 It makes an odd sort of sense that a language designed by a compiler 
 writer should have built-in regex support.  And it seems like Walter has 
 been thinking about this for a while, so I'm willing to see how it goes. 
  But as a library writer, the way that Walter implements these changes 
 gives me fits ;-)
 
 
 Sean

Amen.

I made an eerily similar post on the D forum.

Feb 16 2006

Kyle Furlong <kylefurlong gmail.com> writes:

Sean Kelly wrote:
 Kris wrote:
 "Sean Kelly" <sean f4.ca> wrote ...
 Walter Bright wrote:
 "Derek Parnell" <derek psych.ward> wrote in message 
 news:fdfenjm7wj46.1xmq12pyjxp8c$.dlg 40tude.net...

 BTW, one side effect of the new matching syntax is that you don't 
 have to
 explicitly import std.regexp.

 That was on purpose. It uses a proxy.

 As cool as this is, I don't entirely like the prospect of cutting yet 
 more ties between standard library components and runtime code.  My 
 approach with Ares has been to separate the two, which until now has 
 meant moving only std.utf into the DMD runtime.  Now it looks like 
 std.regex will end up there as well (along with std.outbuffer 
 perhaps). With the new language features, is there any reason to 
 continue regex library support?  Just how much can't be done by the 
 built-in syntax?

 I agree. And it's hard to fathom what the sudden rush to get this is 
 about. I listed a number of (IMO) serious issues on the main forum, so 
 I'll add my support here that hooking RegExp (and all its various 
 imports) into the compiler is just bad news *at this point in time*

 
 I'm willing to let the new language feature mature in place.  And while 
 I think it's unnecessary given that it can be done just as well in 
 library code, there's something about making regex handling a first 
 class citizen that increases its appeal.  However, though I can 
 understand Walter's desire to leverage existing code, I think it's a 
 terrible mistake to make language features rely on library code, even if 
 the relationship is concealed.  I don't think this is an issue for D in 
 general (since the language spec obviously doesn't require this) so much 
 as DMD specifically however.  For example, the GC code currently imports 
 std.thread to do various things.  Now let's say that the private 
 implementation of std.thread changes, and the changes have an impact on 
 inlinable functions.  If the GC code isn't rebuilt, and if it was 
 compiled with the -inline option set, all hell could break loose.
 
 More generally however, such ties make it very difficult for third party 
 library writers to provide alternate standard library implementations to 
 work with DMD (similar to STLPort or STLSoft for C++), because the 
 compiler runtime must be rebuilt to operate with any new library used. 
 And it's difficult to be certain just what low-level features the 
 runtime may rely on without well-defined points of interaction.  This is 
 something I'm completely unfamiliar with coming from a C/C++ background, 
 and it makes me wonder if any other compiled languages are like this as 
 well.
 
 Personally, I would love it if more attention were paid to defining 
 necessary library interaction in D.  This is probably the most 
 significant thing I've done in Ares and is what I think gives Ares the 
 most credibility as a replacement standard library.  And while I would 
 love for Walter to assume control of the DMD runtime and GC portions, 
 doing so would require some care (and discussion) given to how language 
 features such as regex are implemented: does the runtime truly need to 
 interact with the standard library?  If so, how?
 
 Implicit UTF conversions during foreach seems a reasonable language 
 feature as such code is relatively simple to implement, but regular 
 expression processing is somewhat complicated.  Is this a language 
 feature that may be ignored by compilers that target embedded 
 processors, simply because of size/complexity?  Can I expect to see 
 shoddy regex implementations in some compilers such that I'm inclined to 
 use a library implementation anyway?  My real concern isn't D now so 
 much as D five years from now--I like the language enough that I really 
 want it to succeed :-)
 
 Let's just suppose for a minute that the regex-templates work out 
 well. It seems to me that any built-in support for regex (within the D 
 grammar) would be nothing more than a thin veneer over the template 
 syntax (for regex-templates), to make it somewhat  more palatable for 
 the masses? That may not come to pass, but it seems that we should at 
 least wait until there's a bit of education and experience in this 
 regard, rather than hurriedly tie the grammar to something which 
 clearly has a number of fundamental problems. Again; what's the big 
 rush here?

 
 It makes an odd sort of sense that a language designed by a compiler 
 writer should have built-in regex support.  And it seems like Walter has 
 been thinking about this for a while, so I'm willing to see how it goes. 
  But as a library writer, the way that Walter implements these changes 
 gives me fits ;-)
 
 
 Sean

We (the Titan team) have run into this sort of issue with Titan. When trying to
untangle the dmd runtime code, we have found 
huge reliance on library code, both libc and phobos. Another issue that makes
porting difficult is the lack of a standard 
definition of what language features expand to what runtime functions. These
two things have made dealing with the dmd runtime 
extremely hackish and untenable. Walter, for the love of Bob, do something
about this.

Feb 16 2006

Sean Kelly <sean f4.ca> writes:

Kyle Furlong wrote:
 
 We (the Titan team) have run into this sort of issue with Titan. When 
 trying to untangle the dmd runtime code, we have found huge reliance on 
 library code, both libc and phobos. Another issue that makes porting 
 difficult is the lack of a standard definition of what language features 
 expand to what runtime functions. These two things have made dealing 
 with the dmd runtime extremely hackish and untenable. Walter, for the 
 love of Bob, do something about this.

In the meantime, I suggest using Ares as a starting point.  It still 
uses libc functionality (which I think is unavoidable), but should 
otherwise be much cleaner to build off of.  If you have any questions 
about library interaction (since I've yet to document a lot of this) 
feel free to ask in the forums.


Sean

Feb 16 2006

"Kris" <fu bar.com> writes:

"Sean Kelly" <sean f4.ca> wrote
 Kyle Furlong wrote:
 We (the Titan team) have run into this sort of issue with Titan. When 
 trying to untangle the dmd runtime code, we have found huge reliance on 
 library code, both libc and phobos. Another issue that makes porting 
 difficult is the lack of a standard definition of what language features 
 expand to what runtime functions. These two things have made dealing with 
 the dmd runtime extremely hackish and untenable. Walter, for the love of 
 Bob, do something about this.

 In the meantime, I suggest using Ares as a starting point.  It still uses 
 libc functionality (which I think is unavoidable), but should otherwise be 
 much cleaner to build off of.

I'll second that. Ares is as good an isolation of D compiler support as 
you're likely to see anywhere.

Feb 16 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Kyle Furlong" <kylefurlong gmail.com> wrote in message 
news:dt2mkm$17aa$1 digitaldaemon.com...
 We (the Titan team) have run into this sort of issue with Titan. When 
 trying to untangle the dmd runtime code, we have found huge reliance on 
 library code, both libc and phobos.

Sure, but I'm not sure why this is a problem.

 Another issue that makes porting difficult is the lack of a standard 
 definition of what language features expand to what runtime functions. 
 These two things have made dealing with the dmd runtime extremely hackish 
 and untenable. Walter, for the love of Bob, do something about this.

Feb 16 2006

Kyle Furlong <kylefurlong gmail.com> writes:

Walter Bright wrote:
 "Kyle Furlong" <kylefurlong gmail.com> wrote in message 
 news:dt2mkm$17aa$1 digitaldaemon.com...
 We (the Titan team) have run into this sort of issue with Titan. When 
 trying to untangle the dmd runtime code, we have found huge reliance on 
 library code, both libc and phobos.

 
 Sure, but I'm not sure why this is a problem.
 
 Another issue that makes porting difficult is the lack of a standard 
 definition of what language features expand to what runtime functions. 
 These two things have made dealing with the dmd runtime extremely hackish 
 and untenable. Walter, for the love of Bob, do something about this.

 
 
 
 

It basically forces us to write our own libc. And yes one can argue that libc
is as necessary as air for any platform, but I'm 
of the purist type, and feel that the runtime should be self contained.

You also did not respond to my request for a runtime standard, is this
something you are unwilling to do? i.e. is each vendor's 
runtime going to be completely incompatible with others?

Feb 16 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Kyle Furlong" <kylefurlong gmail.com> wrote in message
news:dt3gn8$1sq0$1 digitaldaemon.com...
 Walter Bright wrote:
 "Kyle Furlong" <kylefurlong gmail.com> wrote in message
 news:dt2mkm$17aa$1 digitaldaemon.com...
 We (the Titan team) have run into this sort of issue with Titan. When
 trying to untangle the dmd runtime code, we have found huge reliance on
 library code, both libc and phobos.

 Sure, but I'm not sure why this is a problem.

 Another issue that makes porting difficult is the lack of a standard
 definition of what language features expand to what runtime functions.
 These two things have made dealing with the dmd runtime extremely
 hackish and untenable. Walter, for the love of Bob, do something about
 this.


 It basically forces us to write our own libc. And yes one can argue that
 libc is as necessary as air for any platform, but I'm of the purist type,
 and feel that the runtime should be self contained.

Since D is designed to interface directly to C, the C runtime is kind of a
given. I also don't see much point in reimplementing things like strlen,
strtod, etc. These have been around for decades, they're well optimized, and 
bug free.

 You also did not respond to my request for a runtime standard, is this
 something you are unwilling to do? i.e. is each vendor's runtime going to
 be completely incompatible with others?

I'm not sure what you're asking. Are you asking if Phobos is a D standard?

Feb 16 2006

Kyle Furlong <kylefurlong gmail.com> writes:

Walter Bright wrote:
 "Kyle Furlong" <kylefurlong gmail.com> wrote in message
 news:dt3gn8$1sq0$1 digitaldaemon.com...
 Walter Bright wrote:
 "Kyle Furlong" <kylefurlong gmail.com> wrote in message
 news:dt2mkm$17aa$1 digitaldaemon.com...
 We (the Titan team) have run into this sort of issue with Titan. When
 trying to untangle the dmd runtime code, we have found huge reliance on
 library code, both libc and phobos.

 Sure, but I'm not sure why this is a problem.

 Another issue that makes porting difficult is the lack of a standard
 definition of what language features expand to what runtime functions.
 These two things have made dealing with the dmd runtime extremely
 hackish and untenable. Walter, for the love of Bob, do something about
 this.


 It basically forces us to write our own libc. And yes one can argue that
 libc is as necessary as air for any platform, but I'm of the purist type,
 and feel that the runtime should be self contained.

 
 Since D is designed to interface directly to C, the C runtime is kind of a
 given. I also don't see much point in reimplementing things like strlen,
 strtod, etc. These have been around for decades, they're well optimized, and 
 bug free.
 
 You also did not respond to my request for a runtime standard, is this
 something you are unwilling to do? i.e. is each vendor's runtime going to
 be completely incompatible with others?

 
 I'm not sure what you're asking. Are you asking if Phobos is a D standard?
 
 
 
 

So your dmd compiler emit's references to the _d_whatever extern(C) functions
in the runtime, correct? I'm asking if this is 
going to be a standard, part of the spec, or vendor specific.

Feb 16 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Kyle Furlong" <kylefurlong gmail.com> wrote in message 
news:dt3oer$22lg$1 digitaldaemon.com...
 So your dmd compiler emit's references to the _d_whatever extern(C) 
 functions in the runtime, correct?

Yes.

 I'm asking if this is going to be a standard, part of the spec, or vendor 
 specific.

All the stuff in the internal package is meant to be vendor specific. So, 
yes. For another vendor, the names may change, there may be more or fewer 
such routines.

Feb 17 2006

Sean Kelly <sean f4.ca> writes:

Walter Bright wrote:
 "Kyle Furlong" <kylefurlong gmail.com> wrote in message
 news:dt3gn8$1sq0$1 digitaldaemon.com...
 
 You also did not respond to my request for a runtime standard, is this
 something you are unwilling to do? i.e. is each vendor's runtime going to
 be completely incompatible with others?

 
 I'm not sure what you're asking. Are you asking if Phobos is a D standard?

I think Kyle is wondering whether a compiler writer could simply sit 
down and write a D compiler, sans standard library, given the D spec.  I 
think this is possible insofar as language features are concerned, but 
it may be less clear regarding any points of contact between the runtime 
and the GC or standard library code.  For example, internal/gc/gc.d 
contains a bunch of extern (C) functions (prefixed with "_d_") which 
probably really belong to the runtime code.  But if this is true, what 
are the points of contact between the runtime and GC?  Using Phobos as a 
guide, one might think a compiler writer must provide a garbage 
collector as a part of the runtime, while I consider these logically 
separate libraries.  I think the real goal here is to define a clear 
separation of labor, so a compiler writer can do his part, a library 
writer his part, an platform writer his part, and each can be assured 
that if they follow the spec then their libraries should link against 
any implementation of the other pieces and work without error.  This is 
really what I'm making an effort to define, and why I fussed so much 
over this RegExp business.  Until this last release, things had been 
distilled to a few well-defined extern (C) functions--no need for 
imports at all :-)


Sean

Feb 16 2006

Kyle Furlong <kylefurlong gmail.com> writes:

Sean Kelly wrote:
 Walter Bright wrote:
 "Kyle Furlong" <kylefurlong gmail.com> wrote in message
 news:dt3gn8$1sq0$1 digitaldaemon.com...

 You also did not respond to my request for a runtime standard, is this
 something you are unwilling to do? i.e. is each vendor's runtime 
 going to
 be completely incompatible with others?

 I'm not sure what you're asking. Are you asking if Phobos is a D 
 standard?

 
 I think Kyle is wondering whether a compiler writer could simply sit 
 down and write a D compiler, sans standard library, given the D spec.  I 
 think this is possible insofar as language features are concerned, but 
 it may be less clear regarding any points of contact between the runtime 
 and the GC or standard library code.  For example, internal/gc/gc.d 
 contains a bunch of extern (C) functions (prefixed with "_d_") which 
 probably really belong to the runtime code.  But if this is true, what 
 are the points of contact between the runtime and GC?  Using Phobos as a 
 guide, one might think a compiler writer must provide a garbage 
 collector as a part of the runtime, while I consider these logically 
 separate libraries.  I think the real goal here is to define a clear 
 separation of labor, so a compiler writer can do his part, a library 
 writer his part, an platform writer his part, and each can be assured 
 that if they follow the spec then their libraries should link against 
 any implementation of the other pieces and work without error.  This is 
 really what I'm making an effort to define, and why I fussed so much 
 over this RegExp business.  Until this last release, things had been 
 distilled to a few well-defined extern (C) functions--no need for 
 imports at all :-)
 
 
 Sean

Well put Sean, I would be very interested in Walter's take on these issues.

Feb 16 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Sean Kelly" <sean f4.ca> wrote in message 
news:dt3phq$23iu$1 digitaldaemon.com...
 I think Kyle is wondering whether a compiler writer could simply sit down 
 and write a D compiler, sans standard library, given the D spec.  I think 
 this is possible insofar as language features are concerned, but it may be 
 less clear regarding any points of contact between the runtime and the GC 
 or standard library code.  For example, internal/gc/gc.d contains a bunch 
 of extern (C) functions (prefixed with "_d_") which probably really belong 
 to the runtime code.  But if this is true, what are the points of contact 
 between the runtime and GC?

For the language implementor, the stuff in std.gc. How operator new 
interfaces with the gc is up to the language implementor.

 Using Phobos as a guide, one might think a compiler writer must provide a 
 garbage collector as a part of the runtime, while I consider these 
 logically separate libraries.  I think the real goal here is to define a 
 clear separation of labor, so a compiler writer can do his part, a library 
 writer his part, an platform writer his part, and each can be assured that 
 if they follow the spec then their libraries should link against any 
 implementation of the other pieces and work without error.  This is really 
 what I'm making an effort to define, and why I fussed so much over this 
 RegExp business.  Until this last release, things had been distilled to a 
 few well-defined extern (C) functions--no need for imports at all :-)

Feb 17 2006

Sean Kelly <sean f4.ca> writes:

Walter Bright wrote:
 "Sean Kelly" <sean f4.ca> wrote in message 
 news:dt3phq$23iu$1 digitaldaemon.com...
 I think Kyle is wondering whether a compiler writer could simply sit down 
 and write a D compiler, sans standard library, given the D spec.  I think 
 this is possible insofar as language features are concerned, but it may be 
 less clear regarding any points of contact between the runtime and the GC 
 or standard library code.  For example, internal/gc/gc.d contains a bunch 
 of extern (C) functions (prefixed with "_d_") which probably really belong 
 to the runtime code.  But if this is true, what are the points of contact 
 between the runtime and GC?

 
 For the language implementor, the stuff in std.gc. How operator new 
 interfaces with the gc is up to the language implementor.

But what if the user wants to employ a non-standard GC?  There have 
already been questions about this for real-time programming and other 
specialized applications.

While I'm coming to understand your argument about the necessary 
reliance of runtime code on library code, I do believe that D can only 
benefit if the scope of this reliance and the means of interaction are 
well-defined.  You've mentioned that, according to their specs, the 
C/C++ libraries are inextricably intertwined with the compiler 
definition, and have said that you consider this something you've sought 
to fix in D.  And while I don't have the experience with writing C/C++ 
compilers that you do (and therefore have little exposure to this 
particular issue), it does seem we somewhat agree on what the correct 
approach for library design should be.  As a point of discussion, I'd 
like to outline what I've done with Ares thus far.

First, it's important to note that I consider the runtime to be a 
distinct library containing anything required for basic language 
support, the garbage collector similarly separated and devoted to memory 
management, and the standard library as a third distinct entity which 
contains all components and interfaces the user is expected to actually 
interact with.  Phobos already has this basic separation, but the points 
of interaction between each component aren't particularly well-defined. 
  For example, if someone wants to provided a specialized garbage 
collector, what does he do?  A bit of research will reveal that some 
modules from internal/gc should be removed and a new class of type GC 
should be created, but this requires more interaction with low-level 
code than most users want to have.

Second, I believe it's important that the need to import modules across 
these library boundaries should be avoided if at all possible, as doing 
so creates a compile-time depencency between them.  Also, it seems 
logical to assume that the runtime and GC code might not be written in D 
at all, so the points of interaction should be equally accessible from 
other languages, implying that all such points of interaction should be 
extern (C) functions.  This also hass the side-benefit of allowing the 
functions to be delared in the module they're called, as the name 
mangling scheme ignores declaration placement.

Since the purpose of a garbage collector is to allocate and manage 
memory, I see little need to extend its interface beyond this. 
Therefore I consider the "_d_" prefixed calls in internal/gc/gc.d to 
really belong to the runtime, where I've moved them.  Currently, a GC 
library in Ares is required to expose these functions (which are are 
wrapped in a static class instance for user access in the standard library):

     extern (C) void gc_init();
     extern (C) void gc_term();

     alias void function( void *p, void *dummy ) gc_finalizer;
     extern (C) void gc_setFinalizer( void *p, gc_finalizer fn );

     extern (C) void gc_enable();
     extern (C) void gc_disable();
     extern (C) void gc_collect();

     extern (C) void* gc_malloc( size_t sz );
     extern (C) void* gc_calloc( size_t nm, size_t sz );
     extern (C) void* gc_realloc( void* p, size_t sz );
     extern (C) void gc_free( void* p );

     extern (C) size_t gc_sizeOf( void* p );
     extern (C) size_t gc_capacityOf( void* p );

     extern (C) void gc_addRoot( void* p );
     extern (C) void gc_addRange( void* pbeg, void* pend );

     extern (C) void gc_removeRoot( void* p );
     extern (C) void gc_removeRange( void* pbeg, void* pend );

     extern (C) void gc_pin( void* p );
     extern (C) void gc_unpin( void* p );

I've also considered requiring that the runtime expose an 
os_getStaticDataSegment function and potentially other OS-specific 
memory related functions, but haven't gotten around to it so far.

The remaining points of interaction are all provided by the standard 
library.  First, the callbacks for runtime errors, all of which are 
expected to throw exceptions as default behavior (though onAssert can be 
hooked at run-time if the user wishes to signal the debugger or 
something similar):

     extern (C) void onAssert( char[] file, uint line );
     extern (C) void onOutOfMemory();
     extern (C) void onArrayBoundsError( char[] file, size_t line );
     extern (C) void onSwitchError( char[] file, size_t line );
     extern (C) void onInvalidUtfError( char[] msg, size_t idx );

And then a way to monitor and control multithreading for debugging or GC 
use:

     extern (C) bit multiThreaded();
     extern (C) void suspendAllThreads();
     extern (C) void resumeAllThreads();
     extern (C) void scanAllThreads( void delegate( void*, void* ) fn );

Also, I suspect I'll now be adding a set of functions for RegExp 
interaction, but haven't done that yet.

So you can see that, so far, there has been no need to import any 
modules across library boundaries--all imports are either internal or of 
C headers (which can be easily declared in the module they're called if 
desired).  I think Phobos could ultimately benefit from such an 
arrangement, but it's really not critical at this point.


Sean

Feb 17 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Sean Kelly" <sean f4.ca> wrote in message 
news:dt5cbc$i49$1 digitaldaemon.com...
 Walter Bright wrote:
 For the language implementor, the stuff in std.gc. How operator new 
 interfaces with the gc is up to the language implementor.

 But what if the user wants to employ a non-standard GC?  There have 
 already been questions about this for real-time programming and other 
 specialized applications.

I don't know what you mean by non-standard. It must implement the interface 
in std.gc, and operator new and delete need to work. Other than that, there 
are a wide range of gc implementation strategies one can use.

 First, it's important to note that I consider the runtime to be a distinct 
 library containing anything required for basic language support, the 
 garbage collector similarly separated and devoted to memory management, 
 and the standard library as a third distinct entity which contains all 
 components and interfaces the user is expected to actually interact with. 
 Phobos already has this basic separation, but the points of interaction 
 between each component aren't particularly well-defined. For example, if 
 someone wants to provided a specialized garbage collector, what does he 
 do?  A bit of research will reveal that some modules from internal/gc 
 should be removed and a new class of type GC should be created, but this 
 requires more interaction with low-level code than most users want to 
 have.

Writing a gc is non-trivial, and someone who is up to that task I doubt will 
have much difficulty with the interface to it. You're right in that one 
can't casually create a GC class, but I don't see that as a fault in the 
interface.

 Second, I believe it's important that the need to import modules across 
 these library boundaries should be avoided if at all possible, as doing so 
 creates a compile-time depencency between them.  Also, it seems logical to 
 assume that the runtime and GC code might not be written in D at all, so 
 the points of interaction should be equally accessible from other 
 languages, implying that all such points of interaction should be extern 
 (C) functions.  This also hass the side-benefit of allowing the functions 
 to be delared in the module they're called, as the name mangling scheme 
 ignores declaration placement.

I don't see the reason why one would want to write a new GC that is not in 
D. If one wants to use an existing one, say the Boehm GC which is in C, all 
one needs is a simple wrapper of D functions around the Boehm ones.

 So you can see that, so far, there has been no need to import any modules 
 across library boundaries--all imports are either internal or of C headers 
 (which can be easily declared in the module they're called if desired).  I 
 think Phobos could ultimately benefit from such an arrangement, but it's 
 really not critical at this point.

I see what you're doing, but what is the advantage of avoiding doing the 
import if you're going to need that code anyway?

Feb 17 2006

Sean Kelly <sean f4.ca> writes:

Walter Bright wrote:
 "Sean Kelly" <sean f4.ca> wrote in message 
 news:dt5cbc$i49$1 digitaldaemon.com...
 Walter Bright wrote:
 For the language implementor, the stuff in std.gc. How operator new 
 interfaces with the gc is up to the language implementor.

 But what if the user wants to employ a non-standard GC?  There have 
 already been questions about this for real-time programming and other 
 specialized applications.

 
 I don't know what you mean by non-standard. It must implement the interface 
 in std.gc, and operator new and delete need to work. Other than that, there 
 are a wide range of gc implementation strategies one can use.

By non-standard I simply meant a different implementation.

 First, it's important to note that I consider the runtime to be a distinct 
 library containing anything required for basic language support, the 
 garbage collector similarly separated and devoted to memory management, 
 and the standard library as a third distinct entity which contains all 
 components and interfaces the user is expected to actually interact with. 
 Phobos already has this basic separation, but the points of interaction 
 between each component aren't particularly well-defined. For example, if 
 someone wants to provided a specialized garbage collector, what does he 
 do?  A bit of research will reveal that some modules from internal/gc 
 should be removed and a new class of type GC should be created, but this 
 requires more interaction with low-level code than most users want to 
 have.

 
 Writing a gc is non-trivial, and someone who is up to that task I doubt will 
 have much difficulty with the interface to it. You're right in that one 
 can't casually create a GC class, but I don't see that as a fault in the 
 interface.

True enough.

 Second, I believe it's important that the need to import modules across 
 these library boundaries should be avoided if at all possible, as doing so 
 creates a compile-time depencency between them.  Also, it seems logical to 
 assume that the runtime and GC code might not be written in D at all, so 
 the points of interaction should be equally accessible from other 
 languages, implying that all such points of interaction should be extern 
 (C) functions.  This also hass the side-benefit of allowing the functions 
 to be delared in the module they're called, as the name mangling scheme 
 ignores declaration placement.

 
 I don't see the reason why one would want to write a new GC that is not in 
 D. If one wants to use an existing one, say the Boehm GC which is in C, all 
 one needs is a simple wrapper of D functions around the Boehm ones.

I meant that more as a general statement rather than regarding the GC 
specifically--I think it's more likely that portions of the runtime code 
will not be written in D than the GC.  But as D code can call C 
functions directly, why not use that for library interaction?  It seems 
more straightforward than creating wrappers.  Also, I think the thread 
control functions may be useful for a debugger (which may well be 
written in C/C++), and the GC functions might be useful in mixed-code 
applications.  Wrappers could again be created, but I don't see the point.

 So you can see that, so far, there has been no need to import any modules 
 across library boundaries--all imports are either internal or of C headers 
 (which can be easily declared in the module they're called if desired).  I 
 think Phobos could ultimately benefit from such an arrangement, but it's 
 really not critical at this point.

 
 I see what you're doing, but what is the advantage of avoiding doing the 
 import if you're going to need that code anyway? 

Largely to avoid compile-time dependencies between libaries, as I feel 
it's important that a user should be able to download an alternate 
standard library or GC and use it simply by linking it in.  And while 
this could also be accomplished by documenting that UDTs should perhaps 
not be used and compiling against header modules, it seems more 
straightforward to simply define things at the code level.

Another benefit I discovered is that this approach allows specialized 
functionality to be exposed or code paths to be optimized specifically 
for library use.  For example, I'd originally defined a Thread.count 
method which I knew was being called by the GC.  But when I got around 
to looking at the GC code I realized that it didn't actually care how 
many threads were running so much as whether critical sections were 
necessary to ensure correct behavior.  And this revealed that the way I 
was tracking thread count--modified by the newly created thread before 
entering user code--was not only incorrect, but the fact that 
Thread.count passed through a critical section of its own made it 
effectively useless to the GC code.  The redesigned function serves one 
purpose: to indicate whether Thread.start has ever been called by the 
application, and thus whether memory synchronization issues might be 
present or mutual exclusion might be necessary.  No critical sections 
are used, and indeed, a count of threads isn't even maintained--just a 
bit flag.  This approach was obvious in light of what the GC needed, but 
it was not at all apparent from the context of what a standard library 
user might be interested in.

Finally, defining specific means of interaction allows behavior to be 
modified quite easily.  When a system error occurs in Ares, rather than 
throwing an exception directly the runtime instead passes relevant 
information to a callback exposed by the standard library.  Thus the 
runtime has no dependence on the exception object definition (aside from 
the requirements imposed by the stack unwinding code itself), and the 
user has a clear means of hooking the error handling mechanism if 
different behavior is desired--the behavior of onAssertError can be 
modified, for example, so the user can signal the debugger immediately 
instead of waiting for an exception to propogate.  If the modules were 
imported and exceptions thrown directly, this would obviously not be 
possible.

Since this approach seems to provide at least marginal benefit, I would 
like to turn things around and ask what the advantage is of importing 
modules directly as Phobos does?  I can see that it offers immediate 
relief if the library writer decides he needs more functionality than 
has been predetermined, but with a prototype standard library already in 
place I would think that such needs should already be obvious.  Are 
there other advantages as well?


Sean

Feb 18 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Sean Kelly" <sean f4.ca> wrote in message 
news:dt7u8d$2o5g$1 digitaldaemon.com...
 Since this approach seems to provide at least marginal benefit, I would 
 like to turn things around and ask what the advantage is of importing 
 modules directly as Phobos does?  I can see that it offers immediate 
 relief if the library writer decides he needs more functionality than has 
 been predetermined, but with a prototype standard library already in place 
 I would think that such needs should already be obvious.  Are there other 
 advantages as well?

You have made some good points.

Feb 20 2006

Kyle Furlong <kylefurlong gmail.com> writes:

Walter Bright wrote:
 Added match expressions.
 
 http://www.digitalmars.com/d/changelog.html
 
 
 

Is it possible to drop in compile-time regex support? (i.e. Eric's solution)

Feb 15 2006

pragma <pragma_member pathlink.com> writes:

In article <dt0k9b$27jr$1 digitaldaemon.com>, Kyle Furlong says...
Walter Bright wrote:
 Added match expressions.
 
 http://www.digitalmars.com/d/changelog.html
 
 
 

Is it possible to drop in compile-time regex support? (i.e. Eric's solution)

IMHO, it's not quite ready for prime-time yet.  In fact, some parts of it are
still somewhat incomplete. :(

- Eric Anderton at yahoo

Feb 15 2006

James Dunne <james.jdunne gmail.com> writes:

pragma wrote:
 In article <dt0k9b$27jr$1 digitaldaemon.com>, Kyle Furlong says...
 
Walter Bright wrote:

Added match expressions.

http://www.digitalmars.com/d/changelog.html

Is it possible to drop in compile-time regex support? (i.e. Eric's solution)

 
 
 IMHO, it's not quite ready for prime-time yet.  In fact, some parts of it are
 still somewhat incomplete. :(
 
 - Eric Anderton at yahoo

Not to knock Eric's great efforts at compile-time regex (which is 
seriously cool, btw), but I would be more impressed at code generation 
of regex parsing.  Have the compiler itself write out some highly 
optimized goto-like code and have it parse known regex strings at 
runtime in the fastest way possible.  Reminds me of the approach of the 
Ragel state machine (link on D Links page), but doesn't have to be 
anywhere near as complicated.

-- 
Regards,
James Dunne

Feb 15 2006

BCS <BCS_member pathlink.com> writes:

James Dunne wrote:
 pragma wrote:
 
 In article <dt0k9b$27jr$1 digitaldaemon.com>, Kyle Furlong says...

 Walter Bright wrote:

 Added match expressions.

 http://www.digitalmars.com/d/changelog.html

 Is it possible to drop in compile-time regex support? (i.e. Eric's 
 solution)



 IMHO, it's not quite ready for prime-time yet.  In fact, some parts of 
 it are
 still somewhat incomplete. :(

 - Eric Anderton at yahoo

 
 
 Not to knock Eric's great efforts at compile-time regex (which is 
 seriously cool, btw), but I would be more impressed at code generation 
 of regex parsing.  Have the compiler itself write out some highly 
 optimized goto-like code and have it parse known regex strings at 
 runtime in the fastest way possible.  Reminds me of the approach of the 
 Ragel state machine (link on D Links page), but doesn't have to be 
 anywhere near as complicated.
 

I am not commenting on the regex support in particular (I haven't used it yet), 
however I think that the introduction if this _type_ of feature is a good
thing, 
if it is done carefully.

To elaborate, the use of templates to implement compile time regex just seems 
like an error prone mess (a fantastic, made by a genus mess, but still a mess). 
While templates can be vary powerful and get a lot of stuff done, I think that 
the language should include support for compile time programming that is not 
just a side effect of other features. As an example of what I would like to see 
more of, I posted a while ago proposing a witheach statement.

http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/32232

There are a few other tasks that I think should be easily done at compile time 
construction of a balanced binary search tree for instance.

Feb 16 2006

Georg Wrede <georg.wrede nospam.org> writes:

BCS wrote:
 To elaborate, the use of templates to implement compile time regex just 
 seems like an error prone mess (a fantastic, made by a genus mess, but 
 still a mess). 

Probably nobody thinks that compile time regexes should be implemented 
with template metaprogramming.

While D template programming screams compared to C++, shoving work on 
the template system does slow down compilation unnecessarily, compared 
to doing things 'the proper way'. This would erode the absolutely 
coolest feature DMD has: a blazing compilation speed.

 While templates can be vary powerful and get a lot of 
 stuff done, I think that the language should include support for compile 
 time programming that is not just a side effect of other features.

At the time, I think they just served to demonstrate a few things:

  - that you (or actually, Don) really can do most amazing things with 
templates

  - that showing this would motivate Walter to add effort and priority 
to implementing them properly (i.e. non-template)

  - serve as a vehicle to demonstrate the utility (of both template 
programming in itself, and the utility of compile-time regexes)

Feb 17 2006

James Dunne <james.jdunne gmail.com> writes:

Georg Wrede wrote:
 BCS wrote:
 
 To elaborate, the use of templates to implement compile time regex 
 just seems like an error prone mess (a fantastic, made by a genus 
 mess, but still a mess). 

 
 
 Probably nobody thinks that compile time regexes should be implemented 
 with template metaprogramming.
 
 While D template programming screams compared to C++, shoving work on 
 the template system does slow down compilation unnecessarily, compared 
 to doing things 'the proper way'. This would erode the absolutely 
 coolest feature DMD has: a blazing compilation speed.
 
 While templates can be vary powerful and get a lot of stuff done, I 
 think that the language should include support for compile time 
 programming that is not just a side effect of other features.

 
 
 At the time, I think they just served to demonstrate a few things:
 
  - that you (or actually, Don) really can do most amazing things with 
 templates
 
  - that showing this would motivate Walter to add effort and priority to 
 implementing them properly (i.e. non-template)
 
  - serve as a vehicle to demonstrate the utility (of both template 
 programming in itself, and the utility of compile-time regexes)

DMD has the speed; that's great and all, but we simply can't assume all 
implementations of the D language will be equivalent in performance. 
(Someone is going to write one in Java, I just know it).  IMO, basing 
language decisions off reference implementations is a Bad Thing.

-- 
Regards,
James Dunne

Feb 18 2006

clayasaurus <clayasaurus gmail.com> writes:

James Dunne wrote:
 Georg Wrede wrote:
 BCS wrote:

 To elaborate, the use of templates to implement compile time regex 
 just seems like an error prone mess (a fantastic, made by a genus 
 mess, but still a mess). 


 Probably nobody thinks that compile time regexes should be implemented 
 with template metaprogramming.

 While D template programming screams compared to C++, shoving work on 
 the template system does slow down compilation unnecessarily, compared 
 to doing things 'the proper way'. This would erode the absolutely 
 coolest feature DMD has: a blazing compilation speed.

 While templates can be vary powerful and get a lot of stuff done, I 
 think that the language should include support for compile time 
 programming that is not just a side effect of other features.


 At the time, I think they just served to demonstrate a few things:

  - that you (or actually, Don) really can do most amazing things with 
 templates

  - that showing this would motivate Walter to add effort and priority 
 to implementing them properly (i.e. non-template)

  - serve as a vehicle to demonstrate the utility (of both template 
 programming in itself, and the utility of compile-time regexes)

 
 DMD has the speed; that's great and all, but we simply can't assume all 
 implementations of the D language will be equivalent in performance. 
 (Someone is going to write one in Java, I just know it).  IMO, basing 
 language decisions off reference implementations is a Bad Thing.
 

No, but we can assume implementations of D will be faster than C++ since 
Walter's DMD is twice as fast as DMC for building Empire, even though 
they share the same optimizer, code gen, and linker. The D frontend, 
which is open source, gives D its speed.

For me, the fast compile times compared to C++ are a big feature of D.

Feb 18 2006

Tom <Tom_member pathlink.com> writes:

In article <dt088d$1svm$1 digitaldaemon.com>, Walter Bright says...
Added match expressions.

http://www.digitalmars.com/d/changelog.html

A question: I wonder, do you fix the regressions that arise on each of these
releases? (I really ask myself 'cos I don't see that fixes in the changelog or
maybe i'm wrong)

Thanks in advance,

P.S.: Another little question (i know, it's a second one :-D), sorry about my
ignorance of common emoticons and stuff, what does <g> means?

Tom;

Feb 15 2006

James Dunne <james.jdunne gmail.com> writes:

Tom wrote:
 In article <dt088d$1svm$1 digitaldaemon.com>, Walter Bright says...
 
Added match expressions.

http://www.digitalmars.com/d/changelog.html

 
 
 A question: I wonder, do you fix the regressions that arise on each of these
 releases? (I really ask myself 'cos I don't see that fixes in the changelog or
 maybe i'm wrong)
 
 Thanks in advance,
 
 P.S.: Another little question (i know, it's a second one :-D), sorry about my
 ignorance of common emoticons and stuff, what does <g> means?
 
 Tom;

<grin>

-- 
Regards,
James Dunne

Feb 15 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Tom" <Tom_member pathlink.com> wrote in message 
news:dt0m6n$29jv$1 digitaldaemon.com...
 A question: I wonder, do you fix the regressions that arise on each of 
 these
 releases? (I really ask myself 'cos I don't see that fixes in the 
 changelog or
 maybe i'm wrong)

I try to do the most important ones first.

 P.S.: Another little question (i know, it's a second one :-D), sorry about 
 my
 ignorance of common emoticons and stuff, what does <g> means?

grin

Feb 15 2006

huangliang <huangliang_member pathlink.com> writes:

MatchExpression is a robust feather, but too robust.
we do not need another text oriented language, Perl takes up the place.

D is complex enough, pls don't give it more syntax.
I suggest to freeze features, and improve those existence.

how about 'implicit template instantiation', 'function and delegate' etc...

Feb 16 2006

Georg Wrede <georg.wrede nospam.org> writes:

Walter Bright wrote:
 Added match expressions.
 
 http://www.digitalmars.com/d/changelog.html

Cool! You really have to be working like 25 hours a day at this!

What does

    "When a MatchExpression is the operand of an IfStatement
    or WhileStatement, special handling happens."

in the doc mean?


And another question: I assume all literal regexes will some day be 
compiled at compile time, right? Are we there yet?

Feb 16 2006

"Charles" <noone nowhere.com> writes:

 What does

     "When a MatchExpression is the operand of an IfStatement
     or WhileStatement, special handling happens."

.... Trouble.


Just curious, was this 'built in regex' on anyone's wish list besides
Matthew's ?

Charlie


"Georg Wrede" <georg.wrede nospam.org> wrote in message
news:43F48BBB.1050001 nospam.org...
 Walter Bright wrote:
 Added match expressions.

 http://www.digitalmars.com/d/changelog.html

 Cool! You really have to be working like 25 hours a day at this!

 What does

     "When a MatchExpression is the operand of an IfStatement
     or WhileStatement, special handling happens."

 in the doc mean?


 And another question: I assume all literal regexes will some day be
 compiled at compile time, right? Are we there yet?

Feb 16 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Georg Wrede" <georg.wrede nospam.org> wrote in message 
news:43F48BBB.1050001 nospam.org...
 What does

    "When a MatchExpression is the operand of an IfStatement
    or WhileStatement, special handling happens."

 in the doc mean?

It's explained in the IfStatement and WhileStatement sections.


 And another question: I assume all literal regexes will some day be 
 compiled at compile time, right?

Yes.

 Are we there yet?

Not even close :-(

Feb 16 2006

Stewart Gordon <smjg_1998 yahoo.com> writes:

Walter Bright wrote:
 Added match expressions.
 
 http://www.digitalmars.com/d/changelog.html

So this

     int[] x, y;
     ...
     x=y~~42;

won't work anymore....

Stewart.

-- 
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCS/M d- s:- C++  a->--- UB  P+ L E  W++  N+++ o K-  w++  O? M V? PS- 
PE- Y? PGP- t- 5? X? R b DI? D G e++>++++ h-- r-- !y
------END GEEK CODE BLOCK------

My e-mail is valid but not my primary mailbox.  Please keep replies on 
the 'group where everyone may benefit.

Feb 16 2006

James Dunne <james.jdunne gmail.com> writes:

Stewart Gordon wrote:
 Walter Bright wrote:
 
 Added match expressions.

 http://www.digitalmars.com/d/changelog.html

 
 
 So this
 
     int[] x, y;
     ...
     x=y~~42;
 
 won't work anymore....
 
 Stewart.
 

If you're not using whitespace to deliniate your tokens in the first 
place, you should expect things like this.

-- 
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCS/MU/S d-pu s:+ a-->? C++++$ UL+++ P--- L+++ !E W-- N++ o? K? w--- O 
M--  V? PS PE Y+ PGP- t+ 5 X+ !R tv-->!tv b- DI++(+) D++ G e++>e 
h>--->++ r+++ y+++
------END GEEK CODE BLOCK------

James Dunne

Feb 16 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Stewart Gordon" <smjg_1998 yahoo.com> wrote in message 
news:dt2476$ii2$1 digitaldaemon.com...
 Walter Bright wrote:
 Added match expressions.

 http://www.digitalmars.com/d/changelog.html

 So this

     int[] x, y;
     ...
     x=y~~42;

 won't work anymore....

That's right. Neither will:

    x = !~y;

It's in the same vein that:

    x = y/*p;

never worked, either.

Feb 16 2006

Stewart Gordon <smjg_1998 yahoo.com> writes:

Walter Bright wrote:
 "Stewart Gordon" <smjg_1998 yahoo.com> wrote in message 
 news:dt2476$ii2$1 digitaldaemon.com...
 Walter Bright wrote:
 Added match expressions.

 http://www.digitalmars.com/d/changelog.html

 So this

     int[] x, y;
     ...
     x=y~~42;

 won't work anymore....

 
 That's right. Neither will:
 
     x = !~y;
 
 It's in the same vein that:
 
     x = y/*p;
 
 never worked, either. 

At least neither of those two is syntactically valid now.

Why are MatchExpression and NotMatchExpression separate nonterminals? 
Why not simply

MatchExpression:
	EqualExpression ~~ RelExpression
	EqualExpression !~ RelExpression

or even

EqualExpression:
	RelExpression
	EqualExpression == RelExpression
	EqualExpression != RelExpression
	EqualExpression is RelExpression
	EqualExpression !is RelExpression
	EqualExpression ~~ RelExpression
	EqualExpression !~ RelExpression

?

Stewart.

-- 
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCS/M d- s:- C++  a->--- UB  P+ L E  W++  N+++ o K-  w++  O? M V? PS- 
PE- Y? PGP- t- 5? X? R b DI? D G e++>++++ h-- r-- !y
------END GEEK CODE BLOCK------

My e-mail is valid but not my primary mailbox.  Please keep replies on 
the 'group where everyone may benefit.

Feb 16 2006

Sean Kelly <sean f4.ca> writes:

Stewart Gordon wrote:
 Walter Bright wrote:
 "Stewart Gordon" <smjg_1998 yahoo.com> wrote in message 
 news:dt2476$ii2$1 digitaldaemon.com...
 Walter Bright wrote:
 Added match expressions.

 http://www.digitalmars.com/d/changelog.html

 So this

     int[] x, y;
     ...
     x=y~~42;

 won't work anymore....

 That's right. Neither will:

     x = !~y;

 It's in the same vein that:

     x = y/*p;

 never worked, either. 

 
 At least neither of those two is syntactically valid now.
 
 Why are MatchExpression and NotMatchExpression separate nonterminals? 
 Why not simply
 
 MatchExpression:
     EqualExpression ~~ RelExpression
     EqualExpression !~ RelExpression

I think because MatchExpression injects a _Match* object into the 
following scope, while NotMatchExpression does not.


Sean

Feb 16 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Stewart Gordon" <smjg_1998 yahoo.com> wrote in message 
news:dt2j3t$13tb$1 digitaldaemon.com...
 Why are MatchExpression and NotMatchExpression separate nonterminals?

Because IfStatement handles them differently.

Feb 16 2006

Wang Zhen <nehzgnaw gmail.com> writes:

Although syntactically correct, MatchExpression in StaticIfCondition or 
StaticAssert do not compile. For example:

void main(){static if(!("" ~~ "")){}static assert("" ~~ "");}

Is this intended or an unimplemented feature?


Walter Bright wrote:
 Added match expressions.
 
 http://www.digitalmars.com/d/changelog.html

Feb 17 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Wang Zhen" <nehzgnaw gmail.com> wrote in message 
news:dt49iv$2hm5$1 digitaldaemon.com...
 Although syntactically correct, MatchExpression in StaticIfCondition or 
 StaticAssert do not compile. For example:

 void main(){static if(!("" ~~ "")){}static assert("" ~~ "");}

 Is this intended or an unimplemented feature?

The problem is that getting it to work requires the compiler itself to 
understand regular expressions. Currently, it does not.

Feb 17 2006

"Craig Black" <cblack ara.com> writes:

 The problem is that getting it to work requires the compiler itself to 
 understand regular expressions. Currently, it does not.

You could also perhaps use compile-time templates to evaluate static if 
regex's.  However, it would be another compiler dependency on a library.

-Craig

Feb 17 2006

Georg Wrede <georg.wrede nospam.org> writes:

Walter Bright wrote:
 "Wang Zhen" <nehzgnaw gmail.com> wrote in message 
 news:dt49iv$2hm5$1 digitaldaemon.com...
 
 Although syntactically correct, MatchExpression in
 StaticIfCondition or StaticAssert do not compile. For example:
 
 void main(){static if(!("" ~~ "")){}static assert("" ~~ "");}
 
 Is this intended or an unimplemented feature?

 
 The problem is that getting it to work requires the compiler itself
 to understand regular expressions. Currently, it does not.

Intriguing. I'd sure love to hear more about this.

I take it understanding regular expressions is much more than just 
compiling them? (Like what the runtime does, or Perl, etc.?)

Feb 17 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Georg Wrede" <georg.wrede nospam.org> wrote in message 
news:43F658D2.2000608 nospam.org...
 Walter Bright wrote:
 "Wang Zhen" <nehzgnaw gmail.com> wrote in message 
 news:dt49iv$2hm5$1 digitaldaemon.com...

 Although syntactically correct, MatchExpression in
 StaticIfCondition or StaticAssert do not compile. For example:

 void main(){static if(!("" ~~ "")){}static assert("" ~~ "");}

 Is this intended or an unimplemented feature?

 The problem is that getting it to work requires the compiler itself
 to understand regular expressions. Currently, it does not.

 Intriguing. I'd sure love to hear more about this.

If the compiler is to constant fold regular expressions, then it needs to 
build in to the compiler exactly what would happen if the regex code was 
evaluated at runtime.

 I take it understanding regular expressions is much more than just 
 compiling them? (Like what the runtime does, or Perl, etc.?)

I think the confusion here is the difference between compiling a string 
literal, and compiling the regular expression within the string literal. DMD 
currently does the former, the latter is done at runtime by std.regexp.

Feb 17 2006

Georg Wrede <georg.wrede nospam.org> writes:

Walter Bright wrote:
 "Georg Wrede" <georg.wrede nospam.org> wrote in message 
 news:43F658D2.2000608 nospam.org...
 
 Walter Bright wrote:
 
 "Wang Zhen" <nehzgnaw gmail.com> wrote in message 
 news:dt49iv$2hm5$1 digitaldaemon.com...
 
 
 Although syntactically correct, MatchExpression in 
 StaticIfCondition or StaticAssert do not compile. For example:
 
 void main(){static if(!("" ~~ "")){}static assert("" ~~ "");}
 
 Is this intended or an unimplemented feature?

 
 The problem is that getting it to work requires the compiler
 itself to understand regular expressions. Currently, it does not.
 

 
 Intriguing. I'd sure love to hear more about this.

 
 
 If the compiler is to constant fold regular expressions, then it
 needs to build in to the compiler exactly what would happen if the
 regex code was evaluated at runtime.

Yes. IMHO in essence, the binary machine code, which the runtime also 
would build. What I have a hard time seeing is, how this differs from 
building a normal function at compile time?

And eventually storing both in the executable image.

(I'd give you more intelligent questions, but I'm too baffled.)

Feb 21 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Georg Wrede" <georg.wrede nospam.org> wrote in message 
news:43FB25FC.8090806 nospam.org...
 Walter Bright wrote:
 If the compiler is to constant fold regular expressions, then it
 needs to build in to the compiler exactly what would happen if the
 regex code was evaluated at runtime.

 Yes. IMHO in essence, the binary machine code, which the runtime also 
 would build. What I have a hard time seeing is, how this differs from 
 building a normal function at compile time?

Consider the strlen() function. Compiling a strlen() function and generating 
machine code for it is a very different thing from the compiler knowing what 
strlen is and replacing:

    strlen("abc")

with:

    3

Feb 21 2006

"Lionello Lunesu" <lio remove.lunesu.com> writes:

Interesting indeed.

Is there no way to "fold constants" in this kind of code too? If you know 
the inputs to a function are all constant, can't you simply replace the 
inputs + function call with the function's output?

Would be really cool if this kind of general constant folding could take 
place. The compiler would need to keep track of all constant variables, and 
flagging outputs of operations with constants as constants too. In your 
example, since the input to the strlen function is a constant, the compiler 
could just call the strlen-code itself and replace the actual call with that 
call's output.

I have no experience what-so-ever with compiler writing, so I'm probably 
overlooking MANY things :-)

Lio.

"Walter Bright" <newshound digitalmars.com> wrote in message 
news:dtfin6$29hi$1 digitaldaemon.com...
 "Georg Wrede" <georg.wrede nospam.org> wrote in message 
 news:43FB25FC.8090806 nospam.org...
 Walter Bright wrote:
 If the compiler is to constant fold regular expressions, then it
 needs to build in to the compiler exactly what would happen if the
 regex code was evaluated at runtime.

 Yes. IMHO in essence, the binary machine code, which the runtime also 
 would build. What I have a hard time seeing is, how this differs from 
 building a normal function at compile time?

 Consider the strlen() function. Compiling a strlen() function and 
 generating machine code for it is a very different thing from the compiler 
 knowing what strlen is and replacing:

    strlen("abc")

 with:

    3

Feb 22 2006

Oskar Linde <oskar.lindeREM OVEgmail.com> writes:

Lionello Lunesu skrev:
 Interesting indeed.
 
 Is there no way to "fold constants" in this kind of code too? If you know 
 the inputs to a function are all constant, can't you simply replace the 
 inputs + function call with the function's output?

Disclaimer: I don't know much about this. Most is pure speculation.

I guess it is theoretically possible, but the compiler has to know that 
the function is pure. That is:

a) The function can not have any side effects.
b) The result has to be deterministic and only depend on the arguments.

This means that the function can not call any function not fulfilling a 
and b, and that it can not rely on things like floating point rounding 
state etc.

In the general case, the compiler has no way of knowing this. The 
function may be externally defined, and only resolved at link time. For 
stdlib-functions the compiler could of course be given this knowledge 
beforehand (like strlen).

For functions fully known to the compiler, inlining followed by constant 
folding could theoretically have the same effect, but I don't think any 
compilers are smart enough to identify pure blocks of code in a general 
fashion and being able to evaluate them at compile time. Somewhat easier 
would be to identify pure functions and evaluate them at compile time. I 
guess this is going much further than current constant folding. The 
problems I see are:

a) Hard for the compiler to tell if a function is pure. In many cases it 
is not even possible (The halting problem has an example of such an 
undecidable function).
b) The compiler needs a way to evaluate the function at compile time.
c) The compiler has no way of knowing the function space and time 
complexity.

It would be interesting if there was a way to flag functions as being 
pure. The compiler could then try to evaluate the function at compile 
time or reduce the number of calls to the function at run time similar 
to what a common sub-expression removal optimization would do.

/Oskar

Feb 22 2006

Deewiant <deewiant.doesnotlike.spam gmail.com> writes:

Oskar Linde wrote:
 It would be interesting if there was a way to flag functions as being
 pure. 

This is what I've always thought declaring a function as "const", like can be
done in C++, should do. Optimisation avenues galore.

Feb 22 2006

"Lionello Lunesu" <lio remove.lunesu.com> writes:

"Oskar Linde" <oskar.lindeREM OVEgmail.com> wrote in message 
news:dthe8b$1jg1$1 digitaldaemon.com...
 Lionello Lunesu skrev:
 Interesting indeed.

 Is there no way to "fold constants" in this kind of code too? If you know 
 the inputs to a function are all constant, can't you simply replace the 
 inputs + function call with the function's output?

 Disclaimer: I don't know much about this. Most is pure speculation.

 I guess it is theoretically possible, but the compiler has to know that 
 the function is pure. That is:

 a) The function can not have any side effects.

Good point. Completely forgot about that.

 b) The result has to be deterministic and only depend on the arguments.

Yeah, imagine de compiler calling rand(), taking a void (very constant), 
returning 123 or so.. assuming it's constant! :-)

 a) Hard for the compiler to tell if a function is pure. In many cases it 
 is not even possible (The halting problem has an example of such an 
 undecidable function).

Let's see. If the function only uses the inputs, without even unreferencing 
them, then it's pretty clear I suppose. But you're right, it's complex.

 b) The compiler needs a way to evaluate the function at compile time.

That's easy, by just calling it.

 c) The compiler has no way of knowing the function space and time 
 complexity.

How is this different from a) ?

 It would be interesting if there was a way to flag functions as being 
 pure. The compiler could then try to evaluate the function at compile time 
 or reduce the number of calls to the function at run time similar to what 
 a common sub-expression removal optimization would do.

Indeed. Something like C++ "const", but then for real, and not removable by 
a cast. A "pure" function would simply have a number of restrictions, I 
suppose something like: not allowed to reference any data outside the 
function (globals, class members, etc).

Lio.

Feb 22 2006

Oskar Linde <oskar.lindeREM OVEgmail.com> writes:

Lionello Lunesu skrev:
 "Oskar Linde" <oskar.lindeREM OVEgmail.com> wrote in message 
 news:dthe8b$1jg1$1 digitaldaemon.com...
 a) Hard for the compiler to tell if a function is pure. In many cases it 
 is not even possible (The halting problem has an example of such an 
 undecidable function).

 
 Let's see. If the function only uses the inputs, without even unreferencing 
 them, then it's pretty clear I suppose. But you're right, it's complex.

The function has to halt also. An infinite loop can be impossible for 
the compiler to detect. One would not want the compilation to hang.

 c) The compiler has no way of knowing the function space and time 
 complexity.

 
 How is this different from a) ?

This is similar to a), but since a) is provable undecidable, probably 
not harder. :)

If the function call takes five hours to complete, the compilation would 
take five hours times the number of times the function got called with 
different arguments. Also, if the function uses 2 gb of stack space, the 
compiler might run out of memory...
The compiler would have to execute the function for a certain amount of 
time, and break it if it doesn't return.

/ Oskar

Feb 22 2006

Sean Kelly <sean f4.ca> writes:

Lionello Lunesu wrote:
 Interesting indeed.
 
 Is there no way to "fold constants" in this kind of code too? If you know 
 the inputs to a function are all constant, can't you simply replace the 
 inputs + function call with the function's output?

If the function can be inlined and the operations it contains are also 
subject to constant folding then the optimizer should already do this. 
Otherwise, while it's possible in some cases I don't know of a compiler 
that does this.  I believe this has been talked about on the C++ forums 
as "atomic functions."


Sean

Feb 22 2006

Georg Wrede <georg.wrede nospam.org> writes:

Walter Bright wrote:
 "Georg Wrede" <georg.wrede nospam.org> wrote
 Walter Bright wrote:
 
 If the compiler is to constant fold regular expressions, then it 
 needs to build in to the compiler exactly what would happen if
 the regex code was evaluated at runtime.

 
 Yes. IMHO in essence, the binary machine code, which the runtime
 also would build. What I have a hard time seeing is, how this
 differs from building a normal function at compile time?

 
 Consider the strlen() function. Compiling a strlen() function and
 generating machine code for it is a very different thing from the
 compiler knowing what strlen is and replacing:
 
 strlen("abc")
 
 with:
 
 3

Either I'm getting too old for this business, or you're only giving 
pseudo answers.

(1) If we were to stop the compiler dead in its tracks, and I compiled 
the function "manually" and returned it to the compiler, would we still 
have a problem here?

(2) {-- and this I've so far avoided to bring up, out of courtesy --},
if Don can do it with templates, what's so impossible doing it the 
regular way??

-------------------

Just a cross-check: [I think] we're talking about compiling a single 
regular expression.

My definition: "a compiled regular expression" is any piece of machine 
code that takes *one string* as the argument, and returns (depending on 
which of the 2 kinds it is) either a boolean (as in found or not), or an 
integer denoting position of First Match.

Such a piece of machine code is a function that complies to one of the 
following signatures:

bool foo(char[]);      // match

int bar(char[]);       // search

Feb 22 2006

Don Clugston <dac nospam.com.au> writes:

Georg Wrede wrote:
 Walter Bright wrote:
 "Georg Wrede" <georg.wrede nospam.org> wrote
 Walter Bright wrote:

 If the compiler is to constant fold regular expressions, then it 
 needs to build in to the compiler exactly what would happen if
 the regex code was evaluated at runtime.

 Yes. IMHO in essence, the binary machine code, which the runtime
 also would build. What I have a hard time seeing is, how this
 differs from building a normal function at compile time?

 Consider the strlen() function. Compiling a strlen() function and
 generating machine code for it is a very different thing from the
 compiler knowing what strlen is and replacing:

 strlen("abc")

 with:

 3

 
 Either I'm getting too old for this business, or you're only giving 
 pseudo answers.
 
 (1) If we were to stop the compiler dead in its tracks, and I compiled 
 the function "manually" and returned it to the compiler, would we still 
 have a problem here?

That would be OK. The issue is that the compiler is a tool for 
converting text to machine code. It has no mechanism for executing the 
machine code.

 (2) {-- and this I've so far avoided to bring up, out of courtesy --},
 if Don can do it with templates, what's so impossible doing it the 
 regular way??

The compiler does have a mechanism for executing the "template language" 
at compile time, which is what my code is using. But, the template 
language (which I'll call Double D (DD) :-) ) is fundamentally different 
to the ordinary D language (eg, it has no variables). Conceivably, a 
compiler could convert a D function into a DD metafunction, provided 
that it doesn't write to any variables except at initialisation, and 
doesn't use any control structures other than "if-else" and "return",
and all of its parameters are compile-time constants. But that would be 
so restricted as to be almost useless.

Of course the compiler itself could have the DD code built into it, but 
DD is a horribly inefficient language, and it would be hideous to 
program from inside the compiler.

What could perhaps be done is to allow functions with all-constant 
parameters to be converted into overloads.

eg we have the DD metafunction

int strlenT!(char [] s)

Then, if we could define some kind of syntax like
const alias int strlen(char [] s) strlenT!(s);

as an overload of strlen, so that if all parameters are compile-time 
constants, then the reference to strlen becomes a template instantiation.

More generally, if the lookup mechanism for functions was changed to be:
If the first n parameters of a functions are all compile-time constants, 
C1, C2, ... with the remainder being variables or constants, V1, V2, ...
try to find a matching template.
eg, given
   func(C1, C2, C3, V1, V2, C4)
the following functions are looked for, in this order:
func!(C1, C2, C3)(V1, V2, C4);
func!(C1, C2)(C3, V1, V2, C4);
func!(C1)(C2, C3, V1, V2, C4);
func(C1, C2, C3, V1, V2, C4);

Note that as soon as a template is found, the search stops.
eg if there is a
func(C1, C2, C3) which doesn't have a (V1, V2, V3) member function, 
compilation will fail even if a function func(p1, p2, p3, p4, p5, p6) 
exists.

This is superficially akin to overloading 'const' parameters in C++, but 
unlike C++ "const" would actually mean "constant" and not just "I'm not 
_supposed_ to change it".

Feb 23 2006

Georg Wrede <georg.wrede nospam.org> writes:

Don Clugston wrote:
 Georg Wrede wrote:
 Walter Bright wrote:
 Georg Wrede wrote:
 Walter Bright wrote:

 If the compiler is to constant fold regular expressions, then it 
 needs to build in to the compiler exactly what would happen if
 the regex code was evaluated at runtime.

 Yes. IMHO in essence, the binary machine code, which the runtime
 also would build. What I have a hard time seeing is, how this
 differs from building a normal function at compile time?

 Consider the strlen() function. Compiling a strlen() function and
 generating machine code for it is a very different thing from the
 compiler knowing what strlen is and replacing:

 strlen("abc")

 with:

 3

 Either I'm getting too old for this business, or you're only giving 
 pseudo answers.

 (1) If we were to stop the compiler dead in its tracks, and I compiled 
 the function "manually" and returned it to the compiler, would we 
 still have a problem here?

 
 
 That would be OK. The issue is that the compiler is a tool for 
 converting text to machine code. It has no mechanism for executing the 
 machine code.

Aaaaaah... heureka.

So there's a wavelength problem here!

What I've been talking all along, is 'a regexp compiled into a function, 
but _not_run_ at compile time.

** So, Don's regexps can be both "compiled" and "run" at compile time, 
whereas what I've been wishing all along is a "compile-time compiled but 
not compile-time run" regexp!

In other words, a profoundly normal function, just that it happens to be 
written in RegexpLanguage instead of vanilla D (Or C, or asm).

(Gees, I hope this same wavelength problem wasn't the reason for last 
winter's unsuccessful regexp discussions.) :-(

Feb 23 2006

Don Clugston <dac nospam.com.au> writes:

Georg Wrede wrote:
 Don Clugston wrote:
 Georg Wrede wrote:
 Walter Bright wrote:
 Georg Wrede wrote:
 Walter Bright wrote:

 If the compiler is to constant fold regular expressions, then it 
 needs to build in to the compiler exactly what would happen if
 the regex code was evaluated at runtime.

 Yes. IMHO in essence, the binary machine code, which the runtime
 also would build. What I have a hard time seeing is, how this
 differs from building a normal function at compile time?

 Consider the strlen() function. Compiling a strlen() function and
 generating machine code for it is a very different thing from the
 compiler knowing what strlen is and replacing:

 strlen("abc")

 with:

 3

 Either I'm getting too old for this business, or you're only giving 
 pseudo answers.

 (1) If we were to stop the compiler dead in its tracks, and I 
 compiled the function "manually" and returned it to the compiler, 
 would we still have a problem here?


 That would be OK. The issue is that the compiler is a tool for 
 converting text to machine code. It has no mechanism for executing the 
 machine code.

 
 Aaaaaah... heureka.
 
 So there's a wavelength problem here!
 
 What I've been talking all along, is 'a regexp compiled into a function, 
 but _not_run_ at compile time.

Oh dear, I think I've just confused you. I was only referring to strlen, 
not to regexps. I was trying to explain Walter's statement about why 
it's difficult for a compiler writer.

 ** So, Don's regexps can be both "compiled" and "run" at compile time, 
 whereas what I've been wishing all along is a "compile-time compiled but 
 not compile-time run" regexp!

No, you were right the first time. At compile time, the regexp pattern 
string is compiled into an ordinary function.

Example: the trivial case

bool b = test!("abc")(str);

compiles to something like:

int test_a(char [] str)
{
   return str.length>=3 && str[0..3]=="abc";
}

bool b = test_a(str);

It doesn't actually call the test_a function at compile time.

It's only something like strlen!("abc"), where all of the parameters are 
known at run time, which is "run" at compile time. In the regexp case, 
it's the "make a regexp engine" code which is run at compile time. The 
engine itself is only run at runtime.

 In other words, a profoundly normal function, just that it happens to be 
 written in RegexpLanguage instead of vanilla D (Or C, or asm).

Exactly.

Feb 23 2006

Georg Wrede <georg.wrede nospam.org> writes:

(I put stuff in D.dtl.)

georg

Feb 23 2006

Stewart Gordon <smjg_1998 yahoo.com> writes:

Georg Wrede wrote:
 Walter Bright wrote:
 "Wang Zhen" <nehzgnaw gmail.com> wrote in message 
 news:dt49iv$2hm5$1 digitaldaemon.com...

 Although syntactically correct, MatchExpression in
 StaticIfCondition or StaticAssert do not compile. For example:

 void main(){static if(!("" ~~ "")){}static assert("" ~~ "");}

 Is this intended or an unimplemented feature?

 The problem is that getting it to work requires the compiler itself
 to understand regular expressions. Currently, it does not.

 
 Intriguing. I'd sure love to hear more about this.
 
 I take it understanding regular expressions is much more than just 
 compiling them? (Like what the runtime does, or Perl, etc.?)

A problem is that there are a number of dialects of regexp.  The spec 
doesn't seem to indicate which dialect is being used.

Among the differences between them is whether subexpressions are 
parenthesised by \(...\) or simply (...).  Another issue is whether we 
expect implementations to support the Unicode extensions to regexps 
described here

http://www.textpad.info/forum/viewtopic.php?t=4778

No doubt there are other differences....

Whichever we choose, the behaviour of using std.regexp directly, ~~ 
evaluated at runtime and ~~ evaluated at compiletime must be consistent. 
  But that isn't hard - the compiler would just call the same code that 
std.regexp uses.

Stewart.

-- 
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCS/M d- s:- C++  a->--- UB  P+ L E  W++  N+++ o K-  w++  O? M V? PS- 
PE- Y? PGP- t- 5? X? R b DI? D G e++>++++ h-- r-- !y
------END GEEK CODE BLOCK------

My e-mail is valid but not my primary mailbox.  Please keep replies on 
the 'group where everyone may benefit.

Feb 21 2006

D Programming

C/C++ Programming

Other

digitalmars.D.announce - DMD 0.147 release