digitalmars.D - $`, $', $&, $n - sugar or cyclamates?
- Walter Bright (19/19) Feb 15 2006 D dramatically improves the convenience of string handling over C++. But...
- Hasan Aljudy (2/28) Feb 15 2006 I don't have much todo with regexes .. but please .. the $ sign is ugly!...
- Ameer Armaly (2/25) Feb 15 2006
- Trevor Parscal (4/23) Feb 15 2006 Leave the $ sign for scripting languages...
- Derek Parnell (23/46) Feb 15 2006 Thanks for this Walter. Although it adds no new functionality to
- Sean Kelly (5/28) Feb 15 2006 And this was my concern too. But perhaps this is a bridge best left
- Walter Bright (5/16) Feb 15 2006 You're right in that all it really does is offer an easier way to get at...
- Tom (8/27) Feb 15 2006 On the contrary I think "$" is a very valuable symbol and should be used...
- John Demme (4/15) Feb 15 2006 Oh Bob no... Don't turn D into Perl. I like the $ for short cuts and su...
-
Walter Bright
(6/11)
Feb 15 2006
. I considered setting this up as a vote: - pragma (28/39) Feb 15 2006 Well, assuming that your mind is made up on this way or no way, I'd have...
- Walter Bright (4/17) Feb 15 2006 Yes, opMatch. Already done!
- Ivan Senji (23/33) Feb 16 2006 Walter!! You are really crazy! (In a really really good way)
-
Walter Bright
(3/4)
Feb 16 2006
It's supposed to work
. -
Dave
(12/23)
Feb 15 2006
I think both apply and are not mutually exclusive
- Unknown W. Brackets (16/33) Feb 15 2006 I personally don't see why it has to be 1 or 2. I think compromise is a...
- jicman (2/13) Feb 16 2006
- jicman (2/17) Feb 16 2006 I agree. Perl is perl, D is D.
- S. Chancellor (8/33) Feb 15 2006 With this you've essentially bound syntax to the RegExp class, or are
- Derek Parnell (15/17) Feb 15 2006 I use regular expression matching a lot in the type of programming I do,
- Walter Bright (4/10) Feb 15 2006 All you need to use it with your own custom type is provide an opMatch()...
- Kris (39/58) Feb 15 2006 There seem to be multiple issues here. The first one, which you ask abou...
- Walter Bright (22/63) Feb 15 2006 Nothing, really. But are they more readable than _match.pre, etc.?
- Oskar Linde (30/42) Feb 16 2006 Have you considered making this more general? I.e. for all if statements...
- Walter Bright (3/32) Feb 16 2006 I never thought of that. It's an intriguing idea.
- pragma (3/45) Feb 16 2006 Something along these lines would *most certainly* get my vote!
- kris (2/55) Feb 16 2006 Yes ~ mine too
- Sean Kelly (3/28) Feb 16 2006 Mine too.
- Sean Kelly (30/58) Feb 16 2006 Hold on. Walter, can you explain this injection business a bit? For
- Oskar Linde (11/42) Feb 16 2006 Those are AndAndExpression and OrOrExpression and will not inject anythi...
- Sean Kelly (11/56) Feb 16 2006 Very weird. So a MatchExpression by itself has a boolean result but
- Oskar Linde (6/19) Feb 16 2006 No, not boolean. A MatchExpression has a _Match* result. This result is ...
- Sean Kelly (4/15) Feb 16 2006 Oh right. And pointers can be implicitly evaluates as logical
- =?ISO-8859-1?Q?Julio_C=E9sar_Carrascal_Urquijo?= (2/5) Feb 16 2006 This is a great idea. I like it.
- Walter Bright (16/21) Feb 16 2006 There is one problem with it: every time an IfStatement is added to exis...
- Oskar Linde (45/70) Feb 17 2006 Coding guidelines would probably say that $ should be assigned to a
- pragma (8/20) Feb 17 2006 Sort of like an auto 'auto' declaration? I gather that the point is tha...
- Oskar Linde (9/29) Feb 17 2006 Yes, exactly so. The scope of such variables declared in the operand of
- Walter Bright (9/24) Feb 17 2006 Yes, that's a problem.
- Fredrik Olsson (7/45) Feb 17 2006 Yes! This one I like.
- Sean Kelly (4/36) Feb 17 2006 I like it. Assuming this were implemented, would it affect all
- Sai (1/7) Feb 17 2006 I personally like the former, it does not need special 'if' syntax.
- Ivan Senji (8/45) Feb 17 2006 How would this scale to something like
- Georg Wrede (6/38) Feb 17 2006 I'm uneasy with this. We're playing with fundamental constructs here.
- Kris (6/18) Feb 17 2006 I'm all for getting some kind of regex sugar in the grammar, but also fe...
- Sean Kelly (6/11) Feb 17 2006 As long as these new features don't break old code, I'm fine with Walter...
- Kris (2/12) Feb 17 2006 That would be cool.
- Sean Kelly (16/25) Feb 17 2006 True enough. However, the above syntax is currently illegal, so there's...
- Deewiant (5/12) Feb 17 2006 That would sort of make the whole token pointless IMO - easier just to d...
- kris (23/106) Feb 16 2006 Well, there's always "in" ...
- Walter Bright (32/85) Feb 16 2006 Fair enough. Let's see what others think.
- kris (54/92) Feb 16 2006 That doesn't mean D should adopt arbitrary symbols, Walter. If you want
- Sean Kelly (19/58) Feb 16 2006 I'm branching Ares before I check in this last block of changes. In the...
- Thomas Kuehne (14/23) Feb 16 2006 -----BEGIN PGP SIGNED MESSAGE-----
-
Sean Kelly
(8/30)
Feb 16 2006
The same as the problems with std::vector
in C++ (though I don't - Kris (2/5) Feb 16 2006 Easy fix ~ change the bool alias to byte, instead of bit :-)
- Sean Kelly (7/13) Feb 16 2006 I already use byte in some cases :-) But it lacks the boolean value
- Kris (4/15) Feb 16 2006 Yes, you're right of course. Would be just great if Walter would add a t...
- Regan Heath (34/53) Feb 16 2006 A true bool would make several people happy.. but once one existed peopl...
- Sean Kelly (10/49) Feb 16 2006 This is only a slippery slope if we want it to be ;-) I think the
- Walter Bright (4/7) Feb 16 2006 I regularly do bit masking and shifting on ints. I'm so used to it, I do...
- Derek Parnell (10/18) Feb 16 2006 YOU ARE DEAD WRONG! Sheesh!!! Not all us are blessed with your abilities...
- Walter Bright (9/24) Feb 16 2006 What about using some functions instead:
- Derek Parnell (9/35) Feb 16 2006 You mean like std.regexp library functions? Oh that's right ... we have ...
- Kris (2/9) Feb 16 2006 Besides, its easy to use op-overloads for such things as necessary.
- Derek Parnell (15/44) Feb 16 2006 I regard the syntax
- Bruno Medeiros (11/69) Feb 17 2006 An interesting idea, but maybe, to avoid conflicts syntax conflicts, we
- Regan Heath (4/15) Feb 17 2006 I like it.
- Walter Bright (51/105) Feb 16 2006 Those had to go because === was indistinguishable from == in many fonts.
- Sean Kelly (24/37) Feb 16 2006 This is really more of a library issue than a compiler issue. My
- Walter Bright (20/43) Feb 16 2006 I was concerned that code that did not use MatchExpressions might
- Sean Kelly (6/13) Feb 16 2006 Perhaps I'm being idealistic, as I simply don't believe the runtime
- Walter Bright (10/22) Feb 16 2006 Consider that there's no way to implement C, D, etc., without some runti...
- Sean Kelly (9/31) Feb 17 2006 Just to be clear, by "standard library code" I actually meant D code
- Walter Bright (4/12) Feb 17 2006 I think it's implied by it being part of the language spec. Regardless, ...
-
Kris
(31/71)
Feb 16 2006
Can't say that I agree, but my opinion matters rather little anyway
- Walter Bright (3/14) Feb 16 2006 How do you interpret the fact that it has failed to gain traction among ...
- Kris (21/37) Feb 16 2006 I noted a few reasons previously, regarding differing approaches and
- Walter Bright (14/33) Feb 16 2006 This might be a circular result - people don't use regex in C because
- Sean Kelly (16/47) Feb 16 2006 For what it's worth, the latest release of Ares trims a lot of fat out
- Walter Bright (4/9) Feb 16 2006 Very little actually changed, what I did was resort the order so it was ...
- Georg Wrede (15/22) Feb 16 2006 Hmm.
- Walter Bright (6/14) Feb 16 2006 There are a lot of cool things you can do in script languages because th...
- Georg Wrede (3/21) Feb 17 2006 Neither do I.
- Walter Bright (3/11) Feb 17 2006 My answer is because they're inconvenient to use in C/C++.
- James Dunne (31/58) Feb 20 2006 My answer is that regular expressions simply aren't powerful enough for
- Georg Wrede (6/10) Feb 16 2006 Would it be correct to assume that if we had compile-time regexps, then
- Georg Wrede (40/78) Feb 16 2006 There are 2 things reducing its usage.
-
Walter Bright
(3/6)
Feb 16 2006
I think the $` is pretty much dead now
. - Sean Kelly (4/11) Feb 16 2006 I'm half inclined to suggest -> for ~~, though there doesn't seem to be
- Walter Bright (7/18) Feb 16 2006 Two cons:
- James Dunne (6/32) Feb 15 2006 I'd rather make my code easier to read than write. I don't use regexps
- Roberto Mariottini (17/37) Feb 15 2006 No.
- Oskar Linde (19/28) Feb 16 2006 ` is not readily available on all keyboards. Some fonts also have proble...
- Walter Bright (7/22) Feb 16 2006 $1, $2, $3, ...
- Oskar Linde (40/51) Feb 17 2006 So why does _match have to be a pointer? Would something like this not w...
- Walter Bright (7/21) Feb 17 2006 I wanted it to work with both pointers to structs and to class reference...
- bobef (10/36) Feb 16 2006 It is nice feature but I don't think such thing should be part of the
- Charles (5/24) Feb 16 2006 Sweet jesus ... the horror.
- David Medlock (16/42) Feb 16 2006 I havent read this whole thread, but pardon if this has been suggested.
- Walter Bright (4/6) Feb 16 2006 Why, indeed. Oskar has brought it up, and he and you are right. I'm goin...
- Hasan Aljudy (6/17) Feb 16 2006 I agree with the "foreach" point/suggestion ..
D dramatically improves the convenience of string handling over C++. But while I think using the library std.regexp is straightforward, obviously it just isn't gaining traction. People like the shortcut approaches Ruby and Perl use for regular expressions, hence the new D match-expression support. So, now we have: if (regular_expression ~~ string) { _match.pre _match.post _match.match(n) } Should we do some aliases: $` => _match.pre $' => _match.post $& => _match.match(0) $n => _match.match(n) ? Syntactic sugar is often a good idea, but at what point do they become cyclamates and cause cancer in laboratory animals? Will these $ tokens render D more accessible, but perhaps too unreadable?
Feb 15 2006
Walter Bright wrote:D dramatically improves the convenience of string handling over C++. But while I think using the library std.regexp is straightforward, obviously it just isn't gaining traction. People like the shortcut approaches Ruby and Perl use for regular expressions, hence the new D match-expression support. So, now we have: if (regular_expression ~~ string) { _match.pre _match.post _match.match(n) } Should we do some aliases: $` => _match.pre $' => _match.post $& => _match.match(0) $n => _match.match(n) ? Syntactic sugar is often a good idea, but at what point do they become cyclamates and cause cancer in laboratory animals? Will these $ tokens render D more accessible, but perhaps too unreadable?I don't have much todo with regexes .. but please .. the $ sign is ugly!!
Feb 15 2006
"Walter Bright" <newshound digitalmars.com> wrote in message news:dt088e$1svm$2 digitaldaemon.com...D dramatically improves the convenience of string handling over C++. But while I think using the library std.regexp is straightforward, obviously it just isn't gaining traction. People like the shortcut approaches Ruby and Perl use for regular expressions, hence the new D match-expression support. So, now we have: if (regular_expression ~~ string) { _match.pre _match.post _match.match(n) } Should we do some aliases: $` => _match.pre $' => _match.post $& => _match.match(0) $n => _match.match(n) ? Syntactic sugar is often a good idea, but at what point do they become cyclamates and cause cancer in laboratory animals? Will these $ tokens render D more accessible, but perhaps too unreadable? Hmm... I don't do much with regular expressions, but the presence of too much sugar can be counterproductive; I personally think the standard lib is the place for that kind of thing.
Feb 15 2006
In article <dt088e$1svm$2 digitaldaemon.com>, Walter Bright says...D dramatically improves the convenience of string handling over C++. But while I think using the library std.regexp is straightforward, obviously it just isn't gaining traction. People like the shortcut approaches Ruby and Perl use for regular expressions, hence the new D match-expression support. So, now we have: if (regular_expression ~~ string) { _match.pre _match.post _match.match(n) } Should we do some aliases: $` => _match.pre $' => _match.post $& => _match.match(0) $n => _match.match(n) ? Syntactic sugar is often a good idea, but at what point do they become cyclamates and cause cancer in laboratory animals? Will these $ tokens render D more accessible, but perhaps too unreadable?Leave the $ sign for scripting languages... Thanks, Trevor Parscal
Feb 15 2006
On Wed, 15 Feb 2006 13:59:33 -0800, Walter Bright wrote:D dramatically improves the convenience of string handling over C++. But while I think using the library std.regexp is straightforward, obviously it just isn't gaining traction. People like the shortcut approaches Ruby and Perl use for regular expressions, hence the new D match-expression support. So, now we have: if (regular_expression ~~ string) { _match.pre _match.post _match.match(n) }Thanks for this Walter. Although it adds no new functionality to applications, it does say that D is a serious player in making string handling programs easier to write and maintain. I expect that std.regexp will still stay around and that this new feature is merely a portal into that library.Should we do some aliases: $` => _match.pre $' => _match.post $& => _match.match(0) $n => _match.match(n) ? Syntactic sugar is often a good idea, but at what point do they become cyclamates and cause cancer in laboratory animals? Will these $ tokens render D more accessible, but perhaps too unreadable?My first thought was "ouch! - not pleasant". After some consideration I'm now leaning towards the idea that we should hold off on implementing these shortcuts for now and wait to see if they are actually required or not. And then, if there is a crying need for them, to come up with a set of shortcuts that will be acceptable enough. Currently the '$' symbol is associated with arrays and lengths, and not as a general purpose lead-in character to symbol values. To mix these two disparate concepts in coders minds might not be fruitful. However, there may be other alternatives yet to be discovered, so the concept ought not to be totally abandoned just yet. -- Derek (skype: derek.j.parnell) Melbourne, Australia "Down with mediocracy!" 16/02/2006 10:32:50 AM
Feb 15 2006
Derek Parnell wrote:On Wed, 15 Feb 2006 13:59:33 -0800, Walter Bright wrote:Agreed.Should we do some aliases: $` => _match.pre $' => _match.post $& => _match.match(0) $n => _match.match(n) ? Syntactic sugar is often a good idea, but at what point do they become cyclamates and cause cancer in laboratory animals? Will these $ tokens render D more accessible, but perhaps too unreadable?My first thought was "ouch! - not pleasant". After some consideration I'm now leaning towards the idea that we should hold off on implementing these shortcuts for now and wait to see if they are actually required or not. And then, if there is a crying need for them, to come up with a set of shortcuts that will be acceptable enough.Currently the '$' symbol is associated with arrays and lengths, and not as a general purpose lead-in character to symbol values. To mix these two disparate concepts in coders minds might not be fruitful. However, there may be other alternatives yet to be discovered, so the concept ought not to be totally abandoned just yet.And this was my concern too. But perhaps this is a bridge best left ignored until there's a reason to jump. Sean
Feb 15 2006
"Derek Parnell" <derek psych.ward> wrote in message news:15w2x5659i8ey$.p4zbzif24wfw$.dlg 40tude.net...Thanks for this Walter. Although it adds no new functionality to applications, it does say that D is a serious player in making string handling programs easier to write and maintain. I expect that std.regexp will still stay around and that this new feature is merely a portal into that library.You're right in that all it really does is offer an easier way to get at std.regexp.My first thought was "ouch! - not pleasant". After some consideration I'm now leaning towards the idea that we should hold off on implementing these shortcuts for now and wait to see if they are actually required or not. And then, if there is a crying need for them, to come up with a set of shortcuts that will be acceptable enough.That's why I didn't do them yet.
Feb 15 2006
In article <dt088e$1svm$2 digitaldaemon.com>, Walter Bright says...D dramatically improves the convenience of string handling over C++. But while I think using the library std.regexp is straightforward, obviously it just isn't gaining traction. People like the shortcut approaches Ruby and Perl use for regular expressions, hence the new D match-expression support. So, now we have: if (regular_expression ~~ string) { _match.pre _match.post _match.match(n) } Should we do some aliases: $` => _match.pre $' => _match.post $& => _match.match(0) $n => _match.match(n) ? Syntactic sugar is often a good idea, but at what point do they become cyclamates and cause cancer in laboratory animals? Will these $ tokens render D more accessible, but perhaps too unreadable?On the contrary I think "$" is a very valuable symbol and should be used. Though the symbol "`" is very inconvenient (at least for spanish keyboard layout), ugly and could lead to confusion with "'" symbol - as I've seen many times and which I personally don't like to see used in such a way as "$'" -. Maybe "$[" and "$]", don't know. Just my opinion, Tom;
Feb 15 2006
Walter Bright wrote:Should we do some aliases: $` => _match.pre $' => _match.post $& => _match.match(0) $n => _match.match(n) ? Syntactic sugar is often a good idea, but at what point do they become cyclamates and cause cancer in laboratory animals? Will these $ tokens render D more accessible, but perhaps too unreadable?Oh Bob no... Don't turn D into Perl. I like the $ for short cuts and such, but please no random symbols. I like $match.pre and $length, ect... but $& and $` don't mean anything to me!
Feb 15 2006
"John Demme" <me teqdruid.com> wrote in message news:dt0fvp$23bj$1 digitaldaemon.com...Oh Bob no... Don't turn D into Perl. I like the $ for short cuts and such, but please no random symbols. I like $match.pre and $length, ect... but $& and $` don't mean anything to me!<g>. I considered setting this up as a vote: Vote for 1: (1) If I wanted to write ugly programs I'd use Perl, not D. (2) Cool! I can now dump my Perl scripts and use D!
Feb 15 2006
In article <dt0hbb$25iq$2 digitaldaemon.com>, Walter Bright says..."John Demme" <me teqdruid.com> wrote in message news:dt0fvp$23bj$1 digitaldaemon.com...Well, assuming that your mind is made up on this way or no way, I'd have to lean toward (2). Its there to be used, but if I object to it personally, I can abstain from using it. Just some food for thought, as I think there's plenty left to be worked out in this concept. :) IMHO, using "~~" as a token doesn't look right yet, but that's probably because this would be the first time that token has been used in a programming language (unless I'm mistaken). The only thing I could possibly suggest to use differently would be at-cost (" ") symbol: if("regular expression" "operand"){ /*...*/ } This looks a little more arithmetic to my eye than "~~". :) The dollar-sign operators look good, but "$n" seems limited to me. Why not open this up to array-indexing so it's more compatible with foreach, arrays and other things D? Also, what about if I want to pass the set of matches as an array? The '$x' tokens are sure to lex great, but isn't this running the risk of overloading the '$' symbol a bit much (from a visual standpoint)? if("$\w*" ~~ "hello world"){ mystring[0..$&.length] = $&; //eek! } Also, am I to assume that we'll get an "opProcess" operator overload to use on our classes? As long as _match is flexible enough to accept any type, this could really work. To my eye, the compiler could accept a custom class or struct as the _match value (kind of like an internal 'auto') so long as its namespace provides the .pre, .post, .match members. All-in-all, it would be a rather nice side effect of all this, as things like Spirit have been difficult to implement as D has fewer operator overloads than C++. - Eric Anderton at yahooOh Bob no... Don't turn D into Perl. I like the $ for short cuts and such, but please no random symbols. I like $match.pre and $length, ect... but $& and $` don't mean anything to me!<g>. I considered setting this up as a vote: Vote for 1: (1) If I wanted to write ugly programs I'd use Perl, not D. (2) Cool! I can now dump my Perl scripts and use D!
Feb 15 2006
"pragma" <pragma_member pathlink.com> wrote in message news:dt0mfk$29qc$1 digitaldaemon.com...Also, am I to assume that we'll get an "opProcess" operator overload to use on our classes?Yes, opMatch. Already done!As long as _match is flexible enough to accept any type, this could really work. To my eye, the compiler could accept a custom class or struct as the _match value (kind of like an internal 'auto') so long as its namespace provides the .pre, .post, .match members.Already done!All-in-all, it would be a rather nice side effect of all this, as things like Spirit have been difficult to implement as D has fewer operator overloads than C++. - Eric Anderton at yahoo
Feb 15 2006
Walter Bright wrote:"pragma" <pragma_member pathlink.com> wrote in message news:dt0mfk$29qc$1 digitaldaemon.com...Walter!! You are really crazy! (In a really really good way) I just tried this for fun and it works: <code> import std.stdio; class ArrayBeginsWith0and1 { static bool opMatch(int[] nums) { if(nums.length < 2)return false; if(nums[0] == 0 && nums[1] == 1) return true; else return false; } } void main() { static int[] somearray1 = [0,1,2]; static int[] somearray2 = [2,1,2]; writefln(ArrayBeginsWith0and1 ~~ somearray1); //prints true writefln(ArrayBeginsWith0and1 ~~ somearray2); //prints false } </code> I hope this isn't a bug that this works?Also, am I to assume that we'll get an "opProcess" operator overload to use on our classes?Yes, opMatch. Already done!
Feb 16 2006
"Ivan Senji" <ivan.senji_REMOVE_ _THIS__gmail.com> wrote in message news:dt1u3t$d97$1 digitaldaemon.com...I hope this isn't a bug that this works?It's supposed to work <g>.
Feb 16 2006
In article <dt0hbb$25iq$2 digitaldaemon.com>, Walter Bright says..."John Demme" <me teqdruid.com> wrote in message news:dt0fvp$23bj$1 digitaldaemon.com...I think both apply and are not mutually exclusive <g> For me, the big part of supporting the most common regex operation in the language itself is that quick scripts using it can be kicked out without having to import something or remember the details of the RegExp class. Crazy (or lazy?), but I find that appealing when comparing it to a scripting language. So that's a vote for (2). I've never been a big fan of most of Perl's syntactical sugar - just too easy to miss something when you're reading it, so that's a vote for (1). And besides, one will never be able to copy and paste much of anything from Perl into D so there isn't any 'sweet' benefit there either <g> - DaveOh Bob no... Don't turn D into Perl. I like the $ for short cuts and such, but please no random symbols. I like $match.pre and $length, ect... but $& and $` don't mean anything to me!<g>. I considered setting this up as a vote: Vote for 1: (1) If I wanted to write ugly programs I'd use Perl, not D. (2) Cool! I can now dump my Perl scripts and use D!
Feb 15 2006
I personally don't see why it has to be 1 or 2. I think compromise is a great thing. I should note first that I actually like $ in scripting languages, because it tends to make variables stand out (not hide them.) You seem to be suggesting either using _match.match(0) (ick!) or $&.... why? Why can't it be: $pre => _match.pre $post => _match.post $match => _match.match(0) $5 => _match.match(5) Yes, yes, I realize this looks more like those scripting-language variables, but it's also clearer than Perl's syntax, and almost as easy to type. I would spend more time making sure I'm pressing the right symbol than typing "pre" or some such. Just my opinion. -[Unknown]"John Demme" <me teqdruid.com> wrote in message news:dt0fvp$23bj$1 digitaldaemon.com...Oh Bob no... Don't turn D into Perl. I like the $ for short cuts and such, but please no random symbols. I like $match.pre and $length, ect... but $& and $` don't mean anything to me!<g>. I considered setting this up as a vote: Vote for 1: (1) If I wanted to write ugly programs I'd use Perl, not D. (2) Cool! I can now dump my Perl scripts and use D!
Feb 15 2006
1 Walter Bright says..."John Demme" <me teqdruid.com> wrote in message news:dt0fvp$23bj$1 digitaldaemon.com...Oh Bob no... Don't turn D into Perl. I like the $ for short cuts and such, but please no random symbols. I like $match.pre and $length, ect... but $& and $` don't mean anything to me!<g>. I considered setting this up as a vote: Vote for 1: (1) If I wanted to write ugly programs I'd use Perl, not D. (2) Cool! I can now dump my Perl scripts and use D!
Feb 16 2006
John Demme says...Walter Bright wrote:I agree. Perl is perl, D is D.Should we do some aliases: $` => _match.pre $' => _match.post $& => _match.match(0) $n => _match.match(n) ? Syntactic sugar is often a good idea, but at what point do they become cyclamates and cause cancer in laboratory animals? Will these $ tokens render D more accessible, but perhaps too unreadable?Oh Bob no... Don't turn D into Perl. I like the $ for short cuts and such, but please no random symbols. I like $match.pre and $length, ect... but $& and $` don't mean anything to me!
Feb 16 2006
On 2006-02-15 13:59:33 -0800, "Walter Bright" <newshound digitalmars.com> said:D dramatically improves the convenience of string handling over C++. But while I think using the library std.regexp is straightforward, obviously it just isn't gaining traction. People like the shortcut approaches Ruby and Perl use for regular expressions, hence the new D match-expression support. So, now we have: if (regular_expression ~~ string) { _match.pre _match.post _match.match(n) } Should we do some aliases: $` => _match.pre $' => _match.post $& => _match.match(0) $n => _match.match(n) ? Syntactic sugar is often a good idea, but at what point do they become cyclamates and cause cancer in laboratory animals? Will these $ tokens render D more accessible, but perhaps too unreadable?With this you've essentially bound syntax to the RegExp class, or are you not using that for this? I do believe I recall some statements by you in the past against standard libraries being an integral part of the computer language. Though, I'm too lazy to dig them up right now. My preference is that this match syntax be removed, and the aliases never see the light of day. I use perl for this sort of stuff. -S.
Feb 15 2006
On Wed, 15 Feb 2006 18:06:45 -0800, S. Chancellor wrote:My preference is that this match syntax be removed, and the aliases never see the light of day. I use perl for this sort of stuff.I use regular expression matching a lot in the type of programming I do, e.g. Build, and I suspect I'd find perl far too slow for the purpose. I haven't used the std.regexp library because it doesn't really support Unicode correctly so I've written simple functions to some pattern matching for my needs. And as I've just found out, the new pattern matching just uses the standard library and Unicode support is not there, so I still can't use it. -- Derek (skype: derek.j.parnell) Melbourne, Australia "Down with mediocracy!" 16/02/2006 1:38:45 PM
Feb 15 2006
"Derek Parnell" <derek psych.ward> wrote in message news:sedgdqrvihce.1s7xzb5qubodc$.dlg 40tude.net...I haven't used the std.regexp library because it doesn't really support Unicode correctly so I've written simple functions to some pattern matching for my needs. And as I've just found out, the new pattern matching just uses the standard library and Unicode support is not there, so I still can't use it.All you need to use it with your own custom type is provide an opMatch() overload.
Feb 15 2006
"Walter Bright" <newshound digitalmars.com> wrote...D dramatically improves the convenience of string handling over C++. But while I think using the library std.regexp is straightforward, obviously it just isn't gaining traction. People like the shortcut approaches Ruby and Perl use for regular expressions, hence the new D match-expression support. So, now we have: if (regular_expression ~~ string) { _match.pre _match.post _match.match(n) } Should we do some aliases: $` => _match.pre $' => _match.post $& => _match.match(0) $n => _match.match(n) ? Syntactic sugar is often a good idea, but at what point do they become cyclamates and cause cancer in laboratory animals? Will these $ tokens render D more accessible, but perhaps too unreadable?There seem to be multiple issues here. The first one, which you ask about, is related to the syntax. At first blush, the ~~ looks like an approximate approximation, and then making D look like a malformed Perl is surely a mistake. What the heck is wrong with $match.pre, $match.post, $match.index(n) instead? At least they're readable :-) Additionally, I thought '~' was used for concatenation? Because '+' is overloaded in other languages? Isn't that just exactly what you're now doing with '~' ? I mean, what does a "pattern within" operation have to do with concatenation? Then, you say this is applicable only to char[]. What about wchar[] and dchar[]? Are they now relegated to second-class citizens? It's no use converting those arrays into char[] on the fly ~ apart from the heap activity and conversion that would ensue (for both operands; one of which could be rather substantial), $match.pre and friends would also have to do conversions back into the original format. Ugghh. Yet another issue is with respect to case-folding (which is often used with regex expressions). You see, unicode case-folding does not follow the trivial rules of ASCII ~ you can't just call tolower() and hope for the best. Thus, there needs to be some mechanism to support alternate, more appropriate, converters. In retrospect, much of this should probably be handled via template usage (for the different UTF types). And the converter issue can be resolved by supporting some kind of assignable or plug-in module. All of this can be handled by a templated class. I attempted to do just this with your RegExp class, but ran into problems related to how patterns are stored in the "instruction" stream (size differences between char and dchar, for example). I'm an advocate for potentially getting regex support into the grammar but, on the face of it, your approach just doesn't appear to be considered in a particularly thorough manner. There again, perhaps you've already addressed the above issues, and the resolution is just not currently visible? Perhaps this whole thing should wait until after we see what can be done with the regex templates, so that there's some experience behind the grammar? I mean, that would surely be better than having to remove the above at some point in the future. What's the big rush with built-in regex anyway? I really do think it should wait until we have some solid experience with regex templates ~ don't you think it's rather likely we'll learn something really useful that applies directly to a built-in grammar? - Kris
Feb 15 2006
"Kris" <fu bar.com> wrote in message news:dt0q7n$2cuo$1 digitaldaemon.com...There seem to be multiple issues here. The first one, which you ask about, is related to the syntax. At first blush, the ~~ looks like an approximate approximation, and then making D look like a malformed Perl is surely a mistake.If you've got a better idea for tokens ~~ and !~ ?What the heck is wrong with $match.pre, $match.post, $match.index(n) instead? At least they're readable :-)Nothing, really. But are they more readable than _match.pre, etc.?Additionally, I thought '~' was used for concatenation?It is.Because '+' is overloaded in other languages? Isn't that just exactly what you're now doing with '~' ?'=' and '==' mean entirely different things. So does / and /*. I don't think ~~ need have anything to do with complement or concatenation.I mean, what does a "pattern within" operation have to do with concatenation?Nothing at all.Then, you say this is applicable only to char[]. What about wchar[] and dchar[]? Are they now relegated to second-class citizens? It's no use converting those arrays into char[] on the fly ~ apart from the heap activity and conversion that would ensue (for both operands; one of which could be rather substantial), $match.pre and friends would also have to do conversions back into the original format. Ugghh.That is a problem, one that would get solved when RegExp can do wchar and dchar. That isn't a technical problem, it's more of a getting around to it problem.Yet another issue is with respect to case-folding (which is often used with regex expressions). You see, unicode case-folding does not follow the trivial rules of ASCII ~ you can't just call tolower() and hope for the best. Thus, there needs to be some mechanism to support alternate, more appropriate, converters.I agree that case is an issue. That's why this also works: if (RegExp("string", "i") ~~ "string") ... and can work with any class type as the left operand, as long as it overloads opMatch.In retrospect, much of this should probably be handled via template usage (for the different UTF types). And the converter issue can be resolved by supporting some kind of assignable or plug-in module. All of this can be handled by a templated class. I attempted to do just this with your RegExp class, but ran into problems related to how patterns are stored in the "instruction" stream (size differences between char and dchar, for example).I don't agree. The problem I ran into with this approach is the injection of the declaration _match into the current scope.I'm an advocate for potentially getting regex support into the grammar but, on the face of it, your approach just doesn't appear to be considered in a particularly thorough manner. There again, perhaps you've already addressed the above issues, and the resolution is just not currently visible?I considered many ways of doing it, and have actually been thinking about it for months. This seemed to be the most practical. I hope I answered your questions about it.Perhaps this whole thing should wait until after we see what can be done with the regex templates, so that there's some experience behind the grammar? I mean, that would surely be better than having to remove the above at some point in the future. What's the big rush with built-in regex anyway? I really do think it should wait until we have some solid experience with regex templates ~ don't you think it's rather likely we'll learn something really useful that applies directly to a built-in grammar?I don't think this takes away from the regex templates. I hope to use the regex templates in conjunction with this syntactic sugar to create optimized regex evaluation.
Feb 15 2006
Walter Bright wrote:"Kris" <fu bar.com> wrote in message news:dt0q7n$2cuo$1 digitaldaemon.com...Have you considered making this more general? I.e. for all if statements, inject a variable that takes the value of the entire condition expression. (Using _result as a placeholder for such an identifier.) if ("..." ~~ "...) { _result.match(0); } if (myFunc()) { _result.whatever(); } Why should this behavior be reserved for opMatch() only? Isn't this a very common coding pattern that could also become less verbose by this: SomeType result; if ( (result = getSomething())) { doSomethingWith(result); } (becoming: if (getSomething()) { doSomethingWith(_result); } ) One suggestion would be to call _result $. Giving $ the semantics of a "scope injected value". This would go hand in hand with an earlier suggestion of changing the $ for index operations too: Assume [] introduces a new scope, then a $ within [] would refer to whatever is being indexed. char[] cutHeadAndTail = myString[1 .. $.length-1]; Image subImage = myImage[$.upperLeft .. $.middle]; char[] contents = text[$.indexOf('{')+1 .. $.indexOf('}')]; /OskarIn retrospect, much of this should probably be handled via template usage (for the different UTF types). And the converter issue can be resolved by supporting some kind of assignable or plug-in module. All of this can be handled by a templated class. I attempted to do just this with your RegExp class, but ran into problems related to how patterns are stored in the "instruction" stream (size differences between char and dchar, for example).I don't agree. The problem I ran into with this approach is the injection of the declaration _match into the current scope.
Feb 16 2006
"Oskar Linde" <olREM OVEnada.kth.se> wrote in message news:dt1aif$2qpd$1 digitaldaemon.com...Have you considered making this more general? I.e. for all if statements, inject a variable that takes the value of the entire condition expression. (Using _result as a placeholder for such an identifier.) if ("..." ~~ "...) { _result.match(0); } if (myFunc()) { _result.whatever(); } Why should this behavior be reserved for opMatch() only? Isn't this a very common coding pattern that could also become less verbose by this: SomeType result; if ( (result = getSomething())) { doSomethingWith(result); } (becoming: if (getSomething()) { doSomethingWith(_result); } ) One suggestion would be to call _result $. Giving $ the semantics of a "scope injected value". This would go hand in hand with an earlier suggestion of changing the $ for index operations too: Assume [] introduces a new scope, then a $ within [] would refer to whatever is being indexed. char[] cutHeadAndTail = myString[1 .. $.length-1]; Image subImage = myImage[$.upperLeft .. $.middle]; char[] contents = text[$.indexOf('{')+1 .. $.indexOf('}')];I never thought of that. It's an intriguing idea.
Feb 16 2006
In article <dt1eje$2uvu$2 digitaldaemon.com>, Walter Bright says..."Oskar Linde" <olREM OVEnada.kth.se> wrote in message news:dt1aif$2qpd$1 digitaldaemon.com...Something along these lines would *most certainly* get my vote! - Eric Anderton at yahooHave you considered making this more general? I.e. for all if statements, inject a variable that takes the value of the entire condition expression. (Using _result as a placeholder for such an identifier.) if ("..." ~~ "...) { _result.match(0); } if (myFunc()) { _result.whatever(); } Why should this behavior be reserved for opMatch() only? Isn't this a very common coding pattern that could also become less verbose by this: SomeType result; if ( (result = getSomething())) { doSomethingWith(result); } (becoming: if (getSomething()) { doSomethingWith(_result); } ) One suggestion would be to call _result $. Giving $ the semantics of a "scope injected value". This would go hand in hand with an earlier suggestion of changing the $ for index operations too: Assume [] introduces a new scope, then a $ within [] would refer to whatever is being indexed. char[] cutHeadAndTail = myString[1 .. $.length-1]; Image subImage = myImage[$.upperLeft .. $.middle]; char[] contents = text[$.indexOf('{')+1 .. $.indexOf('}')];I never thought of that. It's an intriguing idea.
Feb 16 2006
pragma wrote:In article <dt1eje$2uvu$2 digitaldaemon.com>, Walter Bright says...Yes ~ mine too"Oskar Linde" <olREM OVEnada.kth.se> wrote in message news:dt1aif$2qpd$1 digitaldaemon.com...Something along these lines would *most certainly* get my vote! - Eric Anderton at yahooHave you considered making this more general? I.e. for all if statements, inject a variable that takes the value of the entire condition expression. (Using _result as a placeholder for such an identifier.) if ("..." ~~ "...) { _result.match(0); } if (myFunc()) { _result.whatever(); } Why should this behavior be reserved for opMatch() only? Isn't this a very common coding pattern that could also become less verbose by this: SomeType result; if ( (result = getSomething())) { doSomethingWith(result); } (becoming: if (getSomething()) { doSomethingWith(_result); } ) One suggestion would be to call _result $. Giving $ the semantics of a "scope injected value". This would go hand in hand with an earlier suggestion of changing the $ for index operations too: Assume [] introduces a new scope, then a $ within [] would refer to whatever is being indexed. char[] cutHeadAndTail = myString[1 .. $.length-1]; Image subImage = myImage[$.upperLeft .. $.middle]; char[] contents = text[$.indexOf('{')+1 .. $.indexOf('}')];I never thought of that. It's an intriguing idea.
Feb 16 2006
kris wrote:pragma wrote:Mine too. SeanIn article <dt1eje$2uvu$2 digitaldaemon.com>, Walter Bright says...Yes ~ mine too"Oskar Linde" <olREM OVEnada.kth.se> wrote in message news:dt1aif$2qpd$1 digitaldaemon.com...Something along these lines would *most certainly* get my vote!One suggestion would be to call _result $. Giving $ the semantics of a "scope injected value". This would go hand in hand with an earlier suggestion of changing the $ for index operations too: Assume [] introduces a new scope, then a $ within [] would refer to whatever is being indexed. char[] cutHeadAndTail = myString[1 .. $.length-1]; Image subImage = myImage[$.upperLeft .. $.middle]; char[] contents = text[$.indexOf('{')+1 .. $.indexOf('}')];I never thought of that. It's an intriguing idea.
Feb 16 2006
Sean Kelly wrote:kris wrote:Hold on. Walter, can you explain this injection business a bit? For example, the effect here seems clear: if( "x" ~~ "y" ) { _match.blah; } But what about this: if( "x" ~~ "y" && "y" ~~ "z" ) { _match.blah; } And this: if( "x" ~~ "y" || "y" ~~ "z" ) { _match.blah; } Does _match represent the result of the last match sub-expression evaluated? And is there any way to know which expression succeeded? Does the fact that the injected value is a _Match* mean that I might potentially have an array of objects I could iterate through? And finally, could you clarify the spec in this regard? Also, with respect to the above proposal, how might this work: int numStudents(); float avgGrade(); if( numStudents() < 10 || avgGrade() > 50.0 ) { } While the result of each subexression is actually boolean (just as in the match expression above), the values we'd be interested in are the integer and float. But in the above example, the float might not be evaluated at all. I'd merely like to voice this as a qualifier to my initial support of this idea above :-) Seanpragma wrote:Mine too.In article <dt1eje$2uvu$2 digitaldaemon.com>, Walter Bright says...Yes ~ mine too"Oskar Linde" <olREM OVEnada.kth.se> wrote in message news:dt1aif$2qpd$1 digitaldaemon.com...Something along these lines would *most certainly* get my vote!One suggestion would be to call _result $. Giving $ the semantics of a "scope injected value". This would go hand in hand with an earlier suggestion of changing the $ for index operations too: Assume [] introduces a new scope, then a $ within [] would refer to whatever is being indexed. char[] cutHeadAndTail = myString[1 .. $.length-1]; Image subImage = myImage[$.upperLeft .. $.middle]; char[] contents = text[$.indexOf('{')+1 .. $.indexOf('}')];I never thought of that. It's an intriguing idea.
Feb 16 2006
Sean Kelly wrote:Hold on. Walter, can you explain this injection business a bit? For example, the effect here seems clear: if( "x" ~~ "y" ) { _match.blah; } But what about this: if( "x" ~~ "y" && "y" ~~ "z" ) { _match.blah; } And this: if( "x" ~~ "y" || "y" ~~ "z" ) { _match.blah; }Those are AndAndExpression and OrOrExpression and will not inject anything. Only a pure if(MatchExpression) injects anything.Also, with respect to the above proposal, how might this work: int numStudents(); float avgGrade(); if( numStudents() < 10 || avgGrade() > 50.0 ) { }In this case, $ would always refer to the value of (numStudents() < 10 || avgGrade() > 50.0), which is bool and must always be true. (It would be interesting to change the || expression into returning the left value if it is nonzero and the right value otherwise, without converting anything to bool, but I'm not fully sure what implications that would have...)While the result of each subexression is actually boolean (just as in the match expression above), the values we'd be interested in are the integer and float. But in the above example, the float might not be evaluated at all. I'd merely like to voice this as a qualifier to my initial support of this idea above :-)This is probably impossible. How would the compiler know what subexpressions are interesting and how would those be referred to? /Oskar
Feb 16 2006
Oskar Linde wrote:Sean Kelly wrote:Very weird. So a MatchExpression by itself has a boolean result but injects a value into the following scope?Hold on. Walter, can you explain this injection business a bit? For example, the effect here seems clear: if( "x" ~~ "y" ) { _match.blah; } But what about this: if( "x" ~~ "y" && "y" ~~ "z" ) { _match.blah; } And this: if( "x" ~~ "y" || "y" ~~ "z" ) { _match.blah; }Those are AndAndExpression and OrOrExpression and will not inject anything. Only a pure if(MatchExpression) injects anything.So based on the above, your suggestion would only be useful for single call expressions: if( numStudents() ) printf( "%i students\n", $.whatever ); Seems reasonable I suppose.Also, with respect to the above proposal, how might this work: int numStudents(); float avgGrade(); if( numStudents() < 10 || avgGrade() > 50.0 ) { }In this case, $ would always refer to the value of (numStudents() < 10 || avgGrade() > 50.0), which is bool and must always be true. (It would be interesting to change the || expression into returning the left value if it is nonzero and the right value otherwise, without converting anything to bool, but I'm not fully sure what implications that would have...)That's fine. I was merely trying to sort out the implications of this new feature. SeanWhile the result of each subexression is actually boolean (just as in the match expression above), the values we'd be interested in are the integer and float. But in the above example, the float might not be evaluated at all. I'd merely like to voice this as a qualifier to my initial support of this idea above :-)This is probably impossible. How would the compiler know what subexpressions are interesting and how would those be referred to?
Feb 16 2006
Sean Kelly wrote:Oskar Linde wrote:No, not boolean. A MatchExpression has a _Match* result. This result is what gets injected into the following scope. My suggestion is just a generalization of this.Those are AndAndExpression and OrOrExpression and will not inject anything. Only a pure if(MatchExpression) injects anything.Very weird. So a MatchExpression by itself has a boolean result but injects a value into the following scope?So based on the above, your suggestion would only be useful for single call expressions: if( numStudents() ) printf( "%i students\n", $.whatever );Yes. /Oskar
Feb 16 2006
Oskar Linde wrote:Sean Kelly wrote:Oh right. And pointers can be implicitly evaluates as logical expressions. Makes sense now. SeanOskar Linde wrote:No, not boolean. A MatchExpression has a _Match* result. This result is what gets injected into the following scope. My suggestion is just a generalization of this.Those are AndAndExpression and OrOrExpression and will not inject anything. Only a pure if(MatchExpression) injects anything.Very weird. So a MatchExpression by itself has a boolean result but injects a value into the following scope?
Feb 16 2006
Oskar Linde wrote:char[] cutHeadAndTail = myString[1 .. $.length-1]; Image subImage = myImage[$.upperLeft .. $.middle]; char[] contents = text[$.indexOf('{')+1 .. $.indexOf('}')];This is a great idea. I like it.
Feb 16 2006
"Julio César Carrascal Urquijo" <jcesar phreaker.net> wrote in message news:dt28a3$o2q$1 digitaldaemon.com...Oskar Linde wrote:There is one problem with it: every time an IfStatement is added to existing code, it will break all uses of $ in the ThenStatement: ----- before -------- if (foo()) $.bar = 3; ------ after --------- if (foo()) { if (abc()) $.bar = 3; // uh-oh! } ---------------------- This is of course a trivial example, but consider if the $ appeared in a large block of code.char[] cutHeadAndTail = myString[1 .. $.length-1]; Image subImage = myImage[$.upperLeft .. $.middle]; char[] contents = text[$.indexOf('{')+1 .. $.indexOf('}')];This is a great idea. I like it.
Feb 16 2006
Walter Bright wrote:"Julio César Carrascal Urquijo" <jcesar phreaker.net> wrote in message news:dt28a3$o2q$1 digitaldaemon.com...Coding guidelines would probably say that $ should be assigned to a named variable for all but the simplest blocks: if (foo()) { auto myvar = $; ... } The $ would be kind of elusive and only usable in its outermost scope. But the MatchExpression injected _match has the same problem. Consider the following hypothetical refactoring example: const char[] two_argument_function_call = r"([_a-zA-Z][_0-9a-zA-Z]*)\(([^,\(\)]+),([^,\(\)]+)\)"; // Find function-calls if (two_argument_function_call ~~ str) { // Swap the order of arguments for functions named array_* if ("array_(.+)" ~~ _match.match(1)) { // Need access results from outer _match. } ... } And here is something the current MatchExpression behavior suffers from that a general scope variable would not: if (a ~~ b) { if (c == d && e ~~ f) { do_something(_match.match(0)); // (*) } } *) here e ~~ f is not injecting its result and _match refers to the result of a ~~ b The apparent innocent change of removing the condition c == d from the if-statement will suddenly and silently have a side effect of injecting a shadowing _match variable and thus alter the argument to do_something(). Maybe this is a good time to consider Ben Hinkle's suggested declare-and-init operator := as a non-verbose way of naming sub-expressions. http://www.digitalmars.com/d/archives/digitalmars/D/28198.html (Also similar to Serg Kovrovs suggestion on the Semantic Scope Operator thread) if (m := a ~~ b) { ... if (n := c ~~ m.match(0)) { ... } } /OskarOskar Linde wrote:There is one problem with it: every time an IfStatement is added to existing code, it will break all uses of $ in the ThenStatement: ----- before -------- if (foo()) $.bar = 3; ------ after --------- if (foo()) { if (abc()) $.bar = 3; // uh-oh! } ---------------------- This is of course a trivial example, but consider if the $ appeared in a large block of code.char[] cutHeadAndTail = myString[1 .. $.length-1]; Image subImage = myImage[$.upperLeft .. $.middle]; char[] contents = text[$.indexOf('{')+1 .. $.indexOf('}')];This is a great idea. I like it.
Feb 17 2006
In article <dt4nqs$2erg$1 digitaldaemon.com>, Oskar Linde says...Maybe this is a good time to consider Ben Hinkle's suggested declare-and-init operator := as a non-verbose way of naming sub-expressions. http://www.digitalmars.com/d/archives/digitalmars/D/28198.html (Also similar to Serg Kovrovs suggestion on the Semantic Scope Operator thread) if (m := a ~~ b) { ... if (n := c ~~ m.match(0)) { ... } }Sort of like an auto 'auto' declaration? I gather that the point is that the lvalue to the := expession is transparent to the context in which it is used (kind of inlining a variable creation and assignment)? Also, how about using $.outer instead? Link for "SSO" thread (with syntax examples at bottom of post): http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/33645 - Eric Anderton at yahoo
Feb 17 2006
pragma wrote:In article <dt4nqs$2erg$1 digitaldaemon.com>, Oskar Linde says...Yes, exactly so. The scope of such variables declared in the operand of for example if-statements should probably be similar to the scope of variables declared in the init-part of a for-declaration.Maybe this is a good time to consider Ben Hinkle's suggested declare-and-init operator := as a non-verbose way of naming sub-expressions. http://www.digitalmars.com/d/archives/digitalmars/D/28198.html (Also similar to Serg Kovrovs suggestion on the Semantic Scope Operator thread) if (m := a ~~ b) { ... if (n := c ~~ m.match(0)) { ... } }Sort of like an auto 'auto' declaration? I gather that the point is that the lvalue to the := expession is transparent to the context in which it is used (kind of inlining a variable creation and assignment)?Also, how about using $.outer instead?$.outer could collide with a member identifier. Maybe using the keyword super somehow... or append another $, like $$ for outer, $$$ for outer(outer). I don't think it's very necessary when you can do auto outer = $; before starting the inner scope. /Oskar
Feb 17 2006
"Oskar Linde" <oskar.lindeREM OVEgmail.com> wrote in message news:dt4nqs$2erg$1 digitaldaemon.com...The apparent innocent change of removing the condition c == d from the if-statement will suddenly and silently have a side effect of injecting a shadowing _match variable and thus alter the argument to do_something().Yes, that's a problem.Maybe this is a good time to consider Ben Hinkle's suggested declare-and-init operator := as a non-verbose way of naming sub-expressions. http://www.digitalmars.com/d/archives/digitalmars/D/28198.html (Also similar to Serg Kovrovs suggestion on the Semantic Scope Operator thread) if (m := a ~~ b) { ... if (n := c ~~ m.match(0)) { ... } }It's a workable proposal. But it overlaps the functionality of 'auto' declarations a bit much. And: if (auto m = a ~~ b) might be a little wordy? Perhaps: if (m; a ~~ b) sort of along the lines of foreach?
Feb 17 2006
Walter Bright skrev:"Oskar Linde" <oskar.lindeREM OVEgmail.com> wrote in message news:dt4nqs$2erg$1 digitaldaemon.com...Yes! This one I like. I have shuddred allot while reading this thread, I do not like too much magic happening in my code. This one is neat and simple, consistent with existing syntax. And most importantly; makes it quite hard to write incorrect code. // Fredrik OlssonThe apparent innocent change of removing the condition c == d from the if-statement will suddenly and silently have a side effect of injecting a shadowing _match variable and thus alter the argument to do_something().Yes, that's a problem.Maybe this is a good time to consider Ben Hinkle's suggested declare-and-init operator := as a non-verbose way of naming sub-expressions. http://www.digitalmars.com/d/archives/digitalmars/D/28198.html (Also similar to Serg Kovrovs suggestion on the Semantic Scope Operator thread) if (m := a ~~ b) { ... if (n := c ~~ m.match(0)) { ... } }It's a workable proposal. But it overlaps the functionality of 'auto' declarations a bit much. And: if (auto m = a ~~ b) might be a little wordy? Perhaps: if (m; a ~~ b) sort of along the lines of foreach?
Feb 17 2006
Walter Bright wrote:"Oskar Linde" <oskar.lindeREM OVEgmail.com> wrote in message news:dt4nqs$2erg$1 digitaldaemon.com...I like it. Assuming this were implemented, would it affect all conditional expressions except foreach? SeanThe apparent innocent change of removing the condition c == d from the if-statement will suddenly and silently have a side effect of injecting a shadowing _match variable and thus alter the argument to do_something().Yes, that's a problem.Maybe this is a good time to consider Ben Hinkle's suggested declare-and-init operator := as a non-verbose way of naming sub-expressions. http://www.digitalmars.com/d/archives/digitalmars/D/28198.html (Also similar to Serg Kovrovs suggestion on the Semantic Scope Operator thread) if (m := a ~~ b) { ... if (n := c ~~ m.match(0)) { ... } }It's a workable proposal. But it overlaps the functionality of 'auto' declarations a bit much. And: if (auto m = a ~~ b) might be a little wordy? Perhaps: if (m; a ~~ b) sort of along the lines of foreach?
Feb 17 2006
if (auto m = a ~~ b) might be a little wordy? Perhaps: if (m; a ~~ b)I personally like the former, it does not need special 'if' syntax.
Feb 17 2006
Walter Bright wrote:"Oskar Linde" <oskar.lindeREM OVEgmail.com> wrote in message news:dt4nqs$2erg$1 digitaldaemon.com...How would this scale to something like if((a ~~ b) && (c ~~ d)) would it be: if( m; a~~b && n; c~~d) ? This looks confusing to me. Wouldn't ':' look better here: if( m: a~~b && n: c~~d) ? But I think i like Ben's declare nad init := operator best in this case.The apparent innocent change of removing the condition c == d from the if-statement will suddenly and silently have a side effect of injecting a shadowing _match variable and thus alter the argument to do_something().Yes, that's a problem.Maybe this is a good time to consider Ben Hinkle's suggested declare-and-init operator := as a non-verbose way of naming sub-expressions. http://www.digitalmars.com/d/archives/digitalmars/D/28198.html (Also similar to Serg Kovrovs suggestion on the Semantic Scope Operator thread) if (m := a ~~ b) { ... if (n := c ~~ m.match(0)) { ... } }It's a workable proposal. But it overlaps the functionality of 'auto' declarations a bit much. And: if (auto m = a ~~ b) might be a little wordy? Perhaps: if (m; a ~~ b) sort of along the lines of foreach?
Feb 17 2006
Walter Bright wrote:"Oskar Linde" <oskar.lindeREM OVEgmail.com> wrote in message news:dt4nqs$2erg$1 digitaldaemon.com...I'm uneasy with this. We're playing with fundamental constructs here. if( ; ) is something so pivotal, that we should give this careful thought. If it took us 4 years of hard work to get rid of bit, what will happen when this gets rushed into the language without due diligence?The apparent innocent change of removing the condition c == d from the if-statement will suddenly and silently have a side effect of injecting a shadowing _match variable and thus alter the argument to do_something().Yes, that's a problem.Maybe this is a good time to consider Ben Hinkle's suggested declare-and-init operator := as a non-verbose way of naming sub-expressions. http://www.digitalmars.com/d/archives/digitalmars/D/28198.html (Also similar to Serg Kovrovs suggestion on the Semantic Scope Operator thread) if (m := a ~~ b) { ... if (n := c ~~ m.match(0)) { ... } }It's a workable proposal. But it overlaps the functionality of 'auto' declarations a bit much. And: if (auto m = a ~~ b) might be a little wordy? Perhaps: if (m; a ~~ b) sort of along the lines of foreach?
Feb 17 2006
"Georg Wrede" <georg.wrede nospam.org> wrote [snip]I'm all for getting some kind of regex sugar in the grammar, but also feel a bit alarmed about the sudden rush to 'slam' all this into the language. Seems like it would be wiser to approach this whole thing in smaller steps: let's see how foreach() goes first?if (auto m = a ~~ b) might be a little wordy? Perhaps: if (m; a ~~ b) sort of along the lines of foreach?I'm uneasy with this. We're playing with fundamental constructs here. if( ; ) is something so pivotal, that we should give this careful thought. If it took us 4 years of hard work to get rid of bit, what will happen when this gets rushed into the language without due diligence?
Feb 17 2006
Kris wrote:I'm all for getting some kind of regex sugar in the grammar, but also feel a bit alarmed about the sudden rush to 'slam' all this into the language. Seems like it would be wiser to approach this whole thing in smaller steps: let's see how foreach() goes first?As long as these new features don't break old code, I'm fine with Walter trying things out. After all, the best way to solicit input is often to give people something to play with. But it would be nice if there were a way to have these features flagged as "experimental." Sean
Feb 17 2006
"Sean Kelly" <sean f4.ca> wrote...Kris wrote:That would be cool.I'm all for getting some kind of regex sugar in the grammar, but also feel a bit alarmed about the sudden rush to 'slam' all this into the language. Seems like it would be wiser to approach this whole thing in smaller steps: let's see how foreach() goes first?As long as these new features don't break old code, I'm fine with Walter trying things out. After all, the best way to solicit input is often to give people something to play with. But it would be nice if there were a way to have these features flagged as "experimental."
Feb 17 2006
Georg Wrede wrote:I'm uneasy with this. We're playing with fundamental constructs here. if( ; ) is something so pivotal, that we should give this careful thought. If it took us 4 years of hard work to get rid of bit, what will happen when this gets rushed into the language without due diligence?True enough. However, the above syntax is currently illegal, so there's no change of something breaking, and C/C++ already allow declarations in if blocks via the traditional method: if( int x = foo() ) {} One of Walter's other suggestions was to use this syntax, with the qualification that it was a bit verbose. One thing I like about the proposed syntax is that it's already how foreach works, so the semantic meaning is mostly just being extended to if and while blocks. The 'for' syntax doesn't match this however, which may be one argument in favor of the more traditional 'auto' method. Personally, my primary interest is that the syntax be both consistent and obvious. Both of the above work for me, but I favor "if( x; foo() )" if implicit type determination is mandatory. If it's not, I'm ambivalent. Sean
Feb 17 2006
Oskar Linde wrote:Coding guidelines would probably say that $ should be assigned to a named variable for all but the simplest blocks: if (foo()) { auto myvar = $; ... } The $ would be kind of elusive and only usable in its outermost scope.That would sort of make the whole token pointless IMO - easier just to do something like: if ((myvar = foo()) != 0) or whatever, I'm not sure exactly how the syntax currently works for this.
Feb 17 2006
Walter Bright wrote:"Kris" <fu bar.com> wrote in message news:dt0q7n$2cuo$1 digitaldaemon.com...Well, there's always "in" ... if (".wav$" in filename) ... plus the !in variation. Don't you find that somewhat more appealing?There seem to be multiple issues here. The first one, which you ask about, is related to the syntax. At first blush, the ~~ looks like an approximate approximation, and then making D look like a malformed Perl is surely a mistake.If you've got a better idea for tokens ~~ and !~ ?I believe the shortened versions ($pre, $post, $group[n] etc) are much more readable. This type of thing is why some of us were so adamant about saving the $ sign as a prefix for meta-tags, vis-a-vis $time, $file, $line and, of course, $lengthWhat the heck is wrong with $match.pre, $match.post, $match.index(n) instead? At least they're readable :-)Nothing, really. But are they more readable than _match.pre, etc.?The first two are at least related. But the argument is flawed: choosing arbitrary symbols for operators does not make the language easier to grasp. At least "in" has some relevant meaning to it.Additionally, I thought '~' was used for concatenation?It is.Because '+' is overloaded in other languages? Isn't that just exactly what you're now doing with '~' ?'=' and '==' mean entirely different things. So does / and /*. I don't think ~~ need have anything to do with complement or concatenation.Well, since grammar supported regex has elevated itself to the top of the priority list, perhaps wchar/dchar support might tag along with it?Then, you say this is applicable only to char[]. What about wchar[] and dchar[]? Are they now relegated to second-class citizens? It's no use converting those arrays into char[] on the fly ~ apart from the heap activity and conversion that would ensue (for both operands; one of which could be rather substantial), $match.pre and friends would also have to do conversions back into the original format. Ugghh.That is a problem, one that would get solved when RegExp can do wchar and dchar. That isn't a technical problem, it's more of a getting around to it problem.That's a good solution. Do you have a unicode 'folder' ?Yet another issue is with respect to case-folding (which is often used with regex expressions). You see, unicode case-folding does not follow the trivial rules of ASCII ~ you can't just call tolower() and hope for the best. Thus, there needs to be some mechanism to support alternate, more appropriate, converters.I agree that case is an issue. That's why this also works: if (RegExp("string", "i") ~~ "string") ... and can work with any class type as the left operand, as long as it overloads opMatch.I don't understand the relevance of that, Walter. What does _match have to do with the need to support utf8,utf16 and utf32?In retrospect, much of this should probably be handled via template usage (for the different UTF types). And the converter issue can be resolved by supporting some kind of assignable or plug-in module. All of this can be handled by a templated class. I attempted to do just this with your RegExp class, but ran into problems related to how patterns are stored in the "instruction" stream (size differences between char and dchar, for example).I don't agree. The problem I ran into with this approach is the injection of the declaration _match into the current scope.No, but the opMatch() is a good solution for that aspect.I'm an advocate for potentially getting regex support into the grammar but, on the face of it, your approach just doesn't appear to be considered in a particularly thorough manner. There again, perhaps you've already addressed the above issues, and the resolution is just not currently visible?I considered many ways of doing it, and have actually been thinking about it for months. This seemed to be the most practical. I hope I answered your questions about it.Perhaps, but I really don't see the need for this sudden rush to get regex support into the grammar. Experience with regex templates is almost certain to uncover some conflict in this regard ~ one that will likely have to be compromised to fit in with the current syntax. That's just Murphy's law. What's the big hurry?Perhaps this whole thing should wait until after we see what can be done with the regex templates, so that there's some experience behind the grammar? I mean, that would surely be better than having to remove the above at some point in the future. What's the big rush with built-in regex anyway? I really do think it should wait until we have some solid experience with regex templates ~ don't you think it's rather likely we'll learn something really useful that applies directly to a built-in grammar?I don't think this takes away from the regex templates. I hope to use the regex templates in conjunction with this syntactic sugar to create optimized regex evaluation.
Feb 16 2006
"kris" <fu bar.org> wrote in message news:dt1cm1$2t76$1 digitaldaemon.com...Walter Bright wrote: Well, there's always "in" ... if (".wav$" in filename) ... plus the !in variation. Don't you find that somewhat more appealing?Not really. I think it also conflicts with 'in' already.Fair enough. Let's see what others think.I believe the shortened versions ($pre, $post, $group[n] etc) are much more readable. This type of thing is why some of us were so adamant about saving the $ sign as a prefix for meta-tags, vis-a-vis $time, $file, $line and, of course, $lengthWhat the heck is wrong with $match.pre, $match.post, $match.index(n) instead? At least they're readable :-)Nothing, really. But are they more readable than _match.pre, etc.?It's all a matter of what you're used to. Who'd have thought that '!' for 'not' would feel natural? It was a kludge invented for C. Now it's standard.The first two are at least related. But the argument is flawed: choosing arbitrary symbols for operators does not make the language easier to grasp.Additionally, I thought '~' was used for concatenation?It is.Because '+' is overloaded in other languages? Isn't that just exactly what you're now doing with '~' ?'=' and '==' mean entirely different things. So does / and /*. I don't think ~~ need have anything to do with complement or concatenation.At least "in" has some relevant meaning to it.It would be overloading its existing meaning, which means that it'll take semantic, rather than syntactic, analysis to disambiguate. This is potential trouble.The thing is, RegExp has been in there from the beginning, but it has gone unused and even its existence is overlooked. I don't believe that's because it isn't useful - look at Ruby, Perl, Javascript, etc. Those languages heavilly use regex. Is there something inherent about *script* languages that make them nice for regex? I don't believe there is, I think it gets heavilly used in those languages because the syntactic sugar makes it easy to use. I've been blasted for putting strings in the language (instead of as a library String class), for putting complex numbers in, and for associative arrays. I think the results speak for these being a success. If regex's are heavilly used, then the extra sugar for them becomes worthwhile as well. Who uses regex in C++? Hardly anyone. I'm betting it's because using them sucks in C++, not because people don't use regex's.That is a problem, one that would get solved when RegExp can do wchar and dchar. That isn't a technical problem, it's more of a getting around to it problem.Well, since grammar supported regex has elevated itself to the top of the priority list, perhaps wchar/dchar support might tag along with it?No. But that's a library issue, not a language issue. Match expressions are set up so that one can completely control their behavior with a custom class.I agree that case is an issue. That's why this also works: if (RegExp("string", "i") ~~ "string") ... and can work with any class type as the left operand, as long as it overloads opMatch.That's a good solution. Do you have a unicode 'folder' ?Nothing. But _match *does* have a lot to do with the inadequacy of a pure template solution. Not even mixins will work in a nice way here.I don't understand the relevance of that, Walter. What does _match have to do with the need to support utf8,utf16 and utf32?In retrospect, much of this should probably be handled via template usage (for the different UTF types). And the converter issue can be resolved by supporting some kind of assignable or plug-in module. All of this can be handled by a templated class. I attempted to do just this with your RegExp class, but ran into problems related to how patterns are stored in the "instruction" stream (size differences between char and dchar, for example).I don't agree. The problem I ran into with this approach is the injection of the declaration _match into the current scope.I thought it fit in well with D's new capability of being runnable in a script-like fashion. If this opens up a reasonably broad new range of applications that D is a good fit for, that's good. I might be wrong, of course, as I've been with the bit data type (a complete botch). Match expressions don't break anything, were not expensive to implement, and the only way to see how they'll work out is to try them.I don't think this takes away from the regex templates. I hope to use the regex templates in conjunction with this syntactic sugar to create optimized regex evaluation.Perhaps, but I really don't see the need for this sudden rush to get regex support into the grammar. Experience with regex templates is almost certain to uncover some conflict in this regard ~ one that will likely have to be compromised to fit in with the current syntax. That's just Murphy's law. What's the big hurry?
Feb 16 2006
Walter Bright wrote:"kris" <fu bar.org> wrote in message news:dt1cm1$2t76$1 digitaldaemon.com...but not from the users standpointWalter Bright wrote: Well, there's always "in" ... if (".wav$" in filename) ... plus the !in variation. Don't you find that somewhat more appealing?Not really. I think it also conflicts with 'in' already.It's all a matter of what you're used to. Who'd have thought that '!' for 'not' would feel natural? It was a kludge invented for C. Now it's standard.That doesn't mean D should adopt arbitrary symbols, Walter. If you want rapid adoption, then the more you can do to make the language "approachable", the more success you'll have. There was a similar issue with === and !==, and you thankfully deprecated them :-)I can see that there "might" be trouble for the compiler and, if so, that would be an issue. However, for a developer, the meaning of "in" with respect to its use with AA and potentially regex-patterns is consistent. One is asking the question "does this thing on the left exist within the thing on the right". It even takes care of getting the operand ordering correct. Thus, I'd urge you to at least see if there's actually a notable problem for the compiler to handle this before writing the idea off.At least "in" has some relevant meaning to it.It would be overloading its existing meaning, which means that it'll take semantic, rather than syntactic, analysis to disambiguate. This is potential trouble.The thing is, RegExp has been in there from the beginning, but it has gone unused and even its existence is overlooked. I don't believe that's because it isn't useful - look at Ruby, Perl, Javascript, etc. Those languages heavilly use regex. Is there something inherent about *script* languages that make them nice for regex? I don't believe there is, I think it gets heavilly used in those languages because the syntactic sugar makes it easy to use.Heck, I've used regex in all manner of ways. I don't think visibility is the problem; rather, I suspect there's a limited set of domains where it applies in a systems language. Some of the those can be addressed in other ways, particularly where performance is a concern; hence regex may not get used as much as it might. In scripting languages there's often a need for Q & D pattern-matching, with little regard for a potentially more efficient mechanism. Horses for courses.I've been blasted for putting strings in the language (instead of as a library String class), for putting complex numbers in, and for associative arrays. I think the results speak for these being a success. If regex's are heavilly used, then the extra sugar for them becomes worthwhile as well.That's getting a bit off topic, isn't it? OK, I'll go with it: I'm an advocate for getting regex support in the grammar, but I'm certainly not an advocate for tying Phobos to the compiler (RegExp has a notable resultant import set; because of this I refactored it for Ares and Mango). Without a clearly defined means to decouple Phobos from the compiler, you're effectively erecting barriers for other solutions to clamber over (as Sean vaguely intimated earlier). What's missing from all this built-in stuff is a clean and documented means to have it supported outside of Phobos. After all, the compiler is injecting explicit references for AA code, utf conversion code, regex code, and a variety of other things. What's next? In short: you're (a) building more and more library functionality directly into the language without providing a means to cleanly support alternate implementations, extensions, or otherwise decouple the compiler. And (b) by doing so, you're (perhaps inadvertantly) stifling some innovation and causing some headaches for the very people who are trying to help D along the road to acceptance. It would really help if you'd be somewhat sensitive to these aspects rather than persistently ignoring them. For instance, how does one change .sort to use a different sorting algorithm? How does one change the hashing function for non-classes? How can one unhook RegExp+OutBuffer+String+Others, and replace it? etc. etc. If D is intended to be a closed-shop, Phobos-only environment, then some of us are presumably wasting our time supporting the language; right? I don't suppose that was the answer you were looking for <g>Who uses regex in C++? Hardly anyone. I'm betting it's because using them sucks in C++, not because people don't use regex's.Again, it's horses for courses. BTW, regex does not suck in C, so why C++ ?I thought it fit in well with D's new capability of being runnable in a script-like fashion. If this opens up a reasonably broad new range of applications that D is a good fit for, that's good. I might be wrong, of course, as I've been with the bit data type (a complete botch). Match expressions don't break anything, were not expensive to implement, and the only way to see how they'll work out is to try them.I figured that was the motivation. The "cost" you speak of considers only how much effort it takes you to get the functionality into the compiler, test it a bit, document the usage, and respond to the flak ;-) BTW: perhaps it would be appropriate to deprecate bit[] before 1.0 and provide a nice library class/struct instead? You might even reuse the old code from Zortech/Zorland days.
Feb 16 2006
kris wrote:Walter Bright wrote:I'm branching Ares before I check in this last block of changes. In the new branch I'm simply going to move all necessary Phobos std code required into dmdrt/util and will plan to trim it down over time. Not ideal, I know, but better than trying to play catch-up with heavily modified code such as the version of RegExp you provided. For the rest, I agree completely, but then I've already said as much in d.D.announce :-)I've been blasted for putting strings in the language (instead of as a library String class), for putting complex numbers in, and for associative arrays. I think the results speak for these being a success. If regex's are heavilly used, then the extra sugar for them becomes worthwhile as well.That's getting a bit off topic, isn't it? OK, I'll go with it: I'm an advocate for getting regex support in the grammar, but I'm certainly not an advocate for tying Phobos to the compiler (RegExp has a notable resultant import set; because of this I refactored it for Ares and Mango). Without a clearly defined means to decouple Phobos from the compiler, you're effectively erecting barriers for other solutions to clamber over (as Sean vaguely intimated earlier). What's missing from all this built-in stuff is a clean and documented means to have it supported outside of Phobos. After all, the compiler is injecting explicit references for AA code, utf conversion code, regex code, and a variety of other things. What's next?The lack of a standard library component is a significant factor IMO. As is the widely divergent syntaxes supported by third party libraries. Personally, I haven't used regular expressions in D because I haven't needed to yet, not because they weren't a language feature. But I can't help liking this being built-in from a language perspective, even if this is balanced by practical concerns.Who uses regex in C++? Hardly anyone. I'm betting it's because using them sucks in C++, not because people don't use regex's.Again, it's horses for courses. BTW, regex does not suck in C, so why C++ ?If it helps, I'll send you a case of beer or something ;-) But if there's universal agreement that packed bit arrays were a mistake then they need to be out pre-1.0 and broken code be damned. I really don't want to see a 1.0 D release containing features that even the designer thinks should not exist. SeanI thought it fit in well with D's new capability of being runnable in a script-like fashion. If this opens up a reasonably broad new range of applications that D is a good fit for, that's good. I might be wrong, of course, as I've been with the bit data type (a complete botch). Match expressions don't break anything, were not expensive to implement, and the only way to see how they'll work out is to try them.I figured that was the motivation. The "cost" you speak of considers only how much effort it takes you to get the functionality into the compiler, test it a bit, document the usage, and respond to the flak ;-) BTW: perhaps it would be appropriate to deprecate bit[] before 1.0 and provide a nice library class/struct instead? You might even reuse the old code from Zortech/Zorland days.
Feb 16 2006
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sean Kelly schrieb am 2006-02-16:kris wrote:[snip]What is the cost of keeping bit[] in the language? Currently, every type - including void - can be used as the type on an array element. What would be the consequences for generic programming if T -> T[] isn't guaranteed to succeed? Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFD9QXm3w+/yD4P9tIRAruLAJ96SNaO7jn85lXJxyxXmMVsS3bPZACdG1pd KBuKJE2ogwPwg0YSHeGIJ+A= =+ZUL -----END PGP SIGNATURE-----BTW: perhaps it would be appropriate to deprecate bit[] before 1.0 and provide a nice library class/struct instead? You might even reuse the old code from Zortech/Zorland days.If it helps, I'll send you a case of beer or something ;-) But if there's universal agreement that packed bit arrays were a mistake then they need to be out pre-1.0 and broken code be damned. I really don't want to see a 1.0 D release containing features that even the designer thinks should not exist.
Feb 16 2006
Thomas Kuehne wrote:-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sean Kelly schrieb am 2006-02-16:The same as the problems with std::vector<bool> in C++ (though I don't have any specific references handy). I think the true ramifications of this in D won't be completely apparent until the language has been in use a bit longer however. One thought I had was to leave bit in place, perhaps deprecated, and add 'bool' as a non-packed but otherwise equivalent type. Seankris wrote:[snip]What is the cost of keeping bit[] in the language? Currently, every type - including void - can be used as the type on an array element. What would be the consequences for generic programming if T -> T[] isn't guaranteed to succeed?BTW: perhaps it would be appropriate to deprecate bit[] before 1.0 and provide a nice library class/struct instead? You might even reuse the old code from Zortech/Zorland days.If it helps, I'll send you a case of beer or something ;-) But if there's universal agreement that packed bit arrays were a mistake then they need to be out pre-1.0 and broken code be damned. I really don't want to see a 1.0 D release containing features that even the designer thinks should not exist.
Feb 16 2006
"Thomas Kuehne" <thomas-dloop kuehne.cn> wrote ...Currently, every type - including void - can be used as the type on an array element. What would be the consequences for generic programming if T -> T[] isn't guaranteed to succeed?Easy fix ~ change the bool alias to byte, instead of bit :-)
Feb 16 2006
Kris wrote:"Thomas Kuehne" <thomas-dloop kuehne.cn> wrote ...I already use byte in some cases :-) But it lacks the boolean value safety of bit, so I tend to litter my code with asserts just to be sure something didn't get screwed up... or simply make sure I'm only comparing to zero and not-zero. Either way, it's more error prone than I'd like. SeanCurrently, every type - including void - can be used as the type on an array element. What would be the consequences for generic programming if T -> T[] isn't guaranteed to succeed?Easy fix ~ change the bool alias to byte, instead of bit :-)
Feb 16 2006
"Sean Kelly" <sean f4.ca> wroteKris wrote:Yes, you're right of course. Would be just great if Walter would add a true *cough* bool *cough* type that doesn't try to pack itself when used with arrays. Packed bits are great too, but for different reasons."Thomas Kuehne" <thomas-dloop kuehne.cn> wrote ...I already use byte in some cases :-) But it lacks the boolean value safety of bit, so I tend to litter my code with asserts just to be sure something didn't get screwed up... or simply make sure I'm only comparing to zero and not-zero. Either way, it's more error prone than I'd like.Currently, every type - including void - can be used as the type on an array element. What would be the consequences for generic programming if T -> T[] isn't guaranteed to succeed?Easy fix ~ change the bool alias to byte, instead of bit :-)
Feb 16 2006
On Thu, 16 Feb 2006 16:36:48 -0800, Kris <fu bar.com> wrote:"Sean Kelly" <sean f4.ca> wroteA true bool would make several people happy.. but once one existed people would then want: class A {} A a = new a(); if (a) //error not boolean result. right? That would bother me.Kris wrote:Yes, you're right of course. Would be just great if Walter would add a true *cough* bool *cough* type that doesn't try to pack itself when used with arrays."Thomas Kuehne" <thomas-dloop kuehne.cn> wrote ...I already use byte in some cases :-) But it lacks the boolean value safety of bit, so I tend to litter my code with asserts just to be sure something didn't get screwed up... or simply make sure I'm only comparing to zero and not-zero. Either way, it's more error prone than I'd like.Currently, every type - including void - can be used as the type on an array element. What would be the consequences for generic programming if T -> T[] isn't guaranteed to succeed?Easy fix ~ change the bool alias to byte, instead of bit :-)Packed bits are great too, but for different reasons.Indeed, I can think of several uses for packed bits. i.e. - Using them as a bunch of flags, generally boolean on/off flags. - Representing/disecting packed data, i.e. tcp headers. - Assembling/converting data i.e. 8bit to 7bit characters for SMS messages. all of these can be done with & | ^ etc but it would be nice, i.e. more readable, easier to write if we could index the data. I've suggested this before but is it perhaps possible to allow us to perform array operations on the basic types: byte, short, int, long. For the same reason that bit[] does not work, these could not provide a full set of array functionality, but it could provide much that would be of use, I suspect. Examples: int flags; ... if (flags[5]) //check for flag flag[5] = 1; //set flag void foo(long header) { int length = header[0..5]; //copy bits to lvalue. ... For the 3rd task, converting from 8bit to 7bit some sort of stream that allowed bits to be sent to it and assembled would be the ideal way, I suspect. In the end it's just syntactic sugar for & | and ^. The question is, does it make the code clearer, I think so. Does it make bit manipulation easier to code, I think so. Is that enough to make it a valuable feature? Regan
Feb 16 2006
Regan Heath wrote:On Thu, 16 Feb 2006 16:36:48 -0800, Kris <fu bar.com> wrote:This is only a slippery slope if we want it to be ;-) I think the intent behind adding 'bool' was twofold: first, 'bit' loses meaning if it never actually refers to a bit, and second, it allows 'bit' to be deprecated for a while so people can change their code."Sean Kelly" <sean f4.ca> wroteA true bool would make several people happy.. but once one existed people would then want: class A {} A a = new a(); if (a) //error not boolean result. right? That would bother me.Kris wrote:Yes, you're right of course. Would be just great if Walter would add a true *cough* bool *cough* type that doesn't try to pack itself when used with arrays."Thomas Kuehne" <thomas-dloop kuehne.cn> wrote ...I already use byte in some cases :-) But it lacks the boolean value safety of bit, so I tend to litter my code with asserts just to be sure something didn't get screwed up... or simply make sure I'm only comparing to zero and not-zero. Either way, it's more error prone than I'd like.Currently, every type - including void - can be used as the type on an array element. What would be the consequences for generic programming if T -> T[] isn't guaranteed to succeed?Easy fix ~ change the bool alias to byte, instead of bit :-)Aye. I like the idea of packed bit arrays in general. I just don't want them to be mandatory for the built-in boolean type--I run into too many situations where I want to do something that the existing syntax doesn't support and I'm stuck using an array of bytes instead. SeanPacked bits are great too, but for different reasons.Indeed, I can think of several uses for packed bits. i.e. - Using them as a bunch of flags, generally boolean on/off flags. - Representing/disecting packed data, i.e. tcp headers. - Assembling/converting data i.e. 8bit to 7bit characters for SMS messages. all of these can be done with & | ^ etc but it would be nice, i.e. more readable, easier to write if we could index the data.
Feb 16 2006
"Regan Heath" <regan netwin.co.nz> wrote in message news:ops43d5lmc23k2f5 nrage.netwin.co.nz...In the end it's just syntactic sugar for & | and ^. The question is, does it make the code clearer, I think so. Does it make bit manipulation easier to code, I think so. Is that enough to make it a valuable feature?I regularly do bit masking and shifting on ints. I'm so used to it, I don't think that adding sugar for it would help any.
Feb 16 2006
On Thu, 16 Feb 2006 17:25:23 -0800, Walter Bright wrote:"Regan Heath" <regan netwin.co.nz> wrote in message news:ops43d5lmc23k2f5 nrage.netwin.co.nz...YOU ARE DEAD WRONG! Sheesh!!! Not all us are blessed with your abilities. That's why I don't do Assembler anymore and that's why we use higher level languages than machine code. -- Derek (skype: derek.j.parnell) Melbourne, Australia "Down with mediocracy!" 17/02/2006 12:40:36 PMIn the end it's just syntactic sugar for & | and ^. The question is, does it make the code clearer, I think so. Does it make bit manipulation easier to code, I think so. Is that enough to make it a valuable feature?I regularly do bit masking and shifting on ints. I'm so used to it, I don't think that adding sugar for it would help any.
Feb 16 2006
"Derek Parnell" <derek psych.ward> wrote in message news:edpqlnztl599.19xc3uf14ntbh.dlg 40tude.net...On Thu, 16 Feb 2006 17:25:23 -0800, Walter Bright wrote:What about using some functions instead: int setBit(inout v, int b) { return v |= 1 << b; } ?"Regan Heath" <regan netwin.co.nz> wrote in message news:ops43d5lmc23k2f5 nrage.netwin.co.nz...YOU ARE DEAD WRONG! Sheesh!!! Not all us are blessed with your abilities.In the end it's just syntactic sugar for & | and ^. The question is, does it make the code clearer, I think so. Does it make bit manipulation easier to code, I think so. Is that enough to make it a valuable feature?I regularly do bit masking and shifting on ints. I'm so used to it, I don't think that adding sugar for it would help any.That's why I don't do Assembler anymore and that's why we use higher level languages than machine code.<g>
Feb 16 2006
On Thu, 16 Feb 2006 17:48:54 -0800, Walter Bright wrote:"Derek Parnell" <derek psych.ward> wrote in message news:edpqlnztl599.19xc3uf14ntbh.dlg 40tude.net...You mean like std.regexp library functions? Oh that's right ... we have ~~ now; silly me. -- Derek (skype: derek.j.parnell) Melbourne, Australia "Down with mediocracy!" 17/02/2006 1:49:07 PMOn Thu, 16 Feb 2006 17:25:23 -0800, Walter Bright wrote:What about using some functions instead: int setBit(inout v, int b) { return v |= 1 << b; } ?"Regan Heath" <regan netwin.co.nz> wrote in message news:ops43d5lmc23k2f5 nrage.netwin.co.nz...YOU ARE DEAD WRONG! Sheesh!!! Not all us are blessed with your abilities.In the end it's just syntactic sugar for & | and ^. The question is, does it make the code clearer, I think so. Does it make bit manipulation easier to code, I think so. Is that enough to make it a valuable feature?I regularly do bit masking and shifting on ints. I'm so used to it, I don't think that adding sugar for it would help any.
Feb 16 2006
"Walter Bright" <newshound digitalmars.com> wrote"Regan Heath" <regan netwin.co.nz> wrote in message news:ops43d5lmc23k2f5 nrage.netwin.co.nz...Besides, its easy to use op-overloads for such things as necessary.In the end it's just syntactic sugar for & | and ^. The question is, does it make the code clearer, I think so. Does it make bit manipulation easier to code, I think so. Is that enough to make it a valuable feature?I regularly do bit masking and shifting on ints. I'm so used to it, I don't think that adding sugar for it would help any.
Feb 16 2006
On Fri, 17 Feb 2006 13:54:47 +1300, Regan Heath wrote:On Thu, 16 Feb 2006 16:36:48 -0800, Kris <fu bar.com> wrote:I regard the syntax if ( <identifier> ) as shorthand for if ( <identifier> != 0 ) or if ( <identifier> !is null) as appropriate, so this would not fall foul of a native boolean implementation. -- Derek (skype: derek.j.parnell) Melbourne, Australia "Down with mediocracy!" 17/02/2006 12:37:40 PM"Sean Kelly" <sean f4.ca> wroteA true bool would make several people happy.. but once one existed people would then want: class A {} A a = new a(); if (a) //error not boolean result. right? That would bother me.Kris wrote:Yes, you're right of course. Would be just great if Walter would add a true *cough* bool *cough* type that doesn't try to pack itself when used with arrays."Thomas Kuehne" <thomas-dloop kuehne.cn> wrote ...I already use byte in some cases :-) But it lacks the boolean value safety of bit, so I tend to litter my code with asserts just to be sure something didn't get screwed up... or simply make sure I'm only comparing to zero and not-zero. Either way, it's more error prone than I'd like.Currently, every type - including void - can be used as the type on an array element. What would be the consequences for generic programming if T -> T[] isn't guaranteed to succeed?Easy fix ~ change the bool alias to byte, instead of bit :-)
Feb 16 2006
Regan Heath wrote:On Thu, 16 Feb 2006 16:36:48 -0800, Kris <fu bar.com> wrote:I favor a true bool, but would still like to keep the if(<int>) idiom."Sean Kelly" <sean f4.ca> wroteA true bool would make several people happy.. but once one existed people would then want: class A {} A a = new a(); if (a) //error not boolean result. right? That would bother me.Kris wrote:Yes, you're right of course. Would be just great if Walter would add a true *cough* bool *cough* type that doesn't try to pack itself when used with arrays."Thomas Kuehne" <thomas-dloop kuehne.cn> wrote ...I already use byte in some cases :-) But it lacks the boolean value safety of bit, so I tend to litter my code with asserts just to be sure something didn't get screwed up... or simply make sure I'm only comparing to zero and not-zero. Either way, it's more error prone than I'd like.Currently, every type - including void - can be used as the type on an array element. What would be the consequences for generic programming if T -> T[] isn't guaranteed to succeed?Easy fix ~ change the bool alias to byte, instead of bit :-)An interesting idea, but maybe, to avoid conflicts syntax conflicts, we should have: if (flags.bits[5]) flags.bits[5] = 0; (the name "bits" could maybe be other) -- Bruno Medeiros - CS/E student "Certain aspects of D are a pathway to many abilities some consider to be... unnatural."Packed bits are great too, but for different reasons.Indeed, I can think of several uses for packed bits. i.e. - Using them as a bunch of flags, generally boolean on/off flags. - Representing/disecting packed data, i.e. tcp headers. - Assembling/converting data i.e. 8bit to 7bit characters for SMS messages. all of these can be done with & | ^ etc but it would be nice, i.e. more readable, easier to write if we could index the data. I've suggested this before but is it perhaps possible to allow us to perform array operations on the basic types: byte, short, int, long. For the same reason that bit[] does not work, these could not provide a full set of array functionality, but it could provide much that would be of use, I suspect. Examples: int flags; .... if (flags[5]) //check for flag flag[5] = 1; //set flag void foo(long header) { int length = header[0..5]; //copy bits to lvalue. ....
Feb 17 2006
On Fri, 17 Feb 2006 15:38:38 +0000, Bruno Medeiros <daiphoenixNO SPAMlycos.com> wrote:I like it. Reganif (flags[5]) //check for flag flag[5] = 1; //set flag void foo(long header) { int length = header[0..5]; //copy bits to lvalue. ....An interesting idea, but maybe, to avoid conflicts syntax conflicts, we should have: if (flags.bits[5]) flags.bits[5] = 0; (the name "bits" could maybe be other)
Feb 17 2006
"kris" <fu bar.org> wrote in message news:dt1jhm$1m3$1 digitaldaemon.com...Walter Bright wrote:Can't separate the two.Not really. I think it also conflicts with 'in' already.but not from the users standpointThat doesn't mean D should adopt arbitrary symbols, Walter. If you want rapid adoption, then the more you can do to make the language "approachable", the more success you'll have. There was a similar issue with === and !==, and you thankfully deprecated them :-)Those had to go because === was indistinguishable from == in many fonts.The trouble starts happening when you overload the operators. Doing this with 'in' will result in similar problems that C++ has with '+' being sometimes plus, sometimes concatenate.It would be overloading its existing meaning, which means that it'll take semantic, rather than syntactic, analysis to disambiguate. This is potential trouble.I can see that there "might" be trouble for the compiler and, if so, that would be an issue. However, for a developer, the meaning of "in" with respect to its use with AA and potentially regex-patterns is consistent.One is asking the question "does this thing on the left exist within the thing on the right". It even takes care of getting the operand ordering correct. Thus, I'd urge you to at least see if there's actually a notable problem for the compiler to handle this before writing the idea off.It's not a problem with the compiler. It's a conceptual problem for the user. When I see 'in' I think of containers. That's completely different from regex.Heck, I've used regex in all manner of ways. I don't think visibility is the problem; rather, I suspect there's a limited set of domains where it applies in a systems language. Some of the those can be addressed in other ways, particularly where performance is a concern; hence regex may not get used as much as it might. In scripting languages there's often a need for Q & D pattern-matching, with little regard for a potentially more efficient mechanism. Horses for courses.Scripting languages have 3 main programming characteristics: 1) dynamic typing 2) great string handling 3) runtime script generation & execution A lot of people turn to them because of (2). There's no reason C++ and D can't do (2) as well. C++ doesn't because the C++ community has adopted the principle of "if it can be done as a library, it must be done as a library, no matter how unbelievably wretched that might turn out." So when C++ programmers want to do strings, they switch to Perl, Ruby, Python, etc. As to string manipulation in a systems app - is a compiler a systems app? I believe it is, and there's a bunch of tedious string manipulation in it. Everything from handling the command line arguments to manipulating file names to formatting error messages to reading config files. It's astonishing how that stuff shrinks and becomes a pleasure to code rather than tedium when the string handling sugar is applied. I also write a number of garden variety string processing apps, such as the one that turns newsgroup postings into the "D archives". I want to do them in D. I don't want to install/learn Ruby/Python/Perl. I see no reason why D cannot dominate that problem space well.I'm an advocate for getting regex support in the grammar,I thought you were arguing against that <g>.but I'm certainly not an advocate for tying Phobos to the compiler (RegExp has a notable resultant import set; because of this I refactored it for Ares and Mango). Without a clearly defined means to decouple Phobos from the compiler, you're effectively erecting barriers for other solutions to clamber over (as Sean vaguely intimated earlier). What's missing from all this built-in stuff is a clean and documented means to have it supported outside of Phobos. After all, the compiler is injecting explicit references for AA code, utf conversion code, regex code, and a variety of other things. What's next?The compiler actually does not emit any explicit references to RegExp. It's all done by a reference to object._Match. _Match operates as a proxy to RegExp, but the compiler knows nothing about that.In short: you're (a) building more and more library functionality directly into the language without providing a means to cleanly support alternate implementations, extensions, or otherwise decouple the compiler. And (b) by doing so, you're (perhaps inadvertantly) stifling some innovation and causing some headaches for the very people who are trying to help D along the road to acceptance. It would really help if you'd be somewhat sensitive to these aspects rather than persistently ignoring them. For instance, how does one change .sort to use a different sorting algorithm? How does one change the hashing function for non-classes? How can one unhook RegExp+OutBuffer+String+Others, and replace it? etc. etc. If D is intended to be a closed-shop, Phobos-only environment, then some of us are presumably wasting our time supporting the language; right?Regex is non-trivial. There's no way to have any sort of language support for it without it being in the library. Anyone working on D libraries or other things is welcome to use RegExp, so I am just not understanding what the problem is. Phobos isn't a closed shop, the license on the files allows anyone to do pretty much anything they want with it. Also, let me reiterate that the compiler does *not* emit any hardcoded references to RegExp, nor does it know anything at all about regex's. It uses object._Match, which is a proxy to whatever the language implementor wants to use. RegExp could probably remove its dependence on OutBuffer, though.It sucks in C, and why do I say that? I've shipped a C compiler for 22 years now, and not once, not ever, did anyone ask for a regex library for it. Regex wasn't put in the C standard, or the C++ one. Yet regex is considered a core capability of several other languages. There are many ways to interpret that - I am interpreting it as meaning that regex sucks in C, and so people seem to just never even think of using C when they need to process strings.Who uses regex in C++? Hardly anyone. I'm betting it's because using them sucks in C++, not because people don't use regex's.Again, it's horses for courses. BTW, regex does not suck in C, so why C++ ?BTW: perhaps it would be appropriate to deprecate bit[] before 1.0 and provide a nice library class/struct instead? You might even reuse the old code from Zortech/Zorland days.I know Stewart is using bit[], I want to hear his opinion first. If he says dump it, I'm agreeable.
Feb 16 2006
Walter Bright wrote:The compiler actually does not emit any explicit references to RegExp. It's all done by a reference to object._Match. _Match operates as a proxy to RegExp, but the compiler knows nothing about that.This is really more of a library issue than a compiler issue. My concern is that, since internal/object.d now imports std.regexp, the runtime code can no longer be built without at least a skeleton regexp module available. And if the regexp implementation changes then the runtime must be rebuilt. I'll admit that the current approach is probably best given that std.regexp exists and code duplication is a Bad Thing, but it still creates a language dependency on library code, even if the compiler isn't emitting RegExp calls directly.Regex is non-trivial. There's no way to have any sort of language support for it without it being in the library. Anyone working on D libraries or other things is welcome to use RegExp, so I am just not understanding what the problem is. Phobos isn't a closed shop, the license on the files allows anyone to do pretty much anything they want with it.I agree. And this works fine for Phobos. But if Phobos is to be a template for future standard library implementations, then it should be designed in a way that allows for closed-source compiler implementations as well. Also, what if a library writer decides to exploit the regular expression support provided by the language, and merely implements his RegExp class as a veneer over the built-in functionality? It creates an odd sort of circular dependency. I'd originally considered the same thing for UTF transcoding using the built-in foreach mechanism, but as that code is relatively simply it's not as much of an issue. I assume there's no plan to remove std.regexp from Phobos now that language support is in place?Also, let me reiterate that the compiler does *not* emit any hardcoded references to RegExp, nor does it know anything at all about regex's. It uses object._Match, which is a proxy to whatever the language implementor wants to use.Understood. In fact I'll vouch for this since I've had a close look at the code. Sean
Feb 16 2006
"Sean Kelly" <sean f4.ca> wrote in message news:dt394j$1n2j$1 digitaldaemon.com...This is really more of a library issue than a compiler issue. My concern is that, since internal/object.d now imports std.regexp, the runtime code can no longer be built without at least a skeleton regexp module available. And if the regexp implementation changes then the runtime must be rebuilt. I'll admit that the current approach is probably best given that std.regexp exists and code duplication is a Bad Thing, but it still creates a language dependency on library code, even if the compiler isn't emitting RegExp calls directly.I was concerned that code that did not use MatchExpressions might inadvertantly link in the std.regexp module, which would be a Bad Thing. It does not, so I'm not convinced this is a bad thing.Sure, and std.regexp's license allows it to be used in closed source. It's a different license from dmd's source code, and the reason for the difference is so that people can use it for just the purpose you suggest. If one wanted to reimplement (or better, extend) RegExp in order to support, say, Perl 6 regex, all that object._Match needs are about 4 trival members, which shouldn't be a burden. Other than that, why reimplement RegExp?Regex is non-trivial. There's no way to have any sort of language support for it without it being in the library. Anyone working on D libraries or other things is welcome to use RegExp, so I am just not understanding what the problem is. Phobos isn't a closed shop, the license on the files allows anyone to do pretty much anything they want with it.I agree. And this works fine for Phobos. But if Phobos is to be a template for future standard library implementations, then it should be designed in a way that allows for closed-source compiler implementations as well.Also, what if a library writer decides to exploit the regular expression support provided by the language, and merely implements his RegExp class as a veneer over the built-in functionality? It creates an odd sort of circular dependency.At some point, he'll need a regex implementation. And the license for std.RegExp allows him to use/adapt it as required.I assume there's no plan to remove std.regexp from Phobos now that language support is in place?I'm just not getting it - why should it be removed? There never was a plan to remove it. And why would an implementation of a D runtime library not want to do a regex implementation? Of course, it's a lot of work to implement a regex, but one can just copy over std.RegExp and use/adapt it as required, as the license allows that. So I am just not getting what the problem is.
Feb 16 2006
Walter Bright wrote:I'm just not getting it - why should it be removed? There never was a plan to remove it. And why would an implementation of a D runtime library not want to do a regex implementation? Of course, it's a lot of work to implement a regex, but one can just copy over std.RegExp and use/adapt it as required, as the license allows that. So I am just not getting what the problem is.Perhaps I'm being idealistic, as I simply don't believe the runtime should rely on standard library code. Up to now that's been achievable, but the solution for this particular feature is less clear. But I'll drop the issue for now and mull it over a bit. Sean
Feb 16 2006
"Sean Kelly" <sean f4.ca> wrote in message news:dt3hev$1taf$1 digitaldaemon.com...Walter Bright wrote:Consider that there's no way to implement C, D, etc., without some runtime library. Just doing a long divide relies on library code. There's the startup code (you can't just jmp to main()), shutdown code, exception handling support, etc. C/C++ have gone the odd route of making the library *part of the language*, so, for example, a compiler can recognize strlen and replace it with custom code. To my mind this gives the worst of both worlds - no syntactic sugar and no library flexibility.I'm just not getting it - why should it be removed? There never was a plan to remove it. And why would an implementation of a D runtime library not want to do a regex implementation? Of course, it's a lot of work to implement a regex, but one can just copy over std.RegExp and use/adapt it as required, as the license allows that. So I am just not getting what the problem is.Perhaps I'm being idealistic, as I simply don't believe the runtime should rely on standard library code. Up to now that's been achievable, but the solution for this particular feature is less clear. But I'll drop the issue for now and mull it over a bit.
Feb 16 2006
Walter Bright wrote:"Sean Kelly" <sean f4.ca> wrote in message news:dt3hev$1taf$1 digitaldaemon.com...Just to be clear, by "standard library code" I actually meant D code specifically. I fully expect the standard C library to be used by the D runtime. But as the C runtime likely calls C standard library functions, I suppose there's little reason to expect otherwise from D.Walter Bright wrote:Consider that there's no way to implement C, D, etc., without some runtime library. Just doing a long divide relies on library code. There's the startup code (you can't just jmp to main()), shutdown code, exception handling support, etc.I'm just not getting it - why should it be removed? There never was a plan to remove it. And why would an implementation of a D runtime library not want to do a regex implementation? Of course, it's a lot of work to implement a regex, but one can just copy over std.RegExp and use/adapt it as required, as the license allows that. So I am just not getting what the problem is.Perhaps I'm being idealistic, as I simply don't believe the runtime should rely on standard library code. Up to now that's been achievable, but the solution for this particular feature is less clear. But I'll drop the issue for now and mull it over a bit.C/C++ have gone the odd route of making the library *part of the language*, so, for example, a compiler can recognize strlen and replace it with custom code. To my mind this gives the worst of both worlds - no syntactic sugar and no library flexibility.I've heard this mentioned before and it seems a bit odd to me. Does the spec actually mention this anywhere, or is it merely implied by having the library spec be a part of the language spec? Sean
Feb 17 2006
"Sean Kelly" <sean f4.ca> wrote in message news:dt5d4e$i49$2 digitaldaemon.com...Walter Bright wrote:I think it's implied by it being part of the language spec. Regardless, it is true, and many compilers (including DMC) take advantage of it.C/C++ have gone the odd route of making the library *part of the language*, so, for example, a compiler can recognize strlen and replace it with custom code. To my mind this gives the worst of both worlds - no syntactic sugar and no library flexibility.I've heard this mentioned before and it seems a bit odd to me. Does the spec actually mention this anywhere, or is it merely implied by having the library spec be a part of the language spec?
Feb 17 2006
"Walter Bright" <newshound digitalmars.com> wrote ... [snip]Can't say that I agree, but my opinion matters rather little anyway <g>One is asking the question "does this thing on the left exist within the thing on the right". It even takes care of getting the operand ordering correct. Thus, I'd urge you to at least see if there's actually a notable problem for the compiler to handle this before writing the idea off.It's not a problem with the compiler. It's a conceptual problem for the user. When I see 'in' I think of containers. That's completely different from regex.Not at all. I've been an advocate for it in the past also. It's certain other aspects of built-in functionality that I consistently have a beef with.I'm an advocate for getting regex support in the grammar,I thought you were arguing against that <g>.It's one thing to hear you say that; yet the proof is in the pudding. It's actually quite tricky to disentangle the compiler from Phobos. Some parts simply cannot be decoupled at all (at this time). It's not a critisism of you personally, but the above concerns are very real and the frustration is something you perhaps need to know about. If I read your answer a particular way, it can be interpreted as saying "why would you *not* want to use Phobos?". That would be an example of stifling innovation, for all kind of reasons.In short: you're (a) building more and more library functionality directly into the language without providing a means to cleanly support alternate implementations, extensions, or otherwise decouple the compiler. And (b) by doing so, you're (perhaps inadvertantly) stifling some innovation and causing some headaches for the very people who are trying to help D along the road to acceptance. It would really help if you'd be somewhat sensitive to these aspects rather than persistently ignoring them. For instance, how does one change .sort to use a different sorting algorithm? How does one change the hashing function for non-classes? How can one unhook RegExp+OutBuffer+String+Others, and replace it? etc. etc. If D is intended to be a closed-shop, Phobos-only environment, then some of us are presumably wasting our time supporting the language; right?Regex is non-trivial. There's no way to have any sort of language support for it without it being in the library. Anyone working on D libraries or other things is welcome to use RegExp, so I am just not understanding what the problem is. Phobos isn't a closed shop, the license on the files allows anyone to do pretty much anything they want with it.Also, let me reiterate that the compiler does *not* emit any hardcoded references to RegExp, nor does it know anything at all about regex's. It uses object._Match, which is a proxy to whatever the language implementor wants to use. RegExp could probably remove its dependence on OutBuffer, though.Probably. On the same topic, you've often 'lectured' about the need to decouple such that the "libraries don't end up like Java" . Yet RegExp imports String too, which in turn imports all these (std.format in particular): private import std.stdio; private import std.utf; private import std.uni; private import std.array; private import std.format; private import std.ctype; private import std.stdarg; It's quite easy to eliminate OutBuffer and String from RegExp. There's an adjusted version of it in circulation, if you'd like to forego the effort.It sucks in C, and why do I say that? I've shipped a C compiler for 22 years now, and not once, not ever, did anyone ask for a regex library for it. Regex wasn't put in the C standard, or the C++ one. Yet regex is considered a core capability of several other languages. There are many ways to interpret that - I am interpreting it as meaning that regex sucks in C, and so people seem to just never even think of using C when they need to process strings.I'm surprised that you'd interpret it that way. I've used regex in C for decades. There was one great implementation from, uhhh, Ian somebody from Edinburgh Uni, which generated x86 code on the fly. I used that to great effect ~ a truly impressive utility.
Feb 16 2006
"Kris" <fu bar.com> wrote in message news:dt3cc2$1pc7$1 digitaldaemon.com...How do you interpret the fact that it has failed to gain traction among the general C population?It sucks in C, and why do I say that? I've shipped a C compiler for 22 years now, and not once, not ever, did anyone ask for a regex library for it. Regex wasn't put in the C standard, or the C++ one. Yet regex is considered a core capability of several other languages. There are many ways to interpret that - I am interpreting it as meaning that regex sucks in C, and so people seem to just never even think of using C when they need to process strings.I'm surprised that you'd interpret it that way. I've used regex in C for decades. There was one great implementation from, uhhh, Ian somebody from Edinburgh Uni, which generated x86 code on the fly. I used that to great effect ~ a truly impressive utility.
Feb 16 2006
"Walter Bright" <newshound digitalmars.com> wrote..."Kris" <fu bar.com> wrote in message news:dt3cc2$1pc7$1 digitaldaemon.com...I noted a few reasons previously, regarding differing approaches and mindsets between script developers and systems developers. Even when the same person does both. George Wrede just posted some very similar reasoning too. The upshot is that (IMO) the general C population rarely have a compelling need for regex. Where regex might seem (perhaps mistakenly) like using a sledgehammer to crack a nut in C, it's usage is often not given a second thought in scripts. Speaking personally, I don't expect high performance out of a script, and don't give two hoots about Q & D hacking therein. That's not the case with systems-programming (for me), where I'm likely to use something more lightweight as appropriate. On the other hand, I've written a lot of the type of code that really benefits from the state-machinery exposed by a good regex engine. Other times I've hand-tuned my own state-machines to do the work instead. Sometimes in assembly. As noted previously, I don't think it's a question of visibility at all ~ more a question of task, applicability, priorities, and various other cost factors. One has to wonder how much script-regex actually leverages the power within? I'd bet a large % are completely trivial. The kind which can easily be handled by other (more efficient) means in systems languages.How do you interpret the fact that it has failed to gain traction among the general C population?It sucks in C, and why do I say that? I've shipped a C compiler for 22 years now, and not once, not ever, did anyone ask for a regex library for it. Regex wasn't put in the C standard, or the C++ one. Yet regex is considered a core capability of several other languages. There are many ways to interpret that - I am interpreting it as meaning that regex sucks in C, and so people seem to just never even think of using C when they need to process strings.I'm surprised that you'd interpret it that way. I've used regex in C for decades. There was one great implementation from, uhhh, Ian somebody from Edinburgh Uni, which generated x86 code on the fly. I used that to great effect ~ a truly impressive utility.
Feb 16 2006
"Kris" <fu bar.com> wrote in message news:dt3foh$1s1d$1 digitaldaemon.com...I noted a few reasons previously, regarding differing approaches and mindsets between script developers and systems developers. Even when the same person does both. George Wrede just posted some very similar reasoning too. The upshot is that (IMO) the general C population rarely have a compelling need for regex. Where regex might seem (perhaps mistakenly) like using a sledgehammer to crack a nut in C, it's usage is often not given a second thought in scripts.This might be a circular result - people don't use regex in C because regex's suck in C, so there is no incentive to improve it because there aren't any users. People just get used to going to another language to use regex, and never stop to think it doesn't have to be that way.Speaking personally, I don't expect high performance out of a script, and don't give two hoots about Q & D hacking therein. That's not the case with systems-programming (for me), where I'm likely to use something more lightweight as appropriate.There's a lot of string processing work done in C that is not performance sensitive - like dealing with the command line arguments.On the other hand, I've written a lot of the type of code that really benefits from the state-machinery exposed by a good regex engine. Other times I've hand-tuned my own state-machines to do the work instead. Sometimes in assembly.Sure. And building in some syntactic sugar for regex isn't going to sabotage optimization.One has to wonder how much script-regex actually leverages the power within? I'd bet a large % are completely trivial.I agree with that.The kind which can easily be handled by other (more efficient) means in systems languages.I'm not sure that efficiency is the only goal here - productivity is a big one, too, and one often uses regex in parts of the program that don't need performance. I know I sure get tired of strlen/strcmp/memcpy for routine non-performance-critical code.
Feb 16 2006
Kris wrote:"Walter Bright" <newshound digitalmars.com> wrote ...For what it's worth, the latest release of Ares trims a lot of fat out of std.string, so far as runtime dependencies are concerned. The only modules that are actually required by some portion of the runtime are: std.ctype std.outbuffer std.regexp std.string std.utf And outbuffer should be easy enough to remove from this list. I'd have continued to use your modified std.regexp for this release except the deltas between the 146 and 147 versions of std.regexp were tremendous. It would have taken hours to sort out a workable merge of that file, so falling back on the new Phobos version seemed preferable.RegExp could probably remove its dependence on OutBuffer, though.Probably. On the same topic, you've often 'lectured' about the need to decouple such that the "libraries don't end up like Java" . Yet RegExp imports String too, which in turn imports all these (std.format in particular): private import std.stdio; private import std.utf; private import std.uni; private import std.array; private import std.format; private import std.ctype; private import std.stdarg; It's quite easy to eliminate OutBuffer and String from RegExp. There's an adjusted version of it in circulation, if you'd like to forego the effort.That sounds pretty cool. SeanIt sucks in C, and why do I say that? I've shipped a C compiler for 22 years now, and not once, not ever, did anyone ask for a regex library for it. Regex wasn't put in the C standard, or the C++ one. Yet regex is considered a core capability of several other languages. There are many ways to interpret that - I am interpreting it as meaning that regex sucks in C, and so people seem to just never even think of using C when they need to process strings.I'm surprised that you'd interpret it that way. I've used regex in C for decades. There was one great implementation from, uhhh, Ian somebody from Edinburgh Uni, which generated x86 code on the fly. I used that to great effect ~ a truly impressive utility.
Feb 16 2006
"Sean Kelly" <sean f4.ca> wrote in message news:dt3nss$22bj$1 digitaldaemon.com...And outbuffer should be easy enough to remove from this list. I'd have continued to use your modified std.regexp for this release except the deltas between the 146 and 147 versions of std.regexp were tremendous. It would have taken hours to sort out a workable merge of that file, so falling back on the new Phobos version seemed preferable.Very little actually changed, what I did was resort the order so it was more appealing in Ddoc format, and add the Ddoc comments.
Feb 16 2006
Walter Bright wrote:It sucks in C, and why do I say that? I've shipped a C compiler for 22 years now, and not once, not ever, did anyone ask for a regex library for it. Regex wasn't put in the C standard, or the C++ one. Yet regex is considered a core capability of several other languages. There are many ways to interpret that - I am interpreting it as meaning that regex sucks in C, and so people seem to just never even think of using C when they need to process strings.Hmm. Regexes being a big thing for interpreted languages is much thanks to the Q&D convenience. Also systems scripting needs it for nontrivial filtering, and of course complicated line rewriting. C folks tend to "peek directly" into the strings because it's cheap, and you have a sense of complete control. Using regexps in C needs a total change of paradigm. Regexps are kind of "top down" things, wherease traditionally "peeking into strings" is bottom-up programming. You'd also have to learn regexps. The trivial things are trivial in C-style too, and the non-trivial stuff gets avoided because of the up-front investment. Folks rather do nested ifs and stuff. Conversely, many interpreted languages make it inefficient to do "peek" kind of programming, as compared to using regexps.
Feb 16 2006
"Georg Wrede" <georg.wrede nospam.org> wrote in message news:43F53BE5.8020900 nospam.org...Using regexps in C needs a total change of paradigm. Regexps are kind of "top down" things, wherease traditionally "peeking into strings" is bottom-up programming. You'd also have to learn regexps. The trivial things are trivial in C-style too, and the non-trivial stuff gets avoided because of the up-front investment. Folks rather do nested ifs and stuff. Conversely, many interpreted languages make it inefficient to do "peek" kind of programming, as compared to using regexps.There are a lot of cool things you can do in script languages because they are interpreted, and one doesn't care about efficiency. Those things are simply incompatible with D. But I don't see any inherent advantages script languages should have in implementing regex.
Feb 16 2006
Walter Bright wrote:"Georg Wrede" <georg.wrede nospam.org> wrote in message news:43F53BE5.8020900 nospam.org...Neither do I. But the question was, how come regexps aren't _used_ as much as we'd expect.Using regexps in C needs a total change of paradigm. Regexps are kind of "top down" things, wherease traditionally "peeking into strings" is bottom-up programming. You'd also have to learn regexps. The trivial things are trivial in C-style too, and the non-trivial stuff gets avoided because of the up-front investment. Folks rather do nested ifs and stuff. Conversely, many interpreted languages make it inefficient to do "peek" kind of programming, as compared to using regexps.There are a lot of cool things you can do in script languages because they are interpreted, and one doesn't care about efficiency. Those things are simply incompatible with D. But I don't see any inherent advantages script languages should have in implementing regex.
Feb 17 2006
"Georg Wrede" <georg.wrede nospam.org> wrote in message news:43F58CDB.9090504 nospam.org...Walter Bright wrote:My answer is because they're inconvenient to use in C/C++.There are a lot of cool things you can do in script languages because they are interpreted, and one doesn't care about efficiency. Those things are simply incompatible with D. But I don't see any inherent advantages script languages should have in implementing regex.Neither do I. But the question was, how come regexps aren't _used_ as much as we'd expect.
Feb 17 2006
Georg Wrede wrote:Walter Bright wrote:My answer is that regular expressions simply aren't powerful enough for the kinds of string processing that I need to do regularly (no pun intended). Regular expressions represent regular languages. Not all languages are regular, of course. <rant> My other beef with regular expression are that there are so many competeing standards for them, and on top of that some are not even standardized (i.e. MS Visual Studio .NET 2003). You never know if one implementation uses longest-match or one uses shortest-match; you never know how newlines are handled; you never know if Unicode is supported; you never know the run-time performance of your regex; you never know the syntax for selecting match indicies (0 based or 1 based, use '\1'? Record match with {} or with \(\) or with () ??) etc. There are simply too many variables with regular expressions as they exist in all their forms to be relied upon. Finally, they're just plain ugly and nearly impossible to debug. </rant> Following that rant, I can put a positive spin here and say that Ragel state machine compiler is an excellent model to work from! One can insert custom code between state transitions for debugging and even for complex logic! Why can't we have compiler-support for this type of power? :) -- -----BEGIN GEEK CODE BLOCK----- Version: 3.1 GCS/MU/S d-pu s:+ a-->? C++++$ UL+++ P--- L+++ !E W-- N++ o? K? w--- O M-- V? PS PE Y+ PGP- t+ 5 X+ !R tv-->!tv b- DI++(+) D++ G e++>e h>--->++ r+++ y+++ ------END GEEK CODE BLOCK------ James Dunne"Georg Wrede" <georg.wrede nospam.org> wrote in message news:43F53BE5.8020900 nospam.org...Neither do I. But the question was, how come regexps aren't _used_ as much as we'd expect.Using regexps in C needs a total change of paradigm. Regexps are kind of "top down" things, wherease traditionally "peeking into strings" is bottom-up programming. You'd also have to learn regexps. The trivial things are trivial in C-style too, and the non-trivial stuff gets avoided because of the up-front investment. Folks rather do nested ifs and stuff. Conversely, many interpreted languages make it inefficient to do "peek" kind of programming, as compared to using regexps.There are a lot of cool things you can do in script languages because they are interpreted, and one doesn't care about efficiency. Those things are simply incompatible with D. But I don't see any inherent advantages script languages should have in implementing regex.
Feb 20 2006
kris wrote:I'm an advocate for getting regex support in the grammar, but I'm certainly not an advocate for tying Phobos to the compiler (RegExp has a notable resultant import set; because of this I refactored it for Ares and Mango).Would it be correct to assume that if we had compile-time regexps, then the resultant import set would be effectively zero? (As long as we of course don't also use regexps that aren't compile-time compilable?) Since (IMHO) most shortish programs only use literal regexes, this would be quite important.
Feb 16 2006
Walter Bright wrote:"kris" <fu bar.org> wroteWalter Bright wrote:Sad. "in" did sound good. :-)At least "in" has some relevant meaning to it.It would be overloading its existing meaning, which means that it'll take semantic, rather than syntactic, analysis to disambiguate. This is potential trouble.There are 2 things reducing its usage. First, the using itself has been awkward. Second, and more important, most real-world uses of regex involve literals. And that implies compile-time compilation, if they are to be perceived efficient.The thing is, RegExp has been in there from the beginning, but it has gone unused and even its existence is overlooked. I don't believe that's because it isn't useful - look at Ruby, Perl, Javascript, etc. Those languages heavilly use regex. Is there something inherent about *script* languages that make them nice for regex? I don't believe there is, I think it gets heavilly used in those languages because the syntactic sugar makes it easy to use.That is a problem, one that would get solved when RegExp can do wchar and dchar. That isn't a technical problem, it's more of a getting around to it problem.Well, since grammar supported regex has elevated itself to the top of the priority list, perhaps wchar/dchar support might tag along with it?Experience has shown that using D as a scripting language in a production environment, currently needs some method of compiler-version-locking. In other words, if a script is written for D.130, then something should ensure that it stays compiled with that version, even after the system D compiler gets updated. If this is not done, then system scripts break at unexpected times (i.e. the first time that particular script is run after the compiler is updated to the first version that breaks the script). In a production environment it is plain impossible to search and test-run each D script any time the compiler gets updated. This problem is made even worse by the run-time library not having any version identifier. It sure would be nice if one could leave the old run-time libraries as-is, and only add the new one next to them. The binaries should choose the right one automagically. The way we are using D scripting (digitalmars.D.announce:2674) is version independent (meaning we can use _any_ DMD), but of course the individual D scripts introduce compiler version dependencies by themselves. One solution to all the above mentioned problems, would of course be a "dscript.d" binary, that takes care of everything. (A good starting point would be to use the above mentioned scripting script.) Then every D script would start with but that would then totally obviate the DMD -run parameter!I thought it fit in well with D's new capability of being runnable in a script-like fashion.I don't think this takes away from the regex templates. I hope to use the regex templates in conjunction with this syntactic sugar to create optimized regex evaluation.Perhaps, but I really don't see the need for this sudden rush to get regex support into the grammar. Experience with regex templates is almost certain to uncover some conflict in this regard ~ one that will likely have to be compromised to fit in with the current syntax. That's just Murphy's law. What's the big hurry?If this opens up a reasonably broad new range of applications that D is a good fit for, that's good. I might be wrong, of course, as I've been with the bit data type (a complete botch). Match expressions don't break anything, were not expensive to implement, and the only way to see how they'll work out is to try them.I think the current implementation is good. I don't like to see any $whatever (or even worse, $` $´ $' $") implemented!!!! We don't like to see D become Perl. And hey, Perl itself has been moving away from the $-unbrememberable-fly-droppings stuff. AND even _bash_ has been starting to avoid them lately! (See man bash.) Syntactic sugar is ok in general. But not "semantic" or "hieroglyphic" sugar. Let's see how the brand new stuff works, and whether any additional sugar ever becomes needed here!
Feb 16 2006
"Georg Wrede" <georg.wrede nospam.org> wrote in message news:43F530AC.9010101 nospam.org...Syntactic sugar is ok in general. But not "semantic" or "hieroglyphic" sugar. Let's see how the brand new stuff works, and whether any additional sugar ever becomes needed here!I think the $` is pretty much dead now <g>.
Feb 16 2006
Walter Bright wrote:"Kris" <fu bar.com> wrote in message news:dt0q7n$2cuo$1 digitaldaemon.com...I'm half inclined to suggest -> for ~~, though there doesn't seem to be an obvious corresponding 'not' version. SeanThere seem to be multiple issues here. The first one, which you ask about, is related to the syntax. At first blush, the ~~ looks like an approximate approximation, and then making D look like a malformed Perl is surely a mistake.If you've got a better idea for tokens ~~ and !~ ?
Feb 16 2006
"Sean Kelly" <sean f4.ca> wrote in message news:dt2cra$ssu$2 digitaldaemon.com...Walter Bright wrote:Two cons: 1) people see -> and they're going to think the C/C++ meaning. Heck, I often mistakenly use -> in D instead of '.'. For that reason -> should never result in valid D code. 2) as you suggested, !-> doesn't look too hot :-("Kris" <fu bar.com> wrote in message news:dt0q7n$2cuo$1 digitaldaemon.com...I'm half inclined to suggest -> for ~~, though there doesn't seem to be an obvious corresponding 'not' version.There seem to be multiple issues here. The first one, which you ask about, is related to the syntax. At first blush, the ~~ looks like an approximate approximation, and then making D look like a malformed Perl is surely a mistake.If you've got a better idea for tokens ~~ and !~ ?
Feb 16 2006
Walter Bright wrote:D dramatically improves the convenience of string handling over C++. But while I think using the library std.regexp is straightforward, obviously it just isn't gaining traction. People like the shortcut approaches Ruby and Perl use for regular expressions, hence the new D match-expression support. So, now we have: if (regular_expression ~~ string) { _match.pre _match.post _match.match(n) } Should we do some aliases: $` => _match.pre $' => _match.post $& => _match.match(0) $n => _match.match(n) ? Syntactic sugar is often a good idea, but at what point do they become cyclamates and cause cancer in laboratory animals? Will these $ tokens render D more accessible, but perhaps too unreadable?I'd rather make my code easier to read than write. I don't use regexps just for that reason. -- Regards, James Dunne
Feb 15 2006
In article <dt088e$1svm$2 digitaldaemon.com>, Walter Bright says...D dramatically improves the convenience of string handling over C++. But while I think using the library std.regexp is straightforward, obviously it just isn't gaining traction. People like the shortcut approaches Ruby and Perl use for regular expressions, hence the new D match-expression support. So, now we have: if (regular_expression ~~ string) { _match.pre _match.post _match.match(n) }Fairly good.Should we do some aliases: $` => _match.pre $' => _match.post $& => _match.match(0) $n => _match.match(n) ?No. That's why I hate perl. I have to look in the manual to know what the hell $` means, and be carefult abou it being realli an ` and not a '.Syntactic sugar is often a good idea, but at what point do they become cyclamates and cause cancer in laboratory animals? Will these $ tokens render D more accessible, but perhaps too unreadable?Yes. All those $'$`$&$3 are useful only to make my eyes cross. If you want to use $ use it as an abbreviation of 'match', so you'll get: $pre => _match.pre $post => _match.post $(0) => _match.match(0) $(n) => _match.match(n) So once I know that $ stands for 'match', I can easily argue what $pre, $post, $(0) and $(3) stand for. Ciao --- http://www.mariottini.net/roberto/
Feb 15 2006
Walter Bright wrote:Should we do some aliases: $` => _match.pre $' => _match.post` is not readily available on all keyboards. Some fonts also have problems differentiating between the three Latin-1 ticks (` ' ´) (the straight tick (apostrophe) (') looks like a right tick (acute accent) (´) in many fonts).$& => _match.match(0) $n => _match.match(n)Is n meant to be an integer expression or a numeric literal?? Syntactic sugar is often a good idea, but at what point do they become cyclamates and cause cancer in laboratory animals? Will these $ tokens render D more accessible, but perhaps too unreadable?IMHO, Both. It makes D less readable for sure. I also think this repulses more people in general than it attracts some odd perl hackers. :) In this case, I don't even thing the syntactical sugar makes the code much faster to write (which in reality, I think, is psychological more than a real problem). If verbosity is to be avoided, I would suggest (as in my earlier reply to this thread) that $ replaces _match. This would give: $.pre $.post $[0] $[n] (or $.match(n), but why not overload opIndex?) /Oskar
Feb 16 2006
"Oskar Linde" <olREM OVEnada.kth.se> wrote in message news:dt1ccm$2ssg$1 digitaldaemon.com...$1, $2, $3, ...$& => _match.match(0) $n => _match.match(n)Is n meant to be an integer expression or a numeric literal?I'm a little surprised at the uniformly negative reaction to the perl-ish notation. But that's good, as it makes the right way to go for D clear.? Syntactic sugar is often a good idea, but at what point do they become cyclamates and cause cancer in laboratory animals? Will these $ tokens render D more accessible, but perhaps too unreadable?IMHO, Both. It makes D less readable for sure. I also think this repulses more people in general than it attracts some odd perl hackers. :)If verbosity is to be avoided, I would suggest (as in my earlier reply to this thread) that $ replaces _match. This would give: $.pre $.post $[0] $[n] (or $.match(n), but why not overload opIndex?)That was the original plan, but when _match is of type T*, the [ ] cannot be overloaded.
Feb 16 2006
Walter Bright wrote:"Oskar Linde" <olREM OVEnada.kth.se> wrote in message news:dt1ccm$2ssg$1 digitaldaemon.com...So why does _match have to be a pointer? Would something like this not work? (from object.d, added void *_this, opIndex and changed this->_this) /* ***************************** _Match **************************** */ /* ** * Default type for _match. * Implemented as a proxy for RegExp, so that object doesn't pull in * the entire std.regexp. */ import std.regexp; struct _Match { void *_this; char[] match(size_t n) { return (cast(RegExp)_this).match(n); } char[] opIndex(size_t n) { return match(n); } _Match opNext() { RegExp r = (cast(RegExp)_this).opNext(); if (r) return cast(_Match)_this; r = cast(RegExp)_this; delete r; return null; } char[] pre() { return (cast(RegExp)_this).pre(); } char[] post() { return (cast(RegExp)_this).post(); } } /Oskar$.pre $.post $[0] $[n] (or $.match(n), but why not overload opIndex?)That was the original plan, but when _match is of type T*, the [ ] cannot be overloaded.
Feb 17 2006
"Oskar Linde" <olREM OVEnada.kth.se> wrote in message news:dt41c8$2a9a$1 digitaldaemon.com...Walter Bright wrote:I wanted it to work with both pointers to structs and to class references."Oskar Linde" <olREM OVEnada.kth.se> wrote in message news:dt1ccm$2ssg$1 digitaldaemon.com...So why does _match have to be a pointer?$.pre $.post $[0] $[n] (or $.match(n), but why not overload opIndex?)That was the original plan, but when _match is of type T*, the [ ] cannot be overloaded.Would something like this not work?The problem with that is testing: _Match m; if (m) doesn't work if _Match is a struct.
Feb 17 2006
Walter Bright wrote:D dramatically improves the convenience of string handling over C++. But while I think using the library std.regexp is straightforward, obviously it just isn't gaining traction. People like the shortcut approaches Ruby and Perl use for regular expressions, hence the new D match-expression support. So, now we have: if (regular_expression ~~ string) { _match.pre _match.post _match.match(n) } Should we do some aliases: $` => _match.pre $' => _match.post $& => _match.match(0) $n => _match.match(n) ? Syntactic sugar is often a good idea, but at what point do they become cyclamates and cause cancer in laboratory animals? Will these $ tokens render D more accessible, but perhaps too unreadable?It is nice feature but I don't think such thing should be part of the language. I don't think it is so common. Maybe I am wrong... The other thing I don't like is the too many reserved words... Me personally wouldn't try to catch Ruby or Perl. I believe comparison between D/C/C++ and virtual machine or scripting language is foolish. But it depends on what are the goals of D - larger audience or higher quality. Because, in my opinion, trying to catch a scripting language is regression. But as I said it is very nice feature. I will use it myself, but wouldn't judge for a language by this...
Feb 16 2006
Sweet jesus ... the horror. "Walter Bright" <newshound digitalmars.com> wrote in message news:dt088e$1svm$2 digitaldaemon.com...D dramatically improves the convenience of string handling over C++. But while I think using the library std.regexp is straightforward, obviouslyitjust isn't gaining traction. People like the shortcut approaches Ruby and Perl use for regular expressions, hence the new D match-expressionsupport.So, now we have: if (regular_expression ~~ string) { _match.pre _match.post _match.match(n) } Should we do some aliases: $` => _match.pre $' => _match.post $& => _match.match(0) $n => _match.match(n) ? Syntactic sugar is often a good idea, but at what point do they become cyclamates and cause cancer in laboratory animals? Will these $ tokens render D more accessible, but perhaps too unreadable?
Feb 16 2006
Walter Bright wrote:D dramatically improves the convenience of string handling over C++. But while I think using the library std.regexp is straightforward, obviously it just isn't gaining traction. People like the shortcut approaches Ruby and Perl use for regular expressions, hence the new D match-expression support. So, now we have: if (regular_expression ~~ string) { _match.pre _match.post _match.match(n) } Should we do some aliases: $` => _match.pre $' => _match.post $& => _match.match(0) $n => _match.match(n) ? Syntactic sugar is often a good idea, but at what point do they become cyclamates and cause cancer in laboratory animals? Will these $ tokens render D more accessible, but perhaps too unreadable?I havent read this whole thread, but pardon if this has been suggested. Why doesnt the regular expression stuff use foreach? struct Match { short start, end; } foreach( Match m ; "[0-9]" ~~ mystring ) { writefln( "Found number:%s", mystring[m.start..m.end] ); } Basically this implements a callback methodology for regexes, similar to: void match( char[] regex, char[] str, bool delegate( Match m, char[] s ) dg ); Obviously this doesnt cover all cases, but I'm just curious why it isn't used. -DavidM
Feb 16 2006
"David Medlock" <noone nowhere.com> wrote in message news:dt2mpk$17aj$1 digitaldaemon.com...I havent read this whole thread, but pardon if this has been suggested. Why doesnt the regular expression stuff use foreach?Why, indeed. Oskar has brought it up, and he and you are right. I'm going to reevaluate this based on the feedback in this thread.
Feb 16 2006
Walter Bright wrote:"David Medlock" <noone nowhere.com> wrote in message news:dt2mpk$17aj$1 digitaldaemon.com...I agree with the "foreach" point/suggestion .. IMO building regex into the language to the point where a ~~ expressions automatically generates a "_match" variable is just going too far. a Match struct/class and a foreach implementation makes it much more consistent and clean.I havent read this whole thread, but pardon if this has been suggested. Why doesnt the regular expression stuff use foreach?Why, indeed. Oskar has brought it up, and he and you are right. I'm going to reevaluate this based on the feedback in this thread.
Feb 16 2006