www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Regex in ctfe?

reply Manu via Digitalmars-d <digitalmars-d puremagic.com> writes:
Possible to std.regex in ctfe?
I kinda had it in my mind that ctRegex was ctfe-able, but it seems
it's not. It just generates an efficient regex engine at compile time
which only actually works on data at runtime.

I have a string import destined for a mixin, and I want to parse it
with regex, but I haven't been able to make it work. The docs mention
nothing about this possibility.
Jan 25 2016
next sibling parent reply Era Scarecrow <rtcvb32 yahoo.com> writes:
  What exactly are you trying to do i wonder?

  If it uses enough of a subset then maybe... I'll assume you 
tried and failed already though.

  It seems more likely that a external library like PCRE would be 
compiled into the compiler instead and you could hook into it; 
But with probable mismatches of the Regex engine and features I'd 
wonder if it would cause problems (at least until the compiler is 
converted to D and can implement it's own d libraries).
Jan 25 2016
parent reply Manu via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 26 January 2016 at 12:58, Era Scarecrow via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
  What exactly are you trying to do i wonder?

  If it uses enough of a subset then maybe... I'll assume you tried and
 failed already though.

  It seems more likely that a external library like PCRE would be compiled
 into the compiler instead and you could hook into it; But with probable
 mismatches of the Regex engine and features I'd wonder if it would cause
 problems (at least until the compiler is converted to D and can implement
 it's own d libraries).
I thought DMD was fully converted to D? The compiler could make use of std.regex if it wanted to...
Jan 25 2016
parent reply Era Scarecrow <rtcvb32 yahoo.com> writes:
On Tuesday, 26 January 2016 at 05:10:58 UTC, Manu wrote:
 I thought DMD was fully converted to D? The compiler could make 
 use of std.regex if it wanted to...
My experience is at least a year or two out of date; One of the DConf2015 talks someone was working on a C++ to D converter, and planned to swap over from C++ to D once it was complete enough. If that already happened, i was unaware of it.
Jan 26 2016
parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Tuesday, 26 January 2016 at 11:35:24 UTC, Era Scarecrow wrote:
 On Tuesday, 26 January 2016 at 05:10:58 UTC, Manu wrote:
 I thought DMD was fully converted to D? The compiler could 
 make use of std.regex if it wanted to...
My experience is at least a year or two out of date; One of the DConf2015 talks someone was working on a C++ to D converter, and planned to swap over from C++ to D once it was complete enough. If that already happened, i was unaware of it.
Yes the swap happend. In a recent git checkout you will have the cpp files and the .d files side by side. matching regex at ctfe should work with a bit of modification.
Jan 26 2016
parent reply w0rp <devw0rp gmail.com> writes:
Unless I'm mistaken, I think the compiler for regex currently 
works at compile time, but not the matcher. Maybe someone who 
knows the module could add support for that.
Jan 26 2016
parent reply Pierre Krafft <kpierre+dlang outlook.com> writes:
On Tuesday, 26 January 2016 at 12:47:26 UTC, w0rp wrote:
 Unless I'm mistaken, I think the compiler for regex currently 
 works at compile time, but not the matcher. Maybe someone who 
 knows the module could add support for that.
That's correct. I looked in to this a while ago and found out that the matcher uses trusted code with pointers. Getting support for ctfe would require a rewrite of most of the matching engine and would probably result in a rather big performance hit.
Jan 27 2016
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Wed, Jan 27, 2016 at 06:47:59PM +0000, Pierre Krafft via Digitalmars-d wrote:
 On Tuesday, 26 January 2016 at 12:47:26 UTC, w0rp wrote:
Unless I'm mistaken, I think the compiler for regex currently works
at compile time, but not the matcher. Maybe someone who knows the
module could add support for that.
That's correct. I looked in to this a while ago and found out that the matcher uses trusted code with pointers. Getting support for ctfe would require a rewrite of most of the matching engine and would probably result in a rather big performance hit.
An alternative would be to use `if (_ctfe)` branches in the code to switch to a different matching engine when in CTFE, and leaving the runtime code untouched. T -- Don't throw out the baby with the bathwater. Use your hands...
Jan 27 2016
parent reply Era Scarecrow <rtcvb32 yahoo.com> writes:
On Wednesday, 27 January 2016 at 19:05:01 UTC, H. S. Teoh wrote:
 An alternative would be to use `if (_ctfe)` branches in the 
 code to switch to a different matching engine when in CTFE, and 
 leaving the runtime code untouched.
Hmmm... As I recall there's 2 major engines that run Regex (DFA & NFA). If you don't need the more complex engine (which has forward/back referencing, named capturing, etc) then maybe the simpler engine would work. But it would seem easier/simpler if the compiler just hooks in the Regex that's compiled into it instead and passes the data back and forth. It does mean you can't test Regex changes during the CTFE stage (assuming your actively changing/building it) but it wouldn't have the performance hit either.
Jan 27 2016
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Wed, Jan 27, 2016 at 07:54:21PM +0000, Era Scarecrow via Digitalmars-d wrote:
 On Wednesday, 27 January 2016 at 19:05:01 UTC, H. S. Teoh wrote:
An alternative would be to use `if (_ctfe)` branches in the code to
switch to a different matching engine when in CTFE, and leaving the
runtime code untouched.
Hmmm... As I recall there's 2 major engines that run Regex (DFA & NFA). If you don't need the more complex engine (which has forward/back referencing, named capturing, etc) then maybe the simpler engine would work. But it would seem easier/simpler if the compiler just hooks in the Regex that's compiled into it instead and passes the data back and forth. It does mean you can't test Regex changes during the CTFE stage (assuming your actively changing/building it) but it wouldn't have the performance hit either.
Currently, AFAIK, (d)dmd does not use anything from Phobos. I'm not sure if this is a temporary situation, or there's a strong reason for it. T -- Those who don't understand D are condemned to reinvent it, poorly. -- Daniel N
Jan 27 2016
parent reply Era Scarecrow <rtcvb32 yahoo.com> writes:
On Wednesday, 27 January 2016 at 20:25:33 UTC, H. S. Teoh wrote:
 On Wed, Jan 27, 2016 at 07:54:21PM +0000, Era Scarecrow via 
 Digitalmars-d wrote:
  But it would seem easier/simpler if the compiler just hooks 
 in the  Regex that's compiled into it instead and passes the 
 data back and forth. It does mean you can't test Regex changes 
 during the CTFE stage (assuming your actively 
 changing/building it) but it wouldn't  have the performance 
 hit either.
Currently, AFAIK, (d)dmd does not use anything from Phobos. I'm not sure if this is a temporary situation, or there's a strong reason for it.
Probably since it was so recently changed from C++ to D that it just hasn't tried to include anything. And leaving the two projects (library and compiler) separate does allow a certain level of purity.
Jan 27 2016
parent tsbockman <thomas.bockman gmail.com> writes:
On Wednesday, 27 January 2016 at 21:09:19 UTC, Era Scarecrow 
wrote:
 On Wednesday, 27 January 2016 at 20:25:33 UTC, H. S. Teoh wrote:
 Currently, AFAIK, (d)dmd does not use anything from Phobos. 
 I'm not sure if this is a temporary situation, or there's a 
 strong reason for it.
Probably since it was so recently changed from C++ to D that it just hasn't tried to include anything. And leaving the two projects (library and compiler) separate does allow a certain level of purity.
This was a deliberate decision to avoid creating circular dependencies between the compiler and Phobos, which are arguably too tightly coupled already. (I remember seeing it discussed somewhere around the time of the transition to DDMD; I'm too lazy to dig up a link right now.)
Jan 31 2016
prev sibling parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Tuesday, 26 January 2016 at 01:34:09 UTC, Manu wrote:
 Possible to std.regex in ctfe?
(...)
 I have a string import destined for a mixin, and I want to 
 parse it with regex, but I haven't been able to make it work. 
 The docs mention nothing about this possibility.
Hi Manu, a possible solution for that would be to use my Pegged (https://github.com/PhilippeSigaud/Pegged) project. It's a parser generator that creates compile-time (and runtime) parsers. It's not directly compatible with Phobos regexes, but it's quite easy to extract the matched strings. Bastiaan Veelo recently added support for left-recursive grammars, which means we can now generate parsers for many more grammars (although if your parsing need is covered by regexes, that probably won't interest you...) Philippe
Jan 31 2016