www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - About CTFE and pointers

reply bearophile <bearophileHUGS lycos.com> writes:
I have seen this C++11 program:
http://kaizer.se/wiki/log/post/C++_constexpr/

I have translated it to this D code:


bool notEnd(const char *s, const int n) {
    return s && s[n];
}
bool strPrefix(const char *s, const char *t, const int ns, const int nt) {
    return (s == t) ||
           !t[nt] ||
           (s[ns] == t[nt] && (strPrefix(s, t, ns+1, nt+1)));
}
bool contains(const char *s, const char *needle, const int n=0) {
    // Works only with C-style 0-terminated strings
    return notEnd(s, n) &&
           (strPrefix(s, needle, n, 0) || contains(s, needle, n+1));
}
enum int x = contains("froogler", "oogle");
void main() {
//    assert(contains("froogler", "oogle"));
}


If I run the version of the code with the run-time, it generates no errors.

If I run the version with enum with the latest dmd it gives:

test.d(6): Error: string index 5 is out of bounds [0 .. 5]
test.d(7):        called from here: strPrefix(s,t,ns + 1,nt + 1)
test.d(4):        5 recursive calls to function strPrefix
test.d(12):        called from here: strPrefix(s,needle,n,0)
test.d(12):        called from here: contains(s,needle,n + 1)
test.d(12):        called from here: contains(s,needle,n + 1)
test.d(14):        called from here: contains("froogler","oogle",0)


At first sight it looks like a CTFE bug, but studying the code a little it
seems there is a off-by-one bug in the code
(http://en.wikipedia.org/wiki/Off-by-one_error ). A quick translation to D
arrays confirms it:


bool notEnd(in char[] s, in int n) {
    return s && s[n];
}
bool strPrefix(in char[] s, in char[] t, in int ns, in int nt) {
    return (s == t) ||
           !t[nt] ||
           (s[ns] == t[nt] && (strPrefix(s, t, ns+1, nt+1)));
}
bool contains(in char[] s, in char[] needle, in int n=0) {
    // Works only with C-style 0-terminated strings
    return notEnd(s, n) &&
           (strPrefix(s, needle, n, 0) || contains(s, needle, n+1));
}
//enum int x = contains("froogler", "oogle");
void main() {
    assert(contains("froogler", "oogle"));
}


It gives at run-time:

core.exception.RangeError test(6): Range violation
----------------
...\test.d(6): bool test.strPrefix(const(char[]), const(char[]), const(int),
const(int))
...
----------------


So it seems that Don, when he has implemented the last parts of the CTFE
interpreter, has done something curious, because in some cases it seems able to
find out of bounds even when you use just raw pointers :-)

Bye,
bearophile
Feb 24 2012
parent reply =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= <xtzgzorex gmail.com> writes:
On 24-02-2012 15:08, bearophile wrote:
 I have seen this C++11 program:
 http://kaizer.se/wiki/log/post/C++_constexpr/

 I have translated it to this D code:


 bool notEnd(const char *s, const int n) {
      return s&&  s[n];
 }
 bool strPrefix(const char *s, const char *t, const int ns, const int nt) {
      return (s == t) ||
             !t[nt] ||
             (s[ns] == t[nt]&&  (strPrefix(s, t, ns+1, nt+1)));
 }
 bool contains(const char *s, const char *needle, const int n=0) {
      // Works only with C-style 0-terminated strings
      return notEnd(s, n)&&
             (strPrefix(s, needle, n, 0) || contains(s, needle, n+1));
 }
 enum int x = contains("froogler", "oogle");
 void main() {
 //    assert(contains("froogler", "oogle"));
 }


 If I run the version of the code with the run-time, it generates no errors.

 If I run the version with enum with the latest dmd it gives:

 test.d(6): Error: string index 5 is out of bounds [0 .. 5]
 test.d(7):        called from here: strPrefix(s,t,ns + 1,nt + 1)
 test.d(4):        5 recursive calls to function strPrefix
 test.d(12):        called from here: strPrefix(s,needle,n,0)
 test.d(12):        called from here: contains(s,needle,n + 1)
 test.d(12):        called from here: contains(s,needle,n + 1)
 test.d(14):        called from here: contains("froogler","oogle",0)


 At first sight it looks like a CTFE bug, but studying the code a little it
seems there is a off-by-one bug in the code
(http://en.wikipedia.org/wiki/Off-by-one_error ). A quick translation to D
arrays confirms it:


 bool notEnd(in char[] s, in int n) {
      return s&&  s[n];
 }
 bool strPrefix(in char[] s, in char[] t, in int ns, in int nt) {
      return (s == t) ||
             !t[nt] ||
             (s[ns] == t[nt]&&  (strPrefix(s, t, ns+1, nt+1)));
 }
 bool contains(in char[] s, in char[] needle, in int n=0) {
      // Works only with C-style 0-terminated strings
      return notEnd(s, n)&&
             (strPrefix(s, needle, n, 0) || contains(s, needle, n+1));
 }
 //enum int x = contains("froogler", "oogle");
 void main() {
      assert(contains("froogler", "oogle"));
 }


 It gives at run-time:

 core.exception.RangeError test(6): Range violation
 ----------------
 ....\test.d(6): bool test.strPrefix(const(char[]), const(char[]), const(int),
const(int))
 ....
 ----------------


 So it seems that Don, when he has implemented the last parts of the CTFE
interpreter, has done something curious, because in some cases it seems able to
find out of bounds even when you use just raw pointers :-)

 Bye,
 bearophile
It's not at all unlikely that the CTFE interpreter represents blocks of memory as a pointer+length pair internally. -- - Alex
Feb 24 2012
parent Don Clugston <dac nospam.com> writes:
On 24/02/12 15:18, Alex Rønne Petersen wrote:
 On 24-02-2012 15:08, bearophile wrote:
 I have seen this C++11 program:
 http://kaizer.se/wiki/log/post/C++_constexpr/

 I have translated it to this D code:


 bool notEnd(const char *s, const int n) {
 return s&& s[n];
 }
 bool strPrefix(const char *s, const char *t, const int ns, const int
 nt) {
 return (s == t) ||
 !t[nt] ||
 (s[ns] == t[nt]&& (strPrefix(s, t, ns+1, nt+1)));
 }
 bool contains(const char *s, const char *needle, const int n=0) {
 // Works only with C-style 0-terminated strings
 return notEnd(s, n)&&
 (strPrefix(s, needle, n, 0) || contains(s, needle, n+1));
 }
 enum int x = contains("froogler", "oogle");
 void main() {
 // assert(contains("froogler", "oogle"));
 }


 If I run the version of the code with the run-time, it generates no
 errors.

 If I run the version with enum with the latest dmd it gives:

 test.d(6): Error: string index 5 is out of bounds [0 .. 5]
 test.d(7): called from here: strPrefix(s,t,ns + 1,nt + 1)
 test.d(4): 5 recursive calls to function strPrefix
 test.d(12): called from here: strPrefix(s,needle,n,0)
 test.d(12): called from here: contains(s,needle,n + 1)
 test.d(12): called from here: contains(s,needle,n + 1)
 test.d(14): called from here: contains("froogler","oogle",0)


 At first sight it looks like a CTFE bug, but studying the code a
 little it seems there is a off-by-one bug in the code
 (http://en.wikipedia.org/wiki/Off-by-one_error ). A quick translation
 to D arrays confirms it:


 bool notEnd(in char[] s, in int n) {
 return s&& s[n];
 }
 bool strPrefix(in char[] s, in char[] t, in int ns, in int nt) {
 return (s == t) ||
 !t[nt] ||
 (s[ns] == t[nt]&& (strPrefix(s, t, ns+1, nt+1)));
 }
 bool contains(in char[] s, in char[] needle, in int n=0) {
 // Works only with C-style 0-terminated strings
 return notEnd(s, n)&&
 (strPrefix(s, needle, n, 0) || contains(s, needle, n+1));
 }
 //enum int x = contains("froogler", "oogle");
 void main() {
 assert(contains("froogler", "oogle"));
 }


 It gives at run-time:

 core.exception.RangeError test(6): Range violation
 ----------------
 ....\test.d(6): bool test.strPrefix(const(char[]), const(char[]),
 const(int), const(int))
 ....
 ----------------


 So it seems that Don, when he has implemented the last parts of the
 CTFE interpreter, has done something curious, because in some cases it
 seems able to find out of bounds even when you use just raw pointers :-)

 Bye,
 bearophile
It's not at all unlikely that the CTFE interpreter represents blocks of memory as a pointer+length pair internally.
Yes, that's exactly what it does. That's how it's able to implement pointers safely. That's a nice story, Thanks, bearophile.
Feb 24 2012