digitalmars.D - Another prayer for invariant strngs
- Robert Fraser (8/8) Jul 12 2007 Invariant strings have been discussed before (briefly) in discussions of...
- Christian Kamm (8/10) Jul 12 2007 I was under the impression that invariant(char)[] was the same type as
- torhu (3/15) Jul 12 2007 That's my understanding too, but I'm a bit confused by that fact that
- Robert Fraser (2/18) Jul 13 2007
- Christian Kamm (9/11) Jul 13 2007 Yep, dynamic arrays behave very much like pointers or classes:
- 0ffh (8/17) Jul 13 2007 I don't quite see this point. The way I understand D2.0 strings (which
- Regan Heath (81/107) Jul 13 2007 This makes sense if you think about it from the compilers point of view.
- Robert Fraser (5/140) Jul 13 2007 Oh, didn't see your message. That's awesome, thanks! No, I didn't want t...
Invariant strings have been discussed before (briefly) in discussions of constness, however I wish to bring up the subject again more directly. The "string" alias as it is now (in D 2.0) is an odd beast. The problem is that it is invariant(char)[] instead of invariant(char[]), so that while the characters themselves are invariant, the array is mutable. This has two main problems: 1. It's confusing. There have been quite a few topics both in this newsgroup and in digitalmars.D.learn about how exactly to use the 2.0 string alias and where it's immutable/where it's not. 2. Performance. While writing my own code, I can pretend "string" is invariant (or use my own invariant(char[]) alias), but when passing to, or receiving code from library functions, this is not possible. This means that in each of these situations I must take two, performance-draining precautionary measures: i. Duplicate the string every time it's passed in or out of my code. ii.Synchronize multithreaded access to strings/acquire locks/etc. Invariant strings have precedent: they're used in Java, .NET, Perl, Python, Ruby and quite a few other languages. And for when multiple string operations are going down, there's always char[] and .idup to fall back on, which are far better than Java's StringBuffer, etc. So, please, Walter... consider Andrei's proposal and make "string" an alias to invariant(char[]). It'll make a lot of happiness happen.
Jul 12 2007
The problem is that it is invariant(char)[] instead of invariant(char[])I was under the impression that invariant(char)[] was the same type as invariant(char[]) as invariant/const never apply to the declaration itself? So invariant(int) == int, invariant(int*) == invariant(int)* invariant(int**) == invariant(int*)* != invariant(int)** Or is that incorrect? Christian
Jul 12 2007
Christian Kamm wrote:That's my understanding too, but I'm a bit confused by that fact that Walter's examples uses both variants.The problem is that it is invariant(char)[] instead of invariant(char[])I was under the impression that invariant(char)[] was the same type as invariant(char[]) as invariant/const never apply to the declaration itself? So invariant(int) == int, invariant(int*) == invariant(int)* invariant(int**) == invariant(int*)* != invariant(int)** Or is that incorrect?
Jul 12 2007
Oh, sorry, guess I was quite wrong. So does this mean I don't need to be making defensive copies of every string? torhu Wrote:Christian Kamm wrote:That's my understanding too, but I'm a bit confused by that fact that Walter's examples uses both variants.The problem is that it is invariant(char)[] instead of invariant(char[])I was under the impression that invariant(char)[] was the same type as invariant(char[]) as invariant/const never apply to the declaration itself? So invariant(int) == int, invariant(int*) == invariant(int)* invariant(int**) == invariant(int*)* != invariant(int)** Or is that incorrect?
Jul 13 2007
So does this mean I don't need to be making defensive copies of every string?Yep, dynamic arrays behave very much like pointers or classes: void foo(const(char)[] str) { // valid since str is not final // only changes local copy of array pointer and length str = "abc"; // illegal! can't change the data of the array str[] = "abc"; }
Jul 13 2007
Robert Fraser wrote:2. Performance. While writing my own code, I can pretend "string" is invariant (or use my own invariant(char[]) alias), but when passing to, or receiving code from library functions, this is not possible. This means that in each of these situations I must take two, performance-draining precautionary measures: i. Duplicate the string every time it's passed in or out of my code. ii.Synchronize multithreaded access to strings/acquire locks/etc.I don't quite see this point. The way I understand D2.0 strings (which may be like so much wrong, but still), with invariant(char)[] you can be sure the characters will never change, so there is totally no reason to duplicate that string. Only the pointer to the characters and the length information are mutable.Invariant strings have precedent: they're used in Java, .NET, Perl, Python, Ruby and quite a few other languages.In my book, precedence in itself is no argument - except for lemmings. ;-) Regards, Frank
Jul 13 2007
(disclaimer, I have done only the testing shown at the end of this post) Robert Fraser wrote:Invariant strings have been discussed before (briefly) in discussions of constness, however I wish to bring up the subject again more directly. The "string" alias as it is now (in D 2.0) is an odd beast. The problem is that it is invariant(char)[] instead of invariant(char[]), so that while the characters themselves are invariant, the array is mutable.This makes sense if you think about it from the compilers point of view. It has placed the characters themselves in ROM but the array reference is in RAM so it's pointer and length can change. So, this is valid: invariant(char)[] a = "foo"; invariant(char)[] b = "bar"; b = a; But these are invalid: char[] p; a[0] = 'a'; //for any given rvalue b[] = a[]; //and slicing variants p = a; //p cannot point to invariant(char) If you want to prevent the reference from changing make it 'final', eg. final invariant(char)[] a;This has two main problems: 1. It's confusing. There have been quite a few topics both in this newsgroup and in digitalmars.D.learn about how exactly to use the 2.0 string alias and where it's immutable/where it's not.I wont argue as to whether it's confusing, but to me it seems the basic concept is: "A 'string' reference isn't immutable (or rather 'final'), but it's data is (immutable)".2. Performance. While writing my own code, I can pretend "string" is invariant (or use my own invariant(char[]) alias), but when passing to, or receiving code from library functions, this is not possible.When you pass string to a function that function gets a /copy/ of the reference. So, there is technically no need for the copied reference to be invariant (or rather 'final'). Changes to the reference in the function *do not* propagate back to the caller. Unless, however, the parameter is 'ref'. In which case changes to the reference propagate back to the caller. In this case if your reference is final DMD will error, see test case below. In short, if you use 'final' on your strings then even if you call a library function which takes a 'ref' the compiler will protect you.This means that in each of these situations I must take two, performance-draining precautionary measures: i. Duplicate the string every time it's passed in or out of my code. ii.Synchronize multithreaded access to strings/acquire locks/etc.You do not need to sync access to invariant data, but you may need to sync access to an array reference (if your code, or library code might change it). To prevent changes make your strings final.Invariant strings have precedent: they're used in Java, .NET, Perl, Python, Ruby and quite a few other languages. And for when multiple string operations are going down, there's always char[] and .idup to fall back on, which are far better than Java's StringBuffer, etc.Does Java prevent you re-assigning an invariant string reference? If so, are they implicitly 'final' then?So, please, Walter... consider Andrei's proposal and make "string" an alias to invariant(char[]). It'll make a lot of happiness happen.I think a greater understanding of the current system is required before we start opting for changes. - Regan Heath Test cases: void main() { invariant(char)[] p1 = "one"; invariant(char[]) p2 = "two"; final invariant(char[]) p3 = "three"; char[] p4 = "four".dup; const(char)[] p5 = "five"; const(char[]) p6 = "six"; //p1[0] = 'a'; //Error: p1[0] is not mutable //p2[0] = 'a'; //Error: p2[0] is not mutable //p3[0] = 'a'; //Error: p3[0] is not mutable p4[0] = 'a'; //ok //p5[0] = 'a'; //Error: p5[0] is not mutable //p6[0] = 'a'; //Error: p6[0] is not mutable //p1[] = p2[]; //Error: slice p1[] is not mutable //p2[] = p1[]; //Error: slice p2[] is not mutable //p3[] = p1[]; //Error: slice p3[] is not mutable p4[] = p1[]; //ok //p5[] = p1[]; //Error: slice p5[] is not mutable //p6[] = p1[]; //Error: slice p6[] is not mutable p1 = p2; //ok p2 = p1; //ok //p3 = p1; //variable invariant.p3 cannot modify final/const/invariant variable 'p3' //p4 = p1; //Error: cannot implicitly convert expression (p1) of type invariant(char)[] to char[] p5 = p1; //ok p6 = p1; //ok foo(p3); //variable invariant.main.p3 cannot modify final/const/invariant variable 'p3' } /* void foo(final invariant(char)[] param) { //param = "test"; //variable invariant.foo.param cannot modify final/const/invariant variable 'param' } */ void foo(ref invariant(char)[] param) { param = "test"; //variable invariant.foo.param cannot modify final/const/invariant variable 'param' }
Jul 13 2007
Oh, didn't see your message. That's awesome, thanks! No, I didn't want the refrences to be final, just the data. Basically, I want to ensure that functions I call won't mess around with my data. Thanks! All the best, Fraser Regan Heath Wrote:(disclaimer, I have done only the testing shown at the end of this post) Robert Fraser wrote:Invariant strings have been discussed before (briefly) in discussions of constness, however I wish to bring up the subject again more directly. The "string" alias as it is now (in D 2.0) is an odd beast. The problem is that it is invariant(char)[] instead of invariant(char[]), so that while the characters themselves are invariant, the array is mutable.This makes sense if you think about it from the compilers point of view. It has placed the characters themselves in ROM but the array reference is in RAM so it's pointer and length can change. So, this is valid: invariant(char)[] a = "foo"; invariant(char)[] b = "bar"; b = a; But these are invalid: char[] p; a[0] = 'a'; //for any given rvalue b[] = a[]; //and slicing variants p = a; //p cannot point to invariant(char) If you want to prevent the reference from changing make it 'final', eg. final invariant(char)[] a; > This has two main problems:1. It's confusing. There have been quite a few topics both in this newsgroup and in digitalmars.D.learn about how exactly to use the 2.0 string alias and where it's immutable/where it's not.I wont argue as to whether it's confusing, but to me it seems the basic concept is: "A 'string' reference isn't immutable (or rather 'final'), but it's data is (immutable)".2. Performance. While writing my own code, I can pretend "string" is invariant (or use my own invariant(char[]) alias), but when passing to, or receiving code from library functions, this is not possible.When you pass string to a function that function gets a /copy/ of the reference. So, there is technically no need for the copied reference to be invariant (or rather 'final'). Changes to the reference in the function *do not* propagate back to the caller. Unless, however, the parameter is 'ref'. In which case changes to the reference propagate back to the caller. In this case if your reference is final DMD will error, see test case below. In short, if you use 'final' on your strings then even if you call a library function which takes a 'ref' the compiler will protect you.This means that in each of these situations I must take two, performance-draining precautionary measures: i. Duplicate the string every time it's passed in or out of my code. ii.Synchronize multithreaded access to strings/acquire locks/etc.You do not need to sync access to invariant data, but you may need to sync access to an array reference (if your code, or library code might change it). To prevent changes make your strings final.Invariant strings have precedent: they're used in Java, .NET, Perl, Python, Ruby and quite a few other languages. And for when multiple string operations are going down, there's always char[] and .idup to fall back on, which are far better than Java's StringBuffer, etc.Does Java prevent you re-assigning an invariant string reference? If so, are they implicitly 'final' then?So, please, Walter... consider Andrei's proposal and make "string" an alias to invariant(char[]). It'll make a lot of happiness happen.I think a greater understanding of the current system is required before we start opting for changes. - Regan Heath Test cases: void main() { invariant(char)[] p1 = "one"; invariant(char[]) p2 = "two"; final invariant(char[]) p3 = "three"; char[] p4 = "four".dup; const(char)[] p5 = "five"; const(char[]) p6 = "six"; //p1[0] = 'a'; //Error: p1[0] is not mutable //p2[0] = 'a'; //Error: p2[0] is not mutable //p3[0] = 'a'; //Error: p3[0] is not mutable p4[0] = 'a'; //ok //p5[0] = 'a'; //Error: p5[0] is not mutable //p6[0] = 'a'; //Error: p6[0] is not mutable //p1[] = p2[]; //Error: slice p1[] is not mutable //p2[] = p1[]; //Error: slice p2[] is not mutable //p3[] = p1[]; //Error: slice p3[] is not mutable p4[] = p1[]; //ok //p5[] = p1[]; //Error: slice p5[] is not mutable //p6[] = p1[]; //Error: slice p6[] is not mutable p1 = p2; //ok p2 = p1; //ok //p3 = p1; //variable invariant.p3 cannot modify final/const/invariant variable 'p3' //p4 = p1; //Error: cannot implicitly convert expression (p1) of type invariant(char)[] to char[] p5 = p1; //ok p6 = p1; //ok foo(p3); //variable invariant.main.p3 cannot modify final/const/invariant variable 'p3' } /* void foo(final invariant(char)[] param) { //param = "test"; //variable invariant.foo.param cannot modify final/const/invariant variable 'param' } */ void foo(ref invariant(char)[] param) { param = "test"; //variable invariant.foo.param cannot modify final/const/invariant variable 'param' }
Jul 13 2007