digitalmars.D - Implicit conversion of concatenation result to immutable
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (28/28) Apr 01 2021 Can somebody explain the logic behind the compiler disallowing
- Steven Schveighoffer (5/7) Apr 01 2021 That makes no sense. The compiler should allow that conversion. It can
- =?UTF-8?Q?Ali_=c3=87ehreli?= (5/12) Apr 01 2021 It should even return char[], no? Freshly copied data should belong to=2...
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (2/4) Apr 01 2021 Precisely.
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (16/17) Apr 01 2021 I think it returning `char[]` is sound.
- Steven Schveighoffer (19/25) Apr 01 2021 It shouldn't lack this knowledge. That's not soundness, it's just annoyi...
- H. S. Teoh (10/23) Apr 01 2021 [...]
- =?UTF-8?Q?Ali_=c3=87ehreli?= (10/35) Apr 01 2021 I admit I've been neglecting indirections. If we are dealing with
- Steven Schveighoffer (4/25) Apr 02 2021 It is considered pure, note that I'm using concatenation inside the
- Q. Schroll (14/27) Apr 01 2021 Found by accident that the code does not compile with -dip1000 on
- H. S. Teoh (28/48) Apr 01 2021 It is illegal to implicitly convert const to immutable, because there
- =?UTF-8?Q?Ali_=c3=87ehreli?= (6/13) Apr 01 2021 Yes, that's tricky for append because one of many slices does own the
- H. S. Teoh (9/18) Apr 01 2021 [...]
- Steven Schveighoffer (4/20) Apr 01 2021 Yes, always an allocation. See point 5 here:
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (4/6) Apr 01 2021 Good, then the implicit conversion should be allowed. Anybody up
- =?UTF-8?Q?Ali_=c3=87ehreli?= (6/12) Apr 01 2021 As I mentioned elsewhere in this thread, the element type must not have=...
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (17/21) Apr 01 2021 Thanks.
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (13/14) Apr 01 2021 That creates an unnecessary GC allocation in cases such
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (8/16) Apr 01 2021 The alternative
- Steven Schveighoffer (9/22) Apr 01 2021 But it's not.
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (4/5) Apr 01 2021 In the meanwhile `.assumeUnique` from `std.exception` should be
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (18/20) Apr 01 2021 I'm very intrigued by the fact the this issue hasn't been
Can somebody explain the logic behind the compiler disallowing both line 3 and 4 in ```d const(char)[] x; string y; string z1 = x ~ y; // errors string z2 = y ~ x; // errors ``` erroring as ``` Error: cannot implicitly convert expression `x ~ cast(const(char)[])y` of type `char[]` to `string` Error: cannot implicitly convert expression `cast(const(char)[])y ~ x` of type `char[]` to `string ```` Has this something to do with the compiler being defensive about possible in-place appending or prepending to the arguments (in this case `x` and `y`) passed to the array concatenation expression? For instance, could `x ~ y` return either - a back-extended slice `x[0 .. x.length + y.length]` with `y` appended at the back or - a front-extended slice `y[-x.length .. y.length]` with `x` prepended to the front provided the GC has information about available free memory there? This problem regularly crops up for me during assembling of strings passed as a string parameter for instance an exception constructor.
Apr 01 2021
On 4/1/21 5:21 PM, Per Nordlöw wrote:Error: cannot implicitly convert expression `x ~ cast(const(char)[])y` of type `char[]` to `string`That makes no sense. The compiler should allow that conversion. It can clearly prove that the result doesn't derive from the parameters. I'm surprised it doesn't return const(char)[]. -Steve
Apr 01 2021
On 4/1/21 2:41 PM, Steven Schveighoffer wrote:On 4/1/21 5:21 PM, Per Nordl=C3=B6w wrote:`Error: cannot implicitly convert expression `x ~ cast(const(char)[])y=of type `char[]` to `string`That makes no sense. The compiler should allow that conversion. It can=clearly prove that the result doesn't derive from the parameters. I'm surprised it doesn't return const(char)[]. -SteveIt should even return char[], no? Freshly copied data should belong to=20 the programmer. Ali
Apr 01 2021
On Thursday, 1 April 2021 at 21:45:08 UTC, Ali Çehreli wrote:It should even return char[], no? Freshly copied data should belong to the programmer.Precisely.
Apr 01 2021
On Thursday, 1 April 2021 at 21:41:06 UTC, Steven Schveighoffer wrote:I'm surprised it doesn't return const(char)[].I think it returning `char[]` is sound. What's unsound is that the compiler lacks knowledge of it being a unique slice (no aliasing). The situation is analogous with ```d alias T = int; const n = 42; auto x = new T[n]; // no conversion static assert(is(typeof(x) == T[])); immutable(T)[] y = new T[n]; // implicit conversion to immutable allowed static assert(is(typeof(y) == immutable(T)[])); ``` which is currently accepted by the compiler.
Apr 01 2021
On 4/1/21 5:51 PM, Per Nordlöw wrote:On Thursday, 1 April 2021 at 21:41:06 UTC, Steven Schveighoffer wrote:Not saying it's unsound, it just surprises me.I'm surprised it doesn't return const(char)[].I think it returning `char[]` is sound.What's unsound is that the compiler lacks knowledge of it being a unique slice (no aliasing).It shouldn't lack this knowledge. That's not soundness, it's just annoying. I'm not saying I disagree with you, it's a limitation and I think there's no good explanation for the problem. To illustrate the point more cleanly: ```d auto concat(T, U)(T[] x, U[] y) pure { return x ~ y; } void main() { string x; const(char)[] y; string z = concat(x, y); // compiles } ``` -Steve
Apr 01 2021
On Thu, Apr 01, 2021 at 06:34:04PM -0400, Steven Schveighoffer via Digitalmars-d wrote: [...]```d auto concat(T, U)(T[] x, U[] y) pure { return x ~ y; } void main() { string x; const(char)[] y; string z = concat(x, y); // compiles } ```[...] Put this way, the solution becomes obvious: `~` should be considered a pure operation. Then the compiler (in theory) ought to be able to infer uniqueness from `x ~ y`, and consequently allow implicit conversion to immutable. T -- Never criticize a man until you've walked a mile in his shoes. Then when you do criticize him, you'll be a mile away and he won't have his shoes.
Apr 01 2021
On 4/1/21 3:55 PM, H. S. Teoh wrote:On Thu, Apr 01, 2021 at 06:34:04PM -0400, Steven Schveighoffer via Digitalmars-d wrote: [...]I admit I've been neglecting indirections. If we are dealing with const(S)[], S being a user defined type with indirections, then concatenation cannot be S[] because the original S object would not allow mutation through them. This is still within compiler's attribute inference, right? On the other hand, this probably would complicate template code: I can imagine a template code is tested with simple types and works but fails as soon as used with a const(S)[] type at a customer site. Ali```d auto concat(T, U)(T[] x, U[] y) pure { return x ~ y; } void main() { string x; const(char)[] y; string z = concat(x, y); // compiles } ```[...] Put this way, the solution becomes obvious: `~` should be considered a pure operation. Then the compiler (in theory) ought to be able to infer uniqueness from `x ~ y`, and consequently allow implicit conversion to immutable. T
Apr 01 2021
On 4/1/21 6:55 PM, H. S. Teoh wrote:On Thu, Apr 01, 2021 at 06:34:04PM -0400, Steven Schveighoffer via Digitalmars-d wrote: [...]It is considered pure, note that I'm using concatenation inside the function (which is marked pure). -Steve```d auto concat(T, U)(T[] x, U[] y) pure { return x ~ y; } void main() { string x; const(char)[] y; string z = concat(x, y); // compiles } ```[...] Put this way, the solution becomes obvious: `~` should be considered a pure operation. Then the compiler (in theory) ought to be able to infer uniqueness from `x ~ y`, and consequently allow implicit conversion to immutable.
Apr 02 2021
On Thursday, 1 April 2021 at 22:34:04 UTC, Steven Schveighoffer wrote:To illustrate the point more cleanly: ```d auto concat(T, U)(T[] x, U[] y) pure { return x ~ y; } void main() { string x; const(char)[] y; string z = concat(x, y); // compiles } ```Found by accident that the code does not compile with -dip1000 on 2.095 (run.dlang.org). Actually, I tried inclining the function template and to my surprise, whether it passes or not depends on -dip1000: ```D string z1 = (() => x ~ y)(); // fails with and without -dip1000 string z2 = ((x, y) => x ~ y)(x, y); // passes without, fails with -dip1000 ``` It makes no difference adding `pure` to any of those since it's inferred.
Apr 01 2021
On Thu, Apr 01, 2021 at 09:21:02PM +0000, Per Nordlöw via Digitalmars-d wrote:Can somebody explain the logic behind the compiler disallowing both line 3 and 4 in ```d const(char)[] x; string y; string z1 = x ~ y; // errors string z2 = y ~ x; // errors ``` erroring as ``` Error: cannot implicitly convert expression `x ~ cast(const(char)[])y` of type `char[]` to `string` Error: cannot implicitly convert expression `cast(const(char)[])y ~ x` of type `char[]` to `string ````It is illegal to implicitly convert const to immutable, because there may be a mutable alias to the data somewhere. If so, it will violate immutability. For example: char[] evil; const(char)[] x = evil; // x now aliases evil string y = x; // y is now aliases evil <---- evil[0] = 'a'; // Oops, immutability violated The implicit conversion on the line marked `<----` is the cause of the problem. Now, when you append a string to a const(char)[], the compiler has to promote `string` to `const(char)[]` first, so that the operands of ~ have the same type. And obviously, the result of concatenating two const(char)[] must be const(char)[], since you don't know if one of them may have mutable aliases somewhere else. So the result must likewise be const(char)[]. One may argue that appending in general will reallocate, and once reallocated it will be unique, and there safe to implicitly convert to immutable. However, in general we cannot guarantee this, e.g., one of the strings could be empty and not reallocate at runtime, so it may continue to be aliased by some mutable reference somewhere else. So the result must be typed as const(char)[], along with the restriction that it cannot implicitly convert to immutable. [...]This problem regularly crops up for me during assembling of strings passed as a string parameter for instance an exception constructor.Just use .idup on the result. T -- What's a "hot crossed bun"? An angry rabbit.
Apr 01 2021
On 4/1/21 2:59 PM, H. S. Teoh wrote:the result of concatenating two const(char)[] must be const(char)[], since you don't know if one of them may have mutable aliases somewhere else. So the result must likewise be const(char)[]. One may argue that appending in general will reallocate, and once reallocated it will be unique, and there safe to implicitly convert to immutable. However, in general we cannot guarantee thisYes, that's tricky for append because one of many slices does own the potential bytes after the array and will append elements in there. However, concatenation always makes a new array, right? I think the result can be char[] in that case. Ali
Apr 01 2021
On Thu, Apr 01, 2021 at 03:07:09PM -0700, Ali Çehreli via Digitalmars-d wrote:On 4/1/21 2:59 PM, H. S. Teoh wrote:[...][...] If one of the arguments is an empty array, does concatenation allocate a new array anyway? Or does it simply return the other argument? (I don't know.) If not, then we cannot make it implicitly convertible. T -- Genius may have its limitations, but stupidity is not thus handicapped. -- Elbert HubbardOne may argue that appending in general will reallocate, and once reallocated it will be unique, and there safe to implicitly convert to immutable. However, in general we cannot guarantee thisYes, that's tricky for append because one of many slices does own the potential bytes after the array and will append elements in there. However, concatenation always makes a new array, right? I think the result can be char[] in that case.
Apr 01 2021
On 4/1/21 6:34 PM, H. S. Teoh wrote:On Thu, Apr 01, 2021 at 03:07:09PM -0700, Ali Çehreli via Digitalmars-d wrote:Yes, always an allocation. See point 5 here: https://dlang.org/spec/arrays.html#array-concatenation -SteveOn 4/1/21 2:59 PM, H. S. Teoh wrote:[...][...] If one of the arguments is an empty array, does concatenation allocate a new array anyway? Or does it simply return the other argument? (I don't know.) If not, then we cannot make it implicitly convertible.One may argue that appending in general will reallocate, and once reallocated it will be unique, and there safe to implicitly convert to immutable. However, in general we cannot guarantee thisYes, that's tricky for append because one of many slices does own the potential bytes after the array and will append elements in there. However, concatenation always makes a new array, right? I think the result can be char[] in that case.
Apr 01 2021
On Thursday, 1 April 2021 at 22:35:57 UTC, Steven Schveighoffer wrote:Yes, always an allocation. See point 5 here: https://dlang.org/spec/arrays.html#array-concatenationGood, then the implicit conversion should be allowed. Anybody up for the job? If not, I'm gonna look into it.
Apr 01 2021
On 4/1/21 3:44 PM, Per Nordl=C3=B6w wrote:On Thursday, 1 April 2021 at 22:35:57 UTC, Steven Schveighoffer wrote:e=20Yes, always an allocation. See point 5 here:=20 https://dlang.org/spec/arrays.html#array-concatenation=20 Good, then the implicit conversion should be allowed. Anybody up for th=job? If not, I'm gonna look into it.As I mentioned elsewhere in this thread, the element type must not have=20 indirections though. If S is a struct with indirections, concatenating=20 const(S)[] should still produce const(S)[]. Ali
Apr 01 2021
On Thursday, 1 April 2021 at 23:21:56 UTC, Ali Çehreli wrote:As I mentioned elsewhere in this thread, the element type must not have indirections though. If S is a struct with indirections, concatenating const(S)[] should still produce const(S)[].Thanks. Got something working that allows the implicit conversion at https://github.com/dlang/dmd/pull/12341 for the sample code ```d safe pure unittest { const(char)[] x; string y; string z1 = x ~ y; // now passes string z2 = y ~ x; // now passes } ``` but nothing else. Feel perfectly free to fill in the details or even take over the PR.
Apr 01 2021
On Thursday, 1 April 2021 at 21:59:21 UTC, H. S. Teoh wrote:Just use .idup on the result.That creates an unnecessary GC allocation in cases such ```d string x; const(char)[] y; throw new Exception(x ~ y.idup) ``` which, imho, is very much worth considering avoiding the need for. The implicit conversion could be special cased in the compiler to be allowed only on a `CatExp` being an r-value, naturally non-aliased. As mentioned above this is analogous with conversion rules of new expressions.
Apr 01 2021
On Thursday, 1 April 2021 at 22:17:08 UTC, Per Nordlöw wrote:On Thursday, 1 April 2021 at 21:59:21 UTC, H. S. Teoh wrote:The alternative ```d throw new Exception((x ~ y).idup) ``` , I now realized you referred to, is also unsound because `x ~ y` is already a unique freshly allocated unaliased slice that can be freely implicitly converted to `immutable`.Just use .idup on the result.That creates an unnecessary GC allocation in cases such ```d string x; const(char)[] y; throw new Exception(x ~ y.idup) ```
Apr 01 2021
On 4/1/21 5:59 PM, H. S. Teoh wrote:Now, when you append a string to a const(char)[], the compiler has to promote `string` to `const(char)[]` first, so that the operands of ~ have the same type. And obviously, the result of concatenating two const(char)[] must be const(char)[], since you don't know if one of them may have mutable aliases somewhere else. So the result must likewise be const(char)[].But it's not. auto z = x ~ y; pragma(msg, typeof(z)); // char[] This is what's confusing me. The compiler somehow knows it can do this implicit cast, but doesn't know that the result is unique. It should be obvious.One may argue that appending in general will reallocate, and once reallocated it will be unique, and there safe to implicitly convert to immutable. However, in general we cannot guarantee this, e.g., one of the strings could be empty and not reallocate at runtime, so it may continue to be aliased by some mutable reference somewhere else. So the result must be typed as const(char)[], along with the restriction that it cannot implicitly convert to immutable.a ~ b will always allocate new memory, it's in the spec. -Steve
Apr 01 2021
On Thursday, 1 April 2021 at 21:59:21 UTC, H. S. Teoh wrote:Just use .idup on the result.In the meanwhile `.assumeUnique` from `std.exception` should be preferred over `.idup` unless duplicate memory allocations is of intrinsic value. ;)
Apr 01 2021
On Thursday, 1 April 2021 at 21:21:02 UTC, Per Nordlöw wrote:Can somebody explain the logic behind the compiler disallowing both line 3 and 4 inI'm very intrigued by the fact the this issue hasn't been discussed more. It smells to me like this behaviour is somewhat by design...because I don't think it is at all difficult to fix. During implicit conversion checking, just peek into the expression and see if it's CatExp and if so treat it as a unique reference and allow conversion from mutable non-mutable. However, note that to solve this in the general case we have to involve data flow and escap e analysis. And have a tag associated with a variable that indicates whether it is aliased or not (on, yes, maybe). Consider, for instance, ```d auto c = a ~ b; e = d; // `d` aliased to `e` f = foo(e); // `e` maybe aliased to `f` immutable g = e; // allowed? ```
Apr 01 2021