www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - std.string.assumeUTF() silently casting mutable to immutable?

reply Forest <forest example.com> writes:
I may have found a bug in assumeUTF(), but being new to D, I'm 
not sure.

The description:

 Assume the given array of integers arr is a well-formed UTF 
 string and return it typed as a UTF string.
 ubyte becomes char, ushort becomes wchar and uint becomes 
 dchar. Type qualifiers are preserved.
The declaration: ```d auto assumeUTF(T)(T[] arr) if (staticIndexOf!(immutable T, immutable ubyte, immutable ushort, immutable uint) != -1) ``` Shouldn't that precondition's `immutable T` be simply `T`? As it stands, I can do this with no complaints from the compiler... ```d string test(ubyte[] arr) { import std.string; return arr.assumeUTF; } ``` ...and accidentally end up with a "string" pointing at mutable data. Am I missing something?
Feb 12 2024
parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Tuesday, February 13, 2024 12:40:57 AM MST Forest via Digitalmars-d-learn 
wrote:
 I may have found a bug in assumeUTF(), but being new to D, I'm
 not sure.

 The description:
 Assume the given array of integers arr is a well-formed UTF
 string and return it typed as a UTF string.
 ubyte becomes char, ushort becomes wchar and uint becomes
 dchar. Type qualifiers are preserved.
The declaration: ```d auto assumeUTF(T)(T[] arr) if (staticIndexOf!(immutable T, immutable ubyte, immutable ushort, immutable uint) != -1) ``` Shouldn't that precondition's `immutable T` be simply `T`? As it stands, I can do this with no complaints from the compiler... ```d string test(ubyte[] arr) { import std.string; return arr.assumeUTF; } ``` ...and accidentally end up with a "string" pointing at mutable data. Am I missing something?
It's not a bug in assumeUTF. if you changed your code to string test(ubyte[] arr) { import std.string; pragma(msg, typeof(arr.assumeUTF)); return arr.assumeUTF; } then the compiler will output char[] because assumeUTF retains the type qualifier of the original type (as its documentation explains). Rather, it looks like the problem here is that dmd will implictly change the constness of a return value when it thinks that it can do so to make the code work. Essentially, that means that the function has to be pure and that the return value can't have come from any of the function's arguments. And at a glance, that would be true here, because no char[] was passed into assumeUTF. However, casting from ubyte[] to char[] is safe, so dmd should be taking that possibility into account, and it's apparently not. So, there's definitely a bug here, but it's a dmd bug. Its checks for whether it can safely change the constness of the return type apparently aren't sophisticated enough to catch this case. - Jonathan M Davis
Feb 13 2024
parent reply Johan <j j.nl> writes:
On Tuesday, 13 February 2024 at 08:10:20 UTC, Jonathan M Davis 
wrote:
 
 So, there's definitely a bug here, but it's a dmd bug. Its 
 checks for whether it can safely change the constness of the 
 return type apparently aren't sophisticated enough to catch 
 this case.
This is a pretty severe bug. Some test cases: https://d.godbolt.org/z/K1fjdj76M ```d ubyte[] pure_ubyte(ubyte[] arr) pure safe; ubyte[] pure_void(void[] arr) pure safe; ubyte[] pure_int(int[] arr) pure safe; int[] pure_ubyte_to_int(ubyte[] arr) pure safe; // All cases below should not compile, yet some do. immutable(ubyte)[] test(ubyte[] arr) safe { // return with_ubyte(arr); // ERROR: OK return pure_void(arr); // No error: NOK! } immutable(ubyte)[] test(int[] arr) safe { return pure_int(arr); // No error: NOK! } immutable(int)[] test2(ubyte[] arr) safe { return pure_ubyte_to_int(arr); // No error: NOK! } ``` -Johan
Feb 13 2024
parent reply Forest <forest example.com> writes:
On Tuesday, 13 February 2024 at 14:05:03 UTC, Johan wrote:
 On Tuesday, 13 February 2024 at 08:10:20 UTC, Jonathan M Davis 
 wrote:
 
 So, there's definitely a bug here, but it's a dmd bug. Its 
 checks for whether it can safely change the constness of the 
 return type apparently aren't sophisticated enough to catch 
 this case.
This is a pretty severe bug.
Thanks, gents. Reported on the tracker: https://issues.dlang.org/show_bug.cgi?id=24394
Feb 13 2024
parent reply RazvanN <razvan.nitu1305 gmail.com> writes:
On Wednesday, 14 February 2024 at 02:13:08 UTC, Forest wrote:
 On Tuesday, 13 February 2024 at 14:05:03 UTC, Johan wrote:
 On Tuesday, 13 February 2024 at 08:10:20 UTC, Jonathan M Davis 
 wrote:
 
 So, there's definitely a bug here, but it's a dmd bug. Its 
 checks for whether it can safely change the constness of the 
 return type apparently aren't sophisticated enough to catch 
 this case.
This is a pretty severe bug.
Thanks, gents. Reported on the tracker: https://issues.dlang.org/show_bug.cgi?id=24394
This has already been fixed, you just need to use -preview=fixImmutableConv. This was put behind a preview flag as it introduces a breaking change.
Feb 14 2024
parent reply Forest <forest example.com> writes:
On Wednesday, 14 February 2024 at 10:57:42 UTC, RazvanN wrote:
 This has already been fixed, you just need to use 
 -preview=fixImmutableConv. This was put behind a preview flag 
 as it introduces a breaking change.
I just tried that flag on run.dlang.org, and although it fixes the case I posted earlier, it doesn't fix this one: ```d string test(const(ubyte)[] arr) { import std.string; return arr.assumeUTF; } ``` Shouldn't this be rejected as well?
Feb 14 2024
parent RazvanN <razvan.nitu1305 gmail.com> writes:
On Wednesday, 14 February 2024 at 11:56:29 UTC, Forest wrote:
 On Wednesday, 14 February 2024 at 10:57:42 UTC, RazvanN wrote:
 This has already been fixed, you just need to use 
 -preview=fixImmutableConv. This was put behind a preview flag 
 as it introduces a breaking change.
I just tried that flag on run.dlang.org, and although it fixes the case I posted earlier, it doesn't fix this one: ```d string test(const(ubyte)[] arr) { import std.string; return arr.assumeUTF; } ``` Shouldn't this be rejected as well?
Indeed, that should be rejected as well, otherwise you can modify immutable table. This code currently happily compiles: ```d string test(const(ubyte)[] arr) { import std.string; return arr.assumeUTF; } void main() { import std.stdio; ubyte[] arr = ['a', 'b', 'c']; auto t = test(arr); writeln(t); arr[0] = 'x'; writeln(t); } ``` And prints: ``` abc xbc ``` However, this seems to be a different issue.
Feb 14 2024