digitalmars.D.learn - A debug class has started
- forkit (34/34) Dec 13 2021 ok. one line in this code is causing a problem (as noted in
- forkit (8/10) Dec 13 2021 oh...
- forkit (4/5) Dec 13 2021 char* w = cast(char*)str.toStringz; // this seems to be the
- drug (6/12) Dec 13 2021 That's because `str` is initialized by a literal and you can not change
- ag0aep6g (5/13) Dec 13 2021 That still has undefined behavior. You cannot mutate the characters in a...
- drug (2/18) Dec 13 2021 You're right. I forget to change `string str` to `auto str` or `char[] s...
- WebFreak001 (14/28) Dec 13 2021 important: toStringz _may_ do a copy with the current
- forkit (7/12) Dec 13 2021 (this produces an unpredictable result??)
- H. S. Teoh (7/14) Dec 13 2021 Shouldn't you be using:
- forkit (5/19) Dec 13 2021 that also produces the same unpredictable result.
- H. S. Teoh (8/24) Dec 13 2021 [...]
- forkit (37/51) Dec 13 2021 so here are all the possible options I've tried. only 2 of these
- H. S. Teoh (13/24) Dec 13 2021 This line is your problem: you have a raw pointer `str` and
- forkit (42/47) Dec 13 2021 oh.. so many different ways...(to both produce the same bug, and
- H. S. Teoh (11/24) Dec 13 2021 [...]
- WebFreak001 (54/64) Dec 14 2021 this is also undefined behavior (toStringz returns an
- forkit (5/46) Dec 14 2021 This was of course just me 'playing around with pointer casting
- Stanislav Blinov (4/9) Dec 13 2021 You're calling a `to` on a char pointer, which, ostensibly, would
ok. one line in this code is causing a problem (as noted in comments) an explanation as to the cause, is welcome ;-) // ------------------------ module test; import std : writeln, writefln; import std.conv : to; void main() { string str = "abc;def;ab"; char* w = cast(char*)str; writeln(replaceChar(w, str.length, ';', 'X')); } immutable(char)[] replaceChar(char* str, ulong len, char ch1, char ch2) { for (ulong i = 0; i < len; i++) { if (str[i] == ch1) { writefln("Found %c at str[%d]", ch1, i); // fine // problem is with this next line: // the line works fine using DMD on windows - but crashes if using LDC on windows // On Linux, both DMD and LDC -> sigsegv //str[i] = ch2; // seems simple enough // - just replace the char currently in element str[i]) with 'X' } } return to!(immutable(char)[])(str); } // ------------------------
Dec 13 2021
On Monday, 13 December 2021 at 09:49:05 UTC, forkit wrote:oh... Windows - dmd version is 2.098.0-dirty - ldc2 version is 1.28 (based on dmd v2.098.0) Linux - dmd version is 2.098 - ldc2 version is 1.20.1 (based on dmd v2.090.1)
Dec 13 2021
On Monday, 13 December 2021 at 09:49:05 UTC, forkit wrote:....char* w = cast(char*)str.toStringz; // this seems to be the solution class has ended ;-)
Dec 13 2021
On 13.12.2021 13:49, forkit wrote:On Monday, 13 December 2021 at 09:49:05 UTC, forkit wrote:That's because `str` is initialized by a literal and you can not change it by definition. When you call `toStringz` it duplicates that literal (adding terminating zero at the end) and the duplicate is mutable. I would recommend do not use `toStringz` and just make duplicate of the literal - https://run.dlang.io/is/vaosW0....char* w = cast(char*)str.toStringz; // this seems to be the solution class has ended ;-)
Dec 13 2021
On 13.12.21 12:09, drug wrote:That's because `str` is initialized by a literal and you can not change it by definition. When you call `toStringz` it duplicates that literal (adding terminating zero at the end) and the duplicate is mutable. I would recommend do not use `toStringz` and just make duplicate of the literal - https://run.dlang.io/is/vaosW0From the link:string str = "abc;def;ab".dup; // allocates the string in the heap char* w = cast(char*)str; writeln(replaceChar(w, str.length, ';', 'X'));That still has undefined behavior. You cannot mutate the characters in a `string`. It doesn't matter if it came from a literal or `.dup`. Use `char[]` instead of `string`.
Dec 13 2021
On 13.12.2021 14:26, ag0aep6g wrote:On 13.12.21 12:09, drug wrote:You're right. I forget to change `string str` to `auto str` or `char[] str`.That's because `str` is initialized by a literal and you can not change it by definition. When you call `toStringz` it duplicates that literal (adding terminating zero at the end) and the duplicate is mutable. I would recommend do not use `toStringz` and just make duplicate of the literal - https://run.dlang.io/is/vaosW0From the link:string str = "abc;def;ab".dup; // allocates the string in the heap char* w = cast(char*)str; writeln(replaceChar(w, str.length, ';', 'X'));That still has undefined behavior. You cannot mutate the characters in a `string`. It doesn't matter if it came from a literal or `.dup`. Use `char[]` instead of `string`.
Dec 13 2021
On Monday, 13 December 2021 at 11:09:18 UTC, drug wrote:On 13.12.2021 13:49, forkit wrote:important: toStringz _may_ do a copy with the current implementation but nothing in the docs states it actually does so. In fact there is commented out code where it [in the past](https://github.com/dlang/phobos/commit/bc412e7c3fa3f124d7f2785223318b45edd4b3e6#diff-b94766ba288f9b4b05ef1a4874e26724750e614afdcddaf4c 071d0f19d91595L217) just dereferenced the memory after the string and checked if it was 0. You should really use `.dup` if you want to mutate your string. (You would need to duplicate anyway if you don't want an unsafe cast) pro-tip for bugs like this: just slap ` safe:` at the start of every file, the compiler will tell you everything that is risky and the bug will 9/10 times just sort itself out by fixing what the compiler complains about. (just recently helped someone with this again, was a 3 minute fix for a big code-base where manually searching the issue would have taken much longer)On Monday, 13 December 2021 at 09:49:05 UTC, forkit wrote:That's because `str` is initialized by a literal and you can not change it by definition. When you call `toStringz` it duplicates that literal (adding terminating zero at the end) and the duplicate is mutable. I would recommend do not use `toStringz` and just make duplicate of the literal - https://run.dlang.io/is/vaosW0....char* w = cast(char*)str.toStringz; // this seems to be the solution class has ended ;-)
Dec 13 2021
On Monday, 13 December 2021 at 12:06:53 UTC, WebFreak001 wrote:You should really use `.dup` if you want to mutate your string. (You would need to duplicate anyway if you don't want an unsafe cast)(this produces an unpredictable result??) char* w = cast(char*)str.dup; (but this seems to work - as expected) char* w = strdup(cast(char*)str); // import core.stdc.string : strdup;pro-tip for bugs like this: just slap ` safe:` at the start of every file, the compiler will tell you everything that is riskyI like this idea. Thanks ;-)
Dec 13 2021
On Mon, Dec 13, 2021 at 08:04:24PM +0000, forkit via Digitalmars-d-learn wrote:On Monday, 13 December 2021 at 12:06:53 UTC, WebFreak001 wrote:Shouldn't you be using: char* w = str.dup.ptr; instead?? T -- It is impossible to make anything foolproof because fools are so ingenious. -- SammyYou should really use `.dup` if you want to mutate your string. (You would need to duplicate anyway if you don't want an unsafe cast)(this produces an unpredictable result??) char* w = cast(char*)str.dup;
Dec 13 2021
On Monday, 13 December 2021 at 20:28:26 UTC, H. S. Teoh wrote:On Mon, Dec 13, 2021 at 08:04:24PM +0000, forkit via Digitalmars-d-learn wrote:that also produces the same unpredictable result. i.e (an extra character from 'somewhere' appears in the output from line below) writeln(replaceChar(w, str.length, ';', 'X'));On Monday, 13 December 2021 at 12:06:53 UTC, WebFreak001 wrote:Shouldn't you be using: char* w = str.dup.ptr; instead?? TYou should really use `.dup` if you want to mutate your string. (You would need to duplicate anyway if you don't want an unsafe cast)(this produces an unpredictable result??) char* w = cast(char*)str.dup;
Dec 13 2021
On Mon, Dec 13, 2021 at 08:47:12PM +0000, forkit via Digitalmars-d-learn wrote:On Monday, 13 December 2021 at 20:28:26 UTC, H. S. Teoh wrote:[...]On Mon, Dec 13, 2021 at 08:04:24PM +0000, forkit via Digitalmars-d-learn wrote:[...](this produces an unpredictable result??) char* w = cast(char*)str.dup;Shouldn't you be using: char* w = str.dup.ptr; instead??that also produces the same unpredictable result. i.e (an extra character from 'somewhere' appears in the output from line below) writeln(replaceChar(w, str.length, ';', 'X'));[...] That's weird. Can you post the minimal code that exhibits this problem? T -- Prosperity breeds contempt, and poverty breeds consent. -- Suck.com
Dec 13 2021
On Monday, 13 December 2021 at 20:28:26 UTC, H. S. Teoh wrote:On Mon, Dec 13, 2021 at 08:04:24PM +0000, forkit via Digitalmars-d-learn wrote:so here are all the possible options I've tried. only 2 of these actually produce the expected result. // ------ module test; import std : writeln, writefln; import std.conv : to; import core.stdc.string : strdup; import std.string : toStringz; void main() { string str = "abc;def;ab"; //char* w = cast(char*)str; // NOPE! A string constant is immutable, so expect undefined behaviour. char* w = strdup(cast(char*)str); // ok //char* w = cast(char*)str.toStringz; // also ok // all these below result in an extra character from 'somewhere' appearing in the writeln output //char* w = cast(char*)str.dup; // nope //char* w = str.dup.ptr; // nope //char* w = &dup(cast(const(char)[])str)[0]; // nope writeln(replaceChar(w, str.length, ';', 'X')); } immutable(char)[] replaceChar(char* str, ulong len, char ch1, char ch2) { for (ulong i = 0; i < len; i++) { if (str[i] == ch1) { writefln("Found %c at str[%d]", ch1, i); // fine str[i] = ch2; } } return to!(immutable(char)[])(str); } // ----------------On Monday, 13 December 2021 at 12:06:53 UTC, WebFreak001 wrote:Shouldn't you be using: char* w = str.dup.ptr; instead?? TYou should really use `.dup` if you want to mutate your string. (You would need to duplicate anyway if you don't want an unsafe cast)(this produces an unpredictable result??) char* w = cast(char*)str.dup;
Dec 13 2021
On Mon, Dec 13, 2021 at 08:58:42PM +0000, forkit via Digitalmars-d-learn wrote: [...]immutable(char)[] replaceChar(char* str, ulong len, char ch1, char ch2) { for (ulong i = 0; i < len; i++) { if (str[i] == ch1) { writefln("Found %c at str[%d]", ch1, i); // fine str[i] = ch2; } } return to!(immutable(char)[])(str);This line is your problem: you have a raw pointer `str` and force-casting it to an array without specifying its length. Do not expect anything good to come of this. (In fact I'm surprised it .to even accepts such a thing!) What you should be doing is: return to!string(str[0 .. len]); Or just: return str[0 .. len].idup; T -- Being forced to write comments actually improves code, because it is easier to fix a crock than to explain it. -- G. Steele
Dec 13 2021
On Monday, 13 December 2021 at 21:13:25 UTC, H. S. Teoh wrote:What you should be doing is: return to!string(str[0 .. len]); Or just: return str[0 .. len].idup; Toh.. so many different ways...(to both produce the same bug, and also to produce the correct output). ... it's a little mind boggling ;-) // ---------- module test; import std : writeln, writefln, assumeUnique; import std.conv : to; import core.stdc.string : strdup; import std.string : toStringz; void main() { string str = "abc;def;ab"; //char* w = cast(char*)str; // nope. a pointer to a string constant is // (supposed to be) immutable, so expect undefined behaviour. char* w = strdup(cast(char*)str); // ok //char* w = cast(char*)str.toStringz; // also ok //char* w = cast(char*)str.dup; // also ok //char* w = str.dup.ptr; // also ok writeln(replaceChar(w, str.length, ';', 'X')); } immutable(char)[] replaceChar(char* str, ulong len, char ch1, char ch2) { for (ulong i = 0; i < len; i++) { if (str[i] == ch1) { writefln("Found %c at str[%d]", ch1, i); // fine str[i] = ch2; } } //return to!(immutable(char)[])(str); // nope .. issue with null terminator perhaps ?? return str[0 .. len].idup; // ok //return str[0 .. len].dup; // also ok //return to!string(str[0 .. len]); // also ok //return assumeUnique(str[0..len]); // also ok } // ---------------------
Dec 13 2021
On Mon, Dec 13, 2021 at 10:43:14PM +0000, forkit via Digitalmars-d-learn wrote:On Monday, 13 December 2021 at 21:13:25 UTC, H. S. Teoh wrote:[...]What you should be doing is: return to!string(str[0 .. len]); Or just: return str[0 .. len].idup;oh.. so many different ways...(to both produce the same bug, and also to produce the correct output). ... it's a little mind boggling ;-)[...] It's very simple. In C, an array decays to a pure pointer. In D, an array is a pointer + length pair. Given a D array, if you want a pointer to its first element, you use .ptr. Given a D pointer, if you want an array, you slice it with [0 .. length]. That's all there is to it. T -- Almost all proofs have bugs, but almost all theorems are true. -- Paul Pedersen
Dec 13 2021
On Monday, 13 December 2021 at 22:43:14 UTC, forkit wrote:[...] //char* w = cast(char*)str; // nope. a pointer to a string constant is // (supposed to be) immutable, so expect undefined behaviour.note://char* w = cast(char*)str.toStringz; // also okthis is also undefined behavior (toStringz returns an immutable(char)* which you cast away)char* w = strdup(cast(char*)str); // okthis is a C library function - this is risky if your string is not a string literal (may copy too much or segfault) - I would recommend not using this. This will only work properly when you have string literals (strings that are created using `""` in code, no other strings like those that are for example read from user input, from files or dynamically created)//char* w = cast(char*)str.dup; // also ok //char* w = str.dup.ptr; // also ok [...]the last two here are equivalent, I personally prefer the last one. I think these are the idiomatic way how to duplicate a string into writable memory and get the pointer to it. The best way would be not doing this at all - when you manipulate strings/arrays in D you can do so by just assigning the elements like this: ```d immutable(char)[] replaceChar(char[] str, char ch1, char ch2) { for (ulong i = 0; i < len; i++) { if (str[i] == ch1) { writefln("Found %c at str[%d]", ch1, i); // fine str[i] = ch2; } } return str.idup; } ``` then when you call it: ```d replaceChar(str.dup, ';', 'X'); ``` or the function more idiomatically: ```d string replaceChar(scope char[] str, char ch1, char ch2) { // ref makes the `c` variable an l-value / assignable and modifies the character when assigned foreach (i, ref c; str) { if (c == ch1) { writefln("Found %s at str[%s]", c, i); c = ch2; } } return str.idup; // you could also not .idup and return char[] and let the caller .idup it when needed } ``` You only really need to work with pointers when you interface with a C library that needs them.
Dec 14 2021
On Tuesday, 14 December 2021 at 08:07:43 UTC, WebFreak001 wrote:The best way would be not doing this at all - when you manipulate strings/arrays in D you can do so by just assigning the elements like this: ```d immutable(char)[] replaceChar(char[] str, char ch1, char ch2) { for (ulong i = 0; i < len; i++) { if (str[i] == ch1) { writefln("Found %c at str[%d]", ch1, i); // fine str[i] = ch2; } } return str.idup; } ``` then when you call it: ```d replaceChar(str.dup, ';', 'X'); ``` or the function more idiomatically: ```d string replaceChar(scope char[] str, char ch1, char ch2) { // ref makes the `c` variable an l-value / assignable and modifies the character when assigned foreach (i, ref c; str) { if (c == ch1) { writefln("Found %s at str[%s]", c, i); c = ch2; } } return str.idup; // you could also not .idup and return char[] and let the caller .idup it when needed } ``` You only really need to work with pointers when you interface with a C library that needs them.This was of course just me 'playing around with pointer casting in D', and not code that I would have deployed. Debugging that code used up an hour of my life .. that I cannot get back I might try out safe instead ;-)
Dec 14 2021
On Monday, 13 December 2021 at 20:58:42 UTC, forkit wrote:immutable(char)[] replaceChar(char* str, ulong len, char ch1, char ch2) //snip return to!(immutable(char)[])(str); }You're calling a `to` on a char pointer, which, ostensibly, would look for null terminator. Which there may not be any if you do a .dup.
Dec 13 2021