digitalmars.D - [Proposal] Add module for C-strings support in Phobos
- Denis Shelomovskij (20/20) Mar 20 2014 It's filed as enhancement 12418 [2]:
- Rikki Cattermole (10/29) Mar 20 2014 Looks like it wouldn't be really useful with Windows API. Given
- Denis Shelomovskij (12/41) Mar 20 2014 You misunderstand the terminology. C string is a zero-terminated string....
- Rikki Cattermole (12/67) Mar 20 2014 I understand how c strings work. It would be nice to have more
- Denis Shelomovskij (12/62) Mar 20 2014 I'd say must unittests do test UTF-16 & UTF-32 versions. As for
- angel (5/5) Mar 21 2014 Going slightly beyond a new module code, it might, possibly, be
- Adam D. Ruppe (13/17) Mar 21 2014 The core language already knows zero-terminated strings:
- Adam D. Ruppe (4/4) Mar 21 2014 You could also write:
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (4/6) Mar 22 2014 In this case "bar" is already zero-terminated right?
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (11/15) Mar 22 2014 DMD currently cannot infer aliases to be callable using UCFS
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (9/15) Mar 22 2014 Correction:
- Adam D. Ruppe (11/12) Mar 22 2014 That's because you made the alias local, UFCS only works with
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (5/17) Mar 22 2014 Ok. Great.
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (1/4) Mar 22 2014 What were the motivations behind this choice of design?
- Andrei Alexandrescu (2/18) Mar 22 2014 Please bugzilla, thanks! -- Andrei
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (1/2) Mar 22 2014 Does anybody know if there is an Issue for this?
- Andrej Mitrovic (13/15) Mar 22 2014 Actually you're running into UFCS not working for module-scoped
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (2/3) Mar 22 2014 Ok.
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (1/2) Mar 22 2014 Do you a reference to this bugzilla issue?
- Andrej Mitrovic (2/4) Mar 22 2014 https://d.puremagic.com/issues/show_bug.cgi?id=6185
- Andrej Mitrovic (3/8) Mar 22 2014 Oops, that's slightly different and solved. I'm not sure if the alias
- Andrej Mitrovic (5/8) Mar 22 2014 Looks like what happened was I filed the 'alias' version as a
It's filed as enhancement 12418 [2]: C-strings processing is a special and common case so: 1. C-strings should be supported with both performance and usability. 2. There should be a dedicated module for C-strings (instead of adding such functions here and there in other modules). Current state: there is no good support for C-strings in Phobos, there is slow and broken `toStringz` (Issue 12417 [3]), and no standard way to make many common operations, like converting returned C-string to string and releasing its memory or creating a C-string from string using an allocation function. So I propose to add `unstd.c.string` [1] module to Phobos which include all use-cases I have seen implementing (correct and fast in contrast to existing ones like GtkD (yes, it's both incorrect and slow because of tons of GC allocations)) C library wrappers. [1] http://denis-sh.bitbucket.org/unstandard/unstd.c.string.html [2] https://d.puremagic.com/issues/show_bug.cgi?id=12418 [3] https://d.puremagic.com/issues/show_bug.cgi?id=12417 -- Денис В. Шеломовский Denis V. Shelomovskij
Mar 20 2014
On Thursday, 20 March 2014 at 08:24:30 UTC, Denis Shelomovskij wrote:It's filed as enhancement 12418 [2]: C-strings processing is a special and common case so: 1. C-strings should be supported with both performance and usability. 2. There should be a dedicated module for C-strings (instead of adding such functions here and there in other modules). Current state: there is no good support for C-strings in Phobos, there is slow and broken `toStringz` (Issue 12417 [3]), and no standard way to make many common operations, like converting returned C-string to string and releasing its memory or creating a C-string from string using an allocation function. So I propose to add `unstd.c.string` [1] module to Phobos which include all use-cases I have seen implementing (correct and fast in contrast to existing ones like GtkD (yes, it's both incorrect and slow because of tons of GC allocations)) C library wrappers. [1] http://denis-sh.bitbucket.org/unstandard/unstd.c.string.html [2] https://d.puremagic.com/issues/show_bug.cgi?id=12418 [3] https://d.puremagic.com/issues/show_bug.cgi?id=12417Looks like it wouldn't be really useful with Windows API. Given that wstrings are more common there. Another thing that would be nice to have is a wrapper struct for the pointer that allows accessing via e.g. opIndex and opSlice. Ext. Use case: Store the struct on D side to make sure GC doesn't clean it up and still be able to access and modify it like a normal string easily.
Mar 20 2014
20.03.2014 13:20, Rikki Cattermole пишет:On Thursday, 20 March 2014 at 08:24:30 UTC, Denis Shelomovskij wrote:You misunderstand the terminology. C string is a zero-terminated string. Also looks like you didn't even go to docs page as the second example is WinAPI one.It's filed as enhancement 12418 [2]: C-strings processing is a special and common case so: 1. C-strings should be supported with both performance and usability. 2. There should be a dedicated module for C-strings (instead of adding such functions here and there in other modules). Current state: there is no good support for C-strings in Phobos, there is slow and broken `toStringz` (Issue 12417 [3]), and no standard way to make many common operations, like converting returned C-string to string and releasing its memory or creating a C-string from string using an allocation function. So I propose to add `unstd.c.string` [1] module to Phobos which include all use-cases I have seen implementing (correct and fast in contrast to existing ones like GtkD (yes, it's both incorrect and slow because of tons of GC allocations)) C library wrappers. [1] http://denis-sh.bitbucket.org/unstandard/unstd.c.string.html [2] https://d.puremagic.com/issues/show_bug.cgi?id=12418 [3] https://d.puremagic.com/issues/show_bug.cgi?id=12417Looks like it wouldn't be really useful with Windows API. Given that wstrings are more common there.Another thing that would be nice to have is a wrapper struct for the pointer that allows accessing via e.g. opIndex and opSlice. Ext. Use case: Store the struct on D side to make sure GC doesn't clean it up and still be able to access and modify it like a normal string easily.I don't understand the use-case. If you did implemented some C library wrappers and have a personal experience, I'd like to hear your opinion on C functions calling problem and your proposal to solve it, if you dislike mine. Also with examples, please, where my solution fails and your one rocks. ) -- Денис В. Шеломовский Denis V. Shelomovskij
Mar 20 2014
On Thursday, 20 March 2014 at 09:32:33 UTC, Denis Shelomovskij wrote:20.03.2014 13:20, Rikki Cattermole пишет:I understand how c strings work. It would be nice to have more unittests for dstring/wstring, because it looks more geared towards char/string. Which is why it looks on the offset that it is less going to work.On Thursday, 20 March 2014 at 08:24:30 UTC, Denis Shelomovskij wrote:You misunderstand the terminology. C string is a zero-terminated string. Also looks like you didn't even go to docs page as the second example is WinAPI one.It's filed as enhancement 12418 [2]: C-strings processing is a special and common case so: 1. C-strings should be supported with both performance and usability. 2. There should be a dedicated module for C-strings (instead of adding such functions here and there in other modules). Current state: there is no good support for C-strings in Phobos, there is slow and broken `toStringz` (Issue 12417 [3]), and no standard way to make many common operations, like converting returned C-string to string and releasing its memory or creating a C-string from string using an allocation function. So I propose to add `unstd.c.string` [1] module to Phobos which include all use-cases I have seen implementing (correct and fast in contrast to existing ones like GtkD (yes, it's both incorrect and slow because of tons of GC allocations)) C library wrappers. [1] http://denis-sh.bitbucket.org/unstandard/unstd.c.string.html [2] https://d.puremagic.com/issues/show_bug.cgi?id=12418 [3] https://d.puremagic.com/issues/show_bug.cgi?id=12417Looks like it wouldn't be really useful with Windows API. Given that wstrings are more common there.I don't dislike your approach at all. I just feel that it needs to allow for a little more use cases. Given the proposal is for phobos. What you have done looks fine for most cases to c libraries. I'm just worried that it has less use cases then it could have. I'm just nitpicking so don't mind me too much :)Another thing that would be nice to have is a wrapper struct for the pointer that allows accessing via e.g. opIndex and opSlice. Ext. Use case: Store the struct on D side to make sure GC doesn't clean it up and still be able to access and modify it like a normal string easily.I don't understand the use-case. If you did implemented some C library wrappers and have a personal experience, I'd like to hear your opinion on C functions calling problem and your proposal to solve it, if you dislike mine. Also with examples, please, where my solution fails and your one rocks. )
Mar 20 2014
20.03.2014 13:52, Rikki Cattermole пишет:On Thursday, 20 March 2014 at 09:32:33 UTC, Denis Shelomovskij wrote:I'd say must unittests do test UTF-16 & UTF-32 versions. As for documentation, function signatures contain template parameter for character but probably there is a lack of ddoc unittests and/or documentation.20.03.2014 13:20, Rikki Cattermole пишет:I understand how c strings work. It would be nice to have more unittests for dstring/wstring, because it looks more geared towards char/string. Which is why it looks on the offset that it is less going to work.On Thursday, 20 March 2014 at 08:24:30 UTC, Denis Shelomovskij wrote:You misunderstand the terminology. C string is a zero-terminated string. Also looks like you didn't even go to docs page as the second example is WinAPI one.It's filed as enhancement 12418 [2]: C-strings processing is a special and common case so: 1. C-strings should be supported with both performance and usability. 2. There should be a dedicated module for C-strings (instead of adding such functions here and there in other modules). Current state: there is no good support for C-strings in Phobos, there is slow and broken `toStringz` (Issue 12417 [3]), and no standard way to make many common operations, like converting returned C-string to string and releasing its memory or creating a C-string from string using an allocation function. So I propose to add `unstd.c.string` [1] module to Phobos which include all use-cases I have seen implementing (correct and fast in contrast to existing ones like GtkD (yes, it's both incorrect and slow because of tons of GC allocations)) C library wrappers. [1] http://denis-sh.bitbucket.org/unstandard/unstd.c.string.html [2] https://d.puremagic.com/issues/show_bug.cgi?id=12418 [3] https://d.puremagic.com/issues/show_bug.cgi?id=12417Looks like it wouldn't be really useful with Windows API. Given that wstrings are more common there.Thanks. So the algorithm is like this: find C library which needs more love and file me an issue [1]. As I just added all common use-cases I have seen. [1] https://bitbucket.org/denis-sh/unstandard/issues -- Денис В. Шеломовский Denis V. ShelomovskijI don't dislike your approach at all. I just feel that it needs to allow for a little more use cases. Given the proposal is for phobos. What you have done looks fine for most cases to c libraries. I'm just worried that it has less use cases then it could have. I'm just nitpicking so don't mind me too much :)Another thing that would be nice to have is a wrapper struct for the pointer that allows accessing via e.g. opIndex and opSlice. Ext. Use case: Store the struct on D side to make sure GC doesn't clean it up and still be able to access and modify it like a normal string easily.I don't understand the use-case. If you did implemented some C library wrappers and have a personal experience, I'd like to hear your opinion on C functions calling problem and your proposal to solve it, if you dislike mine. Also with examples, please, where my solution fails and your one rocks. )
Mar 20 2014
Going slightly beyond a new module code, it might, possibly, be useful to enable zero-terminated string creation on the core language level, with: auto mystr = "hello"z; The 'z' in the end is much the same as 'L' in a '5L' ...
Mar 21 2014
On Friday, 21 March 2014 at 19:59:51 UTC, angel wrote:Going slightly beyond a new module code, it might, possibly, be useful to enable zero-terminated string creation on the core language level, with: auto mystr = "hello"z;The core language already knows zero-terminated strings: void main() { immutable(char)* s = "lol"; } Regular 8-bit strings implicitly convert to pointers without needing to explicitly call the .ptr property and they are always zero terminated automatically. This is why you can write printf("foo"); in D and have it just work without complaining about needing toStringz. You can also write: const char* s = "lol"; and that works too. Not quite auto, but not a big hassle.
Mar 21 2014
You could also write: alias toStringz z; auto foo = "bar".z; and that would work too!
Mar 21 2014
alias toStringz z; auto foo = "bar".z;In this case "bar" is already zero-terminated right? See "String literals already have a 0 appended to them" in http://dlang.org/arrays.html
Mar 22 2014
You could also write: alias toStringz z; auto foo = "bar".z; and that would work too!DMD currently cannot infer aliases to be callable using UCFS unfortunately: unittest { import std.stdio: wln = writeln; import std.string; wln(typeof("a".z).stringof); } errors with t_string.d(19,19): Error: no property 'z' for type 'string' Shouldn't be to hard to fix, though.
Mar 22 2014
unittest { import std.stdio: wln = writeln; import std.string; wln(typeof("a".z).stringof); }Correction: unittest { import std.stdio: wln = writeln; import std.string; alias z = toStringz; wln(typeof("a".z).stringof); } gives same error t_string.d(7,19): Error: no property 'z' for type 'string'
Mar 22 2014
On Saturday, 22 March 2014 at 13:01:11 UTC, Nordlöw wrote:gives same errorThat's because you made the alias local, UFCS only works with global symbols right now (which is actually by design, though I don't think it is a great design). So this works: // move these out to module scope import std.string; alias z = toStringz; unittest { import std.stdio: wln = writeln; wln(typeof("a".z).stringof); // now we're good/ }
Mar 22 2014
On Saturday, 22 March 2014 at 13:21:45 UTC, Adam D. Ruppe wrote:On Saturday, 22 March 2014 at 13:01:11 UTC, Nordlöw wrote:Ok. Great. Still...I believe a warning hint should be outputted. This is not obvious. /Pergives same errorThat's because you made the alias local, UFCS only works with global symbols right now (which is actually by design, though I don't think it is a great design). So this works: // move these out to module scope import std.string; alias z = toStringz; unittest { import std.stdio: wln = writeln; wln(typeof("a".z).stringof); // now we're good/ }
Mar 22 2014
That's because you made the alias local, UFCS only works with global symbols right now (which is actually by design, though I don't think it is a great design).What were the motivations behind this choice of design?
Mar 22 2014
On 3/22/14, 6:01 AM, "Nordlöw" wrote:Please bugzilla, thanks! -- Andreiunittest { import std.stdio: wln = writeln; import std.string; wln(typeof("a".z).stringof); }Correction: unittest { import std.stdio: wln = writeln; import std.string; alias z = toStringz; wln(typeof("a".z).stringof); } gives same error t_string.d(7,19): Error: no property 'z' for type 'string'
Mar 22 2014
Shouldn't be to hard to fix, though.Does anybody know if there is an Issue for this?
Mar 22 2014
On 3/22/14, "Nordlw" <per.nordlow gmail.com> wrote:DMD currently cannot infer aliases to be callable using UCFS unfortunatelyActually you're running into UFCS not working for module-scoped imports. The following will work: ----- import std.stdio: wln = writeln; import std.string; alias toStringz z; void main() { wln(typeof("a".z).stringof); // works ok } ----- UFCS not working for module-scoped imports is a filed bug.
Mar 22 2014
UFCS not working for module-scoped imports is a filed bug.Ok. Great!
Mar 22 2014
UFCS not working for module-scoped imports is a filed bug.Do you a reference to this bugzilla issue?
Mar 22 2014
On 3/22/14, "Nordlw" <per.nordlow gmail.com> wrote:https://d.puremagic.com/issues/show_bug.cgi?id=6185UFCS not working for module-scoped imports is a filed bug.Do you have a reference to this bugzilla issue?
Mar 22 2014
On 3/22/14, Andrej Mitrovic <andrej.mitrovich gmail.com> wrote:On 3/22/14, "Nordlw" <per.nordlow gmail.com> wrote:Oops, that's slightly different and solved. I'm not sure if the alias version is filed.https://d.puremagic.com/issues/show_bug.cgi?id=6185UFCS not working for module-scoped imports is a filed bug.Do you have a reference to this bugzilla issue?
Mar 22 2014
On 3/22/14, Andrej Mitrovic <andrej.mitrovich gmail.com> wrote:Looks like what happened was I filed the 'alias' version as a duplicate of 6185, then 6185 was fixed but not the test-case in 9515. Gonna reopen it now: https://d.puremagic.com/issues/show_bug.cgi?id=9515https://d.puremagic.com/issues/show_bug.cgi?id=6185Oops, that's slightly different and solved. I'm not sure if the alias version is filed.
Mar 22 2014