www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - More complexity creep in Phobos

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
I started an alphabetical search through std modules, and as luck would 
have it I got as far as the first module: std/algorithm/comparison.d. In 
there, there's these overloads:

size_t levenshteinDistance(alias equals = (a,b) => a == b, Range1, Range2)
     (Range1 s, Range2 t)
if (isForwardRange!(Range1) && isForwardRange!(Range2));

size_t levenshteinDistance(alias equals = (a,b) => a == b, Range1, Range2)
     (auto ref Range1 s, auto ref Range2 t)
if (isConvertibleToString!Range1 || isConvertibleToString!Range2)

(similar for levenshteinDistanceAndPath)

What's with the second overload nonsense? The Levenshtein algorithm 
works on forward ranges. And that's about it, in a profound sense: 
Platonically what the algorithm needs is two forward ranges to operate. 
(We ought to be pretty proud of it, too: in all other languages I looked 
at, it's misimplemented to require random access.)

The second overload comes from a warped sense of DWIM: well if this type 
converts to a string, we commit to support that too. Really. How about a 
struct that converts to an array? Should we put in a pull request to 
that, too? Where do we even stop?

I hope there's not much more of this nonsense, because it all should be 
deprecated with fire.
Mar 27
next sibling parent reply Seb <seb wilzba.ch> writes:
On Thursday, 28 March 2019 at 02:17:35 UTC, Andrei Alexandrescu 
wrote:
 I started an alphabetical search through std modules, and as 
 luck would have it I got as far as the first module: 
 std/algorithm/comparison.d. In there, there's these overloads:

 size_t levenshteinDistance(alias equals = (a,b) => a == b, 
 Range1, Range2)
     (Range1 s, Range2 t)
 if (isForwardRange!(Range1) && isForwardRange!(Range2));

 size_t levenshteinDistance(alias equals = (a,b) => a == b, 
 Range1, Range2)
     (auto ref Range1 s, auto ref Range2 t)
 if (isConvertibleToString!Range1 || 
 isConvertibleToString!Range2)

 (similar for levenshteinDistanceAndPath)

 What's with the second overload nonsense? The Levenshtein 
 algorithm works on forward ranges. And that's about it, in a 
 profound sense: Platonically what the algorithm needs is two 
 forward ranges to operate. (We ought to be pretty proud of it, 
 too: in all other languages I looked at, it's misimplemented to 
 require random access.)

 The second overload comes from a warped sense of DWIM: well if 
 this type converts to a string, we commit to support that too. 
 Really. How about a struct that converts to an array? Should we 
 put in a pull request to that, too? Where do we even stop?
See https://github.com/dlang/phobos/pull/3770 for the historical reason.
 I hope there's not much more of this nonsense, because it all 
 should be deprecated with fire.
Then we would have to deprecate more than half of Phobos, because a lot of similar cruft got aggregated over the last ten years. Many of these crufts can't be as easily deprecated as this example...
Mar 27
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/27/19 10:45 PM, Seb wrote:
 See https://github.com/dlang/phobos/pull/3770 for the historical reason.
Thanks. Yep, the proverbial good intentions paving the road to hell.
 I hope there's not much more of this nonsense, because it all should 
 be deprecated with fire.
Then we would have to deprecate more than half of Phobos, because a lot of similar cruft got aggregated over the last ten years. Many of these crufts can't be as easily deprecated as this example...
"More than half" would be an exaggeration and as such of limited usefulness. I looked at a few more modules and they're in better shape.
Mar 27
prev sibling next sibling parent reply Meta <jared771 gmail.com> writes:
On Thursday, 28 March 2019 at 02:17:35 UTC, Andrei Alexandrescu 
wrote:
 I started an alphabetical search through std modules, and as 
 luck would have it I got as far as the first module: 
 std/algorithm/comparison.d. In there, there's these overloads:

 size_t levenshteinDistance(alias equals = (a,b) => a == b, 
 Range1, Range2)
     (Range1 s, Range2 t)
 if (isForwardRange!(Range1) && isForwardRange!(Range2));

 size_t levenshteinDistance(alias equals = (a,b) => a == b, 
 Range1, Range2)
     (auto ref Range1 s, auto ref Range2 t)
 if (isConvertibleToString!Range1 || 
 isConvertibleToString!Range2)

 (similar for levenshteinDistanceAndPath)

 What's with the second overload nonsense? The Levenshtein 
 algorithm works on forward ranges. And that's about it, in a 
 profound sense: Platonically what the algorithm needs is two 
 forward ranges to operate. (We ought to be pretty proud of it, 
 too: in all other languages I looked at, it's misimplemented to 
 require random access.)

 The second overload comes from a warped sense of DWIM: well if 
 this type converts to a string, we commit to support that too. 
 Really. How about a struct that converts to an array? Should we 
 put in a pull request to that, too? Where do we even stop?

 I hope there's not much more of this nonsense, because it all 
 should be deprecated with fire.
Maybe the implementation of the fix is not ideal, but the reason it was added in the first place is valid, IMO. Looking at the original issue[1], the minimized example is as follows: void popFront(T)(ref T[] a) { a = a[1..$]; } enum bool isInputRange(R) = is(typeof( { R r; r.popFront(); })); struct DirEntry { property string name() { return ""; } alias name this; } pragma(msg, isInputRange!DirEntry); // prints 'false' pragma(msg, isInputRange!(typeof(DirEntry.init.name))); // prints 'true' bool isDir(R)(R r) if (isInputRange!R) { return true; } void main() { DirEntry de; bool c = isDir(de); // Error: isDir cannot deduce function from argument types !()(DirEntry) } And trying this code out on dmd-nightly[2], it still fails today. Personally, I think it would be a bad thing if code that uses a DirEntry like above stopped compiling, but I agree that the fix could be implemented differently. It seems like if you see some weird code in Phobos but you don't understand why it was written that way, there are 3 main reasons that account for 99% of these cases: - string (autodecoding) - alias this - enums 1. https://issues.dlang.org/show_bug.cgi?id=15027 2. https://run.dlang.io/is/zMtHs0
Mar 27
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/27/19 11:22 PM, Meta wrote:
 It seems like if you see some weird code in Phobos but you don't 
 understand why it was written that way, there are 3 main reasons that 
 account for 99% of these cases:
 
 - string (autodecoding)
 - alias this
 - enums
"Mistakes made by people" is missing from that list. Which include mine of course. There's this nice notion of reasoning by first principles vs. reasoning by analogy: https://fs.blog/2018/04/first-principles/ A very nice essay - recommended outside this discussion's context, too. The levenshteinDistance issue is a direct application of the two kinds of reasoning. This is reasoning by analogy: "DirEntry converts to string and is used by some people as such. They pass it to functions, and some accept it but some don't. It follows by analogy that the Levenshtein distance algorithm should be worked out to accept it, too." In contrast, the reasoning from first principles should powerfully override that: "Levenshtein distance operates on forward ranges. All that stuff that changes the signature of levenshteinDistance to accept other artifacts is nonsense. Whoever wants to use it should carry the conversion to forward range themselves." It follows that the correct answer to this generalization frenzy should be: "We work with ranges and character types. If you've got enums and alias this and whatnot, more power to you but to use the standard library you must convert those to the stuff we support."
Mar 27
parent reply Meta <jared771 gmail.com> writes:
On Thursday, 28 March 2019 at 04:20:55 UTC, Andrei Alexandrescu 
wrote:
 There's this nice notion of reasoning by first principles vs. 
 reasoning by analogy:

 https://fs.blog/2018/04/first-principles/

 A very nice essay - recommended outside this discussion's 
 context, too. The levenshteinDistance issue is a direct 
 application of the two kinds of reasoning. This is reasoning by 
 analogy:

 "DirEntry converts to string and is used by some people as 
 such. They pass it to functions, and some accept it but some 
 don't. It follows by analogy that the Levenshtein distance 
 algorithm should be worked out to accept it, too."
 In contrast, the reasoning from first principles should 
 powerfully override that:

 "Levenshtein distance operates on forward ranges. All that 
 stuff that changes the signature of levenshteinDistance to 
 accept other artifacts is nonsense. Whoever wants to use it 
 should carry the conversion to forward range themselves."

 It follows that the correct answer to this generalization 
 frenzy should be: "We work with ranges and character types. If 
 you've got enums and alias this and whatnot, more power to you 
 but to use the standard library you must convert those to the 
 stuff we support."
I agree with your first principles arguments, but not your conclusion. I believe you are starting with the faulty base assumption that DirEntry is not a range. DirEntry IS a range, by the rules of the current language. DirEntry states that it is a subtype of `string` by declaring `alias name this`. It follows that a DirEntry may be transparently substituted wherever a string is accepted. As a string is a range, and DirEntry is a subtype of string, then DirEntry is also a range. `levenshteinDistance` accepts two ranges as its arguments, either of which may be strings; therefore, either of its arguments may also be a DirEntry substituted in the place of a string. By writ, we must support passing a DirEntry to levenshteinDistance. The real problem here is with `alias this`. `alias this` claims to allow one type to become a subtype of another, but that's not true; this language feature violates the Liskov Substitution Principle (a fact that I have mentioned before, and likely others). `alias this` fails the substitutability test - this thread and the defect linked are proof of that. Because `alias this` claims to allow one type A to subtype another type B, but does not actually make good on that promise and implements this subtyping improperly, it follows that any code working with type B that wants to also support type A has to use these ugly workarounds. The code in question is working around a defect in the language, and would not be necessary if the language itself were fixed.
Mar 27
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/28/19 1:36 AM, Meta wrote:
 DirEntry states that it is a subtype of `string`
No, please refer to the "Generality creep" thread.
Mar 28
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 3/28/2019 12:32 AM, Andrei Alexandrescu wrote:
 No, please refer to the "Generality creep" thread.
https://digitalmars.com/d/archives/digitalmars/D/Generality_creep_324867.html
Mar 28
prev sibling next sibling parent Kagamin <spam here.lot> writes:
On Thursday, 28 March 2019 at 07:32:33 UTC, Andrei Alexandrescu 
wrote:
 On 3/28/19 1:36 AM, Meta wrote:
 DirEntry states that it is a subtype of `string`
No, please refer to the "Generality creep" thread.
DirEntry is implicitly convertible to string because it needs to be usable with std.file, which is a big untyped ball of strings. Compare with scriptlike https://github.com/abscissa/scriptlike#filepaths that has typed wrappers for paths (though I prefer separate wrappers for files and folders).
Mar 28
prev sibling parent reply Meta <jared771 gmail.com> writes:
On Thursday, 28 March 2019 at 07:32:33 UTC, Andrei Alexandrescu 
wrote:
 On 3/28/19 1:36 AM, Meta wrote:
 DirEntry states that it is a subtype of `string`
No, please refer to the "Generality creep" thread.
Last time I checked, curt dismissal is not an argument. I maintain that this is not "generality creep", but working around a broken language feature.
Mar 28
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/28/19 10:49 AM, Meta wrote:
 On Thursday, 28 March 2019 at 07:32:33 UTC, Andrei Alexandrescu wrote:
 On 3/28/19 1:36 AM, Meta wrote:
 DirEntry states that it is a subtype of `string`
No, please refer to the "Generality creep" thread.
Last time I checked, curt dismissal is not an argument.
Sorry. It was a pertinent response dressed as a curt dismissal. The opening post of the "Generality creep" is trivial to find and contains an explanation of exactly this matter. Walter posted a link to it, too (thanks). Here it is: https://digitalmars.com/d/archives/digitalmars/D/Generality_creep_324867.html What remains is that I copy and paste it for you, which I'll do at the end of this message. I actually stopped reading your post following your assertion, because I inferred it would be increasingly wrong because it is based on a wrong assumption. Just now I went and read it, and indeed it is so.
 I maintain that 
 this is not "generality creep", but working around a broken language 
 feature.
You are wrong in a sense (factually that is not subtyping, it's coercion) and right in another (it's questionable to allow odd forms of alias this in the language in the first place). I wouldn't disagree either way. Depends on the angle. Here's the text explaining why alias this to an rvalue is not subtyping:
 Walter and I were looking at indexOf in the standard library. It has several
overloads, of which let's look at the first two:
 
 https://dlang.org/library/std/string/index_of.html
 
 Now, why couldn't they be merged into a single function with this constraint
(their bodies are identical):
 
 ptrdiff_t indexOf(Range) (
   Range s,
   dchar c,
   CaseSensitive cs = Yes.caseSensitive
 )
 if (isInputRange!Range && isSomeChar!(ElementType!Range));
 
 It makes sense to look for a character in any range of characters.
 
 Attempting to build that fails because of this type fails to work:
 
     struct TestAliasedString
     {
         string get()  safe  nogc pure nothrow { return _s; }
         alias get this;
          disable this(this);
         string _s;
     }
 
 The intuition is that the function should be general enough to figure out
that, hey, TestAliasedString is kinda sorta a subtype of string. So the call
should work.
 
 So let's see why it doesn't work - i.e. why is TestAliasedString an input
range? The definition of isInputRange (in std.range.primitives) is:
 
 enum bool isInputRange(R) =
     is(typeof(R.init) == R)
     && is(ReturnType!((R r) => r.empty) == bool)
     && is(typeof((return ref R r) => r.front))
     && !is(ReturnType!((R r) => r.front) == void)
     && is(typeof((R r) => r.popFront));
 
 Turns out the second clause fails. That takes us to the definition of empty in
the same module:
 
  property bool empty(T)(auto ref scope const(T) a)
 if (is(typeof(a.length) : size_t))
 {
     return !a.length;
 }
 
 The intent is fairly clear - if a range defines empty as a size_t (somewhat
oddly relaxed to "convertible to size_t"), then empty can be nicely defined in
terms of length. Cool. But empty doesn't work with TestAliasedString due to an
overlooked matter: the "const". A mutable TestAliasedString converts to a
string, but a const or immutable TestAliasedString does NOT convert to a const
string! So this fixes that matter:
 
     struct TestAliasedString
     {
         string get()  safe  nogc pure nothrow { return _s; }
         const(string) get()  safe  nogc pure nothrow const { return _s; }
         alias get this;
          disable this(this);
         string _s;
     }
 
 That makes empty() work, but also raises a nagging question: what was the
relationship of TestAliasedString to string before this change? Surely that
wasn't subtyping. (My response would be: "Odd.") And why was Phobos under the
obligation to cater for such a type and its tenuous relationship to a range?
 
 But wait, there's more. Things still don't work because of popFront. Looking
at its definition:
 
 void popFront(C)(scope ref inout(C)[] str)  trusted pure nothrow
 if (isNarrowString!(C[]))
 { ... }
 
 So, reasonably this function takes the range by reference so it can modify its
internals. HOWEVER! The implementation of TestAliasedString.get() returns an
rvalue, i.e. it's the equivalent of a conversion involving a temporary. Surely
that's not to match, whether in the current language or the one after the
rvalue DIP.
 
 The change that does make the code work is:
 
     struct TestAliasedString
     {
         ref string get()  safe  nogc pure nothrow { return _s; }
         ref const(string) get()  safe  nogc pure nothrow const { return _s; }
         alias get this;
          disable this(this);
         string _s;
     }
 
 This indeed does implement a subtyping relationship, and passes the
isInputRange test.
 
 What's the moral of the story here? Generality is good, but it seems in
several places in phobos (of which this is just one example), a combination of
vague specification and aiming for a nice ideal of "work with anything remotely
reasonable" has backfired into a morass of inconsistently defined and supported
corner cases.
 
 For this case in particular - I don't think we should support all types that
support some half-hearted form of subtyping, at the cost of reducing generality
and deprecating working code.
Mar 28
parent reply Meta <jared771 gmail.com> writes:
On Thursday, 28 March 2019 at 16:12:34 UTC, Andrei Alexandrescu 
wrote:
 On 3/28/19 10:49 AM, Meta wrote:
 On Thursday, 28 March 2019 at 07:32:33 UTC, Andrei 
 Alexandrescu wrote:
 On 3/28/19 1:36 AM, Meta wrote:
 DirEntry states that it is a subtype of `string`
No, please refer to the "Generality creep" thread.
Last time I checked, curt dismissal is not an argument.
Sorry. It was a pertinent response dressed as a curt dismissal. The opening post of the "Generality creep" is trivial to find and contains an explanation of exactly this matter. Walter posted a link to it, too (thanks). Here it is: https://digitalmars.com/d/archives/digitalmars/D/Generality_creep_324867.html What remains is that I copy and paste it for you, which I'll do at the end of this message. I actually stopped reading your post following your assertion, because I inferred it would be increasingly wrong because it is based on a wrong assumption. Just now I went and read it, and indeed it is so.
Thanks, stopped reading your post here. I'll leave the discussion to those more qualified.
Mar 28
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/28/19 12:23 PM, Meta wrote:
 On Thursday, 28 March 2019 at 16:12:34 UTC, Andrei Alexandrescu wrote:
 On 3/28/19 10:49 AM, Meta wrote:
 On Thursday, 28 March 2019 at 07:32:33 UTC, Andrei Alexandrescu wrote:
 On 3/28/19 1:36 AM, Meta wrote:
 DirEntry states that it is a subtype of `string`
No, please refer to the "Generality creep" thread.
Last time I checked, curt dismissal is not an argument.
Sorry. It was a pertinent response dressed as a curt dismissal. The opening post of the "Generality creep" is trivial to find and contains an explanation of exactly this matter. Walter posted a link to it, too (thanks). Here it is: https://digitalmars.com/d/archives/digitalmars/D/Generality_creep_324867.html What remains is that I copy and paste it for you, which I'll do at the end of this message. I actually stopped reading your post following your assertion, because I inferred it would be increasingly wrong because it is based on a wrong assumption. Just now I went and read it, and indeed it is so.
Thanks, stopped reading your post here. I'll leave the discussion to those more qualified.
No need to convert this into an exchange of broadsides. The appropriate response is as simple as "oh ok so that's not subtyping, interesting" - which was my reaction as well. It's a subtle matter but once seen it is clear.
Mar 28
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/28/19 12:23 PM, Meta wrote:
 On Thursday, 28 March 2019 at 16:12:34 UTC, Andrei Alexandrescu wrote:
 On 3/28/19 10:49 AM, Meta wrote:
 On Thursday, 28 March 2019 at 07:32:33 UTC, Andrei Alexandrescu wrote:
 On 3/28/19 1:36 AM, Meta wrote:
 DirEntry states that it is a subtype of `string`
No, please refer to the "Generality creep" thread.
Last time I checked, curt dismissal is not an argument.
Sorry. It was a pertinent response dressed as a curt dismissal. The opening post of the "Generality creep" is trivial to find and contains an explanation of exactly this matter. Walter posted a link to it, too (thanks). Here it is: https://digitalmars.com/d/archives/digitalmars/D/Generality_creep_324867.html What remains is that I copy and paste it for you, which I'll do at the end of this message. I actually stopped reading your post following your assertion, because I inferred it would be increasingly wrong because it is based on a wrong assumption. Just now I went and read it, and indeed it is so.
Thanks, stopped reading your post here. I'll leave the discussion to those more qualified.
The author of this post wrote me a very nice private note in lieu of an unpleasant public response to my nasty way of handling this exchange. He refrained from posting in the spirit of "write letters to people you hate, then burn them" (as the joke goes, "okay, then what do I do to the letters?") Anyhow, I wanted to apologize to Meta in the same place where the offense take place. I am sorry for the unkind tone I took to. An explanation (but in no way a justification) of this came from a friend: "The more sure you are of being right, the more insufferable you become."
Mar 30
parent Meta <jared771 gmail.com> writes:
On Saturday, 30 March 2019 at 22:28:32 UTC, Andrei Alexandrescu 
wrote:
 The author of this post wrote me a very nice private note in 
 lieu of an unpleasant public response to my nasty way of 
 handling this exchange. He refrained from posting in the spirit 
 of "write letters to people you hate, then burn them" (as the 
 joke goes, "okay, then what do I do to the letters?")

 Anyhow, I wanted to apologize to Meta in the same place where 
 the offense take place. I am sorry for the unkind tone I took 
 to.

 An explanation (but in no way a justification) of this came 
 from a friend: "The more sure you are of being right, the more 
 insufferable you become."
I appreciate the gesture Andrei, and have the utmost respect for people who are willing to apologize.
Apr 01
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
BTW the intended title was "More generality creep in Phobos". Turns out 
generality creep begets complexity creep, too...
Mar 28
next sibling parent David Gileadi <gileadisNOSPM gmail.com> writes:
On 3/28/19 5:38 AM, Andrei Alexandrescu wrote:
 BTW the intended title was "More generality creep in Phobos". Turns out 
 generality creep begets complexity creep, too...
Creepy!
Mar 28
prev sibling parent touchaa <yasiin.jakorey bullbeer.net> writes:
On Thursday, 28 March 2019 at 12:38:58 UTC, Andrei Alexandrescu 
wrote:
 BTW the intended title was "More generality creep in Phobos". 
 Turns out generality creep begets complexity creep, too...
DirEntry is implicitly convertible to string because it needs to be usable with std.file, which is a big untyped ball of strings.
May 02
prev sibling parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Wednesday, March 27, 2019 8:17:35 PM MDT Andrei Alexandrescu via 
Digitalmars-d wrote:
 I started an alphabetical search through std modules, and as luck would
 have it I got as far as the first module: std/algorithm/comparison.d. In
 there, there's these overloads:

 size_t levenshteinDistance(alias equals = (a,b) => a == b, Range1, Range2)
 (Range1 s, Range2 t)
 if (isForwardRange!(Range1) && isForwardRange!(Range2));

 size_t levenshteinDistance(alias equals = (a,b) => a == b, Range1, Range2)
 (auto ref Range1 s, auto ref Range2 t)
 if (isConvertibleToString!Range1 || isConvertibleToString!Range2)

 (similar for levenshteinDistanceAndPath)

 What's with the second overload nonsense? The Levenshtein algorithm
 works on forward ranges. And that's about it, in a profound sense:
 Platonically what the algorithm needs is two forward ranges to operate.
 (We ought to be pretty proud of it, too: in all other languages I looked
 at, it's misimplemented to require random access.)

 The second overload comes from a warped sense of DWIM: well if this type
 converts to a string, we commit to support that too. Really. How about a
 struct that converts to an array? Should we put in a pull request to
 that, too? Where do we even stop?

 I hope there's not much more of this nonsense, because it all should be
 deprecated with fire.
This particular case is one that's a bit tricky. The correct solution with stuff like this IMHO is to require that you pass actual forward ranges and not types that convert to types that are forward ranges (like strings). The problem is that a lot of code like this was originally written to take strings and then generalized later, and when a function takes something like string or const(char)[], it's going to implicitly convert varous types to the target type at the call site. In contrast, you get no such implicit conversion with a template constraint. The result is that a bunch of functions in Phobos now use isConvertibleToString on a separate overload to then convert the argument to a string and pass it to the primary function. This is very, very wrong even if it's not obvious. The problem is that the conversion then happens inside the function instead of at the call site, which means that you get fun stuff like when a static array is passed in, it's sliced, and if the function returns any portion of the range it's given, then you're returning a slice of the stack that is then invalid. The only way to make this work is to have two overloads - the primary one, and one that accepts arrays specifically. e.g. auto foo(R)(R range) if(isForwardRange!R && ...) { return fooImpl(range); } auto foo(C)(C[] str) { return fooImpl(range); } private auto fooImpl(R)(R range) { } That way, the conversion happens at the call site like it did when the function accepted string. Similar problems exist in general with accepting implicit conversions with a template constraint and is generally not something that we should be doing unless it's for something really simple like bool. This is not at all a nice way to have to write this code, but it is required to avoid code breakage when making the function work on ranges rather than just strings. isConvertibleToString _could_ be used internally with static if to avoid the additional overload, but again, it runs into the problem that the implicit conversion is done internally. So, to support the implicit conversion, a second overload is required. The way that these functions _should_ have been written was to have them operate on ranges in the first place, and then they wouldn't accept any implict conversions from stuff like enums, but many of these functions weren't. I don't know if that's quite the problem with levenshteinDistance, but it's definitely causing a lot of the ugly overloads in Phobos with anything string-related. Unfortunately, I don't think that it's actually possible to just deprecate the overload that accepts implicit conversions, because that's also the overload that accepts strings. So, for many of these functions, I think that we're stuck with having two overloads unless we do something like std.v2 and have more of a hard breakage. Either way, isConvertibleToString must go. Using it in a template constraint is just begging for bugs, because the implicit conversion won't happen at the call site like it needs to. I've done some work towards removing it from Phobos, but due to health issues, I haven't had enough time to finish the job. - Jonathan M Davis
Mar 30
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/30/19 3:22 PM, Jonathan M Davis wrote:
 Either way, isConvertibleToString must go.
Word. We should have one rcstring type that is reference counted, uses UTF8, is not a range, offers ranges (bytes, codepoints, graphemes), and consolidate around it. Algorithms should only use ranges, and whenever someone wants to use an algorithm with a string, they choose the iteration mode that fits the application.
Mar 30
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Mar 30, 2019 at 06:31:05PM -0400, Andrei Alexandrescu via Digitalmars-d
wrote:
 On 3/30/19 3:22 PM, Jonathan M Davis wrote:
 Either way, isConvertibleToString must go.
Word. We should have one rcstring type that is reference counted, uses UTF8, is not a range, offers ranges (bytes, codepoints, graphemes), and consolidate around it. Algorithms should only use ranges, and whenever someone wants to use an algorithm with a string, they choose the iteration mode that fits the application.
+1. That's the correct approach. If std.v2 ever happens, this is definitely the way to go. T -- Insanity is doing the same thing over and over again and expecting different results.
Apr 01
prev sibling parent reply Olivier FAURE <couteaubleu gmail.com> writes:
On Saturday, 30 March 2019 at 22:31:05 UTC, Andrei Alexandrescu 
wrote:
 We should have one rcstring type that is reference counted, 
 uses UTF8, is not a range, offers ranges (bytes, codepoints, 
 graphemes), and consolidate around it. Algorithms should only 
 use ranges, and whenever someone wants to use an algorithm with 
 a string, they choose the iteration mode that fits the 
 application.
Why? Strings have the advantage of being extremely simple constructs that represent exactly the right abstraction: they're a slice of chars, period. They can be scoped, sliced and concatenated just like any other range. Getting rid of auto-decoding would be good, but adding a whole new layer of abstraction and special cases on top of it, with a dedicated allocation strategy C++ style, seems superfluous at best. Honestly, it sounds like the kind of thing that will make reference-counting the new class, where 10 years from now people will be told "Yeah, it would be more convenient to do things this way, but back when this standard library was written we were using reference-counting everywhere, and now we're stuck with it". tl;dr Keep strings and reference-counting separate.
Apr 01
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/1/19 2:13 PM, Olivier FAURE wrote:
 tl;dr Keep strings and reference-counting separate.
You can't do that. New strings are created frequently by concatenation, they're best thought of as units - like int. They need to handle their own allocation, there's no two ways about it. Autodecoding hurts "half" of the users - those who don't need it for the task. De-encapsulating string as an unstructured slice hurts the other "half" - those who need things like Unicode semantics, simple string manipulation, and good use of memory. The view that strings must be char[] and that that's the "simple" choice has been a disaster for D. (I believe I have started to convince Walter of that.) It's simple like Go is simple - forcing complexity realities on users. Incomparably larger than autodecoding. It has forced virtually all string-processing code in D to use the garbage collector. Immutable was a short of adrenaline that prolonged the lifetime of strings by ten years. Before that, having unstructured strings that were also weirdly mutable was a double disaster. We must have a UTF8 reference counted string that assists built-in literals of type immutable(char)[] and rally the entire language ecosystem around those.
Apr 01
prev sibling parent FeepingCreature <feepingcreature gmail.com> writes:
On Monday, 1 April 2019 at 18:13:55 UTC, Olivier FAURE wrote:
 Why?

 Strings have the advantage of being extremely simple constructs 
 that represent exactly the right abstraction: they're a slice 
 of chars, period. They can be scoped, sliced and concatenated 
 just like any other range.
Well, you see, strings have the disadvantage of being constructs that represent exactly the wrong abstraction: they're a slice of chars, period. To see the problem with this, consider: writefln( "äöü are %s letters; the second one is %s", "äöü".length, cast(ubyte[]) "äöü"[1..2]);
May 03