www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Fast removal of character

reply Johan Engelen <j j.nl> writes:
std.string.removechars is now deprecated.
https://dlang.org/changelog/2.075.0.html#pattern-deprecate

What is now the most efficient way to remove characters from a 
string, if only one type of character needs to be removed?

```
// old
auto old(string s) {
     return s.removechars(",").to!int;
}

// new?
auto newnew(string s) {
     return s.filter!(a => a != ',').to!int;
}
```

cheers,
    Johan
Oct 11 2017
next sibling parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Wednesday, October 11, 2017 22:22:43 Johan Engelen via Digitalmars-d-
learn wrote:
 std.string.removechars is now deprecated.
 https://dlang.org/changelog/2.075.0.html#pattern-deprecate

 What is now the most efficient way to remove characters from a
 string, if only one type of character needs to be removed?

 ```
 // old
 auto old(string s) {
      return s.removechars(",").to!int;
 }

 // new?
 auto newnew(string s) {
      return s.filter!(a => a != ',').to!int;
 }
 ```
Well, in general, I'd guess that the fastest way to remove all instances of a character from a string would be std.array.replace with the replacement being the empty string, but if you're feeding it to std.conv.to rather than really using the resultant string, then filter probably is faster, because it won't allocate. Really though, you'd have to test for your use case and see how fast a given solution is. - Jonathan M Davis
Oct 11 2017
parent reply Johan Engelen <j j.nl> writes:
On Wednesday, 11 October 2017 at 22:45:14 UTC, Jonathan M Davis 
wrote:
 On Wednesday, October 11, 2017 22:22:43 Johan Engelen via 
 Digitalmars-d- learn wrote:
 std.string.removechars is now deprecated. 
 https://dlang.org/changelog/2.075.0.html#pattern-deprecate

 What is now the most efficient way to remove characters from a 
 string, if only one type of character needs to be removed?

 ```
 // old
 auto old(string s) {
      return s.removechars(",").to!int;
 }

 // new?
 auto newnew(string s) {
      return s.filter!(a => a != ',').to!int;
 }
 ```
Well, in general, I'd guess that the fastest way to remove all instances of a character from a string would be std.array.replace with the replacement being the empty string,
Is that optimized for empty replacement?
 but if you're feeding it to std.conv.to rather than really 
 using the resultant string, then filter probably is faster, 
 because it won't allocate. Really though, you'd have to test 
 for your use case and see how fast a given solution is.
Yeah :( I am disappointed to see functions being deprecated, without an extensive documentation of how to rewrite them for different usage of the deprecated function. It makes me feel that no deep thought went into removing them (perhaps there was, I can't tell). One has to go and browse through the different version _release notes_ to find any documentation on how to rewrite them. It would have been much better to add it (aswell) to the deprecated function documentation. I have the same problem for std.string.squeeze. The release notes only say how to rewrite the `squeeze()` case, but not the `squeeze("_")` use case. I guess `uniq!("a=='_' && a == b")` ? Great improvement? - Johan
Oct 11 2017
next sibling parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 10/11/17 7:06 PM, Johan Engelen wrote:
 On Wednesday, 11 October 2017 at 22:45:14 UTC, Jonathan M Davis wrote:
 On Wednesday, October 11, 2017 22:22:43 Johan Engelen via 
 Digitalmars-d- learn wrote:
 std.string.removechars is now deprecated. 
 https://dlang.org/changelog/2.075.0.html#pattern-deprecate

 What is now the most efficient way to remove characters from a 
 string, if only one type of character needs to be removed?

 ```
 // old
 auto old(string s) {
      return s.removechars(",").to!int;
 }

 // new?
 auto newnew(string s) {
      return s.filter!(a => a != ',').to!int;
 }
 ```
Well, in general, I'd guess that the fastest way to remove all instances of a character from a string would be std.array.replace with the replacement being the empty string,
Is that optimized for empty replacement?
 but if you're feeding it to std.conv.to rather than really using the 
 resultant string, then filter probably is faster, because it won't 
 allocate. Really though, you'd have to test for your use case and see 
 how fast a given solution is.
Performance-wise, I would use neither, as both autodecode (removechars by using the opaque/slow foreach decoding) and reencode. Especially if you are removing a single ascii char.
 I am disappointed to see functions being deprecated, without an 
 extensive documentation of how to rewrite them for different usage of 
 the deprecated function. It makes me feel that no deep thought went into 
 removing them (perhaps there was, I can't tell).
 
 One has to go and browse through the different version _release notes_ 
 to find any documentation on how to rewrite them. It would have been 
 much better to add it (aswell) to the deprecated function documentation.
This should have been done. A deprecation in this manner where there is no exact path forward is confusing and unnecessary. If we are deprecating a function, the person deprecating the function should have had a replacement in mind. The only way the message could be worse is if it said "please use other functions in Phobos". -Steve
Oct 11 2017
prev sibling parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Wednesday, October 11, 2017 23:06:13 Johan Engelen via Digitalmars-d-
learn wrote:
 I am disappointed to see functions being deprecated, without an
 extensive documentation of how to rewrite them for different
 usage of the deprecated function. It makes me feel that no deep
 thought went into removing them (perhaps there was, I can't tell).

 One has to go and browse through the different version _release
 notes_ to find any documentation on how to rewrite them. It would
 have been much better to add it (aswell) to the deprecated
 function documentation.

 I have the same problem for std.string.squeeze. The release notes
 only say how to rewrite the `squeeze()` case, but not the
 `squeeze("_")` use case. I guess `uniq!("a=='_' && a == b")` ?
 Great improvement?
Normally, when something is deprecated, replacing it is fairly straightforward, and documentation usually isn't need at all (simply pointing someone to the new function generally suffices). Unfortunately, that isn't really the case here. It was decided years ago that they pattern functions in std.string should be replaced by regex stuff, but no one ever did it. It was recently decided to just rip them out anyway, which I have very mixed feelings about, since I agree that they should go, but how to replace their functionality in your own code is not necessarily obvious. We certainly didn't provide functions that did the same thing but took regexes, which was originally the idea for what would replace them but was never implemented. IIRC, the only reason that there's _any_ explanation is because the person who created the PR was pushed to create some examples. The way this was handled is not very typical of how deprecations are handled. - Jonathan M Davis
Oct 11 2017
prev sibling parent Laeeth Isharc <laeeth laeeth.com> writes:
On Wednesday, 11 October 2017 at 22:22:43 UTC, Johan Engelen 
wrote:
 std.string.removechars is now deprecated.
 https://dlang.org/changelog/2.075.0.html#pattern-deprecate

 What is now the most efficient way to remove characters from a 
 string, if only one type of character needs to be removed?

 ```
 // old
 auto old(string s) {
     return s.removechars(",").to!int;
 }

 // new?
 auto newnew(string s) {
     return s.filter!(a => a != ',').to!int;
 }
 ```

 cheers,
    Johan
There's always this: https://github.com/dlang/undeaD/blob/master/src/undead/string.d
Oct 12 2017