digitalmars.D.learn - path matching problem
- Charles Hixson (20/20) Nov 27 2012 Is there a better way to do this? (I want to find files that match any
- Joshua Niehus (17/39) Nov 27 2012 maybe this:?
- Charles Hixson (16/55) Nov 27 2012 That's a good approach, except that I want to step through the matching
- Joshua Niehus (20/24) Nov 27 2012 Ignorance...
- jerro (6/28) Nov 27 2012 You could replace the inner loop with somehting like:
- Charles Hixson (11/39) Nov 27 2012 std.algorithm seems to generally be running the match in the opposite
- jerro (20/32) Nov 27 2012 I don't understand what you mean with running the match in the
- Charles Hixson (13/45) Nov 28 2012 Thanks for the tutorial link, I'll give it a try. (Whee! A 182 page
- Philippe Sigaud (1/3) Nov 28 2012 Well, it *started* as a tutorial. Then people sent me code :)
- jerro (2/7) Nov 28 2012 It's documented here:
Is there a better way to do this? (I want to find files that match any of some extensions and don't match any of several other strings, or are not in some directories.): import std.file; ... string exts = "*.{txt,utf8,utf-8,TXT,UTF8,UTF-8}"; string[] exclude = ["/template/", "biblio.txt", "categories.txt", "subjects.txt", "/toCDROM/"] int limit = 1 // Iterate a directory in depth foreach (string name; dirEntries(sDir, exts, SpanMode.depth)) { bool excl = false; foreach (string part; exclude) { if (part in name) { excl = true; break; } } if (excl) break; etc.
Nov 27 2012
On Tuesday, 27 November 2012 at 19:40:56 UTC, Charles Hixson wrote:Is there a better way to do this? (I want to find files that match any of some extensions and don't match any of several other strings, or are not in some directories.): import std.file; ... string exts = "*.{txt,utf8,utf-8,TXT,UTF8,UTF-8}"; string[] exclude = ["/template/", "biblio.txt", "categories.txt", "subjects.txt", "/toCDROM/"] int limit = 1 // Iterate a directory in depth foreach (string name; dirEntries(sDir, exts, SpanMode.depth)) { bool excl = false; foreach (string part; exclude) { if (part in name) { excl = true; break; } } if (excl) break; etc.maybe this:? import std.algorithm, std.array, std.regex; import std.stdio, std.file; void main() { enum string[] exts = [`".txt"`, `".utf8"`, `".utf-8"`, `".TXT"`, `".UTF8"`, `".UTF-8"`]; enum string exclude = `r"/template/|biblio\.txt|categories\.txt|subjects\.txt|/toCDROM/"`; auto x = dirEntries("/path", SpanMode.depth) .filter!(`endsWith(a.name,` ~ exts.join(",") ~ `)`) .filter!(`std.regex.match(a.name,` ~ exclude ~ `).empty`);; writeln(x); }
Nov 27 2012
On 11/27/2012 01:31 PM, Joshua Niehus wrote:On Tuesday, 27 November 2012 at 19:40:56 UTC, Charles Hixson wrote:That's a good approach, except that I want to step through the matching paths rather than accumulate them in a vector...though ... the filter documentation could mean that it would return an iterator. So I could replace writeln (x); by foreach (string name; x) { ... } and x wouldn't have to hold all the matching strings at the same time. But why the chained filters, rather than using the option provided by dirEntries for one of them? Is it faster? Just the way you usually do things? (Which I accept as a legitimate answer. I can see that that approach would be more flexible.)Is there a better way to do this? (I want to find files that match any of some extensions and don't match any of several other strings, or are not in some directories.): import std.file; ... string exts = "*.{txt,utf8,utf-8,TXT,UTF8,UTF-8}"; string[] exclude = ["/template/", "biblio.txt", "categories.txt", "subjects.txt", "/toCDROM/"] int limit = 1 // Iterate a directory in depth foreach (string name; dirEntries(sDir, exts, SpanMode.depth)) { bool excl = false; foreach (string part; exclude) { if (part in name) { excl = true; break; } } if (excl) break; etc.maybe this:? import std.algorithm, std.array, std.regex; import std.stdio, std.file; void main() { enum string[] exts = [`".txt"`, `".utf8"`, `".utf-8"`, `".TXT"`, `".UTF8"`, `".UTF-8"`]; enum string exclude = `r"/template/|biblio\.txt|categories\.txt|subjects\.txt|/toCDROM/"`; auto x = dirEntries("/path", SpanMode.depth) .filter!(`endsWith(a.name,` ~ exts.join(",") ~ `)`) .filter!(`std.regex.match(a.name,` ~ exclude ~ `).empty`);; writeln(x); }
Nov 27 2012
On Tuesday, 27 November 2012 at 23:43:43 UTC, Charles Hixson wrote:But why the chained filters, rather than using the option provided by dirEntries for one of them? Is it faster? Just the way you usually do things? (Which I accept as a legitimate answer. I can see that that approach would be more flexible.)Ignorance... Your right, I didn't realize that dirEntries had that filter option, you should use that. I doubt the double .filter would effect performance at all (might even slow it down for all i know :) //update: import std.algorithm, std.array, std.regex; import std.stdio, std.file; void main() { string exts = "*.{txt,utf8,utf-8,TXT,UTF8,UTF-8}"; enum string exclude = `r"/template/|biblio\.txt|categories\.txt|subjects\.txt|/toCDROM/"`; dirEntries("/path", exts, SpanMode.depth) .filter!(` std.regex.match(a.name,` ~ exclude ~ `).empty `) .writeln(); }
Nov 27 2012
On Tuesday, 27 November 2012 at 19:40:56 UTC, Charles Hixson wrote:Is there a better way to do this? (I want to find files that match any of some extensions and don't match any of several other strings, or are not in some directories.): import std.file; ... string exts = "*.{txt,utf8,utf-8,TXT,UTF8,UTF-8}"; string[] exclude = ["/template/", "biblio.txt", "categories.txt", "subjects.txt", "/toCDROM/"] int limit = 1 // Iterate a directory in depth foreach (string name; dirEntries(sDir, exts, SpanMode.depth)) { bool excl = false; foreach (string part; exclude) { if (part in name) { excl = true; break; } } if (excl) break; etc.You could replace the inner loop with somehting like: bool excl = exclude.any!(part => name.canFind(part)); There may be even some easier way to do it, take a look at std.algorithm.
Nov 27 2012
On 11/27/2012 01:34 PM, jerro wrote:On Tuesday, 27 November 2012 at 19:40:56 UTC, Charles Hixson wrote:std.algorithm seems to generally be running the match in the opposite direction, if I'm understanding it properly. (Dealing with D template is always confusing to me.) OTOH, I couldn't find the string any method, so I'm not really sure what you're proposing, though it does look attractive. Still, though your basic approach sounds good, the suggestion of Joshua Niehus would let me filter out the strings that didn't fit before entering the loop. There's probably no real advantage to doing it that way, but it does seem more elegant. (You were right, though. That is in std.algorithms.)Is there a better way to do this? (I want to find files that match any of some extensions and don't match any of several other strings, or are not in some directories.): import std.file; ... string exts = "*.{txt,utf8,utf-8,TXT,UTF8,UTF-8}"; string[] exclude = ["/template/", "biblio.txt", "categories.txt", "subjects.txt", "/toCDROM/"] int limit = 1 // Iterate a directory in depth foreach (string name; dirEntries(sDir, exts, SpanMode.depth)) { bool excl = false; foreach (string part; exclude) { if (part in name) { excl = true; break; } } if (excl) break; etc.You could replace the inner loop with somehting like: bool excl = exclude.any!(part => name.canFind(part)); There may be even some easier way to do it, take a look at std.algorithm.
Nov 27 2012
You could replace the inner loop with somehting like: bool excl = exclude.any!(part => name.canFind(part));std.algorithm seems to generally be running the match in the opposite direction, if I'm understanding it properly. (Dealing with D template is always confusing to me.) OTOH, I couldn't find the string any method, so I'm not really sure what you're proposing, though it does look attractive.I don't understand what you mean with running the match in the opposite direction, but I'll explain how my line of code works. First of all, it is equivalent to: any!(part => canFind(name, part))(exclude); The feature that that lets you write that in the way I did in my previous post is called uniform function call syntax (often abbreviated to UFCS) and is described at http://www.drdobbs.com/cpp/uniform-function-call-syntax/232700394. canFind(name, part) returns true if name contains part. (part => canFind(name, part)) is a short syntax for (part){ return canFind(name, part); } any!(condition)(range) returns true if condition is true for any element of range So the line of code in my previous post sets excl to true if name contains any of the strings in exclude. If you know all the strings you want to exclude in advance, it is easier to do that with a regex like Joshua did. If you want to learn about D templates, try this tutorial: https://github.com/PhilippeSigaud/D-templates-tutorial/blob/master/dtemplates.pdf?raw=trueStill, though your basic approach sounds good, the suggestion of Joshua Niehus would let me filter out the strings that didn't fit before entering the loop. There's probably no real advantage to doing it that way, but it does seem more elegant.I agree, it is more elegant.
Nov 27 2012
On 11/27/2012 06:45 PM, jerro wrote:Thanks for the tutorial link, I'll give it a try. (Whee! A 182 page tutorial!) Those things, though, don't seem to stick in my mind. I learned programming in FORTRAN IV, and I don't seem to be able to force either templates, Scheme, or Haskell into my way of thinking about programming. (Interestingly, classes and structured programming fit without problems.) The link to the Walter article in Dr. Dobbs is interesting. I intend to read it first. OTOH, I still don't know where "any" is documented. It's clearly some sort of template instantiation, but it doesn't seem to be defined in either std.string or std.object (or anywhere else I've thought to check). And it look as if it would be something very useful to know.You could replace the inner loop with somehting like: bool excl = exclude.any!(part => name.canFind(part));std.algorithm seems to generally be running the match in the opposite direction, if I'm understanding it properly. (Dealing with D template is always confusing to me.) OTOH, I couldn't find the string any method, so I'm not really sure what you're proposing, though it does look attractive.I don't understand what you mean with running the match in the opposite direction, but I'll explain how my line of code works. First of all, it is equivalent to: any!(part => canFind(name, part))(exclude); The feature that that lets you write that in the way I did in my previous post is called uniform function call syntax (often abbreviated to UFCS) and is described at http://www.drdobbs.com/cpp/uniform-function-call-syntax/232700394. canFind(name, part) returns true if name contains part. (part => canFind(name, part)) is a short syntax for (part){ return canFind(name, part); } any!(condition)(range) returns true if condition is true for any element of range So the line of code in my previous post sets excl to true if name contains any of the strings in exclude. If you know all the strings you want to exclude in advance, it is easier to do that with a regex like Joshua did. If you want to learn about D templates, try this tutorial: https://github.com/PhilippeSigaud/D-templates-tutorial/blob/master/dtemplates.pdf?raw=trueStill, though your basic approach sounds good, the suggestion of Joshua Niehus would let me filter out the strings that didn't fit before entering the loop. There's probably no real advantage to doing it that way, but it does seem more elegant.I agree, it is more elegant.
Nov 28 2012
Thanks for the tutorial link, I'll give it a try. (Whee! A 182 page tutorial!)Well, it *started* as a tutorial. Then people sent me code :)
Nov 28 2012
OTOH, I still don't know where "any" is documented. It's clearly some sort of template instantiation, but it doesn't seem to be defined in either std.string or std.object (or anywhere else I've thought to check). And it look as if it would be something very useful to know.It's documented here: http://dlang.org/phobos/std_algorithm.html#any
Nov 28 2012