digitalmars.D.learn - std.file.dirEntries unsorted
- Timothee Cour (6/6) Dec 10 2013 dirEntries depends on readdir, which has undefined order (eg:
- Jesse Phillips (10/18) Dec 10 2013 It should only be documented. In my experience processing files
- Marco Leise (7/17) Dec 11 2013 Does 2.jpg come after 10.jpg ? What's the order of
- Timothee Cour (8/25) Dec 11 2013 yes, I agree sorting should be explicit as there's no natural order.
- Jesse Phillips (10/21) Dec 11 2013 Why is it too late, the file name includes the full path so
- Jonathan M Davis (13/20) Dec 11 2013 You can use SpanMode.shallow if you just want to look at a directory at ...
dirEntries depends on readdir, which has undefined order (eg: http://stackoverflow.com/questions/8977441/does-readdir-guarantee-an-order, and I've experienced as well dirEntries in non-alphabetical order) shouldn't we make dirEntries return in alphabetical order by default, with an option to return in unspecified native order for efficiency? at least, it should be specified in doc that order is undefined.
Dec 10 2013
On Wednesday, 11 December 2013 at 02:11:51 UTC, Timothee Cour wrote:dirEntries depends on readdir, which has undefined order (eg: http://stackoverflow.com/questions/8977441/does-readdir-guarantee-an-order, and I've experienced as well dirEntries in non-alphabetical order) shouldn't we make dirEntries return in alphabetical order by default, with an option to return in unspecified native order for efficiency? at least, it should be specified in doc that order is undefined.It should only be documented. In my experience processing files don't need a particular order and sorting may not be needed by name. Returning a sorted directory is difficult to define, should directories come first or be mixed with the files. Is uppercase grouped together? Does A come before a. Should the extension be included or postponed for later. I think sorting should be explicit.
Dec 10 2013
Am Wed, 11 Dec 2013 08:00:27 +0100 schrieb "Jesse Phillips" <Jesse.K.Phillips+D gmail.com>:It should only be documented. In my experience processing files=20 don't need a particular order and sorting may not be needed by=20 name. =20 Returning a sorted directory is difficult to define, should=20 directories come first or be mixed with the files. Is uppercase=20 grouped together? Does A come before a. Should the extension be=20 included or postponed for later. =20 I think sorting should be explicit.Does 2.jpg come after 10.jpg ? What's the order of Arabic-Indic "one" =DB=B1 compared to "Latin one" 1 ? And so on and so forth. --=20 Marco
Dec 11 2013
yes, I agree sorting should be explicit as there's no natural order. However sorting after calling dirEntries is not great as typically one wants to sort within a given directory level and it's too late to sort once all the directory levels are flattened. so how about having an extra argument that takes a lambda (eg binaryFun!"a<b") in dirEntries, or, having an additional function in std.file that takes such lambda. On Wed, Dec 11, 2013 at 2:31 AM, Marco Leise <Marco.Leise gmx.de> wrote:Am Wed, 11 Dec 2013 08:00:27 +0100 schrieb "Jesse Phillips" <Jesse.K.Phillips+D gmail.com>:It should only be documented. In my experience processing files don't need a particular order and sorting may not be needed by name. Returning a sorted directory is difficult to define, should directories come first or be mixed with the files. Is uppercase grouped together? Does A come before a. Should the extension be included or postponed for later. I think sorting should be explicit.Does 2.jpg come after 10.jpg ? What's the order of Arabic-Indic "one" =DB=B1 compared to "Latin one" 1 ? And so on and so forth. -- Marco
Dec 11 2013
On Wednesday, 11 December 2013 at 18:34:54 UTC, Timothee Cour wrote:yes, I agree sorting should be explicit as there's no natural order. However sorting after calling dirEntries is not great as typically one wants to sort within a given directory level and it's too late to sort once all the directory levels are flattened. so how about having an extra argument that takes a lambda (eg binaryFun!"a<b") in dirEntries, or, having an additional function in std.file that takes such lambda.Why is it too late, the file name includes the full path so sorting will still sort sibling directories separately. foreach(de; dirEntries(".", SpaneMode.depth).array.sort!((a,b)=>a.name<b.name)) ... This seems reasonable for your need, but I didn't test to check the behavior. dirEntries isn't random access so we can't sort it directly. I don't think placing it in dirEntries saves much and it would hide the required array allocation.
Dec 11 2013
On Wednesday, December 11, 2013 10:34:29 Timothee Cour wrote:yes, I agree sorting should be explicit as there's no natural order. However sorting after calling dirEntries is not great as typically one wants to sort within a given directory level and it's too late to sort once all the directory levels are flattened. so how about having an extra argument that takes a lambda (eg binaryFun!"a<b") in dirEntries, or, having an additional function in std.file that takes such lambda.You can use SpanMode.shallow if you just want to look at a directory at a time - or as Jesse points out, you could sort on the entire path. Regardless, dirEntries can't sort for you, because in order to do that it would have to allocate a container of some kind (be it an array or something else) in order to hold all of the entries and then sort them, whereas dirEntries is lazy and doesn't hold anything other than the info on where it currently is in the list of directories. The actual file list is held by the OS. So, you might as well just implement what you want on top of dirEntries. What you're asking for is already essentially a wrapper around dirEntries - one that would have to allocate on the heap no less - so it really makes more sense for it to be done outside of dirEntries. - Jonathan M Davis
Dec 11 2013