digitalmars.D - std.path review: update
- Lars T. Kyllingstad (29/29) Jul 17 2011 Based on your comments, I have made some changes to my std.path
- bearophile (4/5) Jul 17 2011 compatibleStrings is a template still.
- Lars T. Kyllingstad (9/14) Jul 18 2011 I know. Sorry, forgot to mention that. For now, I'd like to keep it th...
- Jonathan M Davis (9/21) Jul 18 2011 And it _should_ be a template. All of the stuff like that are templates....
- bearophile (25/32) Jul 18 2011 This seems to work:
- Jonathan M Davis (12/52) Jul 18 2011 Okay. Yes, you could do that. But what you're doing is basically the sam...
- bearophile (4/14) Jul 18 2011 The gain of my version is that it doesn't generate tons of templates. Fr...
- Andrej Mitrovic (10/12) Jul 17 2011 Actually I withdraw that feature request. Some tools will work with
- Lars T. Kyllingstad (3/10) Jul 18 2011 Noted. :)
- Steven Schveighoffer (6/19) Jul 18 2011 Hum... I wonder if normalize should do this...
- Lars T. Kyllingstad (11/36) Jul 18 2011 normalize does this on Windows, where '/' is also a directory separator,...
- Steven Schveighoffer (13/47) Jul 18 2011 OK, this is what I meant. By canonical path, I mean I should be able to...
- Jesse Phillips (7/15) Jul 17 2011 I'm not sure my opinion on this. It seems like a useful idea, but as
- Brian Schott (3/3) Jul 17 2011 The documentation comments for driveName say that the return value will
- Jonathan M Davis (13/16) Jul 17 2011 The fun part with that is that "" == null and a null string is empty per...
- Lars T. Kyllingstad (4/22) Jul 18 2011 Pending a decision on the null vs. empty issue, I have now standardised
- torhu (17/38) Jul 18 2011 I'd like to make a case for null as the 'nothing here' value.
- Lars T. Kyllingstad (8/52) Jul 18 2011 True, but the question was not whether one should use null or "" for the...
- torhu (7/31) Jul 18 2011 I meant to imply that null and empty should not be used to mean two
- Jonathan M Davis (17/49) Jul 18 2011 There are definitely situations where it is valuable to differentiate be...
- Andrei Alexandrescu (7/37) Jul 18 2011 Note that there are two aspects: generating 'nothing here' values, and
- Lars T. Kyllingstad (17/57) Jul 18 2011 Some have argued that there is an extra dimension to this, namely the
- Vladimir Panteleev (6/8) Jul 18 2011 Is it still an implementation detail if it's documented behavior?
- Vladimir Panteleev (1/1) Jul 18 2011 Sorry, I thought you meant the old getExt().
- Steven Schveighoffer (12/48) Jul 18 2011 The one that's kind of nice is the if(path.extension), which reads not
- Lars T. Kyllingstad (4/11) Jul 18 2011 It seems I forgot about the CTFEability tests. I'll fix that too, and
- Lars T. Kyllingstad (6/18) Jul 18 2011 Done. Most functions were CTFEable without any modifications (thanks,
- Steven Schveighoffer (27/31) Jul 18 2011 This is a review of the docs/design. I'll review the code separately:
- Lars T. Kyllingstad (30/75) Jul 18 2011 Oops. Thanks!
- Jonathan M Davis (7/97) Jul 18 2011 I suggest that you do what I did in std.file (e.g. with getTimesWin). I ...
- Steven Schveighoffer (53/106) Jul 18 2011 It is and it isn't. It's *not* a normal directory, because only shares ...
- Lars T. Kyllingstad (27/162) Jul 20 2011 Then driveName() should probably return the full share path. But, of th...
- Steven Schveighoffer (15/104) Jul 20 2011 It is in that if you open explorer and type in \\servername, it will giv...
- Lars T. Kyllingstad (6/9) Jul 20 2011 Any .NET programmers out there? Can you please tell me what the
- Jussi Jumppanen (36/42) Jul 21 2011 This code:
- Lars T. Kyllingstad (13/61) Jul 21 2011 Thanks, this is very helpful. Now we know that MS's APIs treat \\foo\ba...
- Rainer Schuetze (7/10) Jul 21 2011 If that's true for the bare open() without going through possible
- Lars T. Kyllingstad (4/17) Jul 21 2011 All right, I'll remove "//path" support again. That simplifies things
- Lars T. Kyllingstad (6/112) Jul 20 2011 Actually, I realise now, it doesn't. :) Since joinPath/buildPath needs
- Rainer Schuetze (5/36) Jul 21 2011 I like the direction that this is heading. If the idea gets extended to
- Nick Sabalausky (16/41) Jul 19 2011 I don't know whether or not it's "never" a valid path, but "dir \\server...
- Andrej Mitrovic (2/2) Jul 19 2011 Here's some relevant info:
- Steven Schveighoffer (21/32) Jul 20 2011 I've done it before, mounted a windows share on a linux box via cifs.
- Lars T. Kyllingstad (9/66) Jul 20 2011 That check would probably be orders of magnitude more expensive than a
- Alix Pexton (6/11) Jul 20 2011 Wikipedia says Windows long file names are up to 255 UTF-16 characters
- Lars T. Kyllingstad (3/17) Jul 21 2011 Thanks! In other words, fcmp() needs to do UTF-16 decoding...
- Rainer Schuetze (5/9) Jul 20 2011 I just tried a few examples: Using umlauts works as expected, i.e. upper...
- torhu (6/10) Jul 21 2011 I guess you've already thought about this, but one solution is to just
Based on your comments, I have made some changes to my std.path proposal. A list of the changes I have made can be found at the following address (look at the commits dated 2011-07-17): https://github.com/kyllingstad/phobos/commits/std-path I believe I have covered most of your requests, with a few exceptions: Firstly, Jonathan argued very convincingly that the contents of the current std.path should be put back in, marked as "scheduled for deprecation". I intend to do this when the review is over, if my submission gets accepted. For now, ignore the bottommost deprecated: block. Secondly, David and Jonathan suggested I optimise functions like setExtension() using ~= to append when possible. I have tried doing so for setExtension(), and I'm not convinced the extra complexity is worth the relatively modest gain. The specialised, optimised version can be found here: https://github.com/kyllingstad/phobos/blob/std-path/std/path.d#L529 Finally, there are some requests with which I don't personally agree. Therefore, I'd like to get more opinions before making any changes: - Should I add toNativePath(), which replaces '/' with '\' on Windows and vice versa on POSIX? - Should it be specified/documented whether a function returns "" or null? Specifically, is it important that extension("foo") is null extension("foo.") !is null && extension("foo.") == "" - Do people agree with Jonathan's views on function names? As before, code and docs can be found here: https://github.com/kyllingstad/phobos/blob/std-path/std/path.d http://www.kyllingen.net/code/new-std-path/phobos-prerelease/std_path.html -Lars
Jul 17 2011
Lars T. Kyllingstad:I believe I have covered most of your requests, with a few exceptions:compatibleStrings is a template still. Bye, bearophile
Jul 17 2011
On Sun, 17 Jul 2011 17:43:42 -0400, bearophile wrote:Lars T. Kyllingstad:I know. Sorry, forgot to mention that. For now, I'd like to keep it the way it is. I can't find any precedence in Phobos for turning these kinds of tests into CTFEable functions, and if compatibleStrings were to end up in std.traits, for instance, it would stand out as being different from everything else in there. If it is decided that it is better to write these tests as ordinary functions, that should probably be done throughout Phobos. -LarsI believe I have covered most of your requests, with a few exceptions:compatibleStrings is a template still.
Jul 18 2011
On Monday 18 July 2011 09:35:17 Lars T. Kyllingstad wrote:On Sun, 17 Jul 2011 17:43:42 -0400, bearophile wrote:And it _should_ be a template. All of the stuff like that are templates. And I'm not even sure that it _can_ be a function. And even if it can, what would we gain by making it a function anyway? It's operating on types. It's of no use at runtime. It's a perfect candidate for an eponymous template. std.traits, std.range, etc. do this sort of thing in pretty much exactly the same way. There may be a cleaner way to write it then it currently is, but using an eponymous template like that is the correct thing to do. - Jonathan M DavisLars T. Kyllingstad:I know. Sorry, forgot to mention that. For now, I'd like to keep it the way it is. I can't find any precedence in Phobos for turning these kinds of tests into CTFEable functions, and if compatibleStrings were to end up in std.traits, for instance, it would stand out as being different from everything else in there. If it is decided that it is better to write these tests as ordinary functions, that should probably be done throughout Phobos.I believe I have covered most of your requests, with a few exceptions:compatibleStrings is a template still.
Jul 18 2011
Jonathan M Davis:And it _should_ be a template. All of the stuff like that are templates. And I'm not even sure that it _can_ be a function. And even if it can, what would we gain by making it a function anyway? It's operating on types. It's of no use at runtime. It's a perfect candidate for an eponymous template. std.traits, std.range, etc. do this sort of thing in pretty much exactly the same way. There may be a cleaner way to write it then it currently is, but using an eponymous template like that is the correct thing to do.This seems to work: import std.traits: isSomeChar, Unqual, isSomeString; bool compatibleStrings(Strings...)() if (Strings.length) { static if (isSomeString!(Strings[0])) { alias Unqual!(typeof(Strings[0].init[0])) TC; foreach (s; Strings[1 .. $]) static if (isSomeString!s && !is(TC == Unqual!(typeof(s.init[0])))) return false; return true; } else return false; } version (unittest) { static assert (compatibleStrings!(char[], const(char)[], string)()); static assert (compatibleStrings!(wchar[], const(wchar)[], wstring)()); static assert (compatibleStrings!(dchar[], const(dchar)[], dstring)()); static assert (!compatibleStrings!(int[], const(int)[], immutable(int)[])()); static assert (!compatibleStrings!(char[], wchar[])()); static assert (!compatibleStrings!(char[], dstring)()); } void main() {} I have written tons of such things in dlibs1, and generally I have seen that recursive templates are slower and need more RAM than similar functions. Bye, bearophile
Jul 18 2011
On Monday 18 July 2011 06:28:50 bearophile wrote:Jonathan M Davis:Okay. Yes, you could do that. But what you're doing is basically the same as the eponymous template except that it's saving the value to in a function so that it can be called at runtime. The gain is 0 and potentially confusing. It's no better than bool compatibleStringsFunc(Strings...)() { enum retval = compatibleStrings!Strings; return retval; } But you _did_ find a way to turn it into a function. - Jonathan M DavisAnd it _should_ be a template. All of the stuff like that are templates. And I'm not even sure that it _can_ be a function. And even if it can, what would we gain by making it a function anyway? It's operating on types. It's of no use at runtime. It's a perfect candidate for an eponymous template. std.traits, std.range, etc. do this sort of thing in pretty much exactly the same way. There may be a cleaner way to write it then it currently is, but using an eponymous template like that is the correct thing to do.This seems to work: import std.traits: isSomeChar, Unqual, isSomeString; bool compatibleStrings(Strings...)() if (Strings.length) { static if (isSomeString!(Strings[0])) { alias Unqual!(typeof(Strings[0].init[0])) TC; foreach (s; Strings[1 .. $]) static if (isSomeString!s && !is(TC == Unqual!(typeof(s.init[0])))) return false; return true; } else return false; } version (unittest) { static assert (compatibleStrings!(char[], const(char)[], string)()); static assert (compatibleStrings!(wchar[], const(wchar)[], wstring)()); static assert (compatibleStrings!(dchar[], const(dchar)[], dstring)()); static assert (!compatibleStrings!(int[], const(int)[], immutable(int)[])()); static assert (!compatibleStrings!(char[], wchar[])()); static assert (!compatibleStrings!(char[], dstring)()); } void main() {} I have written tons of such things in dlibs1, and generally I have seen that recursive templates are slower and need more RAM than similar functions.
Jul 18 2011
Jonathan M Davis:But what you're doing is basically the same as the eponymous template except that it's saving the value to in a function so that it can be called at runtime. The gain is 0 and potentially confusing. It's no better than bool compatibleStringsFunc(Strings...)() { enum retval = compatibleStrings!Strings; return retval; }The gain of my version is that it doesn't generate tons of templates. From my experience such functions lead to faster compile times and less memory used by the compiler compared to using recursive templates. And for me a foreach is usually less confusing than recursive templates :-) Bye, bearophile
Jul 18 2011
On 7/17/11, Lars T. Kyllingstad <public kyllingen.nospamnet> wrote:- Should I add toNativePath(), which replaces '/' with '\' on Windows and vice versa on POSIX?Actually I withdraw that feature request. Some tools will work with only forward slashes, others only backward slashes, but this is regardless of what platform they're on. E.g. some tools don't work with forward slashes, while GIT doesn't work with backward slashes when running on Windows. I think .replace(r"\", "/") and .replace("/", r"\") are good enough, but maybe an alias to each version wouldn't be bad. E.g. "toForwardSlash" and "toBackslash". It's not hard to define this in our own code, so it's not really a feature request.
Jul 17 2011
On Mon, 18 Jul 2011 00:24:30 +0200, Andrej Mitrovic wrote:On 7/17/11, Lars T. Kyllingstad <public kyllingen.nospamnet> wrote:Noted. :) -Lars- Should I add toNativePath(), which replaces '/' with '\' on Windows and vice versa on POSIX?Actually I withdraw that feature request. Some tools will work with only forward slashes, others only backward slashes, but this is regardless of what platform they're on.
Jul 18 2011
On Sun, 17 Jul 2011 18:24:30 -0400, Andrej Mitrovic <andrej.mitrovich gmail.com> wrote:On 7/17/11, Lars T. Kyllingstad <public kyllingen.nospamnet> wrote:Hum... I wonder if normalize should do this... Is normalize supposed to create a canonical path? If so, then this needs to happen. -Steve- Should I add toNativePath(), which replaces '/' with '\' on Windows and vice versa on POSIX?Actually I withdraw that feature request. Some tools will work with only forward slashes, others only backward slashes, but this is regardless of what platform they're on. E.g. some tools don't work with forward slashes, while GIT doesn't work with backward slashes when running on Windows. I think .replace(r"\", "/") and .replace("/", r"\") are good enough, but maybe an alias to each version wouldn't be bad. E.g. "toForwardSlash" and "toBackslash". It's not hard to define this in our own code, so it's not really a feature request.
Jul 18 2011
On Mon, 18 Jul 2011 13:26:08 -0400, Steven Schveighoffer wrote:On Sun, 17 Jul 2011 18:24:30 -0400, Andrej Mitrovic <andrej.mitrovich gmail.com> wrote:normalize does this on Windows, where '/' is also a directory separator, but not on POSIX, where '\' is an ordinary filename character. I am not entirely sure what the exact definition of "canonical path" is, but according to some it entails resolving symlinks. normalize does not do this, but it does everything else: - resolves . and .. to the extent possible - collapses redundant directory separators - changes '/' to '\' on Windows -Lars -LarsOn 7/17/11, Lars T. Kyllingstad <public kyllingen.nospamnet> wrote:Hum... I wonder if normalize should do this... Is normalize supposed to create a canonical path? If so, then this needs to happen.- Should I add toNativePath(), which replaces '/' with '\' on Windows and vice versa on POSIX?Actually I withdraw that feature request. Some tools will work with only forward slashes, others only backward slashes, but this is regardless of what platform they're on. E.g. some tools don't work with forward slashes, while GIT doesn't work with backward slashes when running on Windows. I think .replace(r"\", "/") and .replace("/", r"\") are good enough, but maybe an alias to each version wouldn't be bad. E.g. "toForwardSlash" and "toBackslash". It's not hard to define this in our own code, so it's not really a feature request.
Jul 18 2011
On Mon, 18 Jul 2011 14:30:51 -0400, Lars T. Kyllingstad <public kyllingen.nospamnet> wrote:On Mon, 18 Jul 2011 13:26:08 -0400, Steven Schveighoffer wrote:OK, this is what I meant. By canonical path, I mean I should be able to take two paths that point to the same filename and normalize should output the same string for both. I agree that the posix version should not replace \ with /, since that's a Windows specific issue. I realize there are some limitations when all you are doing is string manipulation. For example ~steves/blah resolves to the canonical path /home/steves/blah. Same thing with symlinks. I guess normalize is the best term for it, don't want to confuse it with full canonical. -SteveOn Sun, 17 Jul 2011 18:24:30 -0400, Andrej Mitrovic <andrej.mitrovich gmail.com> wrote:normalize does this on Windows, where '/' is also a directory separator, but not on POSIX, where '\' is an ordinary filename character. I am not entirely sure what the exact definition of "canonical path" is, but according to some it entails resolving symlinks. normalize does not do this, but it does everything else: - resolves . and .. to the extent possible - collapses redundant directory separators - changes '/' to '\' on WindowsOn 7/17/11, Lars T. Kyllingstad <public kyllingen.nospamnet> wrote:Hum... I wonder if normalize should do this... Is normalize supposed to create a canonical path? If so, then this needs to happen.- Should I add toNativePath(), which replaces '/' with '\' on Windows and vice versa on POSIX?Actually I withdraw that feature request. Some tools will work with only forward slashes, others only backward slashes, but this is regardless of what platform they're on. E.g. some tools don't work with forward slashes, while GIT doesn't work with backward slashes when running on Windows. I think .replace(r"\", "/") and .replace("/", r"\") are good enough, but maybe an alias to each version wouldn't be bad. E.g. "toForwardSlash" and "toBackslash". It's not hard to define this in our own code, so it's not really a feature request.
Jul 18 2011
On Sun, 17 Jul 2011 21:27:41 +0000, Lars T. Kyllingstad wrote:- Should I add toNativePath(), which replaces '/' with '\' on Windows and vice versa on POSIX?I'm not sure my opinion on this. It seems like a useful idea, but as Andrej points out it make just cause other issues.- Should it be specified/documented whether a function returns "" or null? Specifically, is it important that extension("foo") is null extension("foo.") !is null && extension("foo.") == ""I don't think it is important, but probably should be documented.- Do people agree with Jonathan's views on function names?I think I did.
Jul 17 2011
The documentation comments for driveName say that the return value will be an empty string in some circumstances, but the code and unit tests both say that the behavior is to return null.
Jul 17 2011
On Sunday 17 July 2011 22:08:27 Brian Schott wrote:The documentation comments for driveName say that the return value will be an empty string in some circumstances, but the code and unit tests both say that the behavior is to return null.The fun part with that is that "" == null and a null string is empty per std.array.empty, so it _is_ the empty string. The only difference is that "" !is null. So, if the function says that it returns null, then it needs to return null. Since it says that it returns the empty string, it could return either. Now, in spite of all that, there's still a problem since the tests verify that the return value is null, not empty. Either the documentation should say that it returns null, or the tests should be checking for empty, not null. But still, the documentation isn't incorrect. Are the tests are perfectly valid, but they really shouldn't be testing for is null instead of empty when the function is supposed to return empty. - Jonathan M Davis
Jul 17 2011
On Sun, 17 Jul 2011 22:38:43 -0700, Jonathan M Davis wrote:On Sunday 17 July 2011 22:08:27 Brian Schott wrote:Pending a decision on the null vs. empty issue, I have now standardised on using empty() for testing whether functions return empty strings. -LarsThe documentation comments for driveName say that the return value will be an empty string in some circumstances, but the code and unit tests both say that the behavior is to return null.The fun part with that is that "" == null and a null string is empty per std.array.empty, so it _is_ the empty string. The only difference is that "" !is null. So, if the function says that it returns null, then it needs to return null. Since it says that it returns the empty string, it could return either. Now, in spite of all that, there's still a problem since the tests verify that the return value is null, not empty. Either the documentation should say that it returns null, or the tests should be checking for empty, not null. But still, the documentation isn't incorrect. Are the tests are perfectly valid, but they really shouldn't be testing for is null instead of empty when the function is supposed to return empty.
Jul 18 2011
On 18.07.2011 11:42, Lars T. Kyllingstad wrote:On Sun, 17 Jul 2011 22:38:43 -0700, Jonathan M Davis wrote:I'd like to make a case for null as the 'nothing here' value. The advantage of using null is that all possible ways of testing for 'nothingness' (is, ==, as a boolean condition, empty range) will work. But if you return an empty string, you can't do 'str is null', because that will be false. With null there's just no doubt, and no way to get the test wrong. As far as I can tell by the testing I've done, you can use a null string in every way that you can use an empty string, even append to it with ~=. The distinction between null and empty strings is significant in C and Java, but in D it's not, and the tiny difference that actually exists mainly serves to confuse people. It doesn't help that the actual differences are largely undocumented either. One difference is that a statically allocated empty string is null terminated, but I think that can be safely ignored in the case of return values. By the way, did you read my post in the other thread?On Sunday 17 July 2011 22:08:27 Brian Schott wrote:Pending a decision on the null vs. empty issue, I have now standardised on using empty() for testing whether functions return empty strings.The documentation comments for driveName say that the return value will be an empty string in some circumstances, but the code and unit tests both say that the behavior is to return null.The fun part with that is that "" == null and a null string is empty per std.array.empty, so it _is_ the empty string. The only difference is that "" !is null. So, if the function says that it returns null, then it needs to return null. Since it says that it returns the empty string, it could return either. Now, in spite of all that, there's still a problem since the tests verify that the return value is null, not empty. Either the documentation should say that it returns null, or the tests should be checking for empty, not null. But still, the documentation isn't incorrect. Are the tests are perfectly valid, but they really shouldn't be testing for is null instead of empty when the function is supposed to return empty.
Jul 18 2011
On Mon, 18 Jul 2011 14:23:18 +0200, torhu wrote:On 18.07.2011 11:42, Lars T. Kyllingstad wrote:True, but the question was not whether one should use null or "" for the "nothing here" return value of a function. The question was whether the function returning null should mean something different than it returning "".On Sun, 17 Jul 2011 22:38:43 -0700, Jonathan M Davis wrote:I'd like to make a case for null as the 'nothing here' value. The advantage of using null is that all possible ways of testing for 'nothingness' (is, ==, as a boolean condition, empty range) will work. But if you return an empty string, you can't do 'str is null', because that will be false. With null there's just no doubt, and no way to get the test wrong. As far as I can tell by the testing I've done, you can use a null string in every way that you can use an empty string, even append to it with ~=. The distinction between null and empty strings is significant in C and Java, but in D it's not, and the tiny difference that actually exists mainly serves to confuse people. It doesn't help that the actual differences are largely undocumented either. One difference is that a statically allocated empty string is null terminated, but I think that can be safely ignored in the case of return values.On Sunday 17 July 2011 22:08:27 Brian Schott wrote:Pending a decision on the null vs. empty issue, I have now standardised on using empty() for testing whether functions return empty strings.The documentation comments for driveName say that the return value will be an empty string in some circumstances, but the code and unit tests both say that the behavior is to return null.The fun part with that is that "" == null and a null string is empty per std.array.empty, so it _is_ the empty string. The only difference is that "" !is null. So, if the function says that it returns null, then it needs to return null. Since it says that it returns the empty string, it could return either. Now, in spite of all that, there's still a problem since the tests verify that the return value is null, not empty. Either the documentation should say that it returns null, or the tests should be checking for empty, not null. But still, the documentation isn't incorrect. Are the tests are perfectly valid, but they really shouldn't be testing for is null instead of empty when the function is supposed to return empty.By the way, did you read my post in the other thread?Yes, I read it, but I forgot to answer. Sorry about that. I've answered now. -Lars
Jul 18 2011
On 18.07.2011 16:18, Lars T. Kyllingstad wrote:On Mon, 18 Jul 2011 14:23:18 +0200, torhu wrote:I meant to imply that null and empty should not be used to mean two different things, sorry if I didn't make myself clear. AFAIK, none of the Phobos functions that take string arguments care about the difference. If the length is zero, the pointer value is ignored. In light of this, I don't know what different meanings null and empty would or should have.I'd like to make a case for null as the 'nothing here' value. The advantage of using null is that all possible ways of testing for 'nothingness' (is, ==, as a boolean condition, empty range) will work. But if you return an empty string, you can't do 'str is null', because that will be false. With null there's just no doubt, and no way to get the test wrong. As far as I can tell by the testing I've done, you can use a null string in every way that you can use an empty string, even append to it with ~=. The distinction between null and empty strings is significant in C and Java, but in D it's not, and the tiny difference that actually exists mainly serves to confuse people. It doesn't help that the actual differences are largely undocumented either. One difference is that a statically allocated empty string is null terminated, but I think that can be safely ignored in the case of return values.True, but the question was not whether one should use null or "" for the "nothing here" return value of a function. The question was whether the function returning null should mean something different than it returning "".
Jul 18 2011
On 2011-07-18 10:51, torhu wrote:On 18.07.2011 16:18, Lars T. Kyllingstad wrote:There are definitely situations where it is valuable to differentiate between null and empty, but in the case of D arrays, they really aren't designed for it, because nearly everything in the language treats them as being the same thing. There may be some value in differentiating them in spite of that, but it doesn't generally work very well. One of the few places would be the return value of a function. So, if there could reasonably be a difference between "" and null for the return value of a function, then it could be reasonable to null mean something different than "". But the truth is that that's going to be error prone, because people are likely to use == null instead of is null, not realizing that == null doesn't do what they want (in fact, arguably, == null merits a warning). So, if there's no clear gain in returning null, the documentation should just say that it returns empty, and then it doesn't matter whether it returns "" or null. It _is_ a bit of a conundrum though. I'm not sure that making null and "" virtually identical was ultimately a good idea, but we're stuck with it at this point. - Jonathan M DavisOn Mon, 18 Jul 2011 14:23:18 +0200, torhu wrote:I meant to imply that null and empty should not be used to mean two different things, sorry if I didn't make myself clear. AFAIK, none of the Phobos functions that take string arguments care about the difference. If the length is zero, the pointer value is ignored. In light of this, I don't know what different meanings null and empty would or should have.I'd like to make a case for null as the 'nothing here' value. The advantage of using null is that all possible ways of testing for 'nothingness' (is, ==, as a boolean condition, empty range) will work. But if you return an empty string, you can't do 'str is null', because that will be false. With null there's just no doubt, and no way to get the test wrong. As far as I can tell by the testing I've done, you can use a null string in every way that you can use an empty string, even append to it with ~=. The distinction between null and empty strings is significant in C and Java, but in D it's not, and the tiny difference that actually exists mainly serves to confuse people. It doesn't help that the actual differences are largely undocumented either. One difference is that a statically allocated empty string is null terminated, but I think that can be safely ignored in the case of return values.True, but the question was not whether one should use null or "" for the "nothing here" return value of a function. The question was whether the function returning null should mean something different than it returning "".
Jul 18 2011
On 7/18/11 7:23 AM, torhu wrote:On 18.07.2011 11:42, Lars T. Kyllingstad wrote:Note that there are two aspects: generating 'nothing here' values, and testing for 'nothing here'. In keeping with the "be generous with what you receive and conservative with what you send" mantra, good functions should test string inputs with str.empty and return null strings. AndreiOn Sun, 17 Jul 2011 22:38:43 -0700, Jonathan M Davis wrote:I'd like to make a case for null as the 'nothing here' value. The advantage of using null is that all possible ways of testing for 'nothingness' (is, ==, as a boolean condition, empty range) will work. But if you return an empty string, you can't do 'str is null', because that will be false. With null there's just no doubt, and no way to get the test wrong.On Sunday 17 July 2011 22:08:27 Brian Schott wrote:Pending a decision on the null vs. empty issue, I have now standardised on using empty() for testing whether functions return empty strings.The documentation comments for driveName say that the return value will be an empty string in some circumstances, but the code and unit tests both say that the behavior is to return null.The fun part with that is that "" == null and a null string is empty per std.array.empty, so it _is_ the empty string. The only difference is that "" !is null. So, if the function says that it returns null, then it needs to return null. Since it says that it returns the empty string, it could return either. Now, in spite of all that, there's still a problem since the tests verify that the return value is null, not empty. Either the documentation should say that it returns null, or the tests should be checking for empty, not null. But still, the documentation isn't incorrect. Are the tests are perfectly valid, but they really shouldn't be testing for is null instead of empty when the function is supposed to return empty.
Jul 18 2011
On Mon, 18 Jul 2011 09:38:08 -0500, Andrei Alexandrescu wrote:On 7/18/11 7:23 AM, torhu wrote:Some have argued that there is an extra dimension to this, namely the distinction between "nothing here" and "something here, but that something is an empty string". I am not convinced we should make that distinction.On 18.07.2011 11:42, Lars T. Kyllingstad wrote:Note that there are two aspects: generating 'nothing here' values, and testing for 'nothing here'.On Sun, 17 Jul 2011 22:38:43 -0700, Jonathan M Davis wrote:I'd like to make a case for null as the 'nothing here' value. The advantage of using null is that all possible ways of testing for 'nothingness' (is, ==, as a boolean condition, empty range) will work. But if you return an empty string, you can't do 'str is null', because that will be false. With null there's just no doubt, and no way to get the test wrong.On Sunday 17 July 2011 22:08:27 Brian Schott wrote:Pending a decision on the null vs. empty issue, I have now standardised on using empty() for testing whether functions return empty strings.The documentation comments for driveName say that the return value will be an empty string in some circumstances, but the code and unit tests both say that the behavior is to return null.The fun part with that is that "" == null and a null string is empty per std.array.empty, so it _is_ the empty string. The only difference is that "" !is null. So, if the function says that it returns null, then it needs to return null. Since it says that it returns the empty string, it could return either. Now, in spite of all that, there's still a problem since the tests verify that the return value is null, not empty. Either the documentation should say that it returns null, or the tests should be checking for empty, not null. But still, the documentation isn't incorrect. Are the tests are perfectly valid, but they really shouldn't be testing for is null instead of empty when the function is supposed to return empty.In keeping with the "be generous with what you receive and conservative with what you send" mantra, good functions should test string inputs with str.empty and return null strings.The specific example which spurred the debate was the following: While there is no doubt that extension("foo") should return null, Vladimir Panteleev argued that extension("foo.") should be *specified* to return "" (specifically, an empty slice from the end of the input string) to indicate that there is an "empty extension". I disagree, I don't think null and "" should have different semantics. The fact that extension() currently *does* behave as Vladimir wants is, in my opinion, an implementation detail. Note that extension() seems to be the only function for which the controversy has arisen so far, so it may not be worth taking this discussion too far. -Lars
Jul 18 2011
On Mon, 18 Jul 2011 18:07:12 +0300, Lars T. Kyllingstad <public kyllingen.nospamnet> wrote:The fact that extension() currently *does* behave as Vladimir wants is, in my opinion, an implementation detail.Is it still an implementation detail if it's documented behavior? -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Jul 18 2011
Sorry, I thought you meant the old getExt().
Jul 18 2011
On Mon, 18 Jul 2011 08:23:18 -0400, torhu <no spam.invalid> wrote:On 18.07.2011 11:42, Lars T. Kyllingstad wrote:The one that's kind of nice is the if(path.extension), which reads not only much better than if(path.extension == null), but it's a very common idiom in many languages (using if to test a string's emptiness). People are likely to get this wrong (in fact, it may make sense for *all* empty arrays to evaluate as false for an if condition). I personally think if there's no real difference, returning null is the better option based on these points. However, if there is some performance/maintenance advantage to not returning null, then just return an empty non-null array and specify in the API docs that the function returns an empty string. -SteveOn Sun, 17 Jul 2011 22:38:43 -0700, Jonathan M Davis wrote:I'd like to make a case for null as the 'nothing here' value. The advantage of using null is that all possible ways of testing for 'nothingness' (is, ==, as a boolean condition, empty range) will work. But if you return an empty string, you can't do 'str is null', because that will be false. With null there's just no doubt, and no way to get the test wrong.On Sunday 17 July 2011 22:08:27 Brian Schott wrote:Pending a decision on the null vs. empty issue, I have now standardised on using empty() for testing whether functions return empty strings.The documentation comments for driveName say that the return value will be an empty string in some circumstances, but the code and unit tests both say that the behavior is to return null.The fun part with that is that "" == null and a null string is empty per std.array.empty, so it _is_ the empty string. The only difference is that "" !is null. So, if the function says that it returns null, then it needs to return null. Since it says that it returns the empty string, it could return either. Now, in spite of all that, there's still a problem since the tests verify that the return value is null, not empty. Either the documentation should say that it returns null, or the tests should be checking for empty, not null. But still, the documentation isn't incorrect. Are the tests are perfectly valid, but they really shouldn't be testing for is null instead of empty when the function is supposed to return empty.
Jul 18 2011
On Sun, 17 Jul 2011 21:27:41 +0000, Lars T. Kyllingstad wrote:Based on your comments, I have made some changes to my std.path proposal. A list of the changes I have made can be found at the following address (look at the commits dated 2011-07-17): https://github.com/kyllingstad/phobos/commits/std-path I believe I have covered most of your requests, with a few exceptions:It seems I forgot about the CTFEability tests. I'll fix that too, and push the updated code later today. -Lars
Jul 18 2011
On Mon, 18 Jul 2011 10:05:07 +0000, Lars T. Kyllingstad wrote:On Sun, 17 Jul 2011 21:27:41 +0000, Lars T. Kyllingstad wrote:Done. Most functions were CTFEable without any modifications (thanks, Don!). :) The exceptions are relativePath (because of std.algorithm.cmp) and expandTilde (which is strictly a run-time function). -LarsBased on your comments, I have made some changes to my std.path proposal. A list of the changes I have made can be found at the following address (look at the commits dated 2011-07-17): https://github.com/kyllingstad/phobos/commits/std-path I believe I have covered most of your requests, with a few exceptions:It seems I forgot about the CTFEability tests. I'll fix that too, and push the updated code later today.
Jul 18 2011
On Sun, 17 Jul 2011 17:27:41 -0400, Lars T. Kyllingstad <public kyllingen.nospamnet> wrote:Based on your comments, I have made some changes to my std.path proposal. A list of the changes I have made can be found at the following address (look at the commits dated 2011-07-17): https://github.com/kyllingstad/phobos/commits/std-pathThis is a review of the docs/design. I'll review the code separately: basename's standards section says: (with suitable adaptions for Windows paths) adaptions => adaptations This occurs twice. In driveName: Should std.path handle uunc paths? i.e. \\servername\share\path (I think if it does, it should specify \\servername\share as the drive) joinPath: Does this normalize the paths? For example: joinPath("/home/steves", "../lars") => /home/steves/../lars or /home/lars ? If so, the docs should reflect that. If not, maybe it should :) If it doesn't, at least the docs should state that it doesn't. pathSplitter: I think this should be a bi-directional range (no technical limitation I can think of). fcmp: "On Windows, fcmp is an alias for std.string.icmp, which yields a case insensitive comparison. On POSIX, it is an alias for std.algorithm.cmp, i.e. a case sensitive comparison." What about comparing c:/foo with c:\foo? This isn't going to be equal with icmp. expandTilde: I've commented on expandTilde from the other posts, but if it is kept a posix-only function, the documentation should reflect that.
Jul 18 2011
On Mon, 18 Jul 2011 13:16:29 -0400, Steven Schveighoffer wrote:On Sun, 17 Jul 2011 17:27:41 -0400, Lars T. Kyllingstad <public kyllingen.nospamnet> wrote:Oops. Thanks!Based on your comments, I have made some changes to my std.path proposal. A list of the changes I have made can be found at the following address (look at the commits dated 2011-07-17): https://github.com/kyllingstad/phobos/commits/std-pathThis is a review of the docs/design. I'll review the code separately: basename's standards section says: (with suitable adaptions for Windows paths) adaptions => adaptationsThis occurs twice.Copy+paste. :)In driveName: Should std.path handle uunc paths? i.e. \\servername\share\path (I think if it does, it should specify \\servername\share as the drive)Yes, std.path is supposed to support UNC paths. For instance, the following works now: assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo`, "bar", "baz"])); I guess you would rather have that assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo\bar`, "baz"])); then? I am not very familiar with Windows network shares; is \\foo never a valid path on its own? As I understand it, some POSIX systems also mount network drives using similar paths. Does anyone know whether "//foo" is a valid path on these systems, or does it have to bee "//foo/bar"?joinPath: Does this normalize the paths? For example: joinPath("/home/steves", "../lars") => /home/steves/../lars or /home/lars ? If so, the docs should reflect that. If not, maybe it should :) If it doesn't, at least the docs should state that it doesn't.No, it doesn't, and I don't think it should. It is better to let the user choose whether they want the overhead of normalization by calling normalize() explicitly. I will specify this in the docs.pathSplitter: I think this should be a bi-directional range (no technical limitation I can think of).It is more of a complexity vs. benefit thing, but as you are the second person to ask for this, I will look into it. A convincing use case would be nice, though. :)fcmp: "On Windows, fcmp is an alias for std.string.icmp, which yields a case insensitive comparison. On POSIX, it is an alias for std.algorithm.cmp, i.e. a case sensitive comparison." What about comparing c:/foo with c:\foo? This isn't going to be equal with icmp.I am a bit unsure what to do about the comparison functions (fcmp, pathCharMatch and globMatch). Aside from the issue with directory separators it is, as was pointed out by someone else, entirely possible to mount case-sensitive file systems on Windows and case-insensitive file systems on POSIX. (The latter is not uncommon on OSX, I believe.) I am open to suggestions.expandTilde: I've commented on expandTilde from the other posts, but if it is kept a posix-only function, the documentation should reflect that.It does; look at the "Returns" section. Perhaps it should be moved to a more prominent location? -Lars
Jul 18 2011
On 2011-07-18 11:25, Lars T. Kyllingstad wrote:On Mon, 18 Jul 2011 13:16:29 -0400, Steven Schveighoffer wrote:I suggest that you do what I did in std.file (e.g. with getTimesWin). I put this at the very top of the ddoc comment: $(BLUE This function is Windows-Only.) or if it's only on Posix: $(BLUE This function is Posix-Only.) - Jonathan M DavisOn Sun, 17 Jul 2011 17:27:41 -0400, Lars T. Kyllingstad <public kyllingen.nospamnet> wrote:Oops. Thanks!Based on your comments, I have made some changes to my std.path proposal. A list of the changes I have made can be found at the following address (look at the commits dated 2011-07-17): https://github.com/kyllingstad/phobos/commits/std-pathThis is a review of the docs/design. I'll review the code separately: basename's standards section says: (with suitable adaptions for Windows paths) adaptions => adaptationsThis occurs twice.Copy+paste. :)In driveName: Should std.path handle uunc paths? i.e. \\servername\share\path (I think if it does, it should specify \\servername\share as the drive)Yes, std.path is supposed to support UNC paths. For instance, the following works now: assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo`, "bar", "baz"])); I guess you would rather have that assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo\bar`, "baz"])); then? I am not very familiar with Windows network shares; is \\foo never a valid path on its own? As I understand it, some POSIX systems also mount network drives using similar paths. Does anyone know whether "//foo" is a valid path on these systems, or does it have to bee "//foo/bar"?joinPath: Does this normalize the paths? For example: joinPath("/home/steves", "../lars") => /home/steves/../lars or /home/lars ? If so, the docs should reflect that. If not, maybe it should :) If it doesn't, at least the docs should state that it doesn't.No, it doesn't, and I don't think it should. It is better to let the user choose whether they want the overhead of normalization by calling normalize() explicitly. I will specify this in the docs.pathSplitter: I think this should be a bi-directional range (no technical limitation I can think of).It is more of a complexity vs. benefit thing, but as you are the second person to ask for this, I will look into it. A convincing use case would be nice, though. :)fcmp: "On Windows, fcmp is an alias for std.string.icmp, which yields a case insensitive comparison. On POSIX, it is an alias for std.algorithm.cmp, i.e. a case sensitive comparison." What about comparing c:/foo with c:\foo? This isn't going to be equal with icmp.I am a bit unsure what to do about the comparison functions (fcmp, pathCharMatch and globMatch). Aside from the issue with directory separators it is, as was pointed out by someone else, entirely possible to mount case-sensitive file systems on Windows and case-insensitive file systems on POSIX. (The latter is not uncommon on OSX, I believe.) I am open to suggestions.expandTilde: I've commented on expandTilde from the other posts, but if it is kept a posix-only function, the documentation should reflect that.It does; look at the "Returns" section. Perhaps it should be moved to a more prominent location?
Jul 18 2011
On Mon, 18 Jul 2011 14:25:57 -0400, Lars T. Kyllingstad <public kyllingen.nospamnet> wrote:On Mon, 18 Jul 2011 13:16:29 -0400, Steven Schveighoffer wrote:It is and it isn't. It's *not* a normal directory, because only shares can be in that directory. In other words, the point at which a UNC path turns into normal directory structure is after the share name. An easy way to compare is, you can only map drive letters to shares, not to servers.In driveName: Should std.path handle uunc paths? i.e. \\servername\share\path (I think if it does, it should specify \\servername\share as the drive)Yes, std.path is supposed to support UNC paths. For instance, the following works now: assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo`, "bar", "baz"])); I guess you would rather have that assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo\bar`, "baz"])); then? I am not very familiar with Windows network shares; is \\foo never a valid path on its own?As I understand it, some POSIX systems also mount network drives using similar paths. Does anyone know whether "//foo" is a valid path on these systems, or does it have to bee "//foo/bar"?Typically, linux uses URL's, i.e. smb://server/share URL parsing is probably not in std.path's charter. However, I have used a command like: mount -t cifs //server/share /mnt/serverfiles But this is only in very special contexts. In general I don't think //foo should be considered a server path on Posix systems.In fact, if you do not normalize during the join, it's *more* overhead to normalize afterwards. If normalization is done while joining, then you only build one string. There's no need to build a non-normalized string, then build a normalized string based on that. Plus the data is only iterated once. I think it's at least worth an option, but I'm not going to hold back my vote based on this :)joinPath: Does this normalize the paths? For example: joinPath("/home/steves", "../lars") => /home/steves/../lars or /home/lars ? If so, the docs should reflect that. If not, maybe it should :) If it doesn't, at least the docs should state that it doesn't.No, it doesn't, and I don't think it should. It is better to let the user choose whether they want the overhead of normalization by calling normalize() explicitly. I will specify this in the docs.Well a path is more like a stack than a queue. You are usually operating more on the back side of it. To provide back and popBack makes a lot of sense to me. For example, to implement the command cd ../foo, you need to popBack the topmost directory.pathSplitter: I think this should be a bi-directional range (no technical limitation I can think of).It is more of a complexity vs. benefit thing, but as you are the second person to ask for this, I will look into it. A convincing use case would be nice, though. :)It's definitely something to think about. At the very least, I think the default file system case sensitivity should be mapped to a certain function. It doesn't hurt to expose the opposite sensitivity as an alternate (you need to implement both anyway). A template with all options defaulted for the current OS makes good sense I think. Actually, expanding/renaming pathCharMatch provides a perfect way to default these: e.g.: version(Windows) { enum defaultOSSensitivity = false; enum defaultOSDirSeps = `\/`; } else version(Posix) { enum defaultOSSensitivity = true; enum defaultOSDirSeps = "/"; } // replaces pathCharMatch int pathCharCmp(bool caseSensitive = defaultOSSensitivity, string dirseps = defaultOSDirSeps)(dchar a, dchar b); int fcmp(alias pred = "pathCharCmp(a, b)", S1, S2)(S1 filename1, S2 filename2); Anyone who wants to do alternate comparisons is free to do so using other options from pathCharCmp.fcmp: "On Windows, fcmp is an alias for std.string.icmp, which yields a case insensitive comparison. On POSIX, it is an alias for std.algorithm.cmp, i.e. a case sensitive comparison." What about comparing c:/foo with c:\foo? This isn't going to be equal with icmp.I am a bit unsure what to do about the comparison functions (fcmp, pathCharMatch and globMatch). Aside from the issue with directory separators it is, as was pointed out by someone else, entirely possible to mount case-sensitive file systems on Windows and case-insensitive file systems on POSIX. (The latter is not uncommon on OSX, I believe.) I am open to suggestions.Yes. It should say (Posix-only). I believe technically that it should fail to compile on Windows if it does not map to a "home" directory there. Note that as named, it's possible to confuse with expanding the DOS 8.3 name of a file, i.e. Progra~1 -SteveexpandTilde: I've commented on expandTilde from the other posts, but if it is kept a posix-only function, the documentation should reflect that.It does; look at the "Returns" section. Perhaps it should be moved to a more prominent location?
Jul 18 2011
On Mon, 18 Jul 2011 14:51:06 -0400, Steven Schveighoffer wrote:On Mon, 18 Jul 2011 14:25:57 -0400, Lars T. Kyllingstad <public kyllingen.nospamnet> wrote:Well, that certainly cleared things up. ;)On Mon, 18 Jul 2011 13:16:29 -0400, Steven Schveighoffer wrote:It is and it isn't.In driveName: Should std.path handle uunc paths? i.e. \\servername\share\path (I think if it does, it should specify \\servername\share as the drive)Yes, std.path is supposed to support UNC paths. For instance, the following works now: assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo`, "bar", "baz"])); I guess you would rather have that assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo\bar`, "baz"])); then? I am not very familiar with Windows network shares; is \\foo never a valid path on its own?It's *not* a normal directory, because only shares can be in that directory. In other words, the point at which a UNC path turns into normal directory structure is after the share name. An easy way to compare is, you can only map drive letters to shares, not to servers.Then driveName() should probably return the full share path. But, of the following asserts, which should pass? assert (pathSplitter(`\\foo\bar\baz`).front == `\\foo\bar`); assert (pathSplitter(`\\foo\bar\baz`).front == `\\foo`); assert (baseName(`\\foo\bar`) == `\\foo\bar`); assert (baseName(`\\foo\bar`) == "bar"); assert (dirName(`\\foo\bar`) == `\\foo\bar`); assert (dirName(`\\foo\bar`) == `\\foo`); Note that if you replace `\\foo\bar` with `c:\` in the above, the first assert in each pair will pass. Same with "/" on POSIX. Basically, that choice corresponds to treating `\\foo\bar` as a filesystem root.I actually got a request on the Phobos list that std.path should support such paths. Furthermore, the POSIX stardard explicitly mentions "//" paths (though it basically says it is implementation-defined whether to bother dealing with them).As I understand it, some POSIX systems also mount network drives using similar paths. Does anyone know whether "//foo" is a valid path on these systems, or does it have to bee "//foo/bar"?Typically, linux uses URL's, i.e. smb://server/share URL parsing is probably not in std.path's charter. However, I have used a command like: mount -t cifs //server/share /mnt/serverfiles But this is only in very special contexts. In general I don't think //foo should be considered a server path on Posix systems.If it doesn't turn out to be a huge undertaking, I think I'll replace joinPath() with a function buildPath() that takes an input range of path segments and joins them together, with optional normalization. Then, normalize(path) can be implemented as: buildPath(pathSplitter(path)); Does that sound sensible?In fact, if you do not normalize during the join, it's *more* overhead to normalize afterwards. If normalization is done while joining, then you only build one string. There's no need to build a non-normalized string, then build a normalized string based on that. Plus the data is only iterated once. I think it's at least worth an option, but I'm not going to hold back my vote based on this :)joinPath: Does this normalize the paths? For example: joinPath("/home/steves", "../lars") => /home/steves/../lars or /home/lars ? If so, the docs should reflect that. If not, maybe it should :) If it doesn't, at least the docs should state that it doesn't.No, it doesn't, and I don't think it should. It is better to let the user choose whether they want the overhead of normalization by calling normalize() explicitly. I will specify this in the docs.Ok, I'll see what I can do about it. :)Well a path is more like a stack than a queue. You are usually operating more on the back side of it. To provide back and popBack makes a lot of sense to me. For example, to implement the command cd ../foo, you need to popBack the topmost directory.pathSplitter: I think this should be a bi-directional range (no technical limitation I can think of).It is more of a complexity vs. benefit thing, but as you are the second person to ask for this, I will look into it. A convincing use case would be nice, though. :)Good idea. I'll probably implement something like that.It's definitely something to think about. At the very least, I think the default file system case sensitivity should be mapped to a certain function. It doesn't hurt to expose the opposite sensitivity as an alternate (you need to implement both anyway). A template with all options defaulted for the current OS makes good sense I think. Actually, expanding/renaming pathCharMatch provides a perfect way to default these: e.g.: version(Windows) { enum defaultOSSensitivity = false; enum defaultOSDirSeps = `\/`; } else version(Posix) { enum defaultOSSensitivity = true; enum defaultOSDirSeps = "/"; } // replaces pathCharMatch int pathCharCmp(bool caseSensitive = defaultOSSensitivity, string dirseps = defaultOSDirSeps)(dchar a, dchar b); int fcmp(alias pred = "pathCharCmp(a, b)", S1, S2)(S1 filename1, S2 filename2); Anyone who wants to do alternate comparisons is free to do so using other options from pathCharCmp.fcmp: "On Windows, fcmp is an alias for std.string.icmp, which yields a case insensitive comparison. On POSIX, it is an alias for std.algorithm.cmp, i.e. a case sensitive comparison." What about comparing c:/foo with c:\foo? This isn't going to be equal with icmp.I am a bit unsure what to do about the comparison functions (fcmp, pathCharMatch and globMatch). Aside from the issue with directory separators it is, as was pointed out by someone else, entirely possible to mount case-sensitive file systems on Windows and case-insensitive file systems on POSIX. (The latter is not uncommon on OSX, I believe.) I am open to suggestions.I agree. I'll put it inside a version(Posix) block. -LarsYes. It should say (Posix-only). I believe technically that it should fail to compile on Windows if it does not map to a "home" directory there. Note that as named, it's possible to confuse with expanding the DOS 8.3 name of a file, i.e. Progra~1expandTilde: I've commented on expandTilde from the other posts, but if it is kept a posix-only function, the documentation should reflect that.It does; look at the "Returns" section. Perhaps it should be moved to a more prominent location?
Jul 20 2011
On Wed, 20 Jul 2011 13:36:51 -0400, Lars T. Kyllingstad <public kyllingen.nospamnet> wrote:On Mon, 18 Jul 2011 14:51:06 -0400, Steven Schveighoffer wrote:It is in that if you open explorer and type in \\servername, it will give you a list of shares you can try. But I don't think it's a valid *path*, except in explorer. So my intuition is to declare it never a valid path. I'm not sure how \\server interacts with the low level functions of Windows (such as CreateFile). Some research/experimentation is probably warranted.On Mon, 18 Jul 2011 14:25:57 -0400, Lars T. Kyllingstad <public kyllingen.nospamnet> wrote:Well, that certainly cleared things up. ;)On Mon, 18 Jul 2011 13:16:29 -0400, Steven Schveighoffer wrote:It is and it isn't.In driveName: Should std.path handle uunc paths? i.e. \\servername\share\path (I think if it does, it should specify \\servername\share as the drive)Yes, std.path is supposed to support UNC paths. For instance, the following works now: assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo`, "bar", "baz"])); I guess you would rather have that assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo\bar`, "baz"])); then? I am not very familiar with Windows network shares; is \\foo never a valid path on its own?Yes, I think this sounds right (pending research/experimentation cited above).It's *not* a normal directory, because only shares can be in that directory. In other words, the point at which a UNC path turns into normal directory structure is after the share name. An easy way to compare is, you can only map drive letters to shares, not to servers.Then driveName() should probably return the full share path. But, of the following asserts, which should pass? assert (pathSplitter(`\\foo\bar\baz`).front == `\\foo\bar`); assert (pathSplitter(`\\foo\bar\baz`).front == `\\foo`); assert (baseName(`\\foo\bar`) == `\\foo\bar`); assert (baseName(`\\foo\bar`) == "bar"); assert (dirName(`\\foo\bar`) == `\\foo\bar`); assert (dirName(`\\foo\bar`) == `\\foo`); Note that if you replace `\\foo\bar` with `c:\` in the above, the first assert in each pair will pass. Same with "/" on POSIX. Basically, that choice corresponds to treating `\\foo\bar` as a filesystem root.ls //root lists the contents of /root. I'd guess that opening //root with open() would simply open /root. Given that context, they should not be considered to be a server path IMO.I actually got a request on the Phobos list that std.path should support such paths. Furthermore, the POSIX stardard explicitly mentions "//" paths (though it basically says it is implementation-defined whether to bother dealing with them).As I understand it, some POSIX systems also mount network drives using similar paths. Does anyone know whether "//foo" is a valid path on these systems, or does it have to bee "//foo/bar"?Typically, linux uses URL's, i.e. smb://server/share URL parsing is probably not in std.path's charter. However, I have used a command like: mount -t cifs //server/share /mnt/serverfiles But this is only in very special contexts. In general I don't think //foo should be considered a server path on Posix systems.That sounds good. -SteveIf it doesn't turn out to be a huge undertaking, I think I'll replace joinPath() with a function buildPath() that takes an input range of path segments and joins them together, with optional normalization. Then, normalize(path) can be implemented as: buildPath(pathSplitter(path)); Does that sound sensible?In fact, if you do not normalize during the join, it's *more* overhead to normalize afterwards. If normalization is done while joining, then you only build one string. There's no need to build a non-normalized string, then build a normalized string based on that. Plus the data is only iterated once. I think it's at least worth an option, but I'm not going to hold back my vote based on this :)joinPath: Does this normalize the paths? For example: joinPath("/home/steves", "../lars") => /home/steves/../lars or /home/lars ? If so, the docs should reflect that. If not, maybe it should :) If it doesn't, at least the docs should state that it doesn't.No, it doesn't, and I don't think it should. It is better to let the user choose whether they want the overhead of normalization by calling normalize() explicitly. I will specify this in the docs.
Jul 20 2011
On Wed, 20 Jul 2011 14:16:04 -0400, Steven Schveighoffer wrote:I'm not sure how \\server interacts with the low level functions of Windows (such as CreateFile). Some research/experimentation is probably warranted.Any .NET programmers out there? Can you please tell me what the following functions return? System.IO.Path.GetDirectoryName("\\foo\bar") System.IO.Path.GetPathRoot("\\foo\bar\baz") -Lars
Jul 20 2011
Lars T. Kyllingstad Wrote:On Wed, 20 Jul 2011 14:16:04 -0400, Steven Schveighoffer wrote: Any .NET programmers out there? Can you please tell me what the following functions return? System.IO.Path.GetDirectoryName("\\foo\bar") System.IO.Path.GetPathRoot("\\foo\bar\baz")This code: using System; namespace Test { static class Program { [STAThread] static void Main() { string test; test = "\\foo\bar\"; Console.WriteLine("System.IO.Path.GetDirectoryName(" + test + ")"); Console.WriteLine(System.IO.Path.GetDirectoryName(test)); test = "\\foo\bar"; Console.WriteLine("System.IO.Path.GetDirectoryName(" + test + ")"); Console.WriteLine(System.IO.Path.GetDirectoryName(test)); test = "\\foo\bar\baz"; Console.WriteLine("System.IO.Path.GetDirectoryName(" + test + ")"); Console.WriteLine(System.IO.Path.GetPathRoot(test)); } } } produced this output: C:\temp>test.exe System.IO.Path.GetDirectoryName(\\foo\bar\) \\foo\bar System.IO.Path.GetDirectoryName(\\foo\bar) System.IO.Path.GetDirectoryName(\\foo\bar\baz) \\foo\bar Cheers Jussi
Jul 21 2011
On Thu, 21 Jul 2011 03:36:37 -0400, Jussi Jumppanen wrote:Lars T. Kyllingstad Wrote:Thanks, this is very helpful. Now we know that MS's APIs treat \\foo\bar as a root directory, so we should do the same. This means that, once I get around to implementing it, the following asserts will pass on Windows: assert (baseName(`\\foo\bar`) == `\\foo\bar`); assert (dirName(`\\foo\bar`) == `\\foo\bar`); assert (pathSplitter(`\\foo\bar\baz`).front == `\\foo\bar`); This is analogous to the following on POSIX (where the behaviour mimics that of the basename and dirname shell utilities): assert (baseName("/") == "/"); assert (dirName("/") == "/"); assert (pathSplitter("/").front == "/"); -LarsOn Wed, 20 Jul 2011 14:16:04 -0400, Steven Schveighoffer wrote: Any .NET programmers out there? Can you please tell me what the following functions return? System.IO.Path.GetDirectoryName("\\foo\bar") System.IO.Path.GetPathRoot("\\foo\bar\baz")This code: using System; namespace Test { static class Program { [STAThread] static void Main() { string test; test = "\\foo\bar\"; Console.WriteLine("System.IO.Path.GetDirectoryName(" + test + ")"); Console.WriteLine(System.IO.Path.GetDirectoryName(test)); test = "\\foo\bar"; Console.WriteLine("System.IO.Path.GetDirectoryName(" + test + ")"); Console.WriteLine(System.IO.Path.GetDirectoryName(test)); test = "\\foo\bar\baz"; Console.WriteLine("System.IO.Path.GetDirectoryName(" + test + ")"); Console.WriteLine(System.IO.Path.GetPathRoot(test)); } } } produced this output: C:\temp>test.exe System.IO.Path.GetDirectoryName(\\foo\bar\) \\foo\bar System.IO.Path.GetDirectoryName(\\foo\bar) System.IO.Path.GetDirectoryName(\\foo\bar\baz) \\foo\bar Cheers Jussi
Jul 21 2011
On 20.07.2011 20:16, Steven Schveighoffer wrote:ls //root lists the contents of /root. I'd guess that opening //root with open() would simply open /root. Given that context, they should not be considered to be a server path IMO.If that's true for the bare open() without going through possible translations in "ls", I'd guess that "//server/share" would look for a file/directory "share" in "/server", so std.path should treat it this way for posix, too. Sorry, if my previous comments in the phobos-list caused confusion, I must have confused the mount share with a directory specification.
Jul 21 2011
On Thu, 21 Jul 2011 09:09:52 +0200, Rainer Schuetze wrote:On 20.07.2011 20:16, Steven Schveighoffer wrote:All right, I'll remove "//path" support again. That simplifies things for POSIX, at least. -Larsls //root lists the contents of /root. I'd guess that opening //root with open() would simply open /root. Given that context, they should not be considered to be a server path IMO.If that's true for the bare open() without going through possible translations in "ls", I'd guess that "//server/share" would look for a file/directory "share" in "/server", so std.path should treat it this way for posix, too. Sorry, if my previous comments in the phobos-list caused confusion, I must have confused the mount share with a directory specification.
Jul 21 2011
On Wed, 20 Jul 2011 17:36:51 +0000, Lars T. Kyllingstad wrote:On Mon, 18 Jul 2011 14:51:06 -0400, Steven Schveighoffer wrote:Actually, I realise now, it doesn't. :) Since joinPath/buildPath needs to support path segments containing multiple directories, normalize would just be buildPath(path) -LarsOn Mon, 18 Jul 2011 14:25:57 -0400, Lars T. Kyllingstad <public kyllingen.nospamnet> wrote:Well, that certainly cleared things up. ;)On Mon, 18 Jul 2011 13:16:29 -0400, Steven Schveighoffer wrote:It is and it isn't.In driveName: Should std.path handle uunc paths? i.e. \\servername\share\path (I think if it does, it should specify \\servername\share as the drive)Yes, std.path is supposed to support UNC paths. For instance, the following works now: assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo`, "bar", "baz"])); I guess you would rather have that assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo\bar`, "baz"])); then? I am not very familiar with Windows network shares; is \\foo never a valid path on its own?It's *not* a normal directory, because only shares can be in that directory. In other words, the point at which a UNC path turns into normal directory structure is after the share name. An easy way to compare is, you can only map drive letters to shares, not to servers.Then driveName() should probably return the full share path. But, of the following asserts, which should pass? assert (pathSplitter(`\\foo\bar\baz`).front == `\\foo\bar`); assert (pathSplitter(`\\foo\bar\baz`).front == `\\foo`); assert (baseName(`\\foo\bar`) == `\\foo\bar`); assert (baseName(`\\foo\bar`) == "bar"); assert (dirName(`\\foo\bar`) == `\\foo\bar`); assert (dirName(`\\foo\bar`) == `\\foo`); Note that if you replace `\\foo\bar` with `c:\` in the above, the first assert in each pair will pass. Same with "/" on POSIX. Basically, that choice corresponds to treating `\\foo\bar` as a filesystem root.I actually got a request on the Phobos list that std.path should support such paths. Furthermore, the POSIX stardard explicitly mentions "//" paths (though it basically says it is implementation-defined whether to bother dealing with them).As I understand it, some POSIX systems also mount network drives using similar paths. Does anyone know whether "//foo" is a valid path on these systems, or does it have to bee "//foo/bar"?Typically, linux uses URL's, i.e. smb://server/share URL parsing is probably not in std.path's charter. However, I have used a command like: mount -t cifs //server/share /mnt/serverfiles But this is only in very special contexts. In general I don't think //foo should be considered a server path on Posix systems.If it doesn't turn out to be a huge undertaking, I think I'll replace joinPath() with a function buildPath() that takes an input range of path segments and joins them together, with optional normalization. Then, normalize(path) can be implemented as: buildPath(pathSplitter(path)); Does that sound sensible?In fact, if you do not normalize during the join, it's *more* overhead to normalize afterwards. If normalization is done while joining, then you only build one string. There's no need to build a non-normalized string, then build a normalized string based on that. Plus the data is only iterated once. I think it's at least worth an option, but I'm not going to hold back my vote based on this :)joinPath: Does this normalize the paths? For example: joinPath("/home/steves", "../lars") => /home/steves/../lars or /home/lars ? If so, the docs should reflect that. If not, maybe it should :) If it doesn't, at least the docs should state that it doesn't.No, it doesn't, and I don't think it should. It is better to let the user choose whether they want the overhead of normalization by calling normalize() explicitly. I will specify this in the docs.
Jul 20 2011
On 20.07.2011 19:36, Lars T. Kyllingstad wrote:On Mon, 18 Jul 2011 14:51:06 -0400, Steven Schveighoffer wrote:I like the direction that this is heading. If the idea gets extended to other functions as well, you won't have to reimplement std.path if you have to deal with posix paths on windows and vice versa, e.g. when transferring data containing paths between different systems.It's definitely something to think about. At the very least, I think the default file system case sensitivity should be mapped to a certain function. It doesn't hurt to expose the opposite sensitivity as an alternate (you need to implement both anyway). A template with all options defaulted for the current OS makes good sense I think. Actually, expanding/renaming pathCharMatch provides a perfect way to default these: e.g.: version(Windows) { enum defaultOSSensitivity = false; enum defaultOSDirSeps = `\/`; } else version(Posix) { enum defaultOSSensitivity = true; enum defaultOSDirSeps = "/"; } // replaces pathCharMatch int pathCharCmp(bool caseSensitive = defaultOSSensitivity, string dirseps = defaultOSDirSeps)(dchar a, dchar b); int fcmp(alias pred = "pathCharCmp(a, b)", S1, S2)(S1 filename1, S2 filename2); Anyone who wants to do alternate comparisons is free to do so using other options from pathCharCmp.Good idea. I'll probably implement something like that.
Jul 21 2011
"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message news:j01trl$2ia$6 digitalmars.com...On Mon, 18 Jul 2011 13:16:29 -0400, Steven Schveighoffer wrote:I don't know whether or not it's "never" a valid path, but "dir \\server" always fails and "dir \\server\share" always works (assuming it exists, at least). So treating the whole thing as a drive might be the right thing to do. (Of course, it's completely moronic that WIndows works that way...)In driveName: Should std.path handle uunc paths? i.e. \\servername\share\path (I think if it does, it should specify \\servername\share as the drive)Yes, std.path is supposed to support UNC paths. For instance, the following works now: assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo`, "bar", "baz"])); I guess you would rather have that assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo\bar`, "baz"])); then? I am not very familiar with Windows network shares; is \\foo never a valid path on its own?If such mountings are possible, it would seem that there must be some way to check the sensitivity (otherwise the OS itself would probably crap out on it). Although, at least in the case of case-insensitive mountings on posix, doesn't that mean such paths would have both case-sensitive and case-insensitive parts? Ex: /mount/damnWinDrive/dir/subdir Wouldn't the "mount/damnWinDrive" part be case-sensitive and the "dir/subdir" part be insensitve? (I'm starting to really despise case-insensitive filesystems.)fcmp: "On Windows, fcmp is an alias for std.string.icmp, which yields a case insensitive comparison. On POSIX, it is an alias for std.algorithm.cmp, i.e. a case sensitive comparison." What about comparing c:/foo with c:\foo? This isn't going to be equal with icmp.I am a bit unsure what to do about the comparison functions (fcmp, pathCharMatch and globMatch). Aside from the issue with directory separators it is, as was pointed out by someone else, entirely possible to mount case-sensitive file systems on Windows and case-insensitive file systems on POSIX. (The latter is not uncommon on OSX, I believe.) I am open to suggestions.
Jul 19 2011
Here's some relevant info: http://msdn.microsoft.com/en-us/library/aa365247%28v=vs.85%29.aspx
Jul 19 2011
On Tue, 19 Jul 2011 15:55:29 -0400, Nick Sabalausky <a a.a> wrote:If such mountings are possible, it would seem that there must be some way to check the sensitivity (otherwise the OS itself would probably crap out on it).I've done it before, mounted a windows share on a linux box via cifs. What happens is, everything thinks it's case sensitive (i.e. any user-space tools), but when you go to open a file, write a file, rename a file, the share performs as if it were case insensitive. For example: ls /mnt/winshare File.txt find /mnt/winshare -name FILE.TXT No files found touch /mnt/winshare/FILE.TXT => updates date/time on File.txt cat /mnt/winshare/FILE.TXT => outputs File.txt So as long as you are performing operations *blindly*, the case insensitivity kicks in. For example, open a file without first searching for it. But if you start reading directories, tools have no idea it's on a case-insensitive filesystem.Although, at least in the case of case-insensitive mountings on posix, doesn't that mean such paths would have both case-sensitive and case-insensitive parts? Ex: /mount/damnWinDrive/dir/subdir Wouldn't the "mount/damnWinDrive" part be case-sensitive and the "dir/subdir" part be insensitve?Yes, actually, this is a very good point. And there's no way for std.path to make that distinction.(I'm starting to really despise case-insensitive filesystems.)I've never understood why they have any benefits whatsoever. The only reason I can think of them having any use is legacy. -Steve
Jul 20 2011
On Tue, 19 Jul 2011 15:55:29 -0400, Nick Sabalausky wrote:"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message news:j01trl$2ia$6 digitalmars.com...That check would probably be orders of magnitude more expensive than a simple string operation.On Mon, 18 Jul 2011 13:16:29 -0400, Steven Schveighoffer wrote:I don't know whether or not it's "never" a valid path, but "dir \\server" always fails and "dir \\server\share" always works (assuming it exists, at least). So treating the whole thing as a drive might be the right thing to do. (Of course, it's completely moronic that WIndows works that way...)In driveName: Should std.path handle uunc paths? i.e. \\servername\share\path (I think if it does, it should specify \\servername\share as the drive)Yes, std.path is supposed to support UNC paths. For instance, the following works now: assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo`, "bar", "baz"])); I guess you would rather have that assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo\bar`, "baz"])); then? I am not very familiar with Windows network shares; is \\foo never a valid path on its own?If such mountings are possible, it would seem that there must be some way to check the sensitivity (otherwise the OS itself would probably crap out on it).fcmp: "On Windows, fcmp is an alias for std.string.icmp, which yields a case insensitive comparison. On POSIX, it is an alias for std.algorithm.cmp, i.e. a case sensitive comparison." What about comparing c:/foo with c:\foo? This isn't going to be equal with icmp.I am a bit unsure what to do about the comparison functions (fcmp, pathCharMatch and globMatch). Aside from the issue with directory separators it is, as was pointed out by someone else, entirely possible to mount case-sensitive file systems on Windows and case-insensitive file systems on POSIX. (The latter is not uncommon on OSX, I believe.) I am open to suggestions.Although, at least in the case of case-insensitive mountings on posix, doesn't that mean such paths would have both case-sensitive and case-insensitive parts? Ex: /mount/damnWinDrive/dir/subdir Wouldn't the "mount/damnWinDrive" part be case-sensitive and the "dir/subdir" part be insensitve?Argh.(I'm starting to really despise case-insensitive filesystems.)Me too. Does anyone know whether Windows' case insensitivity is limited to ASCII? If not, is the filesystem Unicode-aware, or does it uses some locale specific codepage to compare file names? -Lars
Jul 20 2011
On 20/07/2011 20:57, Lars T. Kyllingstad wrote:Does anyone know whether Windows' case insensitivity is limited to ASCII? If not, is the filesystem Unicode-aware, or does it uses some locale specific codepage to compare file names? -LarsWikipedia says Windows long file names are up to 255 UTF-16 characters (or code points, depending which article you refer to >< ) Seems consistent with Microsoft's approach to character encoding throughout the rest of the Windows API.http://en.wikipedia.org/wiki/Long_filenameA...
Jul 20 2011
On Wed, 20 Jul 2011 22:20:16 +0100, Alix Pexton wrote:On 20/07/2011 20:57, Lars T. Kyllingstad wrote:Thanks! In other words, fcmp() needs to do UTF-16 decoding... -LarsDoes anyone know whether Windows' case insensitivity is limited to ASCII? If not, is the filesystem Unicode-aware, or does it uses some locale specific codepage to compare file names? -LarsWikipedia says Windows long file names are up to 255 UTF-16 characters (or code points, depending which article you refer to >< ) Seems consistent with Microsoft's approach to character encoding throughout the rest of the Windows API.http://en.wikipedia.org/wiki/Long_filename
Jul 21 2011
Does anyone know whether Windows' case insensitivity is limited to ASCII? If not, is the filesystem Unicode-aware, or does it uses some locale specific codepage to compare file names?I just tried a few examples: Using umlauts works as expected, i.e. upper or lower case characters are treated as the same. I then used the greek omega (\u3a9 and \u3c9), still files with upper and lower case are the same, even back on a FAT-16 usb drive (even though some ~-magic is going on there which might not work in Windows 3.1-).-Lars
Jul 20 2011
On 17.07.2011 23:27, Lars T. Kyllingstad wrote:- Should it be specified/documented whether a function returns "" or null? Specifically, is it important that extension("foo") is null extension("foo.") !is null&& extension("foo.") == ""I guess you've already thought about this, but one solution is to just return the dot as part of the extension. Then you get extension("foo.") == ".". I noticed that .NET's getExtension method does this. setExtension and defaultExtension would probably have to change to at least accept extensions that include the dot, if extension() is changed.
Jul 21 2011