www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - What is the rationale behind std.file.setAttributes ?

reply Marco Leise <Marco.Leise gmx.de> writes:
I know that code is from Martin, and I don't mean this as
pointing with the finger. There are code reviews, too.

This is a case of the proverbial thin wrapper around a system
function, as public API of Phobos. Amongst the large set of
operating system abstractions, this one is somewhat deceiving,
because it looks the same on each platform, but the parameter
has a different meaning on each system. Instead of taking e.g.
octal!777 it should really be a D specific enum, IMHO. It is a
coincidence that both Windows and POSIX use a 32-bit integer
for file attributes, but not technically sound to confuse them
in a public API.

I'm posting this because concerning the parameter we have a
similar situation with our file mode specifiers, which are
forwarded to C library functions as is. Depending on the
platform they change their meaning as well. Although there
many standard flags, "don't inherit file handle in child
processes" has never been standardized and cannot be
retrofitted in Phobos right now, because we an enum wasn't
used in the first place. </rant>

-- 
Marco
Dec 27 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/27/13 3:12 AM, Marco Leise wrote:
 I know that code is from Martin, and I don't mean this as
 pointing with the finger. There are code reviews, too.
True. Once code is accepted it is owned by the team.
 This is a case of the proverbial thin wrapper around a system
 function, as public API of Phobos. Amongst the large set of
 operating system abstractions, this one is somewhat deceiving,
 because it looks the same on each platform, but the parameter
 has a different meaning on each system. Instead of taking e.g.
 octal!777 it should really be a D specific enum, IMHO. It is a
 coincidence that both Windows and POSIX use a 32-bit integer
 for file attributes, but not technically sound to confuse them
 in a public API.
I also dislike the integer-based std.file API. The layout of the integral is inherently system-dependent so almost all code based on it is non-portable unless it uses more ad-hoc APIs (such as attrIsFile). My recent work on rdmd has revealed the std.file attribute APIs sorely wanting. I picture two ways out of this: 1. Improve the DirEntry abstraction by e.g. allowing it to fetch attributes only etc. 2. Define a ligthly structure FileAttributes structure that encapsulates the integral and offers portable queries such as isDir, isFile etc. Any takers? As a general note, we are suffering from an deflation of reviewers compared to contributors. I've looked at Phobos closely again in the past couple of weeks and saw quite a few things in there that I wouldn't have approved. On the other hand, nobody can be expected to babysit Phobos 24/7 so the right response would be closer scrutiny of pull requests by the entire community. To foster that, one possibility would be to automatically post new pull requests to this group. What do you all think? Thanks, Andrei
Dec 27 2013
parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Fri, 27 Dec 2013 09:04:39 -0800
schrieb Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:

 I picture two ways out of this:
 
 1. Improve the DirEntry abstraction by e.g. allowing it to fetch 
 attributes only etc.
 
 2. Define a ligthly structure FileAttributes structure that encapsulates 
 the integral and offers portable queries such as isDir, isFile etc.
 
 Any takers?
Wait a second, what about *setting* attributes? Some difficult ones are: o toggling read-only (for whom? user, group, others?) o executable flag o hidden flag On Windows 'executable' is implicit and based on the extension. On Posix 'hidden' is implicit for a file name beginning with a dot. We can read the hidden bit on POSIX, but we cannot toggle it for example. So we can either not expose these attributes at all, ignore them where not applicable when setting attributes or add a third state "ignore". Or looking at it another way: DOS attr < POSIX chmod < ACLs How do other programming languages find a common ground? Should Phobos add file attributes on a case by case basis? E.g. Someone needs to toggle read-only, so we think about how we handle that on POSIX and decide that when a file is made writable we set the writable bits according to umask. (http://en.wikipedia.org/wiki/Umask) These properties can extend a DirEntry, but it is somewhat limiting without free functions to supplement it.
 As a general note, we are suffering from an deflation of reviewers 
 compared to contributors. I've looked at Phobos closely again in the 
 past couple of weeks and saw quite a few things in there that I wouldn't 
 have approved. On the other hand, nobody can be expected to babysit 
 Phobos 24/7 so the right response would be closer scrutiny of pull 
 requests by the entire community. To foster that, one possibility would 
 be to automatically post new pull requests to this group. What do you 
 all think?
Actually I'm opposed to this. It creates noise (especially if dmd/druntime changes are included that only a hand full of people understand) and GitHub already offers this functionality in the form of notifications. "Notifications are based on the repositories you are watching. If you are watching a repository, you will receive notifications for all discussions, including: o Issues and their comments o Pull Requests and their comments o Comments on any commits" It is very coarse though, so unless GitHub is open to limiting notifications to pull requests only, it could still be useful to post pull requests to this list.
 Thanks,
 
 Andrei
-- Marco
Dec 27 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-12-28 03:46, Marco Leise wrote:

 Wait a second, what about *setting* attributes? Some difficult
 ones are:

 o toggling read-only (for whom? user, group, others?)
 o executable flag
 o hidden flag

 On Windows 'executable' is implicit and based on the extension.
 On Posix 'hidden' is implicit for a file name beginning with a
 dot. We can read the hidden bit on POSIX, but we cannot toggle
 it for example. So we can either not expose these attributes
 at all, ignore them where not applicable when setting
 attributes or add a third state "ignore".
In addition to that Mac OS X has an additional way of indicating if a file is is hidden or not. More similar to how it works on Windows then the naming scheme from Posix.
 Or looking at it another way:
 DOS attr < POSIX chmod < ACLs

 How do other programming languages find a common ground?
On the top of the documentation of the File class in Ruby, it says the following: "In the description of File methods, permission bits are a platform-specific set of bits that indicate permissions of a file." And "On non-Posix operating systems, there may be only the ability to make a file read-only or read-write. In this case, the remaining permission bits will be synthesized to resemble typical values." -- /Jacob Carlborg
Dec 28 2013
next sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
As much as I dislike PHP, I think they are onto something with 
having an optional stream context object that you can pass onto 
various file functions. It allows you to treat HTTP and the 
regular file system in the same manner (by having authentication 
etc in the context object).

Rather than setting attributes you would populate the context 
object (you could use a OS-specific factory function/class) with 
the settings you want and then pass it around. I guess you could 
then precreate one context for ReadOnly, another for 
ExecutableBinary, another for MIME text/html etc…
Dec 28 2013
prev sibling parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Sat, 28 Dec 2013 15:23:55 +0100
schrieb Jacob Carlborg <doob me.com>:

 On 2013-12-28 03:46, Marco Leise wrote:
 
 Wait a second, what about *setting* attributes? Some difficult
 ones are:

 o toggling read-only (for whom? user, group, others?)
 o executable flag
 o hidden flag

 On Windows 'executable' is implicit and based on the extension.
 On Posix 'hidden' is implicit for a file name beginning with a
 dot. We can read the hidden bit on POSIX, but we cannot toggle
 it for example. So we can either not expose these attributes
 at all, ignore them where not applicable when setting
 attributes or add a third state "ignore".
In addition to that Mac OS X has an additional way of indicating if a file is is hidden or not. More similar to how it works on Windows then the naming scheme from Posix.
Oh right, they have a hidden attribute as well. I guess if Phobos should expose the 'hidden' state of a file it should write to this attribute, but read both. E.g. for a file named ".hidden" you could do: attribs.hidden = false; and still assert(attribs.hidden == true); due to the POSIX file naming taking precedence then.
 Or looking at it another way:
 DOS attr < POSIX chmod < ACLs

 How do other programming languages find a common ground?
On the top of the documentation of the File class in Ruby, it says the following: "In the description of File methods, permission bits are a platform-specific set of bits that indicate permissions of a file." And "On non-Posix operating systems, there may be only the ability to make a file read-only or read-write. In this case, the remaining permission bits will be synthesized to resemble typical values."
Ruby assumes POSIX semantics and directly exposes chmod() and chown(). Probably because the level of permission control that POSIX offers is good enough? (Even though that only allows setting the read-only bit on Windows.) o they offer one method to set all permission bits in one go (chmod) o other bits (e.g. hidden) cannot be set o attributes can be queried through a series of separate methods: readable, readable_real, world_readable, writable, writable_real, world_writable, executable, owned, grpowned. On Windows grpowned is just: return false; -- Marco
Dec 29 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-12-29 11:36, Marco Leise wrote:

 Oh right, they have a hidden attribute as well. I guess if
 Phobos should expose the 'hidden' state of a file it should
 write to this attribute, but read both. E.g. for a file
 named ".hidden" you could do:

    attribs.hidden = false;

 and still

    assert(attribs.hidden == true);

 due to the POSIX file naming taking precedence then.
Yeah, but that seems quite confusing. I don't know if it would be wise to use the same function for that. It would probably also be confusing if "attribs.hidden = false" renamed the file to remove the leading dot, if present. -- /Jacob Carlborg
Dec 29 2013
parent Marco Leise <Marco.Leise gmx.de> writes:
Am Sun, 29 Dec 2013 13:44:02 +0100
schrieb Jacob Carlborg <doob me.com>:

 On 2013-12-29 11:36, Marco Leise wrote:
 
 Oh right, they have a hidden attribute as well. I guess if
 Phobos should expose the 'hidden' state of a file it should
 write to this attribute, but read both. E.g. for a file
 named ".hidden" you could do:

    attribs.hidden = false;

 and still

    assert(attribs.hidden == true);

 due to the POSIX file naming taking precedence then.
Yeah, but that seems quite confusing. I don't know if it would be wise to use the same function for that. It would probably also be confusing if "attribs.hidden = false" renamed the file to remove the leading dot, if present.
File renaming is out of question. Just like you wouldn't remove the .exe extension to make a file non-executable on Windows. -- Marco
Dec 29 2013
prev sibling next sibling parent reply Martin Nowak <code dawg.eu> writes:
On 12/27/2013 12:12 PM, Marco Leise wrote:
 This is a case of the proverbial thin wrapper around a system
 function, as public API of Phobos. Amongst the large set of
 operating system abstractions, this one is somewhat deceiving,
 because it looks the same on each platform, but the parameter
 has a different meaning on each system.
Yep, totally ugly but it's the counterpart to So if I save these attributes in a std.zip archive I was missing the possibility to restore them.
Dec 27 2013
parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Sat, 28 Dec 2013 03:50:45 +0100
schrieb Martin Nowak <code dawg.eu>:

 On 12/27/2013 12:12 PM, Marco Leise wrote:
 This is a case of the proverbial thin wrapper around a system
 function, as public API of Phobos. Amongst the large set of
 operating system abstractions, this one is somewhat deceiving,
 because it looks the same on each platform, but the parameter
 has a different meaning on each system.
Yep, totally ugly but it's the counterpart to So if I save these attributes in a std.zip archive I was missing the possibility to restore them.
Hi, thanks for replying here. So .zip files store file attributes as ints? Then they must also have a data member that denotes the originating operating system, right? Otherwise it would be impossible to correctly restore a .zip file from one system on another. (Given relative path names and compatible file name character sets.) And the file attributes need to be converted between systems as well. Otherwise it would create *very* bizarre effects when applying POSIX chmod attributes on a Windows machine. -- Marco
Dec 27 2013
parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Sat, 28 Dec 2013 04:44:30 +0100
schrieb Marco Leise <Marco.Leise gmx.de>:

 Am Sat, 28 Dec 2013 03:50:45 +0100
 schrieb Martin Nowak <code dawg.eu>:
 
 On 12/27/2013 12:12 PM, Marco Leise wrote:
 This is a case of the proverbial thin wrapper around a system
 function, as public API of Phobos. Amongst the large set of
 operating system abstractions, this one is somewhat deceiving,
 because it looks the same on each platform, but the parameter
 has a different meaning on each system.
Yep, totally ugly but it's the counterpart to So if I save these attributes in a std.zip archive I was missing the possibility to restore them.
Hi, thanks for replying here. So .zip files store file attributes as ints? Then they must also have a data member that denotes the originating operating system, right? Otherwise it would be impossible to correctly restore a .zip file from one system on another. (Given relative path names and compatible file name character sets.) And the file attributes need to be converted between systems as well. Otherwise it would create *very* bizarre effects when applying POSIX chmod attributes on a Windows machine.
Ok, so there is a compatibility field for the file attributes in a .zip file. So a .zip extractor has to version(Windows/Posix) anyway to check if the attributes for a given file are compatible with the host OS. Couldn't SetFileAttributes and chmod be called right there on the spot? -- Marco
Dec 27 2013
parent reply Martin Nowak <code dawg.eu> writes:
On 12/28/2013 05:01 AM, Marco Leise wrote:
 Ok, so there is a compatibility field for the file attributes
 in a .zip file. So a .zip extractor has to version(Windows/Posix)
 anyway to check if the attributes for a given file are
 compatible with the host OS. Couldn't SetFileAttributes and
 chmod be called right there on the spot?
It does in memory extraction, you have to set the attributes after writing out the file. https://github.com/D-Programming-Language/installer/pull/31 https://d.puremagic.com/issues/show_bug.cgi?id=11789 https://github.com/D-Programming-Language/installer/pull/33
Dec 27 2013
parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Sat, 28 Dec 2013 05:13:57 +0100
schrieb Martin Nowak <code dawg.eu>:

 On 12/28/2013 05:01 AM, Marco Leise wrote:
 Ok, so there is a compatibility field for the file attributes
 in a .zip file. So a .zip extractor has to version(Windows/Posix)
 anyway to check if the attributes for a given file are
 compatible with the host OS. Couldn't SetFileAttributes and
 chmod be called right there on the spot?
It does in memory extraction, you have to set the attributes after writing out the file. https://github.com/D-Programming-Language/installer/pull/31 https://d.puremagic.com/issues/show_bug.cgi?id=11789 https://github.com/D-Programming-Language/installer/pull/33
Some backlog! So getAttributes was there already, the dmd installer script used a custom "setAttributes" to extract ZIP files, and so it was decided that setAttributes should be in Phobos, too. (in short) Don't kill me, but I think this version of setAttributes should stay local to create_dmd_release (or std.zip if it was extended to support extraction to the file system). In the most general use case it still requires a code block like this: bool useFileAttr = false; version (Posix) { useFileAttr = data.isMeantForPosix; } else version (Windows) { useFileAttr = data.isMeantForWindows; } if (useFileAttr) std.file.setAttributes(data.fileName, data.fileAttribs); Meaning it is (as far as I can see) only of use for external data blocks that contain OS specific file attributes in a shared 32-bit field and at that point it doesn't gain much over: version (Posix) { if (data.isMeantForPosix) chmod( toUTFz!(char*)(data.fileName), data.fileAttr ); } else version (Windows) { if (data.isMeantForWindows) SetFileAttributesW( toUTFz!(wchar*)(data.fileName), data.fileAttr ); } For Phobos we need portable solutions. But it is also clear that std.file.setAttributes cannot be replaced to 100% by a portable solution. The question is: Does portable code _care_ about setting each possible chmod flag or Windows file attribute? Or are the use cases much more limited there, like making a file read-only or querying if it is a directory? -- Marco
Dec 27 2013
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, December 28, 2013 06:54:53 Marco Leise wrote:
 For Phobos we need portable solutions. But it is also clear that
 std.file.setAttributes cannot be replaced to 100% by a portable
 solution. The question is: Does portable code _care_ about setting
 each possible chmod flag or Windows file attribute? Or are the use
 cases much more limited there, like making a file read-only or
 querying if it is a directory?
We need to try hard to make Phobos cross-platform and portable, but some stuff just can't be, and std.file already has some functions which fall in that category (e.g. anything symlink related or dealing with file times). And having a D wrapper around that functionality can make code much cleaner, so I'm all for having a limited number of system-specific functions in std.file if that's what it takes to get the job done (and file permissions tend to fall in that category). - Jonathan M Davis
Dec 27 2013
parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Fri, 27 Dec 2013 22:12:37 -0800
schrieb Jonathan M Davis <jmdavisProg gmx.com>:

 On Saturday, December 28, 2013 06:54:53 Marco Leise wrote:
 For Phobos we need portable solutions. But it is also clear that
 std.file.setAttributes cannot be replaced to 100% by a portable
 solution. The question is: Does portable code _care_ about setting
 each possible chmod flag or Windows file attribute? Or are the use
 cases much more limited there, like making a file read-only or
 querying if it is a directory?
We need to try hard to make Phobos cross-platform and portable, but some stuff just can't be, and std.file already has some functions which fall in that category (e.g. anything symlink related or dealing with file times). And having a D wrapper around that functionality can make code much cleaner, so I'm all for having a limited number of system-specific functions in std.file if that's what it takes to get the job done (and file permissions tend to fall in that category). - Jonathan M Davis
But SetFileAttributes doesn't set file permissions (except for "read-only")! Chmod does. If you want a D wrapper for POSIX chmod in std.file, it should be exactly that. For Windows you need to work with ACLs to change permissions. -- Marco
Dec 27 2013
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, December 28, 2013 07:21:28 Marco Leise wrote:
 schrieb Jonathan M Davis <jmdavisProg gmx.com>:
 We need to try hard to make Phobos cross-platform and portable, but some
 stuff just can't be, and std.file already has some functions which fall
 in that category (e.g. anything symlink related or dealing with file
 times). And having a D wrapper around that functionality can make code
 much cleaner, so I'm all for having a limited number of system-specific
 functions in std.file if that's what it takes to get the job done (and
 file permissions tend to fall in that category).
 
 - Jonathan M Davis
But SetFileAttributes doesn't set file permissions (except for "read-only")! Chmod does. If you want a D wrapper for POSIX chmod in std.file, it should be exactly that. For Windows you need to work with ACLs to change permissions.
I'm not arguing the exact API for setFileAttributes, since I haven't spent the time as of yet to look it over and therefore do not feel qualified to comment on it specifically. My main point was that some stuff in std.file is system- specific and that if it has to be, it's better to have it in std.file as system- specific rather than not having it at all just because it couldn't be completely cross-platform and portable. So, just because setFileAttributes is doing something system-specific does not mean that it's bad. But its validity may very well be argued on other counts (e.g. whether it's actually a good API for what it does). - Jonathan M Davis
Dec 28 2013
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Saturday, 28 December 2013 at 08:18:44 UTC, Jonathan M Davis 
wrote:
 on it specifically. My main point was that some stuff in 
 std.file is system-
 specific and that if it has to be, it's better to have it in 
 std.file as system-
 specific rather than not having it at all just because it 
 couldn't be
 completely cross-platform and portable. So, just because 
 setFileAttributes is
 doing something system-specific does not mean that it's bad.
It is bad. If you want system-specific behaviour you should have a separate interface that provide all the advantages that going to a lower level provides. Having half-assed OS-specific support is too pragmatic and will lead to a legacy mess in the long run when those interfaces become obsolete. A good file abstraction should also support newer file systems like Google Cloud Storage, though. GCS does not support append() or directories. A good file abstraction should also provide mechanisms to deal with different levels of consistency on the underlying filesystem and caching-mechanism: - read after write might return an old version - read after write returns the new version if read from the same computer - read after write always returns the new version As a new language D should support Cloud based environments out-of-the-box with a hierarchy of functionality down to the peculiarities of Windows/Posix etc so that you can decide to code in a manner that supports all platforms without having to litter your code with ifs and version-statements… and also restrict you to the subset you have "authorized" so that you can develop on your Windows system and port it to the cloud with just a recompile without knowing the details of the cloud file system. (i.e. the compiler warns you when you are using system specific functionality). I also think good file system support implies built-in caching and let the library decide whether to turn library-level caching on or off based on what the operating system or underlying interface supports. E.g. the ability to tell the system that you want successive reads of a file to be available fast at the cost of consistency.
Dec 28 2013
parent reply Marco Leise <Marco.Leise gmx.de> writes:
On Saturday, 28 December 2013 at 08:18:44 UTC, Jonathan M Davis=20
wrote:
 on it specifically. My main point was that some stuff in=20
 std.file is system-
 specific and that if it has to be, it's better to have it in=20
 std.file as system-
 specific rather than not having it at all just because it=20
 couldn't be
 completely cross-platform and portable. So, just because=20
 setFileAttributes is
 doing something system-specific does not mean that it's bad.
Agreed. And really I didn't want to make a point about how it should be done right when I started this thread. This particular function just seemed not in line with Phobos principles as I remember them. :) Am Sat, 28 Dec 2013 09:02:01 +0000 schrieb "Ola Fosheim Gr=C3=B8stad" <ola.fosheim.grostad+dlang gmail.com>:
 It is bad. If you want system-specific behaviour you should have=20
 a separate interface that provide all the advantages that going=20
 to a lower level provides. Having half-assed OS-specific support=20
 is too pragmatic and will lead to a legacy mess in the long run=20
 when those interfaces become obsolete.
=20
 A good file abstraction should also support newer file systems=20
 like Google Cloud Storage, though. GCS does not support append()=20
 or directories.
=20
 A good file abstraction should also provide mechanisms to deal=20
 with different levels of consistency on the underlying filesystem=20
 and caching-mechanism:
 - read after write might return an old version
 - read after write returns the new version if read from the same=20
 computer
 - read after write always returns the new version
=20
 As a new language D should support Cloud based environments=20
 out-of-the-box with a hierarchy of functionality down to the=20
 peculiarities of Windows/Posix etc so that you can decide to code=20
 in a manner that supports all platforms without having to litter=20
 your code with ifs and version-statements=E2=80=A6 and also restrict you=
=20
 to the subset you have "authorized" so that you can develop on=20
 your Windows system and port it to the cloud with just a=20
 recompile without knowing the details of the cloud file system.=20
 (i.e. the compiler warns you when you are using system specific=20
 functionality).
=20
 I also think good file system support implies built-in caching=20
 and let the library decide whether to turn library-level caching=20
 on or off based on what the operating system or underlying=20
 interface supports. E.g. the ability to tell the system that you=20
 want successive reads of a file to be available fast at the cost=20
 of consistency.
Are you hijacking this thread to ask for Google Cloud support? ;) Don't forget FTP, SSH, WebDAV, ... Implementing all of this can take quite some time and will result in something like the "Gnome virtual file-system". It might become so large that std.file is easily reimplemented as a tiny part of it for the native file-system module. Basic OS level file-system I/O support is useful on its own, especially in a systems programming language. You don't need to pull in a whole bunch of dependencies to read a text file. --=20 Marco
Dec 29 2013
next sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Sunday, 29 December 2013 at 11:01:05 UTC, Marco Leise wrote:
 Basic OS level file-system I/O support is useful on its own,
 especially in a systems programming language. You don't need
 to pull in a whole bunch of dependencies to read a text file.
Yes, it is useful to have good bindings for OS-level apis. However, Posix is no longer an adequate abstraction for cross-platform file systems. In the cloud or in clusters you mount network drives with other properties than a local drive. It is much better to encourage a high-level interface for general file access. That encourage a more portable design for an application. That makes porting much easier. Porting software that assumes a local drive to the cloud is tedious. You don't pull in whole bunch of dependencies. You include the modules you need and they share the same interface. That's a clean design. The file module reminds me of Perl and old Php, that was ok in the 1990s, but it is the wrong way of creating a file system abstraction in 2014. If you just want to read and write a file with a specific path there is no reason for using a non-portable interface. Encourage a portable interface and porting to the cloud becomes trivial: you just swap out the context-object.
Dec 29 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/29/13 4:20 AM, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" wrote:
 On Sunday, 29 December 2013 at 11:01:05 UTC, Marco Leise wrote:
 Basic OS level file-system I/O support is useful on its own,
 especially in a systems programming language. You don't need
 to pull in a whole bunch of dependencies to read a text file.
Yes, it is useful to have good bindings for OS-level apis. However, Posix is no longer an adequate abstraction for cross-platform file systems. In the cloud or in clusters you mount network drives with other properties than a local drive.
Well one question is what other successful designs could we use as precedent? I don't know of any successful unified APIs for regular/remote filesystems that also allow full local file functionality. The closest abstraction I know of is the FUSE-installed filesystems, but those adapt foreign file systems to the Posix interface. Andrei
Dec 29 2013
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Sunday, 29 December 2013 at 15:05:31 UTC, Andrei Alexandrescu 
wrote:
 Well one question is what other successful designs could we use 
 as precedent? I don't know of any successful unified APIs for 
 regular/remote filesystems that also allow full local file 
 functionality. The closest abstraction I know of is the 
 FUSE-installed filesystems, but those adapt foreign file 
 systems to the Posix interface.
I guess file storage design is currently in flux and a bit lacking in terms of APIs, partially because of cloud computing and SSDs, so my suggestion would be to: 1. Provide os.linux, os.windows, os.osx, os.ios, os.posix for low level local-storage access which closely resembles the native/standard APIs. 2. Provide a novel hierarchy of abstraction for basic file I/O that is highly portable and afford modular and extensible implementation, and make those really easy to use in order to encourage portable libraries for D. I don't think the overhead of function calls through interfaces matters all that much because the cost associated with context shifts for system calls will dominate that by a solid margin (my assumption). So basically on the most abstract and limited level you get a key/value cache (memcache). The next level is key/value datastore (couchdb/mongodb/amazon/google etc). Then the ability to list all keys. Then the ability list keys based on hiearchy (directories) etc. If you can specify what kind of functionality your application needs and set those constraints in a file-system context object then you can set the level of portability for your application. I think this would be a real advantage. Then let that file-system object have a factory for "file-mode" and let that do the right thing when you attempt to set the filetype as a MIME-constant (e.g set the right extension or set the right attribute). Another advantage is that you could create "safe virtual filesystems" e.g. create a file-system context object that resolve ".." and prefix all paths with "/tmp/" and set default "file-mode" for created objects (like read-only and owner). Which would make it easy to write more secure web-servers. You could even specify a name-mangler and create a db-like filesystem for arbitrary keys with automatic creation of directories (for efficiency) if the abstraction level is right. I don't think the overhead is all that much if it is done in a modular fashion. You only import what you need. Finding the right abstraction is not trivial, I agree. So you need to do some analysis of what current storage-solutions provide to find the common denominators. (Having done some work with ndb on App Engine and the HttpRequest stuff in Dart I see the advantage of wrapping up resource loading as Futures. Mostly sugar, but uniform and easy to remember if used on all resources. E.g. request all resources as futures, then process them as they become available.)
Dec 29 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/29/13 7:43 AM, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" wrote:
 On Sunday, 29 December 2013 at 15:05:31 UTC, Andrei Alexandrescu wrote:
 Well one question is what other successful designs could we use as
 precedent? I don't know of any successful unified APIs for
 regular/remote filesystems that also allow full local file
 functionality. The closest abstraction I know of is the FUSE-installed
 filesystems, but those adapt foreign file systems to the Posix interface.
I guess file storage design is currently in flux and a bit lacking in terms of APIs, partially because of cloud computing and SSDs, so my suggestion would be to: 1. Provide os.linux, os.windows, os.osx, os.ios, os.posix for low level local-storage access which closely resembles the native/standard APIs.
I think we (and others) have done a fine job so far at abstracting away e.g. very different ways of figuring whether an entry is a directory on Windows vs. Posix. Now we're to throw all that away and go back to silex stones and bear claws?
 2. Provide a novel hierarchy of abstraction for basic file I/O that is
 highly portable and afford modular and extensible implementation, and
 make those really easy to use in order to encourage portable libraries
 for D.
Yah, it would be great if that were awfully more detailed :o).
 I don't think the overhead of function calls through interfaces
 matters all that much because the cost associated with context shifts
 for system calls will dominate that by a solid margin (my assumption).
That's exactly what they said when they designed iostreams.
 So basically on the most abstract and limited level you get a key/value
 cache (memcache). The next level is key/value datastore
 (couchdb/mongodb/amazon/google etc).
 Then the ability to list all keys.
 Then the ability list keys based on hiearchy (directories) etc.

 If you can specify what kind of functionality your application needs and
 set those constraints in a file-system context object then you can set
 the level of portability for your application. I think this would be a
 real advantage. Then let that file-system object have a factory for
 "file-mode" and let that do the right thing when you attempt to set the
 filetype as a MIME-constant (e.g set the right extension or set the
 right attribute).

 Another advantage is that you could create "safe virtual filesystems"
 e.g. create a file-system context object that resolve ".." and prefix
 all paths with "/tmp/" and set default "file-mode" for created objects
 (like read-only and owner). Which would make it easy to write more
 secure web-servers. You could even specify a name-mangler and create a
 db-like filesystem for arbitrary keys with automatic creation of
 directories (for efficiency) if the abstraction level is right.

 I don't think the overhead is all that much if it is done in a modular
 fashion. You only import what you need. Finding the right abstraction is
 not trivial, I agree. So you need to do some analysis of what current
 storage-solutions provide to find the common denominators.

 (Having done some work with ndb on App Engine and the HttpRequest stuff
 in Dart I see the advantage of wrapping up resource loading as Futures.
 Mostly sugar, but uniform and easy to remember if used on all resources.
 E.g. request all resources as futures, then process them as they become
 available.)
Again I think we'd need a ton of detail here to even assess whether this has merit. So far it's an interesting brain dump. Andrei
Dec 29 2013
parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Sunday, 29 December 2013 at 16:06:32 UTC, Andrei Alexandrescu 
wrote:
 I think we (and others) have done a fine job so far at 
 abstracting away e.g. very different ways of figuring whether 
 an entry is a directory on Windows vs. Posix. Now we're to 
 throw all that away and go back to silex stones and bear claws?
I am not telling anyone what to do. I am saying what I would prefer. I would prefer a standard low-level layer and a flexible high-level layer.
 Yah, it would be great if that were awfully more detailed :o).
I am not coming with a finished design, but I think it is doable.
 That's exactly what they said when they designed iostreams.
Buffered i/o is another issue because you either have a high interface-call/system call ratio, or make too many system-calls.
 Again I think we'd need a ton of detail here to even assess 
 whether this has merit. So far it's an interesting brain dump.
Ok, let me give you some motivation: Old software bases tend to evolve and make assumptions that do not hold over time. I recently ported phpbb to App Engine, which provides a read-only filesystem. Phpbb used the following: 1. templates: on filesystem, or if modified in mysql 2. avatar images: on filesystem 3. processed templates: cached on filesystem In order to port I had to create my own filesystem abstraction and map processed templates onto memcache and avatar images onto Cloud Storage. Basically sifting through the code and hoping I didn't overlook anything. This file-system port would have been trivial if phpbb had instantiated separate "virtual filesystems" for templates, avatars and processed templates. I could then have specified that the avatar-filesystem should default to MIME-type 'image/jpeg' which would let me serve the images directly from Cloud Storage (through Google's efficient infrastructure) with little extra work (low chance of creating bugs). In most cases you don't need to know what filesystem different entities reside on. When moving into the cloud you most likely will gain access to a diverse set of storage mechanisms that are optimized for different usage patterns (like on-the-fly rescaling of images, direct http interfaces, memcaches, redundante high-availability storage etc).
Dec 29 2013
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2013-12-29 12:00, Marco Leise wrote:

 Are you hijacking this thread to ask for Google Cloud
 support? ;) Don't forget FTP, SSH, WebDAV, ...
 Implementing all of this can take quite some time and will
 result in something like the "Gnome virtual file-system". It
 might become so large that std.file is easily reimplemented as
 a tiny part of it for the native file-system module.
Tango already contains a virtual file system interface. If I recall correctly it supports FTP and regular, local, file systems. -- /Jacob Carlborg
Dec 29 2013
prev sibling parent Martin Nowak <code dawg.eu> writes:
On 12/27/2013 12:12 PM, Marco Leise wrote:
 It is a
 coincidence that both Windows and POSIX use a 32-bit integer
 for file attributes, but not technically sound to confuse them
 in a public API.
On Posix this should use mode_t instead which is 16-bit on some platforms.
Dec 27 2013