digitalmars.D - dmdz
- Ellery Newcomer (29/29) Mar 11 2010 So I'm toying with a prototype, which is proving nice enough, but there
- Andrei Alexandrescu (17/46) Mar 11 2010 To me this looks like a definite V2 thing honed by experience. For now
- Ellery Newcomer (24/57) Mar 11 2010 It is. I suppose the name isn't so important, but I really hate zip
- Walter Bright (11/15) Mar 11 2010 What I'd like to see is the creation of a library file interface, say:
- Nick Sabalausky (20/22) Mar 11 2010 This is a bit of a "vim vs emacs" or "static vs dynamic" sort of issue.
- Lars T. Kyllingstad (14/39) Mar 12 2010 I don't really disagree, but it's not always that simple. Take tar, for...
- Bernard Helyer (2/9) Mar 12 2010 The right click 'extract here' under GNOME does *exactly* this.
- Lutger (4/16) Mar 12 2010 Same under KDE: Dolphin right click 'extract here, autodetect subfolder'...
- Chad J (2/20) Mar 13 2010
- Bill Baxter (7/12) Mar 12 2010 WinRAR has an option for that if the zip file and the single folder
- Ellery Newcomer (5/28) Mar 12 2010 I rarely come across a zip file that doesn't follow that convention, and...
- Walter Bright (12/14) Mar 11 2010 How about:
- Lars T. Kyllingstad (4/6) Mar 12 2010 Cool! Looking forward to using it. :)
- Ellery Newcomer (3/9) Mar 12 2010 I have no idea why it's called dmdz and not zdmd. My guess is so you can...
- Ellery Newcomer (12/12) Mar 15 2010 Hello.
- Nick Sabalausky (3/15) Mar 15 2010 I'd just require a setting in dmd.conf for that.
- Ellery Newcomer (5/26) Mar 15 2010 Of course it turns out to be a screwy zip file. Nevermind..
- Ellery Newcomer (11/11) Mar 16 2010 Anyone want to play with dmdz, here it is:
- Lutger (3/15) Mar 16 2010 You might like TRAP:
- Robert Clipsham (10/21) Mar 16 2010 $SOMECMD
- Andrei Alexandrescu (20/31) Mar 16 2010 This is solid work, but I absolutely refuse to believe the solution must...
- Ellery Newcomer (25/47) Mar 17 2010 I count 2 modules and about 800 loc. 2 to 300 of which implements
- Andrei Alexandrescu (111/160) Mar 17 2010 Thanks for replying to this. I'd been afraid that I was coming off too
- Ellery Newcomer (40/88) Mar 17 2010 dang right you are. If you're going to count the antlr runtime, then
- Andrei Alexandrescu (28/127) Mar 17 2010 I meant the antlr grammar for the task. I gave two counts, one excluding...
- BCS (8/10) Mar 17 2010 The difference in speed between disk IO and CPU /might/ be high enough t...
- Andrei Alexandrescu (5/14) Mar 18 2010 That works on zsh, I'm not sure whether it works with other shells.
- Walter Bright (4/15) Mar 18 2010 I'd argue that for this case, caching the extracted files is not worth t...
- Andrei Alexandrescu (6/21) Mar 18 2010 Of course not, but the typical scenario is to just run a program off its...
- Walter Bright (3/4) Mar 18 2010 Caching the executable, sure, but I'm not sure that translates into a ca...
- Andrei Alexandrescu (4/8) Mar 18 2010 I see. It should be fine to cache the exe and regenerate only if the
- BCS (7/25) Mar 18 2010 The only case I can think of where putting a zip file in the middle of t...
- Walter Bright (3/7) Mar 18 2010 It might even be practical to have dmdz compile from a zip file specifie...
- Andrei Alexandrescu (3/10) Mar 18 2010 In that case I do think caching would be helpful :o).
- Lutger (4/12) Mar 18 2010 Just like dsss did...(and still does for D1 I guess)
- Walter Bright (2/6) Mar 18 2010 Anyone can revive it if they're motivated too!
- Ellery Newcomer (8/78) Mar 18 2010 Yeah, you're right there.
- Robert Clipsham (4/6) Mar 18 2010 That seems like a tad too much for it... Surely it would only take a few...
- Ellery Newcomer (4/10) Mar 18 2010 Sure. I could write it in 100 loc. My concern is they would be a buggy
- Andrei Alexandrescu (3/15) Mar 18 2010 You could write it in 5 loc.
- Andrei Alexandrescu (32/49) Mar 18 2010 My bad for not being able to see that in the code. I read through and
- Clemens (2/10) Mar 18 2010 I think it would be a good idea to stay well away from gratuitous portab...
- Andrei Alexandrescu (4/18) Mar 18 2010 Yah, I agree. Well `` don't need to be used in the command line, a
- Walter Bright (5/11) Mar 18 2010 dmd will already read switches out of a file:
- Lionello Lunesu (8/12) Mar 18 2010 and I'm out..
- Andrei Alexandrescu (9/21) Mar 18 2010 I looked around. basename and dirname suggest that the ones in phobos
- Robert Clipsham (10/13) Mar 18 2010 I'm usually one of those, but seen as you asked... It looks good :) I
- Ellery Newcomer (5/18) Mar 18 2010 It would only involve building support for those formats into phobos :)
- Andrei Alexandrescu (10/29) Mar 18 2010 Heh, incidentally I just needed a tar reader a few days ago, so I wrote
- Walter Bright (10/23) Mar 18 2010 That's great, but I only suggest that this not be added to Phobos until ...
- Andrei Alexandrescu (15/39) Mar 18 2010 The archive type should be a D class inheriting ArchiveReader, so no
- Walter Bright (16/22) Mar 18 2010 The reasons for reading the file to determine the archive type are:
- Andrei Alexandrescu (31/55) Mar 18 2010 It is not necessary, only vital.
- Walter Bright (4/6) Mar 18 2010 I understand your point.
- Andrei Alexandrescu (7/14) Mar 18 2010 Makes sense.
- Walter Bright (8/27) Mar 18 2010 Maybe a better way to do it is to just pass a delegate that encapsulates...
- Walter Bright (5/6) Mar 18 2010 Another thing needed for the interface is an associative array that maps...
- Andrei Alexandrescu (4/10) Mar 18 2010 Emphatically NO. Archives work with streams. You can build indexing on
- Michel Fortin (17/28) Mar 18 2010 Andrei, have you took a look at the Zip file format? It's not streamable...
- Andrei Alexandrescu (7/31) Mar 18 2010 Interesting, thank you. I still think generally a random-access
- Walter Bright (4/15) Mar 18 2010 Such an interface won't work with .lib or .a archives. Both have an embe...
- Andrei Alexandrescu (6/21) Mar 18 2010 Now I understand why linkers thrash the disk.
- Walter Bright (8/31) Mar 18 2010 I think this is incorrect. The table of contents in the .lib files was d...
- Robert Clipsham (4/5) Mar 18 2010 http://en.wikipedia.org/wiki/Xz - A lot of linux distro's seem to be
So I'm toying with a prototype, which is proving nice enough, but there be a few things that I'm not quite sure which way to go with. Currently I have the general pattern dmdz [global flags] foo1.zip [foo1 local flags] foo2.zip [foo2 local flags] ... although when given multiple zips it just compiles them independently. My thought was when fooi.zip compiles a lib file, the result should be made available to all subsequent zip files, so you could do something like dmdz lib1.zip lib2.zip main.zip where lib2 can depend on lib1 and main can depend on either lib. But then most if not all of lib1's flags need to be forwarded to lib2 and main. The other alternative I thought of is all the zip files get extracted and then all compiled at once. Or is multiple zip files even a good idea? For the more specific case dmdz [global flags] foo.zip [local flags] it expects all the relevant content in foo.zip to be located inside directory foo, and doesn't extract anything else unless you explicitly tell it to. Also, there can be a file 'cmd' (name?) inside foo.zip which contains additional flags for the compile, with local flags overriding global flags overriding flags found in cmd. At least for dmdz flags. dmd flags get filtered out and forwarded to dmd. The current strategy for compiling just involves giving every compilable thing extracted to dmd. There's also an option to compile each source file separately (which I put in after hitting an odd Out of Memory Error). Comments? Also, are there any plans for std.zip, e.g. with regard to ranges, input/output streams, etc? The current api seems a smidge spartan.
Mar 11 2010
On 03/11/2010 12:11 PM, Ellery Newcomer wrote:So I'm toying with a prototype, which is proving nice enough, but there be a few things that I'm not quite sure which way to go with.I was eagerly waiting for you to get back regarding this project. Thank you!Currently I have the general pattern dmdz [global flags] foo1.zip [foo1 local flags] foo2.zip [foo2 local flags] ... although when given multiple zips it just compiles them independently. My thought was when fooi.zip compiles a lib file, the result should be made available to all subsequent zip files, so you could do something like dmdz lib1.zip lib2.zip main.zip where lib2 can depend on lib1 and main can depend on either lib. But then most if not all of lib1's flags need to be forwarded to lib2 and main. The other alternative I thought of is all the zip files get extracted and then all compiled at once. Or is multiple zip files even a good idea?To me this looks like a definite V2 thing honed by experience. For now the focus is distributing entire programs as one zip file.For the more specific case dmdz [global flags] foo.zip [local flags] it expects all the relevant content in foo.zip to be located inside directory foo, and doesn't extract anything else unless you explicitly tell it to.I don't understand this. Does the program foo.zip have to contain an actual directory called "foo"? That's a bit restrictive. My initial plan revolved around expanding foo.zip somewhere in a unique subdir of the temp directory and considering that a full-blown project resides inside that subdir.Also, there can be a file 'cmd' (name?) inside foo.zip which contains additional flags for the compile, with local flags overriding global flags overriding flags found in cmd. At least for dmdz flags.How about dmd.conf?dmd flags get filtered out and forwarded to dmd. The current strategy for compiling just involves giving every compilable thing extracted to dmd. There's also an option to compile each source file separately (which I put in after hitting an odd Out of Memory Error). Comments?That sounds about right. One thing I want is to stay reasonably KISS (e.g. like rdmd is), i.e. not invent a lot of arcana. rdmd has many heuristics and limitations but has the virtue that it gets a specific job done without requiring its user to learn most anything. I hope dmdz turns out similarly simple.Also, are there any plans for std.zip, e.g. with regard to ranges, input/output streams, etc? The current api seems a smidge spartan.I've hoped to rewrite std.zip forever, but found no time to do so. Andrei
Mar 11 2010
On 03/11/2010 12:29 PM, Andrei Alexandrescu wrote:It is. I suppose the name isn't so important, but I really hate zip files whose contents aren't contained inside a single directory. Also, there would be a bit of a dichotomy if dmdz foo.zip resulted in a directory 'foo' wherever, but unzip foo.zip resulted in what would be the contents of 'foo' above. Another thing: do you envision this just being a build-this-completed-project, or do you see this as an actual development tool? Because I've been approaching it more from the latter perspective. Zip file is a roadmap: look, all the files you need for to compile are here, here, here, and here. So use them. Compile. But if the zip file is a complete project, then you would expect to see source code, test code, test data, licenses, documentation, etc, which would likely require filtering anyways and possibly multiple compiles for different pieces. And you'd expect the result of the compile to end up somewhere in the directory you just created. Alright, I think I'm seeing less and less value in foo.zip/foo as a req.For the more specific case dmdz [global flags] foo.zip [local flags] it expects all the relevant content in foo.zip to be located inside directory foo, and doesn't extract anything else unless you explicitly tell it to.I don't understand this. Does the program foo.zip have to contain an actual directory called "foo"? That's a bit restrictive. My initial plan revolved around expanding foo.zip somewhere in a unique subdir of the temp directory and considering that a full-blown project resides inside that subdir.Sounds good.Also, there can be a file 'cmd' (name?) inside foo.zip which contains additional flags for the compile, with local flags overriding global flags overriding flags found in cmd. At least for dmdz flags.How about dmd.conf?Well, heck. Maybe I'll see what I can do with it. Do you want it to conform to any interface in particular? Also: test whether a file [path?] is contained within a specific directory [path?]. does such functionality exist somewhere in phobos?dmd flags get filtered out and forwarded to dmd. The current strategy for compiling just involves giving every compilable thing extracted to dmd. There's also an option to compile each source file separately (which I put in after hitting an odd Out of Memory Error). Comments?That sounds about right. One thing I want is to stay reasonably KISS (e.g. like rdmd is), i.e. not invent a lot of arcana. rdmd has many heuristics and limitations but has the virtue that it gets a specific job done without requiring its user to learn most anything. I hope dmdz turns out similarly simple.Also, are there any plans for std.zip, e.g. with regard to ranges, input/output streams, etc? The current api seems a smidge spartan.I've hoped to rewrite std.zip forever, but found no time to do so.Andrei
Mar 11 2010
Ellery Newcomer wrote:What I'd like to see is the creation of a library file interface, say: std.archive and then have implementations of it: std.archive.zip std.archive.tar std.archive.lha std.archive.7zip etc. Pass a file name to a factory method of std.archive, and it figures out what kind of archive it is, instantiates the appropriate implementation, etc.I've hoped to rewrite std.zip forever, but found no time to do so.Well, heck. Maybe I'll see what I can do with it. Do you want it to conform to any interface in particular?
Mar 11 2010
"Ellery Newcomer" <ellery-newcomer utulsa.edu> wrote in message news:hnc4o3$2lms$1 digitalmars.com...I suppose the name isn't so important, but I really hate zip files whose contents aren't contained inside a single directory.This is a bit of a "vim vs emacs" or "static vs dynamic" sort of issue. Most of the archive programs I've used, including the one I currently use, put an "Extract to new directory" option into my file manager's right-click menu. I *always* use that, and consider it downright silly not to. But every once in a while I'll get an archive that follows the "nothing but one dir" convention, so I get a useless extra subfolder that I have to either delete or allow it to clutter up my filesystem, and that just irritates the hell out of me. Personally, I'm convinced that any archive program that doesn't allow you to automatically create a subfolder by default is a bad archive program. And I'm convinced that a convention that places restrictions on the top-level of a zip is, well, rediculous. But obviously there are people that disagree with me on that. So, I guess it's a "vim vs emacs" kind of thing. What I really want is an archive program that automatically makes a subfolder by default *but* detects if the top level inside the archive contains nothing more than a single folder and intelligently *not* create a new folder in that case. But I've yet to see one that does that, and I haven't had time to make one.
Mar 11 2010
Nick Sabalausky wrote:"Ellery Newcomer" <ellery-newcomer utulsa.edu> wrote in message news:hnc4o3$2lms$1 digitalmars.com...I don't really disagree, but it's not always that simple. Take tar, for instance, which has been around since forever, and which has a legacy you can't drop just like that. (I wonder if it's even part of the POSIX standard?) There are literally thousands of applications that depend on tar working in exactly the same way as it has always done, on all systems. And that way is to automatically extract all files into the current directory unless otherwise specified. As long as tar is the most common archive format on *NIX (and it is, by far), one must expect people to be true gentlemen and -women who put their files in a subdirectory inside the archive -- i.e. make tarballs and not tarbombs. :)I suppose the name isn't so important, but I really hate zip files whose contents aren't contained inside a single directory.This is a bit of a "vim vs emacs" or "static vs dynamic" sort of issue. Most of the archive programs I've used, including the one I currently use, put an "Extract to new directory" option into my file manager's right-click menu. I *always* use that, and consider it downright silly not to. But every once in a while I'll get an archive that follows the "nothing but one dir" convention, so I get a useless extra subfolder that I have to either delete or allow it to clutter up my filesystem, and that just irritates the hell out of me. Personally, I'm convinced that any archive program that doesn't allow you to automatically create a subfolder by default is a bad archive program. And I'm convinced that a convention that places restrictions on the top-level of a zip is, well, rediculous. But obviously there are people that disagree with me on that. So, I guess it's a "vim vs emacs" kind of thing.What I really want is an archive program that automatically makes a subfolder by default *but* detects if the top level inside the archive contains nothing more than a single folder and intelligently *not* create a new folder in that case. But I've yet to see one that does that, and I haven't had time to make one.If you do, let me know. I'd like that too. :) -Lars
Mar 12 2010
On 12/03/10 18:09, Nick Sabalausky wrote:"Ellery Newcomer"<ellery-newcomer utulsa.edu> wrote in message news:hnc4o3$2lms$1 digitalmars.com... What I really want is an archive program that automatically makes a subfolder by default *but* detects if the top level inside the archive contains nothing more than a single folder and intelligently *not* create a new folder in that case. But I've yet to see one that does that, and I haven't had time to make one.The right click 'extract here' under GNOME does *exactly* this.
Mar 12 2010
Bernard Helyer wrote:On 12/03/10 18:09, Nick Sabalausky wrote:Same under KDE: Dolphin right click 'extract here, autodetect subfolder' Perhaps Dolphin will also function under XP, last time I checked KDE was still a bit buggy under windows though."Ellery Newcomer"<ellery-newcomer utulsa.edu> wrote in message news:hnc4o3$2lms$1 digitalmars.com... What I really want is an archive program that automatically makes a subfolder by default *but* detects if the top level inside the archive contains nothing more than a single folder and intelligently *not* create a new folder in that case. But I've yet to see one that does that, and I haven't had time to make one.The right click 'extract here' under GNOME does *exactly* this.
Mar 12 2010
Lutger wrote:Bernard Helyer wrote:Yes, I love this feature.On 12/03/10 18:09, Nick Sabalausky wrote:Same under KDE: Dolphin right click 'extract here, autodetect subfolder'"Ellery Newcomer"<ellery-newcomer utulsa.edu> wrote in message news:hnc4o3$2lms$1 digitalmars.com... What I really want is an archive program that automatically makes a subfolder by default *but* detects if the top level inside the archive contains nothing more than a single folder and intelligently *not* create a new folder in that case. But I've yet to see one that does that, and I haven't had time to make one.The right click 'extract here' under GNOME does *exactly* this.Perhaps Dolphin will also function under XP, last time I checked KDE was still a bit buggy under windows though.
Mar 13 2010
On Thu, Mar 11, 2010 at 9:09 PM, Nick Sabalausky <a a.a> wrote:What I really want is an archive program that automatically makes a subfolder by default *but* detects if the top level inside the archive contains nothing more than a single folder and intelligently *not* create a new folder in that case. But I've yet to see one that does that, and I haven't had time to make one.WinRAR has an option for that if the zip file and the single folder inside are named the same thing. So if Foo.zip contains just a top level folder called Foo, then it just extracts Foo. Otherwise it makes a "Foo" folder and puts the contents of Foo.zip into that. --bb
Mar 12 2010
On 03/11/2010 11:09 PM, Nick Sabalausky wrote:"Ellery Newcomer"<ellery-newcomer utulsa.edu> wrote in message news:hnc4o3$2lms$1 digitalmars.com...I rarely come across a zip file that doesn't follow that convention, and I never extract to new directory, but I do always check the contents of the zip file manually.I suppose the name isn't so important, but I really hate zip files whose contents aren't contained inside a single directory.This is a bit of a "vim vs emacs" or "static vs dynamic" sort of issue. Most of the archive programs I've used, including the one I currently use, put an "Extract to new directory" option into my file manager's right-click menu. I *always* use that, and consider it downright silly not to. But every once in a while I'll get an archive that follows the "nothing but one dir" convention, so I get a useless extra subfolder that I have to either delete or allow it to clutter up my filesystem, and that just irritates the hell out of me.Personally, I'm convinced that any archive program that doesn't allow you to automatically create a subfolder by default is a bad archive program. And I'm convinced that a convention that places restrictions on the top-level of a zip is, well, rediculous. But obviously there are people that disagree with me on that. So, I guess it's a "vim vs emacs" kind of thing. What I really want is an archive program that automatically makes a subfolder by default *but* detects if the top level inside the archive contains nothing more than a single folder and intelligently *not* create a new folder in that case. But I've yet to see one that does that, and I haven't had time to make one.Yeah, I'm thinking I'm going to do that with dmdz
Mar 12 2010
Ellery Newcomer wrote:So I'm toying with a prototype, which is proving nice enough, but there be a few things that I'm not quite sure which way to go with.How about: dmdz ...stuff... foo.zip ...morestuff... being semantically identical to: dmdz ...stuff... (expanded contents of foo.zip) ...morestuff... In other words, it works just like wildcard expansion: dmd ...stuff... *.d ...morestuff... Just think of foo.zip as a macro that expands to a list of the files that are the contents of foo.zip (while ignoring files that are not usable as input to dmd). The neato thing is that, for a user, there's nothing to learn about using dmdz.
Mar 11 2010
Ellery Newcomer wrote:So I'm toying with a prototype, which is proving nice enough, but there be a few things that I'm not quite sure which way to go with.Cool! Looking forward to using it. :) But can we please call it zdmd, so there is some consistency with rdmd? -Lars
Mar 12 2010
On 03/12/2010 06:15 AM, Lars T. Kyllingstad wrote:Ellery Newcomer wrote:I have no idea why it's called dmdz and not zdmd. My guess is so you can have rdmdz.So I'm toying with a prototype, which is proving nice enough, but there be a few things that I'm not quite sure which way to go with.Cool! Looking forward to using it. :) But can we please call it zdmd, so there is some consistency with rdmd? -Lars
Mar 12 2010
Hello. I've run into a problem. dmd foo/bar/bizz.d bizz.d: module bar.bizz; ... dmd thinks it's looking at module foo.bar.bizz and generally gets confused unless supplied with -Ifoo. As a user, I'm not manually specifying that -Ifoo. So I need some bare-bones lexing capabilities. I have an ANTLR lexer grammar, which will do fine, unless the module name contains unicode characters. Any other suggestions?
Mar 15 2010
"Ellery Newcomer" <ellery-newcomer utulsa.edu> wrote in message news:hnmbkl$2rsj$1 digitalmars.com...Hello. I've run into a problem. dmd foo/bar/bizz.d bizz.d: module bar.bizz; ... dmd thinks it's looking at module foo.bar.bizz and generally gets confused unless supplied with -Ifoo. As a user, I'm not manually specifying that -Ifoo. So I need some bare-bones lexing capabilities. I have an ANTLR lexer grammar, which will do fine, unless the module name contains unicode characters. Any other suggestions?I'd just require a setting in dmd.conf for that.
Mar 15 2010
On 03/15/2010 10:04 PM, Nick Sabalausky wrote:"Ellery Newcomer"<ellery-newcomer utulsa.edu> wrote in message news:hnmbkl$2rsj$1 digitalmars.com...Of course it turns out to be a screwy zip file. Nevermind.. Is dmd.conf really a good name for that file? I'm of the opinion now that it isn't, since it isn't the same thing and it does confuse dmd when executed in the directory containing it. dmdz.conf?Hello. I've run into a problem. dmd foo/bar/bizz.d bizz.d: module bar.bizz; ... dmd thinks it's looking at module foo.bar.bizz and generally gets confused unless supplied with -Ifoo. As a user, I'm not manually specifying that -Ifoo. So I need some bare-bones lexing capabilities. I have an ANTLR lexer grammar, which will do fine, unless the module name contains unicode characters. Any other suggestions?I'd just require a setting in dmd.conf for that.
Mar 15 2010
Anyone want to play with dmdz, here it is: http://personal.utulsa.edu/~ellery-newcomer/dmdz.zip Haven't tested it much, especially on windows. Don't know what it will do with multiple zip files. piecemeal flag doesn't know how to stop when you tell it to. dmd's run flag isn't handled correctly (I don't know how it's supposed to work). Does anyone know of a way to tell whether a command in bash or whatever segfaults? And I modified std.path.dirname and std.path.basename, so I just included them in dmdz.d. Otherwise, it should work okay. It can compile itself under 2.040.
Mar 16 2010
Ellery Newcomer wrote:Anyone want to play with dmdz, here it is: http://personal.utulsa.edu/~ellery-newcomer/dmdz.zip Haven't tested it much, especially on windows. Don't know what it will do with multiple zip files. piecemeal flag doesn't know how to stop when you tell it to. dmd's run flag isn't handled correctly (I don't know how it's supposed to work). Does anyone know of a way to tell whether a command in bash or whatever segfaults?You might like TRAP: http://www.davidpashley.com/articles/writing-robust-shell-scripts.html
Mar 16 2010
On 16/03/10 22:55, Ellery Newcomer wrote:Anyone want to play with dmdz, here it is: http://personal.utulsa.edu/~ellery-newcomer/dmdz.zip Haven't tested it much, especially on windows. Don't know what it will do with multiple zip files. piecemeal flag doesn't know how to stop when you tell it to. dmd's run flag isn't handled correctly (I don't know how it's supposed to work). Does anyone know of a way to tell whether a command in bash or whatever segfaults?$SOMECMD if [ $? -eq 139 ]; then echo "Segfault: $SOMECMD" fi $SOMECMD if [ $? -gte 1 ]; then echo Error fiAnd I modified std.path.dirname and std.path.basename, so I just included them in dmdz.d. Otherwise, it should work okay. It can compile itself under 2.040.
Mar 16 2010
On 03/16/2010 05:55 PM, Ellery Newcomer wrote:Anyone want to play with dmdz, here it is: http://personal.utulsa.edu/~ellery-newcomer/dmdz.zip Haven't tested it much, especially on windows. Don't know what it will do with multiple zip files. piecemeal flag doesn't know how to stop when you tell it to. dmd's run flag isn't handled correctly (I don't know how it's supposed to work). Does anyone know of a way to tell whether a command in bash or whatever segfaults? And I modified std.path.dirname and std.path.basename, so I just included them in dmdz.d. Otherwise, it should work okay. It can compile itself under 2.040.This is solid work, but I absolutely refuse to believe the solution must be as complicated as this. Recall that the baseline is a 30-lines script. I can't bring myself to believe that a four-modules, over thousand lines solution justifies the added complexity. Besides, what happened to std.getopt? You don't need to recognize dmd's options any more than rdmd does. rdmd dedicates only a few lines to argument parsing, dmdz makes it a science. Don't take this the wrong way, the work is absolutely a tour de force. I'm just saying that things could be dramatically simpler with just a little loss of features. I'm looking over the code and am puzzled about the kind of gunpower that seems to be necessary for achieving the task. Recall what's needed: someone who is able and willing would like to distribute a multi-module solution as a zip file. dmdz must provide a means to do so. Simple as that. The "able and willing" part is important - you don't need to cope with arbitrarily-formatted archives, you can impose people how the zip must be formatted. If you ask for them to provide a file called "main.d" in the root of the zip, then so be it if it reduces the size of dmdz by a factor of ten. Andrei
Mar 16 2010
On 03/16/2010 08:13 PM, Andrei Alexandrescu wrote:This is solid work, but I absolutely refuse to believe the solution must be as complicated as this. Recall that the baseline is a 30-lines script. I can't bring myself to believe that a four-modules, over thousand lines solution justifies the added complexity.I count 2 modules and about 800 loc. 2 to 300 of which implements functionality which doesn't exist in std.path but should. The ANTLR crap could be replaced by a hundred lines of handwritten code, but the grammar already existed and took less time.Besides, what happened to std.getopt? You don't need to recognize dmd's options any more than rdmd does. rdmd dedicates only a few lines to argument parsing, dmdz makes it a science.It started when I said, "huh. when is this thing building an executable, and when is it building a library?", and parsing dmd's options seemed like the most generally useful way of finding that out. I rather like the way it's turned out. eg during development: $ dmdz dxl.zip -unittest...$ ./dxl/bin/dxl..."alright, unittests pass" $ dmdz dxl.zip..."now for the release executable" fwiw, I've never used rdmd due to bug 3860.Don't take this the wrong way, the work is absolutely a tour de force. I'm just saying that things could be dramatically simpler with just a little loss of features. I'm looking over the code and am puzzled about the kind of gunpower that seems to be necessary for achieving the task.Huh. When all you have is a harquebus ..Recall what's needed: someone who is able and willing would like to distribute a multi-module solution as a zip file. dmdz must provide a means to do so. Simple as that. The "able and willing" part is important - you don't need to cope with arbitrarily-formatted archives, you can impose people how the zip must be formatted. If you ask for them to provide a file called "main.d" in the root of the zip, then so be it if it reduces the size of dmdz by a factor of ten. AndreiBy restricting the format of the zip file a bit and moving the directory dmd gets run in, I might save 100 loc. Maybe. Does adding main.d to root help with the run flag? It doesn't do anything for dmdz that I can see. By introducing path2list et al into std.path or wherever (really, it is quite handy) and fixing basename and dirname, I could save 2 - 300 loc. By removing piecemeal and getting rid of dmd flags, I could quit 2 - 300 loc plus the ANTLR modules. Except I find both of those features occasionally useful. Given the choice, I'd keep them.
Mar 17 2010
On 03/17/2010 03:01 PM, Ellery Newcomer wrote:On 03/16/2010 08:13 PM, Andrei Alexandrescu wrote:Thanks for replying to this. I'd been afraid that I was coming off too critical. (I counted the ANTLR files as modules, and I think that's fair.) To give you an idea on where I come from, distributing dmdz with dmd is also a message to users on how things are getting done in D. For the problem "Compile a D file and all its dependents, link, and run" the solution rdmd has 469 lines. It seems quite much to me, but I couldn't find ways to make it much smaller. For the problem "Given a zip file containing a D program, build it" the dmdz solution is quite large. If we count everything: $ wc --lines dmdz.d import/antlrrt/*.d lexd.g opts.d sed.sh 782 dmdz.d 891 import/antlrrt/collections.d 551 import/antlrrt/exceptions.d 1253 import/antlrrt/lexing.d 2085 import/antlrrt/parsing.d 10 import/antlrrt/runtime.d 600 import/antlrrt/utils.d 436 lexd.g 88 opts.d 13 sed.sh 6709 total Arguably we can discount the import stuff, although I'd already raise some objections: $ wc --lines dmdz.d lexd.g opts.d sed.sh 782 dmdz.d 436 lexd.g 88 opts.d 13 sed.sh 1319 total That would suggest that it's about three times as difficult to build stuff present in a zip file than to deduce dependencies and build stuff not in a zip file. I find that difficult to swallow because to me building stuff in a zip file should be in some ways easier because there are no dependencies to deduce - they can be assumed to be in the zip file. I looked more through the program and it looks like it uses the zip library (honestly I would have used system("unzip...")), which does add some aggravation for arguably a good reason. (But I also see there's no caching, which is an important requirement.) In my mind it was all about check cache, unzip, and build. True there are details such as lib vs. executable that can be messy but I don't think anything could blow complexity up too hard.This is solid work, but I absolutely refuse to believe the solution must be as complicated as this. Recall that the baseline is a 30-lines script. I can't bring myself to believe that a four-modules, over thousand lines solution justifies the added complexity.I count 2 modules and about 800 loc. 2 to 300 of which implements functionality which doesn't exist in std.path but should. The ANTLR crap could be replaced by a hundred lines of handwritten code, but the grammar already existed and took less time.Nice, but I don't know why you need to understand dmd's flags instead of simply forwarding them to dmd. You could define dmdz-specific flags which you parse and understand, and then dump everything else to dmd, which will figure its own checking and error messages and all that.Besides, what happened to std.getopt? You don't need to recognize dmd's options any more than rdmd does. rdmd dedicates only a few lines to argument parsing, dmdz makes it a science.It started when I said, "huh. when is this thing building an executable, and when is it building a library?", and parsing dmd's options seemed like the most generally useful way of finding that out. I rather like the way it's turned out. eg during development: $ dmdz dxl.zip -unittest > ... $ ./dxl/bin/dxl > ... "alright, unittests pass" $ dmdz dxl.zip > ... "now for the release executable"fwiw, I've never used rdmd due to bug 3860.I didn't mean you to use it as much as look through it for examples of patterns that may be useful to dmdz (such as the one above).Hehe :o). Well definitely you need to submit your stdlib additions to e.g. bugzilla.Don't take this the wrong way, the work is absolutely a tour de force. I'm just saying that things could be dramatically simpler with just a little loss of features. I'm looking over the code and am puzzled about the kind of gunpower that seems to be necessary for achieving the task.Huh. When all you have is a harquebus ..I think it would be great to remove all stuff that's not necessary. I paste at the end of this message my two baselines: a shell script and a D program. They compare poorly with your program, but are extremely simple. I think it may be useful to see how much impact each feature that these programs lack is adding size to your solution. Andrei EXTENSIONS=(d di a o) ZIP=$1 TGT=/tmp/$ZIP BIN=${ZIP/.zip/} if [[ ! -f $ZIP ]]; then echo "Zip file missing: \`$ZIP'" >&2 echo "Usage: dmdz file.zip" >&2 exit 1 fi if [[ ! -d $TGT ]] || [[ $ZIP -nt $TGT ]]; then mkdir --parents $TGT unzip $ZIP -d $TGT >/dev/null fi FIND="find . -type f -false " for EXT in $EXTENSIONS; do FIND="$FIND -or -iname '*.$EXT'" done (cd $TGT && dmd -of$BIN `eval $FIND`) // Accepted extensions auto extensions = [ "d", "di", "a", "o" ]; int main(string[] args) { // The one and only parameter is the zip file auto zip = args[1]; if (!exists(zip)) { stderr.writeln("Zip file missing: `", zip, "'"); stderr.writeln("Usage: dmdz file.zip"); return 1; } // Target directory auto tgt = "/tmp/" ~ zip; // Binary result is the name of the zip without the .zip auto bin = replace(zip, ".zip", ""); // Was the zip file already extracted? If not, extract it if (lastModified(zip) >= lastModified(tgt, d_time.min)) { system("mkdir --parents " ~ tgt); system("unzip " ~ zip " -d " tgt ~ " >/dev/null"); } // Compile all files with accepted extensions auto find = "find . -type f -false "; foreach (ext; extensions) { find ~= " -or -iname '*." ~ ext ~ "'"; } return system("cd " ~ tgt ~ " && dmd -of" ~ bin ~ " `eval " ~ find ~ "`"); }Recall what's needed: someone who is able and willing would like to distribute a multi-module solution as a zip file. dmdz must provide a means to do so. Simple as that. The "able and willing" part is important - you don't need to cope with arbitrarily-formatted archives, you can impose people how the zip must be formatted. If you ask for them to provide a file called "main.d" in the root of the zip, then so be it if it reduces the size of dmdz by a factor of ten. AndreiBy restricting the format of the zip file a bit and moving the directory dmd gets run in, I might save 100 loc. Maybe. Does adding main.d to root help with the run flag? It doesn't do anything for dmdz that I can see. By introducing path2list et al into std.path or wherever (really, it is quite handy) and fixing basename and dirname, I could save 2 - 300 loc. By removing piecemeal and getting rid of dmd flags, I could quit 2 - 300 loc plus the ANTLR modules. Except I find both of those features occasionally useful. Given the choice, I'd keep them.
Mar 17 2010
On 03/17/2010 03:53 PM, Andrei Alexandrescu wrote:Thanks for replying to this. I'd been afraid that I was coming off too critical. (I counted the ANTLR files as modules, and I think that's fair.) To give you an idea on where I come from, distributing dmdz with dmd is also a message to users on how things are getting done in D.dang right you are. If you're going to count the antlr runtime, then maybe you should also be counting druntime and the sections of phobos that I used?For the problem "Compile a D file and all its dependents, link, and run" the solution rdmd has 469 lines. It seems quite much to me, but I couldn't find ways to make it much smaller.user wouldn't know that from any dmd distribution I've ever seen.For the problem "Given a zip file containing a D program, build it" the dmdz solution is quite large. If we count everything: $ wc --lines dmdz.d import/antlrrt/*.d lexd.g opts.d sed.sh 782 dmdz.d 891 import/antlrrt/collections.d 551 import/antlrrt/exceptions.d 1253 import/antlrrt/lexing.d 2085 import/antlrrt/parsing.d 10 import/antlrrt/runtime.d 600 import/antlrrt/utils.d 436 lexd.g 88 opts.d 13 sed.sh 6709 totalforgot generated/*.d should bump it up to 11 or 12 k.Arguably we can discount the import stuff, although I'd already raise some objections: $ wc --lines dmdz.d lexd.g opts.d sed.sh 782 dmdz.d 436 lexd.g 88 opts.d 13 sed.sh 1319 totallexd.g and sed.sh are only there for reference. I hate it when machine generated source code is in a project, but the source grammar isn't.That would suggest that it's about three times as difficult to build stuff present in a zip file than to deduce dependencies and build stuff not in a zip file. I find that difficult to swallow because to me building stuff in a zip file should be in some ways easier because there are no dependencies to deduce - they can be assumed to be in the zip file. I looked more through the program and it looks like it uses the zip library (honestly I would have used system("unzip...")), which does add some aggravation for arguably a good reason. (But I also see there's no caching, which is an important requirement.)eh?Nice, but I don't know why you need to understand dmd's flags instead of simply forwarding them to dmd. You could define dmdz-specific flags which you parse and understand, and then dump everything else to dmd, which will figure its own checking and error messages and all that.filtering out flags that screw things up for the build in question; knowing where the resultant executable is supposed to be;I think it would be great to remove all stuff that's not necessary. I paste at the end of this message my two baselines: a shell script and a D program. They compare poorly with your program, but are extremely simple. I think it may be useful to see how much impact each feature that these programs lack is adding size to your solution. AndreiYou come at this problem like "It should be an eloquent showcase of what D has to offer." I come at it like "I want this to be generally useful. To me." In my opinion, how well it works trumps how many lines of code it took to write. But for the aforementioned bug, I never would have looked at rdmd's source, and even then I didn't notice how many lines of code it was. The way dmdz was written is based on the needs that presented themselves to me at the time. So far I've run it against three different projects and I'm happy with it the way it's turned out. 1. dmdz toy example. not much here. 2. dexcelapi port of jexcelapi, 90k loc (that thing must have shrunk when I wasn't looking, I was sure it was 200k), ~ 400 source files. Big. Dumping everything to dmd is easy enough to implement one way or another, but when I hit an Out of Memory Error I need what -piecemeal has to offer. I found the offending file (still don't know what's up with it), commented it out, and I can dump everything to dmd again. Without it, I probably would have given up on D for another year and a half. 3. dcrypt Today, I wanted to play with it, so I checked it out, popped dmdz.conf and a main.d in the directory and zipped the whole thing up. dmdz dcrypt.zip It worked. Without me doing anything to dmdz or dcrypt (except adding a string alias, &&^%^ tango). I was kind of hoping others would try it and give their opinions, but apparently nobody else cares. Or they're on vacation, like I should be. Or they're giving the infamous 'silent approval'. Who knows.
Mar 17 2010
On 03/17/2010 08:17 PM, Ellery Newcomer wrote:On 03/17/2010 03:53 PM, Andrei Alexandrescu wrote:I meant the antlr grammar for the task. I gave two counts, one excluding the antlr runtime, and based the rest of my discussion on that. I sadly note the irony. There is no need to get defensive, really.Thanks for replying to this. I'd been afraid that I was coming off too critical. (I counted the ANTLR files as modules, and I think that's fair.) To give you an idea on where I come from, distributing dmdz with dmd is also a message to users on how things are getting done in D.dang right you are. If you're going to count the antlr runtime, then maybe you should also be counting druntime and the sections of phobos that I used?Well that's generated. I counted what's needed to get things going. Unless you meant that ironically...For the problem "Compile a D file and all its dependents, link, and run" the solution rdmd has 469 lines. It seems quite much to me, but I couldn't find ways to make it much smaller.user wouldn't know that from any dmd distribution I've ever seen.For the problem "Given a zip file containing a D program, build it" the dmdz solution is quite large. If we count everything: $ wc --lines dmdz.d import/antlrrt/*.d lexd.g opts.d sed.sh 782 dmdz.d 891 import/antlrrt/collections.d 551 import/antlrrt/exceptions.d 1253 import/antlrrt/lexing.d 2085 import/antlrrt/parsing.d 10 import/antlrrt/runtime.d 600 import/antlrrt/utils.d 436 lexd.g 88 opts.d 13 sed.sh 6709 totalforgot generated/*.dshould bump it up to 11 or 12 k.My understanding is that lexd.g is your code so it should be included in the size of the solution, whereas the generated code should not.Arguably we can discount the import stuff, although I'd already raise some objections: $ wc --lines dmdz.d lexd.g opts.d sed.sh 782 dmdz.d 436 lexd.g 88 opts.d 13 sed.sh 1319 totallexd.g and sed.sh are only there for reference. I hate it when machine generated source code is in a project, but the source grammar isn't.The idea is to not extract the files every time you build. If they are in place already, the tool should recognize that.That would suggest that it's about three times as difficult to build stuff present in a zip file than to deduce dependencies and build stuff not in a zip file. I find that difficult to swallow because to me building stuff in a zip file should be in some ways easier because there are no dependencies to deduce - they can be assumed to be in the zip file. I looked more through the program and it looks like it uses the zip library (honestly I would have used system("unzip...")), which does add some aggravation for arguably a good reason. (But I also see there's no caching, which is an important requirement.)eh?The tool shouldn't be a showcase. Obviously the primary purpose is for the tool to be useful. The shell script and the D script are useful. I am sure your tool is useful, but I think it doesn't hit the right balance. I simply don't think it takes that much code to achieve what the tool needs to achieve.Nice, but I don't know why you need to understand dmd's flags instead of simply forwarding them to dmd. You could define dmdz-specific flags which you parse and understand, and then dump everything else to dmd, which will figure its own checking and error messages and all that.filtering out flags that screw things up for the build in question; knowing where the resultant executable is supposed to be;I think it would be great to remove all stuff that's not necessary. I paste at the end of this message my two baselines: a shell script and a D program. They compare poorly with your program, but are extremely simple. I think it may be useful to see how much impact each feature that these programs lack is adding size to your solution. AndreiYou come at this problem like "It should be an eloquent showcase of what D has to offer." I come at it like "I want this to be generally useful. To me."In my opinion, how well it works trumps how many lines of code it took to write. But for the aforementioned bug, I never would have looked at rdmd's source, and even then I didn't notice how many lines of code it was. The way dmdz was written is based on the needs that presented themselves to me at the time. So far I've run it against three different projects and I'm happy with it the way it's turned out. 1. dmdz toy example. not much here. 2. dexcelapi port of jexcelapi, 90k loc (that thing must have shrunk when I wasn't looking, I was sure it was 200k), ~ 400 source files. Big. Dumping everything to dmd is easy enough to implement one way or another, but when I hit an Out of Memory Error I need what -piecemeal has to offer. I found the offending file (still don't know what's up with it), commented it out, and I can dump everything to dmd again. Without it, I probably would have given up on D for another year and a half. 3. dcrypt Today, I wanted to play with it, so I checked it out, popped dmdz.conf and a main.d in the directory and zipped the whole thing up. dmdz dcrypt.zip It worked. Without me doing anything to dmdz or dcrypt (except adding a string alias, &&^%^ tango).I'm not contending the tool is not useful. I'm just saying it is too big for what it does, and that that does matter with regard to distributing it with dmd.I was kind of hoping others would try it and give their opinions, but apparently nobody else cares. Or they're on vacation, like I should be. Or they're giving the infamous 'silent approval'. Who knows.It looks like we're getting into a little diatribe, which is very sad because you've clearly done a good amount of work and I didn't intend to make it look any other way. All I can say is that the tool is very far removed from what I think it should look like; for my money, the moment it gets larger than one simple module it would mean I took a few wrong turns along the way. BTW Walter made a very nice suggestion: make a .zip file in the command line be equivalent to listing all files in that zip in the command line. I think it's this kind of idea that greatly simplifies things. Andrei
Mar 17 2010
Hello Andrei,The idea is to not extract the files every time you build. If they are in place already, the tool should recognize that.The difference in speed between disk IO and CPU /might/ be high enough that (unless the uncompressed file is cached or you round trip it back to the disk) reading from the zip may be faster. I know that on linux there is a way to pass a stream as a file name (I forget what happens under the hood, but bash uses the ">(cmd)" syntax to do it) so you could work with that. -- ... <IXOYE><
Mar 17 2010
On 03/17/2010 10:30 PM, BCS wrote:Hello Andrei,That works on zsh, I'm not sure whether it works with other shells. Also, dmd refuses to compile such streams because they don't end in .d. The file must be written to the file system, so caching would always help. AndreiThe idea is to not extract the files every time you build. If they are in place already, the tool should recognize that.The difference in speed between disk IO and CPU /might/ be high enough that (unless the uncompressed file is cached or you round trip it back to the disk) reading from the zip may be faster. I know that on linux there is a way to pass a stream as a file name (I forget what happens under the hood, but bash uses the ">(cmd)" syntax to do it) so you could work with that.
Mar 18 2010
BCS wrote:Hello Andrei,I'd argue that for this case, caching the extracted files is not worth the effort, complexity, or speed. If you're in an edit/compile/debug loop, I can't see working off of a zip file of the sources.The idea is to not extract the files every time you build. If they are in place already, the tool should recognize that.The difference in speed between disk IO and CPU /might/ be high enough that (unless the uncompressed file is cached or you round trip it back to the disk) reading from the zip may be faster. I know that on linux there is a way to pass a stream as a file name (I forget what happens under the hood, but bash uses the ">(cmd)" syntax to do it) so you could work with that.
Mar 18 2010
On 03/18/2010 02:36 PM, Walter Bright wrote:BCS wrote:Of course not, but the typical scenario is to just run a program off its .zip file every so often. In that case, extraction makes for an unpleasant latency. FWIW, for rdmd caching makes a big, big difference. AndreiHello Andrei,I'd argue that for this case, caching the extracted files is not worth the effort, complexity, or speed. If you're in an edit/compile/debug loop, I can't see working off of a zip file of the sources.The idea is to not extract the files every time you build. If they are in place already, the tool should recognize that.The difference in speed between disk IO and CPU /might/ be high enough that (unless the uncompressed file is cached or you round trip it back to the disk) reading from the zip may be faster. I know that on linux there is a way to pass a stream as a file name (I forget what happens under the hood, but bash uses the ">(cmd)" syntax to do it) so you could work with that.
Mar 18 2010
Andrei Alexandrescu wrote:FWIW, for rdmd caching makes a big, big difference.Caching the executable, sure, but I'm not sure that translates into a case for caching the intermediate files (i.e. the extracted source).
Mar 18 2010
On 03/18/2010 04:14 PM, Walter Bright wrote:Andrei Alexandrescu wrote:I see. It should be fine to cache the exe and regenerate only if the archive is newer. AndreiFWIW, for rdmd caching makes a big, big difference.Caching the executable, sure, but I'm not sure that translates into a case for caching the intermediate files (i.e. the extracted source).
Mar 18 2010
Hello Walter,BCS wrote:The only case I can think of where putting a zip file in the middle of that loop is even remotely reasonable would be for a remote build farm. The other use cases for build-from-zip are building someone else's code where you aren't editing the parts in the zip file. -- ... <IXOYE><Hello Andrei,I'd argue that for this case, caching the extracted files is not worth the effort, complexity, or speed. If you're in an edit/compile/debug loop, I can't see working off of a zip file of the sources.The idea is to not extract the files every time you build. If they are in place already, the tool should recognize that.The difference in speed between disk IO and CPU /might/ be high enough that (unless the uncompressed file is cached or you round trip it back to the disk) reading from the zip may be faster. I know that on linux there is a way to pass a stream as a file name (I forget what happens under the hood, but bash uses the ">(cmd)" syntax to do it) so you could work with that.
Mar 18 2010
BCS wrote:The only case I can think of where putting a zip file in the middle of that loop is even remotely reasonable would be for a remote build farm. The other use cases for build-from-zip are building someone else's code where you aren't editing the parts in the zip file.It might even be practical to have dmdz compile from a zip file specified by a URL! That would be cool.
Mar 18 2010
On 03/18/2010 04:15 PM, Walter Bright wrote:BCS wrote:In that case I do think caching would be helpful :o). AndreiThe only case I can think of where putting a zip file in the middle of that loop is even remotely reasonable would be for a remote build farm. The other use cases for build-from-zip are building someone else's code where you aren't editing the parts in the zip file.It might even be practical to have dmdz compile from a zip file specified by a URL! That would be cool.
Mar 18 2010
Walter Bright wrote:BCS wrote:Just like dsss did...(and still does for D1 I guess) I like dmdz and rdmd, but it's a pity dsss isn't revived yet. I still really miss it, always thought it would become the ruby gems / CPAN of D.The only case I can think of where putting a zip file in the middle of that loop is even remotely reasonable would be for a remote build farm. The other use cases for build-from-zip are building someone else's code where you aren't editing the parts in the zip file.It might even be practical to have dmdz compile from a zip file specified by a URL! That would be cool.
Mar 18 2010
Lutger wrote:Just like dsss did...(and still does for D1 I guess) I like dmdz and rdmd, but it's a pity dsss isn't revived yet. I still really miss it, always thought it would become the ruby gems / CPAN of D.Anyone can revive it if they're motivated too!
Mar 18 2010
On 03/17/2010 08:49 PM, Andrei Alexandrescu wrote:Well that's generated. I counted what's needed to get things going. Unless you meant that ironically...Yes I was speaking in jest up to this point.Yeah, you're right there.should bump it up to 11 or 12 k.My understanding is that lexd.g is your code so it should be included in the size of the solution, whereas the generated code should not.Arguably we can discount the import stuff, although I'd already raise some objections: $ wc --lines dmdz.d lexd.g opts.d sed.sh 782 dmdz.d 436 lexd.g 88 opts.d 13 sed.sh 1319 totallexd.g and sed.sh are only there for reference. I hate it when machine generated source code is in a project, but the source grammar isn't.It does that, but on a per-file basis.The idea is to not extract the files every time you build. If they are in place already, the tool should recognize that.That would suggest that it's about three times as difficult to build stuff present in a zip file than to deduce dependencies and build stuff not in a zip file. I find that difficult to swallow because to me building stuff in a zip file should be in some ways easier because there are no dependencies to deduce - they can be assumed to be in the zip file. I looked more through the program and it looks like it uses the zip library (honestly I would have used system("unzip...")), which does add some aggravation for arguably a good reason. (But I also see there's no caching, which is an important requirement.)eh?All right. I'll try cutting things out and see where I end up.The tool shouldn't be a showcase. Obviously the primary purpose is for the tool to be useful. The shell script and the D script are useful. I am sure your tool is useful, but I think it doesn't hit the right balance. I simply don't think it takes that much code to achieve what the tool needs to achieve.Nice, but I don't know why you need to understand dmd's flags instead of simply forwarding them to dmd. You could define dmdz-specific flags which you parse and understand, and then dump everything else to dmd, which will figure its own checking and error messages and all that.filtering out flags that screw things up for the build in question; knowing where the resultant executable is supposed to be;I think it would be great to remove all stuff that's not necessary. I paste at the end of this message my two baselines: a shell script and a D program. They compare poorly with your program, but are extremely simple. I think it may be useful to see how much impact each feature that these programs lack is adding size to your solution. AndreiYou come at this problem like "It should be an eloquent showcase of what D has to offer." I come at it like "I want this to be generally useful. To me."I'm not contending the tool is not useful. I'm just saying it is too big for what it does, and that that does matter with regard to distributing it with dmd.I still don't see why (other than lexd.g adds ~ 10k loc just to get the line 'module foo.bar;' out of a source file)BTW Walter made a very nice suggestion: make a .zip file in the command line be equivalent to listing all files in that zip in the command line. I think it's this kind of idea that greatly simplifies things. AndreiFair enough.
Mar 18 2010
On 18/03/10 16:28, Ellery Newcomer wrote:I still don't see why (other than lexd.g adds ~ 10k loc just to get the line 'module foo.bar;' out of a source file)That seems like a tad too much for it... Surely it would only take a few (here meaning far less than 10k) lines to parse away comments/whitespace at the start of the file then read the module declaration if there is one?
Mar 18 2010
On 03/18/2010 11:36 AM, Robert Clipsham wrote:On 18/03/10 16:28, Ellery Newcomer wrote:Sure. I could write it in 100 loc. My concern is they would be a buggy 100 loc that would take a good deal of effort to get right. lexd.g already existed and has been pretty heavily tested.I still don't see why (other than lexd.g adds ~ 10k loc just to get the line 'module foo.bar;' out of a source file)That seems like a tad too much for it... Surely it would only take a few (here meaning far less than 10k) lines to parse away comments/whitespace at the start of the file then read the module declaration if there is one?
Mar 18 2010
On 03/18/2010 11:48 AM, Ellery Newcomer wrote:On 03/18/2010 11:36 AM, Robert Clipsham wrote:You could write it in 5 loc. AndreiOn 18/03/10 16:28, Ellery Newcomer wrote:Sure. I could write it in 100 loc. My concern is they would be a buggy 100 loc that would take a good deal of effort to get right. lexd.g already existed and has been pretty heavily tested.I still don't see why (other than lexd.g adds ~ 10k loc just to get the line 'module foo.bar;' out of a source file)That seems like a tad too much for it... Surely it would only take a few (here meaning far less than 10k) lines to parse away comments/whitespace at the start of the file then read the module declaration if there is one?
Mar 18 2010
On 03/18/2010 11:28 AM, Ellery Newcomer wrote:On 03/17/2010 08:49 PM, Andrei Alexandrescu wrote:My bad for not being able to see that in the code. I read through and also searched for "cache", "date", "time"... couldn't find it. I now find it by looking for "last".The idea is to not extract the files every time you build. If they are in place already, the tool should recognize that.It does that, but on a per-file basis.If a casual user downloads the dmd distro and says, hey, let me see how this rdmd tool is implemented, I wouldn't be afraid. If they take a look at dmdz, they may be daunted. The example you gave is perfect. Right now rdmd runs dmd -v to figure out dependencies, but before it was parsing the file for lines that begin with "import". That was problematic, so I'm glad I now use the compiler. Your task is much simpler - nothing is allowed before the module line aside from the shebang line and comments, and you should feel free to restrict modules to e.g. not include recursive comments or anything that aggravates your job. So, I'm very glad you mentioned it: 10K of code to detect "module" is absolute overkill. I now confess that I couldn't figure out why you needed the lexer for dmdz and didn't have the time to sift through the code and figure that out. I thought there must be some solid reason, and so I was ashamed to even ask. I did know you want to find "module", but in my naivete, I wasn't thinking that just that would ever inspire you to include a lexer. To be frank, I even think you shouldn't worry at all about "module". Just extract the blessed thing with caching and call it a day. I was also thinking of simplifying options etc. by requiring a file "dmdflags.txt" in the archive and then do this when you run dmd: dmd `cat dmdflags.txt` stuff morestuff andsomemorestuff i.e. simply expand the file in the command line. No need for any extravaganza. But even dmdflags.txt I'd think would be a bit much. And speaking of cmdline stuff, assume find, zip, etc. are present on the host system if you need them.I'm not contending the tool is not useful. I'm just saying it is too big for what it does, and that that does matter with regard to distributing it with dmd.I still don't see why (other than lexd.g adds ~ 10k loc just to get the line 'module foo.bar;' out of a source file)Thank you for considering changing your program. AndreiBTW Walter made a very nice suggestion: make a .zip file in the command line be equivalent to listing all files in that zip in the command line. I think it's this kind of idea that greatly simplifies things. AndreiFair enough.
Mar 18 2010
Andrei Alexandrescu Wrote:To be frank, I even think you shouldn't worry at all about "module". Just extract the blessed thing with caching and call it a day. I was also thinking of simplifying options etc. by requiring a file "dmdflags.txt" in the archive and then do this when you run dmd: dmd `cat dmdflags.txt` stuff morestuff andsomemorestuff i.e. simply expand the file in the command line.I think it would be a good idea to stay well away from gratuitous portability barriers like this or that system("unzip") suggestion if the portable alternative isn't too much more work. I don't see why you wouldn't want this thing to work on Windows too.
Mar 18 2010
On 03/18/2010 12:28 PM, Clemens wrote:Andrei Alexandrescu Wrote:Yah, I agree. Well `` don't need to be used in the command line, a std.file.readText("dmdflags") should suffice. AndreiTo be frank, I even think you shouldn't worry at all about "module". Just extract the blessed thing with caching and call it a day. I was also thinking of simplifying options etc. by requiring a file "dmdflags.txt" in the archive and then do this when you run dmd: dmd `cat dmdflags.txt` stuff morestuff andsomemorestuff i.e. simply expand the file in the command line.I think it would be a good idea to stay well away from gratuitous portability barriers like this or that system("unzip") suggestion if the portable alternative isn't too much more work. I don't see why you wouldn't want this thing to work on Windows too.
Mar 18 2010
Andrei Alexandrescu wrote:To be frank, I even think you shouldn't worry at all about "module". Just extract the blessed thing with caching and call it a day. I was also thinking of simplifying options etc. by requiring a file "dmdflags.txt" in the archive and then do this when you run dmd: dmd `cat dmdflags.txt` stuff morestuff andsomemorestuffdmd will already read switches out of a file: dmd cmdfile ... So there's no need to parse the command file or do any shell expansion on it. Just pass it, and precede it with an .
Mar 18 2010
On 19-3-2010 1:18, Andrei Alexandrescu wrote:i.e. simply expand the file in the command line. No need for any extravaganza. But even dmdflags.txt I'd think would be a bit much. And speaking of cmdline stuff, assume find, zip, etc. are present on the host system if you need them.and I'm out.. I'm using Windows and don't have any of those (well, I have MS's FIND.EXE but that has nothing in common with posix') Anyway, Ellery is right: general stuff that dmdz needs could probably be moved into Phobos at some point. As for "module", couldn't dmd include an option to output these, similar to the way it outputs deps? L.
Mar 18 2010
On 03/18/2010 06:43 PM, Lionello Lunesu wrote:On 19-3-2010 1:18, Andrei Alexandrescu wrote:You're right.i.e. simply expand the file in the command line. No need for any extravaganza. But even dmdflags.txt I'd think would be a bit much. And speaking of cmdline stuff, assume find, zip, etc. are present on the host system if you need them.and I'm out.. I'm using Windows and don't have any of those (well, I have MS's FIND.EXE but that has nothing in common with posix')Anyway, Ellery is right: general stuff that dmdz needs could probably be moved into Phobos at some point.I looked around. basename and dirname suggest that the ones in phobos have issues (what are those?), and some other functions rely on path2list which I'd hope to replace with a range so as to not allocate memory without necessity.As for "module", couldn't dmd include an option to output these, similar to the way it outputs deps?I think that would be a natural thing to ask for. Until then I don't think there's a real need for supporting module declarations in dmdz. Andrei
Mar 18 2010
On 18/03/10 01:17, Ellery Newcomer wrote:I was kind of hoping others would try it and give their opinions, but apparently nobody else cares. Or they're on vacation, like I should be. Or they're giving the infamous 'silent approval'. Who knows.I'm usually one of those, but seen as you asked... It looks good :) I haven't had chance to try it yet, but a simple tool like this could be really useful. I don't have the same reservations as Andrei about the amount of code/how it's done... If it does its job it's good enough for me :) One thing I would like to know, are there plans for file formats other than .zip? You can generally get files less than half the size with faster compression/decompression times using other formats... would adding support for them (.tar.xz, .tar.gz) be too much extra hassle?
Mar 18 2010
On 03/18/2010 11:32 AM, Robert Clipsham wrote:On 18/03/10 01:17, Ellery Newcomer wrote:It would only involve building support for those formats into phobos :) I actually had the same thought after I saw Walter's suggestion for a std.archive. If I have time, I'd like to make it happen. Wait, what's tar.xz?I was kind of hoping others would try it and give their opinions, but apparently nobody else cares. Or they're on vacation, like I should be. Or they're giving the infamous 'silent approval'. Who knows.I'm usually one of those, but seen as you asked... It looks good :) I haven't had chance to try it yet, but a simple tool like this could be really useful. I don't have the same reservations as Andrei about the amount of code/how it's done... If it does its job it's good enough for me :) One thing I would like to know, are there plans for file formats other than .zip? You can generally get files less than half the size with faster compression/decompression times using other formats... would adding support for them (.tar.xz, .tar.gz) be too much extra hassle?
Mar 18 2010
On 03/18/2010 11:39 AM, Ellery Newcomer wrote:On 03/18/2010 11:32 AM, Robert Clipsham wrote:Heh, incidentally I just needed a tar reader a few days ago, so I wrote an embryo of a base class etc. I'll add it soon. The basic interface is: (a) open the archive (b) get an input range for it. The range iterates over archive entries. (c) You can look at archive info, and if you want to extract you can get a .byChunk() range to extract it. That's also an input range. For now I'm only concerned with reading... writing needs to be added. AndreiOn 18/03/10 01:17, Ellery Newcomer wrote:It would only involve building support for those formats into phobos :) I actually had the same thought after I saw Walter's suggestion for a std.archive. If I have time, I'd like to make it happen.I was kind of hoping others would try it and give their opinions, but apparently nobody else cares. Or they're on vacation, like I should be. Or they're giving the infamous 'silent approval'. Who knows.I'm usually one of those, but seen as you asked... It looks good :) I haven't had chance to try it yet, but a simple tool like this could be really useful. I don't have the same reservations as Andrei about the amount of code/how it's done... If it does its job it's good enough for me :) One thing I would like to know, are there plans for file formats other than .zip? You can generally get files less than half the size with faster compression/decompression times using other formats... would adding support for them (.tar.xz, .tar.gz) be too much extra hassle?
Mar 18 2010
Andrei Alexandrescu wrote:Heh, incidentally I just needed a tar reader a few days ago, so I wrote an embryo of a base class etc. I'll add it soon. The basic interface is: (a) open the archive (b) get an input range for it. The range iterates over archive entries. (c) You can look at archive info, and if you want to extract you can get a .byChunk() range to extract it. That's also an input range. For now I'm only concerned with reading... writing needs to be added.That's great, but I only suggest that this not be added to Phobos until a generic archive interface is also added. That way, we can constantly add support for new archive formats without requiring users to change their code. Some suggestions for that: 1. The archive type should be represented by a string literal, not an enum. This way, users can add other archive types without having to touch the Phobos source code. 2. The reader should auto-detect the archive type based on the file contents, not the file name, and then call the appropriate factory method.
Mar 18 2010
On 03/18/2010 02:49 PM, Walter Bright wrote:Andrei Alexandrescu wrote:Yah.Heh, incidentally I just needed a tar reader a few days ago, so I wrote an embryo of a base class etc. I'll add it soon. The basic interface is: (a) open the archive (b) get an input range for it. The range iterates over archive entries. (c) You can look at archive info, and if you want to extract you can get a .byChunk() range to extract it. That's also an input range. For now I'm only concerned with reading... writing needs to be added.That's great, but I only suggest that this not be added to Phobos until a generic archive interface is also added. That way, we can constantly add support for new archive formats without requiring users to change their code.Some suggestions for that: 1. The archive type should be represented by a string literal, not an enum. This way, users can add other archive types without having to touch the Phobos source code. 2. The reader should auto-detect the archive type based on the file contents, not the file name, and then call the appropriate factory method.The archive type should be a D class inheriting ArchiveReader, so no enum and no string need be involved. The rest is a matter of registry - a new archiver registers itself into a database of archivers that maps file header data to (pointers to) factory methods. Typical file extensions should help, too, because they'd ease matching. Reading the file header (e.g. first 512 bytes) and then matching against archive signatures is, I think, a very nice touch. (I was only thinking of matching by file name.) There is a mild complication - you can't close and reopen the archive, so you need to pass those 512 bytes to the archiver along with the rest of the stream. This is because the stream may not be rewindable, as is the case with pipes. Sounds great! Andrei
Mar 18 2010
Andrei Alexandrescu wrote:Reading the file header (e.g. first 512 bytes) and then matching against archive signatures is, I think, a very nice touch. (I was only thinking of matching by file name.) There is a mild complication - you can't close and reopen the archive, so you need to pass those 512 bytes to the archiver along with the rest of the stream. This is because the stream may not be rewindable, as is the case with pipes.The reasons for reading the file to determine the archive type are: 1. Files sometimes lose their extensions when being transferred around. I sometimes have this problem when downloading files from the internet - Windows will store it without an extension. 2. Sometimes I have to remove the extension when sending a file via email, as stupid email readers block certain email messages based on file attachment extensions. 3. People don't always put the right extension onto the file. 4. Passing an archive of one type to a reader for another type causes the reader to crash (yes, I know, readers should be more robust that way, but reality is reality). Is it really necessary to support streaming archives? The reason I ask is we can nicely separate building/reading archives from file I/O. The archives can be entirely done in memory. Perhaps if an archive is being streamed, the program can simply accumulate it all in memory, then call the archive library functions.
Mar 18 2010
On 03/18/2010 03:11 PM, Walter Bright wrote:Andrei Alexandrescu wrote:Makes sense.Reading the file header (e.g. first 512 bytes) and then matching against archive signatures is, I think, a very nice touch. (I was only thinking of matching by file name.) There is a mild complication - you can't close and reopen the archive, so you need to pass those 512 bytes to the archiver along with the rest of the stream. This is because the stream may not be rewindable, as is the case with pipes.The reasons for reading the file to determine the archive type are: 1. Files sometimes lose their extensions when being transferred around. I sometimes have this problem when downloading files from the internet - Windows will store it without an extension. 2. Sometimes I have to remove the extension when sending a file via email, as stupid email readers block certain email messages based on file attachment extensions. 3. People don't always put the right extension onto the file. 4. Passing an archive of one type to a reader for another type causes the reader to crash (yes, I know, readers should be more robust that way, but reality is reality).Is it really necessary to support streaming archives?It is not necessary, only vital.The reason I ask is we can nicely separate building/reading archives from file I/O. The archives can be entirely done in memory. Perhaps if an archive is being streamed, the program can simply accumulate it all in memory, then call the archive library functions.This is completely nonscalable! 90% of all my archive manipulation involves streaming, and I wouldn't dream of thinking of loading most of those files in RAM. They are huge! I paste from a script I'm working on right now: if [[ ! -f $D/sentences.num.gz ]]; then ./txt2num.d $D/voc.txt \ < <(pv $D/sentences.txt.gz | gunzip) \ > >(gzip >$D/sentences.num.tmp.gz) mv $D/sentences.num.tmp.gz $D/sentences.num.gz fi That takes a good amount of time to run because the .gz involved is 2,180,367,456 bytes _after_ compression. Note how zipping is done both ways - on reading and writing. It would be great if we all went to the utmost possible lengths to distance ourselves from such nonscalable thinking. It's the root reason for which the wc sample program on digitalmars.com is _inappropriate_ and _damaging_ to the reputation of the language, and also the reason for which hash tables' implementation performs so poorly on large data - i.e., exactly when it matters. It's the kind of thinking stemming from "But I don't have _one_ file larger than 1GB anywhere on my hard drive!" which you repeatedly claimed as if it were a solid argument. Well if you don't have one you better get some. Nobody's going to give us a cookie if we process 50KB files 10 times faster than Perl or Python. Where it does matter is large data, and I'd be in a much better mood if I didn't feel my beard growing while I'm waiting next to a program that uses hashes to build a large index file. Andrei
Mar 18 2010
Andrei Alexandrescu wrote:I understand your point. But I still would like a way to build and read archives entirely in memory. One reason is that's how dmd is able to generate libraries so quickly.Is it really necessary to support streaming archives?It is not necessary, only vital.
Mar 18 2010
On 03/18/2010 04:22 PM, Walter Bright wrote:Andrei Alexandrescu wrote:Makes sense. (On the read side, reading in memory is not a problem if reading from a stream is defined - just use the streaming interface to load stuff in memory. For the writing part we need the mythical streaming abstraction that replaces current streams...) AndreiI understand your point. But I still would like a way to build and read archives entirely in memory. One reason is that's how dmd is able to generate libraries so quickly.Is it really necessary to support streaming archives?It is not necessary, only vital.
Mar 18 2010
Andrei Alexandrescu wrote:On 03/18/2010 04:22 PM, Walter Bright wrote:Maybe a better way to do it is to just pass a delegate that encapsulates a reader, and a delegate for the writing. That way, both streams and in-memory buffers will work with the same interface, and the archiver need know nothing about streams or memory. Some default delegates can be provided that interface to streams, files, and memory buffers. Or maybe just pass a range!Andrei Alexandrescu wrote:Makes sense. (On the read side, reading in memory is not a problem if reading from a stream is defined - just use the streaming interface to load stuff in memory. For the writing part we need the mythical streaming abstraction that replaces current streams...) AndreiI understand your point. But I still would like a way to build and read archives entirely in memory. One reason is that's how dmd is able to generate libraries so quickly.Is it really necessary to support streaming archives?It is not necessary, only vital.
Mar 18 2010
Andrei Alexandrescu wrote:The basic interface is:Another thing needed for the interface is an associative array that maps a string to a member of the archive. Object code libraries do this (the string is the unresolved symbol's name, the member is of course the corresponding object file).
Mar 18 2010
On 03/18/2010 05:11 PM, Walter Bright wrote:Andrei Alexandrescu wrote:Emphatically NO. Archives work with streams. You can build indexing on top of them. AndreiThe basic interface is:Another thing needed for the interface is an associative array that maps a string to a member of the archive. Object code libraries do this (the string is the unresolved symbol's name, the member is of course the corresponding object file).
Mar 18 2010
On 2010-03-18 18:17:26 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:On 03/18/2010 05:11 PM, Walter Bright wrote:Andrei, have you took a look at the Zip file format? It's not streamable. To be exact, zip is not streamable because you need to read the central directory at the end of the archive to get the actual file list. This has its benefits: it makes it easy to peak at the content without loading everything, and it makes it possible to completely change the archive's logical content just by appending to the file. It's like a mini-database in a way. <http://en.wikipedia.org/wiki/ZIP_(file_format)#Technical_information> I agree it is essential to have streaming support for archives formats that works with streaming. But offering only that is not a solution for archives in general. -- Michel Fortin michel.fortin michelf.com http://michelf.com/Andrei Alexandrescu wrote:Emphatically NO. Archives work with streams. You can build indexing on top of them.The basic interface is:Another thing needed for the interface is an associative array that maps a string to a member of the archive. Object code libraries do this (the string is the unresolved symbol's name, the member is of course the corresponding object file).
Mar 18 2010
On 03/18/2010 05:32 PM, Michel Fortin wrote:On 2010-03-18 18:17:26 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:Interesting, thank you. I still think generally a random-access interface is not the charter of the Archive interface. A zip archive should open the archive, seek to the end of it once, build an index, and then rewind the file for sequential access. But we shouldn't ask for such miracles from all archives. AndreiOn 03/18/2010 05:11 PM, Walter Bright wrote:Andrei, have you took a look at the Zip file format? It's not streamable. To be exact, zip is not streamable because you need to read the central directory at the end of the archive to get the actual file list. This has its benefits: it makes it easy to peak at the content without loading everything, and it makes it possible to completely change the archive's logical content just by appending to the file. It's like a mini-database in a way. <http://en.wikipedia.org/wiki/ZIP_(file_format)#Technical_information> I agree it is essential to have streaming support for archives formats that works with streaming. But offering only that is not a solution for archives in general.Andrei Alexandrescu wrote:Emphatically NO. Archives work with streams. You can build indexing on top of them.The basic interface is:Another thing needed for the interface is an associative array that maps a string to a member of the archive. Object code libraries do this (the string is the unresolved symbol's name, the member is of course the corresponding object file).
Mar 18 2010
Andrei Alexandrescu wrote:On 03/18/2010 05:11 PM, Walter Bright wrote:Such an interface won't work with .lib or .a archives. Both have an embedded table of contents that is such an associative array - it's not a list of file names, either, that's separate.Andrei Alexandrescu wrote:Emphatically NO. Archives work with streams. You can build indexing on top of them.The basic interface is:Another thing needed for the interface is an associative array that maps a string to a member of the archive. Object code libraries do this (the string is the unresolved symbol's name, the member is of course the corresponding object file).
Mar 18 2010
On 03/18/2010 06:00 PM, Walter Bright wrote:Andrei Alexandrescu wrote:Now I understand why linkers thrash the disk. Anyway, my point is: indexing the archive should be not part of the basic interface. Such capabilities should be in an enhanced interface that builds upon the basic one. AndreiOn 03/18/2010 05:11 PM, Walter Bright wrote:Such an interface won't work with .lib or .a archives. Both have an embedded table of contents that is such an associative array - it's not a list of file names, either, that's separate.Andrei Alexandrescu wrote:Emphatically NO. Archives work with streams. You can build indexing on top of them.The basic interface is:Another thing needed for the interface is an associative array that maps a string to a member of the archive. Object code libraries do this (the string is the unresolved symbol's name, the member is of course the corresponding object file).
Mar 18 2010
Andrei Alexandrescu wrote:On 03/18/2010 06:00 PM, Walter Bright wrote:I think this is incorrect. The table of contents in the .lib files was designed to work with a floppy disk system, and to minimize the number of disk reads. The design of .a libraries is equivalent. The thrashing of linkers came about on limited memory systems as the linker's in-memory data set often exceeded physical ram. A typical linker run also simply needs to read a lot of files.Andrei Alexandrescu wrote:Now I understand why linkers thrash the disk.On 03/18/2010 05:11 PM, Walter Bright wrote:Such an interface won't work with .lib or .a archives. Both have an embedded table of contents that is such an associative array - it's not a list of file names, either, that's separate.Andrei Alexandrescu wrote:Emphatically NO. Archives work with streams. You can build indexing on top of them.The basic interface is:Another thing needed for the interface is an associative array that maps a string to a member of the archive. Object code libraries do this (the string is the unresolved symbol's name, the member is of course the corresponding object file).Anyway, my point is: indexing the archive should be not part of the basic interface. Such capabilities should be in an enhanced interface that builds upon the basic one.That would be fine.
Mar 18 2010
On 18/03/10 16:39, Ellery Newcomer wrote:Wait, what's tar.xz?http://en.wikipedia.org/wiki/Xz - A lot of linux distro's seem to be moving to it for packaging from .tar.gz, I'm on Arch Linux, and the updates are a fraction of the size they used to be :)
Mar 18 2010