digitalmars.D - [due diligence] std.xml
- Justin Johansson (9/9) Oct 19 2010 This module should be removed altogether from Phobos forthwith.
- so (7/16) Oct 19 2010 Man, you sure getting on my nerves, on your last strike i lost it and go...
- Emil Madsen (6/27) Oct 19 2010 I agree with so, be polite, tho the code might not be as good as one wou...
- Steven Schveighoffer (8/12) Oct 19 2010 I agree with the sentiment that you should respect someone else's hard
- Justin Johansson (4/18) Oct 19 2010 I'm sorry and regret the impoliteness of my post. Please all
- Yao G. (7/10) Oct 19 2010 I don't think that Walter or Andrei would take kindly being called like ...
- Daniel Gibson (3/5) Oct 19 2010 I already wondered about that when he posted that strange joke about fuc...
- Andrei Alexandrescu (6/15) Oct 19 2010 I haven't worked with XML all that much. Please make me understand the
- Kagamin (2/5) Oct 19 2010 It needs rewrite.
- Denis Koroskin (3/22) Oct 19 2010 I use it, but design is bad and performance is awful.
- Andrei Alexandrescu (6/32) Oct 19 2010 More detail about the design please? I browsed through the code and the
- Jacob Carlborg (11/16) Oct 19 2010 It has a kind of annoying API:
- Michael Rynn (22/38) Oct 26 2010 There is a xml parser and document structure that follows DOM 2-3 interf...
- div0 (17/23) Oct 19 2010 Well one obvious problem is you have to read the document into memory
- sybrandy (31/33) Oct 19 2010 I think that depends on the type of XML library we create. A SAX
-
div0
(19/24)
Oct 19 2010
- Michel Fortin (34/41) Oct 19 2010 Many people have different needs for XML, it's hard to come with
- Andrei Alexandrescu (4/9) Oct 19 2010 Looks like a simple and clean API. We should be able to adopt the code
- Simen kjaeraas (4/6) Oct 19 2010 I'd love to give this a spin.
- Michel Fortin (6/12) Oct 19 2010 Great. I'll post that code tomorrow.
- Michel Fortin (9/19) Oct 19 2010 Not yet tomorrow, but it's ready. Have fun.
- Simen kjaeraas (4/19) Oct 20 2010 Thank you. I'll have a look-see.
This module should be removed altogether from Phobos forthwith. The code was obviously submitted and accepted without peer review, either that or the peers were idiots as well. It would be better to say that Phobos does not have an XML library yet, and to seek submissions, rather than maintain this piece of codswallop in the latest distribution. Let's not even talk of deprecation. Any D user currently using std.xml is completely misguided. Justin
Oct 19 2010
Man, you sure getting on my nerves, on your last strike i lost it and gone berserk in one of your threads. Now this... You know a human being wrote it, have some respect until you come up with something better. Thanks! On Tue, 19 Oct 2010 16:06:31 +0300, Justin Johansson <no spam.com> wrote:This module should be removed altogether from Phobos forthwith. The code was obviously submitted and accepted without peer review, either that or the peers were idiots as well. It would be better to say that Phobos does not have an XML library yet, and to seek submissions, rather than maintain this piece of codswallop in the latest distribution. Let's not even talk of deprecation. Any D user currently using std.xml is completely misguided. Justin-- Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
Oct 19 2010
I agree with so, be polite, tho the code might not be as good as one would wish. 2010/10/19 so <so so.do>Man, you sure getting on my nerves, on your last strike i lost it and gone berserk in one of your threads. Now this... You know a human being wrote it, have some respect until you come up with something better. Thanks! On Tue, 19 Oct 2010 16:06:31 +0300, Justin Johansson <no spam.com> wrote: This module should be removed altogether from Phobos forthwith.-- // Yours sincerely // Emil 'Skeen' MadsenThe code was obviously submitted and accepted without peer review, either that or the peers were idiots as well. It would be better to say that Phobos does not have an XML library yet, and to seek submissions, rather than maintain this piece of codswallop in the latest distribution. Let's not even talk of deprecation. Any D user currently using std.xml is completely misguided. Justin-- Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
Oct 19 2010
On Tue, 19 Oct 2010 09:36:59 -0400, so <so so.do> wrote:Man, you sure getting on my nerves, on your last strike i lost it and gone berserk in one of your threads. Now this... You know a human being wrote it, have some respect until you come up with something better. Thanks!I agree with the sentiment that you should respect someone else's hard work. Having said that, I agree std.xml should be removed until something replaces it. Fixing bugs in it makes no sense since 1) the author no longer is around and 2) I think it has serious design flaws. Removing it has been discussed on the phobos list before. -Steve
Oct 19 2010
On 20/10/2010 12:44 AM, Steven Schveighoffer wrote:On Tue, 19 Oct 2010 09:36:59 -0400, so <so so.do> wrote:I'm sorry and regret the impoliteness of my post. Please all accept my apology for any offense caused by my careless remarks. -JustinMan, you sure getting on my nerves, on your last strike i lost it and gone berserk in one of your threads. Now this... You know a human being wrote it, have some respect until you come up with something better. Thanks!I agree with the sentiment that you should respect someone else's hard work. Having said that, I agree std.xml should be removed until something replaces it. Fixing bugs in it makes no sense since 1) the author no longer is around and 2) I think it has serious design flaws. Removing it has been discussed on the phobos list before. -Steve
Oct 19 2010
On Tue, 19 Oct 2010 08:06:31 -0500, Justin Johansson <no spam.com> wrote:This module should be removed altogether from Phobos forthwith. The code was obviously submitted and accepted without peer review, either that or the peers were idiots as well.I don't think that Walter or Andrei would take kindly being called like that, as they are some of the "peers" that review Phobos submissions. :p And yes, I agree that std.xml is not really good. P.D. Are you drunk? -- Yao G.
Oct 19 2010
Yao G. schrieb:P.D. Are you drunk?I already wondered about that when he posted that strange joke about fucking the D type system in a canoe.
Oct 19 2010
On 10/19/10 8:06 CDT, Justin Johansson wrote:This module should be removed altogether from Phobos forthwith. The code was obviously submitted and accepted without peer review, either that or the peers were idiots as well. It would be better to say that Phobos does not have an XML library yet, and to seek submissions, rather than maintain this piece of codswallop in the latest distribution. Let's not even talk of deprecation. Any D user currently using std.xml is completely misguided. JustinI haven't worked with XML all that much. Please make me understand the matter better - is std.xml's speed the only concern, or is the module generally obtuse to work with? Thanks, Andrei
Oct 19 2010
Andrei Alexandrescu Wrote:I haven't worked with XML all that much. Please make me understand the matter better - is std.xml's speed the only concern, or is the module generally obtuse to work with?It needs rewrite.
Oct 19 2010
On Tue, 19 Oct 2010 22:47:56 +0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:On 10/19/10 8:06 CDT, Justin Johansson wrote:I use it, but design is bad and performance is awful.This module should be removed altogether from Phobos forthwith. The code was obviously submitted and accepted without peer review, either that or the peers were idiots as well. It would be better to say that Phobos does not have an XML library yet, and to seek submissions, rather than maintain this piece of codswallop in the latest distribution. Let's not even talk of deprecation. Any D user currently using std.xml is completely misguided. JustinI haven't worked with XML all that much. Please make me understand the matter better - is std.xml's speed the only concern, or is the module generally obtuse to work with? Thanks, Andrei
Oct 19 2010
On 10/19/10 14:30 CDT, Denis Koroskin wrote:On Tue, 19 Oct 2010 22:47:56 +0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:More detail about the design please? I browsed through the code and the main issue seems to be heavy reliance on granular delegates to do pretty much anything. Would fixing that improve usability? (It would most likely improve performance.) AndreiOn 10/19/10 8:06 CDT, Justin Johansson wrote:I use it, but design is bad and performance is awful.This module should be removed altogether from Phobos forthwith. The code was obviously submitted and accepted without peer review, either that or the peers were idiots as well. It would be better to say that Phobos does not have an XML library yet, and to seek submissions, rather than maintain this piece of codswallop in the latest distribution. Let's not even talk of deprecation. Any D user currently using std.xml is completely misguided. JustinI haven't worked with XML all that much. Please make me understand the matter better - is std.xml's speed the only concern, or is the module generally obtuse to work with? Thanks, Andrei
Oct 19 2010
On 2010-10-19 21:37, Andrei Alexandrescu wrote:More detail about the design please? I browsed through the code and the main issue seems to be heavy reliance on granular delegates to do pretty much anything. Would fixing that improve usability? (It would most likely improve performance.) AndreiIt has a kind of annoying API: * Attributes are handled as an associative array instead of classes like the rest of the nodes * You cannot create an empty Document, you must either create one from existing XML data or create one with a root element * There is no way the access the parent of a node * No XPath These are a few I came up with for now. -- /Jacob Carlborg
Oct 19 2010
On Tue, 19 Oct 2010 21:53:56 +0200, Jacob Carlborg wrote:On 2010-10-19 21:37, Andrei Alexandrescu wrote:There is a xml parser and document structure that follows DOM 2-3 interfaces on DSource. dsource.org/projects/xmlp. It uses InputRanges to manage the parsing (sort of layered). Handles DTD . Has DOM level 2/3 interfaces (Really, so much of this XML DOM interface seems directly taken off the Java design). DOMString type can be aliased as string or wstring. Handles a lot of XML corner cases, and validates against XML test suite. Can read multiple source files in different encodings and external DTDs. The first version also handled namespaces, but I haven't checked / updated namespaces in the current version, because no-one seems interested, and so I drifted off and did other things. The resulting parser and code seems too big and messy for Phobos. I imagine people just want something that reads in already validated xml, quick and dirty. I can imagine no one would have the patience to know where to start using this , although there are some example test and validation programs. Theres also a short xpath runtime analyser based on it, and a make tool using this, XML config files and variables, to build D programs, and run commands. --- Michael.More detail about the design please? I browsed through the code and the main issue seems to be heavy reliance on granular delegates to do pretty much anything. Would fixing that improve usability? (It would most likely improve performance.) AndreiIt has a kind of annoying API: * Attributes are handled as an associative array instead of classes like the rest of the nodes * You cannot create an empty Document, you must either create one from existing XML data or create one with a root element * There is no way the access the parent of a node * No XPath
Oct 26 2010
On 19/10/2010 20:37, Andrei Alexandrescu wrote:On 10/19/10 14:30 CDT, Denis Koroskin wrote: More detail about the design please? I browsed through the code and the main issue seems to be heavy reliance on granular delegates to do pretty much anything. Would fixing that improve usability? (It would most likely improve performance.) AndreiWell one obvious problem is you have to read the document into memory first, which clearly isn't good enough for large documents. Secondly it doesn't handle xml namespaces properly. namespaces are critically important for parsing most xml documents in practice. Otherwise it should be a heavily template based design sort of like boost::spirit so you can process tags/arbitiutes as they are parsed for maximum performance. If we have XML in the std library, it's really got to be a 100% standards conformant implementation; otherwise it's a waste of space. I spend a couple of weeks writing an XML parser; but I skipped the more obtuse bits. Doing it properly is a large chunk of work and it's dull as f*ck. -- My enormous talent is exceeded only by my outrageous laziness. http://www.ssTk.co.uk
Oct 19 2010
Well one obvious problem is you have to read the document into memory first, which clearly isn't good enough for large documents.I think that depends on the type of XML library we create. A SAX library doesn't require the whole document in memory, however a DOM library typically does as, from what I can tell, they create an in-memory representation that's tree-like. If you don't read it into memory, I'm not really sure how you would be able to, for example, write XPath queries to access some random nodes that are not grouped together in a relatively efficient manner. I say relatively because yes, the memory layout can be very scattered, however it's still better than having to perform random access from disk. I guess one question we need to ask is what do we expect from this library? Do we want a full DOM implementation or is a SAX parser good enough? Or do we need something in between? In PHP or Perl, perhaps both, I saw a library where an XML document was essentially transformed into nested associative arrays. It made it very easy to read data from the XML, however I don't know how much of the official standards it complied with. The current std.xml looks like it tries to be both a DOM library and a SAX library. Personally, I'd rather break them up into two libraries, though it may make sense for the DOM library to leverage the SAX library to build up it's objects. IMHO, I love a good SAX parser. I've used them in the past and I think they work great, so having one in D I think would be ideal, especially in those situations where the XML file is essentially read-only. Do we need a DOM parser? I honestly don't know. Personally, I'd be happy with the associative array approach as it's simple. I don't need to learn a new API just to navigate through XML. Yes, I know there are advantages to using the DOM and XPath, which I also like, but for the most part, I don't need either. Of course, I personally would love to just let XML die and use better data formats, but that's an unrealistic dream :) Casey
Oct 19 2010
On 19/10/2010 21:43, sybrandy wrote:<snip> Nobody said anything about DOM, and the current std.xml doesn't in anyway support a DOM implementation. That's a whole 'nother ball game. For my money supporting DOM is not and should not be a goal for phobos; DOM is irrelevant for the basics of XML data handling.Well one obvious problem is you have to read the document into memory first, which clearly isn't good enough for large documents.Of course, I personally would love to just let XML die and use better data formats, but that's an unrealistic dream :) CaseyDamn straight. XML is a fugly crap format. But too many people have invested too much in it; and these days it is pretty much the defacto standard for a lot of data interchange and a lot of 'file formats' so ignoring it isn't really an option. .NET gives good support for XML out of the box and if phobos is going to have XML, it should do at least as well as the .NET implementation or it's not going to be realistically usable for serious use. I don't think we need to go as far as XSLT, but XML 1.1 and Namespaces is a must. -- My enormous talent is exceeded only by my outrageous laziness. http://www.ssTk.co.uk
Oct 19 2010
On 2010-10-19 16:43:04 -0400, sybrandy <sybrandy gmail.com> said:I guess one question we need to ask is what do we expect from this library? Do we want a full DOM implementation or is a SAX parser good enough? Or do we need something in between? In PHP or Perl, perhaps both, I saw a library where an XML document was essentially transformed into nested associative arrays. It made it very easy to read data from the XML, however I don't know how much of the official standards it complied with.Many people have different needs for XML, it's hard to come with something that pleases everyone. I might have the solution to that however: a template that makes it easy to implement any kind of parser. I've made two xml modules a little while ago. The first is a tokenizer template that can work either as a pull-parser or callback-parser, or even a mix of both, and is reentrant (you can invoke the tokenizer inside a callback to parse new tokens). The implementation has been written based on the XML spec so I'm confident that the parser is pretty much standard. In regard to the standard, the tokenizer lacks support for DTD internal subsets and user-defined character entities, and leaves some well-formness checks to the upper layers (like checking if tag name matches) where it should be less costly for those checks to happen. The second module is a basic tree model based on the tokenizer. It doesn't try to be DOM-conformant, but it shows how the tokenizer can be used and implements the higher-level well-formness checks (matching tag names). Building a SAX parser on top of the tokenizer would be a piece of cake too. It might be incomplete, but this code works: it's already in production in a small program (script?) of mine. I don't really have the time to work on it at the moment, but if anyone wants to take it and improve upon it, then it could probably become Phobos's XML parser. One thing that should be done is make the tokenizer accept ranges, something I started a couple of months ago but which I never finished. Here's the (slightly outdated) documentation. If someone wants to proceed I'll extract the code from the rest of my code and release it under the boost license. http://michelf.com/docs/d/mfr/xmltok.html http://michelf.com/docs/d/mfr/xml.html -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Oct 19 2010
On 10/19/10 17:16 CDT, Michel Fortin wrote:Here's the (slightly outdated) documentation. If someone wants to proceed I'll extract the code from the rest of my code and release it under the boost license. http://michelf.com/docs/d/mfr/xmltok.html http://michelf.com/docs/d/mfr/xml.htmlLooks like a simple and clean API. We should be able to adopt the code into Phobos if an owner and champion is found. Andrei
Oct 19 2010
Michel Fortin <michel.fortin michelf.com> wrote:If someone wants to proceed I'll extract the code from the rest of my code and release it under the boost license.I'd love to give this a spin. -- Simen
Oct 19 2010
On 2010-10-19 19:28:18 -0400, "Simen kjaeraas" <simen.kjaras gmail.com> said:Michel Fortin <michel.fortin michelf.com> wrote:Great. I'll post that code tomorrow. -- Michel Fortin michel.fortin michelf.com http://michelf.com/If someone wants to proceed I'll extract the code from the rest of my code and release it under the boost license.I'd love to give this a spin.
Oct 19 2010
On 2010-10-19 21:31:04 -0400, Michel Fortin <michel.fortin michelf.com> said:On 2010-10-19 19:28:18 -0400, "Simen kjaeraas" <simen.kjaras gmail.com> said:Not yet tomorrow, but it's ready. Have fun. <http://michelf.com/docs/d/mfr-xml-2010-10-19.zip> I've included some notes in the archive about what each module do and what's missing. Feel free to ask if you have questions. -- Michel Fortin michel.fortin michelf.com http://michelf.com/Michel Fortin <michel.fortin michelf.com> wrote:Great. I'll post that code tomorrow.If someone wants to proceed I'll extract the code from the rest of my code and release it under the boost license.I'd love to give this a spin.
Oct 19 2010
Michel Fortin <michel.fortin michelf.com> wrote:On 2010-10-19 21:31:04 -0400, Michel Fortin <michel.fortin michelf.com> said:Thank you. I'll have a look-see. -- SimenOn 2010-10-19 19:28:18 -0400, "Simen kjaeraas" <simen.kjaras gmail.com> said:Not yet tomorrow, but it's ready. Have fun. <http://michelf.com/docs/d/mfr-xml-2010-10-19.zip> I've included some notes in the archive about what each module do and what's missing. Feel free to ask if you have questions.Michel Fortin <michel.fortin michelf.com> wrote:Great. I'll post that code tomorrow.If someone wants to proceed I'll extract the code from the rest of my code and release it under the boost license.I'd love to give this a spin.
Oct 20 2010