www.digitalmars.com         C & C++   DMDScript  

c++.stlsoft - XMLSTL progress: Testers / users wanted in a week or two; some opinions wanted now

reply "Matthew" <matthew hat.stlsoft.dot.org> writes:
Eric Beyeler's recent post has had a positive effect.

I had some good progress last night, in particular the node class is now
bristling with methods and properties corresponding to those available on
the IXMLDOMNode interface. And, yes, I did say properties. It's using my
technique for C++ Properties (described in Chapter 35 of Imperfect C++),
facilitating code such as the following:

        try
        {
            using namespace xmlstl::msxml::dom;

            IXMLDOMDocument_ptr     doc_ptr     =
DOM_load_document(::comstl::ex::bstr(xmlFile));

            document                doc(doc_ptr);

// The following are all property invocations, i.e. they call into property
methods get_text(), get_nodeName(), get_nodeValue()

            string_t                text        =   doc.text;
            string_t                nodename    =   doc.nodeName;
            variant_t               nodeValue   =   doc.nodeValue;
        }
        catch(xmlstl::msxml::dom::parse_error &x)
        {
            .. // report
        }
        catch(xmlstl::msxml::dom::exception &x)
        {
            .. // report
        }

I'm interested in whether anyone would like to volunteer to be a tester for
the early implementations, in a week or two? At the moment, the code's using
some Synesis libraries - this will not be the case down the line (when
XMLSTL 1.0 would be released), but I'm focusing on what's new/unknown
first - so it'd mean having a few DLLs on your system. Don't worry, there's
no spyware here. :-)

For now, I'm interested in some input on naming conventions:

1. Type names. The Synesis classes from which this stuff derives uses
CamelCase naming, representative of the MSXML interfaces from which they
derive. For example, the wrapper for the IXMLDOMDocument is called
XMLDOMDocument. In the initial XMLSTL implementation I'm using nested
namespaces to express the type "family", e.g. the document class is called

  xmlstl::msxml::dom::document

and resides in the file

 #include <xmlstl/msxml/dom/document.hpp>

The other types currently implemented are:

    - node
    - named_node_map
    - parse_error

and so on.

2. Property names. The Synesis classes from which this stuff derives doesn't
have
properties (since it was developed largely for VC++ 6 consumption). In
XMLSTL
I'm currently following the naming convention of the COM properties from the
MSXML types, i.e. the node value property is called nodeValue, rather than
NodeValue
or node_value.


Since the properties are method properties, ie. they are implemented in
terms of methods
on the class, the naming convention must allow for the get_ and/or set_
methods to
co-exist. For example, here's an extract from xmlstl::msxml::dom::node, for
the property
nodeValue:

    variant_t get_nodeValue() const;
    void  put_nodeValue(VARIANT const &);

    STLSOFT_METHOD_PROPERTY_GETSET(variant_t, variant_t, VARIANT const &,
class_type, get_nodeValue, put_nodeValue, nodeValue);

Naturally, classes have methods too, which are also (currently) following
the MSXML
naming conventions, as in:

    SynesisCom::BStr transformNode(IXMLDOMNode_ptr styleSheet);

The examples I presented in Imperfect C++ used ThisCase for properties, and
I wrote at the time that that's my preferred convention, which it is. Also,
since I've been doing a lot of .NET recently, I'm liking that case
convention for (MS)XML more as well. However, both ThisCase for properties,
and get_thisCase or get_ThisCase for property accessor methods conflict with
the STLSoft convention of this_case().

So, I guess I'm looking for input from interested parties. There are thus
three issues:

1. How are classes to be named: named_node_map or NamedNodeMap?  (the XMLDOM
bit is redundant, due to the namespace)
2. How are "normal" methods to be named: method_name(), or methodName(), or
MethodName()?
3. How are properties to be name: property_name, or propertyName, or
PropertyName()?
4. How are property methods to be named: get_property_name,
get_propertyName, or get_PropertyName()?

I think property methods should follow properties, i.e. if it's PropertyName
then it should be get_PropertyName()

I _think_ - though I'm wide open to offers - that I might prefer the
following two schemes:

A. class_name, methodName(), PropertyName, get_PropertyName()
B. class_name, method_name(), PropertyName, get_PropertyName()

The current scheme - class_name, methodName(), propertyName,
get_propertyName() - is not too bad, but it doesn't feel quite right.

(One thing I've currently got reservations about is how the naming methods
and properties on collections, such as named_node_map, might affect these
conventions, but I've yet to do one of those.)

So, what are your thoughts?

Cheers

Matthew
Nov 28 2005
next sibling parent reply Jan Knepper <jan smartsoft.us> writes:
Matthew wrote:
 Eric Beyeler's recent post has had a positive effect.
 
 I had some good progress last night, in particular the node class is now
 bristling with methods and properties corresponding to those available on
 the IXMLDOMNode interface. And, yes, I did say properties. It's using my
 technique for C++ Properties (described in Chapter 35 of Imperfect C++),
 facilitating code such as the following:
 
         try
         {
             using namespace xmlstl::msxml::dom;
 
             IXMLDOMDocument_ptr     doc_ptr     =
 DOM_load_document(::comstl::ex::bstr(xmlFile));
 
             document                doc(doc_ptr);
 
 // The following are all property invocations, i.e. they call into property
 methods get_text(), get_nodeName(), get_nodeValue()
 
             string_t                text        =   doc.text;
             string_t                nodename    =   doc.nodeName;
             variant_t               nodeValue   =   doc.nodeValue;
         }
         catch(xmlstl::msxml::dom::parse_error &x)
         {
             .. // report
         }
         catch(xmlstl::msxml::dom::exception &x)
         {
             .. // report
         }
 
 I'm interested in whether anyone would like to volunteer to be a tester for
 the early implementations, in a week or two? At the moment, the code's using
 some Synesis libraries - this will not be the case down the line (when
 XMLSTL 1.0 would be released), but I'm focusing on what's new/unknown
 first - so it'd mean having a few DLLs on your system. Don't worry, there's
 no spyware here. :-)
 
 For now, I'm interested in some input on naming conventions:
 
 1. Type names. The Synesis classes from which this stuff derives uses
 CamelCase naming, representative of the MSXML interfaces from which they
 derive. For example, the wrapper for the IXMLDOMDocument is called
 XMLDOMDocument. In the initial XMLSTL implementation I'm using nested
 namespaces to express the type "family", e.g. the document class is called
 
   xmlstl::msxml::dom::document
 
 and resides in the file
 
  #include <xmlstl/msxml/dom/document.hpp>
 
 The other types currently implemented are:
 
     - node
     - named_node_map
     - parse_error
 
 and so on.
 
 2. Property names. The Synesis classes from which this stuff derives doesn't
 have
 properties (since it was developed largely for VC++ 6 consumption). In
 XMLSTL
 I'm currently following the naming convention of the COM properties from the
 MSXML types, i.e. the node value property is called nodeValue, rather than
 NodeValue
 or node_value.
 
 
 Since the properties are method properties, ie. they are implemented in
 terms of methods
 on the class, the naming convention must allow for the get_ and/or set_
 methods to
 co-exist. For example, here's an extract from xmlstl::msxml::dom::node, for
 the property
 nodeValue:
 
     variant_t get_nodeValue() const;
     void  put_nodeValue(VARIANT const &);
 
     STLSOFT_METHOD_PROPERTY_GETSET(variant_t, variant_t, VARIANT const &,
 class_type, get_nodeValue, put_nodeValue, nodeValue);
 
 Naturally, classes have methods too, which are also (currently) following
 the MSXML
 naming conventions, as in:
 
     SynesisCom::BStr transformNode(IXMLDOMNode_ptr styleSheet);
 
 The examples I presented in Imperfect C++ used ThisCase for properties, and
 I wrote at the time that that's my preferred convention, which it is. Also,
 since I've been doing a lot of .NET recently, I'm liking that case
 convention for (MS)XML more as well. However, both ThisCase for properties,
 and get_thisCase or get_ThisCase for property accessor methods conflict with
 the STLSoft convention of this_case().
 
 So, I guess I'm looking for input from interested parties. There are thus
 three issues:
 
 1. How are classes to be named: named_node_map or NamedNodeMap?  (the XMLDOM
 bit is redundant, due to the namespace)
 2. How are "normal" methods to be named: method_name(), or methodName(), or
 MethodName()?
 3. How are properties to be name: property_name, or propertyName, or
 PropertyName()?
 4. How are property methods to be named: get_property_name,
 get_propertyName, or get_PropertyName()?
 
 I think property methods should follow properties, i.e. if it's PropertyName
 then it should be get_PropertyName()
 
 I _think_ - though I'm wide open to offers - that I might prefer the
 following two schemes:
 
 A. class_name, methodName(), PropertyName, get_PropertyName()
 B. class_name, method_name(), PropertyName, get_PropertyName()
 
 The current scheme - class_name, methodName(), propertyName,
 get_propertyName() - is not too bad, but it doesn't feel quite right.
 
 (One thing I've currently got reservations about is how the naming methods
 and properties on collections, such as named_node_map, might affect these
 conventions, but I've yet to do one of those.)
 
 So, what are your thoughts?
 
 Cheers
 
 Matthew
 
 
Matthew, I hope you are not trying to write your own XML parser. That already has been done. Just do a search on expat used by quite a few major players in the Unix market. On top of that there are several DOM's already out there. Apache www.apache.org seems to have a very good one. Jan -- ManiaC++ Jan Knepper But as for me and my household, we shall use Mozilla... www.mozilla.org
Nov 29 2005
parent "Matthew" <matthew hat.stlsoft.dot.org> writes:
"Jan Knepper" <jan smartsoft.us> wrote in message
news:dmiih8$2969$1 digitaldaemon.com...
 Matthew wrote:
 Eric Beyeler's recent post has had a positive effect.

 I had some good progress last night, in particular the node class is now
 bristling with methods and properties corresponding to those available
on
 the IXMLDOMNode interface. And, yes, I did say properties. It's using my
 technique for C++ Properties (described in Chapter 35 of Imperfect C++),
 facilitating code such as the following:

         try
         {
             using namespace xmlstl::msxml::dom;

             IXMLDOMDocument_ptr     doc_ptr     =
 DOM_load_document(::comstl::ex::bstr(xmlFile));

             document                doc(doc_ptr);

 // The following are all property invocations, i.e. they call into
property
 methods get_text(), get_nodeName(), get_nodeValue()

             string_t                text        =   doc.text;
             string_t                nodename    =   doc.nodeName;
             variant_t               nodeValue   =   doc.nodeValue;
         }
         catch(xmlstl::msxml::dom::parse_error &x)
         {
             .. // report
         }
         catch(xmlstl::msxml::dom::exception &x)
         {
             .. // report
         }

 I'm interested in whether anyone would like to volunteer to be a tester
for
 the early implementations, in a week or two? At the moment, the code's
using
 some Synesis libraries - this will not be the case down the line (when
 XMLSTL 1.0 would be released), but I'm focusing on what's new/unknown
 first - so it'd mean having a few DLLs on your system. Don't worry,
there's
 no spyware here. :-)

 For now, I'm interested in some input on naming conventions:

 1. Type names. The Synesis classes from which this stuff derives uses
 CamelCase naming, representative of the MSXML interfaces from which they
 derive. For example, the wrapper for the IXMLDOMDocument is called
 XMLDOMDocument. In the initial XMLSTL implementation I'm using nested
 namespaces to express the type "family", e.g. the document class is
called
   xmlstl::msxml::dom::document

 and resides in the file

  #include <xmlstl/msxml/dom/document.hpp>

 The other types currently implemented are:

     - node
     - named_node_map
     - parse_error

 and so on.

 2. Property names. The Synesis classes from which this stuff derives
doesn't
 have
 properties (since it was developed largely for VC++ 6 consumption). In
 XMLSTL
 I'm currently following the naming convention of the COM properties from
the
 MSXML types, i.e. the node value property is called nodeValue, rather
than
 NodeValue
 or node_value.


 Since the properties are method properties, ie. they are implemented in
 terms of methods
 on the class, the naming convention must allow for the get_ and/or set_
 methods to
 co-exist. For example, here's an extract from xmlstl::msxml::dom::node,
for
 the property
 nodeValue:

     variant_t get_nodeValue() const;
     void  put_nodeValue(VARIANT const &);

     STLSOFT_METHOD_PROPERTY_GETSET(variant_t, variant_t, VARIANT const
&,
 class_type, get_nodeValue, put_nodeValue, nodeValue);

 Naturally, classes have methods too, which are also (currently)
following
 the MSXML
 naming conventions, as in:

     SynesisCom::BStr transformNode(IXMLDOMNode_ptr styleSheet);

 The examples I presented in Imperfect C++ used ThisCase for properties,
and
 I wrote at the time that that's my preferred convention, which it is.
Also,
 since I've been doing a lot of .NET recently, I'm liking that case
 convention for (MS)XML more as well. However, both ThisCase for
properties,
 and get_thisCase or get_ThisCase for property accessor methods conflict
with
 the STLSoft convention of this_case().

 So, I guess I'm looking for input from interested parties. There are
thus
 three issues:

 1. How are classes to be named: named_node_map or NamedNodeMap?  (the
XMLDOM
 bit is redundant, due to the namespace)
 2. How are "normal" methods to be named: method_name(), or methodName(),
or
 MethodName()?
 3. How are properties to be name: property_name, or propertyName, or
 PropertyName()?
 4. How are property methods to be named: get_property_name,
 get_propertyName, or get_PropertyName()?

 I think property methods should follow properties, i.e. if it's
PropertyName
 then it should be get_PropertyName()

 I _think_ - though I'm wide open to offers - that I might prefer the
 following two schemes:

 A. class_name, methodName(), PropertyName, get_PropertyName()
 B. class_name, method_name(), PropertyName, get_PropertyName()

 The current scheme - class_name, methodName(), propertyName,
 get_propertyName() - is not too bad, but it doesn't feel quite right.

 (One thing I've currently got reservations about is how the naming
methods
 and properties on collections, such as named_node_map, might affect
these
 conventions, but I've yet to do one of those.)

 So, what are your thoughts?

 Cheers

 Matthew
Matthew, I hope you are not trying to write your own XML parser.
Heavens no! That would be insanity. The library, like that of all STLSoft sub-projects, is a wrapper for existing functionality, with the purpose(s) of:
 That
 already has been done. Just do a search on expat used by quite a few
 major players in the Unix market.
 On top of that there are several DOM's already out there. Apache
 www.apache.org seems to have a very good one.
1. Improving ease of use. I doubt there'd be many C++ programmers who'd contend that MSXML is easy to use 2. Unifying the syntax between libraries. Once MSXML is done, I'll be wrapping other libs, including Xerces (Apache) and maybe Expat (though AFAIK, that's SAX, and this first effort is wrapping DOMC). Just this morning I'm amending a previous little XML editor that I wrote a few years to compile and run with XMLSTL as well as my original Synesis libs, and also with Xerces. XMLSTL currently contains only the xmlstl::msxml::dom namespace, but I plan xmlstl::xerces::dom, and so on. 3. STL-ifying the wrapped libraries. I've got collections such as child_node_sequence already written which works for nodes and attributes. Cheers Matthew
Nov 29 2005
prev sibling parent reply "Eric Beyeler" <ebeyeler svresearch.com> writes:
"Matthew" <matthew hat.stlsoft.dot.org> wrote in message
news:dmft3a$2vkf$1 digitaldaemon.com...
 Eric Beyeler's recent post has had a positive effect.

 I had some good progress last night, in particular the node class is now
 bristling with methods and properties corresponding to those available on
 the IXMLDOMNode interface. And, yes, I did say properties. It's using my
 technique for C++ Properties (described in Chapter 35 of Imperfect C++),
 facilitating code such as the following:

         try
         {
             using namespace xmlstl::msxml::dom;

             IXMLDOMDocument_ptr     doc_ptr     =
 DOM_load_document(::comstl::ex::bstr(xmlFile));

             document                doc(doc_ptr);

 // The following are all property invocations, i.e. they call into
property
 methods get_text(), get_nodeName(), get_nodeValue()

             string_t                text        =   doc.text;
             string_t                nodename    =   doc.nodeName;
             variant_t               nodeValue   =   doc.nodeValue;
         }
         catch(xmlstl::msxml::dom::parse_error &x)
         {
             .. // report
         }
         catch(xmlstl::msxml::dom::exception &x)
         {
             .. // report
         }

 I'm interested in whether anyone would like to volunteer to be a tester
for
 the early implementations, in a week or two? At the moment, the code's
using
 some Synesis libraries - this will not be the case down the line (when
 XMLSTL 1.0 would be released), but I'm focusing on what's new/unknown
 first - so it'd mean having a few DLLs on your system. Don't worry,
there's
 no spyware here. :-)

 For now, I'm interested in some input on naming conventions:

 1. Type names. The Synesis classes from which this stuff derives uses
 CamelCase naming, representative of the MSXML interfaces from which they
 derive. For example, the wrapper for the IXMLDOMDocument is called
 XMLDOMDocument. In the initial XMLSTL implementation I'm using nested
 namespaces to express the type "family", e.g. the document class is called

   xmlstl::msxml::dom::document

 and resides in the file

  #include <xmlstl/msxml/dom/document.hpp>

 The other types currently implemented are:

     - node
     - named_node_map
     - parse_error

 and so on.

 2. Property names. The Synesis classes from which this stuff derives
doesn't
 have
 properties (since it was developed largely for VC++ 6 consumption). In
 XMLSTL
 I'm currently following the naming convention of the COM properties from
the
 MSXML types, i.e. the node value property is called nodeValue, rather than
 NodeValue
 or node_value.


 Since the properties are method properties, ie. they are implemented in
 terms of methods
 on the class, the naming convention must allow for the get_ and/or set_
 methods to
 co-exist. For example, here's an extract from xmlstl::msxml::dom::node,
for
 the property
 nodeValue:

     variant_t get_nodeValue() const;
     void  put_nodeValue(VARIANT const &);

     STLSOFT_METHOD_PROPERTY_GETSET(variant_t, variant_t, VARIANT const &,
 class_type, get_nodeValue, put_nodeValue, nodeValue);

 Naturally, classes have methods too, which are also (currently) following
 the MSXML
 naming conventions, as in:

     SynesisCom::BStr transformNode(IXMLDOMNode_ptr styleSheet);

 The examples I presented in Imperfect C++ used ThisCase for properties,
and
 I wrote at the time that that's my preferred convention, which it is.
Also,
 since I've been doing a lot of .NET recently, I'm liking that case
 convention for (MS)XML more as well. However, both ThisCase for
properties,
 and get_thisCase or get_ThisCase for property accessor methods conflict
with
 the STLSoft convention of this_case().

 So, I guess I'm looking for input from interested parties. There are thus
 three issues:

 1. How are classes to be named: named_node_map or NamedNodeMap?  (the
XMLDOM
 bit is redundant, due to the namespace)
 2. How are "normal" methods to be named: method_name(), or methodName(),
or
 MethodName()?
 3. How are properties to be name: property_name, or propertyName, or
 PropertyName()?
 4. How are property methods to be named: get_property_name,
 get_propertyName, or get_PropertyName()?

 I think property methods should follow properties, i.e. if it's
PropertyName
 then it should be get_PropertyName()

 I _think_ - though I'm wide open to offers - that I might prefer the
 following two schemes:

 A. class_name, methodName(), PropertyName, get_PropertyName()
 B. class_name, method_name(), PropertyName, get_PropertyName()

 The current scheme - class_name, methodName(), propertyName,
 get_propertyName() - is not too bad, but it doesn't feel quite right.

 (One thing I've currently got reservations about is how the naming methods
 and properties on collections, such as named_node_map, might affect these
 conventions, but I've yet to do one of those.)

 So, what are your thoughts?

 Cheers

 Matthew
Just my thoughts, especially if you will be extending this to other libraries and one of your goals is to "STL-ize" this functionality (as you indicate in another post)... I am beginning to really like the naming_convention() of the STL. Lower case and underscores. I may be willing to test the alphas, but it depends on how much effort it will be to get a baseline to compile (which compiler? VC6 or VC7 / 8?) I am using MSXML4 as the parser. I won't have much time to test until January, but should be able to get a little bit in this month. Eric
Nov 30 2005
parent reply "Matthew" <matthew stlsoft.com> writes:
 (One thing I've currently got reservations about is how the naming 
 methods
 and properties on collections, such as named_node_map, might affect these
 conventions, but I've yet to do one of those.)

 So, what are your thoughts?

 Matthew
Just my thoughts, especially if you will be extending this to other libraries and one of your goals is to "STL-ize" this functionality (as you indicate in another post)... I am beginning to really like the naming_convention() of the STL. Lower case and underscores.
I know. It's addictive, isn't it? ;-) The thing is, I think that you/we like this simply because we're using it. I know I start out using function_naming in C many moons ago, and then detested MethodName when I started using C++. I then detested method_name when I started using STL, and the detested methodName in Java and Ruby. I think one benefit with an underscore free/limited form is that the names stand out, which is a good thing for properties. Perhaps, if my contention that all conventions are equally bad, it might be best to follow whatever the convention that W3C use in the DOM specification? I'll look into this ... ... they're defined in http://www.w3.org/TR/DOM-Level-2-Core/core.html as "previousSibling", "nodeValue" for the properties, and "getNamedItem()", "removeNamedItem()" for the methods. Both MSXML DOM and Xerces DOM already use this naming convention, so I'm inclined to go with it (in part because that means I don't have to change anything). The property methods will therefore be named get_previousSibling(), put_nodeValue(). This would also mean that I've have to change the class names, i.e. entity_reference => EntityReference, which is a bit of work. But I think conformance to the W3C std outweighs the STL convention, unless someone can persuade me otherwise.
 I won't have much time to test until January,
 but should be able to get a little bit in this month.
Today I've rewritten a horrid but non-trivial app - XmlEd - such that it now compiles with three configurations: with Synesis XML libraries; with XMLSTL (MSXML DOM); with Xerces. That's proved a fair amount of the MSXML DOM wrapping, although it's by no means full coverage. I've also just written a small and relatively simple program that reads in an XML file and outputs nodes and attributes. I'm going to have to move off XMLSTL for a while soon, so maybe the best thing is to do a couple more simple tests over the next few days, and then release an alpha lib for people to play with.
 I may be willing to test the alphas, but it depends on how much effort it
 will be to get a baseline to compile (which compiler? VC6 or VC7 / 8?)  I 
 am
 using MSXML4 as the parser.
Regarding the compiler support, the properties are enabled for VC++7.1 and other compilers that support my C++ Properties technique. For other compilers, the properties are not enabled, and therefore client code would have to use the property methods. Hence, the following code is equivalent: xmlstl::msxml::dom::node n = . . . // VC++ 6.0 and equivalent string_t name = n.get_nodeName(); // VC++ 7.1 and equivalent string_t name = n.nodeName; The latter (.nodeName) simply invokes the former (.get_nodeName()). Cheers Matthew
Nov 30 2005
parent reply "Eric Beyeler" <ebeyeler svresearch.com> writes:
"Matthew" <matthew stlsoft.com> wrote in message
news:dmkc8j$n0g$1 digitaldaemon.com...
 (One thing I've currently got reservations about is how the naming
 methods
 and properties on collections, such as named_node_map, might affect
these
 conventions, but I've yet to do one of those.)

 So, what are your thoughts?

 Matthew
Just my thoughts, especially if you will be extending this to other libraries and one of your goals is to "STL-ize" this functionality (as
you
 indicate in another post)... I am beginning to really like the
 naming_convention() of the STL. Lower case and underscores.
I know. It's addictive, isn't it? ;-) The thing is, I think that you/we like this simply because we're using it.
I
 know I start out using function_naming in C many moons ago, and then
 detested MethodName when I started using C++. I then detested method_name
 when I started using STL, and the detested methodName in Java and Ruby.

 I think one benefit with an underscore free/limited form is that the names
 stand out, which is a good thing for properties.

 Perhaps, if my contention that all conventions are equally bad, it might
be
 best to follow whatever the convention that W3C use in the DOM
 specification? I'll look into this ...

 ... they're defined in http://www.w3.org/TR/DOM-Level-2-Core/core.html as
 "previousSibling", "nodeValue" for the properties, and "getNamedItem()",
 "removeNamedItem()" for the methods. Both MSXML DOM and Xerces DOM already
 use this naming convention, so I'm inclined to go with it (in part because
 that means I don't have to change anything). The property methods will
 therefore be named get_previousSibling(), put_nodeValue().

 This would also mean that I've have to change the class names, i.e.
 entity_reference => EntityReference, which is a bit of work. But I think
 conformance to the W3C std outweighs the STL convention, unless someone
can
 persuade me otherwise.
naming convention isn't a huge deal for me, as long as it's consistent.
 I won't have much time to test until January,
 but should be able to get a little bit in this month.
Today I've rewritten a horrid but non-trivial app - XmlEd - such that it
now
 compiles with three configurations: with Synesis XML libraries; with
XMLSTL
 (MSXML DOM); with Xerces. That's proved a fair amount of the MSXML DOM
 wrapping, although it's by no means full coverage. I've also just written
a
 small and relatively simple program that reads in an XML file and outputs
 nodes and attributes.

 I'm going to have to move off XMLSTL for a while soon, so maybe the best
 thing is to do a couple more simple tests over the next few days, and then
 release an alpha lib for people to play with.

 I may be willing to test the alphas, but it depends on how much effort
it
 will be to get a baseline to compile (which compiler? VC6 or VC7 / 8?)
I
 am
 using MSXML4 as the parser.
Regarding the compiler support, the properties are enabled for VC++7.1 and other compilers that support my C++ Properties technique. For other compilers, the properties are not enabled, and therefore client code would have to use the property methods. Hence, the following code is equivalent: xmlstl::msxml::dom::node n = . . . // VC++ 6.0 and equivalent string_t name = n.get_nodeName(); // VC++ 7.1 and equivalent string_t name = n.nodeName; The latter (.nodeName) simply invokes the former (.get_nodeName()).
cool. One thing I will comment, though. You said that your node class has methods / properties corresponding to the IXMLDOMNode interface. How will implementing other parsers affect the xmlstl interface? In particular, one of the reasons I am looking for a wrapper is that the interface for iteration is cumbersome. How will that be presented? Eric
Nov 30 2005
parent reply "Matthew" <matthew hat.stlsoft.dot.org> writes:
 Perhaps, if my contention that all conventions are equally bad, it might
be
 best to follow whatever the convention that W3C use in the DOM
 specification? I'll look into this ...

 ... they're defined in http://www.w3.org/TR/DOM-Level-2-Core/core.html
as
 "previousSibling", "nodeValue" for the properties, and "getNamedItem()",
 "removeNamedItem()" for the methods. Both MSXML DOM and Xerces DOM
already
 use this naming convention, so I'm inclined to go with it (in part
because
 that means I don't have to change anything). The property methods will
 therefore be named get_previousSibling(), put_nodeValue().

 This would also mean that I've have to change the class names, i.e.
 entity_reference => EntityReference, which is a bit of work. But I think
 conformance to the W3C std outweighs the STL convention, unless someone
can
 persuade me otherwise.
naming convention isn't a huge deal for me, as long as it's consistent.
Cool. I think applying the Principle of Least Surprise in this case will be helpful to the library's acceptance.
 One thing I will comment, though. You said that your node class has
methods
 / properties corresponding to the IXMLDOMNode interface.  How will
 implementing other parsers affect the xmlstl interface?
Hopefully they should all be the same, or as near as. This remains to be seen.
 In particular, one
 of the reasons I am looking for a wrapper is that the interface for
 iteration is cumbersome. How will that be presented?
Excellent point. At the moment, I've got the xmlstl::msxml::dom::* classes just using the (get_)length and (get_)item, as in the specified interface for NodeList: interface NodeList { Node item(in unsigned long index); readonly attribute unsigned long length; }; So, to enumerate the child nodes of a node n, we currently have two options: 1. We can use the "childNodes" property on the node to obtain the node_list, and then use the "length" and "item" properties on the node_list instance. This is the W3C recommended practice. Here's an extract from the dom_node_print program I wrote yesterday (that also does attributes, and comments). static void dump_node(int depth, xmlstl::msxml::dom::node const &n) { stlsoft::simple_wstring prefix(depth, ' '); wcout << prefix << L"<" << n.nodeName; xmlstl::msxml::dom::node_list childNodes = n.childNodes; { for(size_t i = 0; i < childNodes.length; ++i) { dump_node(1 + depth, childNodes[i]); // or dump_node(1 + depth, childNodes.get_item(i)); // but, since I've not yet done parameterised properties, there's not dump_node(1 + depth, childNodes.item[i]); }} wcout << prefix << L"</" << n.nodeName << L">" << endl; } I think that's neat enough, although there's a subtlety regarding the fact that the item returned by the subscript operator (or get_item()) is not a node, but rather an instance of IXMLDOMNode_ptr, which is stlsoft::ref_ptr<IXMLDOMNode>. That type is convertible to xmlstl::msxml::dom::node, however, which is why the call to dump_node is well-formed. But the following code would not be well-formed, since stlsoft::ref_ptr<IXMLDOMNode> does not have a property "nodeName": childNodes[i].nodeName; At the moment, I've a number of ideas about this, but I need to think about it some more. 2. We can pass the node into an instance of a class I wrote several years ago - child_node_sequence - which would afford the following use pattern: static void dump_node(int depth, xmlstl::msxml::dom::node const &n) { stlsoft::simple_wstring prefix(depth, ' '); wcout << prefix << L"<" << n.nodeName; xmlstl::msxml::dom::child_node_sequence children(xmlstl::get_ref(node)); { for(xmlstl::msxml::dom::child_node_sequence::iterator begin = children.begin(); begin != children.end(); ++begin) { dump_node(1 + depth, *begin); }} wcout << prefix << L"</" << n.nodeName << L">" << endl; } The child_node_sequence class and the node class currently have no relationship. They exchange instances of IXMLDOMNode_ptr. (This is the Ref element of the Handle::Ref pattern, which I'm busily writing up at long last at the moment.). The ugly use of the xmlstl::get_ref() shim is needed only until I build that into child_node_sequence. 3. I've several ideas about how to make things a little more succinct a. Have node_list present begin() and end() methods, i.e. turn it into an STL sequence. And the same for named_node_map b. Have the node class present enum_children(), enum_attributes(), which would return instances of child_node_sequence and attribute_sequence (which is another sequence class from the early XMLSTL attempts some time back) c. Have the node class present child_begin(), child_end(), attr_begin(), attr_end() d. Something else I've not yet thought of ... At the moment, I'm leaning towards (a). For both (a) and (b), however, I'll have to ensure that iterators from disparate sequence instances are compatible, i.e. the "state" they hold will have to pertain to the underlying DOM instances (for MSXML this is IXMLDOMNodeList* and IXMLDOMNamedNodeMap*) Whatever the end result, the implementation should be relatively straightforward. It's deciding on the interface that's the key. I think that that will be informed by other people using the current lib, and by my mapping other XML libs, such as Xerces. I'll try and get an alpha out in the next few days, after I've written another test program. It'll be a lot easier to talk about once people can get their hands on some code. Cheers Matthew
Nov 30 2005
parent "Eric Beyeler" <ebeyeler svresearch.com> writes:
 In particular, one
 of the reasons I am looking for a wrapper is that the interface for
 iteration is cumbersome. How will that be presented?
Excellent point. At the moment, I've got the xmlstl::msxml::dom::* classes just using the (get_)length and (get_)item, as in the specified interface for NodeList: interface NodeList { Node item(in unsigned long index); readonly attribute unsigned long length; }; So, to enumerate the child nodes of a node n, we currently have two
options:
 1. We can use the "childNodes" property on the node to obtain the
node_list,
 and then use the "length" and "item" properties on the node_list instance.
 This is the W3C recommended practice. Here's an extract from the
 dom_node_print program I wrote yesterday (that also does attributes, and
 comments).


 2. We can pass the node into an instance of a class I wrote several years
 ago - child_node_sequence - which would afford the following use pattern:

     static void dump_node(int depth, xmlstl::msxml::dom::node const &n)
     {
         stlsoft::simple_wstring  prefix(depth, ' ');

         wcout << prefix << L"<" << n.nodeName;

         xmlstl::msxml::dom::child_node_sequence
 children(xmlstl::get_ref(node));
         { for(xmlstl::msxml::dom::child_node_sequence::iterator begin  =
 children.begin(); begin != children.end(); ++begin)
         {
             dump_node(1 + depth, *begin);
         }}


         wcout << prefix << L"</" << n.nodeName << L">" << endl;
     }

 The child_node_sequence class and the node class currently have no
 relationship. They exchange instances of IXMLDOMNode_ptr. (This is the Ref
 element of the Handle::Ref pattern, which I'm busily writing up at long
last
 at the moment.). The ugly use of the xmlstl::get_ref() shim is needed only
 until I build that into child_node_sequence.

 3. I've several ideas about how to make things a little more succinct

 a. Have node_list present begin() and end() methods, i.e. turn it into an
 STL sequence. And the same for named_node_map
 b. Have the node class present enum_children(), enum_attributes(), which
 would return instances of child_node_sequence and attribute_sequence
(which
 is another sequence class from the early XMLSTL attempts some time back)
 c. Have the node class present child_begin(), child_end(), attr_begin(),
 attr_end()
 d. Something else I've not yet thought of ...

 At the moment, I'm leaning towards (a). For both (a) and (b), however,
I'll
 have to ensure that iterators from disparate sequence instances are
 compatible, i.e. the "state" they hold will have to pertain to the
 underlying DOM instances (for MSXML this is IXMLDOMNodeList* and
 IXMLDOMNamedNodeMap*)
I was thinking along the lines of 3-c or 3-b. That's what I'm looking for - stl-style iteration through the children. It wouldn't hurt to have 3-a as well. Another method of iteration people may want is a depth-first iteration of all subelements of a node, not just its immediate children. Don't know how that would factor in, but that shouldn't affect the decisions about the shallow child iteration. Eric
Dec 01 2005