Digital Mars - c++ - Open-RJ 1.3.1 released

↑ ↓ ← → "Matthew" <admin stlsoft.dot.dot.dot.dot.org> writes:

This is a minor update, primarily comprising the addition of record 
comment information representation by the API. (Previously, the 
comments used on records within Open-RJ database files were not 
available at the programmatic level.

This does not contain any write functionality that's currently 
underway in collaboration with Lars Ivar Igesund.

Download from http://openrj.org/ (redirects to SourceForge)

Details:

================================================================================
23rd May 2005 : Open-RJ 1.2.1 => 1.3.1
--------------------------------------------------------------------------------

The main change is that record comments are now made available in 
the API, in
the form of the 'comment' member of the ORJRecordA structure, and 
the
ORJ_Record_GetCommentA() function.

There is also a new auto-link header, which causes compilers that 
support such
behaviour to insert comment records that direct their linker to link 
to the
appropriate Open-RJ library, e.g. openrj.vc71.mt.debug.lib, without 
requiring
explicit specification in the linker command. Some developers prefer 
this kind
of automatic behaviour, and now have this facility if they so wish 
just by
including openrj_implicit_link.h

Other changes are as follows:

 - c_str_ptr() shims are added for ORJStringA. (See chapter 19 of 
Imperfect C++
   (http://imperfectcplusplus.com/) for an explanation of the Shims 
concept)
 - c_str_data() and c_str_len() shims are provided in addition to 
c_str_ptr().
   These are of general use, but are particularly useful with 
STLSoft's
   basic_string_view template

Changes to the mappings are as follows:

 C++ mapping:
  - the headers are separated out, one per class. They are all fully 
configured,
    i.e. including openrj/cpp/filedatabase.hpp includes 
openrj/cpp/record.hpp
    and openrj/cpp/field.hpp
  - addition of (in-)equality operators (operator == and operator 
!=) for Field
    class
  - addition of String typedef within openrj::cpp namespace, which 
is actually
    a typedef for stlsoft::basic_string_view<char> (for providing 
views onto
    the string information within database fields).
  - addition of Record::GetComment(), Record::HasField(char const 
*name),
    Record::HasField(char const *name, char const *value),
    Record::HasFieldWithValue(char const *value) methods
  - Record::operator []() overloads taking string objects now return 
String
    rather than ORJString

 STL mapping:
  - addition of record::comment() method

 Python mapping:
  - addition of comment property to record type

 Ruby mapping:
  - addition of comment property to record type

================================================================================

May 22 2005

↑ ↓ ← → "Rajiv Bhagwat" <dataflow vsnl.com> writes:

Matthew,
OpenRJ seems to be such a simple spec. I wonder why the code to access it
has to be so big in size and not so evidently clear to look at. This is a
simpler spec for a simple database (actualy not even that, just a table) so
I was expecting much simpler code.

Obviously the use of templates means it is not targetted to 'c' users. To
me, the simpler use would be:
<pre>
#include "oneheader.h"
// Table taken from the original sample ..
const char *str = "%% Sample Open-RJ database - Cats and Dogs\n"
"%% Created:   28th September 2004\n"
"%% Updated:   29th September 2004\n"
"Name:  Barney\n"
"Species: Dog\n"
"%%\n"
...

"Name:  Sparky\n"
"Species: Cat\n"
"%%\n";

// Emit the table contents as specified by 'str'
void emit(const char *str){
    Table   table;
    Record  rec;
    Field   fld;

    table.read_from_memory(str);
    cout << table.comment();
    foreach(table){                      // Loop over the records of the
table
        rec = table.current();                    // grab the 'current'
record
        foreach(rec){                                  // Loop over the
fields
            fld = rec.current();                   // grab the 'current'
field
            cout << fld.name() << "= '" << fld.value() << "'" << endl; //
use!
            }
        cout << "%% " << rec.comment() << endl;
        }
    }

int main(){
    try {
        emit(str);                                              // from
memory
        }
    catch (const std::exception & e){
        cout << e.what() << endl;
        return 1;
        }
    return 0;
    }

</pre>

Any comments? Am I missing something deeper? Is there a place in OpenRj for
such a 'simpler', essentially 1 header version?
- Rajiv
PS: Yet to grab your book, but liked your DDJ articles very much. Thats why
it is painful to see complex code coming from you..


"Matthew" <admin stlsoft.dot.dot.dot.dot.org> wrote in message
news:d6rob3$cf7$1 digitaldaemon.com...
 This is a minor update, primarily comprising the addition of record
 comment information representation by the API. (Previously, the
 comments used on records within Open-RJ database files were not
 available at the programmatic level.

 This does not contain any write functionality that's currently
 underway in collaboration with Lars Ivar Igesund.

 Download from http://openrj.org/ (redirects to SourceForge)

 Details:


====
 23rd May 2005 : Open-RJ 1.2.1 => 1.3.1
 --------------------------------------------------------------------------


 The main change is that record comments are now made available in
 the API, in
 the form of the 'comment' member of the ORJRecordA structure, and
 the
 ORJ_Record_GetCommentA() function.

 There is also a new auto-link header, which causes compilers that
 support such
 behaviour to insert comment records that direct their linker to link
 to the
 appropriate Open-RJ library, e.g. openrj.vc71.mt.debug.lib, without
 requiring
 explicit specification in the linker command. Some developers prefer
 this kind
 of automatic behaviour, and now have this facility if they so wish
 just by
 including openrj_implicit_link.h

 Other changes are as follows:

  - c_str_ptr() shims are added for ORJStringA. (See chapter 19 of
 Imperfect C++
    (http://imperfectcplusplus.com/) for an explanation of the Shims
 concept)
  - c_str_data() and c_str_len() shims are provided in addition to
 c_str_ptr().
    These are of general use, but are particularly useful with
 STLSoft's
    basic_string_view template

 Changes to the mappings are as follows:

  C++ mapping:
   - the headers are separated out, one per class. They are all fully
 configured,
     i.e. including openrj/cpp/filedatabase.hpp includes
 openrj/cpp/record.hpp
     and openrj/cpp/field.hpp
   - addition of (in-)equality operators (operator == and operator
 !=) for Field
     class
   - addition of String typedef within openrj::cpp namespace, which
 is actually
     a typedef for stlsoft::basic_string_view<char> (for providing
 views onto
     the string information within database fields).
   - addition of Record::GetComment(), Record::HasField(char const
 *name),
     Record::HasField(char const *name, char const *value),
     Record::HasFieldWithValue(char const *value) methods
   - Record::operator []() overloads taking string objects now return
 String
     rather than ORJString

  STL mapping:
   - addition of record::comment() method

  Python mapping:
   - addition of comment property to record type

  Ruby mapping:
   - addition of comment property to record type


====

May 23 2005

↑ ↓ ← → "Matthew" <admin stlsoft.dot.dot.dot.dot.org> writes:

"Rajiv Bhagwat" <dataflow vsnl.com> wrote in message 
news:d6stkt$1ee6$1 digitaldaemon.com...
 Matthew,
 OpenRJ seems to be such a simple spec.


It is

 I wonder why the code to access it
 has to be so big in size and not so evidently clear to look at.


What's big?

The source files are small

23/05/2005  10:52:26 AM  R----A--      38949 
H:\freelibs\openrj\1.3.x\src\orjapi.c
18/02/2005  02:04:51 PM  R----A--       3319 
H:\freelibs\openrj\1.3.x\src\orjmem.c
11/04/2005  11:12:53 PM  R----A--       6223 
H:\freelibs\openrj\1.3.x\src\orjstr.c
23/05/2005  12:29:20 PM  R----A--      30756 
H:\freelibs\openrj\1.3.x\include\openrj\openrj.h
08/04/2005  04:34:24 PM  R----A--       3285 
H:\freelibs\openrj\1.3.x\include\openrj\openrj_assert.h
08/04/2005  04:47:59 PM  R----A--       5543 
H:\freelibs\openrj\1.3.x\include\openrj\openrj_implicit_link.h
08/04/2005  04:34:30 PM  R----A--       3399 
H:\freelibs\openrj\1.3.x\include\openrj\openrj_memory.h

Effectively everything is declared within openrj.h, and implemented 
within orjapi.c.

It compiles to ~10K.

 This is a
 simpler spec for a simple database (actualy not even that, just a 
 table) so
 I was expecting much simpler code.


Maybe you're mistaking the extent of the code for the other 
mappings - C++, Ch, D, .NET, Python, Ruby, STL, etc. - with the base 
API?

 Obviously the use of templates means it is not targetted to 'c' 
 users. To
 me, the simpler use would be:
 <pre>
 #include "oneheader.h"
 // Table taken from the original sample ..
 const char *str = "%% Sample Open-RJ database - Cats and Dogs\n"
 "%% Created:   28th September 2004\n"
 "%% Updated:   29th September 2004\n"
 "Name:  Barney\n"
 "Species: Dog\n"
 "%%\n"
 ...

 "Name:  Sparky\n"
 "Species: Cat\n"
 "%%\n";

 // Emit the table contents as specified by 'str'
 void emit(const char *str){
    Table   table;
    Record  rec;
    Field   fld;

    table.read_from_memory(str);
    cout << table.comment();
    foreach(table){                      // Loop over the records 
 of the
 table
        rec = table.current();                    // grab the 
 'current'
 record
        foreach(rec){                                  // Loop over 
 the
 fields
            fld = rec.current();                   // grab the 
 'current'
 field
            cout << fld.name() << "= '" << fld.value() << "'" << 
 endl; //
 use!
            }
        cout << "%% " << rec.comment() << endl;
        }
    }

 int main(){
    try {
        emit(str);                                              // 
 from
 memory
        }
    catch (const std::exception & e){
        cout << e.what() << endl;
        return 1;
        }
    return 0;
    }

 </pre>

 Any comments? Am I missing something deeper? Is there a place in 
 OpenRj for
 such a 'simpler', essentially 1 header version?


I don't really get your point. It pretty much does that.

I've attached a fully functioning program adapted from your example 
code. The only Open-RJ #include it requires is

    #include <openrj/stl/database.hpp>

Can you explain where the program fails to satisfy your (and my!) 
desire for simplicity?

 - Rajiv
 PS: Yet to grab your book, but liked your DDJ articles very much.


Thanks.

 Thats why
 it is painful to see complex code coming from you..


Ok. I guess I see complexity as something different. I'm keen to 
hear more, though, about what/why you think it's complex.

Cheers

Matthew

May 23 2005

↑ ↓ ← → "Rajiv Bhagwat" <dataflow vsnl.com> writes:

"Matthew" <admin stlsoft.dot.dot.dot.dot.org> wrote in message
news:d6tl85$27ok$1 digitaldaemon.com...
 The previous one used the STL mapping. Here's one using the C++
 mapping.

 I'm not being a smart-arse. I really don't see where the complexity
 is, and I'm keen to understand your position.


Matthew, I have a lot of respect for the way you have analysed things,
that's why I dared to point out the simplicity issues to you. With others, I
would just let it pass.

No, I am not confusing with the other language mappings: I am talking about
the basic code.
The code which I sent you compiles as is and runs (with an alternate
implementation -  before I put my foot in the mouth, I had to come up with
the so called 'simpler' solution).

I am pointing out several things which need attention:
(Btw: this could be a wrong newsgroup for such analysis, but I believe most
of the readers would benefit. I have learned a lot of things from such
comments.) I am outlining the thought train when people look for the use of
third party source:

1. When I saw something like:
static ORJRC    ORJ_ExpandBlockAndParseA_(  ORJDatabaseA                *db
                                        ,   const size_t
cbDbStruct
                                        ,   size_t
cbData
                                        ,   const size_t
cbAll
                                        ,   unsigned
flags
                                        ,   IORJAllocator
*ator
                                        ,   ORJDatabaseA const
**pdatabase
                                        ,   ORJError
*error
                                        ,   size_t
size);

the first thought that crossed the mind: 'ohmygosh! whats this?'. I would
have reservations about using such functions, however proven, along with my
code. The mere length of the function makes it non-obvious.

Do we need all such allocations and reallocations? Can't we really use
simpler STL constructs? After all, we want to handle a small table with less
than 100 records, anything more and we will use Sqlite or Mysql.

2. Ok, I said, maybe all that is well tested. How to use it? In the test
directory, the C program is 450 lines, the c++ one is 475, STL more than
500. Not counting the test table of around 25 lines, it is still quite a
bit. The examples have to be much smaller, skimpier. If found suitable at
the first sight, the programmers won't mind going thru the code & the
documentation (in that order!) to see how to adopt it to more complex use.

Do we have to include this one every program? I wondered.
#include <openrj/cpp/openrj.hpp>
#include <openrj/cpp/field.hpp>
#include <openrj/cpp/record.hpp>
#include <openrj/cpp/database.hpp>

..
using std::cerr;
using std::endl;
using std::cout;
#if !defined(ORJ_NO_NAMESPACE)
using openrj::ORJRC;
#endif /* !ORJ_NO_NAMESPACE */
using openrj::cpp::FileDatabase;
using openrj::cpp::MemoryDatabase;
using openrj::cpp::DatabaseException;
using openrj::cpp::Field;
using openrj::cpp::Record;

Thats why my comment:
#include "oneheader.h"
and it should handle all the required includes and namespaces. Your later
examples do have the use of the single header, but we the potential users of
the library, would look at the provided test programs. A simple, quick one
and another more exhaustive one for deeper understanding would help.

If the library uses STL, why have a C program? Initially it gave the
impression that C only implementations can use this library.

3. The test code uses something like 'catch(DatabaseException &x)' - why
can't these people use the standard std::exceptions ? Is this one derived
from that? Does it mean I have to handle this AND the standard exceptions?
No immediate answers. What if I am writing a throw-away program and don't
want all the exception handling?

On top of this, this guy keeps on confusing Databases with Tables
everywhere... Unless the spec will be enhanced, why is he calling a Table as
a Database? Aren't fields, records, tables and databases clear to everyone??
Still stuck with DBF?

Oh - later on there seems to be a 'real' main, and it shows std::exception
and catch(...). Is he expecting other exceptions?
Matthew, we look for simpler 'demo' programs...

4. What parts are 'Wizard-generated' for this program?? Would that not be a
real, usable demo program, if it uses OpenRJ?
And where is 'execute_unittest();' - its not in this file.. Will this
compile then?
What's a 'goto' doing here??? Do I trust this code? Are there any in the
base library?


5. I see the use of:
    memory_database db(contents);

    for(memory_database::c......
That means there must be a separate class for 'file_database'... Why two? Is
there a difference in the interfaces of the two classes? What if one is
ammended and the other one is not? Also, in case I have to switch over from
one to other, do I need to edit every use? After all, the name is used in
every 'for' ...

6. Matthew had a great article about iterating over arrays & containers in
many languages.. (thats all I remember now!) but why is he still using:
    memory_database db(contents);
    for(memory_database::const_iterator ri = db.begin(); ri != db.end();
++ri) // Loop over the records

followed by - (*ri).comment()
Yes, with a lot of code again and again using 'begin' and 'end' it looks
'natural' - but how come he is not tired of typing the whole thing again and
again and inventing 'ri' and 'fi', i1 and i2 and so on... How come he hasn't
stumbled upon simpler 'foreach' which hides these things? (Does he love
typing? Not worried about Carpel Tunnel?)
foreach == begin..end construct
current= (*ri)


Etc, etc. This was a typical line of thinking. I have also typed much more
than reqd, surely, sorry. Am not looking for real, line by line answers, got
them (it compiling to around 10k was also not obvious, it is nice to know.)
This was wondering aloud. My only expectation is a good, prolific programmer
like you should keep much more simpler users in mind. We all will benefit
from slimmer software with simpler examples.

Thanks and regards,
- Rajiv

PS: Do I leave the implementation of the simpler version to diligent
readers? My Field and Record classes are almost empty, the Table class has 3
routines to speak of, 2 of which handle using the memory and the file. All
within one header. Read the example earlier, it compiles and runs.

 Cheers

 Matthew

May 24 2005

↑ ↓ ← → "Matthew" <admin.hat stlsoft.dot.org> writes:

Wow! A lot to think about, and a lot of very useful feedback that's already got
me thinking.

Even though you say at the end you don't expect it, I have answered most
points, partly to show you where your feedback 
has really got me thinking, and probably changing some of my ways of doing
things.

 No, I am not confusing with the other language mappings: I am talking about
 the basic code.
 The code which I sent you compiles as is and runs (with an alternate
 implementation -  before I put my foot in the mouth, I had to come up with
 the so called 'simpler' solution).

 I am pointing out several things which need attention:
 (Btw: this could be a wrong newsgroup for such analysis, but I believe most
 of the readers would benefit. I have learned a lot of things from such
 comments.) I am outlining the thought train when people look for the use of
 third party source:

 1. When I saw something like:
 static ORJRC    ORJ_ExpandBlockAndParseA_(  ORJDatabaseA                *db
                                        ,   const size_t
 cbDbStruct
                                        ,   size_t
 cbData
                                        ,   const size_t
 cbAll
                                        ,   unsigned
 flags
                                        ,   IORJAllocator
 *ator
                                        ,   ORJDatabaseA const
 **pdatabase
                                        ,   ORJError
 *error
                                        ,   size_t
 size);

 the first thought that crossed the mind: 'ohmygosh! whats this?'. I would
 have reservations about using such functions, however proven, along with my
 code. The mere length of the function makes it non-obvious.

 Do we need all such allocations and reallocations? Can't we really use
 simpler STL constructs? After all, we want to handle a small table with less
 than 100 records, anything more and we will use Sqlite or Mysql.


First, although I like STL and use it a lot, I think libraries benefit from
being written in C wherever possible and 
practicable. Since Open-RJ is a pretty straightforward (or I thought it was!
<g>) thing, and small, I thought a C 
implementation appropriate.

As to that function, it's a worker function, that's only used inside the
implementation. The reason it entered the 
codebase was when I added support for memory databases along with file
databases. I preferred to keep as much code as 
possible common between the parsing of the memory and file databases for
reasons of maintainability.

The reason that it does reallocation is that I am pathologically opposed to
inefficiency, and so Open-RJ only makes two 
allocations, resulting in the one block. Naturally I accept that this is likely
unnecessary on performance grounds for 
Open-RJ, but it does have the nice side-effect that closing a database is
simply a case of one call to free().

The reason that an allocator may be specified is that I am also a big
flexibility fan. Again, this might be largely 
moot, but when one's writing an open-source library that will be mapped to many
languages, as few assumptions as 
possible is the best.

In summary, I've good reasons for the decisions made, although I'm _not_ saying
that they're necessarily optimal.

 2. Ok, I said, maybe all that is well tested. How to use it? In the test
 directory, the C program is 450 lines, the c++ one is 475, STL more than
 500. Not counting the test table of around 25 lines, it is still quite a
 bit. The examples have to be much smaller, skimpier. If found suitable at
 the first sight, the programmers won't mind going thru the code & the
 documentation (in that order!) to see how to adopt it to more complex use.

 Do we have to include this one every program? I wondered.
 #include <openrj/cpp/openrj.hpp>
 #include <openrj/cpp/field.hpp>
 #include <openrj/cpp/record.hpp>
 #include <openrj/cpp/database.hpp>

 ..
 using std::cerr;
 using std::endl;
 using std::cout;
 #if !defined(ORJ_NO_NAMESPACE)
 using openrj::ORJRC;
 #endif /* !ORJ_NO_NAMESPACE */
 using openrj::cpp::FileDatabase;
 using openrj::cpp::MemoryDatabase;
 using openrj::cpp::DatabaseException;
 using openrj::cpp::Field;
 using openrj::cpp::Record;

 Thats why my comment:
 #include "oneheader.h"
 and it should handle all the required includes and namespaces. Your later
 examples do have the use of the single header, but we the potential users of
 the library, would look at the provided test programs. A simple, quick one
 and another more exhaustive one for deeper understanding would help.


You're 100% correct on this.

I avoid using directives like the plague, and I believe there are very good
reasons for doing so in production code, but 
I can (now) see that for pedagogical purposes it's not exactly helpful. I'm so
chagrined I'm changing this right now! 
:-)

 If the library uses STL, why have a C program? Initially it gave the
 impression that C only implementations can use this library.


Here I disagree/diverge. The library is C, and has C-API. And it has mappings
to other languages. So it's appropriate to 
have a C test program, and to have test programs in other languages.

Nonetheless, I'm keen to hear how the (mis-)impression is given. If you can
spare further advice on this issue, I'll 
definitely listen.

 3. The test code uses something like 'catch(DatabaseException &x)' - why
 can't these people use the standard std::exceptions ? Is this one derived
 from that? Does it mean I have to handle this AND the standard exceptions?
 No immediate answers. What if I am writing a throw-away program and don't
 want all the exception handling?


Fair enough. I guess I wanted to show the full functionality. I don't think
this is black-and-white, so maybe the answer 
is to have separate examples - a minimal example and a full-functionality
example?

 On top of this, this guy keeps on confusing Databases with Tables
 everywhere... Unless the spec will be enhanced, why is he calling a Table as
 a Database? Aren't fields, records, tables and databases clear to everyone??
 Still stuck with DBF?


The records in a database don't have to have the same structure, so (I don't
think) it's reasonable to call it a table 
(which would imply uniformity of structure).

Convinced? (Maybe I should have documented that somewhere ...)

 Oh - later on there seems to be a 'real' main, and it shows std::exception
 and catch(...). Is he expecting other exceptions?
 Matthew, we look for simpler 'demo' programs...


Again, the separation of main_() and main() is an artefact of real
applications, rather than something helpful in an 
example program, that's crept in as a result of habit. Maybe I'll trim that out.

 4. What parts are 'Wizard-generated' for this program??


The skeleton. Not useful information for potential users. It's gone.

Would that not be a
 real, usable demo program, if it uses OpenRJ?
 And where is 'execute_unittest();' - its not in this file.. Will this
 compile then?


Erm, vestigial/unfinished. Getting fixed now ...

 What's a 'goto' doing here??? Do I trust this code? Are there any in the
 base library?


He he. Point taken. Going, going, gone ...

 5. I see the use of:
    memory_database db(contents);

    for(memory_database::c......
 That means there must be a separate class for 'file_database'... Why two?


One for files one for memory, both derived from a common base which has the
'main' interface, i.e. access to records and 
fields.

 Is there a difference in the interfaces of the two classes? What if one is
 ammended and the other one is not? Also, in case I have to switch over from
 one to other, do I need to edit every use? After all, the name is used in
 every 'for' ...


I disagree here, as far as I follow your argument. The common aspects of the
two classes are in the base class. The 
derived classes merely differ in their constructors, as appropriate to the
source of the database. I still think that's 
the best design choice.

 6. Matthew had a great article about iterating over arrays & containers in
 many languages.. (thats all I remember now!) but why is he still using:
    memory_database db(contents);
    for(memory_database::const_iterator ri = db.begin(); ri != db.end();
 ++ri) // Loop over the records

 followed by - (*ri).comment()
 Yes, with a lot of code again and again using 'begin' and 'end' it looks
 'natural' - but how come he is not tired of typing the whole thing again and
 again and inventing 'ri' and 'fi', i1 and i2 and so on... How come he hasn't
 stumbled upon simpler 'foreach' which hides these things? (Does he love
 typing? Not worried about Carpel Tunnel?)
 foreach == begin..end construct
 current= (*ri)


Well, this one's the desire not to prescribe that users of one library should
have to use another, or should have to 
learn concepts that are not (yet) accepted into the mainstream of C++
programming practice. I still think that's the 
right choice.

 Etc, etc. This was a typical line of thinking. I have also typed much more
 than reqd, surely, sorry.


No apologies. Indeed, I'm very grateful for the input, and will change my libs
in all those areas you've pointed out 
that I agree on. Thanks!

 Am not looking for real, line by line answers, got
 them (it compiling to around 10k was also not obvious, it is nice to know.)


Maybe I should mention that somewhere in the docs.

 This was wondering aloud. My only expectation is a good, prolific programmer
 like you should keep much more simpler users in mind. We all will benefit
 from slimmer software with simpler examples.


Yes, I agree. Again, thanks.

I'm touched by your sentiments, albeit I _know_ that I'm a good engineer, and a
half-decent author. What I'm not good 
at, as I know all too well, is documentation and so I'm very grateful for your
feedback. Please feel free to make as 
much similar feedback about Open-RJ (and recls and STLSoft) as you are able in
future.

 Thanks and regards,
 - Rajiv

 PS: Do I leave the implementation of the simpler version to diligent
 readers? My Field and Record classes are almost empty, the Table class has 3
 routines to speak of, 2 of which handle using the memory and the file. All
 within one header. Read the example earlier, it compiles and runs.


I'll happily take a look at it, if you want to post/send it to me.

May 24 2005

↑ ↓ ← → "Rajiv Bhagwat" <dataflow vsnl.com> writes:

Thanks for taking the comments in the right spirit.

1. Open source:
The success of the open source model lies in (competent) others being able
to (relatively) easily understand the code and hence develop confidence, use
it and further be able to enhance it. Goes perfectly well with the idiom
that programmers write code for other programmers (could be 'new' you, after
3 months!). Got to keep it simple, if can't, got to put comments next to it.
I strictly follow that 95% of code be less than 20 line routines, but don't
want to give others any advise on this issue. (Those who are sold on it,
don't need the advice; others are going to call me impractical, anyway!).
And, I have written and am maintaining reasonably big programs for a number
of years (decades). Oops, giving away my age..
So, others are going to look at your code first to see if they can
understand it .. that decides whether they will use it. So, 'internal /
worker' routines also need to be 'good'!

2. 'c' and 'stl'
With practically every platform having a c++ compiler, today I use C only
for the embedded projects, where it is not possible/often not required to
have c++. Thus, the mention of 'c' usability gave me the hopes of using this
with the smaller 8031 based 'c' projects. The small 'file system' would be a
better standard to follow than a proprietary one for every project. From
this perspective, the first look at the library is disappointing.

The thick STL book convinced me that STL beats 'hand coding' in every
situation. So, it should be used without reservation. What I am looking for,
however, is 'minimum' use of STL to avoid problems with various
'differently' conforming compilers. Even here, as written, my 'desired'
sample program effectively hides the use of STL, so it is largely usable
as-is on the smaller configurations. So, no boost, no stlsoft. Not for such
a small thing..

Even then, aggressively using STL for every situation still eludes me. Hats
off to guys like you who can understand the complex templates and 'meta'
programming. Just keep on explaining and we will (hopefully) pick it up.
But, use it for the wrong things and we will be driven away. (After saying
this, I have made an attempt to use nothing else but STL, and I am aware
that there must be umpteen ways to do it right!)

3. Other design choices:
This task of picking up records can be done easily (if the setup is right)
using lex/yacc, boost spirit and any other dozens of similar parsing
mechanisms, but all are overkills for separating 'name: value'. This looks
like a simple C thing. What's the big deal about it? Yes, this action needs
to be center-stage. The table - record comments, count of records,
selectively ignoring spaces are all peripheral. The present implementation
needs to segregate these in obvious ways.

Fields are smallest operable entities, each record has multiple fields
(whether 2 records have same named fields is a design decision, OpenRj is
attractive because it does not force us to), many records constitute a table
(same size=rectangular table. Otherwise 'round'??), many tables make a
database. A real database is a much bigger beast. This is a table handler,
and it is useful as one. Pretend it to be anything else and it gets
ridiculed. Anything slightly bigger and programmers would use Sqlite, MySql
or other 'in-memory' databases capable of understanding SQL. Too much prior
art there to ignore.

One class or two derived from a base? That's your choice. The possibility of
testing using in-memory strings and then switching over to files is very
real. Users want minimum change then. That's why the suggestion about a
single class. And that is where writing a real application before the
library comes into the picture. So, please don't say 'vestigial/unfinished'
!! Please start with the app, first.

Also, I am talking about the use of 'foreach' in the example code. The
library does not use it. This is a perfect way to illustrate a technique, in
this case, the emphasis would be on the library classes 'capable' of being
traversed in this way. In my illustrative design, the same ones may be
traversed in the 'usual' begin-end way!! I needed your opinion on this as a
researcher of iterations over containers. Are you aware of an implementation
of 'foreach' for c++? I would find it hard to believe that what I have been
using for quite some time (even before I started using STL) has not been in
circulation...

Already, we have discussed too much regarding this ;-) Only excusable, if we
consider the principles in general, not for this project.
The only reason I have still not posted any code is to make sure that this
is read! I know I would rush to the code first and ignore everything ;-)

Regards,
- Rajiv



"Matthew" <admin.hat stlsoft.dot.org> wrote in message
news:d6v7jf$15om$1 digitaldaemon.com...
 Wow! A lot to think about, and a lot of very useful feedback that's

May 24 2005

↑ ↓ ← → "Matthew" <admin stlsoft.dot.dot.dot.dot.org> writes:

 1. Open source:
 The success of the open source model lies in (competent) others 
 being able
 to (relatively) easily understand the code and hence develop 
 confidence, use
 it and further be able to enhance it. Goes perfectly well with the 
 idiom
 that programmers write code for other programmers (could be 'new' 
 you, after
 3 months!). Got to keep it simple, if can't, got to put comments 
 next to it.
 I strictly follow that 95% of code be less than 20 line routines, 
 but don't
 want to give others any advise on this issue. (Those who are sold 
 on it,
 don't need the advice; others are going to call me impractical, 
 anyway!).
 And, I have written and am maintaining reasonably big programs for 
 a number
 of years (decades). Oops, giving away my age..
 So, others are going to look at your code first to see if they can
 understand it .. that decides whether they will use it. So, 
 'internal /
 worker' routines also need to be 'good'!


Agreed, and well taken.

I've updated Open-RJ - now 1.3.2 (http://openrj.org/) - in line with 
this good advice. I shall do the same with recls in a few days, and 
STLSoft for its next release.

Thanks for this

 2. 'c' and 'stl'
 With practically every platform having a c++ compiler, today I use 
 C only
 for the embedded projects, where it is not possible/often not 
 required to
 have c++. Thus, the mention of 'c' usability gave me the hopes of 
 using this
 with the smaller 8031 based 'c' projects. The small 'file system' 
 would be a
 better standard to follow than a proprietary one for every 
 project. From
 this perspective, the first look at the library is disappointing.


I don't agree here, as the aims for Open-RJ are not commensurate 
with that. Specifically:

- I wanted it to be as attractive to pure-C (particularly Linux) 
folks, as C++; there are a lot of C programmers who aren't C++ 
programmers.
- I wanted it to be super small, and flexible, with as few 
dependencies as possible. In its chopped down state - where you 
bring your own allocator - it requires only stddef.h!
- A library that's to be mapped to many different languages must 
have a C-API. Given that, and the fact that the impl is pretty 
simple - the original was written in a couple of hours - it seems 
pointless not to implement in C. This is informed, in part, by my 
experience with recls, which _is_ implemented in C++ (though still 
presents a C-API).

 The thick STL book convinced me that STL beats 'hand coding' in 
 every
 situation. So, it should be used without reservation.


Agreed, for application code, and for library code that's only going 
to be used by C++ client code.

 What I am looking for,
 however, is 'minimum' use of STL to avoid problems with various
 'differently' conforming compilers. Even here, as written, my 
 'desired'
 sample program effectively hides the use of STL, so it is largely 
 usable
 as-is on the smaller configurations. So, no boost, no stlsoft. Not 
 for such
 a small thing..


Can't respond to this, until I see the code.

Walter wrote a very small impl in D, which was great as far as it 
went, but it didn't go further than the parsing into fields and 
records. That's fine, if that's all that's wanted, but it's not a 
fully representative comparison. (FYI: May's instalment of "Positive 
Integration" http://www.cuj.com/documents/s=9784/cuj0505wilson/, 
discusses this very subject, and shows how the 100% D version is 
slightly slower. I expect the same from a pure C++ version, although 
I wouldn't bet the house on it. Hmmm, maybe I'll try it with 
string_view, that might be as fast.)

 Even then, aggressively using STL for every situation still eludes 
 me. Hats
 off to guys like you who can understand the complex templates and 
 'meta'
 programming. Just keep on explaining and we will (hopefully) pick 
 it up.


The only reason I write about it is because I find it confusing as 
well. :-)

 3. Other design choices:
 This task of picking up records can be done easily (if the setup 
 is right)
 using lex/yacc, boost spirit and any other dozens of similar 
 parsing
 mechanisms, but all are overkills for separating 'name: value'. 
 This looks
 like a simple C thing. What's the big deal about it?


There is no big deal. It's just a simple little library, with 
mappings to many languages. It's useful, small, portable. Just badly 
documented. (Or maybe 'was', as I'm hoping the 1.3.2 release answers 
a lot of your concerns. Please let me know if/where it doesn't.)

FYI: I read about the Record-JAR in Raymond's "The Art Of UNIX 
Programming" - a book I'd recommend to *everyone*. It is one of many 
formats he discusses, each of which has its own virtues and niches. 
I've found Open-RJ ideal for the configuration files of various 
programs where they need more than a few command-line args, and 
where XML would be overkill.

 Yes, this action needs
 to be center-stage. The table - record comments, count of records,
 selectively ignoring spaces are all peripheral. The present 
 implementation
 needs to segregate these in obvious ways.

 Fields are smallest operable entities, each record has multiple 
 fields
 (whether 2 records have same named fields is a design decision, 
 OpenRj is
 attractive because it does not force us to), many records 
 constitute a table
 (same size=rectangular table. Otherwise 'round'??), many tables 
 make a
 database. A real database is a much bigger beast. This is a table 
 handler,
 and it is useful as one. Pretend it to be anything else and it 
 gets
 ridiculed. Anything slightly bigger and programmers would use 
 Sqlite, MySql
 or other 'in-memory' databases capable of understanding SQL. Too 
 much prior
 art there to ignore.


Well, I think anything more complex and one would use XML, but then 
I see it as a configuration format, rather than a database format. 
(I guess using the term Database was somewhat ill-chosen, then <g>)

 One class or two derived from a base? That's your choice. The 
 possibility of
 testing using in-memory strings and then switching over to files 
 is very
 real. Users want minimum change then. That's why the suggestion 
 about a
 single class.


Because they inherit, there is no confusion/complexity here. The 
only difference is the constructors for the derived (file and 
memory) classes. I think it's very simple, orthogonal, discoverable.

 And that is where writing a real application before the
 library comes into the picture. So, please don't say 
 'vestigial/unfinished'
 !! Please start with the app, first.


Point taken.

 Also, I am talking about the use of 'foreach' in the example code. 
 The
 library does not use it. This is a perfect way to illustrate a 
 technique, in
 this case, the emphasis would be on the library classes 'capable' 
 of being
 traversed in this way. In my illustrative design, the same ones 
 may be
 traversed in the 'usual' begin-end way!! I needed your opinion on 
 this as a
 researcher of iterations over containers. Are you aware of an 
 implementation
 of 'foreach' for c++? I would find it hard to believe that what I 
 have been
 using for quite some time (even before I started using STL) has 
 not been in
 circulation...


Maybe I'll plug in "Ranges" into the Open-RJ/C++ and /STL mappings. 
A 1.4 feature, perhaps ...

 Already, we have discussed too much regarding this ;-) Only 
 excusable, if we
 consider the principles in general, not for this project.


Sure

 The only reason I have still not posted any code is to make sure 
 that this
 is read! I know I would rush to the code first and ignore 
 everything ;-)


It is read. Now post! :-)

May 25 2005

↑ ↓ ← → "Rajiv Bhagwat" <dataflow vsnl.com> writes:

I have taken Walter's advice - instead of posting files here, I have posted
an article on CodeProject.
"Implementing 'foreach' for c++ as a Design Pattern"
http://www.codeproject.com/

Have attached the alternate OpenRj implementation as the 'sample project'.
It has been tested (obviously) on DMC, as well as on VC6 & gcc under linux.

- Rajiv


"Matthew" <admin stlsoft.dot.dot.dot.dot.org> wrote in message
news:d73k78$dkh$1 digitaldaemon.com...
 1. Open source:
 The success of the open source model lies in (competent) others
 being able
 to (relatively) easily understand the code and hence develop
 confidence, use
 it and further be able to enhance it. Goes perfectly well with the
 idiom
 that programmers write code for other programmers (could be 'new'
 you, after
 3 months!). Got to keep it simple, if can't, got to put comments
 next to it.
 I strictly follow that 95% of code be less than 20 line routines,
 but don't
 want to give others any advise on this issue. (Those who are sold
 on it,
 don't need the advice; others are going to call me impractical,
 anyway!).
 And, I have written and am maintaining reasonably big programs for
 a number
 of years (decades). Oops, giving away my age..
 So, others are going to look at your code first to see if they can
 understand it .. that decides whether they will use it. So,
 'internal /
 worker' routines also need to be 'good'!


 Agreed, and well taken.

 I've updated Open-RJ - now 1.3.2 (http://openrj.org/) - in line with
 this good advice. I shall do the same with recls in a few days, and
 STLSoft for its next release.

 Thanks for this

 2. 'c' and 'stl'
 With practically every platform having a c++ compiler, today I use
 C only
 for the embedded projects, where it is not possible/often not
 required to
 have c++. Thus, the mention of 'c' usability gave me the hopes of
 using this
 with the smaller 8031 based 'c' projects. The small 'file system'
 would be a
 better standard to follow than a proprietary one for every
 project. From
 this perspective, the first look at the library is disappointing.


 I don't agree here, as the aims for Open-RJ are not commensurate
 with that. Specifically:

 - I wanted it to be as attractive to pure-C (particularly Linux)
 folks, as C++; there are a lot of C programmers who aren't C++
 programmers.
 - I wanted it to be super small, and flexible, with as few
 dependencies as possible. In its chopped down state - where you
 bring your own allocator - it requires only stddef.h!
 - A library that's to be mapped to many different languages must
 have a C-API. Given that, and the fact that the impl is pretty
 simple - the original was written in a couple of hours - it seems
 pointless not to implement in C. This is informed, in part, by my
 experience with recls, which _is_ implemented in C++ (though still
 presents a C-API).

 The thick STL book convinced me that STL beats 'hand coding' in
 every
 situation. So, it should be used without reservation.


 Agreed, for application code, and for library code that's only going
 to be used by C++ client code.

 What I am looking for,
 however, is 'minimum' use of STL to avoid problems with various
 'differently' conforming compilers. Even here, as written, my
 'desired'
 sample program effectively hides the use of STL, so it is largely
 usable
 as-is on the smaller configurations. So, no boost, no stlsoft. Not
 for such
 a small thing..


 Can't respond to this, until I see the code.

 Walter wrote a very small impl in D, which was great as far as it
 went, but it didn't go further than the parsing into fields and
 records. That's fine, if that's all that's wanted, but it's not a
 fully representative comparison. (FYI: May's instalment of "Positive
 Integration" http://www.cuj.com/documents/s=9784/cuj0505wilson/,
 discusses this very subject, and shows how the 100% D version is
 slightly slower. I expect the same from a pure C++ version, although
 I wouldn't bet the house on it. Hmmm, maybe I'll try it with
 string_view, that might be as fast.)

 Even then, aggressively using STL for every situation still eludes
 me. Hats
 off to guys like you who can understand the complex templates and
 'meta'
 programming. Just keep on explaining and we will (hopefully) pick
 it up.


 The only reason I write about it is because I find it confusing as
 well. :-)

 3. Other design choices:
 This task of picking up records can be done easily (if the setup
 is right)
 using lex/yacc, boost spirit and any other dozens of similar
 parsing
 mechanisms, but all are overkills for separating 'name: value'.
 This looks
 like a simple C thing. What's the big deal about it?


 There is no big deal. It's just a simple little library, with
 mappings to many languages. It's useful, small, portable. Just badly
 documented. (Or maybe 'was', as I'm hoping the 1.3.2 release answers
 a lot of your concerns. Please let me know if/where it doesn't.)

 FYI: I read about the Record-JAR in Raymond's "The Art Of UNIX
 Programming" - a book I'd recommend to *everyone*. It is one of many
 formats he discusses, each of which has its own virtues and niches.
 I've found Open-RJ ideal for the configuration files of various
 programs where they need more than a few command-line args, and
 where XML would be overkill.

 Yes, this action needs
 to be center-stage. The table - record comments, count of records,
 selectively ignoring spaces are all peripheral. The present
 implementation
 needs to segregate these in obvious ways.

 Fields are smallest operable entities, each record has multiple
 fields
 (whether 2 records have same named fields is a design decision,
 OpenRj is
 attractive because it does not force us to), many records
 constitute a table
 (same size=rectangular table. Otherwise 'round'??), many tables
 make a
 database. A real database is a much bigger beast. This is a table
 handler,
 and it is useful as one. Pretend it to be anything else and it
 gets
 ridiculed. Anything slightly bigger and programmers would use
 Sqlite, MySql
 or other 'in-memory' databases capable of understanding SQL. Too
 much prior
 art there to ignore.


 Well, I think anything more complex and one would use XML, but then
 I see it as a configuration format, rather than a database format.
 (I guess using the term Database was somewhat ill-chosen, then <g>)

 One class or two derived from a base? That's your choice. The
 possibility of
 testing using in-memory strings and then switching over to files
 is very
 real. Users want minimum change then. That's why the suggestion
 about a
 single class.


 Because they inherit, there is no confusion/complexity here. The
 only difference is the constructors for the derived (file and
 memory) classes. I think it's very simple, orthogonal, discoverable.

 And that is where writing a real application before the
 library comes into the picture. So, please don't say
 'vestigial/unfinished'
 !! Please start with the app, first.


 Point taken.

 Also, I am talking about the use of 'foreach' in the example code.
 The
 library does not use it. This is a perfect way to illustrate a
 technique, in
 this case, the emphasis would be on the library classes 'capable'
 of being
 traversed in this way. In my illustrative design, the same ones
 may be
 traversed in the 'usual' begin-end way!! I needed your opinion on
 this as a
 researcher of iterations over containers. Are you aware of an
 implementation
 of 'foreach' for c++? I would find it hard to believe that what I
 have been
 using for quite some time (even before I started using STL) has
 not been in
 circulation...


 Maybe I'll plug in "Ranges" into the Open-RJ/C++ and /STL mappings.
 A 1.4 feature, perhaps ...

 Already, we have discussed too much regarding this ;-) Only
 excusable, if we
 consider the principles in general, not for this project.


 Sure

 The only reason I have still not posted any code is to make sure
 that this
 is read! I know I would rush to the code first and ignore
 everything ;-)


 It is read. Now post! :-)

May 26 2005

↑ ↓ ← → "Walter" <newshound digitalmars.com> writes:

"Matthew" <admin stlsoft.dot.dot.dot.dot.org> wrote in message
news:d73k78$dkh$1 digitaldaemon.com...
 Walter wrote a very small impl in D, which was great as far as it
 went, but it didn't go further than the parsing into fields and
 records.


Here it is, for comparison purposes:
------------------------------------------

// openrj.d
// placed into Public Domain

module std.openrj;

import std.string;

alias char[][] [char[]] [] openrj_t;

class OpenrjException : Exception
{
    uint linnum;

    this(uint linnum, char[] msg)
    {
 this.linnum = linnum;
 super(std.string.format("OpenrjException line %s: %s", linnum, msg));
    }
}

openrj_t parse(char[] db)
{
    openrj_t rj;
    char[][] lines;
    char[][] [char[]] record;

    lines = std.string.splitlines(db);

    for (uint linnum = 0; linnum < lines.length; linnum++)
    {
 char[] line = lines[linnum];

 // Splice lines ending with backslash
 while (line.length && line[length - 1] == '\\')
 {
     if (++linnum == lines.length)
  throw new OpenrjException(linnum, "no line after \\ line");
     line = line[0 .. length - 1] ~ lines[linnum];
 }

 if (line[0 .. 2] == "%%")
 {
     // Comment lines separate records
     if (record)
  rj ~= record;
     record = null;
     line = null;
     continue;
 }

 int colon = std.string.find(line, ':');
 if (colon == -1)
     throw new OpenrjException(linnum, "'key : value' expected");

 char[] key = std.string.strip(line[0 .. colon]);
 char[] value = std.string.strip(line[colon + 1 .. length]);

 char[][] fields = record[key];
 fields ~= value;
 record[key] = fields;
    }
    if (record)
 rj ~= record;
    return rj;
}
--------------------------------------------------------------------
And here's a simple driver for it:
--------------------------------------------------------------------
import std.stdio;
import std.file;
import std.openrj;

int main()
{
    char[] db = cast(char[])std.file.read("test.rj");
    openrj_t rj = std.openrj.parse(db);

    foreach (char[][] [char[]] record; rj)
    {
 foreach (char[] key, char[][] fields; record)
 {
     writefln(key, ":");
     foreach (char[] field; fields)
     {
  writefln("\t", field);
     }
 }
 writefln("---------------------");
    }

    return 0;
}
----------------------------------------------------

May 26 2005