D - D symbol table extension proposal
- whitis freelabs.com (219/219) Apr 20 2004 Hi, I was a user of Zortech C++ v1.x and v2.x; I just read the complete ...
- Dave Sieber (12/13) Apr 20 2004 This is one of my biggest wishes. I almost switched to Java/C# just so t...
Hi, I was a user of Zortech C++ v1.x and v2.x; I just read the complete D language specification and the language looks very interesting. D has a weakness found in most other programming languages; the symbol table is thrown away after compilation. My suspicion is that adding this would add only minor complexity to the compiler. It is not the kind of language feature that causes serious scaling issues. And there would be many benifits. I have already written a library that tries to shoe horn symbol tables into the C language but there is considerable redundancy in coding since you don't have access to the compilers symbol tables. Even so, it is of considerable benifit. If it were built into the language, it would be easier to use and have additional documentation benifits. To add compiler support: - Create a runtime class (even if it doesn't actually do anything yet) - Add one pointer to each compiler symbol table entry which can be be used to add extra detail in another structure. - add keyword symtab_of() when you use this keyword, it causes the compiler symtab format to be copied and converted into the runtime symtab format and an object of that type to be put into static storage and a pointer returned to that object. - If the compiler is sneaky, it will read the precompiled run time class file and use that to copy members by name allowing the class to be completely reimplemented without changing the compiler. - add the extra_info keyword. When this is found, everything enclosed in the following { } will be parsed as if it was an initializer for a symtab_t class and the pointer to that class will be stored in the extra pointer in the compiler symbol table. At some point, the info in the compiler symbol table is copied into the symtab_t class add on before the symbtab_t object is made a part of the program. - The compiler could even be extended to make the default .print and .format methods for classes which do not specify them automatically call the symbol table routines. - symbol table objects which have extra_info but never used by symtab_of() can be discarded by linker. One of the cool aspects of this is that a lot of documentation now becomes structured as part of the symbol table extras instead of random comments and thus can be parsed by utilities such as code annotation programs ( doxygen, etc.) and debuggers. Symbol tables often work a lot better than other object oriented paradigms which are much harder to implement. The C version is at http://www.freelabs.com/~whitis/software/symbol/ If you have access to a copy of the first edition of _Linux Programming Unleashed_, I wrote a chapter which gives a tutorial on using the symbol table package. Symbol tables can be used in a number of ways: - Parsing command line parameters - Reading configuration files rfc822 (name: value), name=value, XML, windoze style - writing configuration files - reading data files - writing data files - web forms - GUI forms (preferences, etc). - remote proceedure call protocols, network transactions, etc. - dumping data structures for debugging - external utilities also could benefit from the additional info about each object. - code browsers: doxygen, kernel browser, etc. - debuggers ----------------------begin sample code---------------------------- import symtab; int debug_level extra_info { external_name: "debug"; description: "Debugging Level (0=none, 1=some, ... 9=lots)" }; char[] save_options_filename extra_info { description: "If set, options will be written to file"; } = ""; int show_options extra_info { external_name: "show"; } struct foo_t { real x extra_info { description="X Coordinate"; } real y extra_info { description="Y Coordinate"; } real z extra_info { description="Z Coordinate"; } } extra_info { // standard keywords uxternal_name: "foo"; description: "A 3D Coordinate"; xml_is_tag, true; extra_pairs { {"html_css_class", "coordinate"} } symtab_protocol_send_message } symtab_t foo_st = symtab_of(foo_t); foo_t center; foo_t viewpoint; smart_pointer_t center_sp = smart_pointer(&foo_st, ¢er); // this symbol table lists variables accessable on the command // line and in the config file symtab_t[] parameters_st = { symtab_of(show_options), symtab_of(save_options_filename), symtab_of(center), symtab_of(viewpoint), symtab_of(debug_level), } // This symbol table lists variables sent as part of a protocol // remote procedure call message symtab_t[] protocol_st = { symtab_of(center), symtab_of(viewpoint) } // this symbol table lists variables received in response to // protocol remote proceedure call message. struct results_t { enum { STATUS_OK extra_info { external_name: "OK" }, STATUS_PERMANENT_ERROR extra_info { external_name, "ERROR" }, STATUS_TEMPORARY_ERROR, extra_info { external_name, "TRYAGAIN" }, STATUS_WARNING extra_info { external_name, "WARNING" } } status; int lineno; char[] error_line; char[] error_text; } symtab_t results_t_st = symtab_of(results_t); results_t results; // we don't define a symbol table for results (which would include the address // of the struct) so we can illustrate how smart pointers allow the address // and symbol table data to be combined later. This would be more useful // if we had many variables of type results_t. main() { parse_cmd_options(parameters_st, args); // --viewpoint.x=1.0 --viewpoint.y=2.0 --viewpoint.z=3.0 --debug=2 // --viewpoint={1.0,2.0,3.0} --debug=2 // --viewpoint={x=1.0, y=2.0, z=2.0} --debug=2 symtab_read_config_options_t rc_options = symtab_read_file_options_t_defaults; rc_options.format=READ_CONFIG_FORMAT_XML; symtab_write_config_options_t wc_options = symtab_write_file_options_t_defaults; wc_options.format=WRITE_CONFIG_FORMAT_XML; parameters_st.read_file(, "~/.myprog", rc_options); if(show_options) { parameters_st.write_file("-", options); } if(save_options_filename.length > 0) { char[] junk; junk = save_options_filename; save_options_filename=""; parameters_st.write_file(save_options_file, options); } if(debug_level>0) { protocol_st.write_file(stderr, options); } protocol_send_message( smart_pointer(&protocol_st, null), smart_pointer(&results_t_st, &results), } if(debug_level>0) { smart_pointer(&results_t_st, &results)..write_file(stderr, options); } } ----------------------end sample code---------------------------- To make things a little cleaner, symtab_of() should probably return a smart_pointer. When I wrote the original symbol table routines in C, I discovered that the compiler would not let you initialize arbitrary byte streams containing data types of mixed sizes. I had to force everything to 4 bytes. This would be a useful feature in the D language: byte_stream[] symbol_table = { (token) ST_BEGIN, (token) ST_BEGIN, (token) ST_IDENTIFER, (char[]) "x" (token) ST_TYPE, (symtab_t *) &int32u_st, (token) ST_AT, (far void*) &x, (token) ST_MIN, (long) MAX_ULONG, (token) ST_MAX, (long) MIN_ULONG, (token) ST_END, (token) ST_BEGIN, (token) ST_IDENTIFER, (char[]) "y" (token) ST_TYPE, (symtab_t *) &int32u_st, (token) ST_AT, (far void*) &y, (token) ST_MIN, (long) MAX_ULONG, (token) ST_MAX, (long) MIN_ULONG, (token) ST_END, (token) ST_END, } When I wrote the original symbol table routines in C, I discovered that the compiler would not let you initialize arbitrary byte streams containing data types of mixed sizes. I had to force everything to 4 bytes. This would be a useful feature in the D language: byte_stream[] symbol_table = { (token) ST_BEGIN, (token) ST_BEGIN, (token) ST_IDENTIFER, (char[]) "x" (token) ST_TYPE, (symtab_t *) &int32u_st, (token) ST_AT, (far void*) &x, (token) ST_MIN, (long) MAX_ULONG, (token) ST_MAX, (long) MIN_ULONG, (token) ST_END, (token) ST_BEGIN, (token) ST_IDENTIFER, (char[]) "y" (token) ST_TYPE, (symtab_t *) &int32u_st, (token) ST_AT, (far void*) &y, (token) ST_MIN, (long) MAX_ULONG, (token) ST_MAX, (long) MIN_ULONG, (token) ST_END, (token) ST_END, } -- Mark Whitis http://www.freelabs.com/~whitis/ NO SPAM Author of many open source software packages. Coauthor: Linux Programming Unleashed (1st Edition)
Apr 20 2004
whitis freelabs.com wrote:- dumping data structures for debuggingI could automate dumping of huge numbers of structs in a recent project. And this was not for debugging, it was for data comparisons across large numbers of files. Maintaining it all by hand was incredibly tedious and really brought home to me the fact that our languages are not as helpful as they could be -- especially because the compiler HAS the information! Even a simple API to access the symbol table/debug info would be helpful. Why insist that only an external application can access information which is right there and readily available? -- dave
Apr 20 2004