@q% file: yacco2_sym_tbl.w@> @q% Copyright Dave Bone 1998 - 2015@> @q% /*@> @q% This Source Code Form is subject to the terms of the Mozilla Public@> @q% License, v. 2.0. If a copy of the MPL was not distributed with this@> @q% file, You can obtain one at http://mozilla.org/MPL/2.0/.@> @q% */@> @q% This file is part of YACC02 compiler/compiler project.@> \input "eplain" \input "supp-pdf" \input "/usr/local/yacco2/diagrams/o2mac.tex" %\tracingall=1 %\tracingall=1 %\tracingmacros=1 %\tracingcommands=2 @i "/usr/local/yacco2/license.w" \def\ptindent#1{\line{\hskip.5in{#1}\hfill}} \def\fbreak{\hfill\break} @** Summary of Yacco2 and Linker's Symbol Table. Basic hash table for Yacco2's symbols. It is also used by Yacco2's linker companion who deals in the terminal first sets of threads. Yacco2's ratatouille:\fbreak \ptindent{$\backslash$yacco2$\backslash$compiler$\backslash$stbl - NT symbol table directory} \ptindent{|yacco2stbl.w| - cweb generator file} \ptindent{|hash_table_size| - current space allocator} \ptindent{|yacco2_stbl.h| - header file} \ptindent{|yacco2_stbl.cpp| - implementation} \ptindent{|yacco2_stbl| - symbol table namespace} \ptindent{|yacco2_stbl::stbl| - global symbol table variable} \ptindent{|T_sym_report_card| - defined in Yacco2's terminal alphabet} Dependency files from other Yacco2 sub-systems:\fbreak \ptindent{|yacco2.h| - basic definitions used by Yacco2} \ptindent{|yacco2_err_symbols.h| - error definitions from Yacco2's grammar alphabet} \ptindent{|yacco2_terminals.h| - regular terminal definitions from Yacco2's grammar alphabet} Symbol table procedures:\fbreak \ptindent{|yacco2::add_sym_to_stbl|} \ptindent{|yacco2::find_sym_in_stbl|} \ptindent{|yacco2::get_sym_entry_by_sub|} \ptindent{|yacco2::test_program| --- see `how to run the test' section} Note: |yacco2_stbl.cpp| gets compiled with Yacco2 parse library's grammars and placed into the runtime library. @*2 Synopsis of table entries.\fbreak Is it a Forest or a tree?\fbreak As u the reader are probably IT types and so trees can have a different meaning, the question can be rephrased: is it a group or an individual? The symbol table takes a group approach to symbols. What does this mean? Each specific symbol (tree) is classified as either a rule, terminal, keyword, or a thread thus pigeon holeing the entry into a forest. This generalization makes parsing the grammar easier. See |rule_in_stbl| and |T_in_stbl| defined in the terminal grammar as classified examples. The weakness in this 1 to 1 relationship is in error processing where the point of reference is not where the point of entry into the symbol table occured. To correct this weakness, the symbol table's entry has 2 parts: the symbol that created the definition and a cross reference list as referenced within the subrule's right-hand-side string of symbols. Each symbol in this xref list provides the source file coordinates for error processing purposes. The referenced symbol is entered in LILO order. @*2 Special note.\fbreak Scratch pad coordinates of |rule_in_stbl| and |T_in_stbl|\fbreak When the symbol table is accessed, the current token's coordinates overrides the found table entry's original birthings of |rule_in_stbl| or |T_in_stbl|. The other entries |th_in_stbl| or |kw_in_stbl| are similarly serviced. Using this returned generalization terminal with its scratch pad origins allows the grammar writer to create other tokens like |refered_rule| or |refered_T| containing the current coordinates. There are 2 create contexts for a symbol table entry: at definition time for all terminals and sometimes rules, and referenced time for rules of a subrule definition. All symbols in a subrule are references but i relaxed the requirement that rules must be defined before use as in the formal definition of a grammar. The waiving of this formality allows Productions to forward reference to-be-defined rules. Of course, Yacco2 must post process the declarations of rule use against their definitions to report referenced but not defined rules within the Productions. @** Introduction to Yacco2's Symbol Table. This is Yacco2's family of symbol table routines. It uses the {\it Linear probing and insertion} hashing |Algorithm L| from {\it Vol. 3 Sorting and Searcing} P. 526 of {\it The Art of Computer Programming} by Donald E. Knuth. As the symbols from Yacco2 are variable in length, the |Hash procedure code| uses a scheme suggested by R. Sedgewick from his book Algorithms 2nd edition p. 233 to modulo per character. A grammar's symbols are the nonterminals and terminals. The terminals are constant across all grammars being defined as a thread or a stand-alone unit. Usually they are brought into the grammar by Yacco2's ``@@'' include file operator. Nonterminals (rules) are local to the grammar being defined allowing for same name usuage across multiple grammars. The only globalness to a grammar's name is its namespace and possibly its thread name. All grammar symbols are mutually exclusive to one another: there is no commonness across these enities. From a grammar's perspective, |Terminals| are classified into 4 categories: errors, raw characters, lr constant symbols, and finally the regular terminals. It is their literal string that one uses throughout the grammar's productions which makes it global in scope. From a syntax-directed code perspective, these categories are emitted using their namespace qualifiers. This decision was made to lower global namespace pollution. Hence the syntax-directed c++ code must use their namespace qualifier. From a symbol table management point of view, symbol handling is very easy as a symbol scoping stack is not required as in the Pascal language. Using |cweb| allows me to converse with myself on various considerations which hopefully is not too verbous or symptoms of paranoia. An array of symbol entry records is used. Yes I know what is written about the merits of object oriented programming but... As a package all-inclusive-deal the class is safe --- no need to remember cleanup duties like routines that are loosely coupled. But do I want to include all the other mechanisms required like the operator[] or use a type from the Standard Template Library (STL) like Vector? No I don't but I want a containment facility of all its loosely declared elements. So how to do it? Just wrap it in a namespace construct. Now the safe way is to not use a pointer mechanism requiring a destructor for cleanup but an array of symbol entry records which does not grow out the memory dynamically but takes the maximum size at startup time. No memory leakage can occur and no cleanup required. This requires a change to the symbol entry to indicate vacancy which had used the null pointer. The cost of table initialization should be done at the compiler/linker/startup time. Each occupied entry in the table records how the symbol has been used and declared with a pointer to the symbol's definition. As this is the oracle of Yacco2's symbol table management, I want it to be safe and fast. It is the safe part that becomes intriguing. There are many ways to handle errors:\fbreak \ptindent{1) abort} \ptindent{2) throw/catch an error using c++ constructs} \ptindent{3) return some form of status} Abort is a drastic last-resort measure. Flow control is shutdown with a message directed to the user. Throw/catch has merit but I'm not wanting to get into it for efficiency purposes. The degree of error processing for this family of routines is single level calls: not recursive. Typically point 3 is a singular entity: an integer. When things evolve, this strategy is too simplistic when more points of evidence need reporting. So, lets look at a report card approach which must be safe and efficient. When I went to school and teachers could write or would take the time to properly evaluate the student through comments, the take home bulletin had to be signed by the parent or a proxy... Now the report card allows multiple statuses to be marked up and returned to the caller. It becomes the caller's responsibility to check for irregularities and take appropriate action: not like the old days when an unsigned report card meant go back home until a signature acknowledges receipt. So how can efficiency be put into place? A globally defined variable accessed by the caller/called routines has no discipline. The caller and the called do their thing with the implicit understanding that each routine deposits and checks the global variable. It is this implicit coupling that makes it highly efficient in a one-to-one action/response narration. I do not like implicitness as it forces one to build up all these untold truths, constraints, pre-post conditions. By importing all components via parameters to the called routine removes misunderstanding or forgetfullness. Now how to make the report card approach efficient? Returning a result by use of a function call is the usual way of doing things. Established between the caller / called routines is the explicit relationship that the result is returned. Return-by-value depends on the optimization of the compiler regarding efficiency on ``the copy of objects'' syndrome. Return-by-address thrashs the memory heap. Both methods leaves one open to not deal with the returned result. In fact the function can be called with the returned parameter being dropped: the returned parameter is not assigned to a caller's variable though the compiler should shriek at you. This form of calling still leaves the potential of memory leaks and sloppy programming. Elimination of memory leaks can be programmed out but not the sloppy programming; you still need to deal with the report card's verdict. Well, 50\% improvement is not bad. So the only alternative is to pass the report card as a parameter: either by reference or by pointer. My past experience with threads and pass-by-reference have not been good. In fact, depending on the compiler used, the program just did not work consistently! Passing by address pointer, though it contains the potential of a bad pointer, has never failed me. So my leanings are towards the pointer approach. Therefore parameter passing makes the global variable a local scope issue for the called routine with its parametric pre-conditions forced onto the calling routine. This pro-con debate is now settled for my routines that are multi-dimensional in findings: even a singular dimension gives the mechanism to build out extensions using new axes for actions, errors, specific results and ... In my mind's eye, this concludes how I will continue developing code. The hard part is to maintain this discipline. October 2004 - grumblings from within. Optimism leads to re-trying. We'll have again a go of passing parameters by reference. I hope the compilers are more reliable this time in a multi-threaded environment than in the past. It certainly lowers constraint checks. Here goes. You'll get a post evaluation report if it's not successful. Now for the good stuff --- the report card. It should contain the following items:\fbreak \ptindent{1) status indicator --- okay or failure} \ptindent{2) action taken --- aborted,not fnd,fnd,inserted} \ptindent{3) $\uparrow$ error terminal when an error occured} \ptindent{4) $\uparrow$ symbol-table-entry if found or inserted} \ptindent{5) subscript value used} The report card is defined in Yacco2's grammar Terminal section that acts as a dictionary of terms. @*2 Namespace sections. @= namespace yacco2_stbl{ @ @= }// end of namspace |yacco2_stbl| @ @= using namespace yacco2_stbl; @ Yacco2's symbol table blueprint. Output of the code. @(yacco2_stbl.cpp@>= @; using namespace NS_yacco2_err_symbols; using namespace NS_yacco2_terminals; using namespace yacco2; @; @*2 Maintenance.\fbreak The following subsections explain what to do in various cases. @ How to compile the symbol table. @^How to compile the symbol table@> Originally the project uses Microsoft's Visual Studio c++. Now it's Unix compiled. The below notes are in a Microsoft context. U can easily substitute a Unix context. In project $\backslash$yacco2$\backslash$compiler$\backslash$grammars, the emitted code |yacco2_stbl.cpp| is compiled with the grammars used by Yacco2. @ How to print out |cweave| report. @^How to print |cweave| report@> Fire up the DOS batch command processor and set the default directory as follows:\fbreak cd $\backslash$yacco2$\backslash$compiler$\backslash$symbol table \fbreak To produce a PDF document, type in the following at the DOS PROMPT:\fbreak pdftex yacco2stbl \fbreak To produce a postscript document, for the visually impaired like my aging self due to the PDF styled greyed section numbers within the table of contents and index, type in the following at the DOS PROMPT:\fbreak tex yacco2stbl \fbreak dvips yacco2stbl \fbreak Use Ghostscript to print out the yacco2stbl.ps file and Adobe Reader for the \.pdf version. @*2 Salient Points established. @^Salient points @> Key points concluded:\fbreak \ptindent{1) Procedure calls only used. Results are returned thru a passed Report card parameter.} \ptindent{2) c++ native types used for efficiency with no memory leaks.} \ptindent{3) Namespace eliminates loosely defined entities spread across different spectrums.} \ptindent{4) Report card is the scratch pad of evolutionary results used throughout all the symbol table phrases.} @ How to expand the symbol table. @^How to expand the symbol table@> Defined as a cweb macro, |hash_table_size| is the size of the table expressed as a prime number. If you need to enlarge it, use a Google search on prime number and find the Java applelet and run it. Adjust this macro definition and re-generate the project symbol table in Visual Studio which creates a new ctangle |yacco2_stbl.cpp| c++ file. This file then needs to be recompiled. See |How to compile| @^How to compile the symbol table@> above. @*2 How to run the test. @^How to run the test@> Create the following DOS batch file:\fbreak rem file: t1.bat\fbreak @@ECHO OFF\fbreak rem testdriver batch harness\fbreak rem first parm indicated the option to test\fbreak rem 1 - cmd line only\fbreak rem 2 - cmd line then P1 lex\fbreak rem 3 - cmd line then P3 syntax\fbreak rem 4 - runs the symbol table test\fbreak echo on\fbreak cd $\backslash$yacco2$\backslash$testdriver$\backslash$debug\fbreak rem echo Test out symbol table\fbreak testdriver 4 $\backslash$yacco2$\backslash$compiler $\backslash$grammars$\backslash$fsm\_class\_phrase6.dat\fbreak Now double click on the batch file created ``t1.bat''. NT will launch the batch process and execute the ``testdriver'' with its passed parameters. The appropriate tests are displayed on the console of the batch process. The test run should produce the following approximate facsimile:\fbreak C:$\backslash$yacco2$\backslash$testdriver$>$rem file: t1.bat\fbreak C:$\backslash$yacco2$\backslash$testdriver$>$ rem note: remed out lines that have been tested okay\fbreak C:$\backslash$yacco2$\backslash$testdriver$>$ cd $\backslash$yacco2$\backslash$testdriver$\backslash$debug\fbreak C:$\backslash$yacco2$\backslash$testdriver$\backslash$Debug$>$ rem echo test out symbol table\fbreak C:$\backslash$yacco2$\backslash$testdriver$\backslash$Debug$>$ testdriver 4 /yacco2/compiler/grammars/fsm\_class\_phrase6.dat\fbreak Prime number used: 20011\fbreak Test zero len key\fbreak zero\_len\_name: zero-len-symbol\fbreak 1st entry test\fbreak duplicate entry test\fbreak Error duplicate entry: dup-entry-in-sym-table\fbreak Same key address test\fbreak Error same key address: same-address-as-key-in-symbol-table\fbreak Fill up balance of table\fbreak Overflow test\fbreak Error overflow: sym-table-full\fbreak Find symbol test\fbreak Not found test\fbreak Get symbol by subscript test\fbreak Error subscript out of range: -1 subscript-out-of-range\fbreak Error not found if this message appears more than once: 17482\fbreak Remember 1 spot is left vacant in the algorithm!\fbreak Error subscript out of range: 20011 subscript-out-of-range\fbreak Clean up\fbreak enter any key then press Enter: x\fbreak @*2 Terminology. @^Terminology@> {\bf Symbol's literal} @^Symbol's literal@> --- a variable length array of characters delimited by the null character ``00''. In c++, it's a |const char*| type. It is the value used to find a symbol in Yacco2's symbol table expressed by the string of characters bracketed by the double quotes: example is |"#user-declaration"|. Go see Yacco2's terminal definitions. @*2 Facts. @^Facts@> Forget the tabloids and graffitti. Here is the gospel truth:\fbreak \ptindent{1) |1..prime number| is the total spots in the symbol table} \ptindent{2) algorithm leaves 1 spot vacant.} \ptindent{3) |prime number-1| is the number of spots for insertion.} \ptindent{4) |0..prime number-1| subscript range due to modulo calculation} \ptindent{5) Grammar symbols fall into lrk, err, raw character, or terminals} \ptindent{6) Each terminal group has its own namespace} The vacant spot only surfaces when the table is full and is pseudo-random in nature; this is due to value of the keys entered and where the hash function @^hash function@> positions them. @*2 Anatomy of symbol table. \fbreak \fbreak \convertMPtoPDF{"/usr/local/yacco2/diagrams/yacco2_stbl.1"}{1}{1} \fbreak \fbreak The above figure shows the symbol table (stbl) as an array of fixed size records of table$\_$entry having one entry in it. I use a quasi Pascal definition to describe the stbl cuz its more intuitively descriptive than its ``c++'' cousin. A hash function on the symbol's literal name is used where it stores the key in its own character pool (char$\_$pool). Access to the symbol table is thru the hash function or by subscript. \fbreak \fbreak Various table entries:\fbreak \fbreak \convertMPtoPDF{"/usr/local/yacco2/diagrams/yacco2_stbl.2"}{1}{1} @*2 Create header file for symbol table environment.\fbreak Please note the appropriate include files that have been generated by other Yacco2's |cweb| sub-systems. @(yacco2_stbl.h@>= @h #ifndef yacco2_stbl_ #define yacco2_stbl_ 0 #include "yacco2.h" #include "yacco2_err_symbols.h" #include "yacco2_terminals.h" #include namespace yacco2_stbl{ using namespace NS_yacco2_terminals; using namespace yacco2; extern table_entry stbl[hash_table_size]; void hash_fnct(T_sym_tbl_report_card& Report,const char& Key); extern void add_sym_to_stbl@/ (T_sym_tbl_report_card& Report ,const char& Name@/ ,yacco2::CAbs_lr1_sym& Sym@/ ,table_entry::defined_or_used_typ Why@/ ,table_entry::entry_typ What); extern void find_sym_in_stbl(T_sym_tbl_report_card& Report,const char& Name); extern void test_program(); extern void get_sym_entry_by_sub(T_sym_tbl_report_card& Report,int Sub); extern char char_pool_[char_pool_size]; extern int char_pool_idx_; }; #endif @ Accrue source code. @= table_entry yacco2_stbl::stbl[hash_table_size] = {table_entry()};@/ char yacco2_stbl::char_pool_[char_pool_size] = {};@/ int yacco2_stbl::char_pool_idx_(0);@/ @ @= yacco2_stbl::hash_fnct(Report,Name); @ Include header file. @= #include "yacco2_stbl.h" @** Notes regarding hashing.\fbreak @^hash function@> Due to the variable length of a symbol's name, the hash function goes across the character list whereby each character's value is added to the previous calculated hashed value shifted by some amount and then divided by the |hash_table_size| to get the remainder. Moduloing per character prevents a potential overflow. The reason for not making the hash function external to the world is that its output is not universal. The calculated value is just the starting seed for the |add_sym_to_stbl| to find a vacant spot. Hence it is locally accessible by |yacco2_stbl|'s namespace procedures. Constraints:\fbreak \ptindent{ip1: report card to contain the result} \ptindent{ip2: symbol's literal} Errors:\fbreak \ptindent{1) Zero len symbol} A potential error could occur if the symbol literal is not delimited by a ``00'' the null character. This could lead to an address violation. As Yacco2 creates this using the c++ string facility, this type of error can only be caused by the c++ compiler. Originally I had programmed an artificial maximun len of 512 characters but decided that this was overkill. @d hash_table_size 20011 /* prime of course */ @d hash_table_full hash_table_size - 1 @d str_delimiter '\x00' @d char_pool_size 100*1024 @*2 The hash procedure code: |hash_fnct|. The length of key includes the delimiter. This allows the complete key to be moved into the character pool. @+= void yacco2_stbl::hash_fnct(T_sym_tbl_report_card& Report,const char& Key) { Report.pos_ = -1; Report.key_len_ = strlen(&Key); if(Report.key_len_ == 0) { Report.status_ = T_sym_tbl_report_card::failure; Report.err_entry_ = new Err_zero_len_sym; return; } Report.key_len_ += 1; const char* k = &Key; Report.pos_ = *k++; for(;*k != 0;++k) { // walk the plank characters Report.pos_ = ((Report.pos_ *257)+*k) % hash_table_size; } Report.status_ = T_sym_tbl_report_card::okay; } @*2 Add symbol to table routine: |add_sym_to_stbl|.\fbreak |Algorithm L|. Watch for the subtlety of the algorithm. It leaves one spot vacant so as to not check upfront on every entry whether the table is full. This check is done when the item is to be inserted which can produce a table overflow error. I was caught by this due to my assumptions going into the reading of the algorithm as opposed to the remarks on L4 which refines the assumptions to stop the looping due to the one spot always left empty. Again I prefer the upfront statement of expectations but this was done for teaching purposes. As C is not restriced to label length as in MIX, I use the label name to indicate intent rather than MIX's letter-number combination. The ``leave one spot vacant'' optimization benefits finding an entry. Constraints:\fbreak \ptindent{ip1: report card to contain the result} \ptindent{ip2: symbol's literal} \ptindent{ip3: symbol object} \ptindent{ip4: reason for entry --- defined or used} \ptindent{ip5: what type of symbol --- terminal or rule} Errors:\fbreak \ptindent{1) duplicate entry} \ptindent{3) hash function errors} \ptindent{4) same key address between new symbol and item in symbol table} @+= void yacco2_stbl::add_sym_to_stbl@/ (T_sym_tbl_report_card& Report@/ ,const char& Name@/ ,yacco2::CAbs_lr1_sym& Sym@/ ,table_entry::defined_or_used_typ Why@/ ,table_entry::entry_typ What) @/ { static int guest_cnt(0); hash: @; @; compare: @; advance: @; insert: @; } @ Check the report card for appropriate behavior. @= if(Report.status_ == T_sym_tbl_report_card::failure) return;@/ if(Report.status_ == T_sym_tbl_report_card::fatal) return;@/ @ Find a free spot for the symbol. @= table_entry* te = &yacco2_stbl::stbl[Report.pos_]; int r(0); if(te->vacant_ == true) goto insert; r = strcmp(te->key_,&Name); if(r == 0){ Report.status_ = T_sym_tbl_report_card::failure; Report.err_entry_ = new Err_dup_entry_in_sym_table; Report.tbl_entry_ = te; Report.key_len_ = 0; return; } @ Linear probing to find spot. @= --Report.pos_; if(Report.pos_ < 0) Report.pos_ += hash_table_size; goto compare; @ Insert the symbol into the table. @= if(guest_cnt == hash_table_full){@/ Report.status_ = T_sym_tbl_report_card::failure; Report.err_entry_ = new Err_sym_tbl_full; return; } if(char_pool_idx_ + Report.key_len_ > char_pool_size){@/ Report.status_ = T_sym_tbl_report_card::failure; Report.err_entry_ = new Err_sym_tbl_char_pool_full; return; } ++guest_cnt; te->pos_ = Report.pos_; te->vacant_ = false; char* key_name = &char_pool_[char_pool_idx_]; strncpy(key_name,&Name,Report.key_len_); te->key_ = key_name; te->key_len_ = Report.key_len_; char_pool_idx_+=Report.key_len_; te->symbol_ = &Sym; te->type_ = What; if(Why == table_entry::used) te->used_ = true; else te->defined_ = true; Report.status_ = T_sym_tbl_report_card::okay; Report.action_ = T_sym_tbl_report_card::inserted; Report.tbl_entry_ = te; @*2 Find symbol in table routine: |find_sym_in_stbl|.\fbreak This routine's only purpose is to verification whether a symbol exists in the symbol table. It allows the programmer to be proactive rather than reactive to a symbol table problem. How so? Reactive would be when a duplicate error occured when adding a symbol to the table. As an error has been posted to the report card, now the programmer has to deal with this situation requiring cleanup if corrective action is taken that is different to the posted error. If the error is acceptable, then one can piggy back off the posted error. Having options is what it's all about. Constraints:\fbreak \ptindent{ip1: report card to contain the result} \ptindent{ip2: symbol's literal} Errors:\fbreak \ptindent{1) hash function errors} @+= void yacco2_stbl::find_sym_in_stbl(T_sym_tbl_report_card& Report,const char& Name) { hash: @; @; compare: @; advance: @; } @ Find a free spot for the symbol. @+= Report.tbl_entry_ = &yacco2_stbl::stbl[Report.pos_]; table_entry* te = Report.tbl_entry_; if(te->vacant_ == true) { Report.status_ = T_sym_tbl_report_card::okay; Report.action_ = T_sym_tbl_report_card::not_fnd; return; } int r = strcmp(te->key_,&Name); if(r == 0){ Report.status_ = T_sym_tbl_report_card::okay; Report.action_ = T_sym_tbl_report_card::fnd; return; } @ Find Advance to next. Linear probing to find spot. @= --Report.pos_; if(Report.pos_<0) Report.pos_ += hash_table_size; goto compare; @*2 Get table entry by subscript: |get_sym_entry_by_sub|. Constraints:\fbreak \ptindent{ip1: report card to contain the result} \ptindent{ip2: subscript where range 0..|hash_table_size| - 1} Errors:\fbreak \ptindent{1) subscript out of range} @+= void yacco2_stbl::get_sym_entry_by_sub(T_sym_tbl_report_card& Report,int Sub) { Report.pos_ = Sub; if((Sub < 0) || (Sub > hash_table_full)){@/ Report.status_ = T_sym_tbl_report_card::failure; Report.err_entry_ = new Err_subscript_out_of_range; return; } Report.tbl_entry_ = &yacco2_stbl::stbl[Sub]; Report.status_ = T_sym_tbl_report_card::okay; if(Report.tbl_entry_->vacant_ == false) Report.action_ = T_sym_tbl_report_card::fnd; else Report.action_ = T_sym_tbl_report_card::not_fnd; } @** Test program: |test_program|. A generator manufactures all the symbol literals up to table full. The literate routines below describe well each type of test with their appropriate range of errors. @+= void yacco2_stbl::test_program()@/ { @; @; @; @; @; @; @; @; } @ Initial variables for test. @= const char* zero_len_name = ""; T_eol* a_good_sym = new T_eol; int cnt(0); T_sym_tbl_report_card report_card; std::cout << "Prime number used: " << hash_table_size << std::endl; std::cout << "Test zero len key" << std::endl; add_sym_to_stbl(report_card,*zero_len_name,*a_good_sym,table_entry::used,table_entry::terminal); if(report_card.status_ == T_sym_tbl_report_card::failure){ std::cout << "Error zero_len_name: " << report_card.err_entry_->id__ << std::endl; delete report_card.err_entry_; }else{ std::cout << "Should have been a zero_len_name error " << std::endl; } @ Test 1st entry, duplicate, and same key address. @= char buf[256]; std::string* sbuf[hash_table_size];// |sbuf[0]| symbolic vacant spot sbuf[1] = new string("1"); const char* f1st_dup = "1"; std::cout << "1st entry test" << std::endl; add_sym_to_stbl(report_card,*sbuf[1]->c_str(),*a_good_sym,table_entry::defed,table_entry::rule); if(report_card.status_ == T_sym_tbl_report_card::failure){ std::cout << "Error first entry failed: " << report_card.err_entry_->id__ << std::endl; delete report_card.err_entry_; } std::cout << "duplicate entry test" << std::endl; add_sym_to_stbl(report_card,*f1st_dup,*a_good_sym,table_entry::defed,table_entry::rule); if(report_card.status_ == T_sym_tbl_report_card::failure){ std::cout << "Error duplicate entry: " << report_card.err_entry_->id__ << std::endl; delete report_card.err_entry_; }else{ std::cout << "Should have been a duplicate entry error " << std::endl; } cout << "Same key address test " << endl; add_sym_to_stbl(report_card,*sbuf[1]->c_str(),*a_good_sym,table_entry::defed,table_entry::rule); if(report_card.status_ == T_sym_tbl_report_card::failure){ std::cout << "Error same key address: " << report_card.err_entry_->id__ << std::endl; delete report_card.err_entry_; }else{ std::cout << "Should have been a same key address error " << std::endl; } @ Fill up balance of table. @= const char* numeric_key = "%i"; std::cout << "Fill up balance of table" << std::endl; for(cnt=2;cnt <= hash_table_full;++cnt){ sprintf(buf,numeric_key,cnt); sbuf[cnt] = new string(buf); add_sym_to_stbl(report_card,*sbuf[cnt]->c_str() ,*a_good_sym,table_entry::defed,table_entry::rule); if(report_card.status_ == T_sym_tbl_report_card::failure){ std::cout << "Error entry failed: " << cnt << " " << report_card.err_entry_->id__ << std::endl; delete report_card.err_entry_; } } @ Test table overflow error. @= const char* overflow_key = "overflow"; std::cout << "Overflow test" << std::endl; add_sym_to_stbl(report_card,*overflow_key,*a_good_sym,table_entry::defed,table_entry::rule); if(report_card.status_ == T_sym_tbl_report_card::failure){ std::cout << "Error overflow: " << report_card.err_entry_->id__ << std::endl; delete report_card.err_entry_; }else{ std::cout << "Should be overflow error" << std::endl; } @ Test find symbol in table. It walks all the keys just entered whose range is 1..|hash_table_full|. The last test is ``not found'' on the overflow key. @= std::cout << "Find symbol test" << std::endl; for(cnt=1;cnt <= hash_table_full;++cnt){ find_sym_in_stbl(report_card,*sbuf[cnt]->c_str()); if(report_card.status_ == T_sym_tbl_report_card::failure){ std::cout << "Error Should have been found: " << cnt << " " << report_card.err_entry_->id__ << std::endl; delete report_card.err_entry_; }else{ if(report_card.action_ == T_sym_tbl_report_card::not_fnd){ std::cout << "Error Action should have been found: " << cnt << std::endl; } } } std::cout << "Not found test" << std::endl; find_sym_in_stbl(report_card,*overflow_key); if(report_card.status_ == T_sym_tbl_report_card::failure){ std::cout << "Should not be an error: " << cnt << " " << report_card.err_entry_->id__ << std::endl; delete report_card.err_entry_; }else{ if(report_card.action_ == T_sym_tbl_report_card::fnd){ std::cout << "Error should be not found: " << std::endl; } } @ Test get entry by subscript. Walk thru -1..|hash_table_size|. This should produce 2 out-of-bound errors: one at each end of the spectrum traveled --- -1 and |hash_table_full|. The range 0..|hash_table_full| should have all spots occupied except 1 vacant spot as explained in the @^Facts@> Facts section. @= std::cout << "get symbol by subscript test" << std::endl; for(cnt=-1;cnt <= hash_table_size;++cnt){ get_sym_entry_by_sub(report_card,cnt); if(report_card.status_ == T_sym_tbl_report_card::failure){ std::cout << "Error subscript out of range: " << cnt << " " << report_card.err_entry_->id__ << std::endl; delete report_card.err_entry_; }else{ if(report_card.action_ == T_sym_tbl_report_card::not_fnd){ std::cout << "Error not found if this message appears more than once: " << cnt << std::endl; std::cout << "Remember 1 spot is left vacant in the algorithm!" << std::endl; } } } @ Cleanup. @= std::cout << "Clean up" << std::endl; delete a_good_sym; for(cnt=1;cnt <= hash_table_full;++cnt){ delete sbuf[cnt]; } @* Notes: bric-a-brac. @ Remove key address dependency and copy key to a global |char_pool_| area.\fbreak Originally I used the passed key's address also as storage and guarded against the same key address passed by an error. Now what if one drains its info from cin into a common storage area? So I am more robust but with more overhead. The symbol table can now be used elsewhere. The |CAbs_lr1_sym| value item would be changed to some generic or specific object. Possibly |void*| with type casting or adjusted to the local requirements. \fbreak Improvement: 8 Mar. 2005 @* Index.