/* Copyright Dave Bone 1998 - 2014 All Rights Reserved. No part of this document may be reproduced without written consent from the author. FILE: eol.lex Dates: 17 Juin 2003 Purpose: end-of-line recognizer */ /@ @i "/usr/local/yacco2/copyright.w" @** Eol Thread.\fbreak This thread's claim-to-fame is its end-of-line matching for Unix, Mac, or Windows variants. See |Rdelimiters| rule for the variant in end-of-line recognition. The other thing of interest is its use of the meta terminal \invisibleshift. Without it in the subrule, the traditional way is to subtract ``x0a'' from |eolr| representing ``all terminals'' in the Terminal alphabet within the lookahead expression of ``parallel-thread-function'' to prevent a shift / reduce conflict due to the ``xod'' common prefix subrules. This works but is very inefficient in the size of the lookahead set generated caused by the number of terminals in the Terminal alphabet. The \invisibleshift approach adds a shift in the subrule but only ``eolr'' in its reduce set whereas the traditional way has to binary search thru the lookahead set of approximately .5k terminals to see if the current token is a member. Under current set implementation, this is expensive as the partition number is binary searched first followed by the element within the 8 member set. How does \invisibleshift work? Being a meta-terminal, it is not part of the token stream. It is one of the parsing conditionals tested for by its presence within the finite automaton's current state. \allshift is another such meta terminal example. Use a global pointer to it as it is just an indicator. The new / delete cycle is too expensive. @/ fsm (fsm-id "eol.lex",fsm-filename eol,fsm-namespace NS_eol ,fsm-class Ceol ,fsm-version "1.0" ,fsm-date "17 Juin 2003",fsm-debug "false" ,fsm-comments "end-of-line recognizer --- Unix, Mac, and Microsoft supported styles.") parallel-parser ( parallel-thread-function TH_eol *** parallel-la-boundary eolr *** ) @"/usr/local/yacco2/compiler/grammars/yacco2_T_includes.T" /@ @** Rules.\fbreak @/ rules{ Reol (){ -> Rdelimiters { /@ Return the |eol| token back to the caller. @/ op CAbs_lr1_sym* sym = NS_yacco2_terminals::PTR_eol__; sym->set_rc(*rule_info__.parser__->start_token__,__FILE__,__LINE__); RSVP(sym); *** } } /@ @** |Rdelimiters| rule.\fbreak @/ Rdelimiters /@The comments indicate the associated end-of-line variants. The Mac subrule shows the use of the \invisibleshift to escape from a shift / reduce conflict due to the common prefix ``x0d''. As it is not in the token stream, it is a very effective way to deal with this type of conflict. @/ () { -> /@ lf: Unix\fbreak@/ "x0a" -> /@ cr: Mac. Note use of the \invisibleshift to remove the shift / reduce conflict. This is caused by the general lookahead boundary of |eolr|. \fbreak @/ "x0d" |.| -> /@ cr:lf Windows\fbreak@/ "x0d" "x0a" } }// end of rules