• "Bootstrapping yacc in yacc" -> "Bootstrapping yacc in lex"!

    From Rock Brentwood@21:1/5 to All on Sun Mar 14 17:54:23 2021
    It's a recurrent question that's come up in other forums "can yacc be bootstrapped in yacc?" Now, I'm adding a twist.

    I'll repeat one of my recent replies here. In the syntax for yacc files, laid out by the POSIX standard, there is no mandatory semi-colon at the ends of rules, so an extra look-ahead is required to determine whether an identifier
    is followed by a colon. If so, then this indicates the left-hand side of a new rule.

    A grammar rule has the form

    left-hand-side ":" stuff on the right optional ";"'s.

    If you see a ":" in the middle of the rules on the right, then you've actually sneaked on over into the *next* rule.

    Bison hacks the syntax, by making left-hand-side + ":" into a single token.

    It may, in fact, be possible to parse yacc *even with* this issue, without having to hack yacc grammar like bison did - by just simply not using yacc at all, but using only lex!

    The grammar specified by POSIX is actually a *regular* grammar that specifies
    a finite state transducer. The transducer is deterministic, precisely because the identifier-colon ambiguity can be resolved. Therefore, if you set up the right kind of finite state machine in lex making use of lex's start condition facility in the right way, then you should be able to parse a yacc grammar
    with lex and to bootstrap an implementation of yacc with just lex.

    A specification possibly suitable for utilities like lex (using UNICODE, in UTF-8 format) - working off the POSIX syntax - can be found in comp.theory
    here
    https://groups.google.com/g/comp.theory/c/jSkl9ey7iM8

    It comp.compilers can handle UTF-8, I can repeat it here, as well.
    [Yes, messages are in UTF-8. -John]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Rock Brentwood on Mon Mar 15 02:37:16 2021
    On 2021-03-15, Rock Brentwood <rockbrentwood@gmail.com> wrote:
    It's a recurrent question that's come up in other forums "can yacc be bootstrapped in yacc?" Now, I'm adding a twist.

    I'll repeat one of my recent replies here. In the syntax for yacc files, laid out by the POSIX standard, there is no mandatory semi-colon at the ends of rules, so an extra look-ahead is required to determine whether an identifier is followed by a colon. If so, then this indicates the left-hand side of a new
    rule.

    A grammar rule has the form

    left-hand-side ":" stuff on the right optional ";"'s.

    If you see a ":" in the middle of the rules on the right, then you've actually
    sneaked on over into the *next* rule.

    Bison hacks the syntax, by making left-hand-side + ":" into a single token.

    You could simply allow rules of this form

    ":" right side ...


    With a semantic restrction that this must be preceded by a rule that
    is not terminated with a semicolon, whose last element is a symbol:

    blah ":" previous rule material ending in symbol /* no semicolon */

    ":" next rule ";"

    Then we make the AST transformation of moving "symbol" to be the head
    of the following rule:

    -->


    blah ":" previous rule material ending in

    symbol ":" next rule ";"

    Situations where it's not a symbol, or the prior rule has ended in
    a semicolon (which is thus followed by a colon) are not syntax
    errors, but are diagnosed semantically.
    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)