• C scanner

    From luser droog@21:1/5 to All on Mon Dec 13 15:20:58 2021
    Sticking with PS version 12 of the parser combinators, I finished the
    usual 3 examples (regex, PS scanner, JSON parser) and they seemed
    pretty good and concise. So I translated my C scanner over from
    the C version 9. It looks pretty good to me. Especially the helper
    function `tokendef` which makes the parser add a tag to the return value.

    Wrapping a lazy-input function around another lazy-input functions is
    just weird. It seems to work when I run it stepwise in my head, but it
    still looks weird the way it's written. It makes more sense when you
    look at how `lazy-input` builds the function. But that part isn't new so
    I won't include it here.

    The big idea is at the bottom. Calling `token-input` with a string-input
    and 2 zeros gives you a lazy stream of tagged token structures.
    Calling `string-input` needs its own 2 zeros. So there's a lot of zeros
    to put 'em together.

    %errordict/typecheck{ps pe quit}put
    (pc12.ps)run {
    tokendef{ 1 index cvlit { exch cons one } curry using def }
    cvsstr{ dup length string cvs }
    strcat{ 2 copy length exch length add string % a b s
    3 2 roll 2 copy 0 exch putinterval % b s a
    length 3 2 roll 3 copy putinterval pop pop }
    prefix{ exch strcat cvn }
    } pairs-begin

    /keywords {
    int char
    float double struct
    auto extern
    register static
    goto return sizeof
    break continue
    if else
    for do while
    switch case default
    } cvlit def
    keywords { cvsstr dup (k_) prefix exch str tokendef } forall
    /keyword-names keywords { cvsstr (k_) prefix } map def

    /symbols {
    star (*) plusplus (++) plus (+) dot (.)
    arrow (->) minusminus (--) minus (-)
    bangeq (!=) bang (!) tilde (~)
    ampamp (&&) amp (&) eqeq (==) equal (=)
    caret (^) pipepipe (||) pipe (|)
    slant (/) percent (%)
    ltlt (<<) lteq (<=) less (<)
    gtgt (>>) gteq (>=) greater (>)
    lparen (\() rparen (\))
    comma (,) semi (;) colon (:) quest (?)
    lbrace ({) rbrace (}) lbrack ([) rbrack (])
    } cvlit def
    symbols 2 { aload pop str tokendef } fortuple
    /symbol-names [ symbols 2 { first } fortuple ] def

    /assignops {
    pluseq (+=) minuseq (-=)
    stareq (*=) slanteq (/=) percenteq (%=)
    gtgteq (>>=) ltlteq (<<=)
    ampeq (&=) careteq (^=) pipeeq (|=)
    } cvlit def
    assignops 2 { aload pop str tokendef } fortuple

    /comment (/*) str (*) noneof many (*) char then some then (/) then def /space ( \t\n) anyof //comment alt many def

    /alpha_ (a)(z)range (A)(Z)range alt (_)char alt def
    /digit (0)(9)range def
    /identifier //alpha_ //alpha_ //digit alt many then tokendef

    /integer //digit some tokendef
    /floating //digit some (.) char then //digit many then
    (.) char //digit some then alt
    (eE) anyof (+-) anyof maybe then //digit some then maybe then tokendef

    /escape (\\) char
    //digit //digit maybe then //digit maybe then
    ('"bnrt\\) anyof alt then def
    /char_ //escape ('\n) noneof alt def
    /schar_ //escape ("\n) noneof alt def
    /character (') char //char_ then (') char then tokendef
    /astring (") char //schar_ many then (") char then tokendef

    /constant //floating //integer alt //character alt //astring alt tokendef

    /symbolic [ keyword-names {load} forall
    symbol-names {load} forall
    assignops 2{first load} fortuple
    counttomark 1 sub {alt} repeat exch pop def

    /ctoken //space //constant //symbolic alt //identifier alt xthen def /token-input{r c in}
    { in dup //ctoken exec +not-ok { true }{ exch pop second xs-x false } ifelse }
    { 4 3 roll } % xs [x[r c]] r' c' -> [x[r c]] r' c' xs
    { token-input } lazy-input def

    0 0 ( aname another) string-input //ctoken exec report
    0 0 ( ++ / * ) string-input //ctoken exec report
    0 0 ( 37,x,y ) string-input //ctoken exec report
    0 0 0 0 ( 37,x,y{12+q;} ) string-input token-input
    dup first ==
    next dup first ==
    next dup first ==
    next dup first ==
    next dup first ==
    next dup first ==
    pc

    quit


    $ gsnd -q -dNOSAFER pc12ctok.ps
    OK
    [[/identifier [(a) (n) (a) (m) (e)]]]
    remainder:[[( ) [0 6]] {0 7 (another) string-input}]
    OK
    [[/plusplus [(+) (+)]]]
    remainder:{0 3 ( / * ) string-input}
    OK
    [[/constant [[/integer [(3) (7)]]]]]
    remainder:[[(,) [0 3]] {0 4 (x,y ) string-input}]
    [[[/constant [[/integer [(3) (7)]]]]] [0 0]]
    [[[/comma (,)]] [0 1]]
    [[[/identifier (x)]] [0 2]]
    [[[/comma (,)]] [0 3]]
    [[[/identifier (y)]] [0 4]]
    [[[/lbrace ({)]] [0 5]]
    stack:
    [[[[/lbrace ({)]] [0 5]] {0 6 {0 8 (12+q;} ) string-input} token-input}]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)