• Parser combinator rewrite status

    From luser droog@21:1/5 to All on Sun Dec 6 01:48:35 2020
    I started redesigning the Parser Combinators following the
    paper by Hutton where he describes adding error reports by
    rewriting everything to operate on a 3-state object. A parser
    can result in an [OK ...] state or it can be [Error ...] or [Fail ...].
    This way you can record some local state at the point that
    an error is discovered and propagate that back down the
    call stack wrapping more local knowledge at each step.

    So for the simple tests I've put it to, viz.

    0 0 (abcd\ne) string-input
    (abc) str exec
    report
    pc

    0 0 (abed\ne) string-input
    (abd) str
    (abc) str alt exec
    report
    pc

    0 0 (abed\ne) string-input
    (a)(c) range
    (a)(c) range then
    exec
    report
    pq

    That is two success tests and one failure test, I get the following
    (promising) output:

    $ gsnd -q -dNOSAFER pc11a.ps
    OK
    [(a) (b) (c)]
    remainder:[[(d) [0 3]] {0 4 (\ne) string-input}]
    stack:
    :stack
    Fail
    [[(after) [(a) []]] [[(after) [(b) []]] [[{(c) eq} (not satisfied)] [[(e) [0 2]] {0 3 (d\ne) string-input}]]]]
    stack:
    :stack
    OK
    [(a) (b)]
    remainder:{0 2 (ed\ne) string-input}
    stack:
    :stack

    The "stack:...:stack" parts are just demonstrating that the stacks
    are left nice and clean after each test. And the Failure report has
    the pieces for a nice error message, but it's not quite there yet IMO.

    I suppose the next logical step is to try to build a regex engine or
    a tokenizer with it. to be continued...

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jeffrey H. Coffield@21:1/5 to luser droog on Mon Dec 7 09:58:18 2020
    Not sure what you are trying to accomplish here, but some work was done
    parsing PostScript using Antlr which has some interesting error
    handling. I have just started using Antlr4 myself and at some point will probably look at using it to parse PostScript forms.

    Jeff Coffield

    On 12/06/2020 01:48 AM, luser droog wrote:
    I started redesigning the Parser Combinators following the
    paper by Hutton where he describes adding error reports by
    rewriting everything to operate on a 3-state object. A parser
    can result in an [OK ...] state or it can be [Error ...] or [Fail ...].
    This way you can record some local state at the point that
    an error is discovered and propagate that back down the
    call stack wrapping more local knowledge at each step.

    So for the simple tests I've put it to, viz.

    0 0 (abcd\ne) string-input
    (abc) str exec
    report
    pc

    0 0 (abed\ne) string-input
    (abd) str
    (abc) str alt exec
    report
    pc

    0 0 (abed\ne) string-input
    (a)(c) range
    (a)(c) range then
    exec
    report
    pq

    That is two success tests and one failure test, I get the following (promising) output:

    $ gsnd -q -dNOSAFER pc11a.ps
    OK
    [(a) (b) (c)]
    remainder:[[(d) [0 3]] {0 4 (\ne) string-input}]
    stack:
    :stack
    Fail
    [[(after) [(a) []]] [[(after) [(b) []]] [[{(c) eq} (not satisfied)] [[(e) [0 2]] {0 3 (d\ne) string-input}]]]]
    stack:
    :stack
    OK
    [(a) (b)]
    remainder:{0 2 (ed\ne) string-input}
    stack:
    :stack

    The "stack:...:stack" parts are just demonstrating that the stacks
    are left nice and clean after each test. And the Failure report has
    the pieces for a nice error message, but it's not quite there yet IMO.

    I suppose the next logical step is to try to build a regex engine or
    a tokenizer with it. to be continued...


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From luser droog@21:1/5 to Jeffrey H. Coffield on Sat Dec 12 01:23:14 2020
    On Monday, December 7, 2020 at 11:58:21 AM UTC-6, Jeffrey H. Coffield wrote:
    Not sure what you are trying to accomplish here, but some work was done parsing PostScript using Antlr which has some interesting error
    handling. I have just started using Antlr4 myself and at some point will probably look at using it to parse PostScript forms.

    Jeff Coffield

    I want to write parser *in* PostScript, not just *for* PostScript. And I think the parser combinators make for a nice way do it, despite the difficulty in translating it to a non-lazy language. But I did a whole bunch of work
    before realizing that the lack of error messages made the whole thing
    rather unusable. But it's quite possible I'm overlooking a simpler way to
    go about it. It's possible possible that postscript isn't the right tool for the job.

    But in the interim, I've had some success with implementing lazy evaluation
    and translating the result to working (complicated) C. So if I can get a prototype that produces nice error messages easily from some kind of
    annotated grammar, I can translate that to C and have a nice interface
    for writing parsers in C.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)