• Improved accuracy in diagnostics. Is it worthwhile?

    From Ev. Drikos@21:1/5 to All on Fri Mar 18 15:05:49 2022
    Hello,

    This is mainly a parsing question but it's also Fortran related as well.

    When I make syntax checking with the command 'fcheck' in the code below,
    the error message doesn't contain a '(' in the expected tokens. This
    happens due to default actions, although the parser is basically LALR. A
    pure LALR parser wouldn't make reductions without examine the lookahead.

    Default actions are useful because they save a lot of space in parsing
    tables, at the cost of missing expected tokens in the error messages
    printed by the command 'fcheck'. This is the relevant BNF rule for the
    example given at the end of this message:

    implicit-stmt ::=
    IMPLICIT implicit-spec-list
    | IMPLICIT NONE [ ( [ implicit-none-spec-list ] ) ]


    Disabling default actions for the command 'fcheck' is fairly simple,
    just a button click in Syntaxis, but at the moment I can't think of
    how many error messages would be improved, whereas a parsing table
    increase (50%) would be granted. The command 'fcheck' can be found at https://github.com/drikosev/Fortran

    So far, my approach has been that improved diagnostics shouldn't slow
    down the processing of correct programs. Is it worthwhile to improve diagnostics by disabling default actions in a LALR parser?


    Thanks,
    Ev. Drikos

    ----------------------------------------------------------------------
    $ cat default-actions.f90 && fcheck default-actions.f90
    IMPLICIT NONE ? (type, external)
    PRINT *, "Only ';', not a '(', in the expected tokens in diagnostics."
    END

    default-actions.f90:1: error: syntax:Unexpected: '?'. Expected: ";".

    Parsed with Errors: default-actions.f90
    $

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Daniel Feenberg@21:1/5 to Ev. Drikos on Fri Mar 18 06:18:08 2022
    On Friday, March 18, 2022 at 9:05:54 AM UTC-4, Ev. Drikos wrote:
    ...
    Default actions are useful because they save a lot of space in parsing tables, at the cost of missing expected tokens in the error messages
    printed by the command 'fcheck'. This is the relevant BNF rule for the example given at the end of this message:
    ...
    So far, my approach has been that improved diagnostics shouldn't slow
    down the processing of correct programs. Is it worthwhile to improve diagnostics by disabling default actions in a LALR parser?
    ...

    Thanks,
    Ev. Drikos

    .

    Improved diagnostics might reduce the number of compilations required before a program was correct. Hence the concern about improved diagnostics slowing down the processing of correct programs might be misplaced. Personally I find good diagnostics a very
    pleasant surprise when encountered and a strong incentive to use that compiler.

    Daniel Feenberg

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to Ev. Drikos on Fri Mar 18 13:54:29 2022
    On Friday, March 18, 2022 at 6:05:54 AM UTC-7, Ev. Drikos wrote:

    This is mainly a parsing question but it's also Fortran related as well.

    You might ask in comp.compilers, where parsing questions go.

    When I make syntax checking with the command 'fcheck' in the code below,
    the error message doesn't contain a '(' in the expected tokens. This
    happens due to default actions, although the parser is basically LALR. A
    pure LALR parser wouldn't make reductions without examine the lookahead.

    This is an interesting case. Since the ( isn't required, it isn't so obvious that the message should mention it.

    On the other hand, if a common error was to put in a letter and forget
    the (), then it is a nice reminder.

    Seems to me that the question is more user experience, and less
    parsing tables. Users can make some strange mistakes, and it is
    nice to help them in those cases. But you can't find all the cases.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ev. Drikos@21:1/5 to All on Tue Mar 22 08:42:06 2022
    On 18/03/2022 22:54, gah4 wrote:

    This is an interesting case. Since the ( isn't required, it isn't so obvious that the message should mention it.

    There are more vague cases (IMHO of course), ie the "." (dot) and the
    new line character (ASCII decimal code 10).

    -GNU Fortran ie accepts the dot as an alternative to "%" only if the
    command line option "-fdec" has been specified (legacy support).

    -The new line character is an alternative to ";" in Fortran lines but
    the only character that ends an OpenMP statement in Fortran programs.

    Not sure if users would like to see all the above duplicate choices
    or just a "\n" instead of a "; or \n" and a "%" instead of ". or %".

    ...

    Seems to me that the question is more user experience, and less
    parsing tables.


    IMHO, one has to figure out out what programmers find less disturbing.
    Also, the command 'fcheck' is a LALR based parser, which means that the lookahead sets may contain spurious tokens (in few cases I hope).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)