• What to use for finding as many syntax errors as possible.

    From Antoon Pardon@21:1/5 to All on Sun Oct 9 12:09:17 2022
    I would like a tool that tries to find as many syntax errors as possible
    in a python file. I know there is the risk of false positives when a
    tool tries to recover from a syntax error and proceeds but I would
    prefer that over the current python strategy of quiting after the first
    syntax error. I just want a tool for syntax errors. No style
    enforcements. Any recommandations? -- Antoon Pardon

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Avi Gross@21:1/5 to antoon.pardon@vub.be on Sun Oct 9 11:49:55 2022
    Anton

    There likely are such programs out there but are there universal agreements
    on how to figure out when a new safe zone of code starts where error
    testing can begin?

    For example a file full of function definitions might find an error in
    function 1 and try to find the end of that function and resume checking the next function. But what if a function defines local functions within it?
    What if the mistake in one line of code could still allow checking the next line rather than skipping it all?

    My guess is that finding 100 errors might turn out to be misleading. If you
    fix just the first, many others would go away. If you spell a variable name wrong when declaring it, a dozen uses of the right name may cause errors. Should you fix the first or change all later ones?



    On Sun, Oct 9, 2022, 6:11 AM Antoon Pardon <antoon.pardon@vub.be> wrote:

    I would like a tool that tries to find as many syntax errors as possible
    in a python file. I know there is the risk of false positives when a
    tool tries to recover from a syntax error and proceeds but I would
    prefer that over the current python strategy of quiting after the first syntax error. I just want a tool for syntax errors. No style
    enforcements. Any recommandations? -- Antoon Pardon
    --
    https://mail.python.org/mailman/listinfo/python-list


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Peter J. Holzer@21:1/5 to Antoon Pardon on Sun Oct 9 18:17:18 2022
    On 2022-10-09 12:09:17 +0200, Antoon Pardon wrote:
    I would like a tool that tries to find as many syntax errors as possible in
    a python file. I know there is the risk of false positives when a tool tries to recover from a syntax error and proceeds but I would prefer that over the current python strategy of quiting after the first syntax error. I just want a tool for syntax errors. No style enforcements. Any recommandations?

    There seems to have been increased interest in good error recovery over
    the last years. I thought I had bookmarked a bunch of projects, but the
    only one I can find right now is Lezer (https://marijnhaverbeke.nl/blog/lezer.html) which is part of the
    CodeMirror (https://codemirror.net/) editor. Python is listed as a
    currently supported language, so you might want to check that out.

    Disclaimer: I haven't used CodeMirror, so I can't say anything about
    its quality. The blog entry about Lezer was interesting, though.

    hp

    --
    _ | Peter J. Holzer | Story must make more sense than reality.
    |_|_) | |
    | | | hjp@hjp.at | -- Charles Stross, "Creative writing
    __/ | http://www.hjp.at/ | challenge!"

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEETtJbRjyPwVTYGJ5k8g5IURL+KF0FAmNC8/4ACgkQ8g5IURL+ KF3U8xAAjDIauyZXRA41jJS9nAck9WS08CBDR7Xx9e9NiujGtOFpbBeorL7mEdgQ uD4I7ZnDeEoeRANA2g6tY9bd+/CidZxeMy32/MXQVHFz/Khwk8OQCj4vi3ydVICk CvW2YQgMGfYV467TTFPTk6fupEGf5Cwsetc+B5B2ScPrNVyZ4VuYLND/+5BeUrN2 nkFomOe5O9TLWc+lA37A/JBddhM0/JZ9mNCTwvqEB2xSRUhNx/19J0niQ8c8lorP s61dNtcF79YFD0SPQ1nnQeTU4wfxLSiJB9k49hIYm9xlulWpu+ztY8wYF73lVHVz 4tZuxL+B885R+3VJMkHdMeUepMbXdwYBbj4eBIV9rKq4X47U4yPiTgNXhALQDT/k QzceM7hX3/tn8aCbe/gZNqwUdORVP2sExxI/uG5+LgQ1FenutRCeDehn4dpV1+la kF788P7+qygJ7oS2uZWc74EvnCiTARPzLzUQzreHQFZ09zGAEjgjD4zarefA2qHo UOSvWQ8VZbOsbcdnEyWbt/Oewr08hJauk9r9pveLFqKQjLViYI6xDbylkxbgk4n5 cwUiCGe2eJvDiOQlXWVuhYIT3HSB3HN3tM2JF+V6igTLRhgc7IuLW978T9OCvimi CTHOwNQBv7XXP8Xe2uCMNSydApQ1eQP6q9KLkqg
  • From Antoon Pardon@21:1/5 to All on Sun Oct 9 18:59:36 2022
    Op 9/10/2022 om 17:49 schreef Avi Gross:
    My guess is that finding 100 errors might turn out to be misleading. If you fix just the first, many others would go away.

    At this moment I would prefer a tool that reported 100 errors, which would allow me to easily correct 10 real errors, over the python strategy which quits after having found one syntax error.

    --
    Antoon.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Karsten Hilbert@21:1/5 to All on Sun Oct 9 19:23:41 2022
    Am Sun, Oct 09, 2022 at 06:59:36PM +0200 schrieb Antoon Pardon:

    Op 9/10/2022 om 17:49 schreef Avi Gross:
    My guess is that finding 100 errors might turn out to be misleading. If you >fix just the first, many others would go away.

    At this moment I would prefer a tool that reported 100 errors, which would allow me to easily correct 10 real errors, over the python strategy which quits
    after having found one syntax error.

    But the point is: you can't (there is no way to) be sure the
    9+ errors really are errors.

    Unless you further constrict what sorts of errors you are
    looking for and what margin of error or leeway for false
    positives you want to allow.

    Karsten
    --
    GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Thomas Passin@21:1/5 to Peter J. Holzer on Sun Oct 9 12:59:09 2022
    https://stackoverflow.com/questions/4284313/how-can-i-check-the-syntax-of-python-script-without-executing-it

    People seemed especially enthusiastic about the one-liner from jmd_dk.

    On 10/9/2022 12:17 PM, Peter J. Holzer wrote:
    On 2022-10-09 12:09:17 +0200, Antoon Pardon wrote:
    I would like a tool that tries to find as many syntax errors as possible in >> a python file. I know there is the risk of false positives when a tool tries >> to recover from a syntax error and proceeds but I would prefer that over the >> current python strategy of quiting after the first syntax error. I just want >> a tool for syntax errors. No style enforcements. Any recommandations?

    There seems to have been increased interest in good error recovery over
    the last years. I thought I had bookmarked a bunch of projects, but the
    only one I can find right now is Lezer (https://marijnhaverbeke.nl/blog/lezer.html) which is part of the
    CodeMirror (https://codemirror.net/) editor. Python is listed as a
    currently supported language, so you might want to check that out.

    Disclaimer: I haven't used CodeMirror, so I can't say anything about
    its quality. The blog entry about Lezer was interesting, though.

    hp



    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Peter J. Holzer@21:1/5 to Karsten Hilbert on Sun Oct 9 19:36:48 2022
    On 2022-10-09 19:23:41 +0200, Karsten Hilbert wrote:
    Am Sun, Oct 09, 2022 at 06:59:36PM +0200 schrieb Antoon Pardon:
    Op 9/10/2022 om 17:49 schreef Avi Gross:
    My guess is that finding 100 errors might turn out to be misleading. If you
    fix just the first, many others would go away.

    At this moment I would prefer a tool that reported 100 errors, which would allow me to easily correct 10 real errors, over the python strategy which quits
    after having found one syntax error.

    But the point is: you can't (there is no way to) be sure the
    9+ errors really are errors.

    As a human who knows Python in many cases you can be sure. Sometimes you
    aren't sure, then you leave that one for the next iteration. No big
    deal. This isn't the 1960s when you sent your punched cards in and got
    the result back next week. So neither the parser nor you need to be
    perfect. Just better than one error at a time.

    hp

    --
    _ | Peter J. Holzer | Story must make more sense than reality.
    |_|_) | |
    | | | hjp@hjp.at | -- Charles Stross, "Creative writing
    __/ | http://www.hjp.at/ | challenge!"

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEETtJbRjyPwVTYGJ5k8g5IURL+KF0FAmNDBrAACgkQ8g5IURL+ KF2mDQ/+Lf0GqC1koqEPJXNUY0m/DEyjVKB+lQfqwBoGl1Gy0M4ma+0HhrGaVoyg g3mHulaVv6/7lIzkw8BRADs9ABebUPtoj8ZnabFKanPEOFCiK9SpPkQWxanRgHM/ NuZjq9mJCR9ktCPoHOCldgwfHH4ZCc06apkC+pet5lfoC+5vy3aRSUOoVunfZskb 2Qb4IXeIDiZyfU+vUTI11X0O6Mg9ET/SLNa6kXrT++vBsDgmyV0uF8JIEko9q3Ue RsMSHixP94Tz6dwTL6GUHY4iDhhiWpHZoonLAB1ZiPzFyGetvhgAMU3bNB1BO/tM QD4QGYb2E7I7q7aaH4pKPUVjUg6NLmNLb6y3wdPu5YbmlQgqr83rSacab7eam2if GudPE3vN4UWvMBbD0x6fepOl4xJR99TtT8ju1SFHnQH6l2N57WeJuUMtB+NM34CV NeIeyVH0Pon84PkWkL6yT0eqxGJBUB5yGsKRNipjE6fSYAXOVkNO+579QbzlW4+E giMq3Tjujv7vBix64WmEEKEvP0eRZGf5Fzy6+S9dYYSKhYlvIQq5L+8kPBB6itt0 8ZUDeHpr8e0T62q802xYXLFwsg70dHGe+FrTBhkPFCQjP1PE6NtZOsNqkcrWEAUa zUksSTtVW3/zvhM9JtgCXdMruHCQhidtN922sPs
  • From Peter J. Holzer@21:1/5 to Thomas Passin on Sun Oct 9 19:29:32 2022
    On 2022-10-09 12:59:09 -0400, Thomas Passin wrote:
    https://stackoverflow.com/questions/4284313/how-can-i-check-the-syntax-of-python-script-without-executing-it

    People seemed especially enthusiastic about the one-liner from jmd_dk.

    I don't think that one-liner solves Antoon's requirement of continuing
    after an error. It uses just the normal python parser so it has exactly
    the same limitations.

    Some of the mentioned tools may do what Antoon wants, though.

    hp

    --
    _ | Peter J. Holzer | Story must make more sense than reality.
    |_|_) | |
    | | | hjp@hjp.at | -- Charles Stross, "Creative writing
    __/ | http://www.hjp.at/ | challenge!"

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEETtJbRjyPwVTYGJ5k8g5IURL+KF0FAmNDBPcACgkQ8g5IURL+ KF0OYA//Y0UuC5GxuEUeP0Zo6k8yW30p4/YCs0PHo2+sq/iqq8bOOZfTeN0a8Rqm PwPHxJdlQvTV1sd/qL8AlTBfv9BRlczfl+np3HS4wTROTS2icd2rXFv9I0JR3oYn ove7KilKjbUDtF4N/PSAaA78flzvD57ecY8R4EmeMMeJD1GIM77X32ztty9HMo2p fFfyz8brjsEq+LPYJKjpF/8eZHvMZiq4/3CV5sjtWyfImqNgL3mMB3Jd7dWeNqDv W3qZT2PebTqQ4+AvCplt6SMP2R675V7J/DEUwfVwQ3mcr+Ua8YX3H5ToJMAPH7/B K1qdgCnD3KGMnkCQzVxzsB0E+HsE42mqS/L/veT35KUXojpcS1TWy2ZwzEo4RSS4 kgRagiyaBDqJx0MFVZwDA0MqAeDOWlNFLq2bfFlH7x3sy+JnZ4ruZF9kVDfIHQ4b BOvNnUShlPTH1YhR/+NpTg4N1B/Ucj5AZEs9sgIJ8u2kalNILzfXjz+DyAN1G9lP LnLupttAm5RRZeP6KEtOs+GuIzGBfawVs7Uv+eqLgjwQfkxTdF8J4cMZ+FY0D0Ix P4ydJMNKEGlgX+YXSYAQeDAHn/ey7GqOTJXY0GVSVy0vOTIEiUx6U2K2+BSLQmZR NJJjiralxSt8KRcBxohsUKVOFDC1lBWDkiBpjDd
  • From Antoon Pardon@21:1/5 to All on Sun Oct 9 19:51:12 2022
    Op 9/10/2022 om 19:23 schreef Karsten Hilbert:
    Am Sun, Oct 09, 2022 at 06:59:36PM +0200 schrieb Antoon Pardon:

    Op 9/10/2022 om 17:49 schreef Avi Gross:
    My guess is that finding 100 errors might turn out to be misleading. If you >>> fix just the first, many others would go away.
    At this moment I would prefer a tool that reported 100 errors, which would >> allow me to easily correct 10 real errors, over the python strategy which quits
    after having found one syntax error.
    But the point is: you can't (there is no way to) be sure the
    9+ errors really are errors.

    Unless you further constrict what sorts of errors you are
    looking for and what margin of error or leeway for false
    positives you want to allow.

    Look when I was at the university we had to program in Pascal and
    the compilor we used continued parsing until the end. Sure there
    were times that after a number of reported errors the number of
    false positives became so high it was useless trying to find the
    remaining true ones, but it still was more efficient to correct the
    obvious ones, than to only correct the first one.

    I don't need to be sure. Even the occasional wrong correction
    is probably still more efficient than quiting after the first
    syntax error.

    --
    Antoon.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Weatherby,Gerard@21:1/5 to All on Sun Oct 9 18:00:51 2022
    PyCharm.

    Does a good job of separating these are really errors from do you really mean that warnings from this word is spelled right.

    https://www.jetbrains.com/pycharm/

    From: Python-list <python-list-bounces+gweatherby=uchc.edu@python.org> on behalf of Antoon Pardon <antoon.pardon@vub.be>
    Date: Sunday, October 9, 2022 at 6:11 AM
    To: python-list@python.org <python-list@python.org>
    Subject: What to use for finding as many syntax errors as possible.
    *** Attention: This is an external email. Use caution responding, opening attachments or clicking on links. ***

    I would like a tool that tries to find as many syntax errors as possible
    in a python file. I know there is the risk of false positives when a
    tool tries to recover from a syntax error and proceeds but I would
    prefer that over the current python strategy of quiting after the first
    syntax error. I just want a tool for syntax errors. No style
    enforcements. Any recommandations? -- Antoon Pardon
    -- https://urldefense.com/v3/__https://mail.python.org/mailman/listinfo/python-list__;!!Cn_UX_p3!kxDZilNf74VILuntVEzVZ4Wjv6RPr4JUbGpWrURDJ3CtDNAi9szBWweqrDM-uHy-o_Sncgrm2BmJIRksmxSG_LGVbBU$<https://urldefense.com/v3/__https:/mail.python.org/mailman/listinfo/
    python-list__;!!Cn_UX_p3!kxDZilNf74VILuntVEzVZ4Wjv6RPr4JUbGpWrURDJ3CtDNAi9szBWweqrDM-uHy-o_Sncgrm2BmJIRksmxSG_LGVbBU$>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From MRAB@21:1/5 to Antoon Pardon on Sun Oct 9 19:51:19 2022
    On 2022-10-09 18:51, Antoon Pardon wrote:


    Op 9/10/2022 om 19:23 schreef Karsten Hilbert:
    Am Sun, Oct 09, 2022 at 06:59:36PM +0200 schrieb Antoon Pardon:

    Op 9/10/2022 om 17:49 schreef Avi Gross:
    My guess is that finding 100 errors might turn out to be misleading. If you
    fix just the first, many others would go away.
    At this moment I would prefer a tool that reported 100 errors, which would >>> allow me to easily correct 10 real errors, over the python strategy which quits
    after having found one syntax error.
    But the point is: you can't (there is no way to) be sure the
    9+ errors really are errors.

    Unless you further constrict what sorts of errors you are
    looking for and what margin of error or leeway for false
    positives you want to allow.

    Look when I was at the university we had to program in Pascal and
    the compilor we used continued parsing until the end. Sure there
    were times that after a number of reported errors the number of
    false positives became so high it was useless trying to find the
    remaining true ones, but it still was more efficient to correct the
    obvious ones, than to only correct the first one.

    I don't need to be sure. Even the occasional wrong correction
    is probably still more efficient than quiting after the first
    syntax error.

    When I did some programming in COBOL, a single omitted "." would
    completely confuse the compiler and it was best to fix that one error
    and then try again.

    On the other hand, TurboPascal would also stop on the first error and
    put the cursor at the error position in the IDE, but as it compiled
    quickly, it wasn't a problem. It was no slower than it would've been if
    it had found multiple errors and you pressed a key to advance to the
    next error.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Avi Gross@21:1/5 to antoon.pardon@vub.be on Sun Oct 9 15:18:19 2022
    Antoon, it may also relate to an interpreter versus compiler issue.

    Something like a compiler for C does not do anything except write code in
    an assembly language. It can choose to keep going after an error and start looking some more from a less stable place.

    Interpreters for Python have to catch interrupts as they go and often run
    code in small batches. Continuing to evaluate after an error could cause
    weird effects.

    So what you want is closer to a lint program that does not run code at all,
    or merely writes pseudocode to a file to be run faster later.

    Many languages now have blocks of code that are not really be evaluated
    till later. Some code is built on the fly. And some errors are not errors
    at first. Many languages let you not declare a variable before using it or allow it to change types. In some, the text is lazily evaluated as late as possible.

    I will say that often enough a program could report more possible errors. Putting your code into multiple files and modules may mean you could
    cleanly evaluate the code and return multiple errors from many modules as
    long as they are distinct. Finding all errors is not possible if recovery
    from one is not guaranteed.

    Take a language that uses a semicolon to end a statement. If absent usually there would be some error but often something on the next line. Your
    evaluator could do an experiment and add a semicolon and try again. This
    might work 90% of the time but sometimes the error was not ending the line
    with a backslash to make it continue properly, or an indentation issue and
    even spelling error. No guarantees.

    Is it that onerous to fix one thing and run it again? It was once when you handed in punch cards and waited a day or on very busy machines.

    On Sun, Oct 9, 2022, 1:03 PM Antoon Pardon <antoon.pardon@vub.be> wrote:



    Op 9/10/2022 om 17:49 schreef Avi Gross:
    My guess is that finding 100 errors might turn out to be misleading. If
    you
    fix just the first, many others would go away.

    At this moment I would prefer a tool that reported 100 errors, which would allow me to easily correct 10 real errors, over the python strategy which quits
    after having found one syntax error.

    --
    Antoon.

    --
    https://mail.python.org/mailman/listinfo/python-list


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Avi Gross@21:1/5 to antoon.pardon@vub.be on Sun Oct 9 15:44:56 2022
    I will say that those of us meaning me, who express reservations are not arguing it is a bad idea to get more info in one sweep. Many errors come in bunches.

    If I keep calling some function with the wrong number or type of arguments,
    it may be the same in a dozen places in my code. The first error report may make me search for the others places so I fix it all at once. Telling me
    where some instances are might speed that a bit.

    As long as it is understood that further errors are a heuristic and
    possibly misleading, fine.

    But an error like setting the size of a fixed length data structure to the right size may result in oodles of errors about being out of range that magically get fixed by one change. Sometimes too much info just gives you a headache.

    But a tool like you described could have uses even if imperfect. If you are teaching a course and students submit programs, could you grade the one
    with a single error higher than one with 5 errors shown imperfectly and
    fail the one with 600?

    On Sun, Oct 9, 2022, 1:53 PM Antoon Pardon <antoon.pardon@vub.be> wrote:



    Op 9/10/2022 om 19:23 schreef Karsten Hilbert:
    Am Sun, Oct 09, 2022 at 06:59:36PM +0200 schrieb Antoon Pardon:

    Op 9/10/2022 om 17:49 schreef Avi Gross:
    My guess is that finding 100 errors might turn out to be misleading.
    If you
    fix just the first, many others would go away.
    At this moment I would prefer a tool that reported 100 errors, which
    would
    allow me to easily correct 10 real errors, over the python strategy
    which quits
    after having found one syntax error.
    But the point is: you can't (there is no way to) be sure the
    9+ errors really are errors.

    Unless you further constrict what sorts of errors you are
    looking for and what margin of error or leeway for false
    positives you want to allow.

    Look when I was at the university we had to program in Pascal and
    the compilor we used continued parsing until the end. Sure there
    were times that after a number of reported errors the number of
    false positives became so high it was useless trying to find the
    remaining true ones, but it still was more efficient to correct the
    obvious ones, than to only correct the first one.

    I don't need to be sure. Even the occasional wrong correction
    is probably still more efficient than quiting after the first
    syntax error.

    --
    Antoon.
    --
    https://mail.python.org/mailman/listinfo/python-list


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Antoon Pardon@21:1/5 to All on Sun Oct 9 21:46:13 2022
    Op 9/10/2022 om 21:18 schreef Avi Gross:
    Antoon, it may also relate to an interpreter versus compiler issue.

    Something like a compiler for C does not do anything except write code in
    an assembly language. It can choose to keep going after an error and start looking some more from a less stable place.

    Interpreters for Python have to catch interrupts as they go and often run code in small batches. Continuing to evaluate after an error could cause weird effects.

    So what you want is closer to a lint program that does not run code at all, or merely writes pseudocode to a file to be run faster later.

    I just want a parser that doesn't give up on encoutering the first syntax error. Maybe do some semantic checking like checking the number of parameters.

    I will say that often enough a program could report more possible errors. Putting your code into multiple files and modules may mean you could
    cleanly evaluate the code and return multiple errors from many modules as long as they are distinct. Finding all errors is not possible if recovery from one is not guaranteed.

    I don't need it to find all errors. As long as it reasonably accuratly
    finds a significant number of them.

    Is it that onerous to fix one thing and run it again? It was once when you handed in punch cards and waited a day or on very busy machines.

    Yes I find it onerous, especially since I have a pipeline with unit tests
    and other tools that all have to redo their work each time a bug is corrected.

    --
    Antoon.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Peter J. Holzer@21:1/5 to Avi Gross on Sun Oct 9 21:46:47 2022
    On 2022-10-09 15:18:19 -0400, Avi Gross wrote:
    Antoon, it may also relate to an interpreter versus compiler issue.

    Something like a compiler for C does not do anything except write code in
    an assembly language. It can choose to keep going after an error and start looking some more from a less stable place.

    Interpreters for Python have to catch interrupts as they go and often run code in small batches. Continuing to evaluate after an error could cause weird effects.

    I don't think this is really an issue. A python file is completely
    compiled to byte code before execution starts.

    It's true that a syntax error before an import prevents that import, but
    since imports are usually at the start of a file, a syntax error will
    only rarely prevent the import (and files intended to be imported
    generally don't have weird side effects anyway).

    One issue is could be that compilers which generate executables are
    generally thorough and slow, while the compilers which generate
    byte-code for immediate consumption by an interpreter are generally
    simple and fast. So there is more incentive for the former to discover
    as many errors as possible and they are also better equipped to do this.

    hp

    --
    _ | Peter J. Holzer | Story must make more sense than reality.
    |_|_) | |
    | | | hjp@hjp.at | -- Charles Stross, "Creative writing
    __/ | http://www.hjp.at/ | challenge!"

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEETtJbRjyPwVTYGJ5k8g5IURL+KF0FAmNDJSIACgkQ8g5IURL+ KF0pOQ/6AgKa98NpZI16iaH4GcbjyBqvhlC7cy+r1tCfDyj6DiGNaUFDXt+6KzV0 3EbV3jKzUrC6uP+rk5CuDH5Gs3YWqYlbOAS0+VB6bSUEp2xS/AncsTI+wCLVlecI rsorqIjRVomkjoHFB0R1ziLqWK3he+G7ili9yz7R4JWTfcpGTEKOK/DxSVJYdQBT 7A8IulPdZbD85wsLzCfQrRAhMizsBuUl2tS1k6ci+4kvATnIU8E+KQlSiHa715Ei bdoEzTwqAdkxjEMWVzABcjSTlnim2qZk3qoXnV/hJmRnmYSJpeVWd9y1HbmVZiLG eFxO/hKynA3xRtPO2VPMwvv/ib5AMD0viTIR1Q/CwSu/f4Dcli5cJCndI0ATIOo7 15hYjwnOlXlUzwZ7MsiomiAsdg4cnb2VqVhcWoKycxaET3CO2WP9ybuaARG7TtxJ b6Y81wvGzuHNTI91e2hNQ3++wz+6ey0S6yPjSIbr9tt9YAw3xn6fR5iCmw+wOsdZ Phfmdrq8iTF4daBONurT52ekHeTdL9oas9m5sva/T2A90LkYkCpzf+0ka5Otdw5D iV00LsNAh+BbjF3ris4RcTuBr8d5pZFfPXGDzinrDrSeBZEqoZil66h7UkROP2/C /rk4N3whyvKbfLHs7BhCtQ+apoADEDiVtKT53VX
  • From Antoon Pardon@21:1/5 to All on Sun Oct 9 22:02:47 2022
    Op 9/10/2022 om 21:44 schreef Avi Gross:
    But an error like setting the size of a fixed length data structure to the right size may result in oodles of errors about being out of range that magically get fixed by one change. Sometimes too much info just gives you a headache.

    So? The user of such a tool doesn't need to go through all the provided information.
    If after correcting a few errors, the users find the rest of the information gives
    him a headache, he can just ignore all that and just run a new iteration.

    --
    Antoon Pardon

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Barry@21:1/5 to All on Sun Oct 9 21:09:51 2022
    On 9 Oct 2022, at 18:54, Antoon Pardon <antoon.pardon@vub.be> wrote:

    

    Op 9/10/2022 om 19:23 schreef Karsten Hilbert:
    Am Sun, Oct 09, 2022 at 06:59:36PM +0200 schrieb Antoon Pardon:

    Op 9/10/2022 om 17:49 schreef Avi Gross:
    My guess is that finding 100 errors might turn out to be misleading. If you
    fix just the first, many others would go away.
    At this moment I would prefer a tool that reported 100 errors, which would >>> allow me to easily correct 10 real errors, over the python strategy which quits
    after having found one syntax error.
    But the point is: you can't (there is no way to) be sure the
    9+ errors really are errors.

    Unless you further constrict what sorts of errors you are
    looking for and what margin of error or leeway for false
    positives you want to allow.

    Look when I was at the university we had to program in Pascal and
    the compilor we used continued parsing until the end. Sure there
    were times that after a number of reported errors the number of
    false positives became so high it was useless trying to find the
    remaining true ones, but it still was more efficient to correct the
    obvious ones, than to only correct the first one.

    If it’s very fast to syntax check then one at a time is fine.
    Python is very fast to syntax check so I personal do not need the multi error version.
    My editor has syntax check on a key and it’s instant to drop me a syntax error.

    Barry


    I don't need to be sure. Even the occasional wrong correction
    is probably still more efficient than quiting after the first
    syntax error.

    --
    Antoon.
    --
    https://mail.python.org/mailman/listinfo/python-list


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Karsten Hilbert@21:1/5 to All on Sun Oct 9 22:46:44 2022
    Am Sun, Oct 09, 2022 at 07:51:12PM +0200 schrieb Antoon Pardon:

    But the point is: you can't (there is no way to) be sure the
    9+ errors really are errors.

    Unless you further constrict what sorts of errors you are
    looking for and what margin of error or leeway for false
    positives you want to allow.

    Look when I was at the university we had to program in Pascal and
    the compilor we used continued parsing until the end. Sure there
    were times that after a number of reported errors the number of
    false positives became so high it was useless trying to find the
    remaining true ones, but it still was more efficient to correct the
    obvious ones, than to only correct the first one.

    I don't need to be sure. Even the occasional wrong correction
    is probably still more efficient than quiting after the first
    syntax error.

    A-ha, so you further defined your context.

    Under which I can agree to the objective :-)

    Best,
    Karsten
    --
    GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Cameron Simpson@21:1/5 to Antoon Pardon on Mon Oct 10 09:45:12 2022
    On 09Oct2022 21:46, Antoon Pardon <antoon.pardon@vub.be> wrote:
    Is it that onerous to fix one thing and run it again? It was once when
    you
    handed in punch cards and waited a day or on very busy machines.

    Yes I find it onerous, especially since I have a pipeline with unit tests
    and other tools that all have to redo their work each time a bug is >corrected.

    It is easy to get the syntax right before submitting to such a pipeline.
    I usually run a linter on my code for serious commits, and I've got a
    `lint1` alias which basicly runs the short fast flavour of that which
    does a syntax check and the very fast less thorough lint phase.

    I say this just to ease your write/run-tests cycle.

    Regarding your main request, had you considered writing your own wrapper
    tool? Something which ran something like:

    python -We:invalid -m py_compile your_python_file.py

    If there's an error, report it, then make a new file commencing with the
    next unindented line after the error, with all preceeding lines
    commented out (to keep the line numbers the same). Then run the check
    again. Repeat until the file's empty or there are no errors.

    This doesn't sound very complex.

    Cheers,
    Cameron Simpson <cs@cskk.id.au>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Chris Angelico@21:1/5 to Antoon Pardon on Mon Oct 10 09:23:27 2022
    On Mon, 10 Oct 2022 at 06:50, Antoon Pardon <antoon.pardon@vub.be> wrote:
    I just want a parser that doesn't give up on encoutering the first syntax error. Maybe do some semantic checking like checking the number of parameters.

    That doesn't make sense though. It's one thing to keep going after
    finding a non-syntactic error, but an error of syntax *by definition*
    makes parsing the rest of the file dubious. What would it even *mean*
    to not give up? How should it interpret the following lines of code?
    All it can do is report the error.

    You know, if you'd not made this thread, the time you saved would have
    been enough for quite a few iterations of "fix one syntactic error,
    run it again to find the next".

    ChrisA

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Thomas Passin@21:1/5 to Peter J. Holzer on Sun Oct 9 21:13:16 2022
    On 10/9/2022 1:29 PM, Peter J. Holzer wrote:
    On 2022-10-09 12:59:09 -0400, Thomas Passin wrote:
    https://stackoverflow.com/questions/4284313/how-can-i-check-the-syntax-of-python-script-without-executing-it

    People seemed especially enthusiastic about the one-liner from jmd_dk.

    I don't think that one-liner solves Antoon's requirement of continuing
    after an error. It uses just the normal python parser so it has exactly
    the same limitations.

    Yes, of course. Interesting, though. py_compile tends to be what I use
    for a quick check. I linked to the page mostly for the other
    possibilities, as you mentioned below:

    Some of the mentioned tools may do what Antoon wants, though.

    hp



    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From avi.e.gross@gmail.com@21:1/5 to Antoon Pardon on Mon Oct 10 00:41:28 2022
    Cameron,

    Your suggestion makes me shudder!

    Removing all earlier lines of code is often guaranteed to generate errors as variables you are using are not declared or initiated, modules are not
    imported and so on.

    Removing just the line or three where the previous error happened would also have a good chance of invalidating something.

    Someone who really wants to be able to isolate large parts of their code so that an error in once does not compromise lots of remaining code, might
    build their code in small units on the level of single functions per file
    and do lots of imports. They can then ask for all the files to be pseudo-compiled to byte-code and that might provide lots of errors to look
    at in one pass.

    But asking for a one-file version to find errors and somehow go past them
    and look for more is more daunting but of course can be done with partial accuracy and usefulness at best.

    As an analogy, if tolerated, think of a spell-checker on a document that can find oodles of words spelled wrong. Unfortunately, a spell corrector can
    drive us nuts if it knows little about context. If it sees a word like
    "reid" should it just change it to "read" or "red" or perhaps "reed" or look
    to see if the real problem is it is supposed to be unified (no space) with a word before or after? Will it know if the word appears in a context where a language like Latin or French or German or Hungarian is being quoted and perhaps it is spelled right, or if wrong, has other more likely corrections?

    Now if you add a grammar detector, and it knows you are looking for an adjective or a verb or a noun, it may do better.

    I use Google translate quite a bit as a tool as I often have to type in various languages and it provides a handy keyboard or lets me check if I
    used the right grammar especially in languages with silly ideas that objects can have 2 or even three genders. So putting in phrases like "this xyz" can result in language-specific text that tells me if it is masculine or
    feminine or perhaps neuter. But the reason I mention it is how often it is WRONG. I mean many languages have multiple words that are spelled the same
    but used and pronounced differently in various contexts. The English word "read" can sound like reed or like red so past tense sounds different as in
    I read that book last week versus please read it to me now. But some
    languages such as Hebrew which generally may not show the vowels, can get totally confused in this program as humans often need lots of context to
    figure out whether the current short word is in a context where it means
    "you: feminine and singular and is pronounced aht or it is a way of showing what follows is a direct object and loosely means "the" in a redundant way
    and is pronounced as "eht". Quite a few words have three or more possible
    ways to pronounce the same letters and without vowel guides need context and sometimes some spreadsheet-like ingenuity as multiple other words are also
    in limbo and once resolved can impact what other words may now mean.
    Obviously adding back the vowels makes things clear so people who are used
    to seeing books written that old way can get hopelessly lost reading a
    modern newspaper.

    End of digression, just assume I could have gone on for many pages
    describing my annoyances at what Google translate does to many other
    languages that show the imperfections in what is really a great and powerful tool.

    Well parsing any program in most languages can be equally complex and
    require lots of context. For example, you can often use the same identifier
    to be the name of a regular variable or the name of a function and sometimes other things such as the name of a module. They can often be disambiguated
    in context. Perhaps the same name following by parentheses should be a
    function call while a name followed by :: or ::: might in that language
    require it to be the name of a module/package. If followed by [ it might
    need to be something indexable such as an array or list and so on. So say
    there is an error in the variable. Can the interpreter or linter figure out what the error is and almost repair it? Can it see a variable name like "alpXha" and note there is no such identifier in the current namespace but there is one called "alpha" that might be the one without the X? But what if what is missing is an open parent or maybe the matching close paren. Does it know if the problem is a bad variable name or a bad function invocation or
    one of many other possible problems. Code with a random blemish is often not easily figured out. If I type the name of a function without parentheses, it could be an attempt to call the function with no arguments (an error though
    in many languages) or it could be I want to pass the function itself as n argument in functional programming. But if I have another variable of type array, might it not be parentheses missing but square brackets?

    The compiler or interpreter often cannot fix it so it often tries to skip forward till it finds something unambiguous that mark the beginning of a new section. That might be something like an unquoted semicolon at the end of a line or a matching close bracket. Depending on such choices, again, varying amounts of the program may be ignored in evaluating what follows. But this
    is not the same as a human speedreading or daydreaming who misses a bit here and there and just hopes it was not crucial and that what follows probably remains worthy and valid. I have sometimes missed something like a name and then seen pages of pronouns like "she" and eventually give up as no more
    hints arrive and I have to go back or ask someone lest a big bunch of the
    text makes no sense to me.

    Someone is wanting to treat code from a spelling checker perspective and
    wants all possible mistakes thrown at them at once. As I pointed out, in
    real life many kinds of context can matter and a really good checker might
    even consult a personal list of words it has learned you want ignored, like people's names or some abbreviations like LOL. It may even read marked-up
    text in say HTML or XML or similar formats that is marked with the language they supposedly contain and calls up a spell-checker appropriate for each region.

    But if they want a really intelligent program that recovers enough from
    errors to reliably continue, maybe not easy.

    They have explained and amended that they understand some of these issues
    and are willing to get lots of false negatives or red herrings and their
    real goal is to have a chance to detect and maybe fix a few things per round rather than just one. Not a bad wish. Just not a trivial wish to grant and satisfy.

    -----Original Message-----
    From: Python-list <python-list-bounces+avi.e.gross=gmail.com@python.org> On Behalf Of Cameron Simpson
    Sent: Sunday, October 9, 2022 6:45 PM
    To: python-list@python.org
    Subject: Re: What to use for finding as many syntax errors as possible.

    On 09Oct2022 21:46, Antoon Pardon <antoon.pardon@vub.be> wrote:
    Is it that onerous to fix one thing and run it again? It was once when
    you handed in punch cards and waited a day or on very busy machines.

    Yes I find it onerous, especially since I have a pipeline with unit
    tests and other tools that all have to redo their work each time a bug
    is corrected.

    It is easy to get the syntax right before submitting to such a pipeline.
    I usually run a linter on my code for serious commits, and I've got a
    `lint1` alias which basicly runs the short fast flavour of that which does a syntax check and the very fast less thorough lint phase.

    I say this just to ease your write/run-tests cycle.

    Regarding your main request, had you considered writing your own wrapper
    tool? Something which ran something like:

    python -We:invalid -m py_compile your_python_file.py

    If there's an error, report it, then make a new file commencing with the
    next unindented line after the error, with all preceeding lines commented
    out (to keep the line numbers the same). Then run the check again. Repeat
    until the file's empty or there are no errors.

    This doesn't sound very complex.

    Cheers,
    Cameron Simpson <cs@cskk.id.au>
    --
    https://mail.python.org/mailman/listinfo/python-list

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Cameron Simpson@21:1/5 to avi.e.gross@gmail.com on Mon Oct 10 17:33:47 2022
    On 10Oct2022 00:41, avi.e.gross@gmail.com <avi.e.gross@gmail.com> wrote:
    Your suggestion makes me shudder!

    And fair enough too. I don't do this for me, I'm just suggesting an
    approach which might bring something to Antoon's objective.

    Removing all earlier lines of code is often guaranteed to generate errors as >variables you are using are not declared or initiated, modules are not >imported and so on.

    Antoon's interested in syntax errors.

    Removing just the line or three where the previous error happened would also >have a good chance of invalidating something.

    Doubtless. He accepts that any such resume-the-parse can bring
    misleading error messages. Antoon is not expecting magic, just getting
    several complaints instead of just the first syntax error.

    I must admit I sympathise a bit, as one of my own major irks is command
    line tools which moan about the first bad option instead of noting it
    and moving on to complain about other things as well, then quitting
    after the command line parse. Pure laziness a lot of the time IMO; I've
    done it myself, but do like to make multiple complaints when it's
    feasible.

    Cheers,
    Cameron Simpson <cs@cskk.id.au>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Antoon Pardon@21:1/5 to All on Mon Oct 10 09:04:18 2022
    Op 10/10/2022 om 00:45 schreef Cameron Simpson:
    On 09Oct2022 21:46, Antoon Pardon <antoon.pardon@vub.be> wrote:
    Is it that onerous to fix one thing and run it again? It was once
    when you
    handed in punch cards and waited a day or on very busy machines.

    Yes I find it onerous, especially since I have a pipeline with unit
    tests
    and other tools that all have to redo their work each time a bug is
    corrected.

    It is easy to get the syntax right before submitting to such a
    pipeline.  I usually run a linter on my code for serious commits, and
    I've got a `lint1` alias which basicly runs the short fast flavour of
    that which does a syntax check and the very fast less thorough lint phase.

    If you have a linter that doesn't quit after the first syntax error,
    please provide a link. I already tried pylint and it also quits after
    the first syntax error.

    --
    Antoon Pardon

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Cameron Simpson@21:1/5 to Antoon Pardon on Mon Oct 10 19:50:38 2022
    On 10Oct2022 09:04, Antoon Pardon <antoon.pardon@vub.be> wrote:
    It is easy to get the syntax right before submitting to such a
    pipeline.  I usually run a linter on my code for serious commits, and
    I've got a `lint1` alias which basicly runs the short fast flavour of
    that which does a syntax check and the very fast less thorough lint
    phase.

    If you have a linter that doesn't quit after the first syntax error,
    please provide a link. I already tried pylint and it also quits after
    the first syntax error.

    I don't have such a linter. I did outline an approach for you to write
    one of your own by wrapping an existing parser program.

    I have a personal "lint" script which runs a few linters. The first
    check is `py_compile` which quits at the first syntax error. The other
    linters are not even tried if that fails.

    I do not know what your editing environment is; I'd have thought that
    some IDEs should make the first syntax error very obvious and easy to go
    to, and an obvious indication that the file as a whoe is syntacticly
    good/bad. If you have such, between them you could fairly easily resolve
    syntax errors rapidly, perhaps rapidly enough to make up for a stop-at-the-first-fail syntax check.

    Cheers,
    Cameron Simpson <cs@cskk.id.au>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael F. Stemper@21:1/5 to Avi Gross on Mon Oct 10 08:21:41 2022
    On 09/10/2022 10.49, Avi Gross wrote:
    Anton

    There likely are such programs out there but are there universal agreements on how to figure out when a new safe zone of code starts where error
    testing can begin?

    For example a file full of function definitions might find an error in function 1 and try to find the end of that function and resume checking the next function. But what if a function defines local functions within it? What if the mistake in one line of code could still allow checking the next line rather than skipping it all?

    My guess is that finding 100 errors might turn out to be misleading. If you fix just the first, many others would go away. If you spell a variable name wrong when declaring it, a dozen uses of the right name may cause errors. Should you fix the first or change all later ones?

    How does one declare a variable in python? Sometimes it'd be nice to
    be able to have declarations and any undeclared variable be flagged.

    When I was writing F77 for a living, I'd (temporarily) put:
    IMPLICIT CHARACTER*3
    at the beginning of a program or subroutine that I was modifying,
    in order to have any typos flagged.

    I'd love it if there was something similar that I could do in python.

    --
    Michael F. Stemper
    87.3% of all statistics are made up by the person giving them.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Robert Latest@21:1/5 to Michael F. Stemper on Mon Oct 10 17:06:37 2022
    Michael F. Stemper wrote:
    How does one declare a variable in python? Sometimes it'd be nice to
    be able to have declarations and any undeclared variable be flagged.

    To my knowledge, the closest to that is using __slots__ in class definitions. Many a time have I assigned to misspelled class members until I discovered __slots__.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Robert Latest@21:1/5 to Antoon Pardon on Mon Oct 10 17:08:32 2022
    Antoon Pardon wrote:
    I would like a tool that tries to find as many syntax errors as possible
    in a python file.

    I'm puzzled as to when such a tool would be needed. How many syntax errors can you realistically put into a single Python file before compiling it for the first time?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Robert Latest@21:1/5 to avi.e.gross@gmail.com on Mon Oct 10 17:14:13 2022
    <avi.e.gross@gmail.com> wrote:
    Cameron,

    Your suggestion makes me shudder!

    Me, too

    Removing all earlier lines of code is often guaranteed to generate errors as variables you are using are not declared or initiated, modules are not imported and so on.

    all of which aren't syntax errors, so the method should still work. Ugly as hell though. I can't think of a reason to want to find multiple syntax errors in a file.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Peter J. Holzer@21:1/5 to Chris Angelico on Mon Oct 10 21:32:56 2022
    On 2022-10-10 09:23:27 +1100, Chris Angelico wrote:
    On Mon, 10 Oct 2022 at 06:50, Antoon Pardon <antoon.pardon@vub.be> wrote:
    I just want a parser that doesn't give up on encoutering the first syntax error. Maybe do some semantic checking like checking the number of parameters.

    That doesn't make sense though.

    I think you disagree with most compiler authors here.

    It's one thing to keep going after finding a non-syntactic error, but
    an error of syntax *by definition* makes parsing the rest of the file dubious.

    Dubious but still useful.

    What would it even *mean* to not give up?

    Read the blog post on Lezer for some ideas: https://marijnhaverbeke.nl/blog/lezer.html

    This is in the context of an editor. But the same problem applies to
    compilers. It's not very important if a compile run only takes a second
    or so but even then it might be helpful to see several error messages
    and not only one at a time. It becomes much more important as compile
    times get longer (as an extreme[1] example, when I worked on a largeish
    cobol program in the 1980s, compiling the thing took about half an hour.
    I really wanted to fix *everything* before starting the compiler again.)

    Marijn isn't the only person who revisited this problem recently[2].
    I've read a few other blog posts and papers on that topic at about the
    same time.

    hp

    [1] Yes, there are programs where a full compile takes much longer than
    that. But you can usually get away with recompiling only a small
    part, so you don't have to wait that long during normal development.
    That cobol compiler couldn't do that.

    [2] "Recently" means "in the last 10 years or so".

    --
    _ | Peter J. Holzer | Story must make more sense than reality.
    |_|_) | |
    | | | hjp@hjp.at | -- Charles Stross, "Creative writing
    __/ | http://www.hjp.at/ | challenge!"

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEETtJbRjyPwVTYGJ5k8g5IURL+KF0FAmNEc2MACgkQ8g5IURL+ KF3ALQ//TQYkga/AfulvyPBEJm2/wYFPx4ti6aCehwG8zLru7FANWolIgDWOqdqR SP3PSz0obbMZsftUwBZDR1VA+QQwESUH+Ibg4YP1y+zGGJcFe2IRhqqNttBQOy3P sQiS/TXaOvT0jgG1Cf7H4q4m7DPEmfptgvpDiiynaXzUzrkMWwtwxqcvVXnjAGqp OD0q2P+ZtsAQIa87isAnHCrl+JdC82ZT5ZfbSXeUpoW/E8l2RF7NklrVFUzNQgLd ULGF6NaO/+7OgnTZ5HXz7S/VynrZEvJUvz2oPMANmW+kthYlP1uyIgUCHiOw2hhQ ooZE6yFQk2z702lzEABcgKfalRA1iONqKkEA0gioqt9xjggOpY+Gap9qQZR2r7XN nQg/tC4b0FKPkKHdSCVDUy3oGUn6LP0rsrGhoerXTR4fFgR/RX5Q51PQHsOLHqfl iqySh/rwifLI65qKgyR8mvhEfb9XJ6/aNhYeu4X9G1ofSKoq7we07k1732ck+7QV gVL+sMvhw+hpJXt+ddTMLA3CHC5g6aQvyri3/yUsWExDqaBpQzdCNR7BDLfys2CI MI0isaxCPos9HNbO9ssudr/GL3VZJALSsvbf/o9K/GrDSpptE7qClRTAxueFDzfU JToVI/dGRE2ZXFkaDhLXONKQPYtw2xbqiWIRiaV
  • From Chris Angelico@21:1/5 to Peter J. Holzer on Tue Oct 11 08:02:17 2022
    On Tue, 11 Oct 2022 at 06:34, Peter J. Holzer <hjp-python@hjp.at> wrote:

    On 2022-10-10 09:23:27 +1100, Chris Angelico wrote:
    On Mon, 10 Oct 2022 at 06:50, Antoon Pardon <antoon.pardon@vub.be> wrote:
    I just want a parser that doesn't give up on encoutering the first syntax error. Maybe do some semantic checking like checking the number of parameters.

    That doesn't make sense though.

    I think you disagree with most compiler authors here.

    It's one thing to keep going after finding a non-syntactic error, but
    an error of syntax *by definition* makes parsing the rest of the file dubious.

    Dubious but still useful.

    There's a huge difference between non-fatal errors and syntactic
    errors. The OP wants the parser to magically skip over a fundamental
    syntactic error and still parse everything else correctly. That's
    never going to work perfectly, and the OP is surprised at this.

    What would it even *mean* to not give up?

    Read the blog post on Lezer for some ideas: https://marijnhaverbeke.nl/blog/lezer.html

    This is in the context of an editor.

    Incidentally, that's actually where I would expect to see that kind of
    feature show up the most - syntax highlighters will often be designed
    to "carry on, somehow" after a syntax error, even though it often
    won't make any sense (just look at what happens to your code
    highlighting when you omit a quote character). It still won't always
    be any use, but you do see *some* attempt at it.

    But if the OP would be satisfied with that, I rather doubt that this
    thread would even have happened. Unless, of course, the OP still lives
    in the dark ages when no text editor available had any suitable
    features for code highlighting.

    ChrisA

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Cameron Simpson@21:1/5 to Avi Gross on Tue Oct 11 09:22:25 2022
    On 09/10/2022 10.49, Avi Gross wrote:
    My guess is that finding 100 errors might turn out to be misleading.
    If you
    fix just the first, many others would go away. If you spell a variable name >>wrong when declaring it, a dozen uses of the right name may cause errors. >>Should you fix the first or change all later ones?

    Just to this, these are semantic errors, not syntax errors. Linters do
    an ok job of spotting these. Antoon is after _syntax errors_.

    On 10Oct2022 08:21, Michael F. Stemper <michael.stemper@gmail.com> wrote:
    How does one declare a variable in python? Sometimes it'd be nice to
    be able to have declarations and any undeclared variable be flagged.

    Linters do pretty well at this. They can trace names and their use
    compared to their first definition/assignment (often - there are of
    course some constructs which are correct but unclear to a static
    analysis - certainly one of my linters occasionally says "possible
    undefine use" to me because there may be a path to use before set). This
    is particularly handy for typos, which often make for "use before set"
    or "set and not used".

    I'd love it if there was something similar that I could do in python.

    Have you used any lint programmes? My "lint" script runs pyflakes and
    pylint.

    Cheers,
    Cameron Simpson <cs@cskk.id.au>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Cameron Simpson@21:1/5 to Chris Angelico on Tue Oct 11 09:17:13 2022
    On 11Oct2022 08:02, Chris Angelico <rosuav@gmail.com> wrote:
    There's a huge difference between non-fatal errors and syntactic
    errors. The OP wants the parser to magically skip over a fundamental >syntactic error and still parse everything else correctly. That's
    never going to work perfectly, and the OP is surprised at this.

    The OP is not surprised by this, and explicitly expressed awareness that resuming a parse had potential for "misparsing" further code.

    I remain of the opinion that one could resume a parse at the next
    unindented line and get reasonable results a lot of the time.

    In fact, I expect that one could resume tokenising at almost any line
    which didn't seem to be inside a string and often get reasonable
    results.

    I grew up with C and Pascal compilers which would _happily_ produce many complaints, usually accurate, and all manner of syntactic errors. They
    didn't stop at the first syntax error.

    All you need in principle is a parser which goes "report syntax error
    here, continue assuming <some state>". For Python that might mean
    "pretend a missing final colon" or "close open brackets" etc, depending
    on the context. If you make conservative implied corrections you can get
    a reasonable continued parse, enough to find further syntax errors.

    I remember the Pascal compiler in particular had a really good "you
    missed a semicolon _back there_" mode which was almost always correct, a
    nice boon when correcting mistakes.

    Cheers,
    Cameron Simpson <cs@cskk.id.au>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Chris Angelico@21:1/5 to Cameron Simpson on Tue Oct 11 09:47:52 2022
    On Tue, 11 Oct 2022 at 09:18, Cameron Simpson <cs@cskk.id.au> wrote:

    On 11Oct2022 08:02, Chris Angelico <rosuav@gmail.com> wrote:
    There's a huge difference between non-fatal errors and syntactic
    errors. The OP wants the parser to magically skip over a fundamental >syntactic error and still parse everything else correctly. That's
    never going to work perfectly, and the OP is surprised at this.

    The OP is not surprised by this, and explicitly expressed awareness that resuming a parse had potential for "misparsing" further code.

    I remain of the opinion that one could resume a parse at the next
    unindented line and get reasonable results a lot of the time.

    The next line at the same indentation level as the line with the
    error, or the next flush-left line? Either way, there's a weird and
    arbitrary gap before you start parsing again, and you still have no
    indication of what could make sense. Consider:

    if condition # no colon
    code
    else:
    code

    To actually "restart" parsing, you have to make a guess of some sort.
    Maybe you can figure out what the user meant to do, and parse
    accordingly; but if that's the case, keep going immediately, don't
    wait for an unindented line. If you want for a blank line followed by
    an unindented line, that might help with a notion of "next logical
    unit of code", but it's very much dependent on the coding style, and
    if you have a codebase that's so full of syntax errors that you
    actually want to see more than one, you probably don't have a codebase
    with pristine and beautiful code layout.

    In fact, I expect that one could resume tokenising at almost any line
    which didn't seem to be inside a string and often get reasonable
    results.

    "Seem to be"? On what basis?

    I grew up with C and Pascal compilers which would _happily_ produce many complaints, usually accurate, and all manner of syntactic errors. They
    didn't stop at the first syntax error.

    Yes, because they work with a much simpler grammar. But even then,
    most syntactic errors (again, this is not to be confused with semantic
    errors - if you say "char *x = 1.234;" then there's no parsing
    ambiguity but it's not going to compile) cause a fair degree of
    nonsense afterwards.

    The waters are a bit muddied by some things being called "syntax
    errors" when they're actually nothing at all to do with the parser.
    For instance:

    def f():
    ... await q
    ...
    File "<stdin>", line 2
    SyntaxError: 'await' outside async function

    This is not what I'm talking about; there's no parsing ambiguity here,
    and therefore no difficulty whatsoever in carrying on with the
    parsing. You could ast.parse() this code without an error. But
    resuming after a parsing error is fundamentally difficult, impossible
    without guesswork.

    All you need in principle is a parser which goes "report syntax error
    here, continue assuming <some state>". For Python that might mean
    "pretend a missing final colon" or "close open brackets" etc, depending
    on the context. If you make conservative implied corrections you can get
    a reasonable continued parse, enough to find further syntax errors.

    And, more likely, you'll generate a lot of nonsense. Take something like this:

    items = [
    item[1],
    item2],
    item[3],
    ]

    As a human, you can easily see what the problem is. Try teaching a
    parser how to handle this. Most likely, you'll generate a spurious
    error - maybe the indentation, maybe the intended end of the list -
    but there's really only one error here. Reporting multiple errors
    isn't actually going to be at all helpful.

    I remember the Pascal compiler in particular had a really good "you
    missed a semicolon _back there_" mode which was almost always correct, a
    nice boon when correcting mistakes.


    Ahh yes. Design a language with strict syntactic requirements, and
    it's not too hard to find where the programmer has omitted them. Thing
    is.... Python just doesn't HAVE those semicolons. Let's say that a
    variant Python required you to put a U+251C ├ at the start of every statement, and U+2524 ┤ at the end of the statement. A whole lot of
    classes of error would be extremely easy to notice and correct, and
    thus you could resume parsing; but that isn't benefiting the
    programmer any. When you don't have that kind of information
    duplication, it's a lot harder to figure out how to cheat the fix and
    go back to parsing.

    ChrisA

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Thomas Passin@21:1/5 to Michael F. Stemper on Mon Oct 10 18:25:42 2022
    On 10/10/2022 9:21 AM, Michael F. Stemper wrote:
    On 09/10/2022 10.49, Avi Gross wrote:
    Anton

    There likely are such programs out there but are there universal
    agreements
    on how to figure out when a new safe zone of code starts where error
    testing can begin?

    For example a file full of function definitions might find an error in
    function 1 and try to find the end of that function and resume
    checking the
    next function.  But what if a function defines local functions within it? >> What if the mistake in one line of code could still allow checking the
    next
    line rather than skipping it all?

    My guess is that finding 100 errors might turn out to be misleading.
    If you
    fix just the first, many others would go away. If you spell a variable
    name
    wrong when declaring it, a dozen uses of the right name may cause errors.
    Should you fix the first or change all later ones?

    How does one declare a variable in python? Sometimes it'd be nice to
    be able to have declarations and any undeclared variable be flagged.

    When I was writing F77 for a living, I'd (temporarily) put:
          IMPLICIT CHARACTER*3
    at the beginning of a program or subroutine that I was modifying,
    in order to have any typos flagged.

    I'd love it if there was something similar that I could do in python.

    The Leo editor (https://github.com/leo-editor/leo-editor) will notify
    you of undeclared variables (and some syntax errors) each time you save
    your (Python) file.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From avi.e.gross@gmail.com@21:1/5 to Avi Gross on Mon Oct 10 22:09:06 2022
    Michael,

    A reasonable question. Python lets you initialize variables but has no
    explicit declarations. Languages differ and I juggle attributes of many in
    my mind and am reacting to the original question NOT about whether and how Python should report many possible errors all at once but how ANY language
    can be expected to do this well. Many others do have a variable declaration phase or an optional declaration or perhaps just a need to declare a
    function prototype so it can be used by others even if the formal function creation will happen later in the code.

    But what I meant in a Python context was something like this:

    Wronk = who cares # this should fail
    ...
    If (Wronk > 5): ...
    ...
    Wronger = Wronk + 1
    ...
    X = minimum(Wronk, Wronger, 12)

    The first line does not parse well so you have an error. But in any case as
    the line makes no sense, Wronk is not initialized to anything. Later code
    may use it in various ways and some of those may be seen as errors for an assortment of reasons, then at one point the code does provide a value for Wronk and suddenly code beyond that has no seeming errors. The above
    examples are not meant to be real but just give a taste that programs with holes in them for any reason may not be consistent. The only relatively guaranteed test for sanity has to start at the top and encounter no errors
    or missing parts based on an anything such as I/O errors.

    And I suggest there are some things sort of declared in python such as:

    Import numpy as np

    Yes, that brings in code from a module if it works and initializes a
    variable called np to sort of point at the module or it's namespace or whatever, depending on the language. It is an assignment but also a way to
    let the program know things. If the above is:

    Import grumpy as np

    Then what happens if the code tries to find a file named "grumpy" somewhere
    and cannot locate it and this is considered a syntax error rather than a run-time error for whatever reason? Can you continue when all kinds of functionality is missing and code asking to make a np.array([1,2,3]) clearly fails?

    Many of us here are talking past each other.

    Yes, it would be nice to get lots of info and arguably we may eventually
    have machine-learning or AI programs a bit more like SPAM detectors that
    look for patterns commonly found and try to fix your program from common
    errors or at least do a temporary patch so they can continue searching for
    more errors. This could result in the best case in guessing right every
    time. If you allowed it to actually fix your code, it might be like people
    who let their spelling be corrected and do not proofread properly and send
    out something embarrassing or just plain wrong!

    And it will compile or be interpreted without complaint albeit not do
    exactly what it is supposed to!




    -----Original Message-----
    From: Python-list <python-list-bounces+avi.e.gross=gmail.com@python.org> On Behalf Of Michael F. Stemper
    Sent: Monday, October 10, 2022 9:22 AM
    To: python-list@python.org
    Subject: Re: What to use for finding as many syntax errors as possible.

    On 09/10/2022 10.49, Avi Gross wrote:
    Anton

    There likely are such programs out there but are there universal
    agreements on how to figure out when a new safe zone of code starts
    where error testing can begin?

    For example a file full of function definitions might find an error in function 1 and try to find the end of that function and resume
    checking the next function. But what if a function defines local
    functions within it?
    What if the mistake in one line of code could still allow checking the
    next line rather than skipping it all?

    My guess is that finding 100 errors might turn out to be misleading.
    If you fix just the first, many others would go away. If you spell a
    variable name wrong when declaring it, a dozen uses of the right name may
    cause errors.
    Should you fix the first or change all later ones?

    How does one declare a variable in python? Sometimes it'd be nice to be able
    to have declarations and any undeclared variable be flagged.

    When I was writing F77 for a living, I'd (temporarily) put:
    IMPLICIT CHARACTER*3
    at the beginning of a program or subroutine that I was modifying, in order
    to have any typos flagged.

    I'd love it if there was something similar that I could do in python.

    --
    Michael F. Stemper
    87.3% of all statistics are made up by the person giving them.
    --
    https://mail.python.org/mailman/listinfo/python-list

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Chris Angelico@21:1/5 to avi.e.gross@gmail.com on Tue Oct 11 13:41:40 2022
    On Tue, 11 Oct 2022 at 13:10, <avi.e.gross@gmail.com> wrote:
    If the above is:

    Import grumpy as np

    Then what happens if the code tries to find a file named "grumpy" somewhere and cannot locate it and this is considered a syntax error rather than a run-time error for whatever reason? Can you continue when all kinds of functionality is missing and code asking to make a np.array([1,2,3]) clearly fails?

    That's not a syntax error. Syntax is VERY specific. It is an error in
    Python to attempt to add 1 to "one", it is an error to attempt to look
    up the upper() method on None, it is an error to try to use a local
    variable you haven't assigned to yet, and it is an error to open a
    file that doesn't exist. But not one of these is a *syntax* error.

    Syntax errors are detected at the parsing stage, before any code gets
    run. The vast majority of syntax errors are grammar errors, where the
    code doesn't align with the parseable text of a Python program. (Non-grammatical parsing errors include using a "nonlocal" statement
    with a name that isn't found in any surrounding scope, using "await"
    in a non-async function, and attempting to import braces from the
    future.)

    ChrisA

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From avi.e.gross@gmail.com@21:1/5 to Chris Angelico on Mon Oct 10 23:11:33 2022
    Cameron, or OP if you prefer,

    I think by now you have seen a suggestion that languages make choices and highly structured ones can be easier to "recover" from errors and try to continue than some with way more complex possibilities that look rather unstructured.

    What is the error in code like this?

    A,b,c,d = 1,2,

    Or is it an error at all?

    Many languages have no concept of doing anything like the above and some tolerate a trailing comma and some set anything not found to some form of
    NULL or uninitialized and some ...

    If you look at human language, some are fairly simple and some are way too organized. But in a way it can make sense. Languages with gender will often
    ask you to change the spelling and often how you pronounce things not only based on whether a noun is male/female or even neuter but also insist you change the form of verbs or adjectives and so on that in effect give
    multiple signals that all have to line up to make a valid and understandable sentence. Heck, in conversations, people can often leave out parts of a sentence such as whether you are talking about "I" or "you" or "she" or "we" because the rest of the words in the sentence redundantly force only one
    choice to be possible.

    So some such annoying grammars (in my opinion) are error
    detection/correction codes in disguise. In days before microphones and speakers, it was common to not hear people well, like on a stage a hundred
    feet away with other ambient noises. Missing a word or two might still allow you to get the point as other parts of the sentence did such redundancies.
    Many languages have similar strictures letting you know multiple times if something is singular or plural. And I think another reason was what I call stranger detection. People who learn some vocabulary might still not speak correctly and be identifiable as strangers, as in spies.

    Do we need this in the modern age? Who knows! But it makes me prefer some languages over others albeit other reasons may ...

    With the internet today, we are used to expecting error correction to come
    for free. Do you really need one of every 8 bits to be a parity bit, which
    only catches may half of the errors, when the internals of your computer are relatively error free and even the outside is protected by things like
    various protocols used in making and examining packets and demanding some be sent again if some checksum does not match? Tons of checking is built in so
    at your level you rarely think about it. If you get a message, it usually is either 99.9999% accurate, or you do not have it shown to you at all. I am
    not talking about SPAM but about errors of transmission.

    So my analogies are that if you want a very highly structured language that
    can recover somewhat from errors, Python may not be it.

    And over the years as features are added or modified, the structure tends to get more complex. And R is not alone. Many surviving languages continue to evolve and borrow from each other and any program that you run today that
    could partially recover and produce pages of possible errors, may blow up
    when new features are introduced.

    And with UNICODE, the number of possible "errors" in what is placed in code
    for languages like Julia that allow them in most places ...


    -----Original Message-----
    From: Python-list <python-list-bounces+avi.e.gross=gmail.com@python.org> On Behalf Of Cameron Simpson
    Sent: Monday, October 10, 2022 6:17 PM
    To: python-list@python.org
    Subject: Re: What to use for finding as many syntax errors as possible.

    On 11Oct2022 08:02, Chris Angelico <rosuav@gmail.com> wrote:
    There's a huge difference between non-fatal errors and syntactic
    errors. The OP wants the parser to magically skip over a fundamental >syntactic error and still parse everything else correctly. That's never
    going to work perfectly, and the OP is surprised at this.

    The OP is not surprised by this, and explicitly expressed awareness that resuming a parse had potential for "misparsing" further code.

    I remain of the opinion that one could resume a parse at the next unindented line and get reasonable results a lot of the time.

    In fact, I expect that one could resume tokenising at almost any line which didn't seem to be inside a string and often get reasonable results.

    I grew up with C and Pascal compilers which would _happily_ produce many complaints, usually accurate, and all manner of syntactic errors. They
    didn't stop at the first syntax error.

    All you need in principle is a parser which goes "report syntax error here, continue assuming <some state>". For Python that might mean "pretend a
    missing final colon" or "close open brackets" etc, depending on the context.
    If you make conservative implied corrections you can get a reasonable
    continued parse, enough to find further syntax errors.

    I remember the Pascal compiler in particular had a really good "you missed a semicolon _back there_" mode which was almost always correct, a nice boon
    when correcting mistakes.

    Cheers,
    Cameron Simpson <cs@cskk.id.au>
    --
    https://mail.python.org/mailman/listinfo/python-list

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Chris Angelico@21:1/5 to avi.e.gross@gmail.com on Tue Oct 11 14:24:28 2022
    On Tue, 11 Oct 2022 at 14:13, <avi.e.gross@gmail.com> wrote:
    With the internet today, we are used to expecting error correction to come for free. Do you really need one of every 8 bits to be a parity bit, which only catches may half of the errors...

    Fortunately, we have WAY better schemes than simple parity, which was
    only really a thing in the modem days. (Though I would say that
    there's still a pretty clear distinction between a good message where everything has correct parity, and line noise where half of them
    don't.) Hamming codes can correct one-bit errors (and detect two-bit
    errors) at a price of log2(size)+1 bits of space. Here's a great
    rundown:

    https://www.youtube.com/watch?v=X8jsijhllIA

    There are other schemes too, but Hamming codes are beautifully elegant
    and easy to understand.

    ChrisA

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From avi.e.gross@gmail.com@21:1/5 to avi.e.gross@gmail.com on Mon Oct 10 23:24:42 2022
    I stand corrected Chris, and others, as I pay the sin tax.

    Yes, there are many kinds of errors that logically fall into different categories or phases of evaluation of a program and some can be determined
    by a more static analysis almost on a line by line (or "statement" or "expression", ...) basis and others need to sort of simulate some things
    and look back and forth to detect possible incompatibilities and yet others
    can only be detected at run time and likely way more categories depending on the language.

    But when I run the Python interpreter on code, aren't many such phases done interleaved and at once as various segments of code are parsed and examined
    and perhaps compiled into block code and eventually executed?

    So is the OP asking for something other than a Python Interpreter that
    normally halts after some kind of error? Tools like a linter may indeed fit that mold.

    This may limit some of the objections of when an error makes it hard for the parser to find some recovery point to continue from as no code is being run
    and no harmful side effects happen by continuing just an analysis.

    Time to go read some books about modern ways to evaluate a language based on more mathematical rules including more precisely what is syntax versus ...

    Suggestions?

    -----Original Message-----
    From: Python-list <python-list-bounces+avi.e.gross=gmail.com@python.org> On Behalf Of Chris Angelico
    Sent: Monday, October 10, 2022 10:42 PM
    To: python-list@python.org
    Subject: Re: What to use for finding as many syntax errors as possible.

    On Tue, 11 Oct 2022 at 13:10, <avi.e.gross@gmail.com> wrote:
    If the above is:

    Import grumpy as np

    Then what happens if the code tries to find a file named "grumpy"
    somewhere and cannot locate it and this is considered a syntax error
    rather than a run-time error for whatever reason? Can you continue
    when all kinds of functionality is missing and code asking to make a np.array([1,2,3]) clearly fails?

    That's not a syntax error. Syntax is VERY specific. It is an error in Python
    to attempt to add 1 to "one", it is an error to attempt to look up the
    upper() method on None, it is an error to try to use a local variable you haven't assigned to yet, and it is an error to open a file that doesn't
    exist. But not one of these is a *syntax* error.

    Syntax errors are detected at the parsing stage, before any code gets run.
    The vast majority of syntax errors are grammar errors, where the code
    doesn't align with the parseable text of a Python program.
    (Non-grammatical parsing errors include using a "nonlocal" statement with a name that isn't found in any surrounding scope, using "await"
    in a non-async function, and attempting to import braces from the
    future.)

    ChrisA
    --
    https://mail.python.org/mailman/listinfo/python-list

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Chris Angelico@21:1/5 to avi.e.gross@gmail.com on Tue Oct 11 14:55:12 2022
    On Tue, 11 Oct 2022 at 14:26, <avi.e.gross@gmail.com> wrote:

    I stand corrected Chris, and others, as I pay the sin tax.

    Yes, there are many kinds of errors that logically fall into different categories or phases of evaluation of a program and some can be determined
    by a more static analysis almost on a line by line (or "statement" or "expression", ...) basis and others need to sort of simulate some things
    and look back and forth to detect possible incompatibilities and yet others can only be detected at run time and likely way more categories depending on the language.

    But when I run the Python interpreter on code, aren't many such phases done interleaved and at once as various segments of code are parsed and examined and perhaps compiled into block code and eventually executed?

    Hmm, depends what you mean. Broadly speaking, here's how it goes:

    0) Early pre-parse steps that don't really matter to most programs,
    like checking character set. We'll ignore these.
    1) Tokenize the text of the program into a sequence of
    potentially-meaningful units.
    2) Parse those tokens into some sort of meaningful "sentence".
    3) Compile the syntax tree into actual code.
    4) Run that code.

    Example:
    code = """def f():
    ... print("Hello, world", 1>=2)
    ... print(Ellipsis, ...)
    ... return True
    ... """


    In step 1, all that happens is that a stream of characters (or bytes,
    depending on your point of view) gets broken up into units.

    for t in tokenize.tokenize(iter(code.encode().split(b"\n")).__next__):
    ... print(tokenize.tok_name[t.exact_type], t.string)

    It's pretty spammy, but you can see how the compiler sees the text.
    Note that, at this stage, there's no real difference between the NAME
    "def" and the NAME "print" - there are no language keywords yet.
    Basically, all you're doing is figuring out punctuation and stuff.

    Step 2 is what we'd normally consider "parsing". (It may well happen concurrently and interleaved with tokenizing, and I'm giving a
    simplified and conceptualized pipeline here, but this is broadly what
    Python does.) This compares the stream of tokens to the grammar of a
    Python program and attempts to figure out what it means. At this
    point, the linear stream turns into a recursive syntax tree, but it's
    still very abstract.

    import ast
    ast.dump(ast.parse(code))
    "Module(body=[FunctionDef(name='f', args=arguments(posonlyargs=[],
    args=[], kwonlyargs=[], kw_defaults=[], defaults=[]), body=[Expr(value=Call(func=Name(id='print', ctx=Load()), args=[Constant(value='Hello, world'), Compare(left=Constant(value=1), ops=[GtE()], comparators=[Constant(value=2)])], keywords=[])), Expr(value=Call(func=Name(id='print', ctx=Load()),
    args=[Name(id='Ellipsis', ctx=Load()), Constant(value=Ellipsis)], keywords=[])), Return(value=Constant(value=True))],
    decorator_list=[])], type_ignores=[])"

    (Side point: I would rather like to be able to
    pprint.pprint(ast.parse(code)) but that isn't a thing, at least not
    currently.)

    This is where the vast majority of SyntaxErrors come from. Your code
    is a sequence of tokens, but those tokens don't mean anything. It
    doesn't make sense to say "print(def f[return)]" even though that'd
    tokenize just fine. The trouble with the notion of "keeping going
    after finding an error" is that, when you find an error, there are
    almost always multiple possible ways that this COULD have been
    interpreted differently. It's as likely to give nonsense results as
    actually useful ones.

    (Note that, in contrast to the tokenization stage, this version
    distinguishes between the different types of word. The "def" has
    resulted in a FunctionDef node, the "print" is a Name lookup, and both
    "..." and "True" have now become Constant nodes - previously, "..."
    was a special Ellipsis token, but "True" was just a NAME.)

    Step 3: the abstract syntax tree gets parsed into actual runnable
    code. This is where that small handful of other SyntaxErrors come
    from. With these errors, you absolutely _could_ carry on and report
    multiple; but it's not very likely that there'll actually *be* more
    than one of them in a file. Here's some perfectly valid AST parsing:

    ast.dump(ast.parse("from __future__ import the_past")) "Module(body=[ImportFrom(module='__future__',
    names=[alias(name='the_past')], level=0)], type_ignores=[])"
    ast.dump(ast.parse("from __future__ import braces")) "Module(body=[ImportFrom(module='__future__',
    names=[alias(name='braces')], level=0)], type_ignores=[])"
    ast.dump(ast.parse("def f():\n\tdef g():\n\t\tnonlocal x\n")) "Module(body=[FunctionDef(name='f', args=arguments(posonlyargs=[],
    args=[], kwonlyargs=[], kw_defaults=[], defaults=[]), body=[FunctionDef(name='g', args=arguments(posonlyargs=[], args=[], kwonlyargs=[], kw_defaults=[], defaults=[]),
    body=[Nonlocal(names=['x'])], decorator_list=[])],
    decorator_list=[])], type_ignores=[])"

    If you were to try to actually compile those to bytecode, they would fail:

    compile(ast.parse("from __future__ import braces"), "-", "exec")
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "-", line 1
    SyntaxError: not a chance

    And finally, step 4 is actually running the compiled bytecode. Any
    errors that happen at THIS stage are going to be run-time errors, not
    syntax errors (a SyntaxError raised at run time would be from
    compiling other code).

    So is the OP asking for something other than a Python Interpreter that normally halts after some kind of error? Tools like a linter may indeed fit that mold.

    Yes, but linters are still going to go through the same process laid
    out above. So if you have a huge pile of code that misuses "await" in
    non-async functions, sure! Maybe a linter could half-compile the code,
    then probe it repeatedly until it gets past everything. That's not
    exactly a common case, though. More likely, you'll have parsing
    errors, and the only way to "move past" a parsing error is to guess at
    what token should be added or removed to make it "kinda work".

    Alternatively, you'll get some kind of messy heuristics to try to
    restart parsing part way down, but that's pretty imperfect too.

    This may limit some of the objections of when an error makes it hard for the parser to find some recovery point to continue from as no code is being run and no harmful side effects happen by continuing just an analysis.

    It's pretty straight-forward to ensure that no code is run - just
    compile it without running it. It's still possible to attack the
    compiler itself, but far less concerning than running arbitrary code.
    Attacks on the compiler are usually deliberate; code you don't want to
    run yet might be a perfectly reasonable call to os.unlink()...

    Time to go read some books about modern ways to evaluate a language based on more mathematical rules including more precisely what is syntax versus ...

    Suggestions?


    I'd recommend looking at Python's compile() function, the ast and
    tokenizer modules, and everything that they point to.

    ChrisA

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From avi.e.gross@gmail.com@21:1/5 to avi.e.gross@gmail.com on Tue Oct 11 02:42:16 2022
    I think we are in agreement here, Chris. My point is that the error
    detection and correction is now done at levels where there is not much need
    to use earlier and inefficient methods like parity bits set aside. We use protocols like TCP and IP and layers above them and above those to maintain
    the integrity of packets and sessions and forms of encryption allowing
    things like authentication. There is tons of overhead, even when some is
    fairly efficient, but we hardly notice it unless things go wrong.

    So written language sent (as in this email/post) does not need lots of redundancy and all the extra effort is, IMNSHO opinion, largely wasted. If I see a bear, I do not wish to check their genitals or DNA to determine their irrelevant gender before asking someone to run from it. If I happen to know
    the gender, as in a zoo, gender only matters for things like breeding
    purposes. I do not want to memorize terms in languages that have not only
    words like lion and lioness or duck and drake and goose and gander, but for EVERYTHING in some sense so I can say the equivalent of ANIMAL-male and ANIMAL-female with unique words. Life would be so much simpler if I could
    say your dog was nice and not be corrected that it was a bitch and I used
    the wrong word endings. If I really wanted to say it was a female dog, well
    I could just add a qualified. Most of the time, who cares?

    The same applies to so much grammatical nonsense which is also usually
    riddled with endless exceptions to the many rules. Make the languages simple with little redundancy and thus far easier to learn.

    I can say similar things about some programming languages that either have
    way too many rules or too few of the right ones.

    There are tradeoffs and if you want a powerful language it will likely not
    be easy to control. If you want a very regulated language, you may find it
    not very useful as many things are hard to do ad others not possible. I know that strongly typed languages often have to allow some method of cheating
    such as unions of data types, or using a parent class as the sort of object-type to allow disparate objects to live together. Python is far from
    the most complex but as noted, it is not trivial to evaluate even the syntax past errors.

    But I admit it is fun and a challenge to learn both kinds and I spent much
    of my time doing so. I like the flexibility of seeing different approaches
    and holding contradictions in my mind while accepting both and yet neither! LOL!


    -----Original Message-----
    From: Python-list <python-list-bounces+avi.e.gross=gmail.com@python.org> On Behalf Of Chris Angelico
    Sent: Monday, October 10, 2022 11:24 PM
    To: python-list@python.org
    Subject: Re: What to use for finding as many syntax errors as possible.

    On Tue, 11 Oct 2022 at 14:13, <avi.e.gross@gmail.com> wrote:
    With the internet today, we are used to expecting error correction to
    come for free. Do you really need one of every 8 bits to be a parity
    bit, which only catches may half of the errors...

    Fortunately, we have WAY better schemes than simple parity, which was only really a thing in the modem days. (Though I would say that there's still a pretty clear distinction between a good message where everything has correct parity, and line noise where half of them
    don't.) Hamming codes can correct one-bit errors (and detect two-bit
    errors) at a price of log2(size)+1 bits of space. Here's a great
    rundown:

    https://www.youtube.com/watch?v=X8jsijhllIA

    There are other schemes too, but Hamming codes are beautifully elegant and
    easy to understand.

    ChrisA
    --
    https://mail.python.org/mailman/listinfo/python-list

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From avi.e.gross@gmail.com@21:1/5 to avi.e.gross@gmail.com on Tue Oct 11 03:10:51 2022
    Thanks for a rather detailed explanation of some of what we have been discussing, Chris. The overall outline is about what I assumed was there but some of the details were, to put it politely, fuzzy.

    I see resemblances to something like how a web page is loaded and operated.
    I mean very different but at some level not so much.

    I mean a typical web page is read in as HTML with various keyword regions expected such as <BODY> ... </BODY> or <DIV ...> ... </DIV> with things
    often cleanly nested in others. The browser makes nodes galore in some kind
    of tree format with an assortment of objects whose attributes or methods represent aspects of what it sees. The resulting treelike structure has
    names like DOM.

    To a certain approximation, this tree starts a certain way but is regularly being manipulated (or perhaps a copy is) as it regularly is looked at to see how to display it on the screen at the moment based on the current tree contents and another set of rules in Cascading Style Sheets. But bits and pieces of JavaScript are also embedded or imported that can read aspects of
    the tree (and more) and modify the contents and arrange for all kinds of asynchronous events when bits of code are invoked such as when you click a button or hover or when an image finishes loading or every 100 milliseconds.
    It can insert new objects into the DOM too. And of course there can be interactions with restricted local storage as well as with servers and code running there.

    It is quite a mess but in some ways I see analogies. Your program reads a stream of data and looks for tokens and eventually turns things into a tree
    of sorts that represents relationships to a point. Additional structures eventually happen at run time that let you store collections of references
    to variables such as environments or namespaces and the program derived from the trees makes changes as it goes and in a language like Python can even possibly change the running program in some ways.

    These are not at all the same thing but share a certain set of ideas and methods and can be very powerful as things interact. In the web case, the
    CSS may search for regions with some class or ID or that are the third
    element of a bullet list and more, using powerful tools like jQuery, and
    make changes. A CSS rule that previously ignored some region as not having a particular class, might start including it after a JavaScript segment is aroused while waiting on an event listener for say a mouse hovering over an area and then changes that part of the DOM (like a node) to be in that
    class. Suddenly the area on your screen changes background or whatever the
    CSS now dictates. We have multiple systems written in an assortment of "languages" that complement each other. Some running programs, especially
    ones that use asynchronous methods like threads or callbacks on events, such
    as a GUI, can effectively do similar things.

    In effect the errors in the web situation have such analogies too as in what happens if a region of HTML is not well-formed or uses a keyword not recognized. This becomes even more interesting in XML where anything can be
    a keyword and you often need other kinds of files (often also in ML) to
    define what the XML can be like and what restrictions it may have such as
    can a <BOOK> have multiple authors but only one optional publication date
    and so on. It can be fascinating and highly technical. So I am up for a challenge of studying anything from early compilers for languages of my
    youth to more recent ways including some like what you show.

    I have time to kill and this might be more fun than other things, for a
    while.

    There was a guy around a few years ago who suggested he would create a
    system where you could create a series of some kind of configuration files
    for ANY language and his system would them compile or run programs for each
    and every such language? Was that on this forum? What ever happened to him?

    But although what he promised seemed a bit too much, I can see from your comments below how in some ways a limited amount of that might be done for
    some subset of languages which can be parsed and manipulated as described.

    -----Original Message-----
    From: Python-list <python-list-bounces+avi.e.gross=gmail.com@python.org> On Behalf Of Chris Angelico
    Sent: Monday, October 10, 2022 11:55 PM
    To: python-list@python.org
    Subject: Re: What to use for finding as many syntax errors as possible.

    On Tue, 11 Oct 2022 at 14:26, <avi.e.gross@gmail.com> wrote:

    I stand corrected Chris, and others, as I pay the sin tax.

    Yes, there are many kinds of errors that logically fall into different categories or phases of evaluation of a program and some can be
    determined by a more static analysis almost on a line by line (or
    "statement" or "expression", ...) basis and others need to sort of
    simulate some things and look back and forth to detect possible incompatibilities and yet others can only be detected at run time and
    likely way more categories depending on the language.

    But when I run the Python interpreter on code, aren't many such phases
    done interleaved and at once as various segments of code are parsed
    and examined and perhaps compiled into block code and eventually executed?

    Hmm, depends what you mean. Broadly speaking, here's how it goes:

    0) Early pre-parse steps that don't really matter to most programs, like checking character set. We'll ignore these.
    1) Tokenize the text of the program into a sequence of
    potentially-meaningful units.
    2) Parse those tokens into some sort of meaningful "sentence".
    3) Compile the syntax tree into actual code.
    4) Run that code.

    Example:
    code = """def f():
    ... print("Hello, world", 1>=2)
    ... print(Ellipsis, ...)
    ... return True
    ... """


    In step 1, all that happens is that a stream of characters (or bytes,
    depending on your point of view) gets broken up into units.

    for t in tokenize.tokenize(iter(code.encode().split(b"\n")).__next__):
    ... print(tokenize.tok_name[t.exact_type], t.string)

    It's pretty spammy, but you can see how the compiler sees the text.
    Note that, at this stage, there's no real difference between the NAME "def"
    and the NAME "print" - there are no language keywords yet.
    Basically, all you're doing is figuring out punctuation and stuff.

    Step 2 is what we'd normally consider "parsing". (It may well happen concurrently and interleaved with tokenizing, and I'm giving a simplified
    and conceptualized pipeline here, but this is broadly what Python does.)
    This compares the stream of tokens to the grammar of a Python program and attempts to figure out what it means. At this point, the linear stream turns into a recursive syntax tree, but it's still very abstract.

    import ast
    ast.dump(ast.parse(code))
    "Module(body=[FunctionDef(name='f', args=arguments(posonlyargs=[], args=[], kwonlyargs=[], kw_defaults=[], defaults=[]), body=[Expr(value=Call(func=Name(id='print', ctx=Load()), args=[Constant(value='Hello, world'), Compare(left=Constant(value=1), ops=[GtE()], comparators=[Constant(value=2)])], keywords=[])), Expr(value=Call(func=Name(id='print', ctx=Load()), args=[Name(id='Ellipsis', ctx=Load()), Constant(value=Ellipsis)], keywords=[])), Return(value=Constant(value=True))],
    decorator_list=[])], type_ignores=[])"

    (Side point: I would rather like to be able to
    pprint.pprint(ast.parse(code)) but that isn't a thing, at least not
    currently.)

    This is where the vast majority of SyntaxErrors come from. Your code is a sequence of tokens, but those tokens don't mean anything. It doesn't make
    sense to say "print(def f[return)]" even though that'd tokenize just fine.
    The trouble with the notion of "keeping going after finding an error" is
    that, when you find an error, there are almost always multiple possible ways that this COULD have been interpreted differently. It's as likely to give nonsense results as actually useful ones.

    (Note that, in contrast to the tokenization stage, this version
    distinguishes between the different types of word. The "def" has resulted in
    a FunctionDef node, the "print" is a Name lookup, and both "..." and "True" have now become Constant nodes - previously, "..."
    was a special Ellipsis token, but "True" was just a NAME.)

    Step 3: the abstract syntax tree gets parsed into actual runnable code. This
    is where that small handful of other SyntaxErrors come from. With these
    errors, you absolutely _could_ carry on and report multiple; but it's not
    very likely that there'll actually *be* more than one of them in a file.
    Here's some perfectly valid AST parsing:

    ast.dump(ast.parse("from __future__ import the_past")) "Module(body=[ImportFrom(module='__future__',
    names=[alias(name='the_past')], level=0)], type_ignores=[])"
    ast.dump(ast.parse("from __future__ import braces")) "Module(body=[ImportFrom(module='__future__',
    names=[alias(name='braces')], level=0)], type_ignores=[])"
    ast.dump(ast.parse("def f():\n\tdef g():\n\t\tnonlocal x\n")) "Module(body=[FunctionDef(name='f', args=arguments(posonlyargs=[], args=[], kwonlyargs=[], kw_defaults=[], defaults=[]), body=[FunctionDef(name='g', args=arguments(posonlyargs=[], args=[], kwonlyargs=[], kw_defaults=[], defaults=[]), body=[Nonlocal(names=['x'])], decorator_list=[])], decorator_list=[])], type_ignores=[])"

    If you were to try to actually compile those to bytecode, they would fail:

    compile(ast.parse("from __future__ import braces"), "-", "exec")
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "-", line 1
    SyntaxError: not a chance

    And finally, step 4 is actually running the compiled bytecode. Any errors
    that happen at THIS stage are going to be run-time errors, not syntax errors
    (a SyntaxError raised at run time would be from compiling other code).

    So is the OP asking for something other than a Python Interpreter that normally halts after some kind of error? Tools like a linter may
    indeed fit that mold.

    Yes, but linters are still going to go through the same process laid out
    above. So if you have a huge pile of code that misuses "await" in non-async functions, sure! Maybe a linter could half-compile the code, then probe it repeatedly until it gets past everything. That's not exactly a common case, though. More likely, you'll have parsing errors, and the only way to "move past" a parsing error is to guess at what token should be added or removed
    to make it "kinda work".

    Alternatively, you'll get some kind of messy heuristics to try to restart parsing part way down, but that's pretty imperfect too.

    This may limit some of the objections of when an error makes it hard
    for the parser to find some recovery point to continue from as no code
    is being run and no harmful side effects happen by continuing just an
    analysis.

    It's pretty straight-forward to ensure that no code is run - just compile it without running it. It's still possible to attack the compiler itself, but
    far less concerning than running arbitrary code.
    Attacks on the compiler are usually deliberate; code you don't want to run
    yet might be a perfectly reasonable call to os.unlink()...

    Time to go read some books about modern ways to evaluate a language
    based on more mathematical rules including more precisely what is syntax
    versus ...

    Suggestions?


    I'd recommend looking at Python's compile() function, the ast and tokenizer modules, and everything that they point to.

    ChrisA
    --
    https://mail.python.org/mailman/listinfo/python-list

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Roel Schroeven@21:1/5 to All on Tue Oct 11 10:00:21 2022
    Op 10/10/2022 om 19:08 schreef Robert Latest via Python-list:
    Antoon Pardon wrote:
    I would like a tool that tries to find as many syntax errors as possible
    in a python file.

    I'm puzzled as to when such a tool would be needed. How many syntax errors can
    you realistically put into a single Python file before compiling it for the first time?
    I've been following the discussion from a distance and the whole time
    I've been wondering the same thing. Especially when you have unit tests,
    as Antoon said he has, I can't really imagine a situation where you add
    so much code in one go without running it that you introduce a painful
    amount of syntax errors.

    My solution would be to use a modern IDE with a linter, possibly with
    style warnings disabled, which will flag syntax errors as soon as you
    type them. Possibly combined with a TDD-style tactic which also prevents
    large amounts of errors (any errors) to build up. But I have the
    impression that any of those doesn't fit in Antoon's workflow.

    --
    "Peace cannot be kept by force. It can only be achieved through understanding."
    -- Albert Einstein

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Antoon Pardon@21:1/5 to All on Tue Oct 11 11:53:17 2022
    Op 10/10/2022 om 19:08 schreef Robert Latest via Python-list:
    Antoon Pardon wrote:
    I would like a tool that tries to find as many syntax errors as possible
    in a python file.
    I'm puzzled as to when such a tool would be needed. How many syntax errors can
    you realistically put into a single Python file before compiling it for the first time?

    Why are you puzzled? I don't need to make that many syntaxt errors to find
    such a tool useful.

    --
    Antoon Pardon

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Weatherby,Gerard@21:1/5 to All on Tue Oct 11 11:26:15 2022
    Sure it does. They’re optional and not enforced at runtime, but I find them useful when writing code in PyCharm:

    import os
    from os import DirEntry

    de : DirEntry
    for de in os.scandir('/tmp'):
    print(de.name)

    de = 7
    print(de)

    Predeclaring de allows me to do the tab completion thing with DirEntry fields / methods

    From: Python-list <python-list-bounces+gweatherby=uchc.edu@python.org> on behalf of avi.e.gross@gmail.com <avi.e.gross@gmail.com>
    Date: Monday, October 10, 2022 at 10:11 PM
    To: python-list@python.org <python-list@python.org>
    Subject: RE: What to use for finding as many syntax errors as possible.
    *** Attention: This is an external email. Use caution responding, opening attachments or clicking on links. ***

    Michael,

    A reasonable question. Python lets you initialize variables but has no
    explicit declarations.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Chris Angelico@21:1/5 to avi.e.gross@gmail.com on Tue Oct 11 22:39:07 2022
    On Tue, 11 Oct 2022 at 18:12, <avi.e.gross@gmail.com> wrote:

    Thanks for a rather detailed explanation of some of what we have been discussing, Chris. The overall outline is about what I assumed was there but some of the details were, to put it politely, fuzzy.

    I see resemblances to something like how a web page is loaded and operated.
    I mean very different but at some level not so much.

    I mean a typical web page is read in as HTML with various keyword regions expected such as <BODY> ... </BODY> or <DIV ...> ... </DIV> with things
    often cleanly nested in others. The browser makes nodes galore in some kind of tree format with an assortment of objects whose attributes or methods represent aspects of what it sees. The resulting treelike structure has
    names like DOM.

    Yes. The basic idea of "tokenize, parse, compile" can be used for
    pretty much any language - even English, although its grammar is a bit
    more convoluted than most programming languages, with many weird
    backward compatibility features! I'll parse your last sentence above:

    LETTERS The
    SPACE
    LETTERS resulting
    SPACE
    ... you get the idea
    LETTERS like
    SPACE
    LETTERS DOM
    FULLSTOP # or call this token PERIOD if you're American

    Now, we can group those tokens into meaningful sets.

    Sentence(type=Statement,
    subject=Noun(name="structure", addenda=[
    Article(type=The),
    Adjective(name="treelike"),
    ]),
    verb=Verb(type=Being, name="has", addenda=[]),
    object=Noun(name="name", plural=True, addenda=[
    Adjective(phrase=Phrase(verb=Verb(name="like"), object=Noun(name="DOM"),
    ]),
    )

    Grammar nerds will probably dispute some of the awful shorthanding I
    did here, but I didn't want to devise thousands of AST nodes just for
    this :)

    To a certain approximation, this tree starts a certain way but is regularly being manipulated (or perhaps a copy is) as it regularly is looked at to see how to display it on the screen at the moment based on the current tree contents and another set of rules in Cascading Style Sheets.

    Yep; the DOM tree is initialized from the HTML (usually - it's
    possible to start a fresh tree with no HTML) and then can be
    manipulated afterwards.

    These are not at all the same thing but share a certain set of ideas and methods and can be very powerful as things interact.

    Oh absolutely. That's why there are languages designed to help you
    define other languages.

    In effect the errors in the web situation have such analogies too as in what happens if a region of HTML is not well-formed or uses a keyword not recognized.

    Aaaaand they're horribly horribly messy, due to a few decades of
    sloppy HTML programmers and the desire to still display the page even
    if things are messed up :) But, again, there's a huge difference
    between syntactic errors (like omitting a matching angle bracket) and
    semantic errors (a keyword not known, like using <spam> when you
    should have used <span>). In the latter case, you can still build a
    DOM tree, but you have an unknown element; in the former case, you
    have to guess at what the author meant, just to get anything going at
    all.

    There was a guy around a few years ago who suggested he would create a
    system where you could create a series of some kind of configuration files for ANY language and his system would them compile or run programs for each and every such language? Was that on this forum? What ever happened to him?

    That was indeed on this forum, and I have no idea what happened to
    him. Maybe he realised that all he'd invented was the Unix shebang?

    ChrisA

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Thomas Passin@21:1/5 to avi.e.gross@gmail.com on Tue Oct 11 14:11:56 2022
    On 10/11/2022 3:10 AM, avi.e.gross@gmail.com wrote:
    I see resemblances to something like how a web page is loaded and operated.
    I mean very different but at some level not so much.

    I mean a typical web page is read in as HTML with various keyword regions expected such as <BODY> ... </BODY> or <DIV ...> ... </DIV> with things
    often cleanly nested in others. The browser makes nodes galore in some kind of tree format with an assortment of objects whose attributes or methods represent aspects of what it sees. The resulting treelike structure has
    names like DOM.

    To bring things back to the context of the original post, actual web
    browsers are extremely tolerant of HTML syntax errors (including
    incorrect nesting of tags) in the documents they receive. They usually
    recover silently from errors and are able to display the rest of the
    page. Usually they manage this correctly. The OP would like to have a
    parser or checker that could do the same, plus giving an output showing
    where each of the errors happened.

    I can imagine such a parser also reporting which lines it had to skip
    before it was able to recover.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Chris Angelico@21:1/5 to Thomas Passin on Wed Oct 12 07:00:50 2022
    On Wed, 12 Oct 2022 at 05:23, Thomas Passin <list1@tompassin.net> wrote:

    On 10/11/2022 3:10 AM, avi.e.gross@gmail.com wrote:
    I see resemblances to something like how a web page is loaded and operated. I mean very different but at some level not so much.

    I mean a typical web page is read in as HTML with various keyword regions expected such as <BODY> ... </BODY> or <DIV ...> ... </DIV> with things often cleanly nested in others. The browser makes nodes galore in some kind of tree format with an assortment of objects whose attributes or methods represent aspects of what it sees. The resulting treelike structure has names like DOM.

    To bring things back to the context of the original post, actual web
    browsers are extremely tolerant of HTML syntax errors (including
    incorrect nesting of tags) in the documents they receive. They usually recover silently from errors and are able to display the rest of the
    page. Usually they manage this correctly.

    Having had to debug tiny errors in HTML pages that resulted in
    extremely weird behaviour, I'm not sure that I agree that they usually
    manage correctly. Fundamentally, they guess, and guesswork is never
    reliable.

    ChrisA

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Thomas Passin@21:1/5 to Chris Angelico on Tue Oct 11 17:09:27 2022
    On 10/11/2022 4:00 PM, Chris Angelico wrote:
    On Wed, 12 Oct 2022 at 05:23, Thomas Passin <list1@tompassin.net> wrote:

    On 10/11/2022 3:10 AM, avi.e.gross@gmail.com wrote:
    I see resemblances to something like how a web page is loaded and operated. >>> I mean very different but at some level not so much.

    I mean a typical web page is read in as HTML with various keyword regions >>> expected such as <BODY> ... </BODY> or <DIV ...> ... </DIV> with things
    often cleanly nested in others. The browser makes nodes galore in some kind >>> of tree format with an assortment of objects whose attributes or methods >>> represent aspects of what it sees. The resulting treelike structure has
    names like DOM.

    To bring things back to the context of the original post, actual web
    browsers are extremely tolerant of HTML syntax errors (including
    incorrect nesting of tags) in the documents they receive. They usually
    recover silently from errors and are able to display the rest of the
    page. Usually they manage this correctly.

    Having had to debug tiny errors in HTML pages that resulted in
    extremely weird behaviour, I'm not sure that I agree that they usually
    manage correctly. Fundamentally, they guess, and guesswork is never
    reliable.

    Still, browsers generally do a very decent job of recovery, even though perfection isn't possible. The OP wants to get help with problems in
    his files even if it isn't perfect, and I think that's reasonable to
    wish for. The link to a post about the lezer parser in a recent message
    on this thread is partly about how a real, practical parser can do some
    error correction in mid-flight, for the purposes of a programming editor
    (as opposed to one that has to build a correct program).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Thomas Passin@21:1/5 to Thomas Passin on Tue Oct 11 17:45:25 2022
    On 10/11/2022 5:09 PM, Thomas Passin wrote:
    <snip>
    The OP wants to get help with problems in
    his files even if it isn't perfect, and I think that's reasonable to
    wish for.  The link to a post about the lezer parser in a recent message
    on this thread is partly about how a real, practical parser can do some
    error correction in mid-flight, for the purposes of a programming editor
    (as opposed to one that has to build a correct program).

    One editor that seems to do what the OP wants is Visual Studio Code. It
    will mark apparent errors - not just syntax errors - not limited to one
    per page. Sometimes it can even suggest corrections. I personally
    dislike the visual clutter the markings impose, but I imagine I could
    get used to it.

    VSC uses a Microsoft system they call "PyLance" - see

    https://devblogs.microsoft.com/python/announcing-pylance-fast-feature-rich-language-support-for-python-in-visual-studio-code/

    Of course, you don't get something complex for free, and in this case
    the cost is having to run a separate server to do all this analysis on
    the fly. However, VSC handles all of that behind the scenes so you
    don't have to.

    Personally, I'd most likely go for a decent programming editor that you
    can set up to run a program on your file, use that to run a checker,
    like pyflakes for instance, and run that from time to time. You could
    run it when you save a file. Even if it only showed one error at a
    time, it would make quick work of correcting mistakes. And it wouldn't
    need to trigger an entire tool chain each time.

    My editor of choice for setting up helper "tools" like this on Windows
    is Editplus (non-free but cheap and very worth it), and I have both
    py_compile and pyflakes set up this way in it. However, as I mentioned
    in an earlier post, the Leo Editor
    (https://github.com/leo-editor/leo-editor) does this for you
    automatically when you save, so it's very convenient. That's what I
    mostly work in.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Cameron Simpson@21:1/5 to Thomas Passin on Wed Oct 12 09:29:42 2022
    On 11Oct2022 17:45, Thomas Passin <list1@tompassin.net> wrote:
    Personally, I'd most likely go for a decent programming editor that you
    can set up to run a program on your file, use that to run a checker,
    like pyflakes for instance, and run that from time to time. You could
    run it when you save a file. Even if it only showed one error at a
    time, it would make quick work of correcting mistakes. And it wouldn't
    need to trigger an entire tool chain each time.

    Aye.

    I've got my editor (vim) configured to run an autoformatter on my code
    when I save (this can be turned off, and parse errors prevent any reformatting).

    Linters I run by hand from the adjacent shell window, via a small script
    which runs my preferred linters with their preferred options.

    My current workplace triggers the CI workflow when you push commits
    upstream, and you can make branch names which do not trigger the CI
    stuff.

    So there's a decent separation between saving (and testing or locally
    running the dev code) from the CI cycle.

    Cheers,
    Cameron Simpson <cs@cskk.id.au>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Peter J. Holzer@21:1/5 to Chris Angelico on Thu Oct 13 02:14:23 2022
    On 2022-10-11 09:47:52 +1100, Chris Angelico wrote:
    On Tue, 11 Oct 2022 at 09:18, Cameron Simpson <cs@cskk.id.au> wrote:

    Consider:

    if condition # no colon
    code
    else:
    code

    To actually "restart" parsing, you have to make a guess of some sort.

    Right. At least one of the papers on parsing I read over the last few
    years (yeah, I really should try to find them again) argued that the
    vast majority of syntax errors is either a missing token, a superfluous
    token or a combination of the the two. So one strategy with good results
    is to heuristically try to insert or delete single tokens and check
    which results in the longest distance to the next error.

    Checking multiple possible fixes has its cost, especially since you have
    to do that at every error. So you can argue that it is better for
    productivity if you discover one error in 0.1 seconds than 10 errors in
    5 seconds.


    I grew up with C and Pascal compilers which would _happily_ produce many complaints, usually accurate, and all manner of syntactic errors. They didn't stop at the first syntax error.

    Yes, because they work with a much simpler grammar.

    I very much doubt that. Python doesn't have a particularly complicated
    grammar, and C certainly doesn't have a particularly simple one.

    The argument that it's impossible in Python (unlike any other language), because Python is oh so special doesn't hold water.

    hp

    --
    _ | Peter J. Holzer | Story must make more sense than reality.
    |_|_) | |
    | | | hjp@hjp.at | -- Charles Stross, "Creative writing
    __/ | http://www.hjp.at/ | challenge!"

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEETtJbRjyPwVTYGJ5k8g5IURL+KF0FAmNHWFoACgkQ8g5IURL+ KF2MKA/+OB/j/9FZQUZHChrQE90+Lf5yDKBrqZPI4OFF3Uw4/JNf++GSwYLLtE3J GV1aQAcWXBmLPfUX/4ydycP+4dE5n9cNPk8gi91Sbt91jNZASOwOOHLG5TWDx5GH kQUKMCQRW7GmIgNYrZenHzdC84hK0zUWKeGRL0jsEShjJ1wlk9CLGkECyqiE06Md fbUYcBDIF53DqwSwqVQOyW/89NJSarzOjOKeBNCu67Ycv5LoCa9ZW5rSM2q6bPPC 57H+B/cOz6/HXolq8aU/xL199rkoHELgLZ8Ejw6ld2stUc4CWBY8RG1pQCxFchM2 OEgAwM849tFaAq3S/rG51JggymexLnBncy+PQmJpokZh0onVHih2SUEX2r2CbbWo TAEjRyQWO7zbJUKIvXm2f3LswGtyY1FnZYHbVHVIaLc11etjesBoZ7RZ32Eq68PP tU1uNmn1OvghjFSg71JDcq7mJGFyor2R/VkSh7G/sllA8GzVH9e24rV55glVbtMq pcH3wY4IcNttXF+Q30xZZ4nOCDDOG5HjarvtmhhbyFFGcv5V0bSq9X1/i3Ad/M2L fCe2HUuNj0oWOuirSiCx1qXisT1nHE3UUvAZShKOdqhZW+4MUfQDNCjt0s/2VfSU Gqx0WRksZTubO0C+cqQ1rjw5+YlUnp8MLamZKFN
  • From Chris Angelico@21:1/5 to PythonList@danceswithmice.info on Thu Oct 13 11:29:22 2022
    On Thu, 13 Oct 2022 at 11:23, dn <PythonList@danceswithmice.info> wrote:
    # add an extra character within identifier, as if 'new' identifier
    28 assert expected_value == fyibonacci_number
    UUUUUUUUUUUUUU UUUUUUUUUUUUUUUUU

    # these all trivial SYNTAX errors - could have tried leaving-out a
    keyword, but ...

    Just to be clear, this last one is not actually a *syntax* error -
    it's a misspelled name, but contextually, that is clearly a name and
    nothing else. These are much easier to report multiples of, and
    typical syntax highlighters will do so.

    Your other two examples were both syntactic discrepancies though.

    ChrisA

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Chris Angelico@21:1/5 to Peter J. Holzer on Thu Oct 13 11:23:40 2022
    On Thu, 13 Oct 2022 at 11:19, Peter J. Holzer <hjp-python@hjp.at> wrote:

    On 2022-10-11 09:47:52 +1100, Chris Angelico wrote:
    On Tue, 11 Oct 2022 at 09:18, Cameron Simpson <cs@cskk.id.au> wrote:

    Consider:

    if condition # no colon
    code
    else:
    code

    To actually "restart" parsing, you have to make a guess of some sort.

    Right. At least one of the papers on parsing I read over the last few
    years (yeah, I really should try to find them again) argued that the
    vast majority of syntax errors is either a missing token, a superfluous
    token or a combination of the the two. So one strategy with good results
    is to heuristically try to insert or delete single tokens and check
    which results in the longest distance to the next error.

    Checking multiple possible fixes has its cost, especially since you have
    to do that at every error. So you can argue that it is better for productivity if you discover one error in 0.1 seconds than 10 errors in
    5 seconds.

    Maybe; but what if you report 10 errors in 5 seconds, but 8 of them
    are spurious? You've reported two useful errors in a sea of noise.
    Even if it's the other way around (8 where you nailed it and correctly
    reported the error, 2 that are nonsense), is it actually helpful? Bear
    in mind that, if you can discover one syntax error in 0.1 seconds, you
    can do that check *the moment the user types a key* in the editor
    (which is more-or-less what happens with most syntax highlighting
    editors - some have a small delay to avoid being too noisy with error reporting, but same difference). Why report false errors when you can
    report errors one by one and know that they're true?

    I grew up with C and Pascal compilers which would _happily_ produce many complaints, usually accurate, and all manner of syntactic errors. They didn't stop at the first syntax error.

    Yes, because they work with a much simpler grammar.

    I very much doubt that. Python doesn't have a particularly complicated grammar, and C certainly doesn't have a particularly simple one.

    The argument that it's impossible in Python (unlike any other language), because Python is oh so special doesn't hold water.


    Never said it's because Python is special; there are a LOT of
    languages that are at least as complicated. Try giving multiple useful
    errors when there's a syntactic problem in SQL, for instance. But I do
    think that Pascal, especially, has a significantly simpler grammar
    than Python does.

    ChrisA

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Peter J. Holzer@21:1/5 to Chris Angelico on Thu Oct 13 02:45:22 2022
    On 2022-10-13 11:23:40 +1100, Chris Angelico wrote:
    On Thu, 13 Oct 2022 at 11:19, Peter J. Holzer <hjp-python@hjp.at> wrote:
    On 2022-10-11 09:47:52 +1100, Chris Angelico wrote:
    On Tue, 11 Oct 2022 at 09:18, Cameron Simpson <cs@cskk.id.au> wrote:

    Consider:

    if condition # no colon
    code
    else:
    code

    To actually "restart" parsing, you have to make a guess of some sort.

    Right. At least one of the papers on parsing I read over the last few
    years (yeah, I really should try to find them again) argued that the
    vast majority of syntax errors is either a missing token, a superfluous token or a combination of the the two. So one strategy with good results
    is to heuristically try to insert or delete single tokens and check
    which results in the longest distance to the next error.

    Checking multiple possible fixes has its cost, especially since you have
    to do that at every error. So you can argue that it is better for productivity if you discover one error in 0.1 seconds than 10 errors in
    5 seconds.

    Maybe; but what if you report 10 errors in 5 seconds, but 8 of them
    are spurious? You've reported two useful errors in a sea of noise.
    Even if it's the other way around (8 where you nailed it and correctly reported the error, 2 that are nonsense), is it actually helpful?

    Humans are pattern-matching animals. It is quite possible that seeing a
    bunch of related errors makes the fix more obvious than seeing them in isolation.

    No, I haven't done any studies on this. Yes, it is possible that all
    those compiler writers who spent lots of work on error recovery over the
    last 50 years (or longer) are delusional.


    I grew up with C and Pascal compilers which would _happily_ produce many
    complaints, usually accurate, and all manner of syntactic errors. They didn't stop at the first syntax error.

    Yes, because they work with a much simpler grammar.

    I very much doubt that. Python doesn't have a particularly complicated grammar, and C certainly doesn't have a particularly simple one.

    The argument that it's impossible in Python (unlike any other language), because Python is oh so special doesn't hold water.


    Never said it's because Python is special; there are a LOT of
    languages that are at least as complicated.

    And almost all of their compilers do try to recover from errors.

    But I do think that Pascal, especially, has a significantly simpler
    grammar than Python does.

    Incidentally, Turbo Pascal was the one other example of a compiler which *didn't* try to recover.

    hp

    --
    _ | Peter J. Holzer | Story must make more sense than reality.
    |_|_) | |
    | | | hjp@hjp.at | -- Charles Stross, "Creative writing
    __/ | http://www.hjp.at/ | challenge!"

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEETtJbRjyPwVTYGJ5k8g5IURL+KF0FAmNHX50ACgkQ8g5IURL+ KF0RZw/+Kf26BRiFYxgoRRmtONpkKzWCA26TNwSOXdPmJlOqGnZcDSCA/XP5/TvB WJ1tgGpSPPXPeC6KQhJpEpGaqHH/eaQVCkR706LPRcLOaCneYzTvn/MUdERk1IKA +A8fQwO51Xg/0rNQvlB/5Kl5MrdLk7S19EZuKSeqXrHvtB22j3fj6+safVK8VA9r a+XFTrQldpFPQd8YeCbS7WkpoSVsbrAgBkKBh6FcxdR3P/s95OJ1rFAvvOwdOBga 6CvtyARhMKmv41dH02QX/Vfi2mM2Zvq7XLzoabgLC4ueLj5G/biGcFqiVF4QKkSt EWYWPHH8Czlo6ajVTuRIE703YkLz2E3HwbyZRoXEDoA+S6kBHYVrBP5QFzkzn+nC 9YCTQPGoXR+iS/3dnzUT3ej/RFIkrd8QuGT4Xq7+4hX8zl4CTrEE0zPvObCiSiO8 dN+iu6HW6TtHeudIOICjqA9KTsOu1RRd+0Kg2LtE//aSB2y+Z62Qsx3QeIAkyT3Z fMMr2Iked8+9TbQCPxDIwGhFjTvf48fUeloPg5tGjjNrPpulCsNgiFW4u+eSlTmz cz2FTwYVnt+tinVJEaiDYy/V3kBvrHhP5hUPY/c/jLxyxxunKvWNu2fk6dI7L/C9 S5RGBzKv1lbYDZUic15kdBvTXeWfZ96k7TH7JzY
  • From dn@21:1/5 to Antoon Pardon on Thu Oct 13 13:22:08 2022
    On 09/10/2022 23.09, Antoon Pardon wrote:
    I would like a tool that tries to find as many syntax errors as possible
    in a python file. I know there is the risk of false positives when a
    tool tries to recover from a syntax error and proceeds but I would
    prefer that over the current python strategy of quiting after the first syntax error. I just want a tool for syntax errors. No style
    enforcements. Any recommandations? -- Antoon Pardon


    Am not sure if have really understood problem being addressed, because
    it seems 'answered' - perhaps the question says more about the tool-set
    being utilised...


    As someone who used to manually check and re-check code before
    submitting (first punched-cards, and later edited files) source to a
    compiler, it took some re-education to learn what to expect from a modern/language-intelligent IDE.

    The topic was a major interest back in the days of batch-compilers. Plus
    we had other tools, eg CREF/XREF utilities which produced
    cross-references of identifier usage - and illustrated typos in
    identifiers, usage before value-assignment, etc (per request from one respondent).


    Using an IDE which is inspecting source-code as it is being typed (or
    when an existing file is opened) will suggest what might?should be typed
    'next' (a mixed blessing IMHO!), and secondly highlights errors until
    they are noticed and dealt-with. Some, especially warnings, can be
    safely ignored - and yes, some are spurious and SHOULD be ignored!.

    PyCharm* displays a number of indicators. The least intrusive appears in
    the top-right corner of the editor-tab listing, eg 8 errors, 2 warnings.
    So, apparently not 'stopping' at first error found.

    Within the source-code itself, there are high-lights and under-lines (in
    and amongst the syntax highlighting presentation/theme) - which I
    suppose are easier to notice during data-entry if one is a touch-typist. Accordingly, not much of a context for multiple errors to be committed
    during a single coding-session, but remaining un-noticed until 'the end'.


    For illustration, I took a simple tutorial* routine and deliberately
    introduced some/many of the types of error discussed within this thread.
    It would have been ideal to attach a graphic but here are some lines of
    code, under which I have attempted to represent a highlighted character (related to the line above) with an "H", and a (red) under-lined token
    with a "U". So, this is a feeble-attempt to show how the source is
    displayed and annotated by the IDE:

    # mis-type the tuple-assignment by adding semi-colon
    # which might also confuse Python into thinking of a second instruction
    17 i, j = 0;, 1
    H UH

    # replace under-line/under-score with space: s/b expected_value
    25 for expected value, fibonacci_number in \
    HHHHHHHHUUUUUU UUUUUUUUUUUUUUUU

    # mis-type the name of the zip built-in function
    26 z ip( SERIES, fibonacci_generator() ):
    U UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU

    # add an extra character within identifier, as if 'new' identifier
    28 assert expected_value == fyibonacci_number
    UUUUUUUUUUUUUU UUUUUUUUUUUUUUUUU

    # these all trivial SYNTAX errors - could have tried leaving-out a
    keyword, but ...


    Assuming the problem is not noticed/handled as the text is being typed,
    and in addition to the coder reviewing the work, recognising problems,
    and dealing with them him-/her-self; the IDE offers two follow-up
    mechanisms:

    1 a means to jump 'focus' from the site of one error to the next,
    whereupon a pop-up will describe the error, eg (line 28) "Unresolved
    reference 'expected_value'"; which illustrates one problem in-isolation.
    In this case, line 28 is 'at fault' despite the fact that the 'error' is
    a consequence of THE problem on line 25!

    2 a "Problems" Tool Window can be displayed, which will list every error
    and warning, with pretty, colored, icons, and the same message per
    example above, together with the relevant line-number, (the first two
    entries, as-listed, are 'warnings', and the rest are described as "errors"):

    Need more values to unpack:17
    Statement seems to have no effect:17
    # so it has picked-up both of my nefarious intentions

    Statement expected, found Py:COMMA:17
    # as above
    # NB the "Py:COMMA" is from tokenize (per @Chris contribution(s))
    'in' expected:25
    # logical, but confused by the space
    Unresolved reference 'value':25
    # pretty-much had no chance with so many faults in one statement!
    Unresolved reference 'fibonacci_number':25
    # ditto
    Unresolved reference 'z':26
    # absolutely!
    ':' expected:26
    # evidently re-started after the "in" and did what it could with the "z" Unresolved reference 'expected_value':28
    # it would be "resolved" but for the first error on line 25
    Unresolved reference 'fyibonacci_number':28
    # ahah! Apparently trying to use an identifier before declaring/defining
    # in reality, just another typo
    # that said, I created the issue by inserting the "y"
    # if I'd mistyped the entire identifier upon first-entry,
    # the IDE's code-completion facility would have given choices and 'saved me'

    NB the content displayed in the Problems Tool Window is dynamic.
    Accordingly, because of the way Python works, if one 'fixes' errors in line-number sequence, closing an error 'higher up' may well cause
    a(nother) error 'lower down' to disappear - thus reducing the number of spurious errors one is likely to encounter as a limitation of the
    automated code-evaluation!
    --
    Regards,
    =dn

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Peter J. Holzer@21:1/5 to Thomas Passin on Thu Oct 13 02:48:54 2022
    On 2022-10-11 14:11:56 -0400, Thomas Passin wrote:
    To bring things back to the context of the original post, actual web
    browsers are extremely tolerant of HTML syntax errors (including incorrect nesting of tags) in the documents they receive.

    HTML5 actually specifies exactly how to recover from errors. So since
    every sequence of bytes results in a well-defined DOM tree you might
    argue (a bit tongue in cheek) that there are no syntax errors in HTML5.

    hp

    --
    _ | Peter J. Holzer | Story must make more sense than reality.
    |_|_) | |
    | | | hjp@hjp.at | -- Charles Stross, "Creative writing
    __/ | http://www.hjp.at/ | challenge!"

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEETtJbRjyPwVTYGJ5k8g5IURL+KF0FAmNHYHUACgkQ8g5IURL+ KF1NCg//RNVyYTdPq3wHsrrEIqCHyeaY0trQydAKh6s/eyd2Rfo4OwoL7C+NI9wu NImUUnOnnUqZtCaaHLkDjYIVDluEuSiOC5Qq4GZuBDrQsf/9cGOAYRcULhwYkfMK nKG59BCNuWGMHM4xSmYdC2fH/vHYsDkgDYToxwhLRAJBg2uPidQ5XRVXKPDrip/b xMVeeWXcpdac6lwOyyGmJnyHaMHP1NEBlxA0S8tWpCkC7NLyEj4KvCHB/qSsMBth cvCIwYYsHrZqeHiiOTYffL1yfy4OWd2nQ+zBaGxLPghZmv65v6HA4YchqLBSi00i htys3iKvtqEKFXoFQ9LnkVGF1NbYCsXgNBQNIV7w1zXvUUjQCHMLAcUcb70e+n2o MS4nspwHffv73+EVGC+mNtN0vOYqTMo5RPzFOV9b7ACVLFRzypcKTzREJtafkDJU IKZ6b3eNj5Z8W99FYaa3nANf+DjuMj/0+7Mz358oePNWeim9/sGarm0dm7zwqDxI dRxfN2Rs1n1cTPcooK7Pi+lAwBgTaawmFVM7kNLBsYPwZQc9uvAUAqq62ac2UEPp Ujs+Aftt+eHnmodfRwvXKsb0Pq503aug0JQis3RruI4j2R9vczdHTeNeBe2fJERA 3XiIu6X5Qn3FJPQtCINDRsVYnovrMuHnbsUmbuT
  • From Alex Hall@21:1/5 to Antoon Pardon on Tue Nov 8 12:09:44 2022
    On Sunday, October 9, 2022 at 12:09:45 PM UTC+2, Antoon Pardon wrote:
    I would like a tool that tries to find as many syntax errors as possible
    in a python file. I know there is the risk of false positives when a
    tool tries to recover from a syntax error and proceeds but I would
    prefer that over the current python strategy of quiting after the first syntax error. I just want a tool for syntax errors. No style
    enforcements. Any recommandations? -- Antoon Pardon

    Bit late here, coming from the Pycoder's Weekly email newsletter, but I'm surprised that I don't see any mentions of [parso](https://parso.readthedocs.io/en/latest/):

    Parso is a Python parser that supports error recovery and round-trip parsing for different Python versions (in multiple Python versions). Parso is also able to list multiple syntax errors in your python file.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)