• Basic Lexing Question

    From Jon Forrest@21:1/5 to All on Wed Jun 29 10:11:54 2022
    The following line is from a makefile accepted by gmake:

    onefile: $(AVAR)

    I'm wondering what the ramification are of lexing what's on the right of the colon as a single string and then breaking it apart later, as opposed to returning a more detailed sequence of tokens, such as DOLLAR LPAREN NAME RPAREN.

    gmake appears to do the former, I'm guessing because it means a simpler
    grammar but that seems like just postponing the hard work until later.

    Cordially,
    Jon Forrest

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to nob...@gmail.com on Wed Jun 29 16:27:06 2022
    On Wednesday, June 29, 2022 at 2:02:08 PM UTC-7, nob...@gmail.com wrote:
    The following line is from a makefile accepted by gmake:

    onefile: $(AVAR)

    I'm wondering what the ramification are of lexing what's on the right of the colon as a single string and then breaking it apart later, as opposed to returning a more detailed sequence of tokens, such as DOLLAR LPAREN NAME RPAREN.

    I suspect that the question is more complicated than it looks.

    Well, first, you might look at the gmake manual, and especially here:

    https://www.gnu.org/software/make/manual/html_node/Flavors.html#Flavors

    Often in interpreted languages, and also in languages that use a preprocessor, you have to consider that things might be parsed more than once.

    As well as I know it, in processing that line gmake searches the line for $, without (mostly) looking at the rest of the line. (Even more, I am not sure about string constants.) So variables are replaced, and then the line
    is executed. Except when it isn't.

    It seems that in variable assignment:

    bvar = $AVAR

    the variable isn't expanded yet, but $AVAR is the value of bvar.
    Then, later, when there is a $bvar, and $AVAR is substituted,
    and then the value of AVAR is substituted.

    Even more, gmake has

    cvar ::= $AVAR

    where $AVAR is expanded.

    I first thought about this for PHP, which is a preprocessor (meant for)
    HTML. The processor doesn't know about HTML at all, but looks for

    <?php

    such that:

    <?php
    echo "Hi, I'm a PHP script!";
    ?>

    is processed by PHP, with the result sent out be the server for the
    web browser to process. I am not sure of the exact rules, so it might
    be that it is processed differently in quoted strings, but I suspect not.

    The gmake manual has the example, which they recommend not using:

    foo = c
    prog.o : prog.$(foo)
    $(foo)$(foo) -$(foo) prog.$(foo)

    Note that the $(foo)$(foo) is replaced by cc to run the C compiler.

    Some of the more interesting parsing examples come with TeX, which allows
    one to change, while it is running, which characters are letters. Letters
    can be used in control-sequence name longer than one character.
    (Note unlike many languages, not digits ... unless they are letters!)

    TeX also has \expandafter, which allows for delaying expansion of something until what follows it expanded.

    In any case, when input is parsed more than once, often by parsers with different rules, the exact order of processing is very important!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Johann Klammer@21:1/5 to Jon Forrest on Thu Jun 30 02:38:44 2022
    On 06/29/2022 07:11 PM, Jon Forrest wrote:
    The following line is from a makefile accepted by gmake:

    onefile: $(AVAR)

    Don't they have functions that go inside the parens to do stuff like wildcard searches and
    string operations?
    [Sure, but they have to run during one of the phases that scans the input -John]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to gah4@u.washington.edu on Fri Jul 1 04:44:08 2022
    On 2022-06-29, gah4 <gah4@u.washington.edu> wrote:
    On Wednesday, June 29, 2022 at 2:02:08 PM UTC-7, nob...@gmail.com wrote:
    The following line is from a makefile accepted by gmake:

    onefile: $(AVAR)

    I'm wondering what the ramification are of lexing what's on the right of the >> colon as a single string and then breaking it apart later, as opposed to
    returning a more detailed sequence of tokens, such as DOLLAR LPAREN NAME
    RPAREN.

    I suspect that the question is more complicated than it looks.

    Well, first, you might look at the gmake manual, and especially here:

    https://www.gnu.org/software/make/manual/html_node/Flavors.html#Flavors

    Often in interpreted languages, and also in languages that use a preprocessor,
    you have to consider that things might be parsed more than once.

    As well as I know it, in processing that line gmake searches the line for $, without (mostly) looking at the rest of the line. (Even more, I am not sure about string constants.) So variables are replaced, and then the line
    is executed. Except when it isn't.

    It seems that in variable assignment:

    bvar = $AVAR

    the variable isn't expanded yet, but $AVAR is the value of bvar.
    Then, later, when there is a $bvar, and $AVAR is substituted,
    and then the value of AVAR is substituted.

    I believe that = versus := variables can all be handled at the semantic
    level, after an abstract parse.

    bvar = $(AVAR)

    is like a macro. When we call $(bvar), it must expand to $(AVAR), which
    then expands to the current value of AVAR. That does not mean we have
    to scan any tokens any more; bar can exist in a translated form.

    Whereas

    bvar := $(AVAR)

    can produce exactly the same representation for the right hand side,
    but then evaluate it immediately and capture the resulting string into
    bvar.

    The one Gmake feature which, I suspect, *must* re-scan the code at
    the character level is $(eval ...). Because, look what you can do:

    DOLLAR := $$
    LPAREN := (
    RPAREN := )

    CODE := $(DOLLAR)$(LPAREN)info CC is $(DOLLAR)$(LPAREN)CC$(RPAREN)$(RPAREN)

    $(eval $(CODE))

    .PHONY: all
    all:

    Output:

    CC is cc

    Other than eval, I don't suspect anything else requires rescanning;
    and note that $(CODE) without eval will not rescan!

    So this will not work without eval either:

    $(shell echo foo.o: foo.c)

    but this will produce a dependency rule:

    $(eval $(shell echo foo.o: foo.c))

    Make can be given a bona-fide "waterfall model of compiling" treatment. :)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)