• What happens at the end of the file for lex?

    From Philipp Klaus Krause@21:1/5 to All on Wed Jun 3 10:04:50 2020
    I wonder what is supposed to happen when a lex lexer reaches the end of
    the input calling input().

    The only information I found in the flex 2.6.4 manual states:

    If 'input()' encounters an end-of-file the normal 'yywrap()' processing
    is done. A "real" end-of-file is returned by 'input()' as 'EOF'.

    What is the difference between an "end-of-file" and a "'real' end-of-file"?

    I did a quick test using this .lex file:


    %%
    . {for(int i = 0; i < 8; i++) {int ch = input(); printf("%d\n", ch);}}

    %%
    main()
    {
    yylex();
    }

    And using a single-character input-file, I see:

    philipp@notebook5:/tmp$ ./a.out < test.c
    10
    0
    0
    0
    0
    0
    0
    0

    So apparently input() just returns 0 (and keeps doing so).

    Is input() supposed to always return 0 at the end? Could inut() return 0
    in some other situation? When would input() return EOF?

    Philipp
    [The convention in lex and flex is that input() returns 0 at tne end of input. You can use a <<EOF>> rule if you want your lexer to do something other than return
    when it gets to EOF. The yywrap() routine is used for file switching if your lexer
    handles multiple input files in a single run. -John]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Philipp Klaus Krause@21:1/5 to All on Thu Jun 4 19:16:09 2020
    Further investigation shows that this was an intentional, but
    undocumented change in flex in 2015 (flex 2.5.4 input() returns EOF at
    the end of the file, flex 2.6.4 input() returns 0). However I still have
    no idea why this change was made.

    I guess the only portable way to handle the end of the file is to set a
    flag in yywrap() and check it each time input() was called.

    Philipp
    [Returning EOF was a bug. The lex input() always returned 0 at end of file. -John]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)