• Converting a lex scanner to flex, help needed

    From Aharon Robbins@21:1/5 to All on Wed Dec 29 20:54:01 2021
    Hi.

    I am trying to convert a V7 Unix vintage lex scanner to flex.

    The rule

    #.* {fixval(); xxbp = -1; return(xxcom); }

    seems to be consuming as much as it can instead of stopping at
    the first newline. When I look at the collected buffer, it
    has multiple lines in it:

    (gdb) p xxbuf
    $7 = "# ========== ratfor in fortran for bootstrap ==========\n#\n# block data - initialize global variables\n#\nblock data\ncommon /cchar/ extdig(10), intdig(10), extlet(26), intlet(26), extbig(26), intbig(26"...

    The program I am trying to modernize is 'struct', which reads Fortran and produces Ratfor. The lex scanner is in the 'beautify' part. The whole
    thing is at https://github.com/arnoldrobbins/struct. If you clone the
    repo, check out the 'modernize' branch, and fix the makefile to compile
    with gcc -m32, you will get working binaries. (64 bit and cleaning up
    the warnings is work in progress.)

    What am I doing wrong?

    Thanks,

    Arnold
    --
    Aharon (Arnold) Robbins arnold AT skeeve DOT com
    [In flex a . doesn't match a newline. What do you see when you look at yytext, which
    is the token it matched? The input buffer doesn't tell you anything very useful about
    individual matched tokens. -John]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Aharon Robbins@21:1/5 to Aharon Robbins on Thu Dec 30 08:10:27 2021
    In article <21-12-019@comp.compilers>,
    Aharon Robbins <arnold@skeeve.com> wrote:
    I am trying to convert a V7 Unix vintage lex scanner to flex.
    ....
    [In flex a . doesn't match a newline. What do you see when you look at >yytext, which
    is the token it matched? The input buffer doesn't tell you anything
    very useful about
    individual matched tokens. -John]

    You're right, it looks like yytext is fine. There seems to be
    other stuff going on between the grammar and the scanner, with the
    grammar poking around inside the input buffer and expecting things to
    work the way they did in lex.

    I will probably have to dive into the code more deeply, instead of
    just mechanically fixing compilation warnings, which is mostly
    what I've been doing so far.

    As an aside, the original code very cavalierly converted int to pointer
    and back, all over. Over 40 years later, it's really hard to have
    to mess with code like this.

    Interestingly enough, though, when compiled in 32 bit, where int
    and pointer are the same size, things seem to actually work!

    Thanks,

    Arnold
    --
    Aharon (Arnold) Robbins arnold AT skeeve DOT com
    [Urrgh. The file handling in flex is quite different from lex. In lex
    it's very simple, I think it just read a line at a time into a buffer,
    in flex it reads large blocks and uses pointers to keep track of where
    it is, with some cleverness if a token spans a block boundary. In lex
    yytext is an array,in flex it's normally a pointer. -John]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)