• Looking for Unix lex for modern systems

    From Aharon Robbins@21:1/5 to All on Thu Jan 6 20:17:12 2022
    Can anyone point me at a version of Unix lex that will run on Linux?

    Thanks,

    Arnold
    --
    Aharon (Arnold) Robbins arnold AT skeeve DOT com
    [I wouldn't hold my breath. Perhaps someone has a retrocomputing
    Vax or PDP-11 that can run an antique lex and then you can use the
    output. Or maybe it might be easier to dig into the ugly lex
    application and figure out what it's doing to the insides of
    the old lex scanner. -John]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to Aharon Robbins on Thu Jan 6 16:42:32 2022
    On Thursday, January 6, 2022 at 4:09:53 PM UTC-8, Aharon Robbins wrote:
    Can anyone point me at a version of Unix lex that will run on Linux?

    On my Linux system, /usr/bin/lex is a symbolic link to /usr/bin/flex

    On FreeBSD, they are both hard links to the same file.

    On OS X, they are two different files (cmp -l shows differences)
    of the same size.

    A web search shows the Oracle lex man page for Solaris, which does not mention flex, and so might not be a link of any kind.

    I have hardware that can run SunOS and Solaris. (It should be easy to find hardware to run Solaris-x86 versions.)

    As to actual copyright AT&T lex, that might be a little harder.
    [Flex can take the same input as lex but its internals are totally different.

    Bell Labs long ago released the code to early Unix systems. The source
    for lex is here: https://minnie.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/lex or on
    the 4.2BSD src archive at https://www.tuhs.org/Archive/Distributions/UCB/4.2BSD/
    I tried to compile the 4.2BSD version on FreeBSD and the errors were
    ugly. -John]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to Aharon Robbins on Fri Jan 7 02:39:57 2022
    On Thursday, January 6, 2022 at 4:09:53 PM UTC-8, Aharon Robbins wrote:
    Can anyone point me at a version of Unix lex that will run on Linux?

    A web search for lex source found this:

    http://heirloom.sourceforge.net/devtools.html

    which sounds like exactly what you want. It is supposed to compile on Linux, and seems to be derived from Solaris source, and has the CDDL license:

    http://www.opensolaris.org/os/licensing

    Otherwise, as noted previously, Solaris-x86 should run on easily found x86 systems.
    (Or in a virtual machine on such systems, if you don't have one available.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to our moderator on Fri Jan 7 15:36:44 2022
    (snip, our moderator wrote)

    [Flex can take the same input as lex but its internals are totally different.

    Bell Labs long ago released the code to early Unix systems. The source
    for lex is here: https://minnie.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/lex or on
    the 4.2BSD src archive at https://www.tuhs.org/Archive/Distributions/UCB/4.2BSD/
    I tried to compile the 4.2BSD version on FreeBSD and the errors were
    ugly. -John]

    It seems that real lex known about RATFOR, and I suspect that actual flex doesn't.
    Is that a good test for which source you have?

    In any case, with

    gcc -std=c89 -Dunix

    there aren't so many errors (that aren't warnings).

    The warnings are from conversion of either the wrong pointer type,
    or between integer and pointer. I am not so sure how well current
    systems do the latter. (That seems to be usual for C from those years.)

    Fixing the actual errors, including removing the initialization
    of *errorf with stdout, and not declaring calloc, it compiles and
    (with the -t option) runs.

    It then stops with:

    (Error) output table overflow
    5/1000 nodes(%e), 10/2500 positions(%p), 3/500 (%n), 254 transitions
    , 2/1000 packed char classes(%k), 3/2000 packed transitions(%a), 0/0 output slots(%o)

    (I have the sample file from the Wikipedia page for input.)

    Reminds me, in the days of OS/2 1.0, I was compiling the GNU utilities,
    and especially grep and diff, for OS/2. In many cases, they would mix integer and (char*), especially in function arguments. Replacing 0 with (char*)0 fixed those, but I also complained to the GNU people. The reply was that, pretty much,
    any system with sizeof(int) not equal to sizeof(char*) was broken, and it wasn't their problem to fix.
    [If the comments in the source code say "written by Eric Schmidt", it's lex, otherwise, it's flex. Yes, that Eric Schmidt. -John]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Aharon Robbins@21:1/5 to gah4@u.washington.edu on Sun Jan 9 19:04:34 2022
    In article <22-01-025@comp.compilers>, gah4 <gah4@u.washington.edu> wrote:
    On Thursday, January 6, 2022 at 4:09:53 PM UTC-8, Aharon Robbins wrote:
    Can anyone point me at a version of Unix lex that will run on Linux?

    A web search for lex source found this:

    http://heirloom.sourceforge.net/devtools.html

    which sounds like exactly what you want.

    I got this to build and run, but it ran out of buffer space. :-(

    I have since made good progress with flex. The original lexer
    was doing its own token buffering. I moved to using yytext, and
    also changed YY_INPUT to get one character of input at a time
    as lex used to do. These two together have allowed me to make
    real progress.

    Performance isn't an issue, so doing one character at a time is fine.

    Thanks everyone for the help.
    --
    Aharon (Arnold) Robbins arnold AT skeeve DOT com

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to Aharon Robbins on Sun Jan 9 17:03:02 2022
    On Sunday, January 9, 2022 at 1:45:11 PM UTC-8, Aharon Robbins wrote:
    In article <22-0...@comp.compilers>, gah4 <ga...@u.washington.edu> wrote:
    On Thursday, January 6, 2022 at 4:09:53 PM UTC-8, Aharon Robbins wrote:
    Can anyone point me at a version of Unix lex that will run on Linux?

    A web search for lex source found this:

    http://heirloom.sourceforge.net/devtools.html

    which sounds like exactly what you want.
    I got this to build and run, but it ran out of buffer space. :-(

    I compiled what I believe is actual lex on Linux. There were two compile
    time errors to fix, and a bunch of warnings that I didn't fix.

    The warnings are related to pointer conversions, so I hope it
    does it right.

    I then ran it with the sample program in the Wikipedia lex article,
    and it ran out of buffer space. It isn't very big, either.

    But then I ran it with the sample from the Solaris lex man page,
    and it works. It even works with -r to generate ratfor output.
    (As far as I know, flex doens't have the -r option.)

    In any case, I don't understand the buffer space message.
    [AT&T lex was a student summer project and it has a bunch of fixed
    size buffers. -John]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to All on Wed Jan 12 14:45:57 2022
    On Sunday, January 9, 2022 at 5:46:03 PM UTC-8, gah4 wrote:

    (snip on running actual lex)

    I then ran it with the sample program in the Wikipedia lex article,
    and it ran out of buffer space. It isn't very big, either.

    (snip)

    In any case, I don't understand the buffer space message.
    [AT&T lex was a student summer project and it has a bunch of fixed
    size buffers. -John]

    OK, the sample in Wikipedia lex article has lines:

    /* This tells flex to read only one input file */
    %option noyywrap

    It turns out that if you give that line to lex, it sets the size of the
    output buffer to zero. (I got suspicious when the comment
    mentioned flex, but had already found the output buffer
    size was zero.)

    Since I have the O'Reilly "Lex & Yacc" book, I could look up
    lex options. It seems that

    %o (number)

    sets the output buffer size in lex, and zero if there is no number.

    The rest of the syntax might be the same between lex and flex,
    but the option syntax is not! (Hint to those working with old files.)
    [It's on page 159, %e %p %n %k %a %o. That last flag is the number
    of "output slots" whatever they were. -John]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)