I am after a good explanation of Recursive Ascent Parsing as I wish to implement a parser generator to generate recursive ascent C/C++ code
from an LR1 grammar.
Regards,
Aaron
[The Wikipedia article isn't bad and has several promising references,
but I wonder if it's worth the effort. RA parsing does what yacc or
bison does, but by turning each state into a function. It's supposed
to be faster than a table driven parser, but LALR parsers are already
so fast, how much faster will it be? Maybe it'll even be slower since
there is a lot more code so it's less likely to fit in a CPU cache.
-John]
On 2022-09-29, Aaron Gray <> wrote:
I am after a good explanation of Recursive Ascent Parsing as I wish to implement a parser generator to generate recursive ascent C/C++ code
from an LR1 grammar.
Regards,
Aaron
[The Wikipedia article isn't bad and has several promising references,
but I wonder if it's worth the effort. RA parsing does what yacc or
bison does, but by turning each state into a function. It's supposed
to be faster than a table driven parser, but LALR parsers are already
so fast, how much faster will it be? Maybe it'll even be slower since
there is a lot more code so it's less likely to fit in a CPU cache.
-John]
John, I suspect the idea is that this is suited for languages that
are compiled well and have good tail call support, to that that
the recursion is stackless.
A table-driven parser is an interpreter for a table. Tables can
be compiled to a goto graph, and a goto graph can be represented
by tail recursion which compiles to goto. This can be faster than the original table-interpreting loop.
Although, those improvements will succumb to Amdahl's law; you would
best be able to demonstrate the improvement in a program which does
nothing but parse a canned token stream: no lexing is going on and the reductions do nothing (basically the syntax is being validated and
that's all).
It be useful in applications like, say, validating some syntax that is arriving as a real-time input.
I'd ping the GNU Bison mailing list to see if they are interested
in the technique. It's doable in C; C compilers have good tail
call support. And in any case, the technique doesn't strictly
require functions: a giant yyparse() can be generated which just
contains a self-contained goto graph.
Kaz, Oh this sounds potentially very efficient, thank you, I will have
a serious play and see if I can get this to work !
[Maybe it'll even be slower since
there is a lot more code so it's less likely to fit in a CPU cache.
-John]
I am after a good explanation of Recursive Ascent Parsing as I wish to implement a parser generator to generate recursive ascent C/C++ code from an LR1 grammar.
[Maybe it'll even be slower since
there is a lot more code so it's less likely to fit in a CPU cache.
-John]
My experience is: Don't worry about I-cache misses until you have
them. I have seen cases where a lot of code was produced, but it had
enough temporal and spatial locality that I-cache misses were no
problem, as well as a case where we applied a technique that reduced
the code size (with the intent of reducing I-cache misses), but as a
result we saw a slowdown from increased I-cache misses, because the
resulting code had worse spatial locality.
Looking at the numbers in 'Thomas J Penello (1986). "Very fast LR
parsing"', one need not worry much about the I-cache: He describes an experiment with a grammar with 126 productions, resulting in 192
states, 305 terminal transitions and 226 nonterminal transitions in
the LALR(1) parser; the resulting parse tables for the table-driven
parser has 2022 bytes, while the recursive-ascent parser has 5602
bytes (including 397 bytes in common with the table-driven parser).
I-cache sizes these days are typically 32KB, so this parser would be completely within the I-cache (even if we assume a factor of two from
going from 80286 code to AMD64 code), so one does not even need to
think about locality. Penello also reports that a 158-production
grammar for standard Pascal with a few minor extensions yielded a
5326-line assembly program (which MASM 3.0 failed to assemble); this
should also easily fit into 32KB.
Of couse, modern machines tend to have larger caches than the
old ones. But also modern machines are throughput oriented
and my suffer more from latency.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 300 |
Nodes: | 16 (2 / 14) |
Uptime: | 67:06:56 |
Calls: | 6,712 |
Files: | 12,244 |
Messages: | 5,356,405 |