What is the LR(0) state machine for this grammar?
S' -> S $ S -> S S + S -> int
This material is summarized from Chapter 8 of lex & yacc (Levine).
Conflicts can occur because of grammatical ambiguities or just because not every CFG is LALR(1). Part of the reason for understanding the LALR parsing algorithm is so that you can read the output of a parser generator and understand how the comments came about. The function print-lalr-states in lalr.cl is useful for this: look through the states, and see where there are. For example,
CL-USER> (def-jyacc foo (S (S + S)) (S (int))) Warning, Not LALR(1)!!: STATE-3, S --> S + S CL-USER> (print-lalr-states) ; ... skip some states ... STATE-3: S --> S . + S, *EOF* + S --> S + S ., + *EOF* On + shift STATE-2 On + *EOF* reduce S + S --> S
This is a shift-reduce conflict: on +, we can't decide whether
to shift + onto the stack, or perform a reduction. The other type
of conflict is a reduce-reduce conflict. When shift-reduce or
reduce-reduce conflicts occur,
a disambiguation rule is invoked, like use the earlier production,
or prefer shift to reduce.
Ideally, you want not to have these
conflicts; even if you disambiguate, you run the danger of hiding some
other problem. But sometimes it's convenient to let the parser generator
handle them. In this case, we'd want extra declarations to tell the
parser generator what option to choose: these declarations usually specify
precedence and associativity.
Common examples of conflicts include:
E -> E + E E -> E * E E -> int
This will give you a few shift-reduce conflicts. They can be removed by specifying the associativity and precedence of the operators... or by writing an unambiguous grammar, as we've been doing.
stmt -> if id then id stmt -> if id then id else id
The problem: how do you decide you want the latter?
(if id then (if id then id) else id) (if id then (if id then id else id))Can assign precedence (choose the second production on conflict), or rewrite to be unambiguous.
S -> S T | T T -> T P | P P -> int
S -> id opt [ id ] opt -> [ keyword ] opt ->
Rewrite the grammar -- and if it gets too burdensome, punt and post-process the parse tree.
The BNF accepted by Johnson's YACC is limited. Wouldn't it be nice if we could write productions like these?
(class-head (class id [ extends id ]?)) (stmts ([ stmt ]*)) (E (E + Tm or E - Tm))How might we do this?