[Template fetch failed for http://spivey.oriel.ox.ac.uk/corner/Template:Sitenotice?action=render: HTTP 404]
How to use yacc to test for ambiguity: Difference between revisions
(Created page with "There is no algorithm that decides definitively whether a given context free grammar is ambiguous, but that does not prevent us from using yacc pragmatically. If we submit a...") |
No edit summary |
||
Line 9: | Line 9: | ||
| IF expr THEN inner ELSE stmt | | IF expr THEN inner ELSE stmt | ||
</pre> | </pre> | ||
In order to submit this to yacc, we must surround it with some apparatus declaring various tokens, and supply semantic actions, all of which can be written as the trivial unit expression @{ () }@. A complete input is shown towards the end of this note. Invoking @ocamlyacc@ with | In order to submit this to yacc, we must surround it with some apparatus declaring various tokens, and supply semantic actions, all of which can be written as the trivial unit expression @{()}@. A complete input is shown towards the end of this note. Invoking @ocamlyacc@ with | ||
$ ''ocamlyacc -v cati.mly'' | $ ''ocamlyacc -v cati.mly'' | ||
produces the error message | produces the error message |
Revision as of 18:31, 26 October 2021
There is no algorithm that decides definitively whether a given context free grammar is ambiguous, but that does not prevent us from using yacc pragmatically. If we submit a grammar (unannotated with precedence declarations) to yacc, and yacc accepts it, then we know the grammar is in the class LALR(1), and all such grammars are unambiguous. If yacc says No and reports parsing conflicts, then either the grammar is ambiguous, or it is unambiguous, but "too difficult" for yacc to handle. In the latter case, we can look at yacc's diagnostic output and use it to find a situation where a conflict is encountered: that can help us to find a string that can be derived in more than one way.
Consider, for example, the following interesting grammar provided by Matteo Cati. It purports to be an unambiguous grammar for the classic Algol--60 dangling else problem.
stmt : inner | IF expr THEN stmt inner : BASIC | IF expr THEN inner ELSE stmt
In order to submit this to yacc, we must surround it with some apparatus declaring various tokens, and supply semantic actions, all of which can be written as the trivial unit expression {()}
. A complete input is shown towards the end of this note. Invoking ocamlyacc
with
$ ocamlyacc -v cati.mly
produces the error message
1 shift/reduce conflict
and also (because of the -v
flag) a report file cati.output
that gives details of parsing automaton ocamlyacc
compiled, and where it discovered the conflict. In what follows, I've used (because why wouldn't I?) the output of my own yacc variant, but the output of ocamlyacc
is equivalent, even if the states are numbered differently. The whole of the report file is shown at the end of this note.
Examining the report, we see the conflict occurs in state 11:
State 11: shift/reduce conflict (shift 12, reduce 1) on ELSE State 11 inner --> IF expr THEN inner . ELSE stmt (4) stmt --> inner . (1) EOF reduce 1 ELSE shift 12
If the machine reaches state 11 and the lookahead is ELSE
, then the machine cannot decide between shifting the ELSE
(presumably in the hope of reducing with the production inner --> IF expr THEN inner ELSE stmt
, or reducing immediately with the production stmt --> inner
. In order to find out what's happening, we should find out how state 11 can be reached in the machine: it is bound to be reachable, because yacc creates states only when they are needed as the target of some transition. Some searching reveals the following transitions.
9 inner goto 11 8 THEN shift 9 4 expr goto 8
%token BASIC EXPR IF THEN ELSE %type<unit> stmt %start stmt %% stmt : inner {()} | IF expr THEN stmt {()} ; inner : BASIC {()} | IF expr THEN inner ELSE stmt {()} ; expr : EXPR {()} ;