[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

5.3 Operator Precedence

Another situation where shift/reduce conflicts appear is in arithmetic expressions. Here shifting is not always the preferred resolution; the Bison declarations for operator precedence allow you to specify when to shift and when to reduce.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

5.3.1 When Precedence is Needed

Consider the following ambiguous grammar fragment (ambiguous because the input ‘1 - 2 * 3’ can be parsed in two different ways):

 
expr:
  expr '-' expr
| expr '*' expr
| expr '<' expr
| '(' expr ')'
…
;

Suppose the parser has seen the tokens ‘1’, ‘-’ and ‘2’; should it reduce them via the rule for the subtraction operator? It depends on the next token. Of course, if the next token is ‘)’, we must reduce; shifting is invalid because no single rule can reduce the token sequence ‘- 2 )’ or anything starting with that. But if the next token is ‘*’ or ‘<’, we have a choice: either shifting or reduction would allow the parse to complete, but with different results.

To decide which one Bison should do, we must consider the results. If the next operator token op is shifted, then it must be reduced first in order to permit another opportunity to reduce the difference. The result is (in effect) ‘1 - (2 op 3)’. On the other hand, if the subtraction is reduced before shifting op, the result is ‘(1 - 2) op 3’. Clearly, then, the choice of shift or reduce should depend on the relative precedence of the operators ‘-’ and op: ‘*’ should be shifted first, but not ‘<’.

What about input such as ‘1 - 2 - 5’; should this be ‘(1 - 2) - 5’ or should it be ‘1 - (2 - 5)’? For most operators we prefer the former, which is called left association. The latter alternative, right association, is desirable for assignment operators. The choice of left or right association is a matter of whether the parser chooses to shift or reduce when the stack contains ‘1 - 2’ and the lookahead token is ‘-’: shifting makes right-associativity.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

5.3.2 Specifying Operator Precedence

Bison allows you to specify these choices with the operator precedence declarations %left and %right. Each such declaration contains a list of tokens, which are operators whose precedence and associativity is being declared. The %left declaration makes all those operators left-associative and the %right declaration makes them right-associative. A third alternative is %nonassoc, which declares that it is a syntax error to find the same operator twice “in a row”. The last alternative, %precedence, allows to define only precedence and no associativity at all. As a result, any associativity-related conflict that remains will be reported as an compile-time error. The directive %nonassoc creates run-time error: using the operator in a associative way is a syntax error. The directive %precedence creates compile-time errors: an operator can be involved in an associativity-related conflict, contrary to what expected the grammar author.

The relative precedence of different operators is controlled by the order in which they are declared. The first precedence/associativity declaration in the file declares the operators whose precedence is lowest, the next such declaration declares the operators whose precedence is a little higher, and so on.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

5.3.3 Specifying Precedence Only

Since POSIX Yacc defines only %left, %right, and %nonassoc, which all defines precedence and associativity, little attention is paid to the fact that precedence cannot be defined without defining associativity. Yet, sometimes, when trying to solve a conflict, precedence suffices. In such a case, using %left, %right, or %nonassoc might hide future (associativity related) conflicts that would remain hidden.

The dangling else ambiguity (see section Shift/Reduce Conflicts) can be solved explicitly. This shift/reduce conflicts occurs in the following situation, where the period denotes the current parsing state:

 
if e1 then if  e2 then s1 . else s2

The conflict involves the reduction of the rule ‘IF expr THEN stmt’, which precedence is by default that of its last token (THEN), and the shifting of the token ELSE. The usual disambiguation (attach the else to the closest if), shifting must be preferred, i.e., the precedence of ELSE must be higher than that of THEN. But neither is expected to be involved in an associativity related conflict, which can be specified as follows.

 
%precedence THEN
%precedence ELSE

The unary-minus is another typical example where associativity is usually over-specified, see Infix Notation Calculator: calc. The %left directive is traditionally used to declare the precedence of NEG, which is more than needed since it also defines its associativity. While this is harmless in the traditional example, who knows how NEG might be used in future evolutions of the grammar…


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

5.3.4 Precedence Examples

In our example, we would want the following declarations:

 
%left '<'
%left '-'
%left '*'

In a more complete example, which supports other operators as well, we would declare them in groups of equal precedence. For example, '+' is declared with '-':

 
%left '<' '>' '=' "!=" "<=" ">="
%left '+' '-'
%left '*' '/'

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

5.3.5 How Precedence Works

The first effect of the precedence declarations is to assign precedence levels to the terminal symbols declared. The second effect is to assign precedence levels to certain rules: each rule gets its precedence from the last terminal symbol mentioned in the components. (You can also specify explicitly the precedence of a rule. See section Context-Dependent Precedence.)

Finally, the resolution of conflicts works by comparing the precedence of the rule being considered with that of the lookahead token. If the token’s precedence is higher, the choice is to shift. If the rule’s precedence is higher, the choice is to reduce. If they have equal precedence, the choice is made based on the associativity of that precedence level. The verbose output file made by ‘-v’ (see section Invoking Bison) says how each conflict was resolved.

Not all rules and not all tokens have precedence. If either the rule or the lookahead token has no precedence, then the default is to shift.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

5.3.6 Using Precedence For Non Operators

Using properly precedence and associativity directives can help fixing shift/reduce conflicts that do not involve arithmetics-like operators. For instance, the “dangling else” problem (see section Shift/Reduce Conflicts) can be solved elegantly in two different ways.

In the present case, the conflict is between the token "else" willing to be shifted, and the rule ‘if_stmt: "if" expr "then" stmt’, asking for reduction. By default, the precedence of a rule is that of its last token, here "then", so the conflict will be solved appropriately by giving "else" a precedence higher than that of "then", for instance as follows:

 
%precedence "then"
%precedence "else"

Alternatively, you may give both tokens the same precedence, in which case associativity is used to solve the conflict. To preserve the shift action, use right associativity:

 
%right "then" "else"

Neither solution is perfect however. Since Bison does not provide, so far, “scoped” precedence, both force you to declare the precedence of these keywords with respect to the other operators your grammar. Therefore, instead of being warned about new conflicts you would be unaware of (e.g., a shift/reduce conflict due to ‘if test then 1 else 2 + 3’ being ambiguous: ‘if test then 1 else (2 + 3)’ or ‘(if test then 1 else 2) + 3’?), the conflict will be already “fixed”.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]

This document was generated by Rick Perry on December 29, 2013 using texi2html 1.82.