Bison 3.0.2: 3.1 Outline of a Bison Grammar

3.1 Outline of a Bison Grammar

A Bison grammar file has four main sections, shown here with the appropriate delimiters:

%{
  Prologue
%}

Bison declarations

%%
Grammar rules
%%

Epilogue

Comments enclosed in ‘/* … */’ may appear in any of the sections. As a GNU extension, ‘//’ introduces a comment that continues until end of line.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

3.1.1 The prologue

The Prologue section contains macro definitions and declarations of functions and variables that are used in the actions in the grammar rules. These are copied to the beginning of the parser implementation file so that they precede the definition of yyparse. You can use ‘#include’ to get the declarations from a header file. If you don’t need any C declarations, you may omit the ‘%{’ and ‘%}’ delimiters that bracket this section.

The Prologue section is terminated by the first occurrence of ‘%}’ that is outside a comment, a string literal, or a character constant.

You may have more than one Prologue section, intermixed with the Bison declarations. This allows you to have C and Bison declarations that refer to each other. For example, the %union declaration may use types defined in a header file, and you may wish to prototype functions that take arguments of type YYSTYPE. This can be done with two Prologue blocks, one before and one after the %union declaration.

%{
  #define _GNU_SOURCE
  #include <stdio.h>
  #include "ptypes.h"
%}

%union {
  long int n;
  tree t;  /* tree is defined in ‘ptypes.h’. */
}

%{
  static void print_token_value (FILE *, int, YYSTYPE);
  #define YYPRINT(F, N, L) print_token_value (F, N, L)
%}

…

When in doubt, it is usually safer to put prologue code before all Bison declarations, rather than after. For example, any definitions of feature test macros like _GNU_SOURCE or _POSIX_C_SOURCE should appear before all Bison declarations, as feature test macros can affect the behavior of Bison-generated #include directives.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

3.1.2 Prologue Alternatives

The functionality of Prologue sections can often be subtle and inflexible. As an alternative, Bison provides a %code directive with an explicit qualifier field, which identifies the purpose of the code and thus the location(s) where Bison should generate it. For C/C++, the qualifier can be omitted for the default location, or it can be one of requires, provides, top. See section %code Summary.

Look again at the example of the previous section:

%{
  #define _GNU_SOURCE
  #include <stdio.h>
  #include "ptypes.h"
%}

%union {
  long int n;
  tree t;  /* tree is defined in ‘ptypes.h’. */
}

%{
  static void print_token_value (FILE *, int, YYSTYPE);
  #define YYPRINT(F, N, L) print_token_value (F, N, L)
%}

…

Notice that there are two Prologue sections here, but there’s a subtle distinction between their functionality. For example, if you decide to override Bison’s default definition for YYLTYPE, in which Prologue section should you write your new definition? You should write it in the first since Bison will insert that code into the parser implementation file before the default YYLTYPE definition. In which Prologue section should you prototype an internal function, trace_token, that accepts YYLTYPE and yytokentype as arguments? You should prototype it in the second since Bison will insert that code after the YYLTYPE and yytokentype definitions.

This distinction in functionality between the two Prologue sections is established by the appearance of the %union between them. This behavior raises a few questions. First, why should the position of a %union affect definitions related to YYLTYPE and yytokentype? Second, what if there is no %union? In that case, the second kind of Prologue section is not available. This behavior is not intuitive.

To avoid this subtle %union dependency, rewrite the example using a %code top and an unqualified %code. Let’s go ahead and add the new YYLTYPE definition and the trace_token prototype at the same time:

%code top {
  #define _GNU_SOURCE
  #include <stdio.h>

  /* WARNING: The following code really belongs
   * in a '%code requires'; see below.  */

  #include "ptypes.h"
  #define YYLTYPE YYLTYPE
  typedef struct YYLTYPE
  {
    int first_line;
    int first_column;
    int last_line;
    int last_column;
    char *filename;
  } YYLTYPE;
}

%union {
  long int n;
  tree t;  /* tree is defined in ‘ptypes.h’. */
}

%code {
  static void print_token_value (FILE *, int, YYSTYPE);
  #define YYPRINT(F, N, L) print_token_value (F, N, L)
  static void trace_token (enum yytokentype token, YYLTYPE loc);
}

…

In this way, %code top and the unqualified %code achieve the same functionality as the two kinds of Prologue sections, but it’s always explicit which kind you intend. Moreover, both kinds are always available even in the absence of %union.

The %code top block above logically contains two parts. The first two lines before the warning need to appear near the top of the parser implementation file. The first line after the warning is required by YYSTYPE and thus also needs to appear in the parser implementation file. However, if you’ve instructed Bison to generate a parser header file (see section %defines), you probably want that line to appear before the YYSTYPE definition in that header file as well. The YYLTYPE definition should also appear in the parser header file to override the default YYLTYPE definition there.

In other words, in the %code top block above, all but the first two lines are dependency code required by the YYSTYPE and YYLTYPE definitions. Thus, they belong in one or more %code requires:

%code top {
  #define _GNU_SOURCE
  #include <stdio.h>
}

%code requires {
  #include "ptypes.h"
}

%union {
  long int n;
  tree t;  /* tree is defined in ‘ptypes.h’. */
}

%code requires {
  #define YYLTYPE YYLTYPE
  typedef struct YYLTYPE
  {
    int first_line;
    int first_column;
    int last_line;
    int last_column;
    char *filename;
  } YYLTYPE;
}

%code {
  static void print_token_value (FILE *, int, YYSTYPE);
  #define YYPRINT(F, N, L) print_token_value (F, N, L)
  static void trace_token (enum yytokentype token, YYLTYPE loc);
}

…

Now Bison will insert #include "ptypes.h" and the new YYLTYPE definition before the Bison-generated YYSTYPE and YYLTYPE definitions in both the parser implementation file and the parser header file. (By the same reasoning, %code requires would also be the appropriate place to write your own definition for YYSTYPE.)

When you are writing dependency code for YYSTYPE and YYLTYPE, you should prefer %code requires over %code top regardless of whether you instruct Bison to generate a parser header file. When you are writing code that you need Bison to insert only into the parser implementation file and that has no special need to appear at the top of that file, you should prefer the unqualified %code over %code top. These practices will make the purpose of each block of your code explicit to Bison and to other developers reading your grammar file. Following these practices, we expect the unqualified %code and %code requires to be the most important of the four Prologue alternatives.

At some point while developing your parser, you might decide to provide trace_token to modules that are external to your parser. Thus, you might wish for Bison to insert the prototype into both the parser header file and the parser implementation file. Since this function is not a dependency required by YYSTYPE or YYLTYPE, it doesn’t make sense to move its prototype to a %code requires. More importantly, since it depends upon YYLTYPE and yytokentype, %code requires is not sufficient. Instead, move its prototype from the unqualified %code to a %code provides:

%code top {
  #define _GNU_SOURCE
  #include <stdio.h>
}

%code requires {
  #include "ptypes.h"
}

%union {
  long int n;
  tree t;  /* tree is defined in ‘ptypes.h’. */
}

%code requires {
  #define YYLTYPE YYLTYPE
  typedef struct YYLTYPE
  {
    int first_line;
    int first_column;
    int last_line;
    int last_column;
    char *filename;
  } YYLTYPE;
}

%code provides {
  void trace_token (enum yytokentype token, YYLTYPE loc);
}

%code {
  static void print_token_value (FILE *, int, YYSTYPE);
  #define YYPRINT(F, N, L) print_token_value (F, N, L)
}

…

Bison will insert the trace_token prototype into both the parser header file and the parser implementation file after the definitions for yytokentype, YYLTYPE, and YYSTYPE.

The above examples are careful to write directives in an order that reflects the layout of the generated parser implementation and header files: %code top, %code requires, %code provides, and then %code. While your grammar files may generally be easier to read if you also follow this order, Bison does not require it. Instead, Bison lets you choose an organization that makes sense to you.

You may declare any of these directives multiple times in the grammar file. In that case, Bison concatenates the contained code in declaration order. This is the only way in which the position of one of these directives within the grammar file affects its functionality.

The result of the previous two properties is greater flexibility in how you may organize your grammar file. For example, you may organize semantic-type-related directives by semantic type:

%code requires { #include "type1.h" }
%union { type1 field1; }
%destructor { type1_free ($$); } <field1>
%printer { type1_print (yyoutput, $$); } <field1>

%code requires { #include "type2.h" }
%union { type2 field2; }
%destructor { type2_free ($$); } <field2>
%printer { type2_print (yyoutput, $$); } <field2>

You could even place each of the above directive groups in the rules section of the grammar file next to the set of rules that uses the associated semantic type. (In the rules section, you must terminate each of those directives with a semicolon.) And you don’t have to worry that some directive (like a %union) in the definitions section is going to adversely affect their functionality in some counter-intuitive manner just because it comes first. Such an organization is not possible using Prologue sections.

This section has been concerned with explaining the advantages of the four Prologue alternatives over the original Yacc Prologue. However, in most cases when using these directives, you shouldn’t need to think about all the low-level ordering issues discussed here. Instead, you should simply use these directives to label each block of your code according to its purpose and let Bison handle the ordering. %code is the most generic label. Move code to %code requires, %code provides, or %code top as needed.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

3.1.3 The Bison Declarations Section

The Bison declarations section contains declarations that define terminal and nonterminal symbols, specify precedence, and so on. In some simple grammars you may not need any declarations. See section Bison Declarations.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

3.1.4 The Grammar Rules Section

The grammar rules section contains one or more Bison grammar rules, and nothing else. See section Syntax of Grammar Rules.

There must always be at least one grammar rule, and the first ‘%%’ (which precedes the grammar rules) may never be omitted even if it is the first thing in the file.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

3.1.5 The epilogue

The Epilogue is copied verbatim to the end of the parser implementation file, just as the Prologue is copied to the beginning. This is the most convenient place to put anything that you want to have in the parser implementation file but which need not come before the definition of yyparse. For example, the definitions of yylex and yyerror often go here. Because C requires functions to be declared before being used, you often need to declare functions like yylex and yyerror in the Prologue, even if you define them in the Epilogue. See section Parser C-Language Interface.

If the last section is empty, you may omit the ‘%%’ that separates it from the grammar rules.

The Bison parser itself contains many macros and identifiers whose names start with ‘yy’ or ‘YY’, so it is a good idea to avoid using any such names (except those documented in this manual) in the epilogue of the grammar file.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

This document was generated by Rick Perry on December 29, 2013 using texi2html 1.82.

3.1.1 The prologue		Syntax and usage of the prologue.
3.1.2 Prologue Alternatives		Syntax and usage of alternatives to the prologue.
3.1.3 The Bison Declarations Section		Syntax and usage of the Bison declarations section.
3.1.4 The Grammar Rules Section		Syntax and usage of the grammar rules section.
3.1.5 The epilogue		Syntax and usage of the epilogue.