[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
mfcalc
Now that the basics of Bison have been discussed, it is time to move on to
a more advanced problem. The above calculators provided only five
functions, ‘+’, ‘-’, ‘*’, ‘/’ and ‘^’. It would
be nice to have a calculator that provides other mathematical functions such
as sin
, cos
, etc.
It is easy to add new operators to the infix calculator as long as they are
only single-character literals. The lexical analyzer yylex
passes
back all nonnumeric characters as tokens, so new grammar rules suffice for
adding a new operator. But we want something more flexible: built-in
functions whose syntax has this form:
function_name (argument) |
At the same time, we will add memory to the calculator, by allowing you to create named variables, store values in them, and use them later. Here is a sample session with the multi-function calculator:
$ mfcalc pi = 3.141592653589 ⇒ 3.1415926536 sin(pi) ⇒ 0.0000000000 alpha = beta1 = 2.3 ⇒ 2.3000000000 alpha ⇒ 2.3000000000 ln(alpha) ⇒ 0.8329091229 exp(ln(beta1)) ⇒ 2.3000000000 $ |
Note that multiple assignment and nested function calls are permitted.
2.5.1 Declarations for mfcalc | Bison declarations for multi-function calculator. | |
2.5.2 Grammar Rules for mfcalc | Grammar rules for the calculator. | |
2.5.3 The mfcalc Symbol Table | Symbol table management subroutines. | |
2.5.4 The mfcalc Lexer | The lexical analyzer. | |
2.5.5 The mfcalc Main | The controlling function. |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
mfcalc
Here are the C and Bison declarations for the multi-function calculator.
%{ #include <stdio.h> /* For printf, etc. */ #include <math.h> /* For pow, used in the grammar. */ #include "calc.h" /* Contains definition of 'symrec'. */ int yylex (void); void yyerror (char const *); %} %define api.value.type union /* Generate YYSTYPE from these types: */ %token <double> NUM /* Simple double precision number. */ %token <symrec*> VAR FNCT /* Symbol table pointer: variable and function. */ %type <double> exp %precedence '=' %left '-' '+' %left '*' '/' %precedence NEG /* negation--unary minus */ %right '^' /* exponentiation */ |
The above grammar introduces only two new features of the Bison language. These features allow semantic values to have various data types (see section More Than One Value Type).
The special union
value assigned to the %define
variable
api.value.type
specifies that the symbols are defined with their data
types. Bison will generate an appropriate definition of YYSTYPE
to
store these values.
Since values can now have various types, it is necessary to associate a type
with each grammar symbol whose semantic value is used. These symbols are
NUM
, VAR
, FNCT
, and exp
. Their declarations are
augmented with their data type (placed between angle brackets). For
instance, values of NUM
are stored in double
.
The Bison construct %type
is used for declaring nonterminal symbols,
just as %token
is used for declaring token types. Previously we did
not use %type
before because nonterminal symbols are normally
declared implicitly by the rules that define them. But exp
must be
declared explicitly so we can specify its value type. See section Nonterminal Symbols.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
mfcalc
Here are the grammar rules for the multi-function calculator.
Most of them are copied directly from calc
; three rules,
those which mention VAR
or FNCT
, are new.
%% /* The grammar follows. */ input: %empty | input line ; line: '\n' | exp '\n' { printf ("%.10g\n", $1); } | error '\n' { yyerrok; } ; exp: NUM { $$ = $1; } | VAR { $$ = $1->value.var; } | VAR '=' exp { $$ = $3; $1->value.var = $3; } | FNCT '(' exp ')' { $$ = (*($1->value.fnctptr))($3); } | exp '+' exp { $$ = $1 + $3; } | exp '-' exp { $$ = $1 - $3; } | exp '*' exp { $$ = $1 * $3; } | exp '/' exp { $$ = $1 / $3; } | '-' exp %prec NEG { $$ = -$2; } | exp '^' exp { $$ = pow ($1, $3); } | '(' exp ')' { $$ = $2; } ; /* End of grammar. */ %% |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
mfcalc
Symbol TableThe multi-function calculator requires a symbol table to keep track of the names and meanings of variables and functions. This doesn’t affect the grammar rules (except for the actions) or the Bison declarations, but it requires some additional C functions for support.
The symbol table itself consists of a linked list of records. Its definition, which is kept in the header ‘calc.h’, is as follows. It provides for either functions or variables to be placed in the table.
/* Function type. */ typedef double (*func_t) (double); /* Data type for links in the chain of symbols. */ struct symrec { char *name; /* name of symbol */ int type; /* type of symbol: either VAR or FNCT */ union { double var; /* value of a VAR */ func_t fnctptr; /* value of a FNCT */ } value; struct symrec *next; /* link field */ }; typedef struct symrec symrec; /* The symbol table: a chain of 'struct symrec'. */ extern symrec *sym_table; symrec *putsym (char const *, int); symrec *getsym (char const *); |
The new version of main
will call init_table
to initialize
the symbol table:
struct init { char const *fname; double (*fnct) (double); }; struct init const arith_fncts[] = { { "atan", atan }, { "cos", cos }, { "exp", exp }, { "ln", log }, { "sin", sin }, { "sqrt", sqrt }, { 0, 0 }, }; /* The symbol table: a chain of 'struct symrec'. */ symrec *sym_table; /* Put arithmetic functions in table. */ static void init_table (void) { int i; for (i = 0; arith_fncts[i].fname != 0; i++) { symrec *ptr = putsym (arith_fncts[i].fname, FNCT); ptr->value.fnctptr = arith_fncts[i].fnct; } } |
By simply editing the initialization list and adding the necessary include files, you can add additional functions to the calculator.
Two important functions allow look-up and installation of symbols in the
symbol table. The function putsym
is passed a name and the type
(VAR
or FNCT
) of the object to be installed. The object is
linked to the front of the list, and a pointer to the object is returned.
The function getsym
is passed the name of the symbol to look up. If
found, a pointer to that symbol is returned; otherwise zero is returned.
#include <stdlib.h> /* malloc. */ #include <string.h> /* strlen. */ symrec * putsym (char const *sym_name, int sym_type) { symrec *ptr = (symrec *) malloc (sizeof (symrec)); ptr->name = (char *) malloc (strlen (sym_name) + 1); strcpy (ptr->name,sym_name); ptr->type = sym_type; ptr->value.var = 0; /* Set value to 0 even if fctn. */ ptr->next = (struct symrec *)sym_table; sym_table = ptr; return ptr; } symrec * getsym (char const *sym_name) { symrec *ptr; for (ptr = sym_table; ptr != (symrec *) 0; ptr = (symrec *)ptr->next) if (strcmp (ptr->name, sym_name) == 0) return ptr; return 0; } |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
mfcalc
LexerThe function yylex
must now recognize variables, numeric values, and
the single-character arithmetic operators. Strings of alphanumeric
characters with a leading letter are recognized as either variables or
functions depending on what the symbol table says about them.
The string is passed to getsym
for look up in the symbol table. If
the name appears in the table, a pointer to its location and its type
(VAR
or FNCT
) is returned to yyparse
. If it is not
already in the table, then it is installed as a VAR
using
putsym
. Again, a pointer and its type (which must be VAR
) is
returned to yyparse
.
No change is needed in the handling of numeric values and arithmetic
operators in yylex
.
#include <ctype.h> int yylex (void) { int c; /* Ignore white space, get first nonwhite character. */ while ((c = getchar ()) == ' ' || c == '\t') continue; if (c == EOF) return 0; /* Char starts a number => parse the number. */ if (c == '.' || isdigit (c)) { ungetc (c, stdin); scanf ("%lf", &yylval.NUM); return NUM; } |
Bison generated a definition of YYSTYPE
with a member named
NUM
to store value of NUM
symbols.
/* Char starts an identifier => read the name. */ if (isalpha (c)) { /* Initially make the buffer long enough for a 40-character symbol name. */ static size_t length = 40; static char *symbuf = 0; symrec *s; int i; if (!symbuf) symbuf = (char *) malloc (length + 1); i = 0; do { /* If buffer is full, make it bigger. */ if (i == length) { length *= 2; symbuf = (char *) realloc (symbuf, length + 1); } /* Add this character to the buffer. */ symbuf[i++] = c; /* Get another character. */ c = getchar (); } while (isalnum (c)); ungetc (c, stdin); symbuf[i] = '\0'; s = getsym (symbuf); if (s == 0) s = putsym (symbuf, VAR); *((symrec**) &yylval) = s; return s->type; } /* Any other character is a token by itself. */ return c; } |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
mfcalc
MainThe error reporting function is unchanged, and the new version of
main
includes a call to init_table
and sets the yydebug
on user demand (See section Tracing Your Parser, for details):
/* Called by yyparse on error. */ void yyerror (char const *s) { fprintf (stderr, "%s\n", s); } int main (int argc, char const* argv[]) { int i; /* Enable parse traces on option -p. */ for (i = 1; i < argc; ++i) if (!strcmp(argv[i], "-p")) yydebug = 1; init_table (); return yyparse (); } |
This program is both powerful and flexible. You may easily add new
functions, and it is a simple job to modify this code to install
predefined variables such as pi
or e
as well.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] |
This document was generated by Rick Perry on December 29, 2013 using texi2html 1.82.