%option align
’%option ecs
’flex
to construct equivalence classes, i.e., sets
of characters which have identical lexical properties (for example, if
the only appearance of digits in the flex
input is in the
character class “[0-9]” then the digits '0', '1', ..., '9' will all be
put in the same equivalence class). Equivalence classes usually give
dramatic reductions in the final table/object file sizes (typically a
factor of 2-5) and are pretty cheap performance-wise (one array look-up
per character scanned).
flex
should not compress the tables by taking advantages of
similar transition functions for different states.
%option meta-ecs
’flex
to construct
meta-equivalence classes,
which are sets of equivalence classes (or characters, if equivalence
classes are not being used) that are commonly used together. Meta-equivalence
classes are often a big win when using compressed tables, but they
have a moderate performance impact (one or two if
tests and one
array look-up per character scanned).
%option read
’stdio
) for input. Instead of calling fread()
or
getc()
, the scanner will use the read()
system call,
resulting in a performance gain which varies from system to system, but
in general is probably negligible unless you are also using ‘-Cf’
or ‘-CF’. Using ‘-Cr’ can cause strange behavior if, for
example, you read from yyin using stdio
prior to calling
the scanner (because the scanner will miss whatever text your previous
reads left in the stdio
input buffer). ‘-Cr’ has no effect
if you define YY_INPUT()
(see Generated Scanner).
The options ‘-Cf’ or ‘-CF’ and ‘-Cm’ do not make sense together - there is no opportunity for meta-equivalence classes if the table is not being compressed. Otherwise the options may be freely mixed, and are cumulative.
The default setting is ‘-Cem’, which specifies that flex
should generate equivalence classes and meta-equivalence classes. This
setting provides the highest degree of table compression. You can trade
off faster-executing scanners at the cost of larger tables with the
following generally being true:
slowest & smallest -Cem -Cm -Ce -C -C{f,F}e -C{f,F} -C{f,F}a fastest & largest
Note that scanners with the smallest tables are usually generated and compiled the quickest, so during development you will usually want to use the default, maximal compression.
‘-Cfe’ is often a good compromise between speed and size for production scanners.
%option full
’stdio
is bypassed.
The result is large but fast. This option is equivalent to
‘--Cfr’
%option fast
’stdio
bypassed). This representation is about as fast
as the full table representation ‘--full’, and for some sets of
patterns will be considerably smaller (and for others, larger). In
general, if the pattern set contains both keywords and a
catch-all, identifier rule, such as in the set:
"case" return TOK_CASE; "switch" return TOK_SWITCH; ... "default" return TOK_DEFAULT; [a-z]+ return TOK_ID;
then you're better off using the full table representation. If only the identifier rule is present and you then use a hash table or some such to detect the keywords, you're better off using ‘--fast’.
This option is equivalent to ‘-CFr’. It cannot be used with ‘--c++’.