[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This chapter describes how flex handles dynamic memory, and how you can override the default behavior.
21.1 The Default Memory Management | ||
21.2 Overriding The Default Memory Management | ||
21.3 A Note About yytext And Memory |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Flex allocates dynamic memory during initialization, and once in a while from
within a call to yylex(). Initialization takes place during the first call to
yylex(). Thereafter, flex may reallocate more memory if it needs to enlarge a
buffer. As of version 2.5.9 Flex will clean up all memory when you call yylex_destroy
See faq-memory-leak.
Flex allocates dynamic memory for four purposes, listed below (2)
Flex allocates memory for the character buffer used to perform pattern
matching. Flex must read ahead from the input stream and store it in a large
character buffer. This buffer is typically the largest chunk of dynamic memory
flex consumes. This buffer will grow if necessary, doubling the size each time.
Flex frees this memory when you call yylex_destroy(). The default size of this
buffer (16384 bytes) is almost always too large. The ideal size for this
buffer is the length of the longest token expected, in bytes, plus a little more. Flex will allocate a few
extra bytes for housekeeping. Currently, to override the size of the input buffer
you must #define YY_BUF_SIZE
to whatever number of bytes you want. We don’t plan
to change this in the near future, but we reserve the right to do so if we ever add a more robust memory management
API.
The size is large enough to hold the same number of states as characters in the input buffer. If you override the size of the
input buffer (via YY_BUF_SIZE
), then you automatically override the size of this buffer as well.
Flex allocates memory for the start condition stack. This is the stack used
for pushing start states, i.e., with yy_push_state(). It will grow if
necessary. Since the states are simply integers, this stack doesn’t consume
much memory. This stack is not present if %option stack
is not
specified. You will rarely need to tune this buffer. The ideal size for this
stack is the maximum depth expected. The memory for this stack is
automatically destroyed when you call yylex_destroy(). See option-stack.
Flex allocates memory for each YY_BUFFER_STATE. The buffer state itself is about 40 bytes, plus an additional large character buffer (described above.) The initial buffer state is created during initialization, and with each call to yy_create_buffer(). You can’t tune the size of this, but you can tune the character buffer as described above. Any buffer state that you explicitly create by calling yy_create_buffer() is NOT destroyed automatically. You must call yy_delete_buffer() to free the memory. The exception to this rule is that flex will delete the current buffer automatically when you call yylex_destroy(). If you delete the current buffer, be sure to set it to NULL. That way, flex will not try to delete the buffer a second time (possibly crashing your program!) At the time of this writing, flex does not provide a growable stack for the buffer states. You have to manage that yourself. See section Multiple Input Buffers.
Flex allocates about 84 bytes for the reentrant scanner structure when you call yylex_init(). It is destroyed when the user calls yylex_destroy().
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Flex calls the functions yyalloc
, yyrealloc
, and yyfree
when it needs to allocate or free memory. By default, these functions are
wrappers around the standard C functions, malloc
, realloc
, and
free
, respectively. You can override the default implementations by telling
flex that you will provide your own implementations.
To override the default implementations, you must do two things:
// For a non-reentrant scanner void * yyalloc (size_t bytes); void * yyrealloc (void * ptr, size_t bytes); void yyfree (void * ptr); // For a reentrant scanner void * yyalloc (size_t bytes, void * yyscanner); void * yyrealloc (void * ptr, size_t bytes, void * yyscanner); void yyfree (void * ptr, void * yyscanner); |
In the following example, we will override all three memory routines. We assume
that there is a custom allocator with garbage collection. In order to make this
example interesting, we will use a reentrant scanner, passing a pointer to the
custom allocator through yyextra
.
%{ #include "some_allocator.h" %} /* Suppress the default implementations. */ %option noyyalloc noyyrealloc noyyfree %option reentrant /* Initialize the allocator. */ #define YY_EXTRA_TYPE struct allocator* #define YY_USER_INIT yyextra = allocator_create(); %% .|\n ; %% /* Provide our own implementations. */ void * yyalloc (size_t bytes, void* yyscanner) { return allocator_alloc (yyextra, bytes); } void * yyrealloc (void * ptr, size_t bytes, void* yyscanner) { return allocator_realloc (yyextra, bytes); } void yyfree (void * ptr, void * yyscanner) { /* Do nothing -- we leave it to the garbage collector. */ } |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
When flex finds a match, yytext
points to the first character of the
match in the input buffer. The string itself is part of the input buffer, and
is NOT allocated separately. The value of yytext will be overwritten the next
time yylex() is called. In short, the value of yytext is only valid from within
the matched rule’s action.
Often, you want the value of yytext to persist for later processing, i.e., by a parser with non-zero lookahead. In order to preserve yytext, you will have to copy it with strdup() or a similar function. But this introduces some headache because your parser is now responsible for freeing the copy of yytext. If you use a yacc or bison parser, (commonly used with flex), you will discover that the error recovery mechanisms can cause memory to be leaked.
To prevent memory leaks from strdup’d yytext, you will have to track the memory somehow. Our experience has shown that a garbage collection mechanism or a pooled memory mechanism will save you a lot of grief when writing parsers.
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This document was generated by Rick Perry on January 7, 2013 using texi2html 1.82.