Lexical Analysis

Lexical analysis reads characters from left to right and groups into tokens. A simple way to build lexical analyzer is to construct a diagram to illustrate the structure of tokens of the source program. We can also produce a lexical analyzer automatically by specifying the lexeme patterns to a lexical-analyzer generator and compiling those patterns into code that functions as a lexical analyzer. This approach makes it easier to modify a lexical analyzer, since we have only to rewrite the affected patterns, not the entire program.

Three general approaches for implementing lexical analyzer are:

i. Use lexical analyzer generator (LEX) from a regular expression based specification that provides routines for reading and buffering the input.

ii. Write lexical analyzer in conventional language using I/O facilities to read input.

iii. Write lexical analyzer in assembly language and explicitly manage the reading of input.


The speed of lexical analysis is a concern in compiler design, since only this phase reads the source program character-by character.

0 comments