The role of the lexical analyzer

Since the lexical analyzer is the part of the compiler that reads the source text, it may perform certain other tasks besides identification of lexemes. One such task is stripping out comments and whitespace (blank, newline, tab, and perhaps other characters that are used to separate tokens in the input). Another task is correlating error messages generated by the compiler with the source program. For instance, the lexical analyzer may keep track of the number of newline characters seen, so it can associate a line number with each error message. In some compilers, the lexical analyzer makes a copy of the source program with the error messages inserted at the appropriate positions. If the source program uses a macro-preprocessor, the expansion of macros may also be performed by the lexical analyzer.

It is the first phase of a compiler. It reads source code as input and sequence of tokens as output. This will be used as input by the parser in syntax analysis. Upon receiving ‘getNextToken’ from parser, lexical analyzer searches for the next token.

Some additional tasks are: eliminating comments, blanks, tab and newline characters,providing line numbers associated with error messages and making a copy of the source program with error messages.


Some of the issues are: simpler design, compiler efficiency is improved and compiler portability is enhanced.

0 comments