Analysis: source program to intermediate
representation (front end)
Synthesis: intermediate representation to
target program (back end)
The analysis part breaks
up the source program into constituent pieces and imposes a grammatical
structure on them. It then uses this structure to create an intermediate
representation of the source program. If the analysis part detects that the
source program is either syntactically ill formed or semantically unsound,
then it must provide informative messages, so the user can take corrective
action. The analysis part also collects information about the source
program and stores it in a data structure called a symbol table, which
is passed along with the intermediate representation to the synthesis part.
The synthesis part
constructs the desired target program from the intermediate
representation and the information in the symbol table. The analysis part
is often called the front end of the compiler; the
synthesis part is the back end.
The phases of a compiler are: lexical
analyzer (scanning) (linear analysis), syntax analyzer (hierarchical
analysis) (parsing), semantic analyzer, intermediate code generator,
machineindependent code optimizer, code generator and machine-dependent
code optimizer Symbol table manager and error handler are two independent
modules which will interact with all phases of compilation. A symbol table
is a data structure containing a record for each identifier with fields
for the attributes of the identifier. When an identifier in the
source program is detected by the lexical analyzer, the identifier is
entered into the symbol table. Each phase can encounter errors. After
detecting an error, a phase must somehow deal with that error, so that
compilation must proceed, allowing further errors in the source program
to be detected.
The first phase of a compiler is called lexical
analysis or scanning. The lexical analyzer
reads the stream of characters making up the source program and groups the
characters into meaningful sequences called lexemes. For
each lexeme, the lexical analyzer produces as output a token of
the form
(token-name, attribute-value)
that it passes on to the subsequent phase,
syntax analysis. In the token, the first component token-name is
an abstract symbol that is used during syntax analysis, and the
second component attribute-value points to an entry in
the symbol table for this token. Information from the symbol-table entry
is needed for semantic analysis and code generation.
The second phase of the compiler is syntax
analysis or parsing. The parser uses the
first components of the tokens produced by the lexical analyzer to create
a tree-like intermediate representation that depicts the grammatical structure
of the token stream. A typical representation is a syntax tree in
which each interior node represents an operation and the children of the
node represent the arguments of the operation.
The semantic analyzer uses
the syntax tree and the information in the symbol table to check the
source program for semantic consistency with the language definition. It also
gathers type information and saves it in either the syntax tree or the
symbol table, for subsequent use during intermediate-code generation.
In the process of translating a source
program into target code, a compiler may construct one or more
intermediate representations, which can have a variety of forms. Syntax trees
are a form of intermediate representation; they are commonly used during
syntax and semantic analysis.
The machine-independent code-optimization
phase attempts to improve the intermediate code so that better target code
will result. Usually better means faster, but other objectives may be desired,
such as shorter code, or target code that consumes less power.
The code generator takes as input an
intermediate representation of the source program and maps it into the
target language. If the target language is machine code, registers Or memory locations
are selected for each of the variables used by the program. Then, the
intermediate instructions are translated into sequences of machine
instructions that perform the same task. A crucial aspect of code
generation is the judicious assignment of registers to hold
variables. Different compiler construction tools are: parser generators,
scanner generators, syntaxdirected translation engines, code-generator
generators, data-flow analysis engines, compilerconstruction toolkits.
0 comments