[29]
Intermediate Code
The task of compiler is to convert the source program into machine program. But, it is not always possible to generate such a machine code directly in one pass. Typically compilers generate an easy to represent form of source language which is called intermediate language. This generation of intermediate language can lead to efficient code generation.
Benefits of Intermediate Code Generation
There are certain benefits of generating machine independent intermediate code:
A compiler for different machines can be treated by attaching different backend to the existing front ends of each machine.
A compiler for different source languages (on the same machine) can be created by proving different front ends for corresponding source languages to existing back end.
A machine independent code optimizer can be applied to intermediate code in order to optimize the code generation.
The role of intermediate code generator in compiler is depicted below.
Properties of Intermediate Languages
The intermediate language is an easy form of source language which can be generated efficiently by the compiler.
The generation of intermediate language should lead to efficient code generation.
The intermediate language should act as effective mediator between front and back end.
The intermediate language should be flexible enough so that optimized code can be generated.
Forms of Intermediate Code
Abstract Syntax Tree
Polish Notation
Three Address Code
Abstract Syntax Tree
The natural hierarchical structure is represented by syntax trees and the directed acyclic graph (DAG) is very much similar to syntax tree but they are in more compact form.
Polish Notation
The linearization of syntax trees is known as polish notation. Here, the operator can be easily associated with the corresponding operands. It is the most natural way of representation in expression evaluation. It is also called as prefix notation in which operator occurs first and then operands.
Example —
(a+b)*(c-d) // Can be written as
+ab –cd
Three Address Code
Three address code is a type of intermediate code which is easy to generate and can be easily converted to machine code. It makes use of at most three addresses and one operator to represent an expression and the value computed at each instruction is stored in temporary variable generated by compiler. The compiler decides the order of operation given by three address code.
General representation
a = b op c
Where a
, b
or c
represents operands like names, constants or compiler generated temporaries and op
represents the operators.
Example —
// Writing a * – (b + c) with Three Address Code
t1 = b + c
t2 = uminus t1
t3 = a * t2
// Writing A=-B *(C/D) with Three Address Code
T1 = - B
T2 = C / D
T3 = T1 * T2
A = T3