Why Intermediate Code is the Compiler's best friend

Why Intermediate Code is the Compiler's best friend

[29]

Intermediate Code

The task of compiler is to convert the source program into machine program. But, it is not always possible to generate such a machine code directly in one pass. Typically compilers generate an easy to represent form of source language which is called intermediate language. This generation of intermediate language can lead to efficient code generation.

Benefits of Intermediate Code Generation

There are certain benefits of generating machine independent intermediate code:

  1. A compiler for different machines can be treated by attaching different backend to the existing front ends of each machine.

  2. A compiler for different source languages (on the same machine) can be created by proving different front ends for corresponding source languages to existing back end.

  3. A machine independent code optimizer can be applied to intermediate code in order to optimize the code generation.

The role of intermediate code generator in compiler is depicted below.

Properties of Intermediate Languages

  1. The intermediate language is an easy form of source language which can be generated efficiently by the compiler.

  2. The generation of intermediate language should lead to efficient code generation.

  3. The intermediate language should act as effective mediator between front and back end.

  4. The intermediate language should be flexible enough so that optimized code can be generated.

Forms of Intermediate Code

  1. Abstract Syntax Tree

  2. Polish Notation

  3. Three Address Code

Abstract Syntax Tree

The natural hierarchical structure is represented by syntax trees and the directed acyclic graph (DAG) is very much similar to syntax tree but they are in more compact form.

Polish Notation

The linearization of syntax trees is known as polish notation. Here, the operator can be easily associated with the corresponding operands. It is the most natural way of representation in expression evaluation. It is also called as prefix notation in which operator occurs first and then operands.

Example

(a+b)*(c-d) // Can be written as

+ab –cd

Three Address Code

Three address code is a type of intermediate code which is easy to generate and can be easily converted to machine code. It makes use of at most three addresses and one operator to represent an expression and the value computed at each instruction is stored in temporary variable generated by compiler. The compiler decides the order of operation given by three address code.

General representation

a = b op c

Where a, b or c represents operands like names, constants or compiler generated temporaries and op represents the operators.

Example

// Writing a * – (b + c) with Three Address Code

t1 = b + c
t2 = uminus t1
t3 = a * t2

// Writing A=-B *(C/D) with Three Address Code

T1 = - B
T2 = C / D
T3 = T1 * T2
A = T3