Skip to content

Lecture No. 02

Dated: 13-03-2025

Two Pass Compiler

cs606_e_2_1.svg

Structure of a 2 pass compiler.

Immediate advantage is that it admits multiple frontends and multiple passes.
The algorithms employed in the frontend have polynomial time complexity1

cs606_e_2_2.svg

Cross Compilation

Imagine three machines

  • Windows IA-32 (32 bit version of x86_64 architecture)
  • Mac OS x86_64
  • Android ARM

Now imagine the source code on windows machine. We use MSVC to compile the code for the windows platform. Then we use gcc to compile this into x86_64 platform.
Now this x86_64 version can be recompiled into ARM code on the Mac machine.

Front End

cs606_e_2_3.svg

Front end

The front end rejects the illegal code and reports the error in a useful way. For the legal code, it produces IR and storage allocation maps for various data structures declared in the program.
The front end is made up for following parts

Scanner

The word is the basic unit of syntax.
A token is a pair of word and its part of speech(<token type, word>).
Typical tokens are:

  • Numbers
  • Identifiers
  • new
  • while
  • if
  • +
  • -
    etc

The scanner takes the input program and breaks it down to tokens.

Example

x = x + y

is converted to

<id, x>
<assign, =>
<id, x>
<op, +>
<id, y>

Parser

The parser takes in the stream of tokens, recognizes context free syntax and reports errors. It guides the context-sensitive (semantic) analysis for tasks like type checking and outputs intermediate representation.

The syntax of most programming languages is specified using Context Free Grammars(CFG).
Context Free syntax is specified with grammar \(G=(S, N, T, P)\).

  • \(S\) is for starting symbol.
  • \(N\) is a set2 of non terminal symbols.
  • \(T\) is a set2 of terminal symbols.
  • \(P\) is a set2 of productions or rewrite rules.

Example

Context Free Grammar for arithmetic expressions is

1: goal -> expr
2: expr -> expr op term
3:      |  term
4: term -> number
5:      |  id
6: op   -> +
7:      |  -
\[S = \text{goal}\]
\[T = \{\text{number}, \text{id}, +, -\}\]
\[N= \{\text{goal}, \text{expr}, \text{term}, \text{op}\}\]
\[P = \{1, 2, 3, 4, 5, 6, 7, 8, 9, 0\}\]

Consider the sentence

\[x + 2 - y\]
Sr Production Result
goal
1 goal → expr expr
2 expr → expr op term expr op term
5 term → id expr op y
7 op → - expr - y
2 expr → expr op term expr op term - y
4 term → number expr op 2 - y
6 op → + expr + 2 - y
3 expr → term term + 2 - y
5 term → id x + 2 - y

To recognize a valid sentence in some CFG, we reverse this process and buildup a parse.

References


  1. Read more about running time analysis

  2. Read more about sets