20. Perspective Projection
Dated: 14-05-2025
How do we map a 3D point
in space onto a pixel
?
To make the task easier, we will impose following restrictions:
- The
point of view
(POV) must lie on the \(z\) axis. - The screen
plane
must be parallel to the \(xy\)plane
. - Left and right
edges
of the screen should be parallel to \(y\) axis. - Top and bottom
edges
of the screen should be parallel to \(x\) axis. - \(z\) axis should pass through the middle of the screen.
There are 2 approaches (using the left hand coordinate system):
- The screen center lies at the
origin
and POV lies at \((0, 0, -z)\). - The screen center lies at \((0, 0, z)\) and POV lies at the
origin
.
The 2nd approach is more convenient when we add features making it possible for the POV to move around the 3D or for objects
to move around in the world.
\(\Delta ABS \sim \Delta ACP\)
\(|\overline{AB}|\) is also called the scaling factor
.
Now the task is to map the world coordinates \((x, y, z)\) onto the screen pixels
\((x^\prime, y^\prime)\).
Here \(s = |\overline{AB}|\)
Because both \(x^\prime\) and \(y^\prime\) depend on \(s\), therefore these equations are limited to only square screens.
The above technique is called z buffering
.
The Perspective Projection Matrix
1
Instead of using \(s\), we will use horizontal field of view
(fov
).
This allows us to calculate screen height
based on screen width
and aspect ratio
.
In z buffering
, we can clip objects
within the range \([z_\text{near}, z_\text{far}]\).
With these parameters, the following projection matrix
1 can be made.
Let's perform a sanity check.
To extract \((x, y, z)\) from \(\vec P^\prime\), the homogeneous component \(\vec P^\prime_{14}\) should be \(1\) but in this case, it is set to \(z\).
To normalize this,
Now last piece of our puzzle is to develop the pipeline.
To render a scene
, you set up a
- World
Matrix
1 - Responsible for transforming local coordinates of theobject
to global coordinates of theworld
. - View
Matrix
1 - Responsible for transforming global coordinates of theworld
to space, relative to the viewer. - Projection
Matrix
1 - Responsible for transforming viewer space coordinates to 2D screenpixel
coordinates.
We can use composition here as well.
The Perspective Projection Matrix Used by Microsoft Direct3D
The projection matrix
1 is typically a scale
and perspective projection
.
The projection transformation converts the viewing frustum
into a cuboid
shape.
Because the near end of the viewing frustum
is smaller than the far end, this has the effect of expanding objects that are near to the camera
.
The Viewing Frustum
The composition looks like
We can redefine the following parameters as
\(\text{fov}_w\) and \(\text{fov}_h\) represent the viewport
's horizontal and vertical fields of view
.
Where \(z_n\) represents the \(z\) position of near clipping plane
.
\(V_h\) and \(V_w\) represent the height
and width
of the viewport
respectively, in camera
space.