42. Least Square Solution

Dated: 20-06-2025

Least Squares Solution

\[A \vec x = \vec b\]

The least squares solution is the value of \(\vec x\) which makes \(A \vec x\) the closest point to \(\vec b\) in \(\text{Col } A\).

Solution of the General Least Squares Problem

Apply best approximation theorem¹ on the subspace² \(\text{Col }A\)

\[\vec b^\prime = \text{Proj}_{\text{Col }A} \vec b\]

Since \(\vec b^\prime \in \text{Col } A\), the equation \(A \vec x = \vec b^\prime\) is consistent³ and there is a solution \(\vec x^\prime \in \mathbb R^n\) such that

\[A \vec x^\prime = \vec b^\prime\]

Since \(\vec b^\prime\) is closest point in \(\text{Col }A\) to \(\vec b\), therefore, \(\vec x^\prime\) is least squares solution of \(A \vec x = \vec b\) if and only if \(\vec x^\prime\) satisfies \(A \vec x^\prime = \vec b\).
Such \(\vec x^\prime \in \mathbb R^n\) is a list of weights that will build \(\vec b\) out of the columns of \(A\).

Normal Equations for \(\vec x^\prime\)

Suppose that \(\vec x^\prime\) satisfies \(A \vec x^\prime = \vec b^\prime\). Then by orthogonal decomposition theorem,⁴ \(\vec b\) has the property that \((\vec b - \vec b^\prime) \perp \text{Col }A\).

\[\because A \vec x^\prime = \vec b^\prime\]

\[\therefore (\vec b - A \vec x^\prime) \perp \text{Col } A\]

This means that \(\vec b - A \vec x^\prime\) is perpendicular to each column of \(A\).
Let \(a_j\) be a column of \(A\).

\[\implies a_j \cdot (\vec bA \vec x^\prime) = 0\]

\[\implies a_j^T \cdot (\vec b - A \vec x^\prime) = 0\]

Since \(a^T_j\) is a row of \(A^T\)

\[A^T \cdot (\vec b - A \vec x^\prime) = 0\]

\[A^T \cdot \vec b - A^TA \vec x^\prime = 0\]

\[A^T \cdot \vec b = A^TA \vec x^\prime\]

This equation represents a system of linear equations⁵ called normal equations for \(\vec x^\prime\).

Decomposition of \(\vec b\) into the sum of a vector⁶ from \(\text{Col } A\) and a vector⁶ perpendicular to \(\text{Col }A\)

\[\vec b = A \vec x^\prime + (\vec b - A \vec x^\prime)\]

Definition

If \(A\) is an \(m \times n\) matrix⁷ and \(\vec b \in \mathbb R^n\), a least square solution of \(A \vec x = \vec b\) is an \(I \vec x^\prime \in \mathbb R^n\) such that

\[||\vec b - A \vec x^\prime|| \le ||\vec b - A \vec x|| \quad \forall \vec x \in \mathbb R^n\]

Theorem

The matrix⁷ \(A^TA\) is invertible⁸ if and only if columns of \(A\) are linearly independent.⁹ In this case

\[\vec x^\prime = (A^TA)^{-1} A^T \vec b\]

Theorem

Given an \(m \times n\) matrix⁷ \(A\) with linearly independent columns,⁹ let \(A = QR\) and for each \(\vec b \in \mathbb R^m\), the equation \(A \vec x = \vec b\) has a unique least squares solution given by

\[\vec x^\prime = R^{-1} Q^T \vec b\]

42. Least Square Solution

Least Squares Solution

Solution of the General Least Squares Problem

Normal Equations for \(\vec x^\prime\)

Definition

Theorem

Theorem

References