1. 投影(Projections)
1.1 投影到直线上(Projection onto a Line)
P r o j L ( x ⃗ ) = c v ⃗ x ⃗ − P r o j L ( x ⃗ ) i s o r t h o g o n a l t o v ⃗ ( x ⃗ − P r o j L ( x ⃗ ) ) ⋅ v ⃗ = 0 ( x ⃗ − c v ⃗ ) ⋅ v ⃗ = 0 x ⃗ ⋅ v ⃗ = c v ⃗ ⋅ v ⃗ c = x ⃗ ⋅ v ⃗ v ⃗ ⋅ v ⃗ P r o j L ( x ⃗ ) = c v ⃗ = ( x ⃗ ⋅ v ⃗ v ⃗ ⋅ v ⃗ ) v ⃗ Proj_L(\vec{x})=c \vec{v} \\ ~\\ \vec{x}-Proj_L(\vec{x})\ is\ orthogonal\ to\ \vec{v}\\ ~\\ (\vec{x}-Proj_L(\vec{x}))\cdot\vec{v}=0\\ ~\\ (\vec{x}-c \vec{v})\cdot\vec{v}=0\\ ~\\ \vec{x}\cdot\vec{v}=c \vec{v}\cdot\vec{v}\\ ~\\ c=\frac{\vec{x}\cdot\vec{v}}{\vec{v}\cdot\vec{v}}\\ ~\\ Proj_L(\vec{x})=c \vec{v}=(\frac{\vec{x}\cdot\vec{v}}{\vec{v}\cdot\vec{v}})\vec{v} ProjL(x)=cv x−ProjL(x) is orthogonal to v (x−ProjL(x))⋅v=0 (x−cv)⋅v=0 x⋅v=cv⋅v c=v⋅vx⋅v ProjL(x)=cv=(v⋅vx⋅v)v
P r o j L ( x ⃗ ) = ( x ⃗ ⋅ v ⃗ v ⃗ ⋅ v ⃗ ) v ⃗ = ( x ⃗ ⋅ v ⃗ ∣ ∣ v ⃗ ∣ ∣ 2 ) v ⃗ = x ⃗ ⋅ v ⃗ ∣ ∣ v ⃗ ∣ ∣ ⋅ v ⃗ ∣ ∣ v ⃗ ∣ ∣ = ( x ⃗ ⋅ u ^ ) u ^ ( w h e r e u ^ i s a u n i t v e c t o r ) Proj_L(\vec{x})=(\frac{\vec{x}\cdot\vec{v}}{\vec{v}\cdot\vec{v}})\vec{v}=(\frac{\vec{x}\cdot\vec{v}}{||\vec{v}||^2})\vec{v}=\vec{x}\cdot\frac{\vec{v}}{||\vec{v}||}\cdot\frac{\vec{v}}{||\vec{v}||}=(\vec{x}\cdot\hat{u})\hat{u}(where\ \hat{u}\ is\ a\ unit\ vector ) ProjL(x)=(v⋅vx⋅v)v=(∣∣v∣∣2x⋅v)v=x⋅∣∣v∣∣v⋅∣∣v∣∣v=(x⋅u^)u^(where u^ is a unit vector)
现将投影公式写为矩阵乘积形式
P r o j L ( x ⃗ ) = ( x ⃗ ⋅ u ^ ) u ^ ( w h e r e u ^ i s a u n i t v e c t o r ) P r o j L : R 2 → R 2 u ^ = [ u 1 u 2 ] [ 1 0 0 1 ] A = [ ( [ 1 0 ] ⋅ [ u 1 u 2 ] ) [ u 1 u 2 ] ( [ 0 1 ] ⋅ [ u 1 u 2 ] ) [ u 1 u 2 ] ] A = [ u 1 [ u 1 u 2 ] u 2 [ u 1 u 2 ] ] = [ u 1 2 u 2 u 1 u 1 u 2 u 2 2 ] Proj_L(\vec{x})=(\vec{x}\cdot\hat{u})\hat{u}(where\ \hat{u}\ is\ a\ unit\ vector )\\ ~\\ Proj_L:\mathbb{R}^2\rightarrow\mathbb{R}^2\\ ~\\ \hat{u}=\begin{bmatrix}u_1\\ u_2\end{bmatrix}\\ ~\\ \begin{bmatrix}1 & 0\\ 0 & 1\end{bmatrix}\\ ~\\ A=\begin{bmatrix} \bigg(\begin{bmatrix}1\\ 0\end{bmatrix}\cdot\begin{bmatrix}u_1\\u_2\end{bmatrix}\bigg)\begin{bmatrix}u_1\\ u_2\end{bmatrix}\quad \bigg(\begin{bmatrix}0\\ 1\end{bmatrix}\cdot\begin{bmatrix}u_1\\u_2\end{bmatrix}\bigg)\begin{bmatrix}u_1\\ u_2\end{bmatrix} \end{bmatrix}\\ ~\\ A=\begin{bmatrix} u_1\begin{bmatrix}u_1\\u_2\end{bmatrix}\quad u_2\begin{bmatrix}u_1\\u_2\end{bmatrix} \end{bmatrix}=\begin{bmatrix}u_1^2 & u_2u_1\\ u_1u_2 & u_2^2\end{bmatrix} ProjL(x)=(x⋅u^)u^(where u^ is a unit vector) ProjL:R2→R2 u^=[u1u2] [1001] A=[([10]⋅[u1u2])[u1u2]([01]⋅[u1u2])[u1u2]] A=[u1[u1u2]u2[u1u2]]=[u12u1u2u2u1u22]
例子:
v ⃗ = [ 2 1 ] x ⃗ = [ 2 3 ] ∣ ∣ v ⃗ ∣ ∣ = 2 2 + 1 2 = 5 u ^ = v ⃗ ∣ ∣ v ⃗ ∣ ∣ = 1 5 [ 2 1 ] = [ 2 / 5 1 / 5 ] A = [ 4 / 5 2 / 5 2 / 5 1 / 5 ] P r o j L ( x ⃗ ) = A x ⃗ = [ 4 / 5 2 / 5 2 / 5 1 / 5 ] [ 2 3 ] = [ 14 / 5 7 / 5 ] = 7 5 [ 2 1 ] \vec{v}=\begin{bmatrix}2\\ 1\end{bmatrix}\quad \vec{x}=\begin{bmatrix}2\\ 3\end{bmatrix}\\ ~\\ ||\vec{v}||=\sqrt{2^2+1^2}=\sqrt{5}\\ ~\\ \hat{u}=\frac{\vec{v}}{||\vec{v}||}=\frac{1}{\sqrt{5}}\begin{bmatrix}2\\ 1\end{bmatrix}=\begin{bmatrix}2/\sqrt{5}\\ 1/\sqrt{5}\end{bmatrix}\\ ~\\ A=\begin{bmatrix}4/5 & 2/5\\ 2/5 & 1/5 \end{bmatrix}\\ ~\\ Proj_L(\vec{x})=A\vec{x}=\begin{bmatrix}4/5 & 2/5\\ 2/5 & 1/5 \end{bmatrix}\begin{bmatrix}2\\ 3\end{bmatrix}=\begin{bmatrix}14/5\\ 7/5\end{bmatrix}=\frac{7}{5}\begin{bmatrix}2\\ 1\end{bmatrix} v=[21]x=[23] ∣∣v∣∣=22+12=5 u^=∣∣v∣∣v=51[21]=[2/51/5] A=[4/52/52/51/5] ProjL(x)=Ax=[4/52/52/51/5][23]=[14/57/5]=57[21]
1.2 投影到子空间上(Projection onto a subspace)
下图展示了某个矩阵A的行空间和零空间
下图展示了某个矩阵A的列空间和左零空间
上面两张图只是为了对子空间中投影能有个直观了解,高维度空间毕竟无法进行直观可视化,所以下面只能以文字描述
笔记来源:A projection onto a subspace is a linear transformation | Linear Algebra | Khan Academy
V V V is subspace of R n \mathbb{R}^n Rn、 { b ⃗ 1 , b ⃗ 2 , ⋯ , b ⃗ k } \{\vec{b}_1,\vec{b}_2,\cdots,\vec{b}_k\} {b1,b2,⋯,bk} basis for V V V
子空间 V V V 中任一向量 a ⃗ ∈ V \vec{a}\in V a∈V
a ⃗ = y 1 b ⃗ 1 + y 2 b ⃗ 2 + ⋯ + y k b ⃗ k \vec{a}=y_1\vec{b}_1+y_2\vec{b}_2+\cdots+y_k\vec{b}_k a=y1b1+y2b2+⋯+ykbk
A y ⃗ = [ ⋮ ⋮ ⋮ ⋮ b ⃗ 1 b ⃗ 2 ⋯ b ⃗ k ⋮ ⋮ ⋮ ⋮ ] [ y 1 y 2 ⋮ y k ] = a ⃗ A\vec{y}=\begin{bmatrix}\vdots & \vdots & \vdots &\vdots\\ \vec{b}_1 & \vec{b}_2 & \cdots & \vec{b}_k\\ \vdots & \vdots & \vdots &\vdots \end{bmatrix} \begin{bmatrix} y_1\\ y_2\\ \vdots\\ y_k \end{bmatrix}=\vec{a} Ay=⎣⎢⎢⎡⋮b1⋮⋮b2⋮⋮⋯⋮⋮bk⋮⎦⎥⎥⎤⎣⎢⎢⎢⎡y1y2⋮yk⎦⎥⎥⎥⎤=a
For some vector x ⃗ ∈ R n \vec{x}\in\mathbb{R}^n x∈Rn、 y ⃗ ∈ R k \vec{y}\in\mathbb{R}^k y∈Rk、 a ⃗ ∈ V \vec{a}\in V a∈V、 P r o j V ( x ⃗ ) ∈ V Proj_{V}(\vec{x})\in V ProjV(x)∈V
P r o j V ( x ⃗ ) = A y ⃗ = a ⃗ Proj_{V}(\vec{x})=A\vec{y}=\vec{a} ProjV(x)=Ay=a
P r o j V ( x ⃗ ) ∈ V Proj_{V}(\vec{x})\in V ProjV(x)∈V、 x ⃗ − P r o j V ( x ⃗ ) ∈ V ⊥ \vec{x}-Proj_{V}(\vec{x})\in V^{\perp} x−ProjV(x)∈V⊥
x ⃗ = P r o j V ( x ⃗ ) + P r o j V ⊥ ( x ⃗ ) \vec{x}=Proj_V(\vec{x})+Proj_{V^{\perp}}(\vec{x}) x=ProjV(x)+ProjV⊥(x)
我们假设这里子空间 V V V 为列空间,其正交补集 V ⊥ V^{\perp} V⊥ 就为左零空间
P r o j V ( x ⃗ ) ∈ C ( A ) Proj_{V}(\vec{x})\in C(A) ProjV(x)∈C(A)、 x ⃗ − P r o j V ( x ⃗ ) ∈ N ( A T ) \vec{x}-Proj_{V}(\vec{x})\in N(A^T) x−ProjV(x)∈N(AT)
由上向量 x ⃗ − P r o j V ( x ⃗ ) \vec{x}-Proj_{V}(\vec{x}) x−ProjV(x) 在左零空间中,故:
A T ( x ⃗ − P r o j V ( x ⃗ ) ) = 0 ⃗ A T ( x ⃗ − A y ⃗ ) = 0 ⃗ A T x ⃗ = A T A y ⃗ y ⃗ = ( A T A ) − 1 A T x ⃗ P r o j V ( x ⃗ ) = A y ⃗ = A ( A T A ) − 1 A T x ⃗ P r o j e c t i o n M a t r i x P = A ( A T A ) − 1 A T A^T(\vec{x}-Proj_{V}(\vec{x}))=\vec{0}\\ ~\\ A^T(\vec{x}-A\vec{y})=\vec{0}\\ ~\\ A^T\vec{x}=A^TA\vec{y}\\ ~\\ \vec{y}=(A^TA)^{-1}A^T\vec{x}\\ ~\\ Proj_{V}(\vec{x})=A\vec{y}=A(A^TA)^{-1}A^T\vec{x}\\ ~\\ Projection\ Matrix\ P=A(A^TA)^{-1}A^T AT(x−ProjV(x))=0 AT(x−Ay)=0 ATx=ATAy y=(ATA)−1ATx ProjV(x)=Ay=A(ATA)−1ATx Projection Matrix P=A(ATA)−1AT
例子:
笔记来源:Subspace projection matrix example | Linear Algebra | Khan Academy
子空间 V V V
V = S p a n ( [ 1 0 0 1 ] , [ 0 1 0 1 ] ) , x ⃗ ∈ R 4 A = [ 1 0 0 1 0 0 1 1 ] P r o j V ( x ⃗ ) = A ( A T A ) − 1 A T x ⃗ V=Span(\begin{bmatrix}1\\ 0\\ 0\\ 1\end{bmatrix},\begin{bmatrix}0\\ 1\\ 0\\ 1\end{bmatrix}),\vec{x}\in\mathbb{R}^4\\ ~\\ A=\begin{bmatrix}1 & 0\\ 0&1\\ 0&0\\ 1&1\end{bmatrix}\\ ~\\ Proj_{V}(\vec{x})=A(A^TA)^{-1}A^T\vec{x}\\ V=Span(⎣⎢⎢⎡1001⎦⎥⎥⎤,⎣⎢⎢⎡0101⎦⎥⎥⎤),x∈R4 A=⎣⎢⎢⎡10010101⎦⎥⎥⎤ ProjV(x)=A(ATA)−1ATx
A T = [ 1 0 0 1 0 1 0 1 ] A A T = [ 1 0 0 1 0 0 1 1 ] [ 1 0 0 1 0 1 0 1 ] = [ 2 1 1 2 ] A^T=\begin{bmatrix}1 & 0 & 0 & 1\\ 0 & 1 & 0 &1\end{bmatrix}\\ ~\\ AA^T=\begin{bmatrix}1 & 0\\ 0&1\\ 0&0\\ 1&1\end{bmatrix}\begin{bmatrix}1 & 0 & 0 & 1\\ 0 & 1 & 0 &1\end{bmatrix}=\begin{bmatrix}2 & 1\\ 1 & 2\end{bmatrix} AT=[10010011] AAT=⎣⎢⎢⎡10010101⎦⎥⎥⎤[10010011]=[2112]
( A A T ) − 1 = [ 2 1 1 2 ] − 1 = 1 3 [ 2 − 1 − 1 2 ] (AA^T)^{-1}=\begin{bmatrix}2 & 1\\ 1 & 2\end{bmatrix}^{-1}=\frac{1}{3}\begin{bmatrix}2 & -1\\ -1 & 2\end{bmatrix} (AAT)−1=[2112]−1=31[2−1−12]
P r o j V ( x ⃗ ) = A ( A T A ) − 1 A T x ⃗ P r o j V ( x ⃗ ) = [ 1 0 0 1 0 0 1 1 ] 1 3 [ 2 − 1 − 1 2 ] [ 1 0 0 1 0 1 0 1 ] x ⃗ = 1 3 [ 2 − 1 0 1 − 1 2 0 1 0 0 0 0 1 1 0 2 ] x ⃗ Proj_{V}(\vec{x})=A(A^TA)^{-1}A^T\vec{x}\\ ~\\ Proj_{V}(\vec{x})=\begin{bmatrix}1 & 0\\ 0&1\\ 0&0\\ 1&1\end{bmatrix}\frac{1}{3}\begin{bmatrix}2 & -1\\ -1 & 2\end{bmatrix}\begin{bmatrix}1 & 0 & 0 & 1\\ 0 & 1 & 0 &1\end{bmatrix}\vec{x}=\frac{1}{3}\begin{bmatrix}2 & -1 & 0 & 1\\ -1 & 2 & 0 &1\\ 0 & 0 & 0 & 0\\ 1 & 1 & 0 & 2\end{bmatrix}\vec{x} ProjV(x)=A(ATA)−1ATx ProjV(x)=⎣⎢⎢⎡10010101⎦⎥⎥⎤31[2−1−12][10010011]x=31⎣⎢⎢⎡2−101−120100001102⎦⎥⎥⎤x
证明子空间投影矩阵和其正交补集投影矩阵的关系
P r o j V ( x ⃗ ) = A ( A T A ) − 1 A T x ⃗ = P x ⃗ Proj_{V}(\vec{x})=A(A^TA)^{-1}A^T\vec{x}=P\vec{x} ProjV(x)=A(ATA)−1ATx=Px
假设
P r o j V ⊥ ( x ⃗ ) = C x ⃗ Proj_{V^{\perp}}(\vec{x})=C\vec{x} ProjV⊥(x)=Cx
x ⃗ = P r o j V ( x ⃗ ) + P r o j V ⊥ ( x ⃗ ) = P x ⃗ + C x ⃗ I x ⃗ = P x ⃗ + C x ⃗ = ( P + C ) x ⃗ I = P + C C = I − P P = I − C \vec{x}=Proj_V(\vec{x})+Proj_{V^{\perp}}(\vec{x})=P\vec{x}+C\vec{x}\\ ~\\ I\vec{x}=P\vec{x}+C\vec{x}=(P+C)\vec{x}\\ ~\\ I=P+C\\ ~\\ C=I-P\\ P=I-C x=ProjV(x)+ProjV⊥(x)=Px+Cx Ix=Px+Cx=(P+C)x I=P+C C=I−PP=I−C
1.3 某向量在子空间中的投影是离其本身最近的子空间中的向量
笔记来源:Projection is closest vector in subspace | Linear Algebra | Khan Academy
∣ ∣ a ⃗ ∣ ∣ = ∣ ∣ x ⃗ − P r o j V ( x ⃗ ) ∣ ∣ ∣ ∣ x ⃗ − P r o j V ( x ⃗ ) ∣ ∣ ≤ ∣ ∣ x ⃗ − v ⃗ ∣ ∣ ||\vec{a}||=||\vec{x}-Proj_V(\vec{x})||\\ ~\\ ||\vec{x}-Proj_V(\vec{x})|| \leq ||\vec{x}-\vec{v}|| ∣∣a∣∣=∣∣x−ProjV(x)∣∣ ∣∣x−ProjV(x)∣∣≤∣∣x−v∣∣
证明:
∣ ∣ x ⃗ − v ⃗ ∣ ∣ 2 = ∣ ∣ b ⃗ + a ⃗ ∣ ∣ 2 = ( b ⃗ + a ⃗ ) ( b ⃗ + a ⃗ ) = b ⃗ ⋅ b ⃗ + 2 a ⃗ ⋅ b ⃗ + a ⃗ ⋅ a ⃗ = ∣ ∣ b ⃗ ∣ ∣ 2 + ∣ ∣ a ⃗ ∣ ∣ 2 ∣ ∣ x ⃗ − v ⃗ ∣ ∣ 2 = ∣ ∣ b ⃗ ∣ ∣ 2 + ∣ ∣ a ⃗ ∣ ∣ 2 ≥ ∣ ∣ a ⃗ ∣ ∣ 2 ∣ ∣ x ⃗ − v ⃗ ∣ ∣ 2 ≥ ∣ ∣ a ⃗ ∣ ∣ 2 ∣ ∣ x ⃗ − v ⃗ ∣ ∣ ≥ ∣ ∣ a ⃗ ∣ ∣ ||\vec{x}-\vec{v}||^2=||\vec{b}+\vec{a}||^2=(\vec{b}+\vec{a})(\vec{b}+\vec{a}) =\vec{b}\cdot\vec{b}+2\vec{a}\cdot\vec{b}+\vec{a}\cdot\vec{a}=||\vec{b}||^2+||\vec{a}||^2\\ ~\\ ||\vec{x}-\vec{v}||^2=||\vec{b}||^2+||\vec{a}||^2 \geq ||\vec{a}||^2\\ ~\\ ||\vec{x}-\vec{v}||^2 \geq ||\vec{a}||^2\\ ~\\ ||\vec{x}-\vec{v}|| \geq ||\vec{a}|| ∣∣x−v∣∣2=∣∣b+a∣∣2=(b+a)(b+a)=b⋅b+2a⋅b+a⋅a=∣∣b∣∣2+∣∣a∣∣2 ∣∣x−v∣∣2=∣∣b∣∣2+∣∣a∣∣2≥∣∣a∣∣2 ∣∣x−v∣∣2≥∣∣a∣∣2 ∣∣x−v∣∣≥∣∣a∣∣