Analysis of Linear Functions
\[
\newcommand{\ra}{\rightarrow}
\newcommand{\kC}{\mathcal{C}}
\newcommand{\IR}{\mathbb{R}}
\newcommand{\AINF}{\mathbb{A}}
\newcommand{\CINF}{\kC^\infty}
\newcommand{\half}{\frac{1}{2}}
\newcommand{\IFF}{\Leftrightarrow}
\newcommand{\inf}{\infty}
\newcommand{\Ind}{\mathbb{1}}
\newcommand{\IR}{\mathbb{R}}
\newcommand{\IA}{\mathbb{A}}
\newcommand{\IB}{\mathbb{B}}
\newcommand{\IC}{\mathbb{C}}
\newcommand{\ID}{\mathbb{D}}
\newcommand{\IF}{\mathbb{F}}
\newcommand{\IH}{\mathbb{H}}
\newcommand{\II}{\mathbb{I}}
\newcommand{\IL}{\mathbb{L}}
\newcommand{\IN}{\mathbb{N}}
\newcommand{\IP}{\mathbb{P}}
\newcommand{\IQ}{\mathbb{Q}}
\newcommand{\IR}{\mathbb{R}}
\newcommand{\IS}{\mathbb{S}}
\newcommand{\IV}{\mathbb{V}}
\newcommand{\IZ}{\mathbb{Z}}
\newcommand{\floor}[1]{\lfloor #1 \rfloor}
\newcommand{\ceil}[1]{\lceil #1 \rceil}
\newcommand{\set}[2]{\{\, #1 \;\vert\; #2 \,\}}
\newcommand{\Set}[2]{\left\{\, #1 \;\vert\; #2 \,\right\}}
\newcommand{\C}{\,\#}
\newcommand{\CSet}[2]{\#\{\, #1 \;\vert\; #2 \,\}}
\newcommand{\qtext}[1]{\quad\text{#1}\quad}
\newcommand{\stext}[1]{\;\text{#1}\;}
\newcommand{\dbrackets}[1]{ [\![ #1 ]\!]}
\newcommand{\KR}{\matcal{R}}
\newcommand{\KA}{\matcal{A}}
\newcommand{\KB}{\matcal{B}}
\newcommand{\KC}{\matcal{C}}
\newcommand{\KD}{\matcal{D}}
\newcommand{\KF}{\matcal{F}}
\newcommand{\KH}{\matcal{H}}
\newcommand{\KI}{\matcal{I}}
\Newcommand{\KL}{\matcal{L}}
\newcommand{\KN}{\matcal{N}}
\newcommand{\KP}{\matcal{P}}
\newcommand{\KQ}{\matcal{Q}}
\newcommand{\KR}{\matcal{R}}
\newcommand{\KS}{\matcal{S}}
\newcommand{\KV}{\matcal{V}}
\newcommand{\KZ}{\matcal{Z}}
\newcommand{\gc}{\mathfrak{C}}
\newcommand{\gd}{\mathfrak{D}}
\newcommand{\gM}{\mathfrak{M}}
\newcommand{\gm}{\mathfrak{m}}
\newcommand{\gf}{\mathfrak{f}}
\newcommand{\gu}{\mathfrak{U}}
\newcommand{\fa}{\mathfrak{a}}
\newcommand{\fg}{\mathfrak{g}}
\newcommand{\fn}{\mathfrak{n}}
\newcommand{\fk}{\mathfrak{k}}
\newcommand{\fm}{\mathfrak{m}}
\newcommand{\fp}{\mathfrak{p}}
\newcommand{\curly}[1]{\mathcal{#1}}
\newcommand{\op}[1]{\mathrm{#1}}
\newcommand{\Cat}[1]{\mathfrak{#1}}
\newcommand{\cat}[1]{\mathbf{#1}}
\newcommand{\vphi}{\varphi}
\newcommand{\sphi}{\phi}
\newcommand{\eps}{\varepsilon}
\newcommand{\tensor}{\otimes}
\newcommand{\tensors}{\tensor\dots\tensor}
\newcommand{\Tensor}{\bigotimes}
\newcommand{\ra}{\rightarrow}
\newcommand{\lra}{\longrightarrow}
\newcommand{\la}{\leftarrow}
\newcommand{\lla}{\longleftarrow}
\newcommand{\isom}{\cong}
\newcommand{\epi}{\twoheadrightarrow}
\newcommand{\mono}{\hookrightarrow}
\newcommand{\del}{\partial}
\newcommand{\union}{\cup}
\newcommand{\dotcup}{\ensuremath{\mathaccent\cdot\cup}}
\newcommand{\dunion}{\dotcup}
\newcommand{\<}{\langle}
\renewcommand{\>}{\rangle}
\newcommand{\inpart}[1]{\in\text{\part}(#1)}
\newcommand{\Vsum}{\bigoplus}
\newcommand{\vsum}{\oplus}
\renewcommand{\S}{\mathfrak{S}}
\newcommand{\id}{\mathrm{id}}
\newcommand{\rk}{\mathrm{rk}}
\newcommand{\Diff}{\mathrm{Diff}}
\newcommand{\Hom}{\mathrm{Hom}}
\newcommand{\Pic}{\mathrm{Pic}}
\newcommand{\Spec}{\mathrm{Spec}}
\newcommand{\End}{\mathrm{End}}
\newcommand{\Ext}{\mathrm{Ext}}
\DeclareMathOperator{\Supp}{\mathrm{Supp}}
\DeclareMathOperator{\Sym}{Sym}
\DeclareMathOperator{\Alt}{\Lambda}
\DeclareMathOperator{\ad}{ad}
\DeclareMathOperator{\ch}{ch}
\DeclareMathOperator{\td}{td}
\DeclareMathOperator{\pr}{pr}
\newcommand{\Jac}{\mathcal{J}}
\]
Let \(\psi : \IR^K \ra \IR^M, \phi: \IR^M \ra \IR^N, a: \IR^N \ra \IR\) be continuously differentiable functions.
We have the following notions of derivatives:
-
Jacobi Matrix
- \(D_p \phi \in \IR[N,M]\) is a NxM matrix of partial derivatives: \(D_p \phi[n,m] = (\del_m \phi^n) (p)\).
- \(D_q a \in \IR[1,N]\) is a 1xN vector of partial derivatives: \(D_q a[1,n] = (\del_n a) (q)\).
- Nabla Operator. For a real-valued function, we get a vector valued function \(\nabla a (q) = (D_q a)^* \in \IR[N,1]=\IR^N\) as adjoint of the Jacobi Matrix with respect to the standard inner product:
\[ D_q a(v) = \<\nabla a(q), v\> = (\nabla a(q))^* \cdot v \]
The following computational rules hold true:
Chain Rule
Let \(q=\psi(p)\), then
\[ D_p(\phi \circ \psi) = ( D_{\psi(p)} \phi ) \cdot (D_p \psi) = D_q \phi \cdot D_p \psi \]
\[ D_p (a \circ \phi) = (D_{\phi(p)} a) \cdot D_p \phi = D_q a \cdot D_p \phi \]
\[ \nabla(a \circ \phi)(p) = (D_p \phi)^* \cdot \nabla a(\phi(p)) = (D_p \phi)^* \cdot \nabla a(q) \]
Product Rule
For two real valued functions \(a,b\) we have the identity:
\[ \nabla (a b)(q) = a(q) . \nabla b(q) + b(q) . \nabla a(q) \]
Here "a.v" denots the scalar multiplication of a scalar "a" with a vector "v".
For a real valued function \(a: \IR^M \ra \IR\) and a vector valued function \(\phi: \IR^M \ra \IR^N\) we have:
\[ D_p (a . \phi) = a(p) . D_p \phi + \phi(p) \cdot (D_p a) \]
Here \(D_p (a . \phi)\) and \(D_p \phi\) are both NxM matrices.
\(\phi(p)\) is a Nx1 vector that is multiplied with the 1xM vector \((D_p a)\) to give a MxN matrix.
Quotient Rule
For two real valued functions \(a,b\) we have:
\[ \nabla \left( \frac{a}{b} \right) = \frac{1}{b^2} (b \nabla a - a \nabla b) \]
For a real valued function \(a\) and a vector valued function \(\phi\) we have:
\[
D_p (\phi / a)
= \frac{1}{a(p)^2} (a(p). D_p \phi - \phi(p) \cdot D_p a)
\]
Linear Maps
-
For \(v \in \IR^N\) and \(f(p)=\<v,p\>\) we have:
\[ D_p f = f = v^* = \< v,\_\>, \qquad \nabla f (p) = v. \]
Quadratic Forms
-
Let \(f(p) = \half \| p\|^2 = \half \< p, p \>\), then
\[ D_p f = p^* = \< p, \_\> , \qquad \nabla f (p) = p. \]
-
Let \(f(p) = \half <G p, p>\), then
\[ D_p f = \half (G+G^*)p^* = \half \< (G+G^*)p, \_ \>, \qquad \nabla f(p) = \half (G+G^*)p \]
If \(G=G^*\), this simplifies to:
\[ D_p f = G p^* = \< G p, \_\> , \qquad \nabla f (p) = G p. \]
-
Let \(f(p) = \half \| A p \|^2\), then
\[ D_p f = p^* A^*A = \<A^* Ap,\_ \> \qquad \nabla f (p) = A^*A p. \]
-
Let \(f(p) = \half \| A^* p - t \|^2\), then
\[ D_p f = A A^* p^* - A t^* = \< A^* Ap - A^* t,\_\> \qquad \nabla f(p) = A^* A p - A^* t. \]
Example: Rayligh Quotient
-
Let \(a(p)=\|p\| = \sqrt{p_1^2 + \dots + p_N^2}\), then
\[ D_p a = \frac{p^*}{\|p\|}, \qquad \nabla a (p) = \frac{p}{\|p\|} \]
Note that \(D_p a(p)\) is a linear map : \(\IR^N \ra \IR\), whereas \(\nabla a (p)\) is a vector in \(\IR^N\).
-
Consider \(r(x) = x/\|x\|\) then, \(r = \phi/a\), with \(\phi (p) = p \in \IR^N, a(p) = \|p\| \in \IR\), then
\[ D_p(r) = D_p (\phi / a)
= \frac{1}{\|p\|^2} (\|p\| I_N - p \cdot p^* / \|p\|)
= \frac{1}{\|p\|}( I_N - \hat{p} \cdot \hat{p}^* )
\]
Where \(\hat{p} = p / \|p\|\) is the unit vector in the direction of \(p\), and \(I_N\) is the \(N \times N\) identity matrix.
The matrix \(I_N - \hat{p} \cdot \hat{p}^*\) is the projection matrix onto the hyperplane orthogonal to \(\hat{p}\).
The factor \(\frac{1}{\|p\|}\) is a scalar, that accounts for the scaling from the sphere of radius \(\|p\|\) to the unit sphere.
-
For a symmetric matrix \(A \in \IR[N,N], A^* = A\), consider the function \(r(x) = \<x, Ax\>/\|x\|^2\).
This is called the raylight quotient.
We have \(r(x) = x^* Ax / x^* x\). We can write \(r(x)=a(x)/b(x)\) with \(a(x) = x^* Ax\) and \(b(x) = x^* x\).
Hence
\[
\nabla r(x) = \frac{1}{\|x\|^2} (Ax - r(x)x)
\]