Analysis of Linear Functions

\[ \newcommand{\ra}{\rightarrow} \newcommand{\kC}{\mathcal{C}} \newcommand{\IR}{\mathbb{R}} \newcommand{\AINF}{\mathbb{A}} \newcommand{\CINF}{\kC^\infty} \newcommand{\half}{\frac{1}{2}} \newcommand{\IFF}{\Leftrightarrow} \newcommand{\inf}{\infty} \newcommand{\Ind}{\mathbb{1}} \newcommand{\IR}{\mathbb{R}} \newcommand{\IA}{\mathbb{A}} \newcommand{\IB}{\mathbb{B}} \newcommand{\IC}{\mathbb{C}} \newcommand{\ID}{\mathbb{D}} \newcommand{\IF}{\mathbb{F}} \newcommand{\IH}{\mathbb{H}} \newcommand{\II}{\mathbb{I}} \newcommand{\IL}{\mathbb{L}} \newcommand{\IN}{\mathbb{N}} \newcommand{\IP}{\mathbb{P}} \newcommand{\IQ}{\mathbb{Q}} \newcommand{\IR}{\mathbb{R}} \newcommand{\IS}{\mathbb{S}} \newcommand{\IV}{\mathbb{V}} \newcommand{\IZ}{\mathbb{Z}} \newcommand{\floor}[1]{\lfloor #1 \rfloor} \newcommand{\ceil}[1]{\lceil #1 \rceil} \newcommand{\set}[2]{\{\, #1 \;\vert\; #2 \,\}} \newcommand{\Set}[2]{\left\{\, #1 \;\vert\; #2 \,\right\}} \newcommand{\C}{\,\#} \newcommand{\CSet}[2]{\#\{\, #1 \;\vert\; #2 \,\}} \newcommand{\qtext}[1]{\quad\text{#1}\quad} \newcommand{\stext}[1]{\;\text{#1}\;} \newcommand{\dbrackets}[1]{ [\![ #1 ]\!]} \newcommand{\KR}{\matcal{R}} \newcommand{\KA}{\matcal{A}} \newcommand{\KB}{\matcal{B}} \newcommand{\KC}{\matcal{C}} \newcommand{\KD}{\matcal{D}} \newcommand{\KF}{\matcal{F}} \newcommand{\KH}{\matcal{H}} \newcommand{\KI}{\matcal{I}} \Newcommand{\KL}{\matcal{L}} \newcommand{\KN}{\matcal{N}} \newcommand{\KP}{\matcal{P}} \newcommand{\KQ}{\matcal{Q}} \newcommand{\KR}{\matcal{R}} \newcommand{\KS}{\matcal{S}} \newcommand{\KV}{\matcal{V}} \newcommand{\KZ}{\matcal{Z}} \newcommand{\gc}{\mathfrak{C}} \newcommand{\gd}{\mathfrak{D}} \newcommand{\gM}{\mathfrak{M}} \newcommand{\gm}{\mathfrak{m}} \newcommand{\gf}{\mathfrak{f}} \newcommand{\gu}{\mathfrak{U}} \newcommand{\fa}{\mathfrak{a}} \newcommand{\fg}{\mathfrak{g}} \newcommand{\fn}{\mathfrak{n}} \newcommand{\fk}{\mathfrak{k}} \newcommand{\fm}{\mathfrak{m}} \newcommand{\fp}{\mathfrak{p}} \newcommand{\curly}[1]{\mathcal{#1}} \newcommand{\op}[1]{\mathrm{#1}} \newcommand{\Cat}[1]{\mathfrak{#1}} \newcommand{\cat}[1]{\mathbf{#1}} \newcommand{\vphi}{\varphi} \newcommand{\sphi}{\phi} \newcommand{\eps}{\varepsilon} \newcommand{\tensor}{\otimes} \newcommand{\tensors}{\tensor\dots\tensor} \newcommand{\Tensor}{\bigotimes} \newcommand{\ra}{\rightarrow} \newcommand{\lra}{\longrightarrow} \newcommand{\la}{\leftarrow} \newcommand{\lla}{\longleftarrow} \newcommand{\isom}{\cong} \newcommand{\epi}{\twoheadrightarrow} \newcommand{\mono}{\hookrightarrow} \newcommand{\del}{\partial} \newcommand{\union}{\cup} \newcommand{\dotcup}{\ensuremath{\mathaccent\cdot\cup}} \newcommand{\dunion}{\dotcup} \newcommand{\<}{\langle} \renewcommand{\>}{\rangle} \newcommand{\inpart}[1]{\in\text{\part}(#1)} \newcommand{\Vsum}{\bigoplus} \newcommand{\vsum}{\oplus} \renewcommand{\S}{\mathfrak{S}} \newcommand{\id}{\mathrm{id}} \newcommand{\rk}{\mathrm{rk}} \newcommand{\Diff}{\mathrm{Diff}} \newcommand{\Hom}{\mathrm{Hom}} \newcommand{\Pic}{\mathrm{Pic}} \newcommand{\Spec}{\mathrm{Spec}} \newcommand{\End}{\mathrm{End}} \newcommand{\Ext}{\mathrm{Ext}} \DeclareMathOperator{\Supp}{\mathrm{Supp}} \DeclareMathOperator{\Sym}{Sym} \DeclareMathOperator{\Alt}{\Lambda} \DeclareMathOperator{\ad}{ad} \DeclareMathOperator{\ch}{ch} \DeclareMathOperator{\td}{td} \DeclareMathOperator{\pr}{pr} \newcommand{\Jac}{\mathcal{J}} \]

Let \(\psi : \IR^K \ra \IR^M, \phi: \IR^M \ra \IR^N, a: \IR^N \ra \IR\) be continuously differentiable functions. We have the following notions of derivatives:

Jacobi Matrix
- \(D_p \phi \in \IR[N,M]\) is a NxM matrix of partial derivatives: \(D_p \phi[n,m] = (\del_m \phi^n) (p)\).
- \(D_q a \in \IR[1,N]\) is a 1xN vector of partial derivatives: \(D_q a[1,n] = (\del_n a) (q)\).

Gradients
For a real-valued function, we get a vector valued function \(\nabla a (q) \in \IR[N,1]=\IR^N\) as adjoint of the Jacobi Matrix with respect to the standard inner product:

\[ D_q a(v) = \<\nabla a(q), v\> = (\nabla a(q))^* \cdot v \]

Chain Rule

Let \(q=\psi(p)\), then

\[ D_p(\phi \circ \psi) = ( D_{\psi(p)} \phi ) \cdot (D_p \psi) = D_q \phi \cdot D_p \psi \]

\[ D_p (a \circ \phi) = (D_{\phi(p)} a) \cdot D_p \phi = D_q a \cdot D_p \phi \]

\[ \nabla(a \circ \phi)(p) = (D_p \phi)^* \cdot \nabla a(\phi(p)) = (D_p \phi)^* \cdot \nabla a(q) \]

Product Rule

For two real valued functions \(a,b\) we have the identity:

\[ \nabla (a b)(q) = a(q) . \nabla b(q) + b(q) . \nabla a(q) \]

Here "a.v" denots the scalar multiplication of a scalar "a" with a vector "v".

The product rule can be extended to the case of a real valued function \(a: \IR^M \ra \IR\) and a vector valued function \(\phi: \IR^M \ra \IR^N\). In this case we find:

\[ D_p (a . \phi) = a(p) . D_p \phi + \phi(p) \cdot (D_p a) \]

Here \(D_p (a . \phi)\) and \(D_p \phi\) are both NxM matrices. \(\phi(p)\) is a Nx1 vector that is multiplied with the 1xM vector \((D_p a)\) to give a MxN matrix.

Quotient Rule

For two real valued functions \(a,b\) we have:

\[ \nabla \left( \frac{a}{b} \right) = \frac{1}{b^2} (b \nabla a - a \nabla b) \]

For a real valued function \(a\) and a vector valued function \(\phi\) we have:

\[ D_p (\phi / a) = \frac{1}{a(p)^2} (a(p). D_p \phi - \phi(p) \cdot D_p a) \]

Linear Maps

If \(\phi=A\) is a linear map, represented by a matrix \(A \in \IR[M,N]\), then

\[ D_p A = A \]

Let \(f(p)=\<a,p\> = a^* p\), for some fixed \(a \in \IR^N\) we have:

\[ \nabla f (p) = a, \qquad D_p f = f = a^* \]

Quadratic Forms

All functions considered in this pargraph have values in \(\IR\). Hence we get derivatives in \(\IR[1,n]\) and gradients in \(\IR^n\).

Let \(f(p) = \half \| p\|^2 = \half \< p, p \>\), then

\[ \nabla f (p) = p \]

Let \(f(p) = \half <G p, p>\), then

\[ \nabla f(p) = \half (G+G^*)p \]

If \(G=G^*\), this simplifies to:

\[ \nabla f (p) = G p \]

Let \(f(p) = \half \| A p \|^2 = \half \< A^* A p, p \>\), then

\[ \nabla f (p) = A^*A \]

Let \(f(p) = \half \| A p - t \|^2 = \half \< Ap-t, Ap-t \> = \half \< A^* Ap, p \> - \< A^*t,p \> + \half \| t \|^2\), then

\[ \nabla f(p) = A^* A p - A^* t \]

Example: Rayligh Quotient

Let \(a(p)=\|p\| = \sqrt{p_1^2 + \dots + p_N^2}\), then

\[ D_p a = \frac{p^*}{\|p\|}, \qquad \nabla a (p) = \frac{p}{\|p\|} \]

Note that \(D_p a(p)\) is a linear map : \(\IR^N \ra \IR\), whereas \(\nabla a (p)\) is a vector in \(\IR^N\).

Consider \(r(x) = x/\|x\|\) then, \(r = \phi/a\), with \(\phi (p) = p \in \IR^N, a(p) = \|p\| \in \IR\), then

\[ D_p(r) = D_p (\phi / a) = \frac{1}{\|p\|^2} (\|p\| I_N - p \cdot p^* / \|p\|) = \frac{1}{\|p\|}( I_N - \hat{p} \cdot \hat{p}^* ) \]

Where \(\hat{p} = p / \|p\|\) is the unit vector in the direction of \(p\), and \(I_N\) is the \(N \times N\) identity matrix. The matrix \(I_N - \hat{p} \cdot \hat{p}^*\) is the projection matrix onto the hyperplane orthogonal to \(\hat{p}\). The factor \(\frac{1}{\|p\|}\) is a scalar, that accounts for the scaling from the sphere of radius \(\|p\|\) to the unit sphere.

For a symmetric matrix \(A \in \IR[N,N], A^* = A\), consider the function \(r(x) = \<x, Ax\>/\|x\|^2\). This is called the raylight quotient.

We have \(r(x) = x^* Ax / x^* x\). We can write \(r(x)=a(x)/b(x)\) with \(a(x) = x^* Ax\) and \(b(x) = x^* x\). Hence

\[ \nabla r(x) = \frac{1}{\|x\|^2} (Ax - r(x)x) \]