Skip to content

Analysis of Linear Functions

Let \(\psi : \IR^K \ra \IR^M, \phi: \IR^M \ra \IR^N, a: \IR^N \ra \IR\) be continuously differentiable functions. We have the following notions of derivatives:

  • Jacobi Matrix

    • \(D_p \phi \in \IR[N,M]\) is a NxM matrix of partial derivatives: \(D_p \phi[n,m] = (\del_m \phi^n) (p)\).
    • \(D_q a \in \IR[1,N]\) is a 1xN vector of partial derivatives: \(D_q a[1,n] = (\del_n a) (q)\).
  • Nabla Operator. For a real-valued function, we get a vector valued function \(\nabla a (q) = (D_q a)^* \in \IR[N,1]=\IR^N\) as adjoint of the Jacobi Matrix with respect to the standard inner product:
\[ D_q a(v) = \<\nabla a(q), v\> = (\nabla a(q))^* \cdot v \]

The following computational rules hold true:

Chain Rule

Let \(q=\psi(p)\), then

\[ D_p(\phi \circ \psi) = ( D_{\psi(p)} \phi ) \cdot (D_p \psi) = D_q \phi \cdot D_p \psi \]
\[ D_p (a \circ \phi) = (D_{\phi(p)} a) \cdot D_p \phi = D_q a \cdot D_p \phi \]
\[ \nabla(a \circ \phi)(p) = (D_p \phi)^* \cdot \nabla a(\phi(p)) = (D_p \phi)^* \cdot \nabla a(q) \]

Product Rule

For two real valued functions \(a,b\) we have the identity:

\[ \nabla (a b)(q) = a(q) . \nabla b(q) + b(q) . \nabla a(q) \]

Here "a.v" denots the scalar multiplication of a scalar "a" with a vector "v".

For a real valued function \(a: \IR^M \ra \IR\) and a vector valued function \(\phi: \IR^M \ra \IR^N\) we have:

\[ D_p (a . \phi) = a(p) . D_p \phi + \phi(p) \cdot (D_p a) \]

Here \(D_p (a . \phi)\) and \(D_p \phi\) are both NxM matrices. \(\phi(p)\) is a Nx1 vector that is multiplied with the 1xM vector \((D_p a)\) to give a MxN matrix.

Quotient Rule

For two real valued functions \(a,b\) we have:

\[ \nabla \left( \frac{a}{b} \right) = \frac{1}{b^2} (b \nabla a - a \nabla b) \]

For a real valued function \(a\) and a vector valued function \(\phi\) we have:

\[ D_p (\phi / a) = \frac{1}{a(p)^2} (a(p). D_p \phi - \phi(p) \cdot D_p a) \]

Linear Maps

  • If \(\phi=A\) is a linear map, represented by a matrix \(A \in \IR[M,N]\), then

    \[ D_p A = A \]
  • For \(v \in \IR^N\) and \(f(p)=\<v,p\>\) we have:

    \[ D_p f = f = v^* = \< v,\_\>, \qquad \nabla f (p) = v. \]

Quadratic Forms

  • Let \(f(p) = \half \| p\|^2 = \half \< p, p \>\), then

    \[ D_p f = p^* = \< p, \_\> , \qquad \nabla f (p) = p. \]
  • Let \(f(p) = \half <G p, p>\), then

    \[ D_p f = \half (G+G^*)p^* = \half \< (G+G^*)p, \_ \>, \qquad \nabla f(p) = \half (G+G^*)p \]

    If \(G=G^*\), this simplifies to:

    \[ D_p f = G p^* = \< G p, \_\> , \qquad \nabla f (p) = G p. \]
  • Let \(f(p) = \half \| A p \|^2\), then

    \[ D_p f = p^* A^*A = \<A^* Ap,\_ \> \qquad \nabla f (p) = A^*A p. \]
  • Let \(f(p) = \half \| A^* p - t \|^2\), then

    \[ D_p f = A A^* p^* - A t^* = \< A^* Ap - A^* t,\_\> \qquad \nabla f(p) = A^* A p - A^* t. \]

Example: Rayligh Quotient

  • Let \(a(p)=\|p\| = \sqrt{p_1^2 + \dots + p_N^2}\), then

    \[ D_p a = \frac{p^*}{\|p\|}, \qquad \nabla a (p) = \frac{p}{\|p\|} \]

    Note that \(D_p a(p)\) is a linear map : \(\IR^N \ra \IR\), whereas \(\nabla a (p)\) is a vector in \(\IR^N\).

  • Consider \(r(x) = x/\|x\|\) then, \(r = \phi/a\), with \(\phi (p) = p \in \IR^N, a(p) = \|p\| \in \IR\), then

    \[ D_p(r) = D_p (\phi / a) = \frac{1}{\|p\|^2} (\|p\| I_N - p \cdot p^* / \|p\|) = \frac{1}{\|p\|}( I_N - \hat{p} \cdot \hat{p}^* ) \]

    Where \(\hat{p} = p / \|p\|\) is the unit vector in the direction of \(p\), and \(I_N\) is the \(N \times N\) identity matrix. The matrix \(I_N - \hat{p} \cdot \hat{p}^*\) is the projection matrix onto the hyperplane orthogonal to \(\hat{p}\). The factor \(\frac{1}{\|p\|}\) is a scalar, that accounts for the scaling from the sphere of radius \(\|p\|\) to the unit sphere.

  • For a symmetric matrix \(A \in \IR[N,N], A^* = A\), consider the function \(r(x) = \<x, Ax\>/\|x\|^2\). This is called the raylight quotient.

    We have \(r(x) = x^* Ax / x^* x\). We can write \(r(x)=a(x)/b(x)\) with \(a(x) = x^* Ax\) and \(b(x) = x^* x\). Hence

    \[ \nabla r(x) = \frac{1}{\|x\|^2} (Ax - r(x)x) \]