Unipotent Systems
by Heinrich Hartmann / 2025-07-19 / Stemwede
Introduction
This document develops a modern approach to solving linear systems using unipotent structures, rank-revealing decompositions, and explicit factorization techniques. We focus on fully constructive, index-driven methods that expose the solvable structure within arbitrary matrices, even in rank-deficient cases.
We work in the space \(\IR[N]\) of real-valued functions on the finite index set \([N] = \{0, 1, \dots, N-1\}\), consistent with zero-based indexing used in computational environments. For matrices, we use \(A \in \IR[M,N]\) with entries \(A[m,n]\).
Unipotent Matrices
We begin with a class of matrices that are "close to the identity," either algebraically or analytically. These matrices support explicit inversion formulas via finite or convergent series and connect classical substitution with analytic expansions.
Definition
Let \(A \in \IR[N,N]\) where \([N] = \{0, 1, \dots, N-1\}\). We define:
- Nilpotent: \(A\) is nilpotent if there is an integer \(k > 0\) such that \(A^k = 0\).
- Unipotent: \(A = I - N\) with \(I\) the identity matrix and \(N\) nilpotent.
- Unicontractive: \(A = I - C\) where \(C\) is a contraction, i.e. \(\|C\|_2 < 1\).
Theorem - Van Neumann Series
- Let \(A = I - N\) be a unipotent matrix, with \(N\) nilpotent (\(N^k = 0\) for some \(k \leq N\)). Then: $$ A^{-1} = \sum_{j=0}^{k-1} N^j. $$
- Let \(A = I - C\) be a unicontractive, with \(\|C\|_2 < 1\), then: $$ A^{-1} x = \sum_{j=0}^{\infty} C^j x $$ for all \(x \in \IR[N]\).
Proof
Part 1 (Unipotent case): Let \(A = I - N\) with \(N\) nilpotent, so \(N^k = 0\) for some \(k \leq N\).
Consider the finite geometric series: \(S = \sum_{j=0}^{k-1} N^j = I + N + N^2 + \cdots + N^{k-1}\)
We compute: \((I - N) \cdot S = N^0 - N^k = I - 0 = I\). Similarly, \(S \cdot (I - N) = I\). Therefore \(A^{-1} = S = \sum_{j=0}^{k-1} N^j\).
Part 2 (Unicontractive case): Let \(A = I - C\) with \(\|C\|_2 < 1\).
The series \(\sum_{j=0}^{\infty} C^j x\) converges absolutely: \(\sum_{j=0}^{\infty} \|C^j x\| \leq \|x\| \sum_{j=0}^{\infty} \|C\|^j = \frac{\|x\|}{1-\|C\|} < \infty\)
For any finite partial sum \(S_n x = \sum_{j=0}^{n} C^j x\), we have: \((I - C) \cdot S_n x = S_n x - C \cdot S_n x = \sum_{j=0}^{n} C^j x - \sum_{j=1}^{n+1} C^j x = x - C^{n+1} x\)
Taking the limit as \(n \to \infty\) and using \(\lim_{n \to \infty} \|C^{n+1} x\| \leq \|C\|^{n+1} \|x\| = 0\) we get \((I - C) \cdot \sum_{j=0}^{\infty} C^j x = x\)
Therefore \(A^{-1} x = \sum_{j=0}^{\infty} C^j x\) for all \(x \in \IR[N]\).
Triangular Systems
In this paragraph we will see that unipotent matrices occur natrually as triagonal systems.
Definition Let \(A\in \IR[N,M]\) be a general matrix. We say:
- \(A\) is upper triangular (ut) if \(A[n,m] = 0\) if \(n > m\).
- \(A\) is strictly upper triangular (sut) if \(A[n,m] = 0\) if \(n \geq m\).
- \(A\) is lower triangular (lt) if \(A^*\) is upper triangular.
- \(A\) is strictly lower triangular (slt) if \(A^*\) is strictly upper triangular.
Proposition Consider \(\IR[K] \subset \IR[N]\) for \(K<N\) via inclusion of the first coordinates. Let \(A \in \IR[N,N]\), then:
- \(A\) is upper triangular if and only if \(A * \IR[K] \subset \IR[K]\) for all \(0 \leq K \leq N\).
- \(A\) is strictly upper triangular if and only if \(A * \IR[K] \subset \IR[K - 1]\) for all \(0 \leq K \leq N\).
Proof
Note that for a ut matrix \(A\) we have \(A * e_k = \sum_n e_n A[n,k] = \sum_{n \leq k} e_n A[n,k] \in \IR[K]\). The rest follows similarly.
Proposition - Nilpotent matrices are Triangular.
Let \(A \in \IR[N,N]\).
- If \(A\) is strictly upper triangular, then \(A\) is nilpotent with \(A^N = 0\).
- If \(A\) is nilpotent, then there exists an orthogonal matrix \(Q \in \IR[N,N]\) with \(Q^* Q = I\), such that \(Q^* A Q\) is strictly upper triangular.
Proof
Part 1: Let \(A\) be strictly upper triangular. By the previous proposition, \(A * \IR[K] \subset \IR[K-1]\) for all \(K\). Therefore \(A^N * \IR[N] \subset \IR[N-1] \subset \cdots \subset \IR[0] = \{0\}\), so \(A^N = 0\).
Part 2: Let \(A\) be nilpotent. We proceed by induction on \(N\).
For \(N=1\): If \(a \in \IR\) and \(a^k = 0\) for some \(k > 0\), then \(a = 0\), which is already strictly upper triangular.
For \(N>1\): Since \(A\) is nilpotent, \(A\) is not invertible, so \(\ker(A) \neq \{0\}\). Choose a unit vector \(q_1 \in \ker(A)\), so \(A q_1 = 0\). Extend \(q_1\) to an orthonormal basis \([q_1 \mid Q_2]\) where \(Q_2 \in \IR[N,N-1]\). No consider \(A' = Q_2^* A Q_2\), this is an \((N-1)\times(N-1)\) matrix which is still nilpotent. By the induction hypothesis, there exists orthogonal \(Q' \in \IR[N-1,N-1]\) such that \(Q'^* A' Q'\) is strictly upper triangular. Hence \(Q = [q_1, Q' Q_2]\), has the desired property.
Algorithm - Backsubstitution
Let \(A = I-N\) be a unipotent matrix where \(N\) is strictly upper triangular. We can solve the system \(Ax = b\) via backsubstitution as follows:
-
Initialize: Set \(x = b\) (since \((I-N)x = b\) implies \(x - Nx = b\), so \(x = b + Nx\)).
-
Iterative step: For \(j = 0, 1, 2, \ldots\) until convergence: $$ x^{(j+1)} = b + N \cdot x^{(j)} $$
-
Termination: Since \(N\) is nilpotent with \(N^k = 0\), the algorithm terminates in at most \(k\) steps with exact solution: $$ x = \sum_{j=0}^{k-1} N^j \cdot b $$
Alternative - Direct computation: For strictly upper triangular \(N\), solve component-wise from last to first index:
Unipotent Factorizations
We study factorizations of the form \(A = Q * U\), where:
- \(Q \in \IR[N,N]\) is explicitly invertible,
- \(U \in \IR[N,N]\) is unipotent (upper triangular with ones on the diagonal).
Proposition — QU Decomposition
Let \(A \in \IR[N,N]\) be invertible. Then there exist matrices \(Q \in \IR[N,N]\) (invertible) and \(U \in \IR[N,N]\) (unipotent) such that:
In this case, set write \(U = I_N - N\) with \(N\) nilpotent (\(N^k = 0\)), then: $$ A^{-1} = \left(\sum_{j=0}^{k-1} N^j\right) * Q^{-1}. $$
Notes
\(Q\) may be constructed using Gaussian elimination, Householder reflections, or Givens rotations.
Inverse Functions
Inverting Unipotent maps via Van-Neumann series gives us clean constructive proofs for the Implicit Function theorem. Again we consider functions of the form:
Where \(N\) is close to zero, and get an inverse via the van Neumann Series \(F^{-1} = \sum_k N^K\). Is \(N\) in small in the formal sense, i.e. the first degree Taylor poloynomial vanishes, we get a formal power series solution inverse. If we have Lipshitz type bounds on \(N\) and \(DN\) we can construct differentiable inverses that converge in the \(C^k\) norm.
Formal Inverse Functions
Let \(F = (F_1, \dots, F_n)\), with each \(F_i \in \IR[[x_1, \dots, x_n]]\), be a "formal" map, with \(F(0) = 0\). \(F\) induces a homomorphism of formal power series rings \(F^*: \IR[[y_1, \dots, y_n]] \to \IR[[x_1, \dots, x_n]]\) via \(F^*(y_i) := F_i(x_1, \dots, x_n)\).
Theorem (Formal Inverse via Van Neumann)
Assume that \(F\) has a "unipotent structure", i.e. \(F = I - N\) where \(I_i(x) = x_i\) and each component \(N_i \in \mathfrak{m}^2\), i.e., \(N\) vanishes to order at least 2.
Then the formal sum \(G := \sum_{k=0}^\infty N^k\) coverges in the \(\mathfrak{m}\)-adic topology, and defines a formal inverse to \(F\). Thus, \(G \circ F = F \circ G = I\) in \(\IR[[x_1, \dots, x_n]]\).
Proof
Let \(\mathfrak{m} \subset \IR[[x_1, \dots, x_n]]\) be the maximal ideal \(\mathfrak{m} := (x_1, \dots, x_n)\) of series with vanishing constant term. By assumption \(F^* \mathfrak{m} \subset \mathfrak{m}\), and we get an induced map on the finite dimensional rings \(\IR[[x_1, \dots, x_n]]/\mathfrak{m}^{k+1}\) of degree \(\leq k\) polynomials.
We note that the operator \(N_k\) induced on \(\IR[[x_1, \dots, x_n]]/\mathfrak{m}^{k+1}\) is nilpotent, as \(N^*(\mathfrak{m}) \subset \mathfrak{m}^2\). Hence \(F_k = I - N_k\) is unipotent with inverse \(G_k = \sum_l N_k^l\). Passing to inductive limits yields the result.
Remarks If \(F\) is a formal function as above with \(D=DF(0)\) invertible matrix \(D \in GL_n(\IR)\), we can reduce to the case \(DF(0) = I\) by precomposing with the linear inverse \(D^{-1}\): \(\tilde{F}(x) := D^{-1} F(x) = x - N(x),\) where \(N \in \mathfrak{m}^2\). An explicit inverse for \(F\) is hence provided by the formula: $$ F^{-1}(x) = \sum_k N^k D^{-1} x. $$
Differentiable Inverse Functions
Theorem (\(C^k\) Inverse via von Neumann Series and Derivative Control)
Let \(F : U \subset \mathbb{R}^n \to \mathbb{R}^n\) be a continuously differentiable function satisfying \(F(0) = 0\) and \(DF(0) = I\). Define \(N(x) := x - F(x)\), so that \(N(0) = 0\) and \(DN(0) = 0\).
Assume that there are constants \(L > 0\), \(\delta > 0\) such that for all \(x \in B_\delta(0)\), \(\|DN(x)\| \leq L \|x\|.\)
Then for any such \(L, \delta\) set \(\rho := \min\left( \delta, \frac{1}{2L} \right)\) then the von Neumann series
$$ G(x) := \sum_{j=0}^\infty N^j(x) $$ converges absolutely in the \(C^1\) norm on \(B_\rho(0)\) and defines a continuously differentiable local inverse of \(F\) on \(B_\rho(0)\).
Moreoever:
- If \(F\) is \(C^2\) the constants \(L,\delta\) can always be found.
- If \(F\) is \(C^k\) then \(G\) is of class \(C^{k-1}\).
Proof
Step 0 (Bounds on \(N\) and \(DN\)):
By the integral form of Taylor's theorem (or mean value estimate), we obtain:
$$
|N(x)| \leq \frac{1}{2} L |x|^2.
$$
Step 1 (Convergence in \(C^0\) norm):
Define \(a_j := \|N^j(x)\|\) with \(a_0 = \|x\|\). Using the bound on \(N\), we have:
$$
a_{j+1} = |N(N^j(x))| \leq \frac{L}{2} a_j^2
$$
This implies double-exponential decay for \(a_j\), provided \(a_0 < \frac{1}{2L}\).
Hence \(\sum_j a_j\) converges absolutely for \(\|x\| < \rho\), and \(G(x) := \sum_j N^j(x)\) converges in norm.
Step 2 (First derivative):
Let \(b_j := \|D(N^j)(x)\|\). The chain rule gives: \(D(N^{j+1})(x) = DN(N^j(x)) \cdot D(N^j)(x)\)
so: \(b_{j+1} \leq \|DN(N^j(x))\| \cdot b_j \leq L a_j \cdot b_j.\)
Inductively: \(b_j \leq \prod_{i=0}^{j-1} L a_i.\) Since \(a_i\) decays doubly exponentially, this product decays exponentially, and \(\sum_j b_j\) converges. Therefore, \(DG(x) = \sum_j D(N^j)(x)\) converges absolutely, and \(G\) is \(C^1\).
Conclusion:
The map \(G(x) = \sum N^j(x)\) defines a \(C^1\) function on \(B_\rho(0)\) with \(F(G(x)) = x\). Inverting this identity gives \(G(F(x)) = x\) for \(x \in B_\rho(0)\), and \(G\) is the local \(C^1\) inverse of \(F\).
Differential Equations
BACKUP
Underdetermined Systems
Let \(F \in \IR[M,N]\) be rectangular with \(M < N\) (i.e., \([M] = \{0, 1, \dots, M-1\}, [N] = \{0, 1, \dots, N-1\}\) with \(M < N\)). We partition \([N] = N_0 \cup N_1\) with \(|N_0| = M\), and correspondingly partition: \(F = [F_0 \mid F_1]\) where \(F_0 \in \IR[M,N_0]\) and \(F_1 \in \IR[M,N_1]\).
Assume \(F_0\) is invertible. Then the general solution to \(F * x = b\) is:
- Homogeneous case (\(b = 0\)): $$ x = (-F_0^{-1} * F_1 * y_1, y_1) $$ with \(y_1 \in \IR[N_1]\) arbitrary.
- Inhomogeneous case: $$ x = (F_0^{-1} * b - F_0^{-1} * F_1 * y_1, y_1) $$ with \(y_1 \in \IR[N_1]\) arbitrary.
If \(F_0 = Q * U\) is a QU factorization with \(U = I_{N_0} - N\), \(N\) nilpotent, then: $$ F_0^{-1} = \left(\sum_{j=0}^{k-1} N^j\right) * Q^{-1}. $$
Rank-Revealing QU Factorization
Algorithm
Given \(A \in \IR[N,N]\) where \([N] = \{0, 1, \dots, N-1\}\), construct:
- Integer \(r \leq N\) (the rank),
- \(Q \in \IR[N,N]\) invertible (representing row operations),
- \(P \in \IR[N,N]\) permutation matrix (representing column operations),
- \(U \in \IR[N,N]\) with structure:
- \(U[i,j]\) for \(i,j \in [r]\): unipotent block \(U_{00}\),
- \(U[i,j]\) for \(i \in [r], j \in \{r, \dots, N-1\}\): arbitrary block \(U_{01}\),
- \(U[i,j] = 0\) for \(i \in \{r, \dots, N-1\}\),
such that \(A = Q * U * P\).
Remark
To isolate an \(r \times r\) invertible submatrix, we find \(r\) independent rows and \(r\) independent columns, which may differ. We resolve this via the permutation matrix \(P\).