$$ \newcommand{\nl}{\\} \newcommand{\half}{\frac{1}{2}} \newcommand{\floor}[1]{\lfloor #1 \rfloor} \newcommand{\ceil}[1]{\lceil #1 \rceil} \newcommand{\Set}[2]{\left\{\, #1 \;\vert\; #2 \,\right\}} \newcommand{\C}{\,\#} \newcommand{\CSet}[2]{\#\{\, #1 \;\vert\; #2 \,\}} \newcommand{\qtext}[1]{\quad\text{#1}\quad} \newcommand{\stext}[1]{\;\text{#1}\;} \newcommand{\IFF}{\Leftrightarrow} \newcommand{\inf}{\infty} \newcommand{\Ind}{\mathbb{1}} \newcommand{\IR}{\mathbb{R}} \newcommand{\IA}{\mathbb{A}} \newcommand{\IB}{\mathbb{B}} \newcommand{\IC}{\mathbb{C}} \newcommand{\ID}{\mathbb{D}} \newcommand{\IF}{\mathbb{F}} \newcommand{\IH}{\mathbb{H}} \newcommand{\II}{\mathbb{I}} \newcommand{\IL}{\mathbb{L}} \newcommand{\IN}{\mathbb{N}} \newcommand{\IP}{\mathbb{P}} \newcommand{\IQ}{\mathbb{Q}} \newcommand{\IR}{\mathbb{R}} \newcommand{\IS}{\mathbb{S}} \newcommand{\IV}{\mathbb{V}} \newcommand{\IZ}{\mathbb{Z}} \newcommand{\KR}{\mathcal{R}} \newcommand{\KA}{\mathcal{A}} \newcommand{\KB}{\mathcal{B}} \newcommand{\KC}{\mathcal{C}} \newcommand{\KD}{\mathcal{D}} \newcommand{\KF}{\mathcal{F}} \newcommand{\KH}{\mathcal{H}} \newcommand{\KI}{\mathcal{I}} \newcommand{\KL}{\mathcal{L}} \newcommand{\KN}{\mathcal{N}} \newcommand{\KP}{\mathcal{P}} \newcommand{\KQ}{\mathcal{Q}} \newcommand{\KR}{\mathcal{R}} \newcommand{\KS}{\mathcal{S}} \newcommand{\KV}{\mathcal{V}} \newcommand{\KZ}{\mathcal{Z}} \newcommand{\gc}{\mathfrak{C}} \newcommand{\gd}{\mathfrak{D}} \newcommand{\gM}{\mathfrak{M}} \newcommand{\gm}{\mathfrak{m}} \newcommand{\gf}{\mathfrak{f}} \newcommand{\gu}{\mathfrak{U}} \newcommand{\fa}{\mathfrak{a}} \newcommand{\fg}{\mathfrak{g}} \newcommand{\fn}{\mathfrak{n}} \newcommand{\fk}{\mathfrak{k}} \newcommand{\fm}{\mathfrak{m}} \newcommand{\fp}{\mathfrak{p}} \newcommand{\curly}[1]{\mathcal{#1}} \newcommand{\op}[1]{\mathrm{#1}} \newcommand{\Cat}[1]{\mathfrak{#1}} \newcommand{\cat}[1]{\mathbf{#1}} \newcommand{\vphi}{\varphi} \newcommand{\sphi}{\phi} \newcommand{\eps}{\varepsilon} \newcommand{\tensor}{\otimes} \newcommand{\tensors}{\tensor\dots\tensor} \newcommand{\Tensor}{\bigotimes} \newcommand{\ra}{\rightarrow} \newcommand{\lra}{\longrightarrow} \newcommand{\la}{\leftarrow} \newcommand{\lla}{\longleftarrow} \newcommand{\isom}{\cong} \newcommand{\epi}{\twoheadrightarrow} \newcommand{\mono}{\hookrightarrow} \newcommand{\del}{\partial} \newcommand{\union}{\cup} \newcommand{\Union}{\bigcup} \newcommand{\dotcup}{\ensuremath{\mathaccent\cdot\cup}} \newcommand{\dunion}{\dotcup} \newcommand{\<}{\langle} \renewcommand{\>}{\rangle} \newcommand{\inpart}[1]{\in\text{\part}(#1)} \newcommand{\Vsum}{\bigoplus} \newcommand{\vsum}{\oplus} \renewcommand{\S}{\mathfrak{S}} \newcommand{\id}{\mathrm{id}} \newcommand{\rk}{\mathrm{rk}} \newcommand{\Diff}{\mathrm{Diff}} \newcommand{\Hom}{\mathrm{Hom}} \newcommand{\Pic}{\mathrm{Pic}} \newcommand{\Spec}{\mathrm{Spec}} \newcommand{\End}{\mathrm{End}} \newcommand{\Ext}{\mathrm{Ext}} \DeclareMathOperator{\Supp}{\mathrm{Supp}} \DeclareMathOperator{\Sym}{Sym} \DeclareMathOperator{\Alt}{\Lambda} \DeclareMathOperator{\ad}{ad} \DeclareMathOperator{\ch}{ch} \DeclareMathOperator{\td}{td} \DeclareMathOperator{\pr}{pr} \newcommand{\one}{\mathbb{1}} \newcommand{\set}[2]{\{\, #1 \;\vert\; #2 \,\}} \newcommand{\dbrackets}[1]{ [\![ #1 ]\!]} \newcommand{\Jac}{\mathcal{J}} \newcommand{\stext}[1]{\{ \, \text{#1} \, \}} $$

Unipotent Systems

by Heinrich Hartmann / 2025-07-19 / Stemwede

Introduction

This document develops a modern approach to solving linear systems using unipotent structures, rank-revealing decompositions, and explicit factorization techniques. We focus on fully constructive, index-driven methods that expose the solvable structure within arbitrary matrices, even in rank-deficient cases.

We work in the space $\IR[N]$ of real-valued functions on the finite index set $[N] = \{0, 1, \dots, N-1\}$, consistent with zero-based indexing used in computational environments. For matrices, we use $A \in \IR[M,N]$ with entries $A[m,n]$.

Unipotent Matrices

We begin with a class of matrices that are "close to the identity," either algebraically or analytically. These matrices support explicit inversion formulas via finite or convergent series and connect classical substitution with analytic expansions.

Definition

Let $A \in \IR[N,N]$ where $[N] = \{0, 1, \dots, N-1\}$. We define:

Nilpotent: $A$ is nilpotent if there is an integer $k > 0$ such that $A^k = 0$.
Unipotent: $A = I - N$ with $I$ the identity matrix and $N$ nilpotent.
Unicontractive: $A = I - C$ where $C$ is a contraction, i.e. $\|C\|_2 < 1$.

Theorem - Van Neumann Series

Let $A = I - N$ be a unipotent matrix, with $N$ nilpotent ($N^k = 0$ for some $k \leq N$). Then: $$ A^{-1} = \sum_{j=0}^{k-1} N^j. $$

Let $A = I - C$ be a unicontractive, with $\|C\|_2 < 1$, then: $$ A^{-1} x = \sum_{j=0}^{\infty} C^j x $$ for all $x \in \IR[N]$.

Proof

Part 1 (Unipotent case): Let $A = I - N$ with $N$ nilpotent, so $N^k = 0$ for some $k \leq N$.

Consider the finite geometric series: $S = \sum_{j=0}^{k-1} N^j = I + N + N^2 + \cdots + N^{k-1}$

We compute: $(I - N) \cdot S = N^0 - N^k = I - 0 = I$. Similarly, $S \cdot (I - N) = I$. Therefore $A^{-1} = S = \sum_{j=0}^{k-1} N^j$.

Part 2 (Unicontractive case): Let $A = I - C$ with $\|C\|_2 < 1$.

The series $\sum_{j=0}^{\infty} C^j x$ converges absolutely: $\sum_{j=0}^{\infty} \|C^j x\| \leq \|x\| \sum_{j=0}^{\infty} \|C\|^j = \frac{\|x\|}{1-\|C\|} < \infty$

For any finite partial sum $S_n x = \sum_{j=0}^{n} C^j x$, we have: $(I - C) \cdot S_n x = S_n x - C \cdot S_n x = \sum_{j=0}^{n} C^j x - \sum_{j=1}^{n+1} C^j x = x - C^{n+1} x$

Taking the limit as $n \to \infty$ and using $\lim_{n \to \infty} \|C^{n+1} x\| \leq \|C\|^{n+1} \|x\| = 0$ we get $(I - C) \cdot \sum_{j=0}^{\infty} C^j x = x$

Therefore $A^{-1} x = \sum_{j=0}^{\infty} C^j x$ for all $x \in \IR[N]$.

Triangular Systems

In this paragraph we will see that unipotent matrices occur natrually as triagonal systems.

Definition Let $A\in \IR[N,M]$ be a general matrix. We say:

$A$ is upper triangular (ut) if $A[n,m] = 0$ if $n > m$.
$A$ is strictly upper triangular (sut) if $A[n,m] = 0$ if $n \geq m$.
$A$ is lower triangular (lt) if $A^*$ is upper triangular.
$A$ is strictly lower triangular (slt) if $A^*$ is strictly upper triangular.

Proposition Consider $\IR[K] \subset \IR[N]$ for $K<N$ via inclusion of the first coordinates. Let $A \in \IR[N,N]$, then:

$A$ is upper triangular if and only if $A * \IR[K] \subset \IR[K]$ for all $0 \leq K \leq N$.
$A$ is strictly upper triangular if and only if $A * \IR[K] \subset \IR[K - 1]$ for all $0 \leq K \leq N$.

Proof

Note that for a ut matrix $A$ we have $A * e_k = \sum_n e_n A[n,k] = \sum_{n \leq k} e_n A[n,k] \in \IR[K]$. The rest follows similarly.

Proposition - Nilpotent matrices are Triangular.

Let $A \in \IR[N,N]$.

If $A$ is strictly upper triangular, then $A$ is nilpotent with $A^N = 0$.
If $A$ is nilpotent, then there exists an orthogonal matrix $Q \in \IR[N,N]$ with $Q^* Q = I$, such that $Q^* A Q$ is strictly upper triangular.

Proof

Part 1: Let $A$ be strictly upper triangular. By the previous proposition, $A * \IR[K] \subset \IR[K-1]$ for all $K$. Therefore $A^N * \IR[N] \subset \IR[N-1] \subset \cdots \subset \IR[0] = \{0\}$, so $A^N = 0$.

Part 2: Let $A$ be nilpotent. We proceed by induction on $N$.

For $N=1$: If $a \in \IR$ and $a^k = 0$ for some $k > 0$, then $a = 0$, which is already strictly upper triangular.

For $N>1$: Since $A$ is nilpotent, $A$ is not invertible, so $\ker(A) \neq \{0\}$. Choose a unit vector $q_1 \in \ker(A)$, so $A q_1 = 0$. Extend $q_1$ to an orthonormal basis $[q_1 \mid Q_2]$ where $Q_2 \in \IR[N,N-1]$. No consider $A' = Q_2^* A Q_2$, this is an $(N-1)\times(N-1)$ matrix which is still nilpotent. By the induction hypothesis, there exists orthogonal $Q' \in \IR[N-1,N-1]$ such that $Q'^* A' Q'$ is strictly upper triangular. Hence $Q = [q_1, Q' Q_2]$, has the desired property.

Algorithm - Backsubstitution

Let $A = I-N$ be a unipotent matrix where $N$ is strictly upper triangular. We can solve the system $Ax = b$ via backsubstitution as follows:

Initialize: Set $x = b$ (since $(I-N)x = b$ implies $x - Nx = b$, so $x = b + Nx$).
Iterative step: For $j = 0, 1, 2, \ldots$ until convergence: $$ x^{(j+1)} = b + N \cdot x^{(j)} $$
Termination: Since $N$ is nilpotent with $N^k = 0$, the algorithm terminates in at most $k$ steps with exact solution: $$ x = \sum_{j=0}^{k-1} N^j \cdot b $$

Alternative - Direct computation: For strictly upper triangular $N$, solve component-wise from last to first index:

\[ x[n] = b[n] + \sum_{m=n+1}^{N-1} N[n,m] \cdot x[m], \quad n = N-1, N-2, \ldots, 0 \]

Unipotent Factorizations

We study factorizations of the form $A = Q * U$, where:

$Q \in \IR[N,N]$ is explicitly invertible,
$U \in \IR[N,N]$ is unipotent (upper triangular with ones on the diagonal).

Proposition — QU Decomposition

Let $A \in \IR[N,N]$ be invertible. Then there exist matrices $Q \in \IR[N,N]$ (invertible) and $U \in \IR[N,N]$ (unipotent) such that:

\[A = Q * U.\]

In this case, set write $U = I_N - N$ with $N$ nilpotent ($N^k = 0$), then: $$ A^{-1} = \left(\sum_{j=0}^{k-1} N^j\right) * Q^{-1}. $$

Notes

$Q$ may be constructed using Gaussian elimination, Householder reflections, or Givens rotations.

Inverse Functions

Inverting Unipotent maps via Van-Neumann series gives us clean constructive proofs for the Implicit Function theorem. Again we consider functions of the form:

\[ F = I + N \]

Where $N$ is close to zero, and get an inverse via the van Neumann Series $F^{-1} = \sum_k N^K$. Is $N$ in small in the formal sense, i.e. the first degree Taylor poloynomial vanishes, we get a formal power series solution inverse. If we have Lipshitz type bounds on $N$ and $DN$ we can construct differentiable inverses that converge in the $C^k$ norm.

Formal Inverse Functions

Let $F = (F_1, \dots, F_n)$, with each $F_i \in \IR[[x_1, \dots, x_n]]$, be a "formal" map, with $F(0) = 0$. $F$ induces a homomorphism of formal power series rings $F^*: \IR[[y_1, \dots, y_n]] \to \IR[[x_1, \dots, x_n]]$ via $F^*(y_i) := F_i(x_1, \dots, x_n)$.

Theorem (Formal Inverse via Van Neumann)

Assume that $F$ has a "unipotent structure", i.e. $F = I - N$ where $I_i(x) = x_i$ and each component $N_i \in \mathfrak{m}^2$, i.e., $N$ vanishes to order at least 2.

Then the formal sum $G := \sum_{k=0}^\infty N^k$ coverges in the $\mathfrak{m}$-adic topology, and defines a formal inverse to $F$. Thus, $G \circ F = F \circ G = I$ in $\IR[[x_1, \dots, x_n]]$.

Proof

Let $\mathfrak{m} \subset \IR[[x_1, \dots, x_n]]$ be the maximal ideal $\mathfrak{m} := (x_1, \dots, x_n)$ of series with vanishing constant term. By assumption $F^* \mathfrak{m} \subset \mathfrak{m}$, and we get an induced map on the finite dimensional rings $\IR[[x_1, \dots, x_n]]/\mathfrak{m}^{k+1}$ of degree $\leq k$ polynomials.

We note that the operator $N_k$ induced on $\IR[[x_1, \dots, x_n]]/\mathfrak{m}^{k+1}$ is nilpotent, as $N^*(\mathfrak{m}) \subset \mathfrak{m}^2$. Hence $F_k = I - N_k$ is unipotent with inverse $G_k = \sum_l N_k^l$. Passing to inductive limits yields the result.

Remarks If $F$ is a formal function as above with $D=DF(0)$ invertible matrix $D \in GL_n(\IR)$, we can reduce to the case $DF(0) = I$ by precomposing with the linear inverse $D^{-1}$: $\tilde{F}(x) := D^{-1} F(x) = x - N(x),$ where $N \in \mathfrak{m}^2$. An explicit inverse for $F$ is hence provided by the formula: $$ F^{-1}(x) = \sum_k N^k D^{-1} x. $$

Differentiable Inverse Functions

Theorem ($C^k$ Inverse via von Neumann Series and Derivative Control)

Let $F : U \subset \mathbb{R}^n \to \mathbb{R}^n$ be a continuously differentiable function satisfying $F(0) = 0$ and $DF(0) = I$. Define $N(x) := x - F(x)$, so that $N(0) = 0$ and $DN(0) = 0$.

Assume that there are constants $L > 0$, $\delta > 0$ such that for all $x \in B_\delta(0)$, $\|DN(x)\| \leq L \|x\|.$

Then for any such $L, \delta$ set $\rho := \min\left( \delta, \frac{1}{2L} \right)$ then the von Neumann series

$$ G(x) := \sum_{j=0}^\infty N^j(x) $$ converges absolutely in the $C^1$ norm on $B_\rho(0)$ and defines a continuously differentiable local inverse of $F$ on $B_\rho(0)$.

Moreoever:

If $F$ is $C^2$ the constants $L,\delta$ can always be found.
If $F$ is $C^k$ then $G$ is of class $C^{k-1}$.

Proof

Step 0 (Bounds on $N$ and $DN$):
By the integral form of Taylor's theorem (or mean value estimate), we obtain: $$ |N(x)| \leq \frac{1}{2} L |x|^2. $$

Step 1 (Convergence in $C^0$ norm):
Define $a_j := \|N^j(x)\|$ with $a_0 = \|x\|$. Using the bound on $N$, we have: $$ a_{j+1} = |N(N^j(x))| \leq \frac{L}{2} a_j^2 $$ This implies double-exponential decay for $a_j$, provided $a_0 < \frac{1}{2L}$. Hence $\sum_j a_j$ converges absolutely for $\|x\| < \rho$, and $G(x) := \sum_j N^j(x)$ converges in norm.

Step 2 (First derivative):
Let $b_j := \|D(N^j)(x)\|$. The chain rule gives: $D(N^{j+1})(x) = DN(N^j(x)) \cdot D(N^j)(x)$ so: $b_{j+1} \leq \|DN(N^j(x))\| \cdot b_j \leq L a_j \cdot b_j.$

Inductively: $b_j \leq \prod_{i=0}^{j-1} L a_i.$ Since $a_i$ decays doubly exponentially, this product decays exponentially, and $\sum_j b_j$ converges. Therefore, $DG(x) = \sum_j D(N^j)(x)$ converges absolutely, and $G$ is $C^1$.

Conclusion:
The map $G(x) = \sum N^j(x)$ defines a $C^1$ function on $B_\rho(0)$ with $F(G(x)) = x$. Inverting this identity gives $G(F(x)) = x$ for $x \in B_\rho(0)$, and $G$ is the local $C^1$ inverse of $F$.

Differential Equations

BACKUP

Underdetermined Systems

Let $F \in \IR[M,N]$ be rectangular with $M < N$ (i.e., $[M] = \{0, 1, \dots, M-1\}, [N] = \{0, 1, \dots, N-1\}$ with $M < N$). We partition $[N] = N_0 \cup N_1$ with $|N_0| = M$, and correspondingly partition: $F = [F_0 \mid F_1]$ where $F_0 \in \IR[M,N_0]$ and $F_1 \in \IR[M,N_1]$.

Assume $F_0$ is invertible. Then the general solution to $F * x = b$ is:

Homogeneous case ($b = 0$): $$ x = (-F_0^{-1} * F_1 * y_1, y_1) $$ with $y_1 \in \IR[N_1]$ arbitrary.
Inhomogeneous case: $$ x = (F_0^{-1} * b - F_0^{-1} * F_1 * y_1, y_1) $$ with $y_1 \in \IR[N_1]$ arbitrary.

If $F_0 = Q * U$ is a QU factorization with $U = I_{N_0} - N$, $N$ nilpotent, then: $$ F_0^{-1} = \left(\sum_{j=0}^{k-1} N^j\right) * Q^{-1}. $$

Rank-Revealing QU Factorization

Algorithm

Given $A \in \IR[N,N]$ where $[N] = \{0, 1, \dots, N-1\}$, construct:

Integer $r \leq N$ (the rank),
$Q \in \IR[N,N]$ invertible (representing row operations),
$P \in \IR[N,N]$ permutation matrix (representing column operations),
$U \in \IR[N,N]$ with structure:
- $U[i,j]$ for $i,j \in [r]$: unipotent block $U_{00}$,
- $U[i,j]$ for $i \in [r], j \in \{r, \dots, N-1\}$: arbitrary block $U_{01}$,
- $U[i,j] = 0$ for $i \in \{r, \dots, N-1\}$,

such that $A = Q * U * P$.

Remark

To isolate an $r \times r$ invertible submatrix, we find $r$ independent rows and $r$ independent columns, which may differ. We resolve this via the permutation matrix $P$.