Skip to content

Polynomials and Their Roots

We work throughout with the ring of polynomials in one variable with complex coefficients.

We work towards showing the set of roots of a polynomial depend continuously on the coefficients.

We use the unusual notation \(P[X]\) since \(\IR[X]\) is reserved for the vector space with basis set \(X\).

The Polynomial Ring

  • We define the ring of complex polynomials in one "formal" variable \(X\) as:

    \[ P[X] := \left\{ \sum_{k=0}^N a_k X^k \;\middle|\; N \in \IN,\; a_k \in \IC \right\} = \Vsum_{k \geq 0} \IC \cdot X^k \]

    As we will see, \(P[X]\) carries the structure of a commutative \(\IC\) algebra.

  • The multiplication of two polynomials is defined by the \(\IC\)-bilinear rule:
    If \(p = \sum_{i=0}^K a_k X^k\) and \(q = \sum_{j=0}^L b_l X^l\), then

    \[ p \cdot q = \sum_{n=0}^{N} \left( \sum_{k+l=n} a_k b_l \right) X^n \]

    where \(N = K+L\). This operation is commutative and associative.

  • Within \(P[X]\) we find the sub-vectors space of degree-N polynomials as:

    \[ P[X]_{\leq N} := \left\{ \sum_{k=0}^N a_k X^k \;\middle|\; a_k \in \IC \right\} = \Vsum_{0 \leq k \leq N} \IC \cdot X^k \]

    This is an \(N+1\) dimensional vectors space with basis \(X_0,\dots,X^N\).

  • Within \(P[X]\) we find the affine subspace of monic degree-N polynomials as:

    \[ P[X]_{N} := \left\{ \sum_{k=0}^N a_k X^k \;\middle|\; a_k \in \IC, a_N = 1 \right\}\]

    This is an affine vectors space over \(P[X]_{\leq N-1}\), that is parametrized by \(X^N + p\) where \(p \in P[X]_{\leq N-1}\).

Polynomial Division

Definition (Degree Function)

For a non-zero polynomial \(p = \sum_{k=0}^N a_k X^k \in P[X]\) with \(a_N \neq 0\), we define its degree as \(\deg(p) := N\).

By convention, we set \(\deg(0) := -\infty\).

Proposition (Basic Properties of Degree)

For polynomials \(f, g \in P[X]\):

  • \(\deg(fg) = \deg(f) + \deg(g)\) for non-zero polynomials \(f, g\)
  • For any \(f, g \in P[X]\) with \(g \neq 0\), there exist unique polynomials \(g', r \in P[X]\) such that

    \[ f = g \cdot g' + r \]

    with either \(r = 0\) or \(\deg(r) < \deg(g)\).

Proof of Division Algorithm

Existence: We prove by strong induction on \(\deg(f)\).

If \(\deg(f) < \deg(g)\), take \(q = 0\) and \(r = f\).

Otherwise, let \(f = a_n X^n + \text{lower terms}\) and \(g = b_m X^m + \text{lower terms}\) with \(n \geq m\) and \(a_n, b_m \neq 0\).

Consider \(f_1 = f - \frac{a_n}{b_m} X^{n-m} \cdot g\). Then \(\deg(f_1) < \deg(f)\).

By induction hypothesis: \(f_1 = q_1 g + r\) with \(\deg(r) < \deg(g)\).

Setting \(q = \frac{a_n}{b_m} X^{n-m} + q_1\) gives \(f = qg + r\).

Uniqueness: If \(f = q_1 g + r_1 = q_2 g + r_2\) with \(\deg(r_1), \deg(r_2) < \deg(g)\), then \((q_1 - q_2)g = r_2 - r_1\). Since \(\deg(r_2 - r_1) < \deg(g)\) but \(\deg((q_1 - q_2)g) = \deg(q_1 - q_2) + \deg(g) \geq \deg(g)\) (unless \(q_1 = q_2\)), we must have \(q_1 = q_2\) and \(r_1 = r_2\).

Proposition (Greatest Common Divisor)

For polynomials \(f, g \in P[X]\) with \(\deg(f) > 0\) and \(\deg(g) > 0\), there exists a unique monic polynomial \(d \in P[X]\) such that for any polynomial \(h \in P[X]\), the following are equivalent:

  • \(h | f\) and \(h | g\)
  • \(h | d\)

We call \(d = \gcd(f,g)\) the greatest common divisor of \(f\) and \(g\).

Furthermore, there exist polynomials \(a, b \in P[X]\) such that:

\[\gcd(f,g) = sf + tg\]

This is Bézout's identity.

Proof

Construction via Extended Euclidean Algorithm:

We construct \(\gcd(f,g)\) as reminder \(r_i\) and the Bézout coefficients \(s_i, t_i\) simultaneously. The key observation is that if \(d|f\) and \(d|g\) then \(d|r\) where \(r = f - g g'\) is the reminder of \(f\) divided by \(g\).

  1. Initialize: \(r_{-1} = f\), \(r_0\), and \((s_{-1}, t_{-1}) = (1, 0)\), \((s_0, t_0) = (0, 1)\) Note that:

    • \(deg(r_0) = deg(g) > 0\)
    • \(s_{-1} f + t_{-1} g = r_{-1} = f\)
    • \(s_0 f + t_0 g = r_0 = g\)
  2. For \(i = 0, 1, 2, \ldots\) until \(r_{i+1} = 0\):

    • Define \(r_{i+1}\) as a reminder \(r_{i-1} = r_i' r_i + r_{i+1}\) of polynomial division of \(r_{i-1}\) by \(r_i\).
    • \(s_{i+1} = s_{i-1} - r_i' s_i\)
    • \(t_{i+1} = t_{i-1} - r_i' t_i\)

    Note that after this step:

    • \(\deg(r_{i+1}) < \deg(r_i)\)
    • \(s_{i+1} f + t_{i+1} g = (s_{i-1} - r_i' s_i) f + (t_{i-1} - r_i' t_i) g = (s_{i-1}f + t_{i-1}g) - r_i' (s_i f + t_i g) = r_{i-1} - r_i' r_i = r_{i+1}\)
  3. If \(r_{k+1} = 0\) for some \(k\) we terminate. Let \(c\) be the leading coefficient of \(r_k\) and

    • Set \(d := r_k / c\)
    • Set \(s := s_k /c\), \(t := t_k / c\)

Termination: Since \(\deg(r_0) > \deg(r_1) > \deg(r_2) > \cdots \geq 0\), the algorithm must terminate.

Correctness: When the algorithm terminates, we have \(r_k \neq 0\) and \(r_{k+1} = 0\). Therefore \(r_{k-1} = r_k' r_k\), and \(d|r_k\) and \(d | r_{k-1}\). Similarly \(d\) divides \(r_{k-2} = r_{k-1} r_{k-1}' - r_k\) all previous remainders, including \(r_0 = g\) and \(r_{-1} = f\).

Conversely, any common divisor of \(f\) and \(g\) must divide \(d = s f + t g\).

Uniqueness: If \(d_1\) and \(d_2\) both satisfy the universal property, then \(d_1 | d_2\) and \(d_2 | d_1\), so \(d_1 = c \cdot d_2\) for some constant \(c\). Since both are monic, \(c = 1\).

Corollary (Fundamental Divisibility Property)

  • Polynomials \(f, g \in P[X]\) are coprime (i.e. \(gcd(f,g) = 1\)) if and only if there exist polynomials \(s, t \in P[X]\) such that \(sf + tg = 1\).
  • If \(\gcd(f,g) = 1\) and \(f | gh\) for some polynomial \(h \in P[X]\), then \(f | h\).
Proof

Since \(f\) and \(g\) are coprime, by Bézout's identity there exist \(s, t \in P[X]\) such that \(sf + tg = 1\). Multiplying by \(h\): \(sfh + tgh = h\).

Since \(f | gh\), we have \(gh = fk\) for some \(k \in P[X]\). Substituting: \(sfh + tfk = f(sh + tk) = h\).

Therefore \(f | h\).

The Resultant

We introduce the following notation, to identify \(P[X]_N\) with the linear space \(\IC[N]\):

  • For a vector \(a \in \IC[N]\) we write \(a[X] := \sum a[i] X^i\) for the corresponing polynomial.
  • For a polynomial \(p \in P[X]_{\leq N}\) we write \(\Coef(p) = [p_0,\dots, p_N] \in \IC[N]\).

We multiplication map is a differentiable (quadratic) map between vector spaces:

\[ M : P[X]_{\leq K} \times P[X]_{\leq L} \lra P[X]_{\leq N}, \quad (p,q) \mapsto p \cdot q \]

This is a differentiable map between vectors spaces of dimension \(K+1+L+1 = N+2\) and \(N+1\). We identify the tangent space of \(P[X]_{\leq K}\) at \(p\) with \(\IC[K+1]\) via \(p + ta[x]\).

We also get an induced map between the affine spaces of monic polynomials:

\[ \tilde{M} : P[X]_K \times P[X]_L \lra P[X]_N, \quad (p,q) \mapsto p \cdot q \]

This is a differentiable map between affine spaces of the same dimension \(K+L= N\). We identify the tangent space of \(P[X]_{K}\) at \(p\) with \(\IC[K]\) via \(p + ta[x]\).

Proposition

  • The differential of \(M\) at a point \(p \in P[X]_{\leq K},q \in P[X]_{\leq L}\) in a direction \(a \in \IC[K+1]\), \(b \in \IC[L+1]\) is given by

    $$ DM(p,q) [a|b] = \Coef( p \cdot b[x] + b \cdot a[x]) \quad \in \IC[N+1] $$

    Note, that this is a non-square matrix \((N+1) \times (N+2)\) matrix, and always has a non-zero kernel element \([a|-b]\).

  • The differential of \(\tilde{M}\) at a point \(p \in P[X]_{K},q \in P[X]_{L}\) in a tangent direction \(a \in \IC[K], b \in \IC[L]\) is given by the same formula (but on different spaces):

    $$ D\tilde{M}(p,q) [a|b] = \Coef( p \cdot b[x] + b \cdot a[x]) \quad \in \IC[N] $$

    This \(N \times N\) square matrix.

Proof

\(DM(p,q)[a|b] = \frac{\partial}{\partial t}\bigg|_{t=0} M(p+ta[x],q+tb[x]) = \frac{\partial}{\partial t}\bigg|_{t=0}\Coef( (p + t a[x]) ( q + t b[x]) ) = \Coef(p \cdot b[x] + q \cdot a[x])\).

Example: Example 1×1 Case

For a polynomials \(p,q \in P[X]_{\leq 1}, p = p_1 X + p_0, q = q_1 X + q_0\) we find:

\[ DM(p,q) = \begin{bmatrix} q_0 & 0. & p_0 & 0 \\ q_1 & q_0 & p_1 & p_0 \\ 0 & q_1 & 0. & p_1 \end{bmatrix} \]

For a polynomials \(p,q \in P[X]_{1}, p = X + p_0, q = X + q_0\) we have:

\[ D\tilde{M}(p,q) = \begin{bmatrix} q_0 & p_0 \\ 1 & 1 \end{bmatrix} \]
Example: Example 2×1 Case

For polynomials \(p \in P[X]_{\leq 2},\ q \in P[X]_{\leq 1}\), with \(p = p_2X^2 + p_1 X + p_0\), \(q = q_1 X + q_0\):

\[ DM(p,q) = \begin{bmatrix} q_0 & & & p_0 & \\ q_1 & q_0 & & p_1 & p_0 \\ & q_1 & q_0 & p_2 & p_1 \\ & & q_1 & & p_2 \end{bmatrix} \]

For monic polynomials \(p = X^2 + p_1 X + p_0\), \(q = X + q_0\):

\[ D\tilde{M}(p,q) = \begin{bmatrix} q_0 & & p_0 \\ 1 & q_0 & p_1 \\ & 1 & p_1 \\ \end{bmatrix} \]
Example: Example 3×2 Case

For polynomials \(p \in P[X]_{\leq 3},\ q \in P[X]_{\leq 2}\), with \(p = p_3 X^3 + p_2 X^2 + p_1 X + p_0\), \(q = q_2 X^2 + q_1 X + q_0\):

\[ DM(p,q) = \begin{bmatrix} q_0 & & & & p_0 & & \\ q_1 & q_0 & & & p_1 & p_0 & \\ q_2 & q_1 & q_0 & & p_2 & p_1 & p_0 \\ & q_2 & q_1 & q_0 & p_3 & p_2 & p_1 \\ & & q_2 & q_1 & & p_3 & p_2 \\ & & & q_2 & & & p_3 \end{bmatrix} \]

For monic polynomials \(p = X^3 + p_2 X^2 + p_1 X + p_0\), \(q = X^2 + q_1 X + q_0\):

\[ D\tilde{M}(p,q) = \begin{bmatrix} q_0 & & & p_0 & \\ q_1 & q_0 & & p_1 & p_0 \\ 1 & q_1 & q_0 & p_2 & p_1 \\ & 1 & q_1 & 1 & p_2 \\ & & 1 & & 1 \\ \end{bmatrix} \]

Definition

  • Sylvester Matrix. For \(p \in P[X]_{\leq K},q \in P[X]_{\leq L}\) the Sylvester Matrix \(S(a,b) \in \IR[N,N]\), \(N=K+L\) is $$ S(p,q)[a|b] = [a|b] = \Coef( p \cdot b[x] + b \cdot a[x]) \quad \in \IC[N] $$ where \(a \in \IC[K]\) and \(b \in \IC[L]\).
  • Resultant. The determinant of the Sylvester Matrix is called Resultant \(Res(a,b) = \det(S(a,b))\).

    The resultant is a polynomial in the entries \(a[k],b[l]\) of degree \(N\).

Proposition

Let \(p,q \in P[X]\) be two polynomials, of degree \(K>0\), \(L>0\) respectively. Let \(f = p \cdot q\). The following are equivalent:

  • (a) \(p,q\) are coprime, i.e. \(gcd(p,q) = 1\).
  • (b) \(Res(p,q) \neq 0\)
  • (c) \(D\tilde{M}(p,q)\) is an invertible \(N \times N\) matrix.
  • (d) \(\tilde{M} : P[X]_K \times P[X]_L \lra P[X]_N,\ (p,q) \mapsto p \cdot q\) is a local diffeomorphism near \(p,q\).
  • (e) There are differentiable functions \(P: P[X]_N \ra P[X]_K\), \(Q:P[X]_N \ra P[X]_L\) defined near \(f\), so that \(f'=P(f') \cdot Q(f')\) for all \(f'\) near \(f\) in the Euklidean topology. We say "the factorization \(f = p \cdot q\) is stable under deformations of \(f\)".

We will later see that \(P,Q\) are in-fact polynomial functions (not just differentiable, complex analytic or rational).

Proof

Notation: The equivalence of b,c,d,e are given by the implicit function theorem applied to \(S(p,q) = D\tilde{M}(p,q)\).

(\(a \Rightarrow b\)) Let \(d\) with \(\deg(d) > 0\) be a common factor, with \(p = d \cdot p'\) and \(q = d \cdot q'\) with \(\deg(p') < K\) and \(\deg(q') < L\). Consider the vector \(v = [p'|-q'] \in \IC[K+L]\), we find:

\[S(p,q) * v = p \cdot (-q') + q \cdot p' = - d \cdot p' \cdot q' + d \cdot q' \cdot p' = 0\]

Since \(v \neq 0\), the matrix \(S(p,q)\) has a non-trivial kernel, hence \(\det(S(p,q)) = 0\).

(\(b \Rightarrow a\)) Suppose \(Res(p,q) = 0\), and for a contradiction that \(p,q\) are co-prime. Then \(S(p,q)\) is singular, so there exists a non-zero vector \(v = [u|w] \in \IC[K+L]\) such that \(S(p,q) * v = 0\). This means \(p \cdot w + q \cdot u = 0\), where \(\deg(w) < L\) and \(\deg(u) < K\).
Rearranging: \(p \cdot w = -q \cdot u\), which shows that \(p | q \cdot u\). As \(p,q\) are coprime polynomials, we conclude that \(p | u\), contradicting \(\deg(u) < K = \deg(p)\).

Higher Resultants

We saw above that the divisibility properties of polynomials are closely linked to the stability of products under deformation. The arguments carry over verbatim to the case of an \(n\)-fold product.

Definition

Let \(p_1 \in P[X]_{K_1}, \ldots, p_n \in P[X]_{K_n}\) be monic polynomials with \(N = K_1 + \cdots + K_n\).

  • Let \(M: P[X]_{K_1} \times \cdots \times P[X]_{K_n} \lra P[X]_N, \quad (p_1,\ldots,p_n) \mapsto p_1 \cdot p_2 \cdot \ldots \cdot p_n\)
  • Let \(S(p_1,\ldots,p_n) = DM(p_1,\ldots,p_n)\) be the Higher Sylvester Matrix
  • Let \(Res(p_1,\ldots,p_n) = \det(S(p_1,\ldots,p_n))\) be the Higher Resultant

Proposition

The differential of \(M\) at \((p_1,\ldots,p_n)\) in direction \((a_1,\ldots,a_n)\) is:

\[DM(p_1,\ldots,p_n)[a_1|\ldots|a_n] = \sum_{i=1}^n a_i[x] \hat{p}_i\]

where \(\hat{p}_i = \prod_{j \neq i} p_j\) is the product of all \(p_j\) except \(p_i\).

Proposition

The following are equivalent:

  • (a) \(p_1, \ldots, p_n\) are mutually coprime, i.e., \(\gcd(p_i,p_j) = 1\) for all \(i \neq j\)
  • (b) \(Res(p_1,\ldots,p_n) \neq 0\)
  • (c) \(DM(p_1,\ldots,p_n)\) is an invertible \(N \times N\) matrix
  • (d) \(M\) is a local diffeomorphism near \((p_1,\ldots,p_n)\)
Proof

As before the equivalence of b,c,d,e is given by the implicit function theorem.

We only show the equivalence of (a) and (b):

(\(a \Rightarrow b\)) If \(p_1, \ldots, p_n\) are not mutually coprime, then there exist indices with a nontrivial common factor \(d\) with \(\deg(d) > 0\). After reordering, we can assume \(d | p_1\) and \(d | p_2\).

Write \(p_1 = d \cdot p_1'\) and \(p_2 = d \cdot p_2'\) where \(\deg(p_1') < K_1\) and \(\deg(p_2') < K_2\).

Consider the vector \(v = (p_2', -p_1', 0, \ldots, 0) \in \IC^N\) where \(p_2'\) is in the first position and \(-p_1'\) is in the second position.

Then \(S(p_1,\ldots,p_n) \cdot v = p_2' \hat{p}_1 - p_1' \hat{p}_2 =(p_2' p_2 - p_1' p_1) (p_3 \cdots p_n) = 0\)

Since \(v \neq 0\), we have \(\det S = 0\).

(\(b \Rightarrow a\)) Let \(Res(p_1, \ldots, p_n) = 0\), and assume for a contradiction that \(p_1, \dots, p_n\) are mutually coprime. As \(S(p_1, \ldots, p_n)\) is singular there exists a nonzero vector \((u_1, \ldots, u_n) \in \IC^N\) such that \(\sum_{i=1}^n u_i[x] \hat{p}_i = 0\).

Assume wlog. \(u_1 \neq 0\), then \(u_1 \hat{p}_1 = -\sum_{i=2}^n u_i[x] \hat{p}_i\). As \(p_1 | \hat{p_i}\) for all \(i>1\), we find \(p_1 | u_1 \hat{p}_1\), as \(p_i\) are coprime we must have \(p_1 | u_1\) which contradicts \(deg(u_1) < K_1\).

Example: 3×2×2 Case

Consider three monic polynomials:

  • \(p_1 = X^3 + p_{12} X^2 + p_{11} X + p_{10} \in P[X]_3\)
  • \(p_2 = X^2 + p_{21} X + p_{20} \in P[X]_2\)
  • \(p_3 = X^2 + p_{31} X + p_{30} \in P[X]_2\)

Introduce the notation for the "skip-products":

  • \(q_1 = p_2 p_3 = X^4 + q_{13} X^3 + q_{12} X^2 + q_{11} X + q_{10}\)
  • \(q_2 = p_1 p_3 = X^5 + q_{24} X^4 + q_{23} X^3 + q_{22} X^2 + q_{21} X + q_{20}\)
  • \(q_3 = p_1 p_2 = X^5 + q_{34} X^4 + q_{33} X^3 + q_{32} X^2 + q_{31} X + q_{30}\)

The resulting \(7 \times 7\) Sylvester matrix looks as follows:

\[S(p_1,p_2,p_3) = \begin{bmatrix} q_{10} & & & q_{20} & & q_{30} & \\ q_{11} & q_{10} & & q_{21} & q_{20} & q_{31} & q_{30} \\ q_{12} & q_{11} & q_{10} & q_{22} & q_{21} & q_{32} & q_{31} \\ q_{13} & q_{12} & q_{11} & q_{23} & q_{22} & q_{33} & q_{32} \\ 1 & q_{13} & q_{12} & q_{24} & q_{23} & q_{34} & q_{33} \\ & 1 & q_{13} & 1 & q_{24} & 1 & q_{34} \\ & & 1 & & 1 & & 1 \end{bmatrix}.\]
Example: 1×1×1 Case

Consider three monic linear polynomials:

  • \(p_1 = X - r_1 \in P[X]_1\)
  • \(p_2 = X - r_2 \in P[X]_1\)
  • \(p_3 = X - r_3 \in P[X]_1\)

The skip-products are:

  • \(q_1 = p_2 p_3 = (X - r_2)(X - r_3) = X^2 - (r_2 + r_3) X + r_2 r_3\)
  • \(q_2 = p_1 p_3 = (X - r_1)(X - r_3) = X^2 - (r_1 + r_3) X + r_1 r_3\)
  • \(q_3 = p_1 p_2 = (X - r_1)(X - r_2) = X^2 - (r_1 + r_2) X + r_1 r_2\)

The resulting \(3 \times 3\) Sylvester matrix is:

\[S(p_1,p_2,p_3) = \begin{bmatrix} r_2 r_3 & r_1 r_3 & r_1 r_2 \\ -(r_2 + r_3) & -(r_1 + r_3) & -(r_1 + r_2) \\ 1 & 1 & 1 \end{bmatrix}\]

We calculate \(\det S(p_1, p_2, p_3) = (r_1 - r_2)(r_1 - r_3)(r_2 - r_3)\)

Roots of Polynomials

Let \(Sym: \IC[N] \ra P[X]_N\) be the product map \(Sym(r_1, \dots, r_N) = (X - r_1) \cdot \dots \cdot (X - r_N)\). Note that the coefficients of \(Sym\) are the elementary symmetric polynomials (up to sign):

  • \(Sym_{N-1} = (-1)^{N-1} (r_1 + \dots + r_N)\)
  • \(Sym_{N-2} = (-1)^{N-2} \sum_{i < j} r_i r_j\)
  • \(Sym_0 = (-1)^N r_1 \cdot \dots \cdot r_N\)

This map is invariant under re-ordering of the "roots" \(r_i\) and hence descends to the quotient \(S: \IC[N]/S_N \ra P[X]_N\). The quotient \(\IC[N]/S_N\) inherits a topology from \(\IC[N]\) making it a topological space.

Proposition

The map \(S: \IC[N]/S_N \ra P[X]_N\) is a set-theoretic bijection:

  • If \(S(r_1, \dots, r_N) = p = S(s_1, \dots, s_N)\), then \(r_i\) and \(s_i\) are a permutation of each other.
  • Each monic polynomial \(p \in P[X]_N\) can be factored into \(p = (X - r_1) \cdot \dots \cdot (X - r_n)\).
Proof

a) We can recover the multiplicity of a root \(r\) as vanishing order of \(p\) at \(r\). \(\# \{ i \ |\ r_i = r \} = ord_r(p) = \# max \{ k\ |\ p(r) = p'(r) = \dots = p^{(k-1)}(r) = 0 \}\).

b) This is the Fundamental Theorem of Algebra.

Theorem (Continuity of the Roots)

Let \(R: P[X]_N \lra \IC[N]/S_N\) the set-theoretic inverse to \(S\), mapping a polynomial \(p\) to it's unordered tuple of roots \([r_1,\dots,r_N]\).

Then, \(R\) is continous.

Proof

Fix \(p \in P[X]_N\). We shows that there is a neighbourhood \(U\) around \(p\) where \(R\) is continous.

  • Factorize \(p = \prod_{i=1}^n (X-r_i)^{m_i}\) where \(r_i \neq r_j\) if \(i \neq j\), and let \(p_i = (X-r_i)^{m_i}\).
  • The factors \(p_i\) are co-prime by construction.
  • Therefore \(M(p_1, \dots, p_n)\) is a local diffeomorphism and we can find local inverse \(P_1(p'), \dots, P_N(p')\), with \(P_1(p')\cdot \dots \cdot P_N(p') = p'\) and \(P_i(p) = p_i\).
  • Now \(R(P_1(p')\cdot \dots \cdot P_N(p')) = R(P_1(p')) + \dots + R(P_N(p'))\), where \(+\) is defined by concatenation (see below).
  • It hence suffices to show that \(R(P_i(p'))\) is continuous near \(p\):
    • As \(P_i(p')\) is continuous near \(p\), it suffices to show that \(R\) is continuous near \(p_i = (X-r_i)^{m_i}\).
    • By change of coordinates, this statement is equivalent to \(R(X^m_i)\) is continuous near \(0\).
    • This is the content of the Lemma 2 below.
  • QED.

Lemma 1 - Continuity of Concatenation

Define an addition operator "+" on \(\Union_{N\geq 0} \IC[N]/S_N\) by concatenation of points: \([r_1,\dots,r_K] + [s_1,\dots,s_L] = [r_1,\dots,r_K, s_1, \dots, s_L]\). This operation is continuous.

Proof

Consider $$ \IC[K] \times \IC[L] \to \IC[K+L]/S_{K+L}, \quad (r_1, \ldots, r_K), (s_1, \ldots, s_L) \mapsto [r_1, \ldots, r_K, s_1, \ldots, s_L] $$

Since this map is well-defined on the quotient spaces \(\IC[K]/S_K \times \IC[L]/S_L\) (concatenation respects equivalence classes), it descends to a continuous map on the quotients by the universal property of quotient topology.

Lemma 2 - Root Bound for Nearly Monic Polynomials

\(R: P[X]_N \lra \IC[N]/S_N\) is continuous near \(X^N\).

Proof

We have to show that for all open neighbourhoods \(U\) of \([0,\dots,0]\) in \(\IC[N]/S_N\) we find an open neighbourhood \(V\) of \(X^N\) in \(P[X]_N\) that maps into \(U\).

Fix \(\eps > 0\). We construct a \(\delta > 0\) so that all roots "\(r \in R(p)\)" have \(|r| < \eps\), if the coefficients of \(p\) satisfy \(|p_i| < \delta\).

Assume that \(0 = p(z) = z^N + p_{N-1}z^{N-1} + \dots p_0\).

Then \(|z|^N = |p_{N-1}z^{N-1} + \dots + p_0| \leq |p_{N-1}z^{N-1}| + \dots |z| |p_1| + |p_0| \leq \delta (1 + |z| + \dots |z|^{N-1})\)

If \(|z| \geq 1\), then \(|z| \leq \delta (1/|z|^{N-1} + 1/|z|^{N-2} + \dots + 1) \leq N \delta\). Setting \(\delta < 1/N\) will force a contradiction.

If \(|z| < 1\), then \(|z| \leq (\delta N)^{1/N}\), hence setting \(\delta < \eps^N/N\) will be sufficient.

Corollary

  • The set of polynomials where all roots are in an open subset is open.
  • The set of polynomials where all roots are in a closed subset is closed.
  • The set of polynomials where two roots collide is closed.
  • The set of polynomials where all roots are distinct is open.

The Discriminat Locus

Proposition

Consider the map \(Sym: \IC[N] \ra P[X]_N\) near \(r=(r_1, \dots, r_N)\) with \(Sym(r) = p\).

The following are equivalent:

  • The roots \(r_i\) are distinct: \(r_i \neq r_j\) for all \(i \neq j\).
  • The Discriminant \(Disc(p) = Res(p,p')\) does not vanish.
  • The Higher Resultant \(Res(X-r_1, \dots, X-r_N)\) does not vanish.
  • \(Sym\) is a local diffeomorphism near \(p\).
Proof

(1) ⟺ (2): The resultant \(Res(p,p')\) measures whether \(p\) and \(p'\) have common roots. Since \(p'(r_i) = \prod_{j \neq i}(r_i - r_j)\), we see that \(p\) and \(p'\) share a root if and only if \(r_i = r_j\) for some \(i \neq j\). Thus \(Res(p,p') \neq 0\) if and only if all roots are distinct.

(1) ⟺ (3): By definition, \(Res(X-r_1, \ldots, X-r_N)\) is the determinant of the higher Sylvester matrix for the linear factors \((X-r_i)\). From our Example 1×1×1 case, we know this determinant equals \(\prod_{i<j}(r_i - r_j)\). This product is non-zero if and only if all \(r_i\) are distinct.

(2) ⟺ (4): The differential \(D Sym(r)\) at \(r = (r_1, \ldots, r_N)\) is given by the higher Sylvester matrix for \((X-r_1, \ldots, X-r_N)\). By our general theory of higher resultants, this matrix is invertible if and only if \(Res(X-r_1, \ldots, X-r_N) \neq 0\). By the inverse function theorem, \(Sym\) is a local diffeomorphism if and only if \(D Sym(r)\) is invertible.

Theorem

Let

  • \(D := \{ p \in P[X]_N \mid \mathrm{Disc}(p) = 0 \}\) be the discriminant locus,
  • \(U := P[X]_N \setminus D\) the open subset of monic degree-\(N\) polynomials with distinct roots.
  • \(V := \mathrm{Sym}^{-1}(U) \subset \mathbb{C}^N\) the space of \(N\)-tuples of distinct roots.
  • Fix \(p \in U\) as base-point, with fiber \(F_0 = \mathrm{Sym}^{-1}(p_0)\).

Then

  • The map \(\mathrm{Sym}: V \to U, (r_1, \dots, r_N) \mapsto \Coef(\prod_{i=1}^N (X - r_i))\) is a finite Galois covering of degree \(N!\):

    • All fibers \(F_p = \mathrm{Sym}^{-1}(p)\) have same cardinality \(|F_p| = N!\).
    • For each point \(p \in U\) there is a neighbourhood \(D\) of \(p \in U\) so that \(\Sym^{-1}(D) \isom F_p \times D\).
    • The group of deck transformations \(\mathrm{Deck}(V/U)\) is canonically isomorphic to \(S_N\).
    • \(S_N \isom \mathrm{Deck}(V/U)\) acts transitively on the fiber \(F_p\).
  • The action of of \(S_N\) on each fiber \(F_p\) is free, making \(V\) into a principal \(S_N\)-bundle.
  • The fundamental group \(\pi_1(U,p)\) is isomorphic to the braid group in N strands \(B_N\).
  • The monodromy representation \(\rho : \pi_1(U,p) \to \mathrm{Aut}(\mathrm{Sym}^{-1}(p)) \cong Aut(S_N)\) factors as the canonical projection \(B_N \twoheadrightarrow S_N\) followed by the left-action of \(S_N\) on itself.
Proof

- To show that \(\Sym\) is a covering space. It suffices to show that \(\Sym\) is proper. To see this we can exploit the continuity of the inverse function \(R\) and the fact that continuous images of compact sets are compact. - The \(S_N\) action is free on the area \(V\) where no points collide and gives us the morphism \(S_N \ra \mathrm{Deck}(V/U)\). - We have seen that \(V/S_N = U\) giving us a Galois cover and Principal \(S_N\)-bundle structure.

- We consider the Braid group \(B_N\) as an abstract group with generators \(\sigma_1,\dots,\sigma_{N-1}\) and relations \(\sigma_i \sigma_j = \sigma_j \sigma_i\) for \(|i - j| > 1\) and \(\sigma_i \sigma_{i+1} \sigma_i = \sigma_{i+1} \sigma_i \sigma_{i+1}\) for \(i=1,\dots,N-2\) - The map \(B_N \ra \pi_1(U,p)\) can be constructed by mapping geneators \(\sigma_1\) to loops that realize adjacent transpositions "\((i i+1) \in S_N\)". For example if we choose \(p=Sym(0,2,4,\dots,2*N)\) we can map \(\sigma_1\) to \(\gamma_1=\Sym(1-\exp(\pi i t), 1+\exp(\pi i t),4,6,\dots,2N)\). This is a loop that flips \(r_1\) and \(r_2\) in \(V\). - One can check that the braid relations are satisfied. - To show that this is an isomorphism is beyond the scope of this note. A proof can be found in Fadell–Neuwirth`1962 paper.

Example N=2

For \(N=2\) we get the following picture:

  • Denote by \(a,b\) the roots of a polynomial \((X-a)(X-b) = X^2-(a+b)X+ab\)
  • Denote by \(p,q\) the cofficients \(p = -(a+b), q=ab\), so \((X-a)(X-b) = X^2+pX+q\).

The map \(\mathrm{Sym}: \IC[2]_{a,b} \ra \IC[2]_{p,q}, \mathrm{Sym}(a,b) = [-(a+b), ab]\) is S^2 invariant, and descends to a map \(S: \IC[2]_{a,b}/S_2 \ra \IC[2]_{p,q}\) which is a continous bijection.

The inverse \(R\) is given by the pq-Formula and maps a pair of coefficients \(p,q\) to an unordered pair of points \(R(p,q)=[\half(-p \pm \sqrt{\Delta})]\), where \(\Delta = p^2 - 4q\). Note that this is well defined as a map to \(\IC[2]_{a,b}/S_2\) but not as a map to \(\IC[2]_{a,b}\). Any continous lift to \(\IC[2]\) would require use to make a consitent choice of a square root, but following a small loop around \(\Delta = 0\) will flip any such choice.

Symmetric Functions

We have seen above that \(\mathrm{Sym}: \IC[N] \ra P[X]_N\) induces a bijection of sets \(\IC[N]/S_N \ra P[X]_N\).

That means, that every map \(f: \IC[N] \ra X\) to an arbitrary set \(X\), that is \(S_N\) invariant: $$ f(r_1, \dots, r_n) = f(r_{\pi 1}, \dots, r_{\pi n}), \quad \pi in S_N $$ factors through \(\mathrm{Sym}\), i.e. there is a unique \(F:P[X]_N \ra X\), so that \(f = F \circ \mathrm{Sym}\). Indeed, we can set \(F(a_0,\dots,a_{N-1}):=f(r_1,\dots,r_n) = f(R(a))\) where \(r_1, \dots, r_n\) is any ordering of the root set \(R(a)\).

Now let's assume that \(f: \IC[N] \ra X\) has additional structure, we can ask does the map \(F\) have the same structure?

Proposition

Let \(f: \IC[N] \ra \IC\) be an \(S_N\) invariant map, and let \(F = f \circ R\) the induced map on \(P[X]_N \ra X\).

  1. If \(f\) is continuous, then \(F\) continuous.
  2. If \(f\) is analytic, then \(F\) is analytic.
  3. If \(f\) is a polynomial, then \(F\) is a polynomial.
  4. If \(f\) is a rational function, then \(F\) is a rational function.

However, for \(N=2\) the differentiable function \(f(a,b) = |a - b|^2\) but the composition \(F(p,q) = (f \circ R)(p,q) = |p^2/2 - q|\) is not differentiable.

Proof
  1. Continuity: We have shown that \(R: P[X]_N \to \IC[N]/S_N\) is continuous above. Since \(f\) is \(S_N\)-invariant, it descends to the quotient \(\IC[N]/S_N\) as a continuous map \(\tilde{f}: \IC[N]/S_N \to \IC\). Hence \(F = \tilde{f} \circ R\) is continuous as the composition of continuous maps.

  2. Analyticity: The map \(F = f \circ R: P[X]_N \to \IC\) is continuous everywhere and analytic outside the discriminant locus \(\Delta = \{p \in P[X]_N : \text{Disc}(p) = 0\}\). Since \(f\) is analytic and \(R\) is analytic outside \(\Delta\), the composition \(F\) is analytic outside \(\Delta\). Moreover, \(F\) extends continuously across \(\Delta\), so it is locally bounded around each point of \(\Delta\). By Riemann's removable singularity theorem, \(F\) is analytic everywhere.

  3. Polynomial case: If \(f\) is a polynomial of degree \(d\), then there exist constants \(M, K > 0\) such that \(|f(r)| \leq M (1 + \|r\|_2)^K\) for all \(r \in \IC[N]\). For any polynomial \(p = X^N + p_{N-1}X^{N-1} + \cdots + p_0 \in P[X]_N\) with roots \(r_1, \ldots, r_N\), Cauchy's bound gives: \(|r_i| \leq 1 + \max_{0 \leq j \leq N-1} |p_j| \leq 1 + \sqrt{N} \|p\|_2\) where \(\|p\|_2 = \sqrt{|p_0|^2 + \cdots + |p_{N-1}|^2}\). Therefore \(\|R(p)\|_2 = \sqrt{\sum_{i=1}^N |r_i|^2} \leq \sqrt{N}(1 + \sqrt{N}\|p\|_2) \leq C(1 + \|p\|_2)\) for some constant \(C\). Hence \(|F(p)| = |f(R(p))| \leq M(1 + C(1 + \|p\|_2))^K \leq M'(1 + \|p\|_2)^K\) for some \(M'\). Since \(F\) is analytic and polynomially bounded, it must be a polynomial.

  4. Rational case: If \(f = P/Q\) where \(P, Q\) are polynomials and \(Q\) is \(S_N\)-invariant with \(Q \not\equiv 0\), then by parts (3), there exist polynomials \(P', Q'\) such that \(P' = P \circ R\) and \(Q' = Q \circ R\). Since \(Q\) is \(S_N\)-invariant and non-zero, \(Q'\) is non-zero (as \(R\) is surjective). Therefore \(F = P'/Q'\) is a rational function.

Corollary (Symmetric Function Theorem)

Every \(S_N\) invariant polynomial \(f\) in variables \(x_1,\dots,x_n\) can be expressed as a polynomial in the elementary symmetric functions: \(s_k = \prod_{i_1 < \dots < i_k} x_{i_1} \dots x_{i_k}\)

\[ s_0 = x_0 + \dots + x_n, \dots, s_n = s_0 \dots x_n. \]
Proof