Final Assessment - MATH0058
All questions must be attempted.
You must submit your solution as a single pdf file. The following options are allowed:
• Write all formulas and code together into one Jupyter Notebook and export to pdf. This may
not work on every platform where Jupyter is installed and you should test it.
• Separately write the theory part into a Word or similar document, and add the codes and
output as screenshots to it. Then convert to pdf. This is slightly more complex but should
work on most setups.
Question 1
In this question we want to explore the effects of floating point accuracy and the sensitivity of
problems with respect to small perturbations.
Q1a State the definitions of backward error, forward error, and condition number. How are these
quantities connected? Which of these are due to the sensitivity of a problem and which are due to
a specific algorithm?
Q1b We consider the root finding problem P(a, z) = 0 with P(a, z) = z2a2 + za1 + a0. Here,
a =
a0 a1 a2
]T is the vector of coefficients of the polynomial. Let z be a solution of P(a, z) = 0
for given coefficients a. Show that if we perturb a single coefficient aj by an amount of ∆aj to
first order the zero z is perturbed by ∆z = −z
P′(z) . As a slightly more advanced question try to
derive this perturbation results for polynomials of arbitrary order and not just degree 2. Based
on the perturbation result, what is the resulting condition number of a zero of P with respect to
a perturbation in the coefficient aj? (Hint: You need not give a formal proof. Consider that the
condition number is the ratio of relative forward and relative backward error.) What happens
with the condition number if the polynomial has two zeros that are close together? Comment on
the ability to accurately compute the roots of polynomials with nearby roots.
Q1c We now want to visualize the effect of small perturbations in the coefficients. Implement the
following function.
[1]: def perturbation_roots(coeffs, eps=1E-10, nrepeats=10):
This function takes a polynomial z**n * coffs[0] + ... z**1 * coeffs[n-1] +
and computes roots of random perturbations of this polynomial using the
np.roots function.
The perturbation is a vector of normally distributed random numbers with
2-norm eps. The function does nrepeats repetitions and at the end
returns a single one-dimensional Numpy array of type complex128
that containes all the different nrepeats * n roots of
all perturbed polynomials.
# Delete the following line to implement your own code.
Once you have implemented your function you should test it with the famous Wilkin-
son polynomial. It is defined as P(z) = ∏20j=1(z − j). Hence, its zeros are the integers
from 1 to 20. Use the perturbation_roots function to generate a single plot that shows
all the roots of this polynomial and its perturbations for the value eps=1E-10. For this
you will need the coefficients of the Wilkinson polynomial. You find them for example at, a blog post on the
Wilkinson polynomial by Cleve Moler (the inventor of Matlab). We also give them below. The
zeros should be plotted as blue crosses. Also plot the exact zeros of the Wilkinson polynomial as
red circles.
Numerically evaluate the condition number of the root z = 15 with respect to perturbations in the
coefficients aj, separately and print the result. Interpret what you see in the graph with respect
to the computed condition numbers. To evaluate the derivative you can use the Numpy function
[150]: # The following code gives the coefficients of the leading order monomial first,
↪→i.e. P(z) = z^{20}...
wilkinson_coeffs = [
Question 2
Let A ∈ Rn×n be symmetric and positive definite, that is A = AT and all eigenvalues of A are
positive. In this question we want to investigate a special variant of the LU Decomposition, called
the Cholesky Decomposition, which exists for symmetric positive definite matrices.
The Cholesky Decomposition is defined as
for C a nonsingular upper triangular matrix with positive entries on its diagonal.
Q2a Show that the Cholesky Decomposition exists if and only if A is positive definite. In order
to show that the Cholesky Decomposition exists if A is positive definite proceed by induction.
Assume that the Cholesky Decomposition exists for matrices of dimenion n − 1 and show that
you can construct a Cholesky Decomposition for a matrix of dimension n of the form
Aˆ b
bT α
CˆT 0
cT δ
] [
Cˆ c
0 δ
with δ > 0, where Aˆ = CˆTCˆ is a Cholesky Decomposition of dimension n− 1. Be careful to also
mention the induction start. In order to show that δ > 0 you may find the following formula for
the determinant of a block matrix helpful:
= det A det(D− CA−1B).
Q2b Use the Singular Value Decomposition to show that ‖A‖2 = ‖C‖22.
Q2c From the lectures you know that the LU Decomposition of a matrix can have a large backward
error if the growth factor ρ is large. Based on this, give a heuristical argument why you expect the
Cholesky Decomposition to be backward stable.
Q2d The decomposition in a) suggests an algorithm for computing the Cholesky Decomposition
by starting at the upper left element a11 of A and successively computing the Cholesky Decompo-
sition for larger and larger submatrices until you have computed the Cholesky Decomposition of
the whole matrix.
Implement a function def cholesky(A) that for a given positive definite matrix A implements
this algorithm. The skeleton code below will help you with the implementation and testing. You
want to achieve that the printed residual is a small multiple of machine precision.
[ ]: import numpy
def cholesky(A):
Compute the Cholesky decomposition of A.
Return the upper triangular matrix C of
the Cholesky decomposition A = C^TC
of a symmetric positive definite matrix A.
# Delete the following line for your implementation.
n = 50
B = numpy.random.randn(n, n)
A = B.T @ B # Computes a positive definite random matrix.
C = cholesky(A)
relative_residual = numpy.linalg.norm(C.T @ C - A) / numpy.linalg.norm(A)
print(f"The relative residual is: {relative_residual}")
Question 3
Consider the following algorithm:
1.) Start with random x0 ∈ Rn, ‖x0‖2 = 1
2.) Until convergence do the following steps
• Compute y = Axj
• Compute z = ATy
• Compute γj = xTj z
• Compute xj+1 = z/‖z‖2.
Q3a Use the Singular Value Decomposition of A to investigate the convergence of this algorithm.
In particular, show under which conditions the algorithm converges and what xj and γj converge
Q3b Implement the algorithm. Use the following skeleton for your implementation.
[ ]:
[ ]: def fun(A, nsteps=100):
Perform nsteps steps of the algorithm.
Return a vector containing the values
\gamma_1, \gamma_2, \dots
# Delete the following line for your implementation.
Now plot the convergence for the algorithm in the following two cases
• A random 50× 50 matrix wwhose entries are uniformly distributed random numbers.
• A random 50× 50 matrix whose entries are normally distributed random numbers.
For the convergence plot the values |γj − γ|/|γ| using a semilogarithmic plot. Here γ is the theo-
retical limit of the iteration as j → ∞. If you don’t know what this is you can use instead the last
value γend that was returned in the output vector from fun.
Q3c Explain the observed convergence behaviour as accurately as possible.
Question 4
In this question we want to investigate quadratic eigenvalue problems of the form
T(λ)x = 0
with T(λ) being a complex function T : C→ Cn×n defined by
T(λ) = A2λ2 + A1λ+ A0,
where A0, A1, A2 are complex n× n matrices.
We say that λj is an eigenvalue of T if there exists an associated vector xj 6= 0 such that T(λj)xj = 0.
Q4a Show that the quadratic eigenvalue problem T(λ)x = 0 is identical to the following gener-
alised linear eigenvalue problem of dimension 2n.
0 I
−A0 −A1
] [
− λ
I 0
0 A2
] [
= 0.
(Note that a generalized linear eigenvalue problem is an eigenvalue problem of the form Ax =
λBx. Hence, a standard eigenvalue problem is obtained for B being the identity matrix. The Scipy
eig and eigvals functions can solve generalized linear eigenvalue problems.)
Q4b In all the following parts we investigate a specific quadratic eigenvalue problem defined by
the matrices in the following Python code.
[131]: A0 = numpy.array(
[[121, 18.9, 15.9],
[0, 2.7, 0.145],
[11.9, 3.64, 15.5],
A1 = numpy.array(
[[7.66, 2.45, 2.1],
[0.23, 1.04, 0.223],
[0.6, 0.756, 0.658],
A2 = numpy.array(
[[17.6, 1.28, 2.89],
[1.28, 0.824, 0.413],
[2.89, 0.413, 0.725]
Numerically compute these eigenvalues and plot them in the complex plane.
Q4c One way to interpet the numerical stability of computed eigenvalues for quadratic eigenvalue
problems is to consider the resolvent function f : z → T(z)−1 in the complex plane. In order to
visualize this function evaluate the function ‖ f (z)‖2 for complex z = x+ iy with x ∈ [−2, 2] and
y ∈ [−10, 10]. Use a plot grid of 100 points in the x-direction and 200 points in the y-direction.
Then plot log10 ‖ f (z)‖2 (the log allows you to distinguish magnitudes in the values). You can cre-
ate the plot grid with numpy.meshgrid, and plot the result with matplotlib.pyplot.imshow. Also
display a colorbar (matplotlib.pyplot.colorbar). Explain heuristically why your numerical re-
sults suggest that the eigenvalues are stable with respect to perturbations.
Q4d Finally, we want to investigate a simple iteration to solve the quadratic eigenvalue problem
that can also be generalised to other types of matrix functions T(λ) beyond the quadratic case.
The following method goes back to Ruhe.
Instead of the problem T(λ)x = 0 we perform a Taylor expansion around some approximate
eigenvalue λ(j) and solve the linear problem, resulting in
T(λ)x ≈
T(λ(j))− ηT′(λ(j))
x = 0
with λ = λ(j) − η. In order to compute η we find the smallest magnitude eigenvalue of the
generalized eigenvalue problem
T(λ(j))x = ηT′(λ(j))x
and update λ(j+1) = λ(j) − η.
Implement this iteration in Python using the following skeleton.
[155]: def ruhe_method(A0, A1, A2, lam0, nsteps=100):
Implement the Ruhe method to find a single eigenvalue of a quadratic
↪→eigenvalue problem.
A0, A1, A2 are the coefficient matrices for the quadratic eigenvalue problem.
↪→ The value
lam0 is the initial eigenvalue approximation and steps is the number of
↪→iteration steps.
The method returns a vector containing the eigenvalue approximation
↪→\lambda^{(j)} in each
iteration step.
# Delete the following line for your implementation.
Now pick an exact eigenvalue λ of the given quadratic eigenvalue problem and as starting value
for the iteration choose λ(0) = λ + .1. Run Ruhe’s method and make a semilog plot of the con-
vergence of the relative error |λ(j) − λ|/|λ|. By inspecting the values of the relative errors make a
conjecture about the rate of convergence of Ruhe’s method.