CMDA3606-matlab代写|学霸联盟

CMDA3606-matlab代写

时间：2023-03-19

CMDA3606 - Spring 2023 Serkan Gugercin
Term Project
Learning Dynamical Systems from Data using SVD
Due Date for Part I : April 7, 2023, 11:59pm
Abstract
This semester we have already use (and will continue to use) the SVD for solving fundamental
problems in data science such as low rank approximations and dimensionality reduction, least-squares
problems, and regularizing ill-conditioned inverse problems. In this project, we will use it in learning
dynamical systems from data. The project will combine your knowledge of discrete-time linear dynamical
systems from CDMA 3605 with the numerical linear algebra tools from CMDA 3606. The project will
have two parts. In Part I, you will work through some details of the Frobenius and then using this norm,
we will set up a minimization problem to learn an underlying dynamical system. Part II will work on
extensions of these learning approaches.
Instructions on the structure of the term project and uploading require-
ments
• Discussion below explains the required concepts in detail. Throughout the text, several problems
are listed. You need to read the whole document before trying to answer the problems. If you sim-
ply jump into the problems to solve, you will not be able to answer those problems.
• You are required to type your term project using your favorite editing software. Hand-written solu-
tions are not acceptable. The proper typesetting of the equations and proper presentation of the
solutions (in addition to their correctness) will be part of the grading.
• Students must submit their solutions electronically as a pdf upload to the course Canvas site.
• The entire assignment must be submitted as ONE pdf file.
• The Matlab scripts need to be attached to the pdf file. Include the code after the corresponding
question. Also, the plots should be included in the pdf file, once again in the answer of the corre-
sponding problem; not in the end. Do not simply attach a screen shot of the scripts. We will
run these scripts if needed. We should be able to copy-paste the code to run.
• For those who are using LaTeX: The mcode package is great for including MATLAB code in LaTeX.
The graphicx package can be used to include images in LaTeX.
• Submitted documents must answer questions in order. If you do not answer a question or part of a
question, please indicate this on your assignment.
• Two students can submit one joint term project.
1
1 PART 1: Frobenius Norm and Learning Dynamics via Dynamic
Mode Decomposition
In class, we studied SVD and its power in producing optimal low-rank approximations. Given a matrix A
(representing an image, or a database of images), we used its SVD to compute its low-rank approximation.
Here, we will use SVD to learn a discrete-time dynamical system from its time-domain data. Thus, we will
not be given the matrix A; rather we will be given time-domain simulation data from a dynamical system
and use SVD to construct the underlying transition matrix A.
1.1 Data-Driven Learning of Dynamical Systems
Consider the following discrete-time linear dynamical system:
x(k + 1) = Ax(k), x(0) = x0, for k = 0, 1, 2, . . . , (1.1)
where x(k),x0 ∈ Rn and A ∈ Rn×n. Unlike the analysis in CMDA 3605, in this term project we will not
have access to A; i.e., we do not know the underlying transition matrix. However, we have access to mea-
surements of the state x(k), i.e., we have the measurements/observations x(0),x(1),x(2),x(3), · · · ,x(N)
where N is the final time. This is a common occurrence in practice. Many physical systems and structures
are instrumented with a large number of sensors, measuring vibrations, temperature, voltages/currents,
rotational speeds etc. Therefore, we have access an abundant amount of dynamical system data. Thus as
opposed to knowing the matrix A in (1.1), we have the data{
x(0),x(1),x(2), · · · ,x(N)} ⇔ X = [x(0) x(1) x(2) · · · x(N)] ∈ Rn×(N+1). (1.2)
Now, given this data (1.2)(called measurements or state history), our goal is to find the state-transition
matrix A, that has produced this data. In other words, our goal is find the underlying model (the state
transition matrix A) that produced this data.
1.2 Setting up the optimization problem to learn A
Let us formulate this data-driven learning problem formally. If we were to find the transition matrix A
exactly, it should satisfy:
x(1) = Ax(0), , x(2) = Ax(1), x(3) = Ax(2), · · · x(N) = Ax(N − 1). (1.3)
The N -equations in (1.3) can be written as a single matrix equation:[
x(1) x(2) x(3) . . . x(N)
]︸︷︷︸
:=Xs∈Rn×N
= A
[
x(0) x(1) x(2) . . . x(N − 1)]︸︷︷︸
:=Ys∈Rn×N
⇐⇒ Xs = AYs. (1.4)
Therefore, the problem becomes: Given the data (1.2), find A such that Xs = AYs. If Ys were to be an
invertible matrix, the solution would have been clear; namely, A = XsY
−1
s . However, this does not apply
here since Ys is not even a square matrix. Moreover, we only have measurement data and the underlying
system might not even be a linear dynamical system. Therefore, we usually cannot find A such that Xs is
exactly AYs. The next best thing is to find the best A that minimizes the error. Then, the problem is:
Given the data (1.2), find A such that ‖Xs −AYs‖ is minimized. (1.5)
2
First, we need to choose an appropriate norm to solve this problem; i.e., how to measure ‖Xs−AYs‖. We
will use the Frobenius norm of a matrix. (Do not worry, the solution of the optimization problem will still
be determined by the SVD, our best friend!).
The Frobenius norm of a matrix M ∈ Rn×m, denoted by ‖M‖F , is defined as
‖M‖F =
√√√√ n∑
i=1
m∑
j=1
| mij |2 where mij = M(i, j). (1.6)
In other words, we take absolute value square of the every element of the matrix, add them up, and then
take its square root.
Problem 1 Using (1.6) (and paper-and-pencil only), compute the Frobenius norm of N =
[
1 1 3
3 2 −1
]
.
Next, we will discuss two more ways to compute the Frobenius norm. First, let’s define the trace of a
square matrix B ∈ Rn×n. Trace of a square matrix is the sum of its diagonal elements:
trace(B) = b11 + b22 + · · ·+ bnn =
n∑
i=1
bii, where bii = B(i, i). (1.7)
Problem 2 Let M ∈ Rn×m. Show that the Frobenius norm defined in (1.6) is also equivalent to
‖M‖F =
√
trace(MMT ) (1.8)
In other words, one forms the matrix MMT , which is a square matrix, computes its trace, and then takes
its square root. It is enough if you prove this result for the simple case M =
[
m11 m12 m13
m21 m22 m23
]
for n = 2
and m = 3.
One can also similarly show that (you do NOT have to) ‖M‖F =
√
trace(MTM); meaning that we can
also work with MTM as opposed to MMT .
Problem 3 For the matrix N in Problem 2.1, now use the formula (1.8) and numerically verify that you
obtain the same result.
So far, SVD has not entered the picture. Well, it is time to do it then.
Problem 4 Let M ∈ Rn×m. For simplicity assume that n ≤ m (this is not necessary, we can do it for
n > m as well; you do not have to). Show that the Frobenius norm formula in (1.8) is equivalent to
‖M‖F =
√
σ21 + σ
2
2 + · · ·+ σ2n =
√√√√ n∑
i=1
σ2i , (1.9)
where σi is the i
th singular value of the matrix M.
Hint: Let M = USVT be the SVD of M and plug this into (1.8). You may also use the fact that
trace(BC) = trace(CB) where B and C are two matrices of appropriate sizes.
3
Problem 5 For the matrix N in Problem 2.1, now use the formula (1.9) and numerically verify that you
obtain the same result as before. You should use the svd command in MATLAB to compute the singular
values.
Now that you have learned the theoretical formula behind the Frobenius norm computation, from now on,
you can simply use the command norm(M,’fro’) to compute the Frobenius norm of a matrix.
1.3 Solving the minimization/learning problem (1.5) in the Frobenius norm via SVD
We will solve the minimization problem (1.5) in the Frobenius norm: In other words, our problem has
become the following: Given the state measurements{
x(0),x(1),x(2),x(3), · · · ,x(N)}, (1.10)
construct the two data matrices:
Xs =
[
x(1), x(2), x(3), . . . , x(N)
]
and Ys =
[
x(0), x(1), x(2), . . . , x(N − 1)]. (1.11)
Then, find the matrix A ∈ Rn×n that matches the data best in the Frobenius norm, i.e.,
find A such that ‖Xs −AYs‖F is minimized. (1.12)
Problem 6 Let Aopt denote the solution of the optimization (1.12). Show that the optimal solution is
given by
Aopt = XsY
†
s. (1.13)
where Y†s denotes the pseudo inverse of Ys.
Once again, as in the optimal low-rank approximation by SVD, the SVD solves this optimization
problem, this time via the pseudoinverse. Therefore, the algorithm for learning the dynamical system from
the measurements is clear:
Algorithm 1.1 Learning the transition matrix Aopt from data
Data: Given are the time-domain data / measurements of the underlying dynamics:{
x(0),x(1),x(2),x(3), · · · ,x(N)}, x(k) ∈ Rn for t = 0, 1, . . . , N.
1. Construct the two data matrices:
Xs =
[
x(1) x(2) x(3) . . . x(N)
]
Ys =
[
x(0) x(1) x(2) . . . x(N − 1)]
2. Then the best-fit dynamical system x(k + 1) = Aoptx(k) to this data is:
Aopt = XsY
†
s.
Algorithm 1.1 is at the core of the Dynamic Mode Decomposition method for fitting a linear dynamical
system to a given data set. It has a wide spectrum of applications ranging from flow control to neuroscience
and many others; you may check [4, 2, 5, 3] and references therein for more details. First, we will apply
this algorithm to a synthetic data.
4
Problem 7 In this problem, we will apply DMD, i.e., Algorithm 1.1, to a model where we know A explic-
itly. This is a simple example to make sure that our algorithm works. We will check whether we are able
to recover A within the machine precision.
We are given a discrete-time dynamical system as in (1.1) with n = 2 and the matrix
A =
[
0.95 0
0 0.5
]
. (1.14)
(a) First we need to generate the data Xs and Ys in (1.11) that DMD algorithm requires. Choose the
initial condition as x0 =
[
1 1
]T
. Pick N = 10 and simulate (1.1) to construct the data matrices Xs
and Ys
(b) Then, apply Algorithm 1.1 to learn the dynamics Aopt. Have you recovered the original dynamics given
in (1.14) (within machine precision). Do no simply write yes or no; measure the distance between the
true quantities and your learned quantities.
Now, think about what you achieved here. Using only the data, you were able to extract/learn what the
underlying transition matrix A were. Of course, this was an academic example where we knew the original
quantities. In the next example, you will be only provided the data.
Problem 8 We will now apply Algorithm 1.1 to a data set coming from the model of a Los Angeles
University Hospital building. Details of this model can be found in [1]. In this model, we have n = 48, i.e.,
x(k) ∈ R48×1.
(a) Load the data file TrainingData.mat into your MATLAB workspace. This file contains the data matrix
X ∈ R48×1001 in (1.2). Therefore, in this case, we have n = 48 and N = 1000. The simulation
was run from t = 0 to t = N = 1000, leading to 1001 time samples. A small 1% random noise
was added to the data during the simulation. The initial condition x(0) is a vector of ones, i.e.,
x(0) = [1 1 1 · · · 1]T ∈ R48. Apply Algorithm 1.1 to this data and compute Aopt ∈ R48×48. The
underlying building model is clearly asymptotically stable. Is your learned discrete-time dynamical
system x(k + 1) = Aoptx(k) asymptotically stable? To test how accurate your model is, simulate your
learned model x(k + 1) = Aoptx(k) for the same initial condition x(0) = [1 1 1 · · · 1]T . Comparing
the full data is difficult since the state has dimension n = 48. So, compare how the first component of
x(k), i.e., x1(k), is approximated with your learned model.
(b) Hopefully your approximation was accurate in Part (a) since you compared your solution to the training
data you used to learn the model. Now, we will test your model using a testing data. Load the data file
TestingData.mat into your MATLAB workspace. This file contains the data matrix Xtest ∈ R48×1001
in (1.2), which was obtained by simulating the original building model with the new initial condition
x(0) = [1 0 0 · · · 0]T for the same time interval, from t = 0 to t = N = 1000. Simulate your learned
model x(k+ 1) = Aoptx(k) for this new initial condition x(0) = [1 0 0 · · · 0]T and check how the first
component of x(k) is approximated in this case. Is the approximation exact? Do NOT simply say yes
or no; comment on your results and give reasons/explanations for your observations/results.
References
[1] A. C. Antoulas, D.C. Sorensen, and S. Gugercin, A survey of model reduction methods for large scale
systems, Contemporary Mathematics, AMS Publications, 280, pp. 193–219, 2001.
5
[2] Z. Drmacˇ, Dynamic Mode Decomposition – A Numerical Linear Algebra Perspective, in The Koopman
Operator in Systems and Control, pp. 161–194, Springer, 2020.
[3] M. Krishnan, S. Gugercin, P. A. Tarazaga, A wavelet-based dynamic mode decomposition for modeling
mechanical systems from partial observations, arXiv:2110.12990, 2021.
[4] N. Kutz, S. Brunton, B. Brunton, and J. Proctor, Dynamic Mode Decomposition: Data-Driven Mod-
eling of Complex Systems, SIAM, 2016.
[5] P. J. Schmid, Dynamic mode decomposition of numerical and experimental data, Journal of Fluid
Mechanics, vol. 656, p. 5–28, 2010.