python代写-COMP0078

COMP0078: Supervised Learning (20/21)
2. Kernels and Regularisation
Exercise Solutions
Exercise 1
(slide 6) - More generally 〈x, y〉M := x>My where x, y ∈ Rn and M is an n×n
symmetric positive definite matrix. - check that this is an inner product.
Recall that an inner product has the following properties:
1. 〈x, x〉 = 0⇔ x = 0 and 〈x, x〉 ≥ 0
2. 〈x, y〉 = 〈y, x〉
3. 〈γx, y〉 = γ〈x, y〉 and 〈x+ y, z〉 = 〈x, z〉+ 〈y, z〉
Observe that since M is PD then M = AA> for some n × n matrix A with
Ax = 0 iff x = 0. We may then define xˆ := A>x ∈ Rn and yˆ := A>y ∈ Rn.
This gives the following
〈x, y〉M = x>My = x>AA>y = 〈A>x,A>y〉 = 〈xˆ, yˆ〉
which of course satisfies the above properties.
We can also show that if M is not positive definite, then property 1 is violated
since 〈x, x〉M = x>Mx ≤ 0 for some x 6= 0.
1
Exercise 2
(slide 7) - Here the space is the set of all functions from [0,∞) → R with the
following three properties:
1. f(0) = 0
2. f is absolutely continuous (hence f(b)− f(a) = ∫ b
a
f ′(x)dx)
3.
∫∞
0
[f ′(x)]2dx <∞
Addition ‘+’ is the usual addition of functions similarly multiplication by a
scalar. The dot product is defined as
〈f, g〉 :=
∫ ∞
0
f ′(x)g′(x)dx
Argue that this is an inner product space.
Denote this space by F . Symmetry and linearity of 〈f, g〉 hold by definition of
the integral.
We now prove the positive definite property of 〈f, f〉 for f ∈ F . Clearly 〈f, f〉 =∫∞
0
[f ′(x)]2dx ≥ 0 since we are integrating over a squared function.
Furthermore if 〈f, f〉 = ∫∞
0
[f ′(x)]2dx = 0 then this, along with property 2
implies that f ′(x) is zero everywhere (rather than almost everywhere). This
then corresponds to f(x) = c for some value c, except f ∈ F iff c = 0 due to
propery 1 above. Therefore f must be 0 everywhere.
2
Exercise 3
(slide 9) - “Some results on convexity”
Recall a function f : X → R is convex iff ∀p, q ∈ X and α ∈ (0, 1) we have
f(αp+ (1− α)q) ≤ αf(p) + (1− α)f(q) (1)
1. If f and g are convex then f + g is convex.
Proof. Let h(x) := f(x) + g(x). Then
h(αp+ (1− α)q) = f(αp+ (1− α)q) + g(αp+ (1− α)q)
≤ αf(p) + (1− α)f(q) + αg(p) + (1− α)g(q)
= α(f(p) + g(p)) + (1− α)(f(q) + g(q))
= αh(p) + (1− α)h(q) ,
where the inequality arises from the convexity of f and g.
2. If f is convex and g is affine (linear + a constant) then f(g(·)) is convex.
Proof. Since g(x) is affine then g(αp+ (1− α)q) = αg(p) + (1− α)g(q) (Equa-
tion (1) holds with equality for affine functions). Let h(x) := f(g(x)). Then
h(αp+ (1− α)q) = f(g(αp+ (1− α)q))
= f(αg(p) + (1− α)g(q))
≤ αf(g(p)) + (1− α)f(g(q))
= αh(p) + (1− α)h(q) ,
where the inequality uses the convexity of f .
3
Exercise 4
(Problem set 1 - Question 3) - Given a kernel K : R2×R2 → R where K(x, t) :=
(1 + 〈x, t〉)2. Find a feature map φ : R2 → R6 which corresponds to the kernel.
Expanding and ‘matching’ terms gives
(1 + 〈x, t〉)2 = 1 + 2〈x, t〉+ 〈x, t〉2
= 1 + 2(x1t1 + x2t2) + (x1t1 + x2t2)(x1t1 + x2t2)
= 1 + (x1t1)
2 + (x2t2)
2 + 2x1t1x2t2 + 2x1t1 + 2x2t2
=

1
x21
x22√
2x1x2√
2x1√
2x2

>
1
t21
t22√
2t1t2√
2t1√
2t2

= 〈φ(x), φ(t)〉 .
Thus
φ(x) =

1
x21
x22√
2x1x2√
2x1√
2x2
 .
4
Exercise 5
(Problem set 1 - Question 5) - Consider a Gaussian kernel function K : R2 ×
R2 → R defined by K(x, z) := e−‖x−z‖2 , does there exist a finite-dimensional
feature map representation? I.e., does there exist a φ : Rn → Rd such that
K(x, z) = 〈φ(x), φ(z)〉? Indicate an answer “yes” or “no” and provide an
No.
5
Exercise 6
(Problem set 2 - Question 2) - Kernels between sets. Let X be a finite set define
K : 2X × 2X → R as
K(A,B) := 2|A∩B|
where A,B ⊆ X. Prove that K is a kernel.
Recall that 2X is the power set of X (i.e., the set of all subsets of X, including
the empty set), and that |2X | = 2|X|.
Proof. Let [pred] = 1 if pred is true and 0 otherwise. Then for i ∈ 2X define
φ(A)i = [i ⊆ A]
(note that i here is used to index φ and also represent elements of 2X) we then
have a binary feature vector φ ∈ {0, 1}2|X| . Finally then
K(A,B) = 〈φ(A), φ(B)〉 =

i∈2X
φ(A)iφ(B)i =

i∈2X
[i ⊆ A][i ⊆ B] = 2|A∩B|
as required.
6
Exercise 7
(Problem set 2 - Question 3.1) - Argue that min(x, t) (where x, t ∈ [0,∞)) is a
kernel...
Recall that the inner product space from Exercise 2 (which we will denote F)
has the following inner product for f, g ∈ F :
〈f, g〉 :=
∫ ∞
0
f ′(x)g′(x)dx . (2)
Define the following feature map φ : [0,∞)→ F s.t
φ(x) = f(p) :=
{
p, p ≤ x
x otherwise .
(3)
Then f ′(p) are step-functions:
f ′(p) =
{
1, p ≤ x
0 otherwise ,
(4)
with the following property∫ ∞
0
f ′(p)dp =
∫ x
0
(1)dp+
∫ ∞
x
(0)dp = x
(see Figure 1).
Therefore by letting φ(x) = f and φ(t) = g as defined in (3) we have the
following by combining (2) and (4)
〈φ(x), φ(y)〉 =
∫ ∞
0
f ′(p)g′(p)dp =
∫ min(x,t)
0
(1)dp+
∫ ∞
min(x,t)
(0)dp = min(x, t)
and min(x, t) is therefore a kernel.
7
Figure 1: Problem set 2 - Question 3.1
8  