xuebaunion@vip.163.com

3551 Trousdale Rkwy, University Park, Los Angeles, CA

留学生论文指导和课程辅导

无忧GPA：https://www.essaygpa.com

工作时间：全年无休-早上8点到凌晨3点

微信客服：xiaoxionga100

微信客服：ITCS521

python代写-COMP0078

时间：2020-12-16

COMP0078: Supervised Learning (20/21)

2. Kernels and Regularisation

Exercise Solutions

Exercise 1

(slide 6) - More generally 〈x, y〉M := x>My where x, y ∈ Rn and M is an n×n

symmetric positive definite matrix. - check that this is an inner product.

Recall that an inner product has the following properties:

1. 〈x, x〉 = 0⇔ x = 0 and 〈x, x〉 ≥ 0

2. 〈x, y〉 = 〈y, x〉

3. 〈γx, y〉 = γ〈x, y〉 and 〈x+ y, z〉 = 〈x, z〉+ 〈y, z〉

Observe that since M is PD then M = AA> for some n × n matrix A with

Ax = 0 iff x = 0. We may then define xˆ := A>x ∈ Rn and yˆ := A>y ∈ Rn.

This gives the following

〈x, y〉M = x>My = x>AA>y = 〈A>x,A>y〉 = 〈xˆ, yˆ〉

which of course satisfies the above properties.

We can also show that if M is not positive definite, then property 1 is violated

since 〈x, x〉M = x>Mx ≤ 0 for some x 6= 0.

1

Exercise 2

(slide 7) - Here the space is the set of all functions from [0,∞) → R with the

following three properties:

1. f(0) = 0

2. f is absolutely continuous (hence f(b)− f(a) = ∫ b

a

f ′(x)dx)

3.

∫∞

0

[f ′(x)]2dx <∞

Addition ‘+’ is the usual addition of functions similarly multiplication by a

scalar. The dot product is defined as

〈f, g〉 :=

∫ ∞

0

f ′(x)g′(x)dx

Argue that this is an inner product space.

Denote this space by F . Symmetry and linearity of 〈f, g〉 hold by definition of

the integral.

We now prove the positive definite property of 〈f, f〉 for f ∈ F . Clearly 〈f, f〉 =∫∞

0

[f ′(x)]2dx ≥ 0 since we are integrating over a squared function.

Furthermore if 〈f, f〉 = ∫∞

0

[f ′(x)]2dx = 0 then this, along with property 2

implies that f ′(x) is zero everywhere (rather than almost everywhere). This

then corresponds to f(x) = c for some value c, except f ∈ F iff c = 0 due to

propery 1 above. Therefore f must be 0 everywhere.

2

Exercise 3

(slide 9) - “Some results on convexity”

Recall a function f : X → R is convex iff ∀p, q ∈ X and α ∈ (0, 1) we have

f(αp+ (1− α)q) ≤ αf(p) + (1− α)f(q) (1)

1. If f and g are convex then f + g is convex.

Proof. Let h(x) := f(x) + g(x). Then

h(αp+ (1− α)q) = f(αp+ (1− α)q) + g(αp+ (1− α)q)

≤ αf(p) + (1− α)f(q) + αg(p) + (1− α)g(q)

= α(f(p) + g(p)) + (1− α)(f(q) + g(q))

= αh(p) + (1− α)h(q) ,

where the inequality arises from the convexity of f and g.

2. If f is convex and g is affine (linear + a constant) then f(g(·)) is convex.

Proof. Since g(x) is affine then g(αp+ (1− α)q) = αg(p) + (1− α)g(q) (Equa-

tion (1) holds with equality for affine functions). Let h(x) := f(g(x)). Then

h(αp+ (1− α)q) = f(g(αp+ (1− α)q))

= f(αg(p) + (1− α)g(q))

≤ αf(g(p)) + (1− α)f(g(q))

= αh(p) + (1− α)h(q) ,

where the inequality uses the convexity of f .

3

Exercise 4

(Problem set 1 - Question 3) - Given a kernel K : R2×R2 → R where K(x, t) :=

(1 + 〈x, t〉)2. Find a feature map φ : R2 → R6 which corresponds to the kernel.

Expanding and ‘matching’ terms gives

(1 + 〈x, t〉)2 = 1 + 2〈x, t〉+ 〈x, t〉2

= 1 + 2(x1t1 + x2t2) + (x1t1 + x2t2)(x1t1 + x2t2)

= 1 + (x1t1)

2 + (x2t2)

2 + 2x1t1x2t2 + 2x1t1 + 2x2t2

=

1

x21

x22√

2x1x2√

2x1√

2x2

>

1

t21

t22√

2t1t2√

2t1√

2t2

= 〈φ(x), φ(t)〉 .

Thus

φ(x) =

1

x21

x22√

2x1x2√

2x1√

2x2

.

4

Exercise 5

(Problem set 1 - Question 5) - Consider a Gaussian kernel function K : R2 ×

R2 → R defined by K(x, z) := e−‖x−z‖2 , does there exist a finite-dimensional

feature map representation? I.e., does there exist a φ : Rn → Rd such that

K(x, z) = 〈φ(x), φ(z)〉? Indicate an answer “yes” or “no” and provide an

argument supporting your answer.

No.

5

Exercise 6

(Problem set 2 - Question 2) - Kernels between sets. Let X be a finite set define

K : 2X × 2X → R as

K(A,B) := 2|A∩B|

where A,B ⊆ X. Prove that K is a kernel.

Recall that 2X is the power set of X (i.e., the set of all subsets of X, including

the empty set), and that |2X | = 2|X|.

Proof. Let [pred] = 1 if pred is true and 0 otherwise. Then for i ∈ 2X define

φ(A)i = [i ⊆ A]

(note that i here is used to index φ and also represent elements of 2X) we then

have a binary feature vector φ ∈ {0, 1}2|X| . Finally then

K(A,B) = 〈φ(A), φ(B)〉 =

∑

i∈2X

φ(A)iφ(B)i =

∑

i∈2X

[i ⊆ A][i ⊆ B] = 2|A∩B|

as required.

6

Exercise 7

(Problem set 2 - Question 3.1) - Argue that min(x, t) (where x, t ∈ [0,∞)) is a

kernel...

Recall that the inner product space from Exercise 2 (which we will denote F)

has the following inner product for f, g ∈ F :

〈f, g〉 :=

∫ ∞

0

f ′(x)g′(x)dx . (2)

Define the following feature map φ : [0,∞)→ F s.t

φ(x) = f(p) :=

{

p, p ≤ x

x otherwise .

(3)

Then f ′(p) are step-functions:

f ′(p) =

{

1, p ≤ x

0 otherwise ,

(4)

with the following property∫ ∞

0

f ′(p)dp =

∫ x

0

(1)dp+

∫ ∞

x

(0)dp = x

(see Figure 1).

Therefore by letting φ(x) = f and φ(t) = g as defined in (3) we have the

following by combining (2) and (4)

〈φ(x), φ(y)〉 =

∫ ∞

0

f ′(p)g′(p)dp =

∫ min(x,t)

0

(1)dp+

∫ ∞

min(x,t)

(0)dp = min(x, t)

and min(x, t) is therefore a kernel.

7

Figure 1: Problem set 2 - Question 3.1

8

2. Kernels and Regularisation

Exercise Solutions

Exercise 1

(slide 6) - More generally 〈x, y〉M := x>My where x, y ∈ Rn and M is an n×n

symmetric positive definite matrix. - check that this is an inner product.

Recall that an inner product has the following properties:

1. 〈x, x〉 = 0⇔ x = 0 and 〈x, x〉 ≥ 0

2. 〈x, y〉 = 〈y, x〉

3. 〈γx, y〉 = γ〈x, y〉 and 〈x+ y, z〉 = 〈x, z〉+ 〈y, z〉

Observe that since M is PD then M = AA> for some n × n matrix A with

Ax = 0 iff x = 0. We may then define xˆ := A>x ∈ Rn and yˆ := A>y ∈ Rn.

This gives the following

〈x, y〉M = x>My = x>AA>y = 〈A>x,A>y〉 = 〈xˆ, yˆ〉

which of course satisfies the above properties.

We can also show that if M is not positive definite, then property 1 is violated

since 〈x, x〉M = x>Mx ≤ 0 for some x 6= 0.

1

Exercise 2

(slide 7) - Here the space is the set of all functions from [0,∞) → R with the

following three properties:

1. f(0) = 0

2. f is absolutely continuous (hence f(b)− f(a) = ∫ b

a

f ′(x)dx)

3.

∫∞

0

[f ′(x)]2dx <∞

Addition ‘+’ is the usual addition of functions similarly multiplication by a

scalar. The dot product is defined as

〈f, g〉 :=

∫ ∞

0

f ′(x)g′(x)dx

Argue that this is an inner product space.

Denote this space by F . Symmetry and linearity of 〈f, g〉 hold by definition of

the integral.

We now prove the positive definite property of 〈f, f〉 for f ∈ F . Clearly 〈f, f〉 =∫∞

0

[f ′(x)]2dx ≥ 0 since we are integrating over a squared function.

Furthermore if 〈f, f〉 = ∫∞

0

[f ′(x)]2dx = 0 then this, along with property 2

implies that f ′(x) is zero everywhere (rather than almost everywhere). This

then corresponds to f(x) = c for some value c, except f ∈ F iff c = 0 due to

propery 1 above. Therefore f must be 0 everywhere.

2

Exercise 3

(slide 9) - “Some results on convexity”

Recall a function f : X → R is convex iff ∀p, q ∈ X and α ∈ (0, 1) we have

f(αp+ (1− α)q) ≤ αf(p) + (1− α)f(q) (1)

1. If f and g are convex then f + g is convex.

Proof. Let h(x) := f(x) + g(x). Then

h(αp+ (1− α)q) = f(αp+ (1− α)q) + g(αp+ (1− α)q)

≤ αf(p) + (1− α)f(q) + αg(p) + (1− α)g(q)

= α(f(p) + g(p)) + (1− α)(f(q) + g(q))

= αh(p) + (1− α)h(q) ,

where the inequality arises from the convexity of f and g.

2. If f is convex and g is affine (linear + a constant) then f(g(·)) is convex.

Proof. Since g(x) is affine then g(αp+ (1− α)q) = αg(p) + (1− α)g(q) (Equa-

tion (1) holds with equality for affine functions). Let h(x) := f(g(x)). Then

h(αp+ (1− α)q) = f(g(αp+ (1− α)q))

= f(αg(p) + (1− α)g(q))

≤ αf(g(p)) + (1− α)f(g(q))

= αh(p) + (1− α)h(q) ,

where the inequality uses the convexity of f .

3

Exercise 4

(Problem set 1 - Question 3) - Given a kernel K : R2×R2 → R where K(x, t) :=

(1 + 〈x, t〉)2. Find a feature map φ : R2 → R6 which corresponds to the kernel.

Expanding and ‘matching’ terms gives

(1 + 〈x, t〉)2 = 1 + 2〈x, t〉+ 〈x, t〉2

= 1 + 2(x1t1 + x2t2) + (x1t1 + x2t2)(x1t1 + x2t2)

= 1 + (x1t1)

2 + (x2t2)

2 + 2x1t1x2t2 + 2x1t1 + 2x2t2

=

1

x21

x22√

2x1x2√

2x1√

2x2

>

1

t21

t22√

2t1t2√

2t1√

2t2

= 〈φ(x), φ(t)〉 .

Thus

φ(x) =

1

x21

x22√

2x1x2√

2x1√

2x2

.

4

Exercise 5

(Problem set 1 - Question 5) - Consider a Gaussian kernel function K : R2 ×

R2 → R defined by K(x, z) := e−‖x−z‖2 , does there exist a finite-dimensional

feature map representation? I.e., does there exist a φ : Rn → Rd such that

K(x, z) = 〈φ(x), φ(z)〉? Indicate an answer “yes” or “no” and provide an

argument supporting your answer.

No.

5

Exercise 6

(Problem set 2 - Question 2) - Kernels between sets. Let X be a finite set define

K : 2X × 2X → R as

K(A,B) := 2|A∩B|

where A,B ⊆ X. Prove that K is a kernel.

Recall that 2X is the power set of X (i.e., the set of all subsets of X, including

the empty set), and that |2X | = 2|X|.

Proof. Let [pred] = 1 if pred is true and 0 otherwise. Then for i ∈ 2X define

φ(A)i = [i ⊆ A]

(note that i here is used to index φ and also represent elements of 2X) we then

have a binary feature vector φ ∈ {0, 1}2|X| . Finally then

K(A,B) = 〈φ(A), φ(B)〉 =

∑

i∈2X

φ(A)iφ(B)i =

∑

i∈2X

[i ⊆ A][i ⊆ B] = 2|A∩B|

as required.

6

Exercise 7

(Problem set 2 - Question 3.1) - Argue that min(x, t) (where x, t ∈ [0,∞)) is a

kernel...

Recall that the inner product space from Exercise 2 (which we will denote F)

has the following inner product for f, g ∈ F :

〈f, g〉 :=

∫ ∞

0

f ′(x)g′(x)dx . (2)

Define the following feature map φ : [0,∞)→ F s.t

φ(x) = f(p) :=

{

p, p ≤ x

x otherwise .

(3)

Then f ′(p) are step-functions:

f ′(p) =

{

1, p ≤ x

0 otherwise ,

(4)

with the following property∫ ∞

0

f ′(p)dp =

∫ x

0

(1)dp+

∫ ∞

x

(0)dp = x

(see Figure 1).

Therefore by letting φ(x) = f and φ(t) = g as defined in (3) we have the

following by combining (2) and (4)

〈φ(x), φ(y)〉 =

∫ ∞

0

f ′(p)g′(p)dp =

∫ min(x,t)

0

(1)dp+

∫ ∞

min(x,t)

(0)dp = min(x, t)

and min(x, t) is therefore a kernel.

7

Figure 1: Problem set 2 - Question 3.1

8