xuebaunion@vip.163.com

3551 Trousdale Rkwy, University Park, Los Angeles, CA

留学生论文指导和课程辅导

无忧GPA：https://www.essaygpa.com

工作时间：全年无休-早上8点到凌晨3点

微信客服：xiaoxionga100

微信客服：ITCS521

程序代写案例-ECE 515

时间：2021-04-15

1

×

×

× ×

ECE 515 Image Analysis & Computer Vision II Spring 2021

Mid-term Exam

Instructor: Rashid Ansari

Date: 03/08/2021

1. Image resizing (8 points)

Consider the rectangular image of the green leaf of size 960 1280. The blue and red components of

the color image are zero. A museum wants you to process it and display it as a 320 320 square image with

the leaf proportionally wider and red as shown below. Aliasing in the processed image should be

minimal.

960 1280 320 320

(a) Briefly explain your procedure.

Solution:

➢ Assign all the red pixel’s values to the corresponding green pixels values and set

all the green and blue pixels values to zero.

➢ Decimate the image horizontally by a factor of 3.

➢ Decimate the image vertically by a factor of 4.

2

×

(b) Specify the ideal frequency and impulse responses needed for the anti-aliasing filter.

Solution:

ℎℎ[] =

1

3

(

3

) horizontal LPF filter with cutoff frequency =

3

ℎ[] =

1

4

(

4

) vertical LPF filter with cutoff frequency =

4

ℎ[, ] = ℎℎ[]ℎ[]

=

1

12

(

3

) (

4

)

(c) A K K separable anti-aliasing filter is used to obtain the final image. Determine the approximate

number of multiplications needed for the separable processing to obtain the reduced size image.

Solution: M = number of image’s columns = 960, N = number of image’s rows = 1280.

Consider vertical decimation first using the separable filter. Without using symmetry, we need K

multiplies per output sample with MN/4 output samples:

N1 = KMN/4

Next for horizontal decimation, we need K multiplies per output sample with MN/12 output samples:

N2 = KMN/12

Total number of multiplications: N1 + N2 = KMN/3 = Kx960x1280/3

2. Edge detection (6 points)

(a) Describe briefly why smoothing is applied to an image prior to edge detection. Explain any adverse

effect of this smoothing in edge detection.

Solution:

Smoothing is applied to an image prior the edge detection process to reduce the noise within an

image (typically Gaussian noise) since noise can produce large derivative values computed in edge

detection. A LPF mostly used for smoothing. Smoothing can decrease the strength of the edges and

widen the edges.

1

2

1

−

−

/4

/3

−/4

−/3

× (/4)

samples

×

samples

(/3) × (/4)

samples

Decimate

vertically

by 4

Decimate

horizontally

by 3

3

×

(b) The Canny edge detector algorithm uses a Hysteresis Thresholding technique that employs two

thresholds tH (high) and tL (low). After initial gradient computation, the gradient magnitude array for

an 12 8 image is shown below. Assume tH = 6 and tL = 3. Show the final edge pixels identified by the

algorithm after hysteresis thresholding. Note that for easy spotting all edges are horizontal (along

rows). Mark the final edge pixels with an X in the pixel box.

0 0 0 1 0 1 1 0 0 1 0 0

1 1 0 4 5 7 8 5 4 2 2 1

1 1 0 0 0 0 1 1 0 0 1 2

0 7 4 5 2 8 7 9 4 4 2 1

1 0 0 2 2 2 1 1 2 0 1 2

1 2 7 8 9 9 8 7 8 1 1 1

0 1 0 0 0 0 1 1 0 0 1 1

1 1 0 4 5 5 5 5 4 2 2 1

0 0 0 1 0 1 1 0 0 1 0 0

1 1 0 2 2 1

1 1 0 0 0 0 1 1 0 0 1 2

0 2 2 1

1 0 0 2 2 2 1 1 2 0 1 2

1 2 1 1 1

0 1 0 0 0 0 1 1 0 0 1 1

1 1 0 4 5 5 5 5 4 2 2 1

4

3. Corners and SIFT (6 points)

(a) Briefly describe the importance of the smallest eigenvalue of the corner matrix M in defining a

Harris corner point.

Solution:

The smallest eigenvalue of the corner matrix M in defining a Harris corner point should be large to

distinguish the corners.

(b) Explain a key shortcoming of a Harris corner detector that is overcome in a SIFT interest point

detector.

Solution:

Aa key shortcoming of a Harris corner detector is that the Harris corner is variant to scale.

Therefore, changing the scale of the image can affect the detection of the corners. SIFT, on the other

hand is scale-invariant, therefore any change in the scale of the image will not affect corner

detection.

(c) In SIFT, what is the purpose of using a Difference of Gaussian (DoG)?

Solution:

The purpose of using a Difference of Gaussian (DoG) is to approximate the Laplacian that is needed

in SIFT to identify the interest points. In addition, it is easy to implement.

(d) Briefly explain how RANSAC works to avoid the inclusion of outliers in fitting a model.

Solution:

RANSAC works by making random selection of points to fit the model. Based on the model

parameters computed and a suitable error tolerance threshold, the number of inliers over the

number of outliers are calculated. The set with the highest numbers of inliers is saved. The process is

repeated with different points until a satisfactory number of inliers is achieved. The tolerance

threshold excludes most of the outliers.

5

×

×

4. Correlation and template matching (6 points)

A template h and pixel values of an image f are shown below. Use suitable reasoning to explain your

answers. Explicit computations may not be necessary.

h f

(a) Identify (by shading the interior) the 3 3 block(s) in the image that yield the highest value of the

cross-correlation function.

(b) Identify (by darkening the border) the 3 3 block(s) in the image that yield the highest value of

the normalized cross correlation function:

3 3 3 0 0 1 1 1 0 0 0 0

-3 -3 -3 0 0 1 1 1 0 0 0 0

3 -3 0 0 0 1 1 1 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0

4 4 4 0 0 0 8 8 8 0 0 0

-4 -4 -4 0 0 0 -8 -8 -8 0 0 0

4 -4 0 0 0 0 9 -9 0 0 0 0

3 3 3 0 0 1 1 1 0 0 0 0

-3 -3 -3 0 0 1 1 1 0 0 0 0

3 -3 0 0 0 1 1 1 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0

4 4 4 0 0 0 8 8 8 0 0 0

-4 -4 -4 0 0 0 -8 -8 -8 0 0 0

4 -4 0 0 0 0 9 -9 0 0 0 0

1 1 1

-1 -1 -1

1 -1 0

3 3 3 0 0 1 1 1 0 0 0 0

-3 -3 -3 0 0 1 1 1 0 0 0 0

3 -3 0 0 0 1 1 1 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0

4 4 4 0 0 0 8 8 8 0 0 0

-4 -4 -4 0 0 0 -8 -8 -8 0 0 0

4 -4 0 0 0 0 9 -9 0 0 0 0

6

5. Hough transform (6 points)

(a) In applying the Hough transform the polar coordinate representation of a line, ρ = x cos θ + y sin θ

is preferred over the standard line representation using y = mx + b. Explain why this is so.

Solution:

In applying the Hough transform the polar coordinate representation of a line, ρ = x cos θ + y sin θ

is preferred over the standard line representation using y = mx + b to avoid the infinite numbers of

slop due to the standard line representation.

(b) Give a rough plot of the curves ρ = xi cos θ + yi sin θ corresponding to any assumed three points

for i = 1, 2, 3 that are collinear (lie on a straight line, such as ((1,1), (2,2), (3,3)).

Solution: If we use points {(1,1), (2,2), (3,3)}, then = ( + ), = 1,2,3

= +

(c) If Hough transform is used in determining the presence of circles of all sizes in a two-dimensional

image, what is the dimensionality of the Hough accumulator array? List the accumulator variables.

Solution:

The dimensionality of the Hough accumulator array is equal to 3.

The accumulator variables a,b, and r.

( − )

2 + ( − )

2 = 2

7

6. 2D filters and directional filtering (8 points)

Consider a system with impulse response:

1 1

h[m, n] = h1D[m]h1D[n] where h1D[m] =

2

sinc(

2

m)

(a) Define an impulse response h1[m, n] = h[m, n] + (−1)m+nh[m, n] . Sketch its frequency response

H1(ejω1 , ejω2 ) (top view) indicating the passband and the stopband in the region [−π, π] × [−π, π].

(b) Let h2[m, n] be the impulse response of a checkerboard filter whose frequency response H2(ejω1 , ejω2 )

(top view) is shown below.

Define h3[m, n] = (h1 ∗ h2)[m, n], where ∗ denoted convolution. Sketch the the frequency response

H3(ejω1 , ejω2 ) (top view) indicating the pass band and the stopband in the region [−π, π] × [−π, π].

h2[m,n]

1

2

−

−

/2

/2

−/2

−/2

1 1

1 1

1

1

2

−

−

/2

/2

−/2

−/2

1

1

1

1

8

(c) An input signal f [m, n] = f1[m, n] + f2[m, n] is applied to the filter with impulse response h3,

where f1[m, n] = 2 cos((m + n)π/4) and f2[m, n] = 2 cos((m − n)π/4). Determine the output g[m, n].

Note that signal f1 corresponds to frequency w1 = p/4, w2 = p/4 which is in the passband of the

filter with impulse response h3. Signal f2 corresponds to frequency w1 = p/4, w2 =- p/4 which is in the

stopband of the filter with impulse response h3

Solution:

[, ] = 1[, ]

学霸联盟

×

×

× ×

ECE 515 Image Analysis & Computer Vision II Spring 2021

Mid-term Exam

Instructor: Rashid Ansari

Date: 03/08/2021

1. Image resizing (8 points)

Consider the rectangular image of the green leaf of size 960 1280. The blue and red components of

the color image are zero. A museum wants you to process it and display it as a 320 320 square image with

the leaf proportionally wider and red as shown below. Aliasing in the processed image should be

minimal.

960 1280 320 320

(a) Briefly explain your procedure.

Solution:

➢ Assign all the red pixel’s values to the corresponding green pixels values and set

all the green and blue pixels values to zero.

➢ Decimate the image horizontally by a factor of 3.

➢ Decimate the image vertically by a factor of 4.

2

×

(b) Specify the ideal frequency and impulse responses needed for the anti-aliasing filter.

Solution:

ℎℎ[] =

1

3

(

3

) horizontal LPF filter with cutoff frequency =

3

ℎ[] =

1

4

(

4

) vertical LPF filter with cutoff frequency =

4

ℎ[, ] = ℎℎ[]ℎ[]

=

1

12

(

3

) (

4

)

(c) A K K separable anti-aliasing filter is used to obtain the final image. Determine the approximate

number of multiplications needed for the separable processing to obtain the reduced size image.

Solution: M = number of image’s columns = 960, N = number of image’s rows = 1280.

Consider vertical decimation first using the separable filter. Without using symmetry, we need K

multiplies per output sample with MN/4 output samples:

N1 = KMN/4

Next for horizontal decimation, we need K multiplies per output sample with MN/12 output samples:

N2 = KMN/12

Total number of multiplications: N1 + N2 = KMN/3 = Kx960x1280/3

2. Edge detection (6 points)

(a) Describe briefly why smoothing is applied to an image prior to edge detection. Explain any adverse

effect of this smoothing in edge detection.

Solution:

Smoothing is applied to an image prior the edge detection process to reduce the noise within an

image (typically Gaussian noise) since noise can produce large derivative values computed in edge

detection. A LPF mostly used for smoothing. Smoothing can decrease the strength of the edges and

widen the edges.

1

2

1

−

−

/4

/3

−/4

−/3

× (/4)

samples

×

samples

(/3) × (/4)

samples

Decimate

vertically

by 4

Decimate

horizontally

by 3

3

×

(b) The Canny edge detector algorithm uses a Hysteresis Thresholding technique that employs two

thresholds tH (high) and tL (low). After initial gradient computation, the gradient magnitude array for

an 12 8 image is shown below. Assume tH = 6 and tL = 3. Show the final edge pixels identified by the

algorithm after hysteresis thresholding. Note that for easy spotting all edges are horizontal (along

rows). Mark the final edge pixels with an X in the pixel box.

0 0 0 1 0 1 1 0 0 1 0 0

1 1 0 4 5 7 8 5 4 2 2 1

1 1 0 0 0 0 1 1 0 0 1 2

0 7 4 5 2 8 7 9 4 4 2 1

1 0 0 2 2 2 1 1 2 0 1 2

1 2 7 8 9 9 8 7 8 1 1 1

0 1 0 0 0 0 1 1 0 0 1 1

1 1 0 4 5 5 5 5 4 2 2 1

0 0 0 1 0 1 1 0 0 1 0 0

1 1 0 2 2 1

1 1 0 0 0 0 1 1 0 0 1 2

0 2 2 1

1 0 0 2 2 2 1 1 2 0 1 2

1 2 1 1 1

0 1 0 0 0 0 1 1 0 0 1 1

1 1 0 4 5 5 5 5 4 2 2 1

4

3. Corners and SIFT (6 points)

(a) Briefly describe the importance of the smallest eigenvalue of the corner matrix M in defining a

Harris corner point.

Solution:

The smallest eigenvalue of the corner matrix M in defining a Harris corner point should be large to

distinguish the corners.

(b) Explain a key shortcoming of a Harris corner detector that is overcome in a SIFT interest point

detector.

Solution:

Aa key shortcoming of a Harris corner detector is that the Harris corner is variant to scale.

Therefore, changing the scale of the image can affect the detection of the corners. SIFT, on the other

hand is scale-invariant, therefore any change in the scale of the image will not affect corner

detection.

(c) In SIFT, what is the purpose of using a Difference of Gaussian (DoG)?

Solution:

The purpose of using a Difference of Gaussian (DoG) is to approximate the Laplacian that is needed

in SIFT to identify the interest points. In addition, it is easy to implement.

(d) Briefly explain how RANSAC works to avoid the inclusion of outliers in fitting a model.

Solution:

RANSAC works by making random selection of points to fit the model. Based on the model

parameters computed and a suitable error tolerance threshold, the number of inliers over the

number of outliers are calculated. The set with the highest numbers of inliers is saved. The process is

repeated with different points until a satisfactory number of inliers is achieved. The tolerance

threshold excludes most of the outliers.

5

×

×

4. Correlation and template matching (6 points)

A template h and pixel values of an image f are shown below. Use suitable reasoning to explain your

answers. Explicit computations may not be necessary.

h f

(a) Identify (by shading the interior) the 3 3 block(s) in the image that yield the highest value of the

cross-correlation function.

(b) Identify (by darkening the border) the 3 3 block(s) in the image that yield the highest value of

the normalized cross correlation function:

3 3 3 0 0 1 1 1 0 0 0 0

-3 -3 -3 0 0 1 1 1 0 0 0 0

3 -3 0 0 0 1 1 1 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0

4 4 4 0 0 0 8 8 8 0 0 0

-4 -4 -4 0 0 0 -8 -8 -8 0 0 0

4 -4 0 0 0 0 9 -9 0 0 0 0

3 3 3 0 0 1 1 1 0 0 0 0

-3 -3 -3 0 0 1 1 1 0 0 0 0

3 -3 0 0 0 1 1 1 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0

4 4 4 0 0 0 8 8 8 0 0 0

-4 -4 -4 0 0 0 -8 -8 -8 0 0 0

4 -4 0 0 0 0 9 -9 0 0 0 0

1 1 1

-1 -1 -1

1 -1 0

3 3 3 0 0 1 1 1 0 0 0 0

-3 -3 -3 0 0 1 1 1 0 0 0 0

3 -3 0 0 0 1 1 1 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0

4 4 4 0 0 0 8 8 8 0 0 0

-4 -4 -4 0 0 0 -8 -8 -8 0 0 0

4 -4 0 0 0 0 9 -9 0 0 0 0

6

5. Hough transform (6 points)

(a) In applying the Hough transform the polar coordinate representation of a line, ρ = x cos θ + y sin θ

is preferred over the standard line representation using y = mx + b. Explain why this is so.

Solution:

In applying the Hough transform the polar coordinate representation of a line, ρ = x cos θ + y sin θ

is preferred over the standard line representation using y = mx + b to avoid the infinite numbers of

slop due to the standard line representation.

(b) Give a rough plot of the curves ρ = xi cos θ + yi sin θ corresponding to any assumed three points

for i = 1, 2, 3 that are collinear (lie on a straight line, such as ((1,1), (2,2), (3,3)).

Solution: If we use points {(1,1), (2,2), (3,3)}, then = ( + ), = 1,2,3

= +

(c) If Hough transform is used in determining the presence of circles of all sizes in a two-dimensional

image, what is the dimensionality of the Hough accumulator array? List the accumulator variables.

Solution:

The dimensionality of the Hough accumulator array is equal to 3.

The accumulator variables a,b, and r.

( − )

2 + ( − )

2 = 2

7

6. 2D filters and directional filtering (8 points)

Consider a system with impulse response:

1 1

h[m, n] = h1D[m]h1D[n] where h1D[m] =

2

sinc(

2

m)

(a) Define an impulse response h1[m, n] = h[m, n] + (−1)m+nh[m, n] . Sketch its frequency response

H1(ejω1 , ejω2 ) (top view) indicating the passband and the stopband in the region [−π, π] × [−π, π].

(b) Let h2[m, n] be the impulse response of a checkerboard filter whose frequency response H2(ejω1 , ejω2 )

(top view) is shown below.

Define h3[m, n] = (h1 ∗ h2)[m, n], where ∗ denoted convolution. Sketch the the frequency response

H3(ejω1 , ejω2 ) (top view) indicating the pass band and the stopband in the region [−π, π] × [−π, π].

h2[m,n]

1

2

−

−

/2

/2

−/2

−/2

1 1

1 1

1

1

2

−

−

/2

/2

−/2

−/2

1

1

1

1

8

(c) An input signal f [m, n] = f1[m, n] + f2[m, n] is applied to the filter with impulse response h3,

where f1[m, n] = 2 cos((m + n)π/4) and f2[m, n] = 2 cos((m − n)π/4). Determine the output g[m, n].

Note that signal f1 corresponds to frequency w1 = p/4, w2 = p/4 which is in the passband of the

filter with impulse response h3. Signal f2 corresponds to frequency w1 = p/4, w2 =- p/4 which is in the

stopband of the filter with impulse response h3

Solution:

[, ] = 1[, ]

学霸联盟