1
×
×
× ×
ECE 515 Image Analysis & Computer Vision II Spring 2021
Mid-term Exam
Instructor: Rashid Ansari
Date: 03/08/2021
1. Image resizing (8 points)
Consider the rectangular image of the green leaf of size 960 1280. The blue and red components of
the color image are zero. A museum wants you to process it and display it as a 320 320 square image with
the leaf proportionally wider and red as shown below. Aliasing in the processed image should be
minimal.
960 1280 320 320
(a) Briefly explain your procedure.
Solution:
➢ Assign all the red pixel’s values to the corresponding green pixels values and set
all the green and blue pixels values to zero.
➢ Decimate the image horizontally by a factor of 3.
➢ Decimate the image vertically by a factor of 4.
2
×
(b) Specify the ideal frequency and impulse responses needed for the anti-aliasing filter.
Solution:
ℎℎ[] =
1
3
(
3
) horizontal LPF filter with cutoff frequency =
3
ℎ[] =
1
4
(
4
) vertical LPF filter with cutoff frequency =
4
ℎ[, ] = ℎℎ[]ℎ[]
=
1
12
(
3
) (
4
)
(c) A K K separable anti-aliasing filter is used to obtain the final image. Determine the approximate
number of multiplications needed for the separable processing to obtain the reduced size image.
Solution: M = number of image’s columns = 960, N = number of image’s rows = 1280.
Consider vertical decimation first using the separable filter. Without using symmetry, we need K
multiplies per output sample with MN/4 output samples:
N1 = KMN/4
Next for horizontal decimation, we need K multiplies per output sample with MN/12 output samples:
N2 = KMN/12
Total number of multiplications: N1 + N2 = KMN/3 = Kx960x1280/3
2. Edge detection (6 points)
(a) Describe briefly why smoothing is applied to an image prior to edge detection. Explain any adverse
effect of this smoothing in edge detection.
Solution:
Smoothing is applied to an image prior the edge detection process to reduce the noise within an
image (typically Gaussian noise) since noise can produce large derivative values computed in edge
detection. A LPF mostly used for smoothing. Smoothing can decrease the strength of the edges and
widen the edges.
1
2
1
−
−
/4
/3
−/4
−/3
× (/4)
samples
×
samples
(/3) × (/4)
samples
Decimate
vertically
by 4
Decimate
horizontally
by 3
3
×
(b) The Canny edge detector algorithm uses a Hysteresis Thresholding technique that employs two
thresholds tH (high) and tL (low). After initial gradient computation, the gradient magnitude array for
an 12 8 image is shown below. Assume tH = 6 and tL = 3. Show the final edge pixels identified by the
algorithm after hysteresis thresholding. Note that for easy spotting all edges are horizontal (along
rows). Mark the final edge pixels with an X in the pixel box.
0 0 0 1 0 1 1 0 0 1 0 0
1 1 0 4 5 7 8 5 4 2 2 1
1 1 0 0 0 0 1 1 0 0 1 2
0 7 4 5 2 8 7 9 4 4 2 1
1 0 0 2 2 2 1 1 2 0 1 2
1 2 7 8 9 9 8 7 8 1 1 1
0 1 0 0 0 0 1 1 0 0 1 1
1 1 0 4 5 5 5 5 4 2 2 1
0 0 0 1 0 1 1 0 0 1 0 0
1 1 0 2 2 1
1 1 0 0 0 0 1 1 0 0 1 2
0 2 2 1
1 0 0 2 2 2 1 1 2 0 1 2
1 2 1 1 1
0 1 0 0 0 0 1 1 0 0 1 1
1 1 0 4 5 5 5 5 4 2 2 1
4
3. Corners and SIFT (6 points)
(a) Briefly describe the importance of the smallest eigenvalue of the corner matrix M in defining a
Harris corner point.
Solution:
The smallest eigenvalue of the corner matrix M in defining a Harris corner point should be large to
distinguish the corners.
(b) Explain a key shortcoming of a Harris corner detector that is overcome in a SIFT interest point
detector.
Solution:
Aa key shortcoming of a Harris corner detector is that the Harris corner is variant to scale.
Therefore, changing the scale of the image can affect the detection of the corners. SIFT, on the other
hand is scale-invariant, therefore any change in the scale of the image will not affect corner
detection.
(c) In SIFT, what is the purpose of using a Difference of Gaussian (DoG)?
Solution:
The purpose of using a Difference of Gaussian (DoG) is to approximate the Laplacian that is needed
in SIFT to identify the interest points. In addition, it is easy to implement.
(d) Briefly explain how RANSAC works to avoid the inclusion of outliers in fitting a model.
Solution:
RANSAC works by making random selection of points to fit the model. Based on the model
parameters computed and a suitable error tolerance threshold, the number of inliers over the
number of outliers are calculated. The set with the highest numbers of inliers is saved. The process is
repeated with different points until a satisfactory number of inliers is achieved. The tolerance
threshold excludes most of the outliers.
5
×
×
4. Correlation and template matching (6 points)
A template h and pixel values of an image f are shown below. Use suitable reasoning to explain your
answers. Explicit computations may not be necessary.
h f
(a) Identify (by shading the interior) the 3 3 block(s) in the image that yield the highest value of the
cross-correlation function.
(b) Identify (by darkening the border) the 3 3 block(s) in the image that yield the highest value of
the normalized cross correlation function:
3 3 3 0 0 1 1 1 0 0 0 0
-3 -3 -3 0 0 1 1 1 0 0 0 0
3 -3 0 0 0 1 1 1 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
4 4 4 0 0 0 8 8 8 0 0 0
-4 -4 -4 0 0 0 -8 -8 -8 0 0 0
4 -4 0 0 0 0 9 -9 0 0 0 0
3 3 3 0 0 1 1 1 0 0 0 0
-3 -3 -3 0 0 1 1 1 0 0 0 0
3 -3 0 0 0 1 1 1 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
4 4 4 0 0 0 8 8 8 0 0 0
-4 -4 -4 0 0 0 -8 -8 -8 0 0 0
4 -4 0 0 0 0 9 -9 0 0 0 0
1 1 1
-1 -1 -1
1 -1 0
3 3 3 0 0 1 1 1 0 0 0 0
-3 -3 -3 0 0 1 1 1 0 0 0 0
3 -3 0 0 0 1 1 1 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
4 4 4 0 0 0 8 8 8 0 0 0
-4 -4 -4 0 0 0 -8 -8 -8 0 0 0
4 -4 0 0 0 0 9 -9 0 0 0 0
6
5. Hough transform (6 points)
(a) In applying the Hough transform the polar coordinate representation of a line, ρ = x cos θ + y sin θ
is preferred over the standard line representation using y = mx + b. Explain why this is so.
Solution:
In applying the Hough transform the polar coordinate representation of a line, ρ = x cos θ + y sin θ
is preferred over the standard line representation using y = mx + b to avoid the infinite numbers of
slop due to the standard line representation.
(b) Give a rough plot of the curves ρ = xi cos θ + yi sin θ corresponding to any assumed three points
for i = 1, 2, 3 that are collinear (lie on a straight line, such as ((1,1), (2,2), (3,3)).
Solution: If we use points {(1,1), (2,2), (3,3)}, then = ( + ), = 1,2,3
= +
(c) If Hough transform is used in determining the presence of circles of all sizes in a two-dimensional
image, what is the dimensionality of the Hough accumulator array? List the accumulator variables.
Solution:
The dimensionality of the Hough accumulator array is equal to 3.
The accumulator variables a,b, and r.
( − )
2 + ( − )
2 = 2
7
6. 2D filters and directional filtering (8 points)
Consider a system with impulse response:
1 1
h[m, n] = h1D[m]h1D[n] where h1D[m] =
2
sinc(
2
m)
(a) Define an impulse response h1[m, n] = h[m, n] + (−1)m+nh[m, n] . Sketch its frequency response
H1(ejω1 , ejω2 ) (top view) indicating the passband and the stopband in the region [−π, π] × [−π, π].
(b) Let h2[m, n] be the impulse response of a checkerboard filter whose frequency response H2(ejω1 , ejω2 )
(top view) is shown below.
Define h3[m, n] = (h1 ∗ h2)[m, n], where ∗ denoted convolution. Sketch the the frequency response
H3(ejω1 , ejω2 ) (top view) indicating the pass band and the stopband in the region [−π, π] × [−π, π].
h2[m,n]
1
2
−
−
/2
/2
−/2
−/2
1 1
1 1
1
1
2
−
−
/2
/2
−/2
−/2
1
1
1
1
8
(c) An input signal f [m, n] = f1[m, n] + f2[m, n] is applied to the filter with impulse response h3,
where f1[m, n] = 2 cos((m + n)π/4) and f2[m, n] = 2 cos((m − n)π/4). Determine the output g[m, n].
Note that signal f1 corresponds to frequency w1 = p/4, w2 = p/4 which is in the passband of the
filter with impulse response h3. Signal f2 corresponds to frequency w1 = p/4, w2 =- p/4 which is in the
stopband of the filter with impulse response h3
Solution:
[, ] = 1[, ]
学霸联盟