COMP90086-无代写|学霸联盟

COMP90086-无代写

时间：2023-11-11

Student number
Semester 2, 2021
Computing and Information Systems
COMP90086 - Computer Vision
Reading time: 15 minutes
Writing time: 2 hours
Permitted Materials
• Calculator
Instructions to Students
• This paper has 7 pages including this cover page.
• There are 9 questions in the exam worth a total of 120 marks, making up 50% of the total assessment for the subject.
• Please answer all questions in this examination paper in the spaces provided.
• Your writing should be clear; illegible answers will not be marked.
• You may not remove any part of this examination paper from the examination room.
Page 2 of 7
Section A. Short Answer Questions
Answer each of the questions in this section as briefly as possible. Expect to answer the text response
questions in nomore than 2-3 sentences.
Question 1: Short Answer Questions [35marks]
(a) (3 marks) Why are image borders a problem for convolution? Explain two options for handling
the image borders when doing convolution.
(b) (3 marks) Why is a Gaussian filter preferred to a box filter (e.g., the filter shown below) for blur-
ring images?
1/9 1/9 1/9
1/9 1/9 1/9
1/9 1/9 1/9
(c) (3 marks) Suppose that you convolve an image I wiht a filter f , then convolve that output with
a second filter g:
(I ∗ f) ∗ g (1)
Which of the following would do the equivalent filtering operation on this image in the Fourier
domain? (There may be multiple answers; select all that apply.) Notation: ∗ denotes convolu-
tion, denotes element-wisemultiplication, FT [x] is the Fourier transform of x, andFT 1[x] is
the inverse Fourier transform of x.
© FT [I] FT [f g]
© FT [I] ∗ FT [f g]
© FT [I ∗ f ] FT [g]
© FT [I] FT [f ] FT [g]
(d) (4 marks) If two dierent objects are photographed under exactly the same lighting conditions
with the same camera and produce the same RGB values, can we conclude that both objects
have the same spectral power distribution? Why or why not?
(e) (3 marks) Which of the following statements are TRUE of the ReLU activation function? (There
may bemultiple answers. Select all that apply.)
© It helps reduce the vanishing gradient problem, compared to other activation func-
tions like sigmoid.
© It allows for faster training of deep CNNs compared to other activation functions like
sigmoid.
© It adds a non-linearity to the output of a convolutional kernel.
© It reduces the dimensions of the output of a convolutional kernel.
(f) (3 marks) The Canny edge detector has two threshold parameters that must be set by the user.
What is the role of each threshold?
Page 3 of 7
(g) (4 marks) Consider the image of a golf ball shown below. Describe two cues that can be used
to infer the 3D surface shape of the golf ball from the image and explain what can be computed
from each cue.
(h) (4 marks) Assume that you have trained a deep CNN on an image dataset for classification, and
you observe that the training accuracy is very high but the testing accuracy is very low. Name
three possible strategies that can address this problem.
(i) (4 marks) Why are generative adversarial networks (GANs) prone to mode collapse? Describe a
method to detect mode collapse.
(j) (4 marks) Compare and contrast the region-merging and normalised cuts approaches to image
segmentation. How are they similar and where do they dier?
Section B. Methodological Questions
In this section you are asked to demonstrate your conceptual understanding of a subset of the methods
that we have studied in this subject.
Question 2: Corner detection [12 marks]
The following questions relate to corner detection.
(a) (6 marks) Would the patch w in the image above be considered a “corner” by the Harris cor-
ner detection algorithm? Why or why not? Justify your answer in terms of the corner response
function.
(b) (6 marks) Is the corner response function invariant to:
• translation?
• image-plane rotation?
• scale?
In addition to providing a yes/no response for each property, briefly justify your answer.
Page 4 of 7
Question 3: Convolutional neural networks [12 marks]
The architecture of U-Net is shown below.
(a) (4 marks) Give an example of a task that is suited to the U-Net architecture and explain why.
(b) (4 marks) Why does the U-Net architecture contain downsampling stages followed by upsam-
pling stages?
(c) (4 marks) Why does the U-Net architecture contain the “bypass” arrows that concatenate acti-
vations from the downsampling side with upsampled activations?
Question 4: Texture synthesis [13 marks]
(a) (5 marks) Howdoes parametric texture synthesis dier fromnon-parametric texture synthesis?
What is an advantage of each approach over the other?
(b) (4 marks) The non-parametric texture synthesis algorithms discussed in class (Efros & Leung,
1999; Efros & Freeman, 2001) have patch size as a free parameter. What is the eect of decreasing
patch size?
(c) (4 marks) How would you choose an appropriate patch size to correctly synthesize the texture
shown below? Be specific, referencing the image.
Page 5 of 7
Question 5: Object detection [13 marks]
(a) (4 marks) Briefly explain thedierencebetween region-proposal-based and single-stage object
detectors and the relative advantage of each approach.
(b) (4 marks) Ina region-proposal-basednetwork, how isa regionof interestdierent fromabound-
ing box prediction and how are they related?
(c) (5 marks) Why is class imbalance a problem for CNN-based object detectors? Explain how this
problem is handled by a region-proposal-basedmethod and a single-stage method.
Section C. Algorithmic Questions
In this section you are asked to demonstrate your understanding of a subset of themethods thatwe have
studied in this subject, in being able to perform algorithmic calculations.
Question 6: Stereo disparity [6marks]
Assume the two images shownbelowwere taken froma calibrated pair of stereo cameras. Each cam-
era has a focal lengthof 30mmandproduces a 100 x 100mm image. The two cameras are at the same
height and each has its optical centre in the centre of the image (at the point (50,50) in the image).
The image planes of the cameras are parallel to each other and to the baseline which is 500 mm.
What is the depth (distance to the baseline) of the indicated point x, which is located at (31,30) in the
le camera’s image and (29,30) in the right camera’s image? Show your work.
Question 7: Epipolar Geometry [8marks]
The essential matrix for a pair of cameras, mapping points in camera 1 to lines in camera 2 is:
E =
 3 −4 45 0 0
−4 −3 3

There are three points of interest: p1 =
(
0
0
)
, p2 =
(
1
0
)
, p3 =
(
0
1
)
(a) (6 marks) Which of these three points is the epipole in camera 1? Show your working.
(b) (2 marks) Which of these three points corresponds to the point q =
(
3
−4
)
in camera 2? Show
your working.
Page 6 of 7
Question 8: Convolutional neural networks [15 marks]
In the following CNNnetwork, the input is a RGB image, with both height andwidth equal to 224. The
convolution operation in the convolutional layer is standard 2D convolution (i.e., each kernel has the
same number of channels of the input, and the kernel goes through local patches of the input to
conduct element-wise multiplication and take sum). “FC10” denotes a fully connected layer with 10
units.
Answer the following questions (show your workings):
(a) (3 marks) Compute the size of the feature maps output by the convolutional layer and max-
pooling layer (The output size should be in format of height×width× number of channels.)
(b) (6 marks) Compute the number of parameters and multiplications of the convolutional layer
andmax-pooling layer (ignore the bias).
(c) (6 marks) If the standard 2D convolution in the convolutional layer is replaced with Depthwise
Separable Convolution (with the same padding, stride and output featuremap size), what is the
number of parameters andmultiplications in the layer?
Question 9: Transposed convolution [6marks]
Compute the result of performing a transposed convolution on the 2× 2 input with the 3× 3 kernel
shown below
(a) (3 marks) with a stride of 2
(b) (3 marks) with a stride of 1
Express each result as a matrix and include the trimming step.
Input:
4 7
6 3
Kernel:
0 1 0
1 2 1
0 1 0
Page 7 of 7