程序代写案例-ENGN6528

时间：2022-06-07

1

ENGN6528 Final Exam S1 2021

The Australian National University College of Engineering and Computer Science Final Examination, First Semester 2021

ENGN6528 Computer Vision

Question Booklet

2

Instructions on next page

Allotted Time

You will have xx hours to complete the exam plus 15 minutes of reading time. An additional 15 minutes has
also been allowed to accommodate the additional task of uploading your completed exam to the final exam
turnitin submission portal on the ENGN6528 Wattle site. Thus, you have xx hours to complete the exam. NO
late exams will be accepted. You may begin the exam as soon as you download it.
Minimal requirements:

You may attempt all questions

You SHOULD NOT include an assignment cover sheet

You must type your ANU student identification number at the top of the first page of your submission

You must monitor your own time (i.e. there is no invigilator to tell you how many minutes are left).

Your answers must be clear enough that another person can read, understand and mark your answer. 11 or
12 point font with 1.5 spacing is preferred. Scanned images of handwritten equations or diagrams must be
legible and of a suitable size.

Numbering questions
● You must specify the question you are answering by typing the relevant question number at the top
the page
● Each question should begin on a new page
● Multi-part questions (e.g. question 1 parts a and b) may be addressed on the same page but should
be clearly labelled (e.g. 1a, 1b )
● Questions should be answered in order

You must upload your completed answers in a single document file within the allotted time using a compatible
file type for Turnitin (Preference: MS Word’s .doc or .docx format) It is the student’s responsibility to
check that the file has uploaded correctly within Turnitin. No late exams will be accepted.
Academic integrity

Students are reminded of the declaration that they agree to when submitting this exam paper via Turnitin:
I declare that this work:
● upholds the principles of academic integrity as defined in the University Academic Misconduct Rules;
● is original, except where collaboration (for example group work) has been authorised in writing by the
course convener in the course outline and/or Wattle site;

3

● is produced for the purposes of this assessment task and has not been submitted for assessment in
any other context, except where authorised in writing by the course convener;
● gives appropriate acknowledgement of the ideas, scholarship and intellectual property of others
insofar as these have been used;
● in no part involves copying, cheating, collusion, fabrication, plagiarism or recycling.

4

There are 5 questions in total.
(Q1-Q5)

Please name your submission as
ENGN6528_exam_u1234567.docx

5

Questions on the next page

6

Q1: (21 marks) [3D SFM and Image formation question]

Answer the following questions concisely. Write down working, and if you are unsure
about some part along the way, state your best assumption and use it for the remaining
parts. Similarly, if you think some aspect is ambiguous, state your assumption and write the
answer as clearly as you can.

(a) Given two calibrated cameras, C1 and C2, C1 has focal length of 500 in x and 375 in y,
(in pixel unit) the camera has resolution 512x512, and the camera centre projected to
image is at (249, 249), with no skew. Suppose C2 has the same image resolution and
focal length as C1, but the camera centre projected to image is at (251, 252). Write
down the calibration matrix K1 and K2 for C1 and C2 respectively. (Hint: please only
write down the final two 3x3 matrices.) [3 marks]

(b) Suppose that a 3D world coordinate system ((X,Y,Z) coordinates as in the below
diagram from the lecture notes) is defined as aligned with the camera coordinate system
of C1. More specifically, the world origin is at the camera centre of C1, the Z axis is
aligned with the optical(principal) axis and the X and Y world coordinate systems
aligned parallel with the x and y axes of the image of C1. Write down the matrices
K[R|t] which define the projection of a point in world coordinate system to the image of
C1. (Hint: please only write down the final 3x4 matrix.) [3 marks]

(c) Suppose that the scene has a point, P1, that in the world coordinate system defined above
that lies at (39, 35, 100). Note that the points in world coordinate system are measured in
cm. What location (to the nearest pixel) will that world point (P1) map to in the image of
C1? [2 marks]

7

(d) Suppose that with respect to the world coordinate system that is aligned with camera C1,
camera C2 begins being aligned to C1, and is then rotated by 45 degrees about its
vertical axis (Y-axis)(as shown below), and subsequently the centre of C2 is translated
by 0.2 m to the left of C1 (along the X axis of C1), then moved forward by 0.2 m parallel
to the optical axis of C1.

Write down the matrices K[R|t], which define the projection of points in the world system (i.e,
the same coordinate system of C1) to the image of C2. (Hint: please only write down the final
3x4 matrix.) [3 marks]

(e) What is the location (to the nearest pixel) that P1 maps to in the image of Camera C2?
(Hint: Please write down only the final result.) [2 marks]

(f) Define the term epipole. [2 points]

8

(g) For camera C1, there is an epipole (or epipolar point) that relates to Camera C2. For the
two-camera setup for predicting structure from motion, what is the position of the
epipole in camera C1 of camera C2? (Hint: It is a point in the image coordinates of
Camera C1). [2 points]

(h) Given a point P2 that appears in camera C1 at image location (x1, y1), and in camera C2
at image location (x2, y2). How would you find the world coordinates of point P2? [4
points]

9

Q2: (7 marks) [Shape-from-X, Stereo]

(a) Shape-from-Shading approaches predict the brightness of an image pixel. Given a point
light source at infinity (distant light source), write down the equation that defines the
brightness at an image pixel assuming that the camera views a Lambertian surface,
Please also define the terms of the equation. [2 marks]

(b) Suppose that we have used some other methods to know the brightness of the lighting,
its direction and the reflectance properties of the surface in the above scenario, but we
only have intensity information about this particular pixel for this surface, what can we
say about the surface orientation? [2 marks]

(c) The images (a and b) shown below are the left, and the right image of an ideal stereo
pair, taken with two identical cameras (A and B) mounted at the same horizontal level
and with their optical axes parallel. [3 marks]

Draw a planar-view (i.e., a top-down bird-eye's view) of the scene showing roughly what
the spatial arrangements of the three objects are. Only relative (rather than accurate)
positions are required.

10

Q3: (8 marks) [Deep neural network]

Given below is a single node in a neural network. Supposing that d is 4, x={2,1,2,3}, and
w={0.3,0.4,0.1,-0.4}, b=0.1, and that the activation function is a standard ReLU, that is
=max(0,x), where x is the input to the activation function.

(a) What is the output of this node? [2 marks]

(b) Describe the difference between, recognition and detection in terms of how you would
use a Deep Convolutional Network to solve the problem? [2 marks]

(c) Two cascaded 3x3 layers, or a single 5x5 layer result in the same number of pixels in the
input image impacting the result. So why might you prefer one representation over the
other? [2 marks]

Q4: (2 marks) (questions with short answers) Given a dataset that consists of images
of the Eiffel Tower, your task is to learn a classifier to detect the Eiffel Tower in new
images. You implement PCA to reduce the dimensionality of your data, but find that
your performance in detecting the Eiffel Tower significantly drops in comparison to
your method on the original input data. Samples of your input training images are given
in the following figures. Why is the performance suffering? [hints: describe in two
sentences.]

Figure 1. Images in the dataset

Q5: (10 marks) [algorithm design] Turn your phone into a GPS in an art museum or a
library. GPS usually does not work well in an indoor environment. The goal of
designing this algorithm is to localize your position by taking a few images around you
in the museum. Please Briefly describe the key steps of your method.

Localize yourself

======= END of ALL QUESTIONS in the EXAM ===========

15