CSC263 Winter 2022 Problem Set 1
Michelle Craig and Samar Sabie
Due: February 7, 2022 4pm ET on MarkUs
• Your problem sets are graded on both correctness and clarity of communication. Solutions that
are technically correct but poorly written will not receive full marks.
• Each problem set may be completed as an individual or in a group of two students. Partners do
not need to be registered in the same section of the course.
• Solutions must be typeset electronically, and submitted to MarkUs as a PDF with the filename
ps1.pdf. Handwritten submissions will receive a grade of zero.
• Your submitted file should not be larger than 9MB. Your file might exceed this limit if you use
a word processor like Microsoft Word to create a PDF; if it does, you should look into PDF
compression tools to make your PDF smaller, although please make sure that your PDF is still
legible before submitting!
• The work you submit must be that of your group; you may not use or copy from the work of
other groups, or from external sources like websites or textbooks. We recognize that although
it should not be necessary to do so, students may wish to read a different textbook or
reference material about course topics. This is not prohibited, but you must declare it. (See the
next bullet). All the information you need has been covered in lectures and/or in our textbook.
• On the front page of your submission, you must list every source of information you used to
complete this homework (other than your lecture notes, our textbook, or conversations from office
hours or lectures). For example, indicate clearly the name of every student from another group
with whom you had any discussions, the title and sections of every textbook you consulted
(excluding the course textbook), the source of every web document you used or website you
visited (other than our own CSC263 site.)
Question 1 [15 marks]
Consider the following Python code:
1. def find_position(A: List[object], target: object) -> int:
2. """Return the index of target in A or -1 if target not found"""
3. for i in range(len(A)):
4. if A[i] == target:
5. return i
6. return -1
7. def encode(L: List[int], M: List[int], k: int) -> int:
8. """Return an integer based on L, M and k
9. Precondition: len(M) >= len(L) """
10. result = 0
11. for i in range(len(L)):
12. if L[i] == k:
13. result += M[i]
14. else:
15. result += find_position(M, L[i])
16. return result
In this question you will perform a best-case, a worst-case, and two average-case analyses of method
encode. Count item comparisons and give exact counts before you simplify to an asymptotic
expression. You may assume that the python operation len() takes constant time and does not
involve any comparisons that need to be counted in your analyses.
(a) Provide a best-case analysis.
(b) Provide a worse-case analysis.
(c) For the first average-case analysis, assume that the elements of both L and M , and the value for
k, are each chosen independently and uniformly at random from the integers 0 through 9.
(d) For the second average-case analysis, assume that lists L and M are both length n and are
random permutations of the lists [0, 1, 2, ..., n − 1]. k is chosen uniformly at random from the
set {0, 1, 2, ..., n.} Notice that when k = n, it will not match any values in L or M since these
lists do not contain n.
Question 2 [6 marks]
This problem deals with the challenge of re-distributing Covid-19 vaccination doses to match need.
Suppose that there are a very large number of distribution centres (pharmacies, doctors’ offices, hospi-
tals, clinics, etc.) and each one has either a surplus of dosages or a shortage. The central office requests
that each centre reports their status, but many distribution centres are short staffed and so the reports
arrive over a period of days. During that time, the central office staff member wants to start to address
the imbalance. So they do the following: After some initial waiting period, the staff member finds the
centre with the largest surplus and the centre with the largest shortage. They contact these centres
and arrange for some number of dosages to be transferred from one to the other. (For the sake of
this problem, we aren’t worried about how much they decide to transfer.) Once this is arranged, they
repeat this process many times. For the repeats, there are two important constraints:
1. Although this may be unrealistic and non-optimal, once a distribution centre has participated in
a transfer (either as a supplier or a recipient), they are not considered for any future transfers.
2. More distribution centres may have filed their status reports during the time that has elapsed
while the staff member was processing the previous transfer. Each time a staff member starts to
plan a new transfer, they consider all the centres that have filed reports at that time (other than
those disregarded because of constraint 1.)
(a) Describe the ADT that can be used to represent this problem.
(b) Describe a data-structure to efficiently implement your ADT. For each operation, give a brief
explanation (in English sentences). Provide and justify the complexity of each operation.
Question 3 [9 marks]
We define a ExtractMin-Insert-Stable (EIS) heap as a min heap with no duplicate elements where the
result of calling ExtractMin and immediately re-inserting that same element is the original heap.
(a) Provide an example of an EIS min heap with seven elements. Display both the original heap
(which is of course also the final heap) and the intermediate heap (after the call to ExtractMin)
as trees.
(b) Provide an example of a min heap with seven distinct elements that is not an EIS heap. Display
the original heap, the intermediate heap and the final heap all as trees.
(c) Formally describe the relationship between the elements that must hold for a heap to be an EIS
min heap.
(d) Prove that your description in part (c) describes all EIS min heaps and does not hold for min
heaps that are not EIS.
Question 4 [15 marks]
Consider a complete binary-tree with 2k−1 nodes and no duplicate keys. The keys in the tree obey the
binary-search-tree ordering property. Suppose that you are searching the tree for the data associated
with key v which is in the tree.
(a) What is the best-case number of nodes which must be examined to find the data associated with
v? Briefly justify your answer.
(b) What is the worst-case number of nodes which must be examined to find the data associated
with v? Prove your answer.
(c) Calculate the average-case number of nodes which must be examined. Assume that key v is in
the tree and is equally likely to be in any location in the tree. Hint: Do a sanity check on your
final formula by calculating by hand the expected number of nodes to be examined for a small
complete binary tree (say height 2 or 3).
(d) Redo the average-case analysis, under a different assumption about the probability distribution
over possible locations for v. Now assume that there is a 40% chance that v is not in the tree
and a 20% chance that it is at the root of the tree. If neither of these are the case, v is equally
likely to be in any other (non-root) location in the tree.
Question 5 [5 marks]
Consider that you have a binary max heap that holds the consecutive integers from 1 to n inclusive.
If the value y is in the second level of the heap, i.e. y is a child of the root, what is the possible range
of n (expressed as a function of y)? Carefully justify your answer.