Module 1: Foundations
Aleksandar Ignjatovic (K17 504)
Course Admins: cs3121@cse.unsw.edu.au
School of Computer Science and Engineering
UNSW Sydney
Term 2, 2024
Module Learning Outcomes 2
Identify the key features of an algorithm
Solve problems by creatively applying standard data structures
and searching and sorting algorithms
Communicate algorithmic ideas at different abstraction levels
Evaluate the efficiency of algorithms and justify their
correctness
Apply the LATEX typesetting system to produce high-quality
technical documents
Table of Contents 3
1. Time Complexity
2. Revision of Data Structures and Algorithms
3. Proofs
4. Puzzle
Fast and slow algorithms 4
You should be familiar with true-ish statements such as:
Heap sort is faster than bubble sort.
Linear search is slower than binary search.
We would like to make such statements more precise.
We also want to understand when they are wrong, and why it
matters.
Best and worst case performance 5
The statements on the previous slide are most commonly
made in comparing the worst case performance of these
algorithms.
They are actually false in the best case!
In most problems in this course, we are concerned with worst
case performance, so as to be robust to maliciously created
(or simply unlucky) instances.
We will hardly ever discuss the best case.
Average case performance 6
In some circumstances, one might accept occasional poor
performance as a trade-off for good average performance over
all possible inputs (or a large sample of them).
Analysing the average case or expected performance of an
algorithm requires probabilistic methods and is beyond the
scope of this course, except where we rely purely on results of
this type from prior courses (e.g. hash table operations,
quicksort).
Rates of growth 7
We need a way to compare two functions, in this case
representing the worst case runtime of each algorithm.
A natural starting point is to compare function values directly.
Question
If f (100) > g(100), then does f represent a greater running time,
i.e. a slower algorithm?
Answer
Not necessarily! n = 100 might be an outlier, or too small to
appreciate the efficiencies of algorithm f . We care more about
which algorithm scales better.
Rates of growth 8
We prefer to talk in terms of asymptotics, i.e. long-run
behaviour.
For example, if the size of the input doubles, the function
value could (approximately) double, quadruple, etc.
A function which quadruples will eventually exceed a function
which doubles, regardless of the values for small inputs.
We’d like to categorise functions (i.e. runtimes, and therefore
algorithms) by their asymptotic rate of growth.
We’ll primarily talk about the worst-case performance of
algorithms, but the same method of analysis could be applied
to the best case or average case performance of an algorithm.
Big-Oh notation 9
Definition
We say f (n) = O(g(n)) if for large enough n, f (n) is at most a
constant multiple of g(n).
g(n) is said to be an asymptotic upper bound for f (n).
This means that the rate of growth of function f is no greater
than that of function g .
An algorithm whose running time is f (n) scales at least as
well as one whose running time is g(n).
It’s true-ish to say that the former algorithm is ‘at least as
fast as’ the latter.
Big-Oh notation 10
Useful to (over-)estimate the complexity of a particular
algorithm.
For uncomplicated functions, such as those that typically arise
as the running time of an algorithm, we usually only have to
consider the dominant term.
Big-Oh notation: Examples 11
Example 1
Let f (n) = 100n. Then f (n) = O(n), because f (n) is at most 100
times n for large n.
Example 2
Let f (n) = 2n + 7. Then f (n) = O(n2), because f (n) is at most 1
times n2 for large n. Note that f (n) = O(n) is also true in this
case.
Example 3
Let f (n) = 0.001n3. Then f (n) ̸= O(n2), because for any constant
multiple of n2, f (n) will eventually exceed it.
Big-Oh notation: Examples 12
Recall from prior courses that:
inserting into a binary search tree takes O(h) time in the
worst case, where h is the height of the tree, and
insertion sort runs in O(n2) time in the worst case, but O(n)
in the best case.
Note that these statements are true regardless of:
the details of how the algorithm is implemented, and
the hardware used to execute the algorithm.
Both of these issues change the actual running time, but the
dominant term of the running time will still be a constant times h,
n2 or n respectively.
Big-Oh notation 13
Question
The definition states that we only care about the function values
for “large enough” n. How large is “large enough”?
Put differently, how small is “small enough” that it can be safely
ignored in assessing whether the definition is satisfied?
Answer
It doesn’t matter how f (1) compares to g(1), or even how
f (1, 000, 000) compares to g(1, 000, 000). We only care that f (n)
is bounded above by a multiple of g(n) eventually.
Big-Oh notation 14
Disclaimer
Of course, how f (1, 000, 000) compares to g(1, 000, 000) is also
important.
When choosing algorithms for a particular application with inputs
of size at most 1, 000, 000, the behaviour beyond this size doesn’t
matter at all!
Asymptotic analysis is one tool by which to compare algorithms. It
is not the only consideration; the actual input sizes used and the
constant factors hidden by the O(·) should also be taken into
account.
Big-Omega notation 15
Definition
We say f (n) = Ω(g(n)) if for large enough n, f (n) is at least a
constant multiple of g(n).
g(n) is said to be an asymptotic lower bound for f (n).
This means that the rate of growth of function f is no less
than that of function g .
An algorithm whose running time is f (n) scales at least as
badly as one whose running time is g(n).
It’s true-ish to say that the former algorithm is ‘no faster
than’ the latter.
Big-Omega notation 16
Useful to (under-)estimate the complexity of any algorithm
solving a particular problem.
For example, finding the maximum element of an unsorted
array takes Ω(n) time, because you must consider every
element.
Once again, for every function that we care about, only the
dominant term will be relevant.
Challenge
We didn’t present the formal definitions of O(·) and Ω(·), but they
can be found in either textbook.
Using these formal definitions, prove that if f (n) = O(g(n)) then
g(n) = Ω(f (n)).
Big-Theta notation 17
Definition
We say f (n) = Θ(g(n)) if
f (n) = O(g(n)) and f (n) = Ω(g(n)).
f (n) and g(n) are said to have the same asymptotic growth
rate.
An algorithm whose running time is f (n) scales as well as one
whose running time is g(n).
It’s true-ish to say that the former algorithm is ‘as fast as’ the
latter.
Big-Theta notation 18
Question
You’ve previously seen statements such as
bubble sort runs in O(n2) time in the worst case.
Should these statements be written using Θ(·) instead of O(·)?
Answer
They can, but they don’t have to be. The statements
bubble sort runs in O(n2) time in the worst case
and
bubble sort runs in Θ(n2) time in the worst case
are both true: they claim that the worst case running time is at
most quadratic and exactly quadratic respectively.
Big-Theta notation 19
Answer (continued)
The Θ(·) statement conveys slightly more information than the
O(·) statement. However in most situations we just want to be
sure that the running time hasn’t been underestimated, so O(·) is
the important part.
Sum property 20
Fact
If f1 = O(g1) and f2 = O(g2), then f1 + f2 = O(g1 + g2).
The last term O(g1 + g2) is often rewritten as
O (max(g1, g2)), since g1 + g2 ≤ 2max(g1, g2).
The same property applies if O is replaced by Ω or Θ.
Sum property 21
This property justifies ignoring non-dominant terms: if f2 has
a lower asymptotic bound than f1, then the bound on f1 also
applies to f1 + f2.
For example, if f2 is linear but f1 is quadratic, then f1 + f2 is
also quadratic.
This is useful for analysing algorithms that have two or more
stages executed sequentially.
If f1 is the running time of the first stage and f2 of the second
stage, then we can bound each stage and add the bounds, or
simply take the most ‘expensive’ stage.
Product property 22
Fact
If f1 = O(g1) and f2 = O(g2), then f1 · f2 = O(g1 · g2).
In particular, if f = O(g) and λ is a constant (i.e. λ = O(1)),
then λ · f = O(g) also.
The same property applies if O is replaced by Ω or Θ.
This is useful for analysing algorithms that have two or more
nested parts.
If each execution of the inner part takes f2 time, and it is
executed f1 many times, then we can bound each part and
multiply the bounds.
Table of Contents 23
1. Time Complexity
2. Revision of Data Structures and Algorithms
3. Proofs
4. Puzzle
(Binary) Heaps 24
Store items in a complete binary tree, with every parent
comparing ≥ all its children.
This is a max heap; replace ≥ with ≤ for min heap.
Used to implement priority queue.
Operations
Build heap: O(n)
Find maximum: O(1)
Delete maximum: O(log n)
Insert: O(log n)
(Binary) Heaps 25
91
79
62
23 47
31
36
25 10
i 1 2 3 4 5 6 7 8 9
A[i ] 91 79 36 62 31 25 10 23 47
Binary search trees 26
Store (comparable) keys or key-value pairs in a binary tree,
where each node has at most two children, designated as left
and right
Each node’s key compares greater than all keys in its left
subtree, and less than all keys in its right subtree.
Operations
Let h be the height of the tree, that is, the length of the longest
path from the root to the leaf.
Search: O(h)
Insert/delete: O(h)
Binary search trees 27
78
33
11 65
50 74
80
98
87
Self-balancing binary search trees 28
In the best case, h ≈ log2 n. Such trees are said to be
balanced.
In the worst case, h ≈ n, e.g. if keys were inserted in
increasing order.
Fortunately, there are several ways to make a self-balancing
binary search tree (e.g. B-tree, AVL tree, red-black tree).
Each of these performs rotations to maintain certain
invariants, in order to guarantee that h = O(log n) and
therefore all tree operations run in O(log n).
Red-black trees are detailed in CLRS, but in this course it is
sufficient to write “self-balancing binary search tree” without
specifying any particular scheme.
Frequency tables 29
Problem
You are given an array A of n − 1 elements. Each number from 1
to n appears exactly once, except for one number which is missing.
Find the missing number.
Idea
Record whether you’ve seen each possible value.
This leads to a linear time, linear space algorithm.
Challenge
How could you solve the problem in linear time and constant
space?
Frequency tables 30
To count the occurrences of integers from 1 to m, simply
make a zero-initialised array of frequencies and increment
every time the value is encountered.
The real question is how to count the occurences of other
kinds of items, such as large integers, strings, and so on.
Try to do the same thing: map these items to integers from 1
to m, and count the occurences of the mapped values.‘’
Hash tables 31
Store values indexed by keys.
Hash function maps keys to indices in a fixed size table.
Ideally no two keys map to the same index.
Operations (expected)
Search for the value associated to a given key: O(1)
Update the value associated to a given key: O(1)
Insert/delete: O(1)
Hash tables 32
A situation where two (or more) keys have the same hash
value is called a collision.
When mapping from a large key space to a small table, it’s
impossible to completely avoid collisions.
There are several ways to manage collisions – for example,
separate chaining stores a linked list of all colliding key-value
pairs at each index of the hash table.
Operations (worst case)
Search for the value associated to a given key: O(n)
Update the value associated to a given key: O(n)
Insert/delete: O(n)
Table of Contents 33
1. Time Complexity
2. Revision of Data Structures and Algorithms
3. Proofs
4. Puzzle
Proof by induction 34
Used to prove a sequence of propositions P1,P2,P3, . . ..
If the first proposition is true, and each one implies the next,
then they are all true.
If we make an infinite line of dominoes so that each one can
knock over the next, and we tip over the first domino, then all
the dominoes will fall.
Often we use a fast induction of the form
P(0)&∀n (P (⌊n/2⌋)→ P(n))→ ∀nP(n)
Proof by contradiction 35
Used to prove a proposition P by considering the alternative.
If the assumption ¬P leads to a contradiction (Q ∧ ¬Q for
some other proposition), then ¬P is impossible, so P holds.
Example: double-move chess, where each side moves twice on
their turn.
With perfect play, White can at least draw.
Suppose otherwise. Then Black must have a winning strategy
that works no matter what White does.
White could waste their first turn by moving a knight out and
back, then implement the winning strategy themselves.
So Black’s strategy is not winning after all!
Reasoning about algorithms 36
Whenever we present an algorithm, we must justify its
correctness and efficiency.
We need to provide reasoning for the claims that:
it always gives the right answer, and
it runs in the time complexity we claim.
Straightforward claims might need only a sentence of
reasoning, but for a less obvious claim the burden is higher.
Example: mergesort 37
Algorithm
We recursively apply the following algorithm to the subarray
A[ℓ..r ].
If ℓ = r , i.e. the subarray has only one element, we simply exit.
This is the base case of our recursion.
Otherwise:
first define m =
⌊
ℓ+r
2
⌋
, the midpoint of the subarray,
then apply the algorithm recursively to A[ℓ..m] and
A[m + 1..r ], and
finally merge the subarrays A[ℓ..m] and A[m + 1..r ].
Example: mergesort 38
The depth of recursion in mergesort is log2 n.
On each level of recursion, merging all the intermediate arrays
takes O(n) steps in total.
Thus, mergesort always terminates, and in fact it terminates
in O(n log2 n) steps.
Merging two sorted arrays always produces a sorted array,
thus, the output of mergesort will be a sorted array.
This is actually a proof by fast induction on the size of the
array!
Where we can communicate the idea succinctly, we should do
so. An argument can be rigorous without notation for its own
sake.
The role of proofs in algorithm design 39
Sometimes it is not easy to gather from the description of an
algorithm:
that it terminates, rather than entering an infinite loop,
how many steps it takes to terminate, and whether this is
acceptably small, or
whether the answer it reports is actually correct.
We always need to justify the claims we make about
algorithms, and occasionally that requires a detailed proof.
Stable Matching Problem 40
Suppose you have hired n frontend engineers and n backend
engineers to build n apps in pairs.
Every frontend engineer submits a list of preferences, which
ranks all the backend engineers from their most preferred
partner to least preferred, and vice versa.
We’d like to make n separate pairs in order to “make everyone
happy” in some sense.
Stable Matching Problem 41
Definition
A matching is an assignment of engineers so that no-one belongs
to two or more pairs.
Definition
A perfect matching is a matching involving all frontend engineers
and all backend engineers, i.e. where everyone belongs to exactly
one pair.
Our task is to design a perfect matching that somehow satisfies the
frontend and backend engineers, with regards to their preferences.
Stable Matching Problem 42
Question
Can we assign every engineer their first preference partner?
Answer
Not necessarily!
We’ll have to lower our expectations. Let’s aim for an
allocation that doesn’t make anyone too unhappy.
If a frontend engineer and a backend engineer were both very
unhappy with their allocated partner, they might quit their
partners and go build an app together.
Stable Matching Problem 43
Definition
A stable matching is a perfect matching in which there are no two
pairs (f1, b1) and (f2, b2) such that:
f1 prefers b2 to b1, and
b2 prefers f1 to f2.
Stable matchings are self-enforcing; no unmatched pair of frontend
engineer f1 and backend engineer b2 will quit.
We will design an algorithm which produces a stable matching.
Stable Matching Problem: Examples 44
In these examples, we’ll label the engineers for ease of reference.
Frontend: Flora and Fred
Backend: Betty and Bilal
In each example, all four engineers will rank their two counterparts
from most preferred to least preferred.
Stable Matching Problem: Example 1 45
Flora : Betty , Bilal Betty : Flora , Fred
Fred : Betty , Bilal Bilal : Flora , Fred
Preferences Stable Not stable
Stable Matching Problem: Example 2 46
Flora : Betty , Bilal Betty : Fred , Flora
Fred : Bilal , Betty Bilal : Flora , Fred
Preferences Stable Stable
Stable Matching Problem: Gale-Shapley algorithm 47
Question
Given n frontend and n backend engineers, how many ways are
there to match them, without regard for preferences?
Answer
n! ≈ (n/e)n - more than exponentially many in n (e ≈ 2.71).
Question
Is it true that for every possible collection of n lists of preferences
provided by all frontend engineers, and n lists of preferences
provided by all backend engineers, a stable matching always exists?
Answer
Perhaps surprisingly, yes!
Stable Matching Problem: Gale-Shapley algorithm 48
Question
Can we find a stable matching in a reasonable amount of time?
Answer
Yes, using the Gale-Shapley algorithm.
Produces pairs in stages, with possible revisions
Frontend engineers will be pitching to backend engineers
(arbitrarily chosen). Backend engineers will decide to accept a
pitch or not.
A frontend engineer who is not currently in a pair will be
called solo.
Start with all frontend engineers solo.
Stable Matching Problem: Gale-Shapley algorithm 49
While there is a solo frontend engineer who has not pitched to
all backend engineers, pick any such solo frontend engineer,
say Femi .
Femi pitches to the highest ranking backend engineer Brad
on his list, ignoring any who he has pitched to before.
If Brad is not yet paired, he accepts Femi’s pitch (at least
tentatively).
Otherwise, if Brad is already paired with someone else, say
Frida :
if Brad prefers Femi to Frida, he turns down Frida and makes
a new pair with Femi (making Frida now solo)
otherwise, Brad will turn down Femi and stay with Frida.
Stable Matching Problem: Gale-Shapley algorithm 50
Claim 1
The algorithm terminates after ≤ n2 rounds.
Proof
In every round of the algorithm, one frontend engineer pitches
to one backend engineer.
Every frontend engineer can make a pitch to a particular
backend engineer at most once.
Thus, every frontend engineer can make at most n pitches.
There are n frontend engineers, so in total they can make
≤ n2 offers.
Thus there can be no more than n2 many rounds.
Stable Matching Problem: Gale-Shapley algorithm 51
Claim 2
The algorithm produces a perfect matching, i.e. when the
algorithm terminates, the 2n engineers form n separate pairs.
Proof
Assume that the algorithm has terminated, but a frontend
engineer Felix is still solo.
This means that Felix has already pitched to all the backend
engineers.
A backend engineer is not paired only if no-one has pitched to
them. Since Felix has pitched to everyone, all n backend
engineers must be paired.
No frontend engineer can have more than one partner, so all n
frontend engineers must also be paired, including Felix.
This is a contradiction, completing the proof.
Stable Matching Problem: Gale-Shapley algorithm 52
Claim 3
The matching produced by the algorithm is stable.
Recall that each frontend engineer pitches in order from their
most preferred to their least preferred backend counterpart.
Therefore, the sequence of backend engineers paired with a
particular frontend engineer is in this order also.
Each backend engineer is initially not paired, then accepts the
first pitch made to them, and only ever switches to a different
frontend counterpart who they prefer to their current partner.
Thus, each backend engineer is paired with frontend engineers
in order from their least preferred to their most preferred.
Stable Matching Problem: Gale-Shapley algorithm 53
Proof
We will prove that the matching is stable using proof by
contradiction.
Assume that the matching is not stable. Thus, there are two pairs:
Feng and Bina , and
Farah and Bayo ,
such that:
Feng prefers Bayo over Bina and
Bayo prefers Feng over Farah.
Stable Matching Problem: Gale-Shapley algorithm 54
Proof (continued)
Feng : . . . , Bayo , . . . , Bina , . . .
Bayo : . . . , Feng , . . . , Farah , . . .
Since Feng prefers Bayo over Bina, he must have made a
pitch to Bayo before Bina.
Bayo must have either:
rejected Feng because he was already paired with a frontend
engineer he prefers to Feng, or
accepted Feng only to later rescind this and accept a pitch
from someone he prefers to Feng.
Stable Matching Problem: Gale-Shapley algorithm 55
Proof (continued)
Feng : . . . , Bayo , . . . , Bina , . . .
Bayo : . . . , Feng , . . . , Farah , . . .
In both cases Bayo would become paired with someone higher
than Feng in his preferences.
However he ends up with Farah, who he prefers even less than
he does Feng.
This is a contradiction, completing the proof.
Table of Contents 56
1. Time Complexity
2. Revision of Data Structures and Algorithms
3. Proofs
4. Puzzle
Puzzle 57
On a circular highway there are n petrol stations, unevenly spaced,
each containing a different quantity of petrol. It is known that the
total amount of petrol from all stations is exactly enough to go
around the highway once.
You want to start at a petrol station, take all its fuel, and drive
clockwise around the highway once. Each time you reach a petrol
station, you will take all its fuel too.1 The only problem is the
possibility that you might run out of fuel before reaching the next
petrol station!
Prove that there is always at least one starting petrol station to
make it around the highway.
1Your petrol tank is large enough to hold all the fuel.
That’s All, Folks!!