程序代写案例-ECOM90009

时间：2022-03-28

1
ECOM90009
Quantitative Methods for Business
Seminar Three - Probability and Discrete
Distributions
SSK eds 7 &8
Sections 6.1 to 6.4, Sections 7.1 to 7.3, and Section 7.6.
1

2
SEMINAR THREE - OUTLINE
1. Introduction to probability: the basis for statistics.
2. Rules of probability.
3. Decision tree analysis.
4. Univariate and bivariate probability distributions.
5. Discrete random variables.
6. The binomial distribution.
The Excel file for Seminar 3 is Binomial_Dist_sem3.

3

Why do you need to know about Probability Based Models?

1. Managers must work in a world where the outcomes of their
actions are often not known in advance. If we have a list of
possible outcomes and their probabilities we call this a
RISK type of situation. When we do not know the possible
outcomes we have an UNCERTAIN type of situation.
We only look at the RISK type of situation.

2. We have to work with SAMPLE ESTIMATES /
STATISTICS and these can have a wide range of possible
values.
3
4
Random Variables and Experiments
A random experiment is a process that results in one of several possible
outcomes that cannot be predicted with certainty.

A random variable (RV) is a function or rule that assigns a numerical value
to each possible outcome of an experiment.

A random variable is discrete if it can only assume a countable number of
possible values eg the number of telephone calls received in a given hour.

A random variable that can assume an uncountable number of values is
continuous eg. the time taken by a student to complete an exam.

5
Examples of Experiments
Experiment 1: flip a coin
Possible outcomes: head, tail
Experiment 2: Roll a die
Possible outcomes: 1, 2, 3, 4, 5, 6
Experiment 3:
Student response to “Overall, I had a very good
learning experience in this subject”
Possible outcomes:
strongly disagree, disagree, neither agree nor
disagree, agree, strongly agree
Possible outcomes:

Sample Space: S = {O1,O2,...On}
The list of all possible outcomes Oi of the experiment (n possible outcomes).
Outcomes listed must be both mutually exclusive and exhaustive.
6
Probabilities
Probability of an Outcome:
The chance that an outcome iO will occur, denoted by ( )iP O ,
where 1,2, .i n  .

• ( ) 0iP O  means there is no chance of the outcome occurring.
• ( ) 0.5iP O  means there is a 50% chance of it occurring.
• ( ) 1iP O  means there is a 100% chance of it occurring.

Requirements of Probabilities:

(1) Probability of each outcome must lie between 0 and 1 inclusive,
123 i.e. 0 ( ) 1 1,2,iP O i n    
(2) Sum of probabilities of all possible outcomes must equal 1,
i.e. 11 2( ) ( ) ( ) ( ) 1
n
in iP O P O P O P O   
7
Assigning Probabilities - Three Ways
Classical approach - use mathematics and logic.
Example: if a fair die is rolled, (3) 1/ 6P  .

Relative Frequency approach - the long run relative frequency with which
the outcome has occurred in the past.
Example: if 153 out of 170 QMB students last semester liked the
subject, the probability that a randomly selected student likes the
subject equals 153 / 170 = 0.9.

Subjective approach - a personal evaluation (intuition) reflecting the degree
to which you believe the outcome will occur.
It is used in many practical situations when the classical and relative
frequency approaches cannot be used.

8
Probability of an Event
Event: any collection of one or more possible outcomes from an experiment.
Probability of an Event, denoted ( ) :P E
– it equals the sum of probabilities assigned to outcomes Oi contained in E.

Example: if you draw one card from a deck of 52 standard playing cards,
what is the probability of drawing a red card?

        13 13 1red diamond, heart diamond heart
52 52 2
P P P P     
If all outcomes are equally likely, the probability of an event E occurring is:
  number of outcomes in
total number of possible outcomes in
EP E
S

9
Combining Events
The complement of event A (denoted A) is the set of all outcomes in S that
do not belong to A.

The probability of this complement occurring is ( ) 1 ( )P A P A 

The intersection of any two events A and B (denoted A B ) is the joint
event consisting of all outcomes common to both A and B.

If there are no outcomes common to both A and B, then A and
B are called disjoint or mutually exclusive events.

The union of any two events A or B (denoted A B ) is the compound event
consisting of all outcomes in A, in B or in both.
10
Univariate Example - Rolling a Die
Consider the outcome of rolling a single die once.
This is a univariate probability distribution - one variable.

Outcome roll a 1 roll a 2 roll a 3 roll a 4 roll a 5 roll a 6 Sum
Probability 1/6 1/6 1/6 1/6 1/6 1/6 1

What is the probability:
1. you roll a 5 or a 6? = 2 / 6 = 1 / 3
2. you roll an odd number? = 3 / 6 = 1 / 2
3. you roll a 2 given that you have rolled an even number? = 1 / 3.
11
Example continued
Let A be the event that the number is odd, i.e. A = {1, 3, 5}

Let B be the event that the number is 5 or 6, i.e. B = {5, 6}

Then the union of A or B, written A B , is

A B = {1,3,5,6}

For our events A and B, ( ) 4 / 6 2 / 3P A B  

The intersection of A and B, written A B , is
A B = {5}
For our events A and B, ( ) 1/ 6P A B 
12
Conditional Probabilities
The probability that A occurs, given that B has occurred, is a
conditional probability.

Here B is the new Sample Space and the symbol for given that is |
Any conditional probability is written as ( | )P A B and is calculated as
    
P A B
P A B
P B

Now ( ) 1/ 6P A B  and ( ) 1 / 3P B  so:
  1 6 1
1 3 2
P A B  
Intuition
– as we know that event B has occurred, we know that the outcome of the roll
of the die is either a 5 or a 6, and each of these outcomes is equally likely.
13
Bivariate Example - Students
Suppose a class consists of 100 students:

– 60 are 23 years old or less, 40 are older than 23.
– one half of each age group is male, the other half female.

We can classify students by age and gender – these are random variables.

A bivariate probability distribution can be represented by a
contingency table.

14

Bivariate Probability Distributions
In general, a bivariate probability distribution in the form of a contingency
table shows us:
A B Total
C P(A ∩ C) P(B ∩ C) P(C)
D P(A ∩ D) P(B ∩ D) P(D)
Total P(A) P(B) 1
The four probabilities such as P(A ∩ C) in the middle of the table of
probabilities are joint probabilities, and the sum of all four is 1.

The sub-totals, i.e. P(A), ... , P(D), are known as marginal probabilities
(unconditional). Each set of marginal probabilities in the column or in the
row also sum to 1.
15
Questions Using the Student Example
For the class of 100 students, answer the following questions.
If we pick a student at random:
1. What is the probability that the student is male?
2. What is the probability that the student is older than 23 and male?
3. Given that the student is male, what is the probability that he is 23
or younger?

4. What is the probability that the student is 23 or younger, or female, or
both?

16
Calculating the Answers

age

1. P(male) is a marginal probability. Answer: 0.5 in final column

2. P(male ∩ > 23) is a joint probability. Answer: 0.2 in the table

3. P(≤ 23 | male) is a conditional probability. Answer: 0.6
= P(≤ 23 ∩ male) / P(male) = 0.3 / 0.5 = 0.6

4. P(female ∪ ≤ 23) is a compound probability. Answer: 0.8
Compound Probability Alternatives
To find P(female OR ≤ 23) we cannot just add P (≤ 23) and P
(female) because then we would be double counting the students who
are both "Female" and " ≤ 23".

To avoid this double counting we add the separate probabilities and
subtract the probability of a student who is both "Female" and " ≤ 23".

To calculate the required probability we find:
P (≤ 23) + P (female) − P (female and ≤ 23)
= 0.5 + 0.6 - 0.3 = 0.8

Alternative approach: use the Complement Rule:
The student who is everything other than “23 or younger, or female, or
both” is a male and aged over 23. The probability that a randomly
selected student is male and aged over 23 is 0.2, which gives the required
probability of 1 - 0.2 = 0.8.
Rules Established So Far
The addition rule (for compound probabilities with OR):
P(A ∪ B) = P(A)+ P(B) − P(A ∩ B)
Formula for conditional probabilities with GIVEN THAT:

This gives the multiplication rule (for joint probabilities with AND):
P (A ∩ B) = P (A | B) ꞏ P (B) = P (B | A) ꞏ P (A)
Mutually Exclusive Events
Events which cannot occur simultaneously are called
mutually exclusive events.

Example: roll a fair die. Let A = {2} and B = {odd number}.
These events are mutually exclusive (their intersection is a null or empty
set).

Let C = {even number}. Then A and C are not mutually exclusive as the
value 2 belongs to both A and C.

Consider the addition rule:
     ( )P A B P A P B P A B    

If A and B are mutually exclusive,   0P A B  , the rule becomes:
   ( )P A B P A P B  
Independent Events
For the student seminar statistics, note that the conditional probability:
 
 
23 Male 0.3
( 23 male) 0.6
Male 0.5
P
P
P
    
and the marginal probability ( 23)P  = 0.6

Thus, whether or not we know a student’s gender gives us no new
information about the probability of whether the student will be 23 .

If the occurrence of one event gives no new information about the
likelihood of another i.e. if  ( )P A B P A , we say those events are
independent.
If two events are not independent, then we say they are dependent.
Independent Events continued
If events A and B are independent, then:

If events A and B are dependent, then:

Consider the multiplication rule:
P(A ∩ B) = P(A|B) · P(B)

If two events are independent, and P (A | B) = P(A) then the rule
becomes:
P(A ∩ B) = P(A) · P(B)
Applications of the Multiplication Rule
The Multiplication Rule has very important practical applications.
It is used to develop systems that reduce the risk of bad outcomes.

1. In modern cars there are usually 2 independent braking systems.
These are used to ensure the probability of brake failure is very
small. If A is the event the brake system will fail the probability the
brake system will fail may be P(A) = 0.001 or 1 in a thousand.
The probability that the first and the second independent systems
will both fail has a much smaller value of 1 in a million as
P (A ∩ A) = P (A) P (A) = (0.001)(0.001) = 0.000001

2. Corporations are expected to have independent directors? If the
probability that a manager acts badly is P(A) = 0.1 and the
probability the board acts badly is P(B) = 0.1 the probability both
act badly is
P (A ∩ B) = P (A) P (B) = (0.1)(0.1) = 0.01
Probability Tree Analysis
If we have information on marginal probabilities P(B) and joint
probabilities P(A ∩ B) in a contingency table, to find conditional
probabilities like P(A | B) we use the formula P(A ∩ B) / P(B).

If we have information on conditional probabilities, we can construct a
probability tree to help us find joint and marginal probabilities.

Probability (or decision) trees are particularly useful for analysing
sequential i.e. a series of experiments.

Example - Eating habits
On Friday nights, 30% of families eat at home (H), and 70% of families
buy their dinner.

Of the 70% of families that buy dinner, half get takeaway food from a
restaurant (T), and the other half eat in at a restaurant (E).

60% of families that buy dinner eat a large meal (L), and 30% of families
that eat at home eat a small meal (S).

What is the probability that
• a family has a large meal ?
• a family buys a small meal ?
• a family buys a large takeaway meal ?
Probability Tree Example

What does the final column add up to?
Answers:
P(L) = P(H ∩ L) + P(E ∩ L) + P(T ∩ L)
= 0.21 + 0.21 + 21
= 0.63
P(S ∩ Buys) = P(E ∩ S) + P(T ∩ S)
= (0.14 + 0.14 )
= 0.28

P(T ∩ L) = 0.21
Sequential Experiments
If you roll a fair die twice, what is the probability that:
1. you roll two sixes?
2. you roll one six?
3. you roll no sixes?
ANSWERS: Use the multiplication rule for independent events
P (A ∩ A) = P (A) P (A)

1. You roll a 6 followed by another 6 ⇒ 1/6 × 1/6 = 1/36

2. Either you roll a 6 followed by a number that is not 6, or you roll a
number that is not 6 followed by a 6. The answer is thus:
1/6 × 5/6+5/6 × 1/6 = 10/36

3. You roll a number that is not 6, followed by another number that is not
6.
Thus: 5/6 × 5/6 = 25/36
Random Variables and Probability Distributions
On slide 4, a definition for a random variable (RV) was given:
“a function or rule that assigns a numerical value
to each outcome of an experiment”.
RVs are what we analyse in statistics.
The data sets we investigate AND the statistics we construct from them
are usually RVs.
Once we know the possible values a RV can take, and the probabilities
that each value occurs, we have the probability distribution for that RV.
The probability distribution for some population is what we need to
analyse a RV (statistical inference is about populations).
Discrete Probability Distributions
We will focus on discrete RVs for the rest of this seminar, and look at
continuous RVs next seminar.

Discrete Probability Distribution definition:
a table, formula or graph that lists all possible values that a discrete RV
can take, along with their associated probabilities.

Requirements
If an RV X can take values 1 2, , , kx x x , where the probability that X takes
the value ix is ( )iP X x or simply ( )ip x , then:
 0 1ip x  and  
1
1
k
i
i
p x


Example - Tossing Coins
Consider the situation where a fair coin is tossed 2 times in such a
way that each outcome H or T is an independent event.

If the RV from this experiment is the number of heads we get
from the two tosses, then the possible values are 0, 1 and 2.

And the probabilities of the possible values,
     1 1 10 1 2
4 2 4
p p p  
Expected Value of a RV
If we weight all possible outcomes by their probabilities and sum them
together, we have the expected value of X, denoted E(X) or µ:

We refer to E(X) as the mean of the population distribution.
One way of thinking of E(X) is that it is the average value we would expect
to see over a large number of repeated experiments.

For our example:

Variance Value of a RV
If we calculate the squared deviation of each outcome from µ, weight each
deviation by the corresponding probability and sum together, we have the
variance of X, denoted V (X).
     22
1
k
i i
i
V X x p x

   
An alternative formula (shortcut calculation) is:
   2 2 2
1

k
i i
i
V X x p x

    
The standard deviation is: ( )V X 
Variance for our Example
If X is the number of heads from tossing a coin twice, we have seen that
µ = 1. Thus:
   2 2 2
1
2 2 2

1 1 1
0 1 2 1
4 2 4
1.5 1
.5
k
i i
i
V X x p x

    
                     
 



The standard deviation is:

( ) 0.5 0.707V X   

Data Transformations
Consider the data set (used in Seminar 1):
3 9 14 19 25
We know 14x  and 2 8.54xs 
What happens if we transform the data?

1. Suppose we add 10 to each value.
2. Suppose we multiply each value by 4.
3. Suppose we multiply each value by 4 then add 10.

What happens to the mean and variance?

Laws of Expected Value and Variance
Suppose we have a RV denoted X and two constants c and d.
Laws of expected value:
1. ( )E c c
2. ( ) ( )E X c E X c  
3. ( ) ( )E cX cE X
4. ( ) ( )E cX d cE X d  

Laws of variance:
1. ( ) 0V c 
2. ( ) ( )V X c V X 
3. 2( ) ( )V cX c V X
4. 2( ) ( )V cX d c V X 
Jointly Distributed Random Variables
Suppose we have two RVs denoted X and Y .
Let the joint probability that X has a value of ix and Y has a value of jy be
denoted by ( , )i jp x y .
Bivariate (joint) distribution: a table, graph or formula that lists the joint
probabilities for all pairs of values of X and Y.

Requirements:
1. 0 ( , ) 1i jp x y  for all pairs of values ( , )i jx y .
2. ( , ) 1i j i jp x y  
Bivariate Example
Consider the following joint distribution, with the three possible values for
X along the top row and the two possible values for Y in the left hand
column:
y \x 1 2 3 Sum
2 0.34 0.18 0.11 0.63
3 0.21 0.11 0.05 0.37
Sum 0.55 0.29 0.16 1.00

The entry when Y takes the value 3 and X takes the value 2 is 0.11.
This means:
P(X = 2 and Y = 3) = p(2,3) = 0.11
Marginals for Bivariate Example
The column totals give the marginal distribution for X,
and the row totals give the marginal distribution for Y .
For example:
P(X = 2) = P(X = 2 and Y = 2) + P(X = 2 and Y = 3)
= 0.18 + 0.11 = 0.29
and
P (Y = 3) = p(1, 3) + p(2, 3) + p(3, 3)
= 0.21+0.11+0.05 = 0.37

The sum of the column sums is 1, as is the sum of the row sums.
Covariance of Two Discrete RVs
Suppose there are n possible values of X (denoted 1 2, , nx x x ), and m
possible values for Y (denoted 1 2, , my y y ) there will be a total of nm
possible combinations of X and Y values.

The covariance of X and Y is:

The coefficient of correlation between X and Y is

NOTE - independent RVs have zero covariance and correlation.
Covariance Calculation Example
For our joint distribution,

0.55 1 0.29 2 0.16 3 1.61x       
2 2 2 2 20.55 1 0.29 2 0.16 3 1.61 0.5579x        

You should check that: 2.37y  and 2 0.2331y 
     
 
, 1 2 0.34 1 3 0.21
3 3 0.05
3.8 3.8157 0.0157
x y
COV X Y
 
      
   
   

 , 0.0157
0.0435
0.5579 0.2331x y
COV X Y  
   
Laws for Bivariate Distributions
We can also develop the probability distribution of any combination of
two variables X and Y .

Example from above:
P[X + Y = 4] = p(1,3)+ p(2,2) = 0.21+0.18 = 0.39
Some laws of expected value and variance:

1. E(X + Y ) = E(X) + E(Y )
2. V(X + Y ) = V (X) + V (Y ) + 2 · COV (X,Y )
3. E(X − Y ) = E(X) − E(Y )
4. V(X − Y ) = V(X) + V(Y ) − 2 · COV (X,Y )

Recall – if X and Y are independent, COV (X,Y ) = 0.
Binomial Distribution
A binomial experiment:

1. consists of a fixed number n of repeated trials,
2. each trial has two possible outcomes: success and failure,
3. the probability of success is p and of failure is q = 1 − p, and
4. the trials are independent – the outcome of one trial does not affect the
outcome of any other trial.

A binomial RV represents the total number of successes in the n trials.

The probability distribution of this RV gives us the probability that a
success will occur x times in n trials.
Binomial Example
Let X be a binomial RV representing the number of times you roll a 6 in n
rolls of a fair die.

For different numbers of trials the probability of rolling a single 6 is:
• 1/6 for n = 1

• 2 × 1/6 × 5/6 = 10/36 for n = 2

To find the probability of rolling a single 6 when n = 3 we have to find:

The number of ways of getting just one six when we have 3 trials.
The probability of any of these different ways of getting exactly one 6.

Binomial Example continued
To find the number of ways we can get a single 6 in 3 rolls of a fair die we
could just list every possible way. These are:

a 6 no 6 no 6
no 6 a 6 no 6
no 6 no 6 a 6
With a fair dice and independent events the 3 probabilities are all equal:
1/6 × 5/6 × 5/6 = (1/6) x (5/6)2
5/6 × 1/6 × 5/6 = (1/6) x (5/6)2
5/6 × 5/6 × 1/6 = (1/6) x (5/6)2

Thus, probability of single 6 in 3 rolls: 3 × 1/6 × (5/6)2 = 75/216
We need a general formula to avoid listing all possible outcomes.
Calculating Binomial Probabilities
First, note that the probability for each of the possible sequences (branches
of the tree) that represent x independent successes and ( )n x independent
failures equals (1 )x n xp p  .

Now we need to know how many branches or sequences yield x successes
and ( )n x failures.

For example, there were two ways of getting exactly one head (H) when
tossing two fair coins: H then T and T then H.

We can use the Counting Rule (combinatorial formula):
 
!
! !
n
x
nC
x n x
 
where    != 1 2 3 2 1n n n n       is called ‘n factorial’, with 0!
defined to equal 1.
Counting Rule Example
Consider arranging 2 letters F and 3 letters S in a row, e.g. FSSFS.

In how many different ways can this be done? Answer = 52C .

• If first F is in position 1, there are 4 possible positions for second F.
• If first F is in position 2, there are 3 possible positions for second F.
• If first F is in position 3, there are 2 possible positions for second F.
• If first F is in position 4, there is 1 possible position for second F.

The total number of arrangements is therefore 4+3+2+1 = 10

Shortcut calculation: 52
5! 120
10
2!3! 2 6
C   

Calculating Binomial Probabilities
Putting the probability and counting rule pieces together:
     
!
1
! !
n xxnp x p p
x n x
 
For tossing two coins example and getting exactly one head:
       
2 112! 2 11 0.5 1 0.5 0.5 0.5 0.5
1! 2 1 ! 1 1
p      
For the rolling exactly one six in three rolls of a fair die example:
         23! 755 251 11 36 6 6 361! 3 1 ! 216p    
Stock Price Example
Suppose a stock price has an equal chance of rising or falling each year, and
the change in any year is independent of any previous rise or fall.

If you observe the stock price movements over 10 years, what is the
probability that the stock price rises in exactly two of those years?

If we let the random variable X denote the number of years in which the
stock price increases, we require  2P X  or  2p , which we calculate as:
    2 8 1010! 902 2 0.5 0.5 0.5 0.0439
2!8! 2
P X p    

We simplify this type of calculation by noting that:
10! 10 9 8 7 6 5 4 3 2 1
10 9 90
8! 8 7 6 5 4 3 2 1
                 
Using the Computer
We can use Excel to calculate probabilities for binomial random variables.

To calculate the above probability p(2) we use:

= BINOM.DIST (2, 10, 0.5, FALSE)

The first argument is the number of successes (x), the next two arguments
are n and p, and the final argument is either TRUE or FALSE.

If we had set the final argument to TRUE, the value calculated would have
been P (X ≤ 2) = p(0) + p(1) + p(2) = 0.0547.

This is called a cumulative probability.
Binomial Distribution Plot: X ∼ b(10,0.5)

Binomial Distribution: n=10, p=0.5

This distribution with p = 0.5 is symmetric.
0.000
0.050
0.100
0.150
0.200
0.250
0.300
0 1 2 3 4 5 6 7 8 9 10
Binomial Distribution Plot: X ∼ b(10,0.3)

Binomial Distribution: n=10, p=0.3

This distribution with p < 0.5 is not symmetric but positively skewed.
With p > 0.5 it is not symmetric but now it will be negatively skewed.
Binomial Distribution Plot: X ∼ b(50,0.3)

Binomial Distribution: n=50, p=0.3
This distribution is not symmetric.
However, it has the bell-shape of the normal distribution which we
discuss next week. Key issue: are both np and n(1 - p) >5.
0.000
0.020
0.040
0.060
0.080
0.100
0.120
0.140
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50
Binomial Means and Variances
If X is a binomial random variable:  E X np 
  2V X npq 
 SD X npq 

Note that statisticians denote “distributed as” by the ∼ character,
e.g. X ∼ b(n,p) denotes that X is distributed as a binomial (b for binomial),
where n is the number of trials and p is the probability of success.
52

Excel file: Binomial_Dist_sem3

What You Need to Know
• Concepts: sample space, outcomes, events, experiments, RVs.
• How to define and calculate probabilities for simple, joint, compound,
conditional and complementary events.
• The concepts of mutually exclusive and independent events.
• How to construct & apply contingency tables & probability trees.
• How to construct and interpret discrete probability distributions, their
expected value and variance.
• How to calculate marginal distributions and measures of association for
bivariate distributions.
• How to calculate probabilities for a binomially distributed RV.

NEXT SEMINAR - Continuous Probability Distributions and
NEXT SEMINAR Sampling Distributions.