ST305-无代写
时间:2023-12-26
ST305 Statistical Inference
1
Baye’s rule
Baye’s rule: Let F1, F2,. . . be a partition of the sample
space, and let B be any set. Then,
P(Fj |E) =
P(EFj)
P(E)
=
P(E |Fj)P(Fj)∑n
i=1 P(E |Fi)P(Fi)
.
2
Example 3L
Suppose that we have 3 cards that are identical in form, except
that both sides of the first card are colored red, both sides of the
second card are colored black, and one side of the third card is
colored red and the other side black. The 3 cards are mixed up
in a hat, and 1 card is randomly selected and put down on the
ground. If the upper side of the chosen card is colored red,
what is the probability that the other side is colored black?
Solution
R- the upturned side of the chosen card is red
Rr (Bb,Rb)- the chosen card is all red (all black, red black)
The desired probability: P(Rb|R)
P(Rb|R) = P(Rb ∩ R)P(R)
=
P(R|Rb)P(Rb)
P(R|Rr )P(Rr ) + P(R|Rb)P(Rb) + P(R|Bb)P(Bb)
=
( 1
2
) ( 1
3
)
(1)
( 1
3
)
+
( 1
2
) ( 1
3
)
+ 0
( 1
3
) = 1
3
. (Activity)
3
Independence between two events
Definition: E is (statistically) independent of F if
P(E |F ) = P(E) or equivlently P(EF ) = P(E)P(F )
.
Definition: Two events E and F that are not independent
are said to be dependent.
If E and F are independent, then the following pairs are
also independent
E and F c .
Proof. WTS: P(EF c) = P(E)P(F c)
Note P(E) = P(EF ) + P(EF c) = P(E)P(F ) + P(EF c) =⇒
P(E)− P(E)P(F ) = P(EF c) =⇒
P(EF c) = P(E)(1− P(F )) = P(E)P(F c)
Ec and F
Ec and F c .
4
Example
Suppose that we toss 2 fair dice. Let E1 (E2) denote the event
that the sum of the dice is 6 (7), and F denote the event that
the first die equals 4. (i) Is E1 independent of F? (ii) Is E2
independent of F?
Sol. (i) P(E1F ) = P({(4,2)}) = 136
P(E1)P(F ) = P({(1,5), (2,4), (3,3), (4,2), (5,1)})P({2)})
=
5
36
1
6
=
5
216
So P(E1F ) ̸= P(E1)P(F ) =⇒ E1 and F are NOT independent.
(ii) P(E2F ) = P({(4,3)}) = 136
P(E2)P(F ) = P({(1,6), (2,5), (3,4), (4,3), (5,2), (6,1)})P({3)})
=
6
36
1
6
=
1
36
So P(E2F ) = P(E2)P(F ) =⇒ E2 and F are independent.
5
Independence among multiple events
Definition: Three events E , F , and G are said to be
independent if
P(EFG) = P(E)P(F )P(G)
P(EF ) = P(E)P(F )
P(EG) = P(E)P(G)
P(FG) = P(F )P(G)
The events E1, E2,. . ., En are said to be independent if for
every subset Ei1 , Ei2 ,. . ., Eir , r ≤ n of these events
P(Ei1Ei2 · · ·Eir ) = P(Ei1)P(Ei2) · · ·P(Eir )
6
Random variables
A random variable is a function from a sample space S
into the real numbers.
Example Suppose that our experiment consists of tossing
3 fair coins. If we let Y denote the number of heads that
appear, then Y is a random variable.
Suppose we have a sample space S = {s1, · · · , sn} with a
probability function P. Define a random variable with range
{x1, · · · , xm}. Let PX be defined by
PX (X = xi) = P({sj ∈ S : X (sj) = xi}.
Then,
PX is an induced probability function on X , defined in terms
of the original function P
We will simply write P(X = xi) rather than PX (X = xi).
7
Example
Suppose that our experiment consists of tossing 3 fair coins. If
we let Y denote the number of heads that appear, then Y is a
random variable.
Solution.
The random variable Y defined as above takes one of the
values 0, 1, 2, and 3 with respective probabilities:
PY (Y = 0) = P({(t , t , t)}) = 18
PY (Y = 1) = P({(t , t ,h), (t ,h, t), (h, t ,h)}) = 38
PY (Y = 2) = P({(t ,h,h), (h, t ,h), (h,h, t)}) = 38
PY (Y = 3) = P({(h,h,h)}) = 18
8
Cumulative distribution function
Definition: The function defined by
FX (x) = PX (X ≤ x) −∞ < x <∞
is called the cumulative distribution function (cdf) or,
more simply, the distribution function of X .
9
Example - Tossing three coins
Consider the experiment of tossing three fair coins, and let X =
number of heads observed.
The cdf of X :
FX (x) =
0 if −∞ < x < 0
1
8 if 0 ≤ x < 1
1
2 if 1 ≤ x < 2
7
8 if 2 ≤ x < 3
1 if 3 ≤ x <∞
This is a step function, see the graph below.
10
Example - Some observations
FX is defined for all values of x , not just those in
X = {0,1,2,3}.
For example,
FX (2.5) = P(X ≤ 2.5) = P(X = 0,1, or 2) = 78
FX has jumps at the values of xi ∈ X and the size of the
jump at xi is equal to P (X = xi)
The size of the jump at any point x is equal to P(X = x).
FX (x) = 0 for x < 0 since X cannot be negative
FX (x) = 1 for x ≥ 3 since x is certain to be less than or
equal to such a value.
FX can be discontinuous, with jumps at certain values of x .
By the way in which FX is defined, at the jump points FX
takes the value at the top of the jump.
FX is continuous when a point is approached from the
right.
11
Cumulative distribution function
Theorem: The junction F (x) is a cdf if and only if the
following three conditions hold:
limx→−∞ F (x) = 0 and limx→∞ F (x) = l .
F (x) is a nondecreasing function of x .
F (x) is right-continuous; that is, for every number x0,
limx↓x0 = F (x0).
Exercise Show the function defined by
F (x) =
{
0, x < 0
1− e−x x ≥ 0
is a cdf.
12
Discrete Random Variables
Definition: A random variable X is discrete if Fx(x) is a
step function of x .
A discrete random variable can take on at most a
countable number of possible values
For a discrete r.v. X , the probability mass function is
defined by:
fX (a) = P(X = a)
If X must assume one of the values x1, x2, . . ., then
fX (xi) ≥ 0 for i = 1,2, . . .
fX (x) = 0 for all other values of x
∞∑
i=1
fX (xi) = 1
A discrete random variable’s mass function values equal to
the jump sizes in the distribution function.
13
Example 2a
The probability mass function of a random variable is given by
f (i) = e−λλi/i!. Find (a) P(X = 0) and (b) P(X > 2).
Solution.
a.
P(X = 0) = f (0) = e−λλ0/0! = e−λ
b.
P(X > 2) = 1− P(X = 0)− P(X = 1)− P(X = 2)
= 1− e−λ − λe−λ − λ2e−λ/2
14
Continuous random variables
Definition: A random variable X is continuous if Fx(x) is a
continuous function of x .
For a continuous random variable X , its distribution
function can be expressed in the form of
F (a) := P(X ≤ a) =
∫ a
−∞
fX (x)dx , −∞ < a <∞
with a function fX ≥ 0.
Then
fX is the probability density function (pdf)
If X is a continuous random variable, fX (x) = F ′X (x).
15
Example
Let X be random variable with the pdf:
f (x) =
{
0, if x ≤ 0
λe−λx , if x ≥ 0
Then, its cdf:
F (x) =
{
0, if x ≤ 0
1− e−λx , if x ≥ 0
16
Properties of pdf and pmf
Theorem: A function fX (x) is a pdf (or pmf ) of a random
variable x if and only if
fX (x) ≥ 0 for all x∑
x fX (x) = 1 (pmf ) or
∫∞
−∞ fX (x)dx = 1 (pdf )
17
Identically distributed random variables
Definition: Let B1 be the smallest sigma algebra containing
all the interval of real numbers of the form (a,b), [a,b),
(a,b] and [a,b]. The random variables X and Y are
identically distributed if, for every set A ∈ B1,
P(X ∈ A) = P(Y ∈ A).
Theorem. The following two statements are equivalent:
The random variables X and Y are identically distributed.
FX (x) = FY (x) for every x .
18
Identically distributed random variables
Note: two random variables that are identically distributed
are not necessarily equal.
Example (Identically distributed random variables)
Consider the experiment of tossing a fair coin three times.
Define the random variables X and Y by:
X= number of heads observed and Y = number of tails
observed.
We can verify P(X = k) = P(Y = k).
=⇒ X and Y are identically distributed.
However, for no sample points do we have X (s) = Y (s).
19
A note on notation
The expression“X has a distribution given by FX (x)" is
abbreviated symbolically by “X ∼ FX (x)". Or we can
similarly write X ∼ fX (x)
The symbol "∼" should as "is distributed as."
X ∼ Y means: X and Y are identically distributed
20
Expected value -discrete random variable
If X is a discrete r.v. with a probability mass p(x), the
expectation or the expected value or the meanof X is
denoted by E [X ]:
E [X ] =
∑
x :p(x)>0
xp(x)
If X must take on one of the values x1, x2, . . ., xn with
respective probabilities p(x1), p(x2), . . ., p(xn), then
E [X ] =
n∑
i=1
xip(xi)
The expectation: (a center of mass) a weighted average of
the possible values xi with weights p (xi).
21
Example
Find E [X ], where X is the outcome when we roll a fair die.
Solution.The prob mass function is
p(1) = p(2) = · · · = p(6) = 16 .
E [X ] = 1
(
1
6
)
+ 2
(
1
6
)
+ · · ·+ 6
(
1
6
)
=
7
2
.
22
Example
Let X ∼ Binom(n,p). Then X = 0,1, . . . ,n, and
p(i) = P{X = i} =
(
n
i
)
pi(1− p)n−i , i = 0,1, . . . ,n
Note
∑n
i=0 p(i) =
∑n
i=0
(
n
i
)
pi(1− p)n−i = [p + (1− p)]n = 1
We will use a cute trick: i = ddt t
i
∣∣
t=1.
E [X ] =
n∑
i=0
i
(
n
i
)
· pi(1− p)n−i
=
n∑
i=0
(
n
i
)
d
dt
t i
∣∣∣∣∣
t=1
· pi(1− p)n−i
=
d
dt
(
n∑
i=0
(
n
i
)
(tp)i(1− p)n−i
)∣∣∣∣∣
t=1
=
d
dt
(tp + 1− p)n
∣∣∣∣
t=1
= n(tp + 1− p)n−1 · p∣∣t=1 = np.
23
Expected value - continuous random variable
The expected value, expectation, or mean of a
continuous random variable X is defined by
E [X ] =
∫ ∞
−∞
x · f (x)dx
Example Consider a random variable X with the following
density:
f (x) =
{
0, if x ≤ 0
1
λe
− 1
λ
x , if x ≥ 0
Find E [X ].
Solution.
E [X ] =
∫ ∞
0
x
1
λ
e−
1
λ
x dx =
1
λ
24
Expectation of a Function of a Random Variable
If X is a discrete random variable that takes on one of the
values xi , i ≥ 1, with respective probabilities f (xi), then, for
any real-valued function g,
E [g(X )] =
∑
i
g(xi)f (xi)
If X is a continuous random variable that takes on one of
the values xi , i ≥ 1, with probability density function f (x),
then, for any real-valued function g,
E [g(X )] =
∫ ∞
−∞
g(x)fX (x)
25
Properties of expectation
The expectation operator can be expressed in integral or
summation form
The properties of the integral or sum, leads to many
properties for the expectation operator, that can help ease
calculational effort
Linearity: The process of taking expectations is a linear
operations =⇒
If a, b and c are constants, then
E [aX + b] = aE [X ] + b
E [ag1(X ) + bg2(X ) + c] = aE [g1(X )] + bE [g2(X )] + c
Proof. If X is continuous:
E [ag1(X )+bg2(X )+c] =
∫∞
−∞(ag1(x)+bg2(x)+c)fX (x)dx
26