xuebaunion@vip.163.com
3551 Trousdale Rkwy, University Park, Los Angeles, CA
留学生论文指导和课程辅导
无忧GPA:https://www.essaygpa.com
工作时间:全年无休-早上8点到凌晨3点

微信客服:xiaoxionga100

微信客服:ITCS521
STA347H1: Probability (Week 2) The Basics of Probability & Random Variables Mohammad Kaviul Anam Khan Department of Statistical Science University of Toronto sta347@utoronto.ca Week 2 (Sept 20-24, 2021) Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 1 / 48 Outline Last week we discussed, • Some introductory set theorem • The probability triple • Some fundamentals of probability • Continuity of probability This week we are going to discuss • Continuity of Probability (with proofs on white board) • Random variables (discrete and continuous) • Some common distributions • PMF, PDF and CDF • Calculating median and mode • Joint distribution (will continue next week) Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 2 / 48 Continuity in Probability • Suppose A1,A2, ... is a sequence of events that are getting “close” (in some sense) to another event, A. • For example the following two Venn diagrams Increasing sequence An ↗ A A1 A2 A3 .. ..A Decreasing sequence An ↘ A A .. ..A3 A2 A1 Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 3 / 48 Continuity in Probability • Increasing Sequence: The sequence {An} increases to A (An ↗ A). This means that A1 ⊆ A2 ⊆ A3 ⊆ .... This also means that ⋃∞n=1 An = A • Decreasing Sequence: The sequence {An} decreases to A (An ↘ A). This means that A1 ⊇ A2 ⊇ A3 ⊇ .... This also means that ⋂∞n=1 An = A • Since {An} converges to A then we would expect that limn→∞ P(An) = P(A). (proof E&R section 1.7) Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 4 / 48 Continuity in Probability Boole’s Inequality Let A1,A2, ..., be any events. Then P ( ∞⋃ k=1 Ak ) ≤ ∞∑ k=1 P(Ak) Proof We know that P(A1 ∪ A2) = P(A1) + P(A2)− P(A1 ∩ A2). This implies P(A1 ∪ A2) ≤ P(A1) + P(A2) Using induction method we can see for any finite n, P(⋃nk=1 Ak) ≤∑nk=1 P(Ak) Let’s assume B1 = A1,B2 = A1 ∪ A2 and Bn = ⋃nk=1 An. Further B = ⋃∞k=1 Ak . We can see that Bn ↗ B. Thus using continuity of probability P(⋃∞k=1 Ak) = P(B) = limn→∞ P(Bn) ≤ limn→∞∑nk=1 P(Ak) = ∑∞k=1 P(Ak) Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 5 / 48 Continuity in Probability Bonferroni’s Inequality Let A1,A2, ..., be any events. Then P ( ∞⋂ k=1 Ak ) ≥ 1− ∞∑ k=1 P(Ack) Proof From extending DeMorgan’s law ( ∞⋂ k=1 Ak )c = ∞⋃ k=1 Ack . Thus, P ( ∞⋂ k=1 Ak )c = P ( ∞⋃ k=1 Ack ) ⇒ P ( ∞⋂ k=1 Ak )c ≤ ∞∑ k=1 P(Ack) ⇒ 1− P ( ∞⋂ k=1 Ak )c ≥ 1− ∞∑ k=1 P(Ack)⇒ P ( ∞⋂ k=1 Ak ) ≥ 1− ∞∑ k=1 P(Ack) Solve all problems of chapter 1.6 from E&R Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 6 / 48 Random Variable • Assume we are tossing three coins • The sample space S = {HHH,HHT ,HTH,THH,HTT ,THT ,TTH,TTT} • Sometimes we are not interested in these type of “variables” (HHH etc.). Rather we might want to count the number of heads after three tosses • This produces much simpler sample space S1 = {0, 1, 2, 3} • This gives rise to the concept of a Random Variable Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 7 / 48 Random Variable Definition A random variable is function from the sample space S to the set of all real numbers R. That is, X (s) : S → R Examples • Tossing three coins and counting the number of heads X = number of heads • Throwing two dices and adding the numbers from two dices X = number from dice one + number from dice two • Consider rolling a fair six-sided die, so that S = {1, 2, 3, 4, 5, 6}. Let X be the number showing, so that X (s) = s for s ∈ S. Let Y be three more than the number showing, so that Y (s) = s + 3. Let Z = X 2 + Y . Then Z (s) = X (s)2 + Y (s) = s2 + s + 3. So Z (1) = 5,Z (2) = 9, etc. Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 8 / 48 Types of Random Variable • Basically there are two types of random variable, 1. Discrete random variable: Number of heads if a coin is tossed n times, number of accidents in a street etc. Countable 2. Continuous random variable: Height of the students, Blood Pressure of patients etc. Uncountable • A ‘random variable’ has probability distribution • The important part is ‘probability’ distribution. This means that certain intervals of the random variable has a certain probability to occur • We will discuss three types of distribution 1. Discrete distribution 2. Continuous distribution 3. Mixture distribution • All the types can be divided in two further types (a) Univariate (b) Multivariate Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 9 / 48 Types of Distribution Figure: A diagram representing the types of distributions Discrete Continuous Mixture Univariate Multivariate Unbounded Bounded Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 10 / 48 Discrete Distributions • The definition of the discrete variable and their distribution Definition A random variable X is discrete if there is a finite or countable sequences x1, x2, ... of distinct real numbers, and a corresponding sequence p1, p2, ... of non-negative real numbers, such that P(X = xi) = pi ,∀i , and ∑i pi = 1 • Another important concept associated with discrete random variables is ‘probability function’ or ‘probability mass function (PMF)’ Definition For a discrete random variable X , it’s probability function is the function pX : R→ [0, 1] defined by, pX (x) = P(X = x) Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 11 / 48 Bernoulli Trials • Let we are tossing a coin. There are two outcomes {H,T} • P(H) = θ and thus P(T ) = 1− θ • Let X be the count of heads, thus x = 0, 1 (read it as x ∈ {0, 1}) • Let’s toss the coin once. Then what would be the probability functions The Bernoullie Distribution P(X = x) = θx (1− θ)1−x ; x = 0, 1 • Very important concept for ‘binary’ variables • This gives rise to the Binomial distribution Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 12 / 48 Binomial Distribution • Let we are tossing n coins. • For each toss (trial) P(H) = θ and thus P(T ) = 1− θ • Let X be the count of heads, thus x = 0, 1, 2, ..., n (read it as x ∈ 0, 1, 2, ..., n) Binomial Distribution P(X = x) = ( n x ) θx (1− θ)n−x ; x = 0, 1, 2, ..., n • Here the number of trials are fixed and number of heads is the variable • What if we just want 1 head and want to observe how many trials we need Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 13 / 48 Geometric Distribution • In this case we want to know what would be the number of trials needed till we get the first head • what is fixed here?? Number of head = 1 • what is variable here?? Number of trials • Let, X be random variable and x = 1, 2, 3, ... then, Geometric Distribution P(X = x) = θ(1− θ)x−1, x = 1, 2, 3, ..., Let, k = x − 1, Then, PX (k) = θ(1− θ)k , k = 0, 1, 2, ... • The second equation means we are counting the number of failures • Why geometric?? Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 14 / 48 Negative Binomial Distribution • What if instead of 1 success (head) we want to observe r successes • In a way this is exactly opposite of Binomial Negative Binomial Distribution Let r be a positive integer, and let Y be the number of tails that appear before the r th head. Then for k ≥ 0, Y = k if and only if the coin shows exactly r − 1 heads (and k tails) on the first r − 1 + k flips, and then shows a head on the (r + k)-th flip. The probability of this is equal to PY (k) = ( r − 1 + k r − 1 ) θr−1(1− θ)kθ = ( r − 1 + k r − 1 ) θr (1− θ)k , k = 0, 1, 2, 3, ... • When r = 1 then this becomes the Geometric Distribution Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 15 / 48 Poisson Distribution • Again consider the binomial distribution, i.e., P(X = x) = (n x ) θx (1− θ)n−x , x = 0, 1, 2, 3, ..., n • What happens when n→∞, i.e., lim n→∞P(X = x) =?. Poisson Distribution Let, λ = nθ, then, P(X = x) = ( n x )( λ n )x ( 1− λn )n−x = n!(n − x)!x ! ( λ n )x ( 1− λn )n−x = n(n − 1)...(n − x + 1)x ! ( λ n )x ( 1− λn )n−x Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 16 / 48 Poisson Distribution Poisson Distribution By rearranging some of the terms P(X = x) =n(n − 1)...(n − x + 1)nx λx x ! ( 1− λn )n ( 1− λn )−x = nn n − 1 n n − 2 n .... n − x + 1 n λx x ! ( 1− λn )n ( 1− λn )−x = 1 ( 1− 1n )( 1− 2n ) .... ( 1− x + 1n ) λx x ! ( 1− λn )n ( 1− λn )−x Thus taking lim n→∞ lim n→∞P(X = x) = λx x ! limn→∞ 1 ( 1− 1n )( 1− 2n ) .... ( 1− x + 1n )( 1− λn )n ( 1− λn )−x Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 17 / 48 Poisson Distribution Continuing, Poisson Distribution We know lim n→∞(1− 1 n ) = 1 and in the same way limn→∞ ( 1− λn )−x = 1 lim n→∞P(X = x) = λx x ! limn→∞ ( 1− λn )n What is lim n→∞ ( 1− λn )n = e−λ = exp(−λ). Thus, as n→∞ P(X = x) = λ x x ! exp(−λ), x = 0, 1, 2, ... • This is the PMF of Poisson Distribution • Very useful when number of trials are large and probability of success is very small θ → 0 • Please do Exercise 2.3.18. Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 18 / 48 Hypergeometric Distribution • Assume we have N fishes in a pond • We catch M fishes and then mark them, leave them back into the pond • Now we catch n fishes. How many of them are marked?? • First see that N, M and n are fixed quantities. Also there are M marked fishes and N −M unmarked fishes • Let x out of n fishes are marked. Thus X is the random variable here and X ≥ 0 and X ≥ n − (N −M) (WHY??) • Thus X = max(0, n − (N −m)) and X ≤ min(M, n) Hypergeometric Distribution n from N can be chosen in (N n ) ways. x of them can be marked in (M x ) ways. Thus, the probability function P(X = x) = (M x )(N−M n−x )(N n ) ; x ≥ max(0, n − (N −M)) & x ≤ min(M, n) Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 19 / 48 Continuous Distribution • If X is a discrete random variable then P(X = x) > 0 for some values of x • However when X is continuous (uncountable) then P(X = x) = 0 • For example height or weight of individuals is continuous. Thus, Continuous Distribution A random variable X is continuous if P(X = x) = 0, ∀x ∈ R • However, X can have a positive probability of being in a certain interval, i.e., P(a ≤ x ≤ b) > 0 for some x Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 20 / 48 Uniform Distribution • One of the most common and useful continuous distribution is Uniform Distribution • Let the sample space S = [0,1] • Then every interval of length b − a; ∀ 0 ≤ a < b ≤ 1 the probability is P(a ≤ x ≤ b) = b − a • Thus x being in the interval [0, 1/2] = 1/2− 0 = 1/2 and so on. • Now let the sample space be S = [L, R] ∀ L,R ∈ R and −∞ < L < R <∞ • What is P(a ≤ x ≤ b) =?; ∀ L ≤ a < b ≤ R • This probability is given by, P(a ≤ x ≤ b) = b − aR − L Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 21 / 48 Probability Density Function (PDF) • For a continuous variable P(X = x) = 0 but P(a < x < b) ≥ 0 for some x • Then how can we develope a probability function like a discrete variable • This gives rise to the concept of Probability Density Function (PDF) Let X be a random variable ∆x be a very small interval. Let, f : R→ R be a function. Then f is a Probability density function if f (x) ≥ 0 ∀ x ∈ R and f (x) = lim ∆x→0 P(x < X < x +∆x) ∆x and ∫ ∞ −∞ f (x)dx = 1 Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 22 / 48 Probability Density Function (PDF) This gives rise two important mathematical definitions, Definition A random variable X is absolutely continuous if there is a density function f , such that P(a ≤ x ≤ b) = ∫ b a f (x)dx whenever a ≤ b Theorem Let X be an absolutely continuous random variable. The X is a continuous random variable, i.e., P(X = a) = 0; ∀ a ∈ R Solution P(X = a) = P(a ≤ X ≤ a) = ∫ aa f (x)dx = 0 Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 23 / 48 Uniform Distribution • Going back to Uniform distribution, the sample space is S = [L, R] ∀ L,R ∈ R and −∞ < L,R <∞ • Thus P(X > R) = P(X < L) = 0 • Thus the PDF of Uniform Distribution is defined as, Uniform Distribution f (x) = 1 R − L L ≤ x ≤ R 0 Otherwise. • Also it’s easy to varify Cumulative Probability ∫ R L f (x)dx = ∫ R L dx R − L = R − LR − L = 1 Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 24 / 48 Normal Distribution • The most common and important distribution in statistics • The plot of f (x) over x looks like the following 0.683 0.954 µ− 2σ µ− σ µ µ+ 2σµ+ σ x • This shape is the famous bell-shape • This distribution is based on the population mean µ and the population variance σ2 Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 25 / 48 Normal Distribution The Normal(µ, σ2) Distribution PDF Let, X be a normally distributed random variable with a finite mean µ ∈ R and a positive finite variance σ2 ∈ R+. Then the PDF, f (x) = 1 σ √ 2pi exp ( −(x − µ) 2 2σ2 ) ; −∞ ≤ x ≤ ∞ Challange: Show that integration of f(x) over the range of x is 1 • This distribution has some very nice properties (see the graph) • Also if µ = 0 and σ = 1 then we have a very specific normal distribution, which is called the Standard Normal distribution Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 26 / 48 Exponential Distribution • Uniform and Normal are both symmetric distributions • However, there are many other possible distributions which are skewed • One such distribution is the Exponential distribution • An exponentially distributed variable is always positive. Thus, Exponential Distribution Let X be a positive random variable, i.e., X ≥ 0. If X is exponentially distributed then the PDF is given by f (x) = { λe−λx ; x ≥ 0 0; x < 0 • Here λ is defined as the rate parameter. • Famous for modelling lifetime (survival time) Challange: Show the PDF integrates to 1 over the limit of X Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 27 / 48 Gamma Distribution • The PDF of the exponential distribution can also be written as f (x) = λX 1−1e−λx . What does the right hand side looks like • This can be derived as a Gamma Function. That is, Γ(α) = ∫∞ 0 tα−1e−tdt, α > 0 • Γ(α + 1) = αΓ(α) • If n is a positive integer then Γ(n) = (n − 1)! and Γ(1/2) = √pi • Thus the PDF of a Gamma distribution is given by, Gamma Distribution Let X be a positive random variable, which is distributed as Gamma. The PDF is given by, f (x) = λ αxα−1 Γ(α) e −λx ; α, λ > 0 Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 28 / 48 Cumulative Distribution Function • One of the problems of PDFs is that they are not unique • Also, density functions are difficult to interpret and they are capable of generating probabilities • However, if X is any random variable (discrete or continuous) then it’s distribution consists of the values of P(X ∈ B) for all subsets B of real numbers • One such special subset is B = (−∞, x ] for some real valued x . This motivates the following definition Cumulative Distribution Function Given a random variable X , it’s cumultive distribution function (CDF) is the function FX : R→ [0, 1], defined for FX (x) = P(X ≤ x). • Instead of FX (x) we can just write F (x) Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 29 / 48 Cumulative Distribution Function • An important property of CDFs are that probability of any interval can be expressed in terms of CDFs. i.e., P(a ≤ X ≤ b) = FX (b)− FX (a) • This motivates a very important theorem of CDF Theorem Let X be any random variable, with cumulative distribution function FX . Let B be any subset of the real numbers. Then P(X ∈ B) can be determined solely from the values of FX (x). • For proofs and details see the book (pages 62-63) Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 30 / 48 Properties of CDF Following are properties of CDF Properties of CDF If FX be CDF of some random variable X . Then, 1. 0 ≤ FX (x) ≤ 1, ∀ x 2. FX (x) is an increasing function. That is if x ≤ y then FX (x) ≤ FX (y) 3. limx→−∞ FX (x) = 0 and limx→∞ FX (x) = 1 For proof please see the book. Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 31 / 48 CDFs for Discrete Distribution Theorem Let X be a discrete random vaiable. Then the CDF FX (x) is defined as FX (x) = ∑ y≤x pX (y) • Lets think about throwing a die and x = 1, 2, 3, 4, 5, 6. Then FX (x) = 0; x < 1 1/6; x ≤ 1 2/6; x ≤ 2 3/6; x ≤ 3 4/6; x ≤ 4 5/6; x ≤ 5 1; x ≥ 6 • Thus for discrete case FX is a step function • That is the function only increases for certain points then remains the same • See Figure 2.5.1 to visualize • Try for known distributions such as binomial, Poisson etc. Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 32 / 48 CDFs of Absolutely Continuous Distributions Theorem Let X be an absolutely continuous random variable, with density function fX . Then the cumulative distribution function FX of X satisfies FX (x) = ∫ x −∞ fX (x)dx • What is the relationship between PDF and CDF (What does fundamental theorem of calculus indicate??) Theorem d dx FX (x) = fX (x) Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 33 / 48 CDF of Normal Distribution • It can easily be seen that how we can calculate the CDFs of Uniform and Exponential distributions • What about Gamma and Normal • Let’s try to calculate the CDF of normal F (x) = ∫ x −∞ 1 σ √ 2pi exp ( −(t − µ) 2 2σ2 ) dt =?? • Not easy. What about standard normal Φ(z) = ∫ z −∞ 1√ 2pi e−t2/2dt =?? • Can be calculated numerically • Once we know the CDF of standard normal (Φ(z)) we can calculate the CDF of any normally distributed variable by F (x) = Φ((x − µ)/σ) Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 34 / 48 Mixture Distribution • Let Yi ∼ Fi(y); i = 1, 2, ..., k • Let pi are positive real numbers such that ∑i pi = 1 • Then define G(x) = p1F1(x) + p2F2(x) + ...+ pkFk(x) • Then G is also a CDF (Proof exercise 2.5.6). This distribution is called Mixture Distribution Simple Example Let Y1 ∼ N(µ1, σ) and Y2 ∼ N(µ2, σ). Toss a coin with P(H) = θ. If head occurs then we choose Y1 or else we choose Y2. Then the CDF, G(Y ) = θΦ(Y1) + (1− θ)Φ(Y2) Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 35 / 48 Median of Distribution • What is median?? • In theory we can calculate the median using CDFs • Is little difficult for discrete distribution • If CDF is known for a absolutely continuous random variable then median can be easily calculated Median of a Continuous Random Variable Let X be a absolutely continuous random variable and FX be it’s CDF. If M ∈ R is the median then, FX (M) = ∫ M −∞ fX (x)dx = 0.5 Solving the equation we can obtain the value of M Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 36 / 48 Median of Distribution • What is the median of exponential distribution with PDF f (x) = λ exp(−λx) • First we need to calculate the CDF F (x) = 1− exp(−λx) • Thus F (M) = 1− exp(−Mλ) = 0.5. Then M =? M = 1 λ log(2) • Here log is the natural log i.e., ln(2) • Can median of Gamma distribution be calculated?? What is the CDF Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 37 / 48 Mode of Continuous Distribution • What is mode? • In general the mode is the value of a data point which occurs the most (have the highest probability of occurrence) • It can be easily calculated by checking where the PMF gives the maximum value • However, how would you calculate mode for a continuous random variable?? • Need to calculate the maximum value of PDF!! • In calculus how do we obtain the maximum value of a continuous function?? • Thus we have to calculate the maxima of the PDF Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 38 / 48 Mode of Continuous Distribution Mode of Normal distribution • If Z is a standard normal variate then what is the mode of z • Maximize 1√ 2pi exp(−1/2(z2)) with respect to z (Hint: take log of the PDF) • Is second derivative negative?? • The mode is 0 • Thus if X ∼ N(µ, σ2) then the mode would be Mode of Gamma distribution • X ∼ Gamma(α, λ), thus f (x) = λ αxα−1 Γ(α) e −λx • The mode is α− 1 λ • What if 0 < α < 1. Can mode be negative for X? • If α < 1 then mode is 0 Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 39 / 48 Joint Distribution • Let X and Y be two random variables. Then often we are interested in their relationship • Even if we know the distributions of X and Y exactly, we may still don’t identify their relationships • This motivates the concept of “joint distribution” Joint Distribution If X and Y are random variables then the joint distribution of X and Y is the collection of probabilities P((X ,Y ) ∈ B), for all subsets B ⊆ R2 of pairs of real numbers. Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 40 / 48 Joint CDF Joint CDF Let X and Y be random variables. Then their joint cumulative distribution function is the function FX ,Y : R2 → [0, 1] defined by FX ,Y (x , y) = P(X ≤ x ,Y ≤ y) = P(X ≤ x ∩ Y ≤ y) Again, like the univariate case, the joint probabilities can solely be determined by the joint CDF Theorem Let X and Y be any random variables, with joint cumulative distribution function FX ,Y . Suppose, a ≤ b and c ≤ d . Then, P(a < X ≤ b, c < Y ≤ d) = FX ,Y (b, d)− FX ,Y (a, d)− FX ,Y (b, c) + FX ,Y (a, c) For proof please check page 81 of “Evans & Rosenthal” Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 41 / 48 Marginal Distribution • If the joint CDFs of multiple random variables are known then the marginal CDFs of those variables can be determined using the following theorem Marginal Distributions Let X and Y be two random variables, with joint cumulative distribution function FX ,Y . Then the cumulative distribution function FX of X satisfies, FX (x) = limy→∞FX ,Y (x , y) for all x ∈ R. Similarly we can find the marginal distribution of Y . (Proof: page 82) In class example 2.7.3 Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 42 / 48 Joint Probability Function • So far we have discussed about CDFs • However, in real life data analysis we might be interested in PMFs or PDFs • The joint probability function or PMFs can be defined as following Joint Probability Function Let X and Y be discrete random variables. Then their joint probability function, pX ,Y , is a function from R2 → R, defined by, pX ,Y (x , y) = P(X = x ,Y = y) • And the Marginal PMF is defined as following, Marginal Probability Function Let X and Y have joint PMF pX ,Y (x , y). Then the marginal PMF of X is given by, pX (x) = ∑ y pX ,Y (x , y) Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 43 / 48 Joint PDF • For continuous random variables the joint PDF can be defined as following, Joint Probability Density Function Let X and Y be jointly absolutely continuous random variables. Then their joint PDF, fX ,Y , is a function from R2 → R. Then f is the joint PDF if fX ,Y (x , y) ≥ 0 for all x and y and ∫ ∫ fX ,Y (x , y)dxdy = 1 • And the Marginal PMF is defined as following, Marginal Probability Function From the previous definition the marginal PDF of X can be obtained by fX (x) = ∫ y fX ,Y (x , y)dy Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 44 / 48 In class exercises • Examples: 2.7.5, 2.7.6, 2.7.7, 2.7.8 • Exercises: 2.7.1, 2.7.4, 2.7.7 • Problems: 2.7.11, 2.7.16 Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 45 / 48 Beta Distribution • Like Gamma function, the Beta function is also has some important implementation in probability theory and statistics • The Beta function of two positive quantities α and β can be calculated as following, B(α, β) = ∫ 1 0 tα−1(1− t)β−1 = Γ(α)Γ(β)Γ(α + β) • This motivates the Beta distribution Beta Distribution Let X is a absolutely continuous random variable and X ∈ [0, 1]. Then X is distributed as Beta with the following PDF f (x) = 1B(α, β)x α−1(1− x)β−1 = Γ(α + β)Γ(α)Γ(β)x α−1(1− x)β−1 Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 46 / 48 Dirichlet Distribution • Beta function can be extended to more than two arguments. That is called the multivariate Beta function. B(α1, α2, ..., αk) = ∫ ∫ ... ∫ tα1−11 t α2−1 2 ...(1− t1 − t2 − ...− tk−1)β−1 = ∏k i=1 Γ(αi) Γ(α1 + α2 + ...+ αk) • With the help of this multivariate Beta function, the Dirichlet distribution was developed • It is a joint distribution of multiple variable X1,X2, .... • Exercise 2.7.17 Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 47 / 48 Bivariate Normal Bivariate Normal Let X and Y be two absolutely continuous random variables with means µ1 and µ2 respectively. Their variances are σ21 and σ22. Let ρ be their correlation defined as −1 ≤ ρ ≤ 1. Then X and Y can be bivariate normally distributed joint random variates with PDF, f (x , y) = 1 2piσ1σ2 √ (1− ρ2) × exp { − 12(1− ρ2) [(x − µ1 σ1 )2 + (y − µ2 σ2 )2 − 2ρ (x − µ1 σ1 )(y − µ2 σ2 )]} Exercise: 2.7.13 Mohammad Kaviul Anam Khan Probability and RV Week 2 (Sept 20-24, 2021) 48 / 48