Get started today!

Good to have you back!
If you've signed in to StudyBlue with Facebook in the past, please do that again.

- StudyBlue
- Virginia
- George Mason University
- Mathematics
- Mathematics 214
- Morse
- Fundamentals of Probability with Stochastic Processes 3E - Ghahramani.pdf

Ahmed A.

Instructor's Solutions Manual Third Edition Fundamentals of ProbabilitY With Stochastic Processes SAEED GHAHRAMANI Western New England College Upper Saddle River, New Jersey 07458 Contents trianglerightsld 1 Axioms of Probability 1 1.2 Sample Space and Events 1 1.4 Basic Theorems 2 1.7 Random Selection of Points from Intervals 7 Review Problems 9 trianglerightsld 2 Combinatorial Methods 13 2.2 Counting Principle 13 2.3 Permutations 16 2.4 Combinations 18 2.5 Stirling’ Formula 31 Review Problems 31 trianglerightsld 3 Conditional Probability and Independence 35 3.1 Conditional Probability 35 3.2 Law of Multiplication 39 3.3 Law of Total Probability 41 3.4 Bayes’ Formula 46 3.5 Independence 48 3.6 Applications of Probability to Genetics 56 Review Problems 59 trianglerightsld 4 Distribution Functions and Discrete Random Variables 63 4.2 Distribution Functions 63 4.3 Discrete Random Variables 66 4.4 Expectations of Discrete Random Variables 71 4.5 Variances and Moments of Discrete Random Variables 77 4.6 Standardized Random Variables 83 Review Problems 83 iv Contents trianglerightsld 5 Special Discrete Distributions 87 5.1 Bernoulli and Binomial Random Variables 87 5.2 Poisson Random Variable 94 5.3 Other Discrete Random Variables 99 Review Problems 106 trianglerightsld 6 Continuous Random Variables 111 6.1 Probability Density Functions 111 6.2 Density Function of a Function of a Random Variable 113 6.3 Expectations and Variances 116 Review Problems 123 trianglerightsld 7 Special Continuous Distributions 126 7.1 Uniform Random Variable 126 7.2 Normal Random Variable 131 7.3 Exponential Random Variables 139 7.4 Gamma Distribution 144 7.5 Beta Distribution 147 7.6 Survival Analysis and Hazard Function 152 Review Problems 153 trianglerightsld 8 Bivariate Distributions 157 8.1 Joint Distribution of Two Random Variables 157 8.2 Independent Random Variables 166 8.3 Conditional Distributions 174 8.4 Transformations of Two Random Variables 183 Review Problems 191 trianglerightsld 9 Multivariate Distributions 200 9.1 Joint Distribution of n>2 Random Variables 200 9.2 Order Statistics 210 9.3 Multinomial Distributions 215 Review Problems 218 Contents v trianglerightsld 10 More Expectations and Variances 222 10.1 Expected Values of Sums of Random Variables 222 10.2 Covariance 227 10.3 Correlation 237 10.4 Conditioning on Random Variables 239 10.5 Bivariate Normal Distribution 251 Review Problems 254 trianglerightsld 11 Sums of Independent Random Variables and LimitTheorems 261 11.1 Moment-Generating Functions 261 11.2 Sums of Independent Random Variables 269 11.3 Markov and Chebyshev Inequalities 274 11.4 Laws of Large Numbers 278 11.5 Central Limit Theorem 282 Review Problems 287 trianglerightsld 12 Stochastic Processes 291 12.2 More on Poisson Processes 291 12.3 Markov Chains 296 12.4 Continuous-Time Markov Chains 315 12.5 Brownian Motion 326 Review Problems 331 Chapter 1 Axioms of Probability 1.2 SAMPLE SPACE AND EVENTS 1. For 1 ≤ i,j ≤ 3, by (i,j) we mean that Vann’s card number is i, and Paul’s card number is j. Clearly, A = braceleftbig (1, 2), (1, 3), (2, 3) bracerightbig and B = braceleftbig (2, 1), (3, 1), (3, 2) bracerightbig . (a) Since A ∩ B =∅, the events A and B are mutually exclusive. (b) None of (1, 1), (2, 2), (3, 3) belongs to A∪B. Hence A∪B not being the sample space shows that A and B are not complements of one another. 2. S ={RRR,RRB,RBR,RBB,BRR,BRB,BBR,BBB}. 3. {x : 0 0. This proves that Mia is wrong. Note that the probability of the simultaneous occurrence of any number of A c i ’s is nonzero. Furthermore, consider any set E consisting of n (n ≤ 50) of the A c i ’s. It is reasonable to assume that the probability of the simultaneous occurrence of the events of E is strictly less than the probability of the simultaneous occurrence of the events of any subset of E. Using these facts, it is straightforward to conclude from the inclusion–exclusion principle that, P parenleftBig 50 uniondisplay i=1 A c i parenrightBig < 50 summationdisplay i=1 P(A c i ) = 50 summationdisplay i=1 1 50 = 1. Thus, by DeMorgan’s law, P parenleftBig 50 intersectiondisplay i=1 A i parenrightBig = 1 − P parenleftBig 50 uniondisplay i=1 A c i parenrightBig > 1 − 1 = 0. 31. Q satisfies Axioms 1 and 2, but not necessarily Axiom 3. So it is not, in general, a probability on S. Let S ={1, 2, 3, }. Let P parenleftbig {1} parenrightbig = P parenleftbig {2} parenrightbig = P parenleftbig {3} parenrightbig = 1/3. Then Q parenleftbig {1} parenrightbig = Q parenleftbig {2} parenrightbig = 1/9, whereas Q parenleftbig {1, 2} parenrightbig = P parenleftbig {1, 2} parenrightbig 2 = 4/9. Therefore, Q parenleftbig {1, 2, } parenrightbig negationslash= Q parenleftbig {1} parenrightbig + Q parenleftbig {2} parenrightbig . R is not a probability on S because it does not satisfy Axiom 2; that is, R(S) negationslash= 1. 32. Let BRB mean that a blue hat is placed on the first player’s head, a red hat on the second player’s head, and a blue hat on the third player’s head, with similar representations for other cases. The sample space is S ={BBB,BRB,BBR,BRR,RRR,RRB,RBR,RBB}. This shows that the probability that two of the players will have hats of the same color and the third player’s hat will be of the opposite color is 6/8 = 3/4. The following improvement, Section 1.7 Random Selection of Points from Intervals 7 based on this observation, explained by Sara Robinson in Tuesday, April 10, 2001 issue of the New York Times, is due to Professor Elwyn Berlekamp of the University of California at Berkeley. Three-fourths of the time, two of the players will have hats of the same color and the third player’s hat will be the opposite color. The group can win every time this happens by using the following strategy: Once the game starts, each player looks at the other two players’hats. If the two hats are different colors, he [or she] passes. If they are the same color, the player guesses his [or her] own hat is the opposite color. This way, every time the hat colors are distributed two and one, one player will guess correctly and the others will pass, and the group will win the game. When all the hats are the same color, however, all three players will guess incorrectly and the group will lose. 1.7 RANDOM SELECTION OF POINTS FROM INTERVALS 1. 30 − 10 30 − 0 = 2 3 . 2. 0.0635 − 0.04 0.12 − 0.04 = 0.294. 3. (a) False; in the experiment of choosing a point at random from the interval (0, 1), let A = (0, 1) −{1/2}.Ais not the sample space but P(A) = 1. (b) False; in the same experiment P parenleftbig {1/2} parenrightbig = 0 while { 1 2 } negationslash=∅. 4. P(A∪ B) ≥ P(A) = 1, so P(A∪ B) = 1. This gives P(AB) = P(A) + P(B)− P(A∪ B) = 1 + 1 − 1 = 1. 5. The answer is P parenleftbig {1, 2,... ,1999} parenrightbig = 1999 summationdisplay i=1 P parenleftbig {i} parenrightbig = 1999 summationdisplay i=1 0 = 0. 6. For i = 0, 1, 2, ..., 9, the probability that i appears as the first digit of the decimal represen- tation of the selected point is the probability that the point falls into the interval bracketleftBig i 10 , i + 1 10 parenrightBig . Therefore, it equals i + 1 10 − i 10 1 − 0 = 1 10 . This shows that all numerals are equally likely to appear as the first digit of the decimal representation of the selected point. 8 Chapter 1 Axioms of Probability 7. No, it is not. Let S ={w 1 ,w 2 ,...}. Suppose that for some p>0, P parenleftbig {w i } parenrightbig = p, i = 1, 2, .... Then, by Axioms 2 and 3, summationtext ∞ i=1 p = 1. This is impossible. 8. Use induction. For n = 1, the theorem is trivial. Exercise 4 proves the theorem for n = 2. Suppose that the theorem is true for n. We show it for n + 1, P(A 1 A 2 ···A n A n+1 ) = P(A 1 A 2 ···A n ) + P(A n+1 ) − P(A 1 A 2 ···A n ∪ A n+1 ) = 1 + 1 − 1 = 1, where P(A 1 A 2 ···A n ) = 1 is true by the induction hypothesis, and P(A 1 A 2 ···A n ∪ A n+1 ) ≥ P(A n+1 ) = 1, implies that P(A 1 A 2 ···A n ∪ A n+1 ) = 1. 9. (a) Clearly, 1 2 ∈ ∞ intersectiondisplay n=1 parenleftBig 1 2 − 1 2n , 1 2 + 1 2n parenrightBig .Ifx ∈ ∞ intersectiondisplay n=1 parenleftBig 1 2 − 1 2n , 1 2 + 1 2n parenrightBig , then, for all n ≥ 1, 1 2 − 1 2n c, and a = c, it can be checked that there are 73, 73, and 27 cases in which b 2 < 4ac, respectively. Therefore, the desired probability is 73 + 73 + 27 216 = 173 216 . Chapter 2 Combinatorial Methods 2.2 COUNTING PRINCIPLES 1. The total number of six-digit numbers is 9×10×10×10×10×10 = 9×10 5 since the first digit cannot be 0. The number of six-digit numbers without the digit five is 8×9×9×9×9×9 = 8 × 9 5 . Hence there are 9 × 10 5 − 8 × 9 5 = 427, 608 six-digit numbers that contain the digit five. 2. (a) 5 5 = 3125. (b) 5 3 = 125. 3. There are 26 × 26 × 26 = 17, 576 distinct sets of initials. Hence in any town with more than 17,576 inhabitants, there are at least two persons with the same initials. The answer to the question is therefore yes. 4. 4 15 = 1, 073, 741, 824. 5. 2 2 23 = 1 2 22 ≈ 0.00000024. 6. (a) 52 5 = 380, 204, 032. (b) 52 × 51 × 50 × 49 × 48 = 311, 875, 200. 7. 6/36 = 1/6. 8. (a) 4 × 3 × 2 × 2 12 × 8 × 8 × 4 = 1 64 . (b) 1 − 8 × 5 × 6 × 2 12 × 8 × 8 × 4 = 27 32 . 9. 1 4 15 ≈ 0.00000000093. 10. 26 × 25 × 24 × 10 × 9 × 8 = 11, 232, 000. 11. There are 26 3 × 10 2 = 1, 757, 600 such codes; so the answer is positive. 12. 2 nm . 13. (2 + 1)(3 + 1)(2 + 1) = 36. (See the solution to Exercise 24.) 14 Chapter 2 Combinatorial Methods 14. There are (2 6 − 1)2 3 = 504 possible sandwiches. So the claim is true. 15. (a) 5 4 = 625. (b) 5 4 − 5 × 4 × 3 × 2 = 505. 16. 2 12 = 4096. 17. 1 − 48 × 48 × 48 × 48 52 × 52 × 52 × 52 = 0.274. 18. 10 × 9 × 8 × 7 = 5040. (a) 9 × 9 × 8 × 7 = 4536; (b) 5040 − 1 × 1 × 8 × 7 = 4984. 19. 1 − (N − 1) n N n . 20. By Example 2.6, the probability is 0.507 that among Jenny and the next 22 people she meets randomly there are two with the same birthday. However, it is quite possible that one of these two persons is not Jenny. Let n be the minimum number of people Jenny must meet so that the chances are better than even that someone shares her birthday. To find n, let A denote the event that among the next n people Jenny meets randomly someone’s birthday is the same as Jenny’s. We have P(A) = 1 − P(A c ) = 1 − 364 n 365 n . To have P(A) > 1/2, we must find the smallest n for which 1 − 364 n 365 n > 1 2 , or 364 n 365 n < 1 2 . This gives n> log 1 2 log 364 365 = 252.652. Therefore, for the desired probability to be greater than 0.5, n must be 253. To some this might seem counterintuitive. 21. Draw a tree diagram for the situation in which the salesperson goes from I to B first. In this situation, you will find that in 7 out of 23 cases, she will end up staying at island I.By symmetry, if she goes from I to H, D,orF first, in each of these situations in 7 out of 23 cases she will end up staying at island I. So there are 4 × 23 = 92 cases altogether and in 4×7 = 28 of them the salesperson will end up staying at island I. Since 28/92 = 0.3043, the answer is 30.43%. Note that the probability that the salesperson will end up staying at island I is not 0.3043 because not all of the cases are equiprobable. Section 2.2 Counting Principle 15 22. He is at 0 first, next he goes to 1 or −1. If at 1, then he goes to 0 or 2. If at −1, then he goes to0or−2, and so on. Draw a tree diagram. You will find that after walking 4 blocks, he is at one of the points 4, 2, 0, −2, or −4. There are 16 possible cases altogether. Of these 6 end up at 0, none at 1, and none at −1. Therefore, the answer to (a) is 6/16 and the answer to (b) is 0. 23. We can think of a number less than 1,000,000 as a six-digit number by allowing it to start with 0 or 0’s. With this convention, it should be clear that there are 9 6 such numbers without the digit five. Hence the desired probability is 1 − (9 6 /10 6 ) = 0.469. 24. Divisors of N are of the form p e 1 1 p e 2 2 ···p e k k , where e i = 0, 1, 2,... ,n i ,1≤ i ≤ k.Therefore, the answer is (n 1 + 1)(n 2 + 1) ···(n k + 1). 25. There are 6 4 possibilities altogether. In 5 4 of these possibilities there is no 3. In 5 3 of these possibilities only the first die lands 3. In 5 3 of these possibilities only the second die lands 3, and so on. Therefore, the answer is 5 4 + 4 × 5 3 6 4 = 0.868. 26. Any subset of the set {salami, turkey, bologna, corned beef, ham, Swiss cheese, American cheese} except the empty set can form a reasonable sandwich. There are 2 7 − 1 possibilities. To every sandwich a subset of the set {lettuce, tomato, mayonnaise} can also be added. Since there are 3 possibilities for bread, the final answer is (2 7 − 1) × 2 3 × 3 = 3048 and the advertisement is true. 27. 11 × 10 × 9 × 8 × 7 × 6 × 5 × 4 11 8 = 0.031. 28. For i = 1, 2, 3, let A i be the event that no one departs at stop i. The desired quantity is P(A c 1 A c 2 A c 3 ) = 1 − P(A 1 ∪ A 2 ∪ A 3 ). Now P(A 1 ∪ A 2 ∪ A 3 ) = P(A 1 ) + P(A 2 ) + P(A 3 ) − P(A 1 A 2 ) − P(A 1 A 3 ) − P(A 2 A 3 ) + P(A 1 A 2 A 3 ) = 2 6 3 6 + 2 6 3 6 + 2 6 3 6 − 1 3 6 − 1 3 6 − 1 3 6 + 0 = 7 27 . Therefore, the desired probability is 1 − (7/27) = 20/27. 29. For 0 ≤ i ≤ 9, the sum of the first two digits is i in (i + 1) ways. Therefore, there are (i + 1) 2 numbers in the given set with the sum of the first two digits equal to the sum of the last two digits and equal to i.Fori = 10, there are 9 2 numbers in the given set with the sum of the first two digits equal to the sum of the last two digits and equal to 10. For i = 11, the corresponding numbers are 8 2 and so on. Therefore, there are altogether 1 2 + 2 2 +···+10 2 + 9 2 + 8 2 +···+1 2 = 670 16 Chapter 2 Combinatorial Methods numbers with the desired probability and hence the answer is 670/10 4 = 0.067. 30. Let A be the event that the number selected contains at least one 0. Let B be the event that it contains at least one 1 and C be the event that it contains at least one 2. The desired quantity is P(ABC) = 1 − P(A c ∪ B c ∪ C c ), where P(A c ∪ B c ∪ C c ) = P(A c ) + P(B c ) + P(C c ) − P(A c B c ) − P(A c C c ) − P(B c C c ) + P(A c B c C c ) = 9 r 9 × 10 r−1 + 8 × 9 r−1 9 × 10 r−1 + 8 × 9 r−1 9 × 10 r−1 − 8 r 9 × 10 r−1 − 8 r 9 × 10 r−1 − 7 × 8 r−1 9 × 10 r−1 + 7 r 9 × 10 r−1 . 2.3 PERMUTATIONS 1. The answer is 1 4! = 1 24 ≈ 0.0417. 2. 3!=6. 3. 8! 3! 5! = 56. 4. The probability that John will arrive right after Jim is 7!/8! (consider Jim and John as one arrival). Therefore, the answer is 1 − (7!/8!) = 0.875. Another Solution: If Jim is the last person, John will not arrive after Jim. Therefore, the remaining seven can arrive in 7! ways. If Jim is not the last person, the total number of possibilities in which John will not arrive right after Jim is 7 × 6 × 6!. So the answer is 7!+7 × 6 × 6! 8! = 0.875. 5. (a) 3 12 = 531, 441. (b) 12! 6! 6! = 924. (c) 12! 3! 4! 5! = 27, 720. 6. 6 P 2 = 30. 7. 20! 4! 3! 5! 8! = 3, 491, 888, 400. 8. (5 × 4 × 7) × (4 × 3 × 6) × (3 × 2 × 5) 3! = 50, 400. Section 2.3 Permutations 17 9. There are 8! schedule possibilities. By symmetry, in 8!/2 of them Dr. Richman’s lecture precedes Dr. Chollet’s and in 8!/2 ways Dr. Richman’s lecture precedes Dr. Chollet’s. So the answer is 8!/2 = 20, 160. 10. 11! 3! 2! 3! 3! = 92, 400. 11. 1 − (6!/6 6 ) = 0.985. 12. (a) 11! 4! 4! 2! = 34, 650. (b) Treating all P’s as one entity, the answer is 10! 4! 4! = 6300. (c) Treating all I’s as one entity, the answer is 8! 4! 2! = 840. (d) Treating all P’s as one entity, and all I’s as another entity, the answer is 7! 4! = 210. (e) By (a) and (c), The answer is 840/34650 = 0.024. 13. parenleftBig 8! 2! 3! 3! parenrightBigslashBig 6 8 = 0.000333. 14. parenleftBig 9! 3! 3! 3! parenrightBigslashBig 52 9 = 6.043 × 10 −13 . 15. m! (n + m)! . 16. Each girl and each boy has the same chance of occupying the 13th chair. So the answer is 12/20 = 0.6. This can also be seen from 12 × 19! 20! = 12 20 = 0.6. 17. 12! 12 12 = 0.000054. 18. Look at the five math books as one entity. The answer is 5!×18! 22! = 0.00068. 19. 1 − 9 P 7 9 7 = 0.962. 20. 2 × 5!×5! 10! = 0.0079. 21. n!/n n . 18 Chapter 2 Combinatorial Methods 22. 1 − (6!/6 6 ) = 0.985. 23. Suppose that A and B are not on speaking terms. 134 P 4 committees can be formed in which neither A serves nor B;4× 134 P 3 committees can be formed in which A serves and B does not. The same numbers of committees can be formed in which B serves and A does not. Therefore, the answer is 134 P 4 + 2(4 × 134 P 3 ) = 326, 998, 056. 24. (a) m n . (b) m P n . (c) n!. 25. parenleftBig 3 · 8! 2! 3! 2! 1! parenrightBigslashBig 6 8 = 0.003. 26. (a) 20! 39 × 37 × 35 ×···×5 × 3 × 1 = 7.61 × 10 −6 . (b) 1 39 × 37 × 35 ×···×5 × 3 × 1 = 3.13 × 10 −24 . 27. Thirty people can sit in 30! ways at a round table. But for each way, if they rotate 30 times (everybody move one chair to the left at a time) no new situations will be created. Thus in 30!/30 = 29! ways 15 married couples can sit at a round table. Think of each married couple as one entity and note that in 15!/15 = 14! ways 15 such entities can sit at a round table. We have that the 15 couples can sit at a round table in (2!) 15 · 14! different ways because if the couples of each entity change positions between themselves, a new situation will be created. So the desired probability is 14!(2!) 15 29! = 3.23 × 10 −16 . The answer to the second part is 24!(2!) 5 29! = 2.25 × 10 −6 . 28. In 13! ways the balls can be drawn one after another. The number of those in which the first white appears in the second or in the fourth or in the sixth or in the eighth draw is calculated as follows. (These are Jack’s turns.) 8 × 5 × 11!+8 × 7 × 6 × 5 × 9!+8 × 7 × 6 × 5 × 4 × 5 × 7! + 8 × 7 × 6 × 5 × 4 × 3 × 2 × 5 × 5!=2, 399, 846, 400. Therefore, the answer is 2, 399, 846, 400/13!=0.385. Section 2.4 Combinations 19 2.4 COMBINATIONS 1. parenleftbigg 20 6 parenrightbigg = 38, 760. 2. 100 summationdisplay i=51 parenleftbigg 100 i parenrightbigg = 583, 379, 627, 841, 332, 604, 080, 945, 354, 060 ≈ 5.8 × 10 29 . 3. parenleftbigg 20 6 parenrightbiggparenleftbigg 25 6 parenrightbigg = 6, 864, 396, 000. 4. parenleftbigg 12 3 parenrightbiggparenleftbigg 40 2 parenrightbigg parenleftbigg 52 5 parenrightbigg = 0.066. 5. parenleftbigg N − 1 n − 1 parenrightbigg slashBig parenleftbigg N n parenrightbigg = n N . 6. parenleftbigg 5 3 parenrightbiggparenleftbigg 2 2 parenrightbigg = 10. 7. parenleftbigg 8 3 parenrightbiggparenleftbigg 5 2 parenrightbiggparenleftbigg 3 3 parenrightbigg = 560. 8. parenleftbigg 18 6 parenrightbigg + parenleftbigg 18 4 parenrightbigg = 21, 624. 9. parenleftbigg 10 5 parenrightbigg slashBig parenleftbigg 12 7 parenrightbigg = 0.318. 10. The coefficient of 2 3 x 9 in the expansion of (2 + x) 12 is parenleftbigg 12 9 parenrightbigg . Therefore, the coefficient of x 9 is 2 3 parenleftbigg 12 9 parenrightbigg = 1760. 11. The coefficient of (2x) 3 (−4y) 4 in the expansion of (2x − 4y) 7 is parenleftbigg 7 4 parenrightbigg . Thus the coefficient of x 3 y 2 in this expansion is 2 3 (−4) 4 parenleftbigg 7 4 parenrightbigg = 71, 680. 12. parenleftbigg 9 3 parenrightbigg bracketleftBig parenleftbigg 6 4 parenrightbigg + 2 parenleftbigg 6 3 parenrightbigg bracketrightBig = 4620. 20 Chapter 2 Combinatorial Methods 13. (a) parenleftbigg 10 5 parenrightbigg slashBig 2 10 = 0.246; (b) 10 summationdisplay i=5 parenleftbigg 10 i parenrightbigg slashBig 2 10 = 0.623. 14. If their minimum is larger than 5, they are all from the set {6, 7, 8,... ,20}. Hence the answer is parenleftbigg 15 5 parenrightbigg slashBig parenleftbigg 20 5 parenrightbigg = 0.194. 15. (a) parenleftbigg 6 2 parenrightbiggparenleftbigg 28 4 parenrightbigg parenleftbigg 34 6 parenrightbigg = 0.228; (b) parenleftbigg 6 6 parenrightbigg + parenleftbigg 6 6 parenrightbigg + parenleftbigg 10 6 parenrightbigg + parenleftbigg 12 6 parenrightbigg parenleftbigg 34 6 parenrightbigg = 0.00084. 16. parenleftbigg 50 5 parenrightbiggparenleftbigg 150 45 parenrightbigg parenleftbigg 200 50 parenrightbigg = 0.00206. 17. n summationdisplay i=0 2 i parenleftbigg n i parenrightbigg = n summationdisplay i=0 parenleftbigg n i parenrightbigg 2 i 1 n−i = (2 + 1) n = 3 n . n summationdisplay i=0 x i parenleftbigg n i parenrightbigg = n summationdisplay i=0 parenleftbigg n i parenrightbigg x i 1 n−i = (x + 1) n . 18. bracketleftBig parenleftbigg 6 2 parenrightbigg 5 4 bracketrightBigslashBig 6 6 = 0.201. 19. 2 12 slashBig parenleftbigg 24 12 parenrightbigg = 0.00151. 20. Royal Flush: 4 parenleftbigg 52 5 parenrightbigg = 0.0000015. Straight flush: 36 parenleftbigg 52 5 parenrightbigg = 0.000014. Four of a kind: 13 × 12 parenleftbigg 4 1 parenrightbigg parenleftbigg 52 5 parenrightbigg = 0.00024. Section 2.4 Combinations 21 Full house: 13 parenleftbigg 4 3 parenrightbigg · 12 parenleftbigg 4 2 parenrightbigg parenleftbigg 52 5 parenrightbigg = 0.0014. Flush: 4 parenleftbigg 13 5 parenrightbigg − 40 parenleftbigg 52 5 parenrightbigg = 0.002. Straight: 10(4) 5 − 40 parenleftbigg 52 5 parenrightbigg = 0.0039. Three of a kind: 13 parenleftbigg 4 3 parenrightbigg · parenleftbigg 12 2 parenrightbigg 4 2 parenleftbigg 52 5 parenrightbigg = 0.021. Two pairs: parenleftbigg 13 2 parenrightbiggparenleftbigg 4 2 parenrightbiggparenleftbigg 4 2 parenrightbigg · 11 parenleftbigg 4 1 parenrightbigg parenleftbigg 52 5 parenrightbigg = 0.048. One pair: 13 parenleftbigg 4 2 parenrightbigg · parenleftbigg 12 3 parenrightbigg 4 3 parenleftbigg 52 5 parenrightbigg = 0.42. None of the above: 1− the sum of all of the above cases = 0.5034445. 21. The desired probability is parenleftbigg 12 6 parenrightbiggparenleftbigg 12 6 parenrightbigg parenleftbigg 24 12 parenrightbigg = 0.3157. 22. The answer is the solution of the equation parenleftbigg x 3 parenrightbigg = 20. This equation is equivalent to x(x − 1)(x − 2) = 120 and its solution is x = 6. 22 Chapter 2 Combinatorial Methods 23. Thereare9×10 3 = 9000 four-digitnumbers. Fromevery4-combinationoftheset{0, 1,... ,9}, exactly one four-digit number can be constructed in which its ones place is less than its tens place, its tens place is less than its hundreds place, and its hundreds place is less than its thousands place. Therefore, the number of such four-digit numbers is parenleftbigg 10 4 parenrightbigg = 210. Hence the desired probability is 0.023333. 24. (x + y + z) 2 = summationdisplay n 1 +n 2 +n 3 =2 n! n 1 ! n 2 ! n 3 ! x n 1 y n 2 z n 3 = 2! 2! 0! 0! x 2 y 0 z 0 + 2! 0! 2! 0! x 0 y 2 z 0 + 2! 0! 0! 2! x 0 y 0 z 2 + 2! 1! 1! 0! x 1 y 1 z 0 + 2! 1! 0! 1! x 1 y 0 z 1 + 2! 0! 1! 1! x 0 y 1 z 1 = x 2 + y 2 + z 2 + 2xy + 2xz + 2yz. 25. The coefficient of (2x) 2 (−y) 3 (3z) 2 in the expansion of (2x − y + 3z) 7 is 7! 2! 3! 2! . Thus the coefficient of x 2 y 3 z 2 in this expansion is 2 2 (−1) 3 (3) 2 7! 2! 3! 2! =−7560. 26. The coefficient of (2x) 3 (−y) 7 (3) 3 in the expansion of (2x − y + 3) 13 is 13! 3! 7! 3! . Therefore, the coefficient of x 3 y 7 in this expansion is 2 3 (−1) 7 (3) 3 13! 3! 7! 3! =−7, 413, 120. 27. In 52! 13! 13! 13! 13! = 52! (13!) 4 ways 52 cards can be dealt among four people. Hence the sample space contains 52!/(13!) 4 points. Now in 4! ways the four different suits can be distributed among the players; thus the desired probability is 4!/[52!/(13!) 4 ]≈4.47 × 10 −28 . 28. The theorem is valid for k = 2; it is the binomial expansion. Suppose that it is true for all integers ≤ k − 1. We show it for k. By the binomial expansion, (x 1 + x 2 +···+x k ) n = n summationdisplay n 1 =0 parenleftbigg n n 1 parenrightbigg x n 1 1 (x 2 +···+x k ) n−n 1 = n summationdisplay n 1 =0 parenleftbigg n n 1 parenrightbigg x n 1 1 summationdisplay n 2 +n 3 +···+n k =n−n 1 (n − n 1 )! n 2 ! n 3 ! ··· n k ! x n 2 2 x n 3 3 ···x n k k = summationdisplay n 1 +n 2 +···+n k =n parenleftbigg n n 1 parenrightbigg (n − n 1 )! n 2 ! n 3 ! ··· n k ! x n 1 1 x n 2 2 ···x n k k Section 2.4 Combinations 23 = summationdisplay n 1 +n 2 +···+n k =n n! n 1 ! n 2 ! ··· n k ! x n 1 1 x n 2 2 ···x n k k . 29. We must have 8 steps. Since the distance from M to L is ten 5-centimeter intervals and the first step is made at M, there are 9 spots left at which the remaining 7 steps can be made. So the answer is parenleftbigg 9 7 parenrightbigg = 36. 30. (a) parenleftbigg 2 1 parenrightbiggparenleftbigg 98 49 parenrightbigg + parenleftbigg 98 48 parenrightbigg parenleftbigg 100 50 parenrightbigg = 0.753; (b) 2 50 slashBig parenleftbigg 100 50 parenrightbigg = 1.16 × 10 −14 . 31. (a) It must be clear that n 1 = parenleftbigg n 2 parenrightbigg n 2 = parenleftbigg n 1 2 parenrightbigg + nn 1 n 3 = parenleftbigg n 2 2 parenrightbigg + n 2 (n + n 1 ) n 4 = parenleftbigg n 3 2 parenrightbigg + n 3 (n + n 1 + n 2 ) . . . n k = parenleftbigg n k−1 2 parenrightbigg + n k−1 (n + n 1 +···+n k−1 ). (b) For n = 25, 000, successive calculations of n k ’s yield, n 1 = 312, 487, 500, n 2 = 48, 832, 030, 859, 381, 250, n 3 = 1, 192, 283, 634, 186, 401, 370, 231, 933, 886, 715, 625, n 4 = 710, 770, 132, 174, 366, 339, 321, 713, 883, 042, 336, 781, 236, 550, 151, 462, 446, 793, 456, 831, 056, 250. For n = 25, 000, the total number of all possible hybrids in the first four generations, n 1 +n 2 +n 3 +n 4 , is 710,770,132,174,366,339,321,713,883,042,337,973,520,184,337, 863,865,857,421,889,665,625. This number is approximately 710 × 10 63 . 32. For n = 1, we have the trivial identity x + y = parenleftbigg 1 0 parenrightbigg x 0 y 1−0 + parenleftbigg 1 1 parenrightbigg x 1 y 1−1 . 24 Chapter 2 Combinatorial Methods Assume that (x + y) n−1 = n−1 summationdisplay i=0 parenleftbigg n − 1 i parenrightbigg x i y n−1−i . This gives (x + y) n = (x + y) n−1 summationdisplay i=0 parenleftbigg n − 1 i parenrightbigg x i y n−1−i = n−1 summationdisplay i=0 parenleftbigg n − 1 i parenrightbigg x i+1 y n−1−i + n−1 summationdisplay i=0 parenleftbigg n − 1 i parenrightbigg x i y n−i = n summationdisplay i=1 parenleftbigg n − 1 i − 1 parenrightbigg x i y n−i + n−1 summationdisplay i=0 parenleftbigg n − 1 i parenrightbigg x i y n−i = x n + n−1 summationdisplay i=1 bracketleftBig parenleftbigg n − 1 i − 1 parenrightbigg + parenleftbigg n − 1 i parenrightbigg bracketrightBig x i y n−i + y n = x n + n−1 summationdisplay i=1 parenleftbigg n i parenrightbigg x i y n−i + y n = n summationdisplay i=0 parenleftbigg n i parenrightbigg x i y n−i . 33. The desired probability is computed as follows. parenleftbigg 12 6 parenrightbiggbracketleftbiggparenleftbigg 30 2 parenrightbiggparenleftbigg 28 2 parenrightbiggparenleftbigg 26 2 parenrightbiggparenleftbigg 24 2 parenrightbiggparenleftbigg 22 2 parenrightbiggparenleftbigg 20 2 parenrightbiggparenleftbigg 18 3 parenrightbiggparenleftbigg 15 3 parenrightbiggparenleftbigg 12 3 parenrightbiggparenleftbigg 9 3 parenrightbiggparenleftbigg 6 3 parenrightbiggparenleftbigg 3 3 parenrightbiggbracketrightbigg slashBig 12 30 ≈ 0.000346. 34. (a) parenleftbigg 10 6 parenrightbigg 2 6 parenleftbigg 20 6 parenrightbigg = 0.347; (b) parenleftbigg 10 1 parenrightbiggparenleftbigg 9 4 parenrightbigg 2 4 parenleftbigg 20 6 parenrightbigg = 0.520; (c) parenleftbigg 10 2 parenrightbiggparenleftbigg 8 2 parenrightbigg 2 2 parenleftbigg 20 6 parenrightbigg = 0.130; (d) parenleftbigg 10 3 parenrightbigg parenleftbigg 20 6 parenrightbigg = 0.0031. 35. parenleftbigg 26 13 parenrightbiggparenleftbigg 26 13 parenrightbigg parenleftbigg 52 26 parenrightbigg = 0.218. Section 2.4 Combinations 25 36. Let a 6-element combination of a set of integers be denoted by {a 1 ,a 2 ,... ,a 6 }, where a 1 < a 2 < ···P N−1 if and only if (N −t)(N −n)>N(N−t −n+m) or, equivalently, if and only if N ≤ nt/m.SoP N is increasing if and only if N ≤ nt/m. This shows that the maximum of P N is at [nt/m], where by [nt/m] we mean the greatest integer ≤ nt/m. 45. The sample space consists of (n+1) 4 elements. Let the elements of the sample be denoted by x 1 , x 2 , x 3 , and x 4 . To count the number of samples (x 1 ,x 2 ,x 3 ,x 4 ) for which x 1 +x 2 = x 3 +x 4 , let y 3 = n − x 3 and y 4 = n − x 4 . Then y 3 and y 4 are also random elements from the set {0, 1, 2,... ,n}. The number of cases in which x 1 +x 2 = x 3 +x 4 is identical to the number of cases in which x 1 + x 2 + y 3 + y 4 = 2n. By Example 2.23, the number of nonnegative integer 28 Chapter 2 Combinatorial Methods solutions to this equation is parenleftbigg 2n + 3 3 parenrightbigg . However, this also counts the solutions in which one of x 1 , x 2 , y 3 , and y 4 is greater than n. Because of the restrictions 0 ≤ x 1 ,x 2 ,y 3 ,y 4 ≤ n, we must subtract, from this number, the total number of the solutions in which one of x 1 , x 2 , y 3 , and y 4 is greater than n. Such solutions are obtained by finding all nonnegative integer solutions of the equation x 1 + x 2 + y 3 + y 4 = n − 1, and then adding n + 1 to exactly one of x 1 , x 2 , y 3 , and y 4 . Their count is 4 times the number of nonnegative integer solutions of x 1 + x 2 + y 3 + y 4 = n − 1; that is, 4 parenleftbigg n + 2 3 parenrightbigg . Therefore, the desired probability is parenleftbigg 2n + 3 3 parenrightbigg − 4 parenleftbigg n + 2 3 parenrightbigg (n + 1) 4 = 2n 2 + 4n + 3 3(n + 1) 3 . 46. (a) The n − m unqualified applicants are “ringers.” The experiment is not affected by their inclusion, so that the probability of any one of the qualified applicants being selected is the same as it would be if there were only qualified applicants. That is, 1/m. This is because in a random arrangement of m qualified applicants, the probability that a given applicant is the first one is 1/m. (b) Let A be the event that a given qualified applicant is hired. We will show that P(A) = 1/m. Let E i be the event that the given qualified applicant is the ith applicant interviewed, and he or she is the first qualified applicant to be interviewed. Clearly, P(A) = n−m+1 summationdisplay i=1 P(E i ), where P(E i ) = n−m P i−1 · 1 · (n − i)! n! . Therefore, P(A) = n−m+1 summationdisplay i=1 n−m P i−1 · (n − i)! n! = n−m+1 summationdisplay i=1 (n − m)! (n − m − i + 1)! (n − i)! n! = n−m+1 summationdisplay i=1 1 m! · 1 n! m! (n − m)! · (n − i)! (n − m − i + 1)! (m − 1)! (m − 1)! = n−m+1 summationdisplay i=1 1 m · 1 parenleftbigg n m parenrightbigg parenleftbigg n − i m − 1 parenrightbigg Section 2.4 Combinations 29 = 1 m · 1 parenleftbigg n m parenrightbigg n−m+1 summationdisplay i=1 parenleftbigg n − i m − 1 parenrightbigg . (4) To calculate n−m+1 summationdisplay i=1 parenleftbigg n − i m − 1 parenrightbigg , note that parenleftbigg n − i m − 1 parenrightbigg is the coefficient of x m−1 in the expansion of (1 + x) n−i . Therefore, n−m+1 summationdisplay i=1 parenleftbigg n − i m − 1 parenrightbigg is the coefficient of x m−1 in the expansion of n−m+1 summationdisplay i=1 (1 + x) n−i = (1 + x) n − (1 + x) m−1 x . This shows that n−m+1 summationdisplay i=1 parenleftbigg n − i m − 1 parenrightbigg is the coefficient of x m in the expansion of (1 + x) n − (1 + x) m−1 , which is parenleftbigg n m parenrightbigg . So (4) implies that P(A) = 1 m · 1 parenleftbigg n m parenrightbigg · parenleftbigg n m parenrightbigg = 1 m . 47. Clearly, N = 6 10 , N(A i ) = 5 10 , N(A i A j ) = 4 10 ,inegationslash= j, and so on. So S 1 has parenleftbigg 6 1 parenrightbigg equal terms, S 2 has parenleftbigg 6 2 parenrightbigg equal terms, and so on. Therefore, the solution is 6 10 − parenleftbigg 6 1 parenrightbigg 5 10 + parenleftbigg 6 2 parenrightbigg 4 10 − parenleftbigg 6 3 parenrightbigg 3 10 + parenleftbigg 6 4 parenrightbigg 2 10 − parenleftbigg 6 5 parenrightbigg 1 10 + parenleftbigg 6 6 parenrightbigg 0 10 = 16, 435, 440. 48. |A 0 |= 1 2 parenleftbigg n 3 parenrightbiggparenleftbigg n − 3 3 parenrightbigg , |A 1 |= 1 2 parenleftbigg n 3 parenrightbiggparenleftbigg 3 1 parenrightbiggparenleftbigg n − 3 2 parenrightbigg , |A 2 |= 1 2 parenleftbigg n 3 parenrightbiggparenleftbigg 3 2 parenrightbiggparenleftbigg n − 3 1 parenrightbigg . The answer is |A 0 | |A 0 |+|A 1 |+|A 2 | = (n − 4)(n − 5) n 2 + 2 . 49. The coefficient of x n in (1 + x) 2n is parenleftbigg 2n n parenrightbigg . Its coefficient in (1 + x) n (1 + x) n is parenleftbigg n 0 parenrightbiggparenleftbigg n n parenrightbigg + parenleftbigg n 1 parenrightbiggparenleftbigg n n − 1 parenrightbigg + parenleftbigg n 2 parenrightbiggparenleftbigg n n − 2 parenrightbigg +···+ parenleftbigg n n parenrightbiggparenleftbigg n 0 parenrightbigg = parenleftbigg n 0 parenrightbigg 2 + parenleftbigg n 1 parenrightbigg 2 + parenleftbigg n 2 parenrightbigg 2 +···+ parenleftbigg n n parenrightbigg 2 , 30 Chapter 2 Combinatorial Methods since parenleftbigg n i parenrightbigg = parenleftbigg n n − 1 parenrightbigg , 0 ≤ i ≤ n. 50. Consider a particular set of k letters. Let M be the number of possibilities in which only these k letters are addressed correctly. The desired probability is the quantity parenleftbigg n k parenrightbigg M slashBig n!. All we got to do is to find M. To do so, note that the remaining n − k letters are all addressed incorrectly. For these n − k letters, there are n − k addresses. But the addresses are written on the envelopes at random. The probability that none is addressed correctly on one hand is M/(n − k)!, and on the other hand, by Example 2.24, is 1 − n−k summationdisplay i=1 (−1) i−1 i! = n summationdisplay i=2 (−1) i−1 i! . So M satisfies M (n − k)! = n summationdisplay i=2 (−1) i−1 i! , and hence M = (n − k)! n summationdisplay i=2 (−1) i−1 i! . The final answer is parenleftbigg n k parenrightbigg M n! = parenleftbigg n k parenrightbigg (n − k)! n summationdisplay i=2 (−1) i−1 i! n! = 1 k! n summationdisplay i=2 (−1) i−1 i! . 51. The set of all sequences of H’s and T’s of length i with no successive H’s are obtained either by adding a T to the tails of all such sequences of length i − 1, or a TH to the tails of all such sequences of length i − 2. Therefore, x i = x i−1 + x i−2 ,i≥ 2. Clearly, x 1 = 2 and x 3 = 3. For consistency, we define x 0 = 1. From the theory of recurrence relations we know that the solution of x i = x i−1 + x i−2 is of the form x i = Ar i 1 + Br i 2 , where r 1 and r 2 are the solutions of r 2 = r + 1. Therefore, r 1 = 1 + √ 5 2 and r 2 = 1 − √ 5 2 and so x i = A parenleftBig 1 + √ 5 2 parenrightBig i + B parenleftBig 1 − √ 5 2 parenrightBig i . Using the initial conditions x 0 = 1 and x 2 = 2, we obtain A = 5 + 3 √ 5 10 and B = 5 − 3 √ 5 10 . Section 2.5 Stirling’s Formula 31 Hence the answer is x n 2 n = 1 2 n bracketleftBigparenleftBig 5 + 3 √ 5 10 parenrightBigparenleftBig 1 + √ 5 2 parenrightBig n + parenleftBig 5 − 3 √ 5 10 parenrightBigparenleftBig 1 − √ 5 2 parenrightBig n bracketrightBig = 1 10 × 2 2n bracketleftBig parenleftbig 5 + 3 √ 5 parenrightbigparenleftbig 1 + √ 5 parenrightbig n + parenleftbig 5 − 3 √ 5 parenrightbigparenleftbig 1 − √ 5 parenrightbig n bracketrightBig . 52. For this exercise, a solution is given by Abramson and Moser in the October 1970 issue of the American Mathematical Monthly. 2.5 STIRLING’s FORMULA 1. (a) parenleftbigg 2n n parenrightbigg 1 2 2n = (2n)! n! n! 1 2 2n ∼ √ 4πn(2n) 2n e −2n (2πn)n 2n e −2n 2 2n ∼ 1 √ πn . (b) bracketleftbig (2n)! bracketrightbig 3 (4n)! (n!) 2 ∼ bracketleftbig√ 4πn(2n) 2n e −2n bracketrightbig 3 √ 8πn(4n) 4n e −4n (2πn)n 2n e −2n = √ 2 4 n . REVIEW PROBLEMS FOR CHAPTER 2 1. The desired quantity is equal to the number of subsets of all seven varieties of fruit minus 1 (the empty set); so it is 2 7 − 1 = 127. 2. The number of choices Virginia has is equal to the number of subsets of {1, 2, 5, 10, 20} minus 1 (for empty set). So the answer is 2 5 − 1 = 31. 3. (6 × 5 × 4 × 3)/6 4 = 0.278. 4. 10 slashBig parenleftbigg 10 2 parenrightbigg = 0.222. 5. 9! 3! 2! 2! 2! = 7560. 6. 5!/5 = 4!=24. 7. 3!·4!·4!·4!=82, 944. 8. 1 − parenleftbigg 23 6 parenrightbigg parenleftbigg 30 6 parenrightbigg = 0.83. 32 Chapter 2 Combinatorial Methods 9. Since the refrigerators are identical, the answer is 1. 10. 6!=720. 11. (Draw a tree diagram.) In 18 out of 52 possible cases the tournament ends because John wins 4 games without winning 3 in a row. So the answer is 34.62%. 12. Yes, it is because the probability of what happened is 1/7 2 = 0.02. 13. 9 8 = 43, 046, 721. 14. (a) 26 × 25 × 24 × 23 × 22 × 21 = 165, 765, 600; (b) 26 × 25 × 24 × 23 × 22 × 5 = 39, 468, 000; (c) parenleftbigg 5 2 parenrightbigg 26 parenleftbigg 3 1 parenrightbigg 25 parenleftbigg 2 1 parenrightbigg 24 parenleftbigg 1 1 parenrightbigg 23 = 21, 528, 000. 15. parenleftbigg 6 3 parenrightbigg + parenleftbigg 6 1 parenrightbigg + parenleftbigg 6 1 parenrightbigg + parenleftbigg 6 1 parenrightbiggparenleftbigg 2 1 parenrightbiggparenleftbigg 2 1 parenrightbigg parenleftbigg 10 3 parenrightbigg = 0.467. Another Solution: parenleftbigg 6 3 parenrightbigg + parenleftbigg 6 1 parenrightbiggparenleftbigg 4 2 parenrightbigg parenleftbigg 10 3 parenrightbigg = 0.467. 16. 8 × 4 × 6 P 4 8 P 6 = 0.571. 17. 1 − 27 8 28 8 = 0.252. 18. (3!/3)(5!) 3 15!/15 = 0.000396. 19. 3 12 = 531, 441. 20. parenleftbigg 4 1 parenrightbiggparenleftbigg 48 12 parenrightbiggparenleftbigg 3 1 parenrightbiggparenleftbigg 36 12 parenrightbiggparenleftbigg 2 1 parenrightbiggparenleftbigg 24 12 parenrightbiggparenleftbigg 1 1 parenrightbiggparenleftbigg 12 12 parenrightbigg 52! 13! 13! 13! 13! = 0.1055. Chapter 2 Review Problems 33 21. Let A 1 , A 2 , A 3 , and A 4 be the events that there is no professor, no associate professor, no assistant professor, and no instructor in the committee, respectively. The desired probability is P(A c 1 A c 2 A c 3 A c 4 ) = 1 − P(A 1 ∪ A 2 ∪ A 3 ∪ A 4 ), where P(A 1 ∪ A 2 ∪ A 3 ∪ A 4 ) is calculated using the inclusion-exclusion principle: P(A 1 ∪ A 2 ∪ A 3 ∪ A 4 ) = P(A 1 ) + P(A 2 ) + P(A 3 ) + P(A 4 ) − P(A 1 A 2 ) − P(A 1 A 3 ) − P(A 1 A 4 ) − P(A 2 A 3 ) − P(A 2 A 4 ) − P(A 3 A 4 ) + P(A 1 A 2 A 3 ) + P(A 1 A 3 A 4 ) + P(A 1 A 2 A 4 ) + P(A 2 A 3 A 4 ) − P(A 1 A 2 A 3 A 4 ) = bracketleftbigg 1 slashBig parenleftbigg 34 6 parenrightbiggbracketrightbiggbracketleftbiggparenleftbigg 28 6 parenrightbigg + parenleftbigg 28 6 parenrightbigg + parenleftbigg 24 6 parenrightbigg + parenleftbigg 22 6 parenrightbigg − parenleftbigg 22 6 parenrightbigg − parenleftbigg 18 6 parenrightbigg − parenleftbigg 16 6 parenrightbigg − parenleftbigg 18 6 parenrightbigg − parenleftbigg 16 6 parenrightbigg − parenleftbigg 12 6 parenrightbigg + parenleftbigg 12 6 parenrightbigg + parenleftbigg 6 6 parenrightbigg + parenleftbigg 10 6 parenrightbigg + parenleftbigg 6 6 parenrightbigg − 0 bracketrightbigg = 0.621. Therefore, the desired probability equals 1 − 0.621 = 0.379. 22. (15!) 2 30!/(2!) 15 = 0.0002112. 23. (N − n + 1) slashBig parenleftbigg N n parenrightbigg . 24. (a) parenleftbigg 4 2 parenrightbiggparenleftbigg 48 24 parenrightbigg parenleftbigg 52 26 parenrightbigg = 0.390; (b) parenleftbigg 40 1 parenrightbigg parenleftbigg 52 13 parenrightbigg = 6.299 × 10 −11 ; (c) parenleftbigg 13 5 parenrightbiggparenleftbigg 39 8 parenrightbiggparenleftbigg 8 8 parenrightbiggparenleftbigg 31 5 parenrightbigg parenleftbigg 52 13 parenrightbiggparenleftbigg 39 13 parenrightbigg = 0.00000261. 25. 12!/(3!) 4 = 369, 600. 26. There is a one-to-one correspondence between all cases in which the eighth outcome obtained is not a repetition and all cases in which the first outcome obtained will not be repeated. The answer is 6 × 5 × 5 × 5 × 5 × 5 × 5 × 5 6 × 6 × 6 × 6 × 6 × 6 × 6 × 6 = parenleftBig 5 6 parenrightBig 7 = 0.279. 27. There are 9 × 10 3 = 9, 000 four-digit numbers. To count the number of desired four-digit numbers, note that if 0 is to be one of the digits, then the thousands place of the number must be 34 Chapter 2 Combinatorial Methods 0, but this cannot be the case since the first digit of an n-digit number is nonzero. Keeping this in mind, it must be clear that from every 4-combination of the set {1, 2,... ,9}, exactly one four-digit number can be constructed in which its ones place is greater than its tens place, its tens place is greater than it hundreds place, and its hundreds place is greater than its thousands place. Therefore, the number of such four-digit numbers is parenleftbigg 9 4 parenrightbigg = 126. Hence the desired probability is = 0.014. 28. Since the sum of the digits of 100,000 is 1, we ignore 100,000 and assume that all of the numbers have five digits by placing 0’s in front of those with less than five digits. The following process establishes a one-to-one correspondence between such numbers, d 1 d 2 d 3 d 4 d 5 , summationtext 5 i=1 d i = 8, and placement of 8 identical objects into 5 distinguishable cells: Put d 1 of the objects into the first cell, d 2 of the objects into the second cell, d 3 into the third cell, and so on. Since this can be done in parenleftbigg 8 + 5 − 1 5 − 1 parenrightbigg = parenleftbigg 12 8 parenrightbigg = 495 ways, the number of integers from the set {1, 2, 3,... ,100000} in which the sum of the digits is 8 is 495. Hence the desired probability is 495/100, 000 = 0.00495. Chapter 3 Conditional Probability and Independence 3.1 CONDITIONAL PROBABILITY 1. P(W | U) = P(UW) P(U) = 0.15 0.25 = 0.60. 2. Let E be the event that in the blood of the randomly selected soldier A antigen is found. Let F be the event that the blood type of the soldier is A. We have P(F | E) = P(FE) P(E) = 0.41 0.41 + 0.04 = 0.911. 3. 0.20 0.32 = 0.625. 4. The reduced sample space is braceleftbig (1, 4), (2, 3), (3, 2), (4, 1), (4, 6), (5, 5), (6, 4) bracerightbig ; therefore, the desired probability is 1/7. 5. 30 − 20 30 − 15 = 2 3 . 6. Both of the inequalities are equivalent to P(AB) > P(A)P(B). 7. 1/3 (1/3) + (1/2) = 2 5 . 8. 4/30 = 0.133. 36 Chapter 3 Conditional Probability and Independence 9. parenleftbigg 40 2 parenrightbiggparenleftbigg 65 6 parenrightbigg parenleftbigg 105 8 parenrightbigg 1 − 2 summationdisplay i=0 parenleftbigg 40 8 − i parenrightbiggparenleftbigg 65 i parenrightbigg parenleftbigg 105 8 parenrightbigg = 0.239. 10. P(α = i | β = 0) = ⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩ 1/19 if i = 0 2/19 if i = 1, 2, 3,... ,9 0ifi = 10, 11, 12,... ,18. 11. Letb ∗ gbmean that the oldest child of the family is a boy, the second oldest is a girl, the youngest is a boy, and the boy found in the family is the oldest child, with similar representations for other cases. The reduced sample space is S = braceleftbig ggb ∗ ,gb ∗ g,b ∗ gg,b ∗ bg, bb ∗ g,gb ∗ b,gbb ∗ ,bgb ∗ ,b ∗ gb,b ∗ bb,bb ∗ b,bbb ∗ bracerightbig . Note that the outcomes of the sample space are not equiprobable. We have that P parenleftbig {ggb ∗ } parenrightbig = P parenleftbig {gb ∗ g} parenrightbig = P parenleftbig {b ∗ gg} parenrightbig = 1/7 P parenleftbig {b ∗ bg} parenrightbig = P parenleftbig {bb ∗ g} parenrightbig = 1/14 P parenleftbig {gb ∗ b} parenrightbig = P parenleftbig {gbb ∗ } parenrightbig = 1/14 P parenleftbig {bgb ∗ } parenrightbig = P parenleftbig {b ∗ gb} parenrightbig = 1/14 P parenleftbig {b ∗ bb} parenrightbig = P parenleftbig {bb ∗ b} parenrightbig = P parenleftbig {bbb ∗ } parenrightbig = 1/21. The solutions to (a), (b), (c) are as follows. (a) P parenleftbig {bb ∗ g} parenrightbig = 1/14; (b) P parenleftbig {bb ∗ g,gbb ∗ ,bgb ∗ ,bb ∗ b,bbb ∗ } parenrightbig = 13/42; (c) P parenleftbig {b ∗ bg, bb ∗ g,gb ∗ b,gbb ∗ ,bgb ∗ ,b ∗ gb} parenrightbig = 3/7. 12. P(A) = 1 implies that P(A∪ B) = 1. Hence, by P(A∪ B) = P(A) + P(B)− P(AB), we have that P(B)= P(AB). Therefore, P(B | A) = P(AB) P(A) = P(B) 1 = P(B). Section 3.1 Conditional Probability 37 13. P(A| B) = P(AB) b , where P(AB) = P(A) + P(B)− P(A∪ B) ≥ P(A) + P(B)− 1 = a + b − 1. 14. (a) P(AB) ≥ 0,P(B)>0. Therefore, P(A| B) = P(AB) P(B) ≥ 0. (b) P(S | B) = P(SB) P(B) = P(B) P(B) = 1. (c) P parenleftBig ∞ uniondisplay i=1 A i vextendsingle vextendsingle vextendsingle B parenrightBig = P parenleftbigg parenleftBig uniontext ∞ i=1 A i parenrightBig B parenrightbigg P(B) = P parenleftBig uniontext ∞ i=1 A i B parenrightBig P(B) = ∞ summationdisplay i=1 P(A i B) P(B) = ∞ summationdisplay i=1 P(A i B) P(B) = ∞ summationdisplay i=1 P(A i | B). Note that P(∪ ∞ i=1 A i B) = summationtext ∞ i=1 P(A i B), since mutual exclusiveness of A i ’s imply that of A i B’s; i.e., A i A j =∅,inegationslash= j, implies that (A i B)(A j B) =∅,inegationslash= j. 15. The given inequalities imply that P(EF)≥ P(GF) and P(EF c ) ≥ P(GF c ). Thus P(E)= P(EF)+ P(EF c ) ≥ P(GF) + P(GF c ) = P(G). 16. Reduce the sample space: Marlon chooses from six dramas and seven comedies two at random. What is the probability that they are both comedies? The answer is parenleftbigg 7 2 parenrightbigg slashBig parenleftbigg 13 2 parenrightbigg = 0.269. 17. Reduce the sample space: There are 21 crayons of which three are red. Seven of these crayons are selected at random and given to Marty. What is the probability that three of them are red? The answer is parenleftbigg 18 4 parenrightbigg slashBig parenleftbigg 21 7 parenrightbigg = 0.0263. 18. (a) The reduced sample space is S ={1, 3, 5, 7, 9,... ,9999}. There are 5000 elements in S. Since the set {5, 7, 9, 11, 13, 15,... ,9999} includes exactly 4998/3 = 1666 odd numbers that are divisible by three, the reduced sample space has 1667 odd numbers that are divisible by 3. So the answer is 1667/5000 = 0.3334. (b) Let O be the event that the number selected at random is odd. Let F be the event that it is divisible by 5 and T be the event that it is divisible by 3. The desired probability is calculated as follows. P(F c T c | O) = 1 − P(F ∪ T | O) = 1 − P(F | O)− P(T | O)+ P(FT | O) = 1 − 1000 5000 − 1667 5000 + 333 5000 = 0.5332. 38 Chapter 3 Conditional Probability and Independence 19. Let A be the event that during this period he has hiked in Oregon Ridge Park at least once. Let B be the event that during this period he has hiked in this park at least twice. We have P(B | A) = P(B) P(A) , where P(A) = 1 − 5 10 6 10 = 0.838 and P(B)= 1 − 5 10 6 10 − 10 × 5 9 6 10 = 0.515. So the answer is 0.515/0.838 = 0.615. 20. The numbers of 333 red and 583 blue chips are divisible by 3. Thus the reduced sample space has 333 + 583 = 916 points. Of these numbers, [1000/15]=66 belong to red balls and are divisible by 5 and [1750/15]=116 belong to blue balls and are divisible by 5. Thus the desired probability is 182/916 = 0.199. 21. Reduce the sample space: There are two types of animals in a laboratory, 15 type I and 13 type II. Six animals are selected at random; what is the probability that at least two of them are Type II? The answer is 1 − parenleftbigg 15 6 parenrightbigg + parenleftbigg 13 1 parenrightbiggparenleftbigg 15 5 parenrightbigg parenleftbigg 28 6 parenrightbigg = 0.883. 22. Reduce the sample space: 30 students of which 12 are French and nine are Korean are divided randomly into two classes of 15 each. What is the probability that one of them has exactly four French and exactly three Korean students? The solution to this problem is parenleftbigg 12 4 parenrightbiggparenleftbigg 9 3 parenrightbiggparenleftbigg 9 8 parenrightbigg parenleftbigg 30 15 parenrightbiggparenleftbigg 15 15 parenrightbigg = 0.00241. 23. This sounds puzzling because apparently the only deduction from the name “Mary” is that one of the children is a girl. But the crucial difference between this and Example 3.2 is reflected in the implicit assumption that both girls cannot be Mary. That is, the same name cannot be used for two children in the same family. In fact, any other identifying feature that cannot be shared by both girls would do the trick. Section 3.2 Law of Multiplication 39 3.2 LAW OF MULTIPLICATION 1. Let G be the event that Susan is guilty. Let L be the event that Robert will lie. The probability that Robert will commit perjury is P(GL) = P(G)P(L | G) = (0.65)(0.25) = 0.1625. 2. The answer is 11 14 × 10 13 × 9 12 × 8 11 × 7 10 × 6 9 = 0.15. 3. By the law of multiplication, the answer is 52 52 × 50 51 × 48 50 × 46 49 × 44 48 × 42 47 = 0.72. 4. (a) 8 20 × 7 19 × 6 18 × 5 17 = 0.0144; (b) 8 20 × 7 19 × 12 18 + 8 20 × 12 19 × 7 18 + 12 20 × 8 19 × 7 18 + 8 20 × 7 19 × 6 18 = 0.344. 5. (a) 6 11 × 5 10 × 5 9 × 4 8 × 4 7 × 3 6 × 3 5 × 2 4 × 2 3 × 1 2 × 1 1 = 0.00216. (b) 5 11 × 4 10 × 3 9 × 2 8 × 1 7 = 0.00216. 6. 3 8 × 5 10 × 5 13 × 8 15 + 5 8 × 3 11 × 8 13 × 5 16 = 0.0712. 7. Let A i be the event that the ith person draws the “you lose” paper. Clearly, P(A 1 ) = 1 200 , P(A 2 ) = P(A c 1 A 2 ) = P(A c 1 )P(A 2 | A c 1 ) = 199 200 · 1 199 = 1 200 , P(A 3 ) = P(A c 1 A c 2 A 3 ) = P(A c 1 )P(A c 2 | A c 1 )P(A 3 | A c 1 A c 2 ) = 199 200 · 198 199 · 1 198 = 1 200 , and so on. Therefore, P(A i ) = 1/200 for 1 ≤ i ≤ 200. This means that it makes no difference if you draw first, last or anywhere in the middle. Here is MarilynVos Savant’s intuitive solution to this problem: 40 Chapter 3 Conditional Probability and Independence It makes no difference if you draw first, last, or anywhere in the middle. Look at it this way: Say the robbers make everyone draw at once. You’d agree that everyone has the same change of losing (one in 200), right? Taking turns just makes that same event happen in a slow and orderly fashion. Envision a raffle at a church with 200 people in attendance, each person buys a ticket. Some buy a ticket when they arrive, some during the event, and some just before the winner is drawn. It doesn’t matter. At the party the end result is this: all 200 guests draw a slip of paper, and, regardless of when they look at the slips, the result will be identical: one will lose. You can’t alter your chances by looking at your slip before anyone else does, or waiting until everyone else has looked at theirs. 8. Let B be the event that a randomly selected person from the population at large has poor credit report. Let I be the event that the person selected at random will improve his or her credit rating within the next three years. We have P(B | I)= P(BI) P(I) = P(I | B)P(B) P(I) = (0.30)(0.18) 0.75 = 0.072. The desired probability is 1−0.072 = 0.928.Therefore, 92.8% of the people who will improve their credit records within the next three years are the ones with good credit ratings. 9. For 1 ≤ n ≤ 39, let E n be the event that none of the first n − 1 cards is a heart or the ace of spades. Let F n be the event that the nth card drawn is the ace of spades. Then the event of “no heart before the ace of spades” is uniontext 39 n=1 E n F n . Clearly, {E n F n , 1 ≤ n ≤ 39} forms a sequence of mutually exclusive events. Hence P parenleftBig 39 uniondisplay n=1 E n F n parenrightBig = 39 summationdisplay n=1 P(E n F n ) = 39 summationdisplay n=1 P(E n )P(F n | E n ) = 39 summationdisplay n=1 parenleftbigg 38 n − 1 parenrightbigg parenleftbigg 52 n − 1 parenrightbigg × 1 53 − n = 1 14 , a result which is not unexpected. 10. P(F)P(E | F)= parenleftbigg 13 3 parenrightbiggparenleftbigg 39 6 parenrightbigg parenleftbigg 52 9 parenrightbigg × 10 43 = 0.059. 11. By the law of multiplication, P(A n ) = 2 3 × 3 4 × 4 5 ×···× n + 1 n + 2 = 2 n + 2 . Section 3.3 Law of Total Probability 41 Now since A 1 ⊇ A 2 ⊇ A 3 ⊇···⊇A n ⊇ A n+1 ⊇··· , by Theorem 1.8, P parenleftBig ∞ intersectiondisplay i=1 A i parenrightBig = lim n→∞ P(A n ) = 0. 3.3 LAW OF TOTAL PROBABILITY 1. 1 2 × 0.05 + 1 2 × 0.0025 = 0.02625. 2. (0.16)(0.60) + (0.20)(0.40) = 0.176. 3. 1 3 (0.75) + 1 3 (0.68) + 1 3 (0.47) = 0.633. 4. 12 51 × 13 52 + 13 51 × 39 52 = 1 4 . 5. 11 50 × parenleftbigg 13 2 parenrightbigg parenleftbigg 52 2 parenrightbigg + 12 50 × parenleftbigg 13 1 parenrightbiggparenleftbigg 39 1 parenrightbigg parenleftbigg 52 2 parenrightbigg + 13 50 × parenleftbigg 39 2 parenrightbigg parenleftbigg 52 2 parenrightbigg = 1 4 . 6. (0.20)(0.40) + (0.35)(0.60) = 0.290. 7. (0.37)(0.80) + (0.63)(0.65) = 0.7055. 8. 1 6 (0.6) + 1 6 (0.5) + 1 6 (0.7) + 1 6 (0.9) + 1 6 (0.7) + 1 6 (0.8) = 0.7. 9. (0.50)(0.04) + (0.30)(0.02) + (0.20)(0.04) = 0.034. 10. Let B be the event that the randomly selected child from the countryside is a boy. Let E be the event that the randomly selected child is the first child of the family and F be the event that he or she is the second child of the family. Clearly, P(E) = 2/3 and P(F) = 1/3. By the law of total probability, P(B)= P(B | E)P(E) + P(B | F)P(F)= 1 2 × 2 3 + 1 2 × 1 3 = 1 2 . Therefore, assuming that sex distributions are equally probable, in the Chinese countryside, the distribution of sexes will remain equal. Here is Marilyn Vos Savant’s intuitive solution to this problem: 42 Chapter 3 Conditional Probability and Independence The distribution of sexes will remain roughly equal. That’s because–no matter how many or how few children are born anywhere, anytime, with or without restriction– half will be boys and half will be girls: Only the act of conception (not the govern- ment!) determines their sex. One can demonstrate this mathematically. (In this example, we’ll assume that women with firstborn girls will always have a second child.) Let’s say 100 women give birth, half to boys and half to girls. The half with boys must end their families. There are now 50 boys and 50 girls. The half with girls (50) give birth again, half to boys and half to girls. This adds 25 boys and 25 girls, so there are now 75 boys and 75 girls. Now all must end their families. So the result of the policy is that there will be fewer children in number, but the boy/girl ratio will not be affected. 11. The probability that the first person gets a gold coin is 3/5. The probability that the second person gets a gold coin is 2 4 × 3 5 + 3 4 × 2 5 = 3 5 . The probability that the third person gets a gold coin is 3 5 × 2 4 × 1 3 + 3 5 × 2 4 × 2 3 + 2 5 × 3 4 × 2 5 + 2 5 × 1 4 × 3 3 = 3 5 , and so on. Therefore, they are all equal. 12. A Probabilistic Solution: Let n be the number of adults in the town. Let x be the number of men in the town. Then n − x is the number of women in the town. Since the number of married men and married women are equal, we have x · 7 9 = (n − x)· 3 5 . This relation implies that x = (27/62)n. Therefore, the probability that a randomly selected adult is male is (27/62)n slashbig n = 27/62. The probability that a randomly selected adult is female is 1 − (27/62) = 35/62. Let A be the event that a randomly selected adult is married. Let M be the event that the randomly selected adult is a man, and let W be the event that the randomly selected adult is a woman. By the law of total probability, P(A) = P(A| M)P(M)+ P(A| W)P(W) = 7 9 · 27 62 + 3 5 · 35 62 = 42 62 = 21 31 ≈ 0.677. Therefore, 21/31st of the adults are married. An Arithmetical Solution: The common numerator of the two fractions is 21. Hence 21/27th of the men and 21/35th of the women are married. We find the common numerator because the number of married men and the number of married women are equal. This shows that of every 27 + 35 = 62 adults, 21 + 21 = 42 are married. Hence 42/62th = 21/31st of the adults in the town are married. Section 3.3 Law of Total Probability 43 13. The answer is clearly 0.40. This can also be computed from (0.40)(0.75) + (0.40)(0.25) = 0.40. 14. Let A be the event that a randomly selected child is the kth born of his or her family. Let B j be the event that he or she is from a family with j children. Then P(A) = c summationdisplay j=k P(A| B j )P(B j ), where, clearly, P(A | B j ) = 1/j . To find P(B j ), note that there are α i N families with j children. Therefore, the total number of children in the world is summationtext c i=0 i(α i N)of which j(Nα j ) are from families with j children. Hence P(B j ) = j(Nα j ) summationtext c i=0 i(α i N) = jα j summationtext c i=0 iα i . This shows that the desired fraction is given by P(A) = c summationdisplay j=k P(A| B j )P(B j ) = c summationdisplay j=k 1 j · jα j summationtext c i=0 iα i = c summationdisplay j=k α j summationtext c i=0 iα i = summationtext c j=k α j summationtext c i=0 iα i . 15. Q(E | F)= Q(EF) Q(F) = P(EF | B) P(F | B) = P(EFB) P(B) P(FB) P(B) = P(EFB) P(FB) = P(E | FB). 16. Let M, C, and F denote the events that the random student is married, is married to a student at the same campus, and is female, respectively. We have that P(F | M) = P(F | MC)P(C | M)+P(F | MC c )P(C c | M) = (0.40) 1 3 +(0.30) 2 3 = 0.333. 17. Let p(k,n) be the probability that exactly k of the first n seeds planted in the farm germinated. Using induction on n, we will show that p(k,n) = 1/(n − 1) for all km, if the ith suitor is the best, then Avril chooses him if and only if among the first i − 1 suitors Avril dates, the best is one of the first m.So P(E m | B i ) = m i − 1 . Therefore, P(E m ) = 1 n n summationdisplay i=m+1 m i − 1 = m n n summationdisplay i=m+1 1 i − 1 . Now n summationdisplay i=m+1 1 i − 1 ≈ integraldisplay n m 1 x dx = ln parenleftBig n m parenrightBig . Thus P(E m ) ≈ m n ln parenleftBig n m parenrightBig . To find the maximum of P(E m ), consider the differentiable function h(x) = x n ln parenleftBig n x parenrightBig . Since h prime (x) = 1 n ln parenleftBig n x parenrightBig − 1 n = 0 implies that x = n/e, the maximum of P(E m ) is at m =[n/e], where [n/e] is the greatest integer less than or equal to n/e. Hence Avril should dump the first [n/e] suitors she dates and marry the first suitor she dates afterward who is better than all those preceding him. The probability that with such a strategy she selects the best suitor of all n is approximately h parenleftBig n e parenrightBig = 1 e ln e = 1 e ≈ 0.368. 23. Let N be the set of nonnegative integers. The domain of f is braceleftbig (g, r) ∈ N × N: 0 ≤ g ≤ N, 0 ≤ r ≤ N, 0 1.424) = P(X>1.125) = 1.25 − 1.125 1.25 − 1 = 1 2 . 64 Chapter 4 Distribution Functions and Discrete Random Variables 5. P(X<1) = F(1−) = 1/2. P(X= 1) = F(1) − F(1−) = 1/6. P(1 ≤ X<2) = F(2−) − F(1−) = 1/4. P(X>1/2) = 1 − F(1/2) = 1 − 1/2 = 1/2. P(X= 3/2) = 0. P(1 t)= 1 − P(|X|≤t) = 1 − bracketleftbig 2F(t)− 1 bracketrightbig = 2 bracketleftbig 1 − F(t) bracketrightbig . Section 4.2 Distribution Functions 65 (c) P(X= t) = 1 + P(X= t)− 1 = P(X≤ t)+ P(X>t)+ P(X= t)− 1 = P(X≤ t)+ P(X≥ t)− 1 = P(X≤ t)+ P(X≤−t)− 1 = F(t)+ F(−t)− 1. 10. F is a distribution function because F(−∞) = 0, F(∞) = 1, F is right continuous, and F prime (t) = 1 π e −t > 0 implies that F is nondecreasing. 11. F is a distribution function because F(−∞) = 0, F(∞) = 1, F is right continuous, and F prime (t) = 1 (1 + t) 2 > 0 implies that it is nondecreasing. 12. Clearly, F is right continuous. On t<0 and on t ≥ 0, it is increasing, lim t→∞ F(t) = 1, and lim t→−∞ F(t) = 0. It looks like F satisfies all of the conditions necessary to make it a distribution function. However, F(0−) = 1/2 >F(0+) = 1/4 shows that F is not nondecreasing. Therefore, F is not a probability distribution function. 13. Let the departure time of the last flight before the passenger arrives be 0. Then Y, the arrival time of the passenger is a random number from (0, 45). The waiting time is X = 45 − Y.We have that for 0 ≤ t ≤ 45, P(X≤ t) = P(45 − Y ≤ t) = P(Y ≥ 45 − t) = 45 − (45 − t) 45 = t 45 . So F, the distribution function of X is F(t)= ⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩ 0 t<0 t/45 0 ≤ t<45 1 t ≥ 45. 14. Let X be the first two-digit number selected from the set {00, 01, 02,... ,99} which is between 4 and 18. Since for i = 4, 5,... ,18, P(X= i | 4 ≤ X ≤ 18) = P(X= i) P(4 ≤ X ≤ 18) = 1/100 15/100 = 1 15 , we have that X is chosen randomly from the set {4, 5,... ,18}. 15. Let X be the minimum of the three numbers, P(X<5) = 1 − P(X≥ 5) = 1 − parenleftbigg 36 3 parenrightbigg parenleftbigg 40 3 parenrightbigg = 0.277. 66 Chapter 4 Distribution Functions and Discrete Random Variables 16. P(X 2 −5X+6 > 0) = P parenleftbig (X−2)(X−3)>0 parenrightbig = P(X<2)+P(X>3) = 2 − 0 3 − 0 +0 = 2 3 . 17. F(t)= ⎧ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎩ 0 t<0 t 1 − t 0 ≤ t<1/2 1 t ≥ 1/2. 18. The distribution function of X is F(t)= 0ift<1; F(t)= 1 − (89/90) n if n ≤ tioccurs if and only if Liz has not played with Bob since i Sundays ago, and the earliest she will play with him is next Sunday. Now the probability is i/k that Liz will play with Bob if last time they played was i Sundays ago; hence P(Z >i)= 1 − i k ,i= 1, 2,... ,k− 1. Let p be the probability mass function of Z. Then, using this fact for 1 ≤ i ≤ k, we obtain p(i) = P(Z = i) = P(Z >i− 1) − P(Z >i)= parenleftBig 1 − i − 1 k parenrightBig − parenleftBig 1 − i k parenrightBig = 1 k . 13. The possible values of X are 0, 1, 2, 3, 4, and 5. For i,0≤ i ≤ 5, P(X= i) = parenleftbigg 5 i parenrightbigg 6 P i · 9 P 5−i · 10! 15! . The numerical values of these probabilities are as follows. i 012345 P(X= i) 42/1001 252/1001 420/1001 240/1001 45/1001 2/1001 14. For i = 0, 1, 2, and 3, we have P(X= i) = parenleftbigg 10 i parenrightbiggparenleftbigg 10 − i 6 − 2i parenrightbigg 2 6−2i parenleftbigg 20 6 parenrightbigg . The numerical values of these probabilities are as follows. i 0123 p(i) 112/323 168/323 42/323 1/323 70 Chapter 4 Distribution Functions and Discrete Random Variables 15. Clearly, P(X>n)= P parenleftBig 6 uniondisplay i=1 E i parenrightBig · To calculate P parenleftbig E 1 ∪ E 2 ∪···∪E 6 parenrightbig , we use the inclusion-exclusion principle. To do so, we must calculate the probabilities of all possible intersections of the events from E 1 , ..., E 6 , add the probabilities that are obtained by intersecting an odd number of events, and subtract all the probabilities that are obtained by intersecting an even number of events. Clearly, there are parenleftbigg 6 1 parenrightbigg terms of the form P(E i ), parenleftbigg 6 2 parenrightbigg terms of the form P(E i E j ), parenleftbigg 6 3 parenrightbigg terms of the form P(E i E j E k ), and so on. Now for all i, P(E i ) = (5/6) n ; for all i and j, P(E i E j ) = (4/6) n ; for all i, j, and k, P(E i E j E k ) = (3/6) n ; and so on. Thus P(X>n)= P(E 1 ∪ E 2 ∪···∪E 6 ) = parenleftbigg 6 1 parenrightbigg parenleftBig 5 6 parenrightBig n − parenleftbigg 6 2 parenrightbigg parenleftBig 4 6 parenrightBig n + parenleftbigg 6 3 parenrightbigg parenleftBig 3 6 parenrightBig n − parenleftbigg 6 4 parenrightbigg parenleftBig 2 6 parenrightBig n + parenleftbigg 6 5 parenrightbigg parenleftBig 1 6 parenrightBig n = 6 parenleftBig 5 6 parenrightBig n − 15 parenleftBig 4 6 parenrightBig n + 20 parenleftBig 3 6 parenrightBig n − 15 parenleftBig 2 6 parenrightBig n + 6 parenleftBig 1 6 parenrightBig n . Let p be the probability mass function of X. The set of all possible values of X is {6, 7, 8,...}, and p(n) = P(X= n) = P(X>n− 1) − P(X>n) = parenleftBig 5 6 parenrightBig n−1 − 5 parenleftBig 4 6 parenrightBig n−1 + 10 parenleftBig 3 6 parenrightBig n−1 − 10 parenleftBig 2 6 parenrightBig n−1 + 5 parenleftBig 1 6 parenrightBig n−1 ,n≥ 6. 16. Put the students in some random order. Suppose that the first two students form the first team, the third and fourth students form the second team, the fifth and sixth students form the third team, and so on. Let F stand for “female” and M stand for “male.” Since our only concern is gender of the students, the total number of ways we can form 13 teams, each consisting of two students, is equal to the number of distinguishable permutations of a sequence of 23 M’s and three F’s. By Theorem 2.4, this number is 26! 23! 3! = parenleftbigg 26 3 parenrightbigg . The set of possible values of the random variable X is {2, 4,... ,26}. To calculate the probabilities associated with these values, note that for k = 1, 2,... ,13, X = 2k if and only if one of the following events occurs: A: One of the first k−1 teams is a female-female team, the kth team is either a male-female or a female-male team, and the remaining teams are all male-male teams. B: The first k − 1 teams are all male-male teams, and the kth team is either a male-female team or a female-male team. Section 4.4 Expectations of Discrete Random Variables 71 To find P(A), note that for A to occur, there are k−1 possibilities for one of the first k−1 teams to be a female-female team, two possibilities for the kth team (male-female and female-male), and one possibility for the remaining teams to be all male-male teams. Therefore, P(A) = 2(k − 1) parenleftbigg 26 3 parenrightbigg . To find P(B), note that for B to occur, there is one possibility for the first k − 1 teams to be all male-male, and two possibilities for the kth team: male-female and female-male. The number of possibilities for the remaining 13−k teams is equal to the number of distinguishable permutations of two F’s and (26−2k)−2 M’s, which, by Theorem 2.4, is 26 − 2k)! 2! (26 − 2k − 2)! = parenleftbigg 26 − 2k 2 parenrightbigg . Therefore, P(B)= 2 parenleftbigg 26 − 2k 2 parenrightbigg parenleftbigg 26 3 parenrightbigg . Hence, for 1 ≤ k ≤ 13, P(X= 2k) = P(A) + P(B)= 2(k − 1) + 2 parenleftbigg 26 − 2k 2 parenrightbigg parenleftbigg 26 3 parenrightbigg = 1 650 k 2 − 1 26 k + 1 4 . 4.4 EXPECTATIONS OF DISCRETE RANDOM VARIABLES 1. Yes, of course there is a fallacy in Dickens’ argument. If, in England, at that time there were exactly two train accidents each month, then Dickens would have been right. Usually, for all n>0 and for any two given days, the probability of n train accidents in day 1 is equal to the probability of n accidents in day 2. Therefore, in all likelihood the risk of train accidents on the final day in March and the risk of such accidents on the first day in April would have been about the same. The fact that train accidents occurred at random days, two per month on the average, imply that in some months more than two and in other months two or less accidents were occurring. 2. Let X be the fine that the citizen pays on a random day. Then E(X) = 25(0.60) + 0(0.40) = 15. Therefore, it is much better to park legally. 72 Chapter 4 Distribution Functions and Discrete Random Variables 3. The expected value of the winning amount is 30 parenleftBig 4000 2, 000, 000 parenrightBig + 800 parenleftBig 500 2, 000, 000 parenrightBig + 1, 200, 000 parenleftBig 1 2, 000, 000 parenrightBig = 0.86. Considering the cost of the ticket, the expected value of the player’s gain in one game is −1 + 0.86 =−0.14. 4. Let X be the amount that the player gains in one game, then P(X= 4) = parenleftbigg 4 3 parenrightbiggparenleftbigg 6 1 parenrightbigg parenleftbigg 10 4 parenrightbigg = 0.114,P(X= 9) = 1 parenleftbigg 10 4 parenrightbigg = 0.005, and P(X=−1) = 1 − 0.114 − 0.005 = 0.881. Thus E(X) =−1(0.881) + 4(0.114) + 9(0.005) =−0.38. Therefore, on the average, the player loses 38 cents per game. 5. LetX bethenetgaininoneplayofthegame. ThesetofpossiblevaluesofX is{−8, −4, 0, 6, 10}. The probabilities associated with these values are p(−8) = p(0) = 1 parenleftbigg 5 2 parenrightbigg = 1 10 ,p(−4) = parenleftbigg 2 1 parenrightbiggparenleftbigg 2 1 parenrightbigg parenleftbigg 5 2 parenrightbigg = 4 10 , and p(6) = p(10) = parenleftbigg 2 1 parenrightbigg parenleftbigg 5 2 parenrightbigg = 2 10 . Hence E(X) =−8 · 1 10 − 4 · 4 10 + 0 · 1 10 + 6 · 2 10 + 10 · 2 10 = 4 5 . Since E(X) > 0, the game is not fair. 6. The expected number of defective items is 3 summationdisplay i=0 i · parenleftbigg 5 i parenrightbiggparenleftbigg 15 5 − i parenrightbigg parenleftbigg 20 3 parenrightbigg = 0.75. Section 4.4 Expectations of Discrete Random Variables 73 7. For i = 4, 5, 6, 7, let X i be the profit if i magazines are ordered. Then E(X 4 ) = 4a 3 , E(X 5 ) = 2a 3 · 6 18 + 5a 3 · 12 18 = 4a 3 , E(X 6 ) = 0 · 6 18 + a · 5 18 + 6a 3 · 7 18 = 19a 18 , E(X 7 ) =− 2a 3 · 6 18 + a 3 · 5 18 + 4a 3 · 4 18 + 7a 3 · 3 18 = 10a 18 . Since 4a/3 > 19a/18 and 4a/3 > 10a/18, either 4, or 5 magazines should be ordered to maximize the profit in the long run. 8. (a) ∞ summationdisplay x=1 6 π 2 x 2 = 6 π 2 ∞ summationdisplay x=1 1 x 2 = 6 π 2 · π 2 6 = 1. (b) E(X) = ∞ summationdisplay x=1 x 6 π 2 x 2 = 6 π 2 ∞ summationdisplay x=1 1 x =∞. 9. (a) 2 summationdisplay i=−2 p(x) = 9 27 + 4 27 + 1 27 + 4 27 + 9 27 = 1. (b) E(X) = summationtext 2 x=−2 xp(x) = 0,E(|X|) = summationtext 2 x=−2 |x|p(x) = 44/27, E(X 2 ) = summationtext 2 x=−2 x 2 p(x) = 80/27. Hence E(2X 2 − 5X + 7) = 2(80/27) − 5(0) + 7 = 349/27. 10. Let R be the radius of the randomly selected disk; then E(2πR) = 2π 10 summationdisplay i=1 i 1 10 = 11π. 11. p(x) the probability mass function of X is given by x −3034 p(x) 3/8 1/8 1/4 1/4 Hence E(X) =−3 · 3 8 + 0 · 1 8 + 3 · 1 4 + 4 · 1 4 = 5 8 , E(X 2 ) = 9 · 3 8 + 0 · 1 8 + 9 · 1 4 + 16 · 1 4 = 77 8 , 74 Chapter 4 Distribution Functions and Discrete Random Variables E(|X|) = 3 · 3 8 + 0 · 1 8 + 3 · 1 4 + 4 · 1 4 = 23 8 , E(X 2 − 2|X|) = 77 8 − 2 parenleftBig 23 8 parenrightBig = 31 8 , E(X|X|) =−9 · 3 8 + 0 · 1 8 + 9 · 1 4 + 16 · 1 4 = 23 8 . 12. E(X) = 10 summationdisplay i=1 i · 1 10 = 11 2 and E(X 2 ) = 10 summationdisplay i=1 i 2 · 1 10 = 77 2 .So E bracketleftbig X(11 − X) bracketrightbig = E(11X − X 2 ) = 11 · 11 2 − 77 2 = 22. 13. Let X be the number of different birthdays; we have P(X= 4) = 365 × 364 × 363 × 362 365 4 = 0.9836, P(X= 3) = parenleftbigg 4 2 parenrightbigg 365 × 364 × 363 365 4 = 0.0163, P(X= 2) = parenleftbigg 4 2 parenrightbigg 365 × 364 + parenleftbigg 4 3 parenrightbigg 365 × 364 365 4 = 0.00007, P(X= 1) = 365 365 4 = 0.000000021. Thus E(X) = 4(0.9836) + 3(0.0163) + 2(0.00007) + 1(0.000, 000, 021) = 3.98. 14. Let X be the number of children they should continue to have until they have one of each sex. For i ≥ 2, clearly, X = i if and only if either all of their first i −1 children are boys and the ith child is a girl, or all of their first i − 1 children are girls and the ith child is a boy. Therefore, by independence, P(X= i) = parenleftBig 1 2 parenrightBig i−1 · 1 2 + parenleftBig 1 2 parenrightBig i−1 · 1 2 = parenleftBig 1 2 parenrightBig i−1 ,i≥ 2. So E(X) = ∞ summationdisplay i=2 i parenleftBig 1 2 parenrightBig i−1 =−1 + ∞ summationdisplay i=1 i parenleftBig 1 2 parenrightBig i−1 =−1 + 1 (1 − 1/2) 2 = 3. Note that for |r| < 1, summationtext ∞ i=1 ir i−1 = 1/[(1 − r) 2 ]. Section 4.4 Expectations of Discrete Random Variables 75 15. Let A j be the event that the person belongs to a family with j children. Then P(K = k) = c summationdisplay j=0 P(K = k|A j )P(A j ) = c summationdisplay j=k 1 j α j . Therefore, E(K) = c summationdisplay k=1 kP(K = k) = c summationdisplay k=1 k c summationdisplay j=k α j j = c summationdisplay k=1 c summationdisplay j=k kα j j . 16. Let X be the number of cards to be turned face up until an ace appears. Let A be the event that no ace appears among the first i − 1 cards that are turned face up. Let B be the event that the ith card turned face up is an ace. We have P(X= i) = P(AB) = P(B|A)P(A) = 4 52 − (i − 1) · parenleftbigg 48 i − 1 parenrightbigg parenleftbigg 52 i − 1 parenrightbigg. Therefore, E(X) = 49 summationdisplay i=1 i parenleftbigg 48 i − 1 parenrightbigg 4 parenleftbigg 52 i − 1 parenrightbigg (53 − i) = 10.6. To some, this answer might be counterintuitive. 17. Let X be the largest number selected. Clearly, P(X= i) = P(X≤ i)− P(X≤ i − 1) = parenleftBig i N parenrightBig n − parenleftBig i − 1 N parenrightBig n ,i= 1, 2,... ,N. Hence E(X) = N summationdisplay i=1 bracketleftBig i n+1 N n − i(i − 1) n N n bracketrightBig = 1 N n N summationdisplay i=1 bracketleftbig i n+1 − i(i − 1) n bracketrightbig = 1 N n N summationdisplay i=1 bracketleftbig i n+1 − (i − 1) n+1 − (i − 1) n bracketrightbig = N n+1 − N summationdisplay i=1 (i − 1) n N n . For large N, N summationdisplay i=1 (i − 1) n ≈ integraldisplay N 0 x n dx = N n+1 n + 1 . 76 Chapter 4 Distribution Functions and Discrete Random Variables Therefore, E(X) ≈ N n+1 − N n+1 n + 1 N n = nN n + 1 . 18. (a) Note that 1 n(n + 1) = 1 n − 1 n + 1 . So k summationdisplay n=1 1 n(n + 1) = k summationdisplay n=1 parenleftBig 1 n − 1 n + 1 parenrightBig = 1 − 1 k + 1 . This implies that ∞ summationdisplay n=1 p(n) = lim k→∞ k summationdisplay n=1 1 n(n + 1) = 1 − lim k→∞ 1 k + 1 = 1. Therefore, p is a probability mass function. (b) E(X) = ∞ summationdisplay n=1 np(n) = ∞ summationdisplay n=1 1 n + 1 =∞, where the last equality follows since we know from calculus that the harmonic series, 1 + 1/2 + 1/3 +··· , is divergent. Hence E(X) does not exist. 19. By the solution to Exercise 16, Section 4.3, it should be clear that for 1 ≤ k ≤ n, P(X= 2k) = 2(k − 1) + 2 parenleftbigg 2n − 2k 2 parenrightbigg parenleftbigg 2n 3 parenrightbigg . Hence E(X) = n summationdisplay k=1 2kP(X = 2k) = n summationdisplay k=1 = 4k(k − 1) + 4k parenleftbigg 2n − 2k 2 parenrightbigg parenleftbigg 2n 3 parenrightbigg = 4 parenleftbigg 2n 3 parenrightbigg bracketleftBig 2 n summationdisplay k=1 k 3 − (4n − 2) n summationdisplay k=1 k 2 + (2n 2 − n − 1) n summationdisplay n=1 k bracketrightBig = 4 parenleftbigg 2n 3 parenrightbigg bracketleftBig 2 · n 2 (n + 1) 2 4 − (4n − 2) · n(n + 1)(2n + 1) 6 + (2n 2 − n − 1) n(n + 1) 2 bracketrightBig = (n + 1) 2 2n − 1 . Section 4.5 Variances and Moments of Discrete Random Variables 77 4.5 VARIANCES AND MOMENTS OF DISCRETE RANDOM VARIABLES 1. On average, in the long run, the two businesses have the same profit. The one that has a profit with lower standard deviation should be chosen by Mr. Jones because he’s interested in steady income. Therefore, he should choose the first business. 2. The one with lower standard deviation, namely, the second device. 3. E(X) = summationtext 3 x=−3 xp(x) =−1,E(X 2 ) = summationtext 3 x=−3 x 2 p(x) = 4.Therefore,Var(X) = 4−1 = 3. 4. p, the probability mass function of X is given by x −30 6 p(x) 3/8 3/8 2/8 Thus E(X) =− 9 8 + 12 8 = 3 8 ,E(X 2 ) = 27 8 + 72 8 = 99 8 , Var(X) = 99 8 − 9 64 = 783 64 = 12.234,σ X = √ 12.234 = 3.498. 5. By straightforward calculations, E(X) = N summationdisplay i=1 i · 1 N = 1 N · N(N + 1) 2 = N + 1 2 , E(X 2 ) = N summationdisplay i=1 i 2 · 1 N = 1 N · N(N + 1)(2N + 1) 6 = (N + 1)(2N + 1) 6 , Var(X) = (N + 1)(2N + 1) 6 − (N + 1) 2 4 = N 2 − 1 12 , σ X = radicalBigg N 2 − 1 12 . 6. Clearly, E(X) = 5 summationdisplay i=0 i · parenleftbigg 13 i parenrightbiggparenleftbigg 39 5 − i parenrightbigg parenleftbigg 52 5 parenrightbigg = 1.25, E(X 2 ) = 5 summationdisplay i=0 i 2 · parenleftbigg 13 i parenrightbiggparenleftbigg 39 5 − i parenrightbigg parenleftbigg 52 5 parenrightbigg = 2.426. 78 Chapter 4 Distribution Functions and Discrete Random Variables Therefore, Var(X) = 2.426 − (1.25) 2 = 0.864, and hence σ X = √ 0.864 = 0.9295. 7. By the Corollary of Theorem 4.2, E(X 2 − 2X) = 3 implies that E(X 2 ) − 2E(X) = 3. Substituting E(X) = 1 in this relation gives E(X 2 ) = 5. Hence, by Theorem 4.3, Var(X) = E(X 2 ) − bracketleftbig E(X) bracketrightbig 2 = 5 − 1 = 4. By Theorem 4.5, Var(−3X + 5) = 9Var(X) = 9 × 4 = 36. 8. Let X be Harry’s net gain. Then X = ⎧ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎩ −2 with probability 1/8 0.25 with probability 3/8 0.50 with probability 3/8 0.75 with probability 1/8. Thus E(X) =−2 · 1 8 + 0.25 · 3 8 + 0.50 · 3 8 + 0.75 · 1 8 = 0.125 E(X 2 ) = (−2) 2 · 1 8 + 0.25 2 · 3 8 + 0.50 2 · 3 8 + 0.75 2 · 1 8 = 0.6875. These show that the expected value of Harry’s net gain is 12.5 cents. Its variance is Var(X) = 0.6875 − 0.125 2 = 0.671875. 9. Note that E(X) = E(Y) = 0. Clearly, P parenleftbig |X − 0|≤t parenrightbig = braceleftBigg 0ift<1 1ift ≥ 1, P parenleftbig |Y − 0|≤t parenrightbig = braceleftBigg 0ift<10 1ift ≥ 10. These relations, clearly, show that for all t>0, P parenleftbig |Y − 0|≤t parenrightbig ≤ P parenleftbig |X − 0|≤t parenrightbig . Therefore, X is more concentrated about 0 than Y is. 10. (a) Let X be the number of trials required to open the door. Clearly, P(X= x) = parenleftBig 1 − 1 n parenrightBig x−1 1 n ,x= 1, 2, 3,... . Section 4.5 Variances and Moments of Discrete Random Variables 79 Thus E(X) = ∞ summationdisplay x=1 x parenleftBig 1 − 1 n parenrightBig x−1 1 n = 1 n ∞ summationdisplay x=1 x parenleftBig 1 − 1 n parenrightBig x−1 . (10) We know from calculus that ∀r, |r| < 1, ∞ summationdisplay x=1 xr x−1 = 1 (1 − r) 2 . (11) Thus ∞ summationdisplay x=1 x parenleftBig 1 − 1 n parenrightBig x−1 = 1 bracketleftBig 1 − parenleftBig 1 − 1 n parenrightBigbracketrightBig 2 = n 2 . (12) Substituting (12) in (10), we obtain E(X) = n. To calculate Var(X), first we find E(X 2 ).We have E(X 2 ) = ∞ summationdisplay x=1 x 2 parenleftBig 1 − 1 n parenrightBig x−1 parenleftBig 1 n parenrightBig = 1 n ∞ summationdisplay x=1 x 2 parenleftBig 1 − 1 n parenrightBig x−1 . (13) Now to calculate this sum, we multiply both sides of (11) by r and then differentiate it with respect to r;weget ∞ summationdisplay x=1 x 2 r x−1 = 1 + r (1 − r) 3 . Using this relation in (13), we obtain E(X 2 ) = 1 n · 1 + 1 − 1 n bracketleftBig 1 − parenleftBig 1 − 1 n parenrightBigbracketrightBig 3 = 2n 2 − n. Therefore, Var(X) = (2n 2 − n) − n 2 = n(n − 1). (b) Let A i be the event that on the ith trial the door opens. Let X be the number of trials required to open the door. Then P(X= 1) = 1 n , 80 Chapter 4 Distribution Functions and Discrete Random Variables P(X= 2) = P(A c 1 A 2 ) = P(A 2 |A c 1 )P(A c 1 ) = 1 n − 1 · n − 1 n = 1 n , P(X= 3) = P(A c 1 A c 2 A 3 ) = P(A 3 |A c 2 A c 1 )P(A c 2 A c 1 ) = P(A 3 |A c 2 A c 1 )P(A c 2 |A c 1 )P(A c 1 ) = 1 n − 2 · n − 2 n − 1 · n − 1 n = 1 n . Similarly, P(X = i) = 1/n for 1 ≤ i ≤ n. Therefore, X is a random number selected from {1, 2, 3,... ,n}. By Exercise 5, E(X) = (n + 1)/2 and Var(X) = (n 2 − 1)/12. 11. For E(X 3 ) to exist, we must have E parenleftbig |X 3 | parenrightbig < ∞.Now ∞ summationdisplay n=1 x 3 n p(x n ) = 6 π 2 ∞ summationdisplay n=1 (−1) n n √ n n 2 = 6 π 2 ∞ summationdisplay n=1 (−1) n √ n < ∞, whereas E parenleftbig |X 3 | parenrightbig = ∞ summationdisplay n=1 |x 3 n |p(x n ) = 6 π 2 ∞ summationdisplay n=1 n √ n n 2 = 6 π 2 ∞ summationdisplay n=1 1 √ n =∞. 12. For 0 ~~n)≥ 0.50 or 1 − (0.98) n ≥ 0.50. This gives (0.98) n ≤ 0.50 or n ≥ ln 0.50/ ln 0.98 = 34.31. Therefore, n = 35. 4. Let F be the distribution function of X, then F(t)= 1 − parenleftBig 1 + t 200 parenrightBig e −t/200 ,t≥ 0. Using this, we obtain P(200 ≤ X ≤ 300) = P(X≤ 300) − P(X<200) = F(300) − F(200−) = F(300) − F(200) = 0.442 − 0.264 = 0.178. 5. Let X be the number of sections that will get a hard test. We want to calculate E(X). The random variable X can only assume the values 0, 1, 2, 3, and 4; its probability mass function is given by p(i) = P(X= i) = parenleftbigg 8 i parenrightbiggparenleftbigg 22 4 − i parenrightbigg parenleftbigg 30 4 parenrightbigg ,i= 0, 1, 2, 3, 4, where the numerical values of p(i)’s are as follows. i 01234 p(i) 0.2669 0.4496 0.2360 0.0450 0.0026 Thus E(X) = 0(0.2669) + 1(0.4496) + 2(0.2360) + 3(0.0450) + 4(0.00026) = 1.067. 6. (a) 1 − F(6) = 5/36. (b) F(9) = 76/81. (c) F(7) − F(2) = 44/49. 7. We have that E(X) = (15.85)(0.15) + (15.9)(0.21) + (16)(0.35) + (16.1)(0.15) + (16.2)(0.14) = 16, Var(X) = (15.85 − 16) 2 (0.15) + (15.9 − 16) 2 (0.21) + (16 − 16) 2 (0.35) + (16.1 − 16) 2 (0.15) + (16.2 − 16) 2 (0.14) = 0.013. E(Y) = (15.85)(0.14) + (15.9)(0.05) + (16)(0.64) + (16.1)(0.08) + (16.2)(0.09) = 16, Var(Y) = (15.85 − 16) 2 (0.14) + (15.9 − 16) 2 (0.05) + (16 − 16) 2 (0.64) + (16.1 − 16) 2 (0.08) + (16.2 − 16) 2 (0.09) = 0.008. Chapter 4 Review Problems 85 These show that, on the average, companies A and B fill their bottles with 16 fluid ounces of soft drink. However, the amount of soda in bottles from company A vary more than in bottles from company B. 8. Let F be the distribution function of X, Then F(t)= ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ 0 t<58 7/30 58 ≤ t<62 13/30 62 ≤ t<64 18/30 64 ≤ t<76 23/30 76 ≤ t<80 1 t ≥ 80. 9. (a) To determine the value of k, note that ∞ summationdisplay i=0 k (2t) i i! = 1. Therefore, k ∞ summationdisplay i=0 (2t) i i! = 1. This implies that ke 2t = 1ork = e −2t . Thus p(i) = e −2t (2t) i i! . (b) P(X<4) = 3 summationdisplay i=0 P(X= i) = e −2t bracketleftbig 1 + 2t + 2t 2 + (4t 3 /3) bracketrightbig , P(X>1) = 1 − P(X= 0) − P(X= 1) = 1 − e −2t − 2te −2t . 10. Let p be the probability mass function, and F be the distribution function of X.Wehave p(0) = p(3) = 1 8 ,p(1) = p(2) = 3 8 , and F(t)= ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ 0 t<0 1/80≤ t<1 4/81≤ t<2 7/82≤ t<3 1 t ≥ 3. 11. (a) The sample space has 52! elements because when the cards are dealt face down, any ordering of the cards is a possibility. To find p(j), the probability that the 4th king will appear on the jth card, we claim that in parenleftbigg 4 1 parenrightbigg · (j − 1) P 3 · 48! ways the 4th king will appear on the jth card, and the remaining 3 kings earlier. To see this, note that 86 Chapter 4 Distribution Functions and Discrete Random Variables we have parenleftbigg 4 1 parenrightbigg combinations for the king that appears on the jth card, and (j − 1) P 3 different permutations for the remaining 3 kings that appear earlier. The last term 48!, is for the remaining 48 cards that can appear in any order in the remaining 48 positions. Therefore, p(j) = parenleftbigg 4 1 parenrightbigg · (j − 1) P 3 · 48! 52! = parenleftbigg j − 1 3 parenrightbigg 52! 4! 48! = parenleftbigg j − 1 3 parenrightbigg parenleftbigg 52 4 parenrightbigg . (b) The probability that the player wins is p(52) = parenleftbigg 51 3 parenrightbigg slashBig parenleftbigg 52 4 parenrightbigg = 1/13. (c) To find E = 52 summationdisplay j=4 jp(j) = 1 parenleftbigg 52 4 parenrightbigg 52 summationdisplay j=4 j parenleftbigg j − 1 3 parenrightbigg , the expected length of the game, we use a technique introduced by Jenkyns and Muller in Mathematics Magazine, 54, (1981), page 203. We have the following relation which can be readily checked. j parenleftbigg j − 1 3 parenrightbigg = 4 5 bracketleftbigg (j + 1) parenleftbigg j 4 parenrightbigg − j parenleftbigg j − 1 4 parenrightbiggbracketrightbigg ,j≥ 5. This gives 52 summationdisplay j=5 j parenleftbigg j − 1 3 parenrightbigg = 4 5 bracketleftbigg 52 summationdisplay j=5 (j + 1) parenleftbigg j 4 parenrightbigg − 52 summationdisplay j=5 j parenleftbigg j − 1 4 parenrightbiggbracketrightbigg = 4 5 bracketleftbigg 53 parenleftbigg 52 4 parenrightbigg − 5 parenleftbigg 4 4 parenrightbiggbracketrightbigg = 11, 478, 736, where the next-to-the-last equality follows because terms cancel out in pairs. Thus E = 1 parenleftbigg 52 4 parenrightbigg 52 summationdisplay j=4 j parenleftbigg j − 1 3 parenrightbigg = 1 parenleftbigg 52 4 parenrightbigg bracketleftbigg 4 + 52 summationdisplay j=5 j parenleftbigg j − 1 3 parenrightbiggbracketrightbigg = 1 parenleftbigg 52 4 parenrightbigg(4 + 11, 478, 736) = 42.4. As Jenkyns and Muller have noted, “This relatively high expectation value is what makes the game interesting. However, the low probability of winning makes it frustrating!” Chapter 5 Special Discrete Distributions 5.1 BERNOULLI AND BINOMIAL RANDOM VARIABLES 1. parenleftbigg 8 4 parenrightbigg parenleftBig 1 4 parenrightBig 4 parenleftBig 3 4 parenrightBig 4 = 0.087. 2. (a) 64 × 1 2 = 32. (b) 6 × 1 2 + 1 = 4 (note that we should count the mother of the family as well). 3. parenleftbigg 6 3 parenrightbigg parenleftBig 1 6 parenrightBig 3 parenleftBig 5 6 parenrightBig 3 = 0.054. 4. parenleftbigg 6 2 parenrightbigg parenleftBig 1 10 parenrightBig 2 parenleftBig 9 10 parenrightBig 4 = 0.098. 5. parenleftbigg 5 2 parenrightbigg parenleftBig 10 30 parenrightBig 2 parenleftBig 20 30 parenrightBig 3 = 0.33. 6. Let X be the number of defective nails. If the manufacturer’s claim is true, we have P(X≥ 2) = 1 − P(X= 0) − P(X= 1) = 1 − parenleftbigg 24 0 parenrightbigg (0.03) 0 (0.97) 24 − parenleftbigg 24 1 parenrightbigg (0.03)(0.97) 23 = 0.162. This shows that there is 16.2% chance that two or more defective nails is found. Therefore, it is not fair to reject company’s claim. 7. Let p and q be the probability mass functions of X and Y, respectively. Then p(x) = parenleftbigg 4 x parenrightbigg (0.60) x (0.40) 4−x ,x= 0, 1, 2, 3, 4; 88 Chapter 5 Special Discrete Distributions q(y) = P(Y = y) = P parenleftBig X = y − 1 2 parenrightBig = parenleftbigg 4 y−1 2 parenrightbigg (0.60) (y−1)/2 (0.40) 4−[(y−1)/2] ,y= 1, 3, 5, 7, 9. 8. 8 summationdisplay i=0 parenleftbigg 15 i parenrightbigg (0.8) i (0.2) 15−i = 0.142. 9. parenleftbigg 10 5 parenrightbigg parenleftBig 11 36 parenrightBig 5 parenleftBig 25 36 parenrightBig 5 = 0.108. 10. (a) 1 − parenleftbigg 5 0 parenrightbigg parenleftBig 1 3 parenrightBig 0 parenleftBig 2 3 parenrightBig 5 − parenleftbigg 5 1 parenrightbigg parenleftBig 1 3 parenrightBig 1 parenleftBig 2 3 parenrightBig 4 = 0.539. (b) parenleftbigg 5 2 parenrightbigg parenleftBig 1 10 parenrightBig 2 parenleftBig 9 10 parenrightBig 3 = 0.073. 11. We know that p(x) is maximum at [(n + 1)p]. If (n + 1)p is an integer, p(x) is maximum at [(n + 1)p]=np + p. But in such a case, some straightforward algebra shows that parenleftbigg n np + p parenrightbigg p np+p (1 − p) n−np−p = parenleftbigg n np + p − 1 parenrightbigg p np+p−1 (1 − p) n−np−p+1 , implying that p(x) is also maximum at np + p − 1. 12. The probability of royal or straight flush is 40 slashBig parenleftbigg 52 5 parenrightbigg . If Ernie plays n games, he will get, on the average, n bracketleftbigg 40 slashBig parenleftbigg 52 5 parenrightbiggbracketrightbigg royal or straight flushes. We want to have 40n slashBig parenleftbigg 52 5 parenrightbigg = 1; this gives n = parenleftbigg 52 5 parenrightbigg slashBig 40 = 64, 974. 13. parenleftbigg 6 3 parenrightbigg parenleftBig 1 3 parenrightBig 3 parenleftBig 2 3 parenrightBig 3 = 0.219. 14. 1 − (999/1000) 100 = 0.095. 15. The maximum occurs at k =[11(0.45)]=4. The maximum probability is parenleftbigg 10 4 parenrightbigg (0.45) 4 (0.55) 6 = 0.238. 16. Call the event of obtaining a full house success. X, the number of full houses is n independent poker hands is a binomial random variable with parameters (n, p), where p is the probability that a random poker hand is a full house. To calculate p, note that there are parenleftbigg 52 5 parenrightbigg possible poker hands and parenleftbigg 4 3 parenrightbiggparenleftbigg 4 2 parenrightbigg 13! 11! = 3744 full houses. Thus p = 3744 slashBig parenleftbigg 52 5 parenrightbigg ≈ 0.0014. Hence Section 5.1 Bernoulli and Binomial Random Variables 89 E(X) = np ≈ 0.0014n andVar(X) = np(1−p) ≈ 0.00144n. Note that if n is approximately 715, then E(X) = 1. Thus we should expect to find, on the average, one full house in every 715 random poker hands. 17. 1 − parenleftbigg 6 6 parenrightbigg parenleftBig 1 4 parenrightBig 6 parenleftBig 3 4 parenrightBig 0 − parenleftbigg 6 5 parenrightbigg parenleftBig 1 4 parenrightBig 5 parenleftBig 3 4 parenrightBig ≈ 0.995. 18. 1 − parenleftbigg 3000 0 parenrightbigg (0.0005) 0 (0.9995) 3000 − parenleftbigg 3000 1 parenrightbigg (0.0005)(0.9995) 2999 ≈ 0.442. 19. The expected value of the expenses if sent in one parcel is 45.20 × 0.07 + 5.20 × 0.93 = 8. The expected value of the expenses if sent in two parcels is (23.30 × 2)(0.07) 2 + (23.30 + 3.30) parenleftbigg 2 1 parenrightbigg (0.07)(0.93) + (6.60)(0.93) 2 = 9.4. Therefore, it is preferable to send in a single parcel. 20. Let n be the minimum number of children they should plan to have. Since the probability of all girlsis(1/2) n andtheprobabilityofallboysis(1/2) n , wemusthave1−(1/2) n −(1/2) n ≥ 0.95. This gives (1/2) n−1 ≤ 0.05 or n − 1 ≥ ln 0.05 ln(0.5) = 4.32 or n ≥ 5.32. Therefore, n = 6. 21. (a) For this to happen, exactly one of the N stations has to attempt transmitting a message. The probability of this is parenleftbigg N 1 parenrightbigg p(1 − p) N−1 = Np(1 − p) N−1 . (b) Let f(p)= Np(1−p) N−1 . The value of p which maximizes the probability of a message going through with no collision is the root of the equation f prime (p) = 0. Now f prime (p) = N(1 − p) N−1 − Np(N − 1)(1 − p) N−2 = 0. Noting that p negationslash= 1, this equation gives p = 1/N. This answer makes a lot of sense because at every “suitable instance,” on average, Np = 1 station will transmit a message. (c) By part (b), the maximum probability is f parenleftBig 1 N parenrightBig = N parenleftBig 1 N parenrightBigparenleftBig 1 − 1 N parenrightBig N−1 = parenleftBig 1 − 1 N parenrightBig N−1 . As N →∞, this probability approaches 1/e, showing that for large numbers of stations (in reality 20 or more), the probability of a successful transmission is approximately 1/e independently of the number of stations if p = 1/N. 90 Chapter 5 Special Discrete Distributions 22. The k students whose names have been called are not standing. Let A 1 , A 2 , ..., A n−k be the students whose names have not been called. For i,1≤ i ≤ n − k, call A i a “success,” if he or she is standing; failure, otherwise. Therefore, whether A i is standing or sitting is a Bernoulli trial, and hence the random variable X is the number of successes in n − k Bernoulli trials. For X to be binomial, for i negationslash= j, the event that A i is a success must be independent of the event that A j is a success. Furthermore, the probability that A i is a success must be the same for all i,1≤ i ≤ n − k. The latter condition is satisfied since A i is standing if and only if his original seat was among the first k. This happens with probability p = k/n regardless of i . However, the former condition is not valid. The relation P parenleftbig A j is standing | A i is standing parenrightbig = k − 1 n , shows that given A i is a success changes the probability that A j is success. That is, A i being a success is not independent of A j being a success. This shows that X is not a binomial random variable. 23. Let X be the number of undecided voters who will vote for abortion. The desired probability is P parenleftbig b + (n − X)>a+ X parenrightbig = P parenleftBig X< n + (b − a) 2 parenrightBig = [ n+(b−a) 2 ] summationdisplay i=0 parenleftbigg n i parenrightbigg parenleftBig 1 2 parenrightBig i parenleftBig 1 2 parenrightBig n−i = parenleftBig 1 2 parenrightBig n [ n+(b−a) 2 ] summationdisplay i=0 parenleftbigg n i parenrightbigg . 24. Let X be the net gain of the player per unit of stake. X is a discrete random variable with possible values −1, 1, 2, and 3. We have P(X=−1) = parenleftbigg 3 0 parenrightbigg parenleftBig 1 6 parenrightBig 0 parenleftBig 5 6 parenrightBig 3 = 125 216 , P(X= 1) = parenleftbigg 3 1 parenrightbigg parenleftBig 1 6 parenrightBigparenleftBig 5 6 parenrightBig 2 = 75 216 , P(X= 2) = parenleftbigg 3 2 parenrightbigg parenleftBig 1 6 parenrightBig 2 parenleftBig 5 6 parenrightBig = 15 216 , P(X= 3) = parenleftbigg 3 3 parenrightbigg parenleftBig 1 6 parenrightBig 3 parenleftBig 5 6 parenrightBig 0 = 1 216 . Hence E(X) =−1 · 125 216 + 1 · 75 216 + 2 · 15 216 + 3 · 1 216 ≈−0.08. Therefore, the player loses 0.08 per unit stake. Section 5.1 Bernoulli and Binomial Random Variables 91 25. E(X 2 ) = n summationdisplay x=1 x 2 parenleftbigg n x parenrightbigg p x (1 − p) n−x = n summationdisplay x=1 (x 2 − x + x) parenleftbigg n x parenrightbigg p x (1 − p) n−x = n summationdisplay x=1 x(x − 1) parenleftbigg n x parenrightbigg p x (1 − p) n−x + n summationdisplay x=1 x parenleftbigg n x parenrightbigg p x (1 − p) n−x = n summationdisplay x=2 n! (x − 2)! (n − x)! p x (1 − p) n−x + E(X) = n(n − 1)p 2 n summationdisplay x=2 parenleftbigg n − 2 x − 2 parenrightbigg p x−2 (1 − p) n−x + np = n(n − 1)p 2 bracketleftbig p + (1 − p) bracketrightbig n−2 + np = n 2 p 2 − np 2 + np. 26. (a) A four-engine plane is preferable to a two-engine plane if and only if 1 − parenleftbigg 4 0 parenrightbigg p 0 (1 − p) 4 − parenleftbigg 4 1 parenrightbigg p(1 − p) 3 > 1 − parenleftbigg 2 0 parenrightbigg p 0 (1 − p) 2 . This inequality gives p>2/3. Hence a four-engine plane is preferable if and only if p>2/3. If p = 2/3, it makes no difference. (b) A five-engine plane is preferable to a three-engine plane if and only if parenleftbigg 5 5 parenrightbigg p 5 (1 − p) 0 + parenleftbigg 5 4 parenrightbigg p 4 (1 − p) + parenleftbigg 5 3 parenrightbigg p 3 (1 − p) 2 > parenleftbigg 3 2 parenrightbigg p 2 (1 − p) + p 3 . Simplifying this inequality, we get 3(p − 1) 2 (2p − 1) ≥ 0 which implies that a five-engine plane is preferable if and only if 2p − 1 ≥ 0. That is, for p>1/2, a five-engine plane is preferable; for p<1/2, a three-engine plane is preferable; for p = 1/2 it makes no difference. 27. Clearly, 8 bits are transmitted. A parity check will not detect an error in the 7–bit character received erroneously if and only if the number of bits received incorrectly is even. Therefore, the desired probability is 4 summationdisplay n=1 parenleftbigg 8 2n parenrightbigg (1 − 0.999) 2n (0.999) 8−2n = 0.000028. 28. The message is erroneously received but the errors are not detected by the parity-check if for 1 ≤ j ≤ 6, j of the characters are erroneously received but not detected by the parity–check, and the remaining 6−j characters are all transmitted correctly. By the solution of the previous exercise, the probability of this event is 6 summationdisplay j=1 (0.000028) j (0.999) 8(6−j) = 0.000161. 92 Chapter 5 Special Discrete Distributions 29. The probability of a straight flush is 40 slashBig parenleftbigg 52 5 parenrightbigg ≈ 0.000015391. Hence we must have 1 − parenleftbigg n 0 parenrightbigg (0.000015391) 0 (1 − 0.000015391) n ≥ 3 4 . This gives (1 − 0.000015391) n ≤ 1 4 . So n ≥ log(1/4) log(1 − 0.000015391) ≈ 90071.06. Therefore, n ≈ 90, 072. 30. Let p, q, and r be the probabilities that a randomly selected offspring is AA, Aa, and aa, respectively. Note that both parents of the offspring are AA with probability (α/n) 2 , they are both Aa with probability bracketleftbig 1 − (α/n) bracketrightbig 2 , and the probability is 2(α/n) bracketleftbig 1 − (α/n) bracketrightbig that one parent is AA and the other is Aa. Therefore, by the law of total probability, p = 1 · parenleftBig α n parenrightBig 2 + 1 4 · parenleftBig 1 − α n parenrightBig 2 + 1 2 · 2 parenleftBig α n parenrightBigparenleftBig 1 − α n parenrightBig = 1 4 parenleftBig α n parenrightBig 2 + 1 2 parenleftBig α n parenrightBig + 1 4 , q = 0 · parenleftBig α n parenrightBig 2 + 1 2 parenleftBig 1 − α n parenrightBig 2 + 1 2 · 2 parenleftBig α n parenrightBigparenleftBig 1 − α n parenrightBig = 1 2 − 1 2 parenleftBig α n parenrightBig 2 , r = 0 · parenleftBig α n parenrightBig 2 + 1 4 parenleftBig 1 − α n parenrightBig 2 + 0 · 2 parenleftBig α n parenrightBigparenleftBig 1 − α n parenrightBig = 1 4 parenleftBig 1 − α n parenrightBig 2 . The probability that at most two of the offspring are aa is 2 summationdisplay i=0 parenleftbigg m i parenrightbigg r i (1 − r) m−i . The probability that exactly i of the offspring are AA and the remaining are all Aa is parenleftbigg m i parenrightbigg p i q m−i . 31. The desired probability is the sum of three probabilities: probability of no customer served and two new arrivals, probability of one customer served and three new arrivals, and probability of two customers served and four new arrivals. These quantities, respectively, are (0.4) 4 · parenleftbigg 4 2 parenrightbigg (0.45) 2 (0.55) 2 , parenleftbigg 4 1 parenrightbigg (0.6)(0.4) 3 · parenleftbigg 4 3 parenrightbigg (0.45) 3 (0.55), and parenleftbigg 4 2 parenrightbigg (0.6) 2 (0.4) 2 · (0.45) 4 . The sum of these quantities, which is the answer, is 0.054. Section 5.1 Bernoulli and Binomial Random Variables 93 32. (a) Let S be the event that the first trial is a success and E be the event that in n trials, the number of successes is even. Then P(E)= P(E|S)P(S)+ P(E|S c )P(S c ). Thus r n = (1 − r n−1 )p + r n−1 (1 − p). Using this relation, induction, and r 0 = 1, we find that r n = 1 2 bracketleftbig 1 + (1 − 2p) n bracketrightbig . (b) The left sum is the probability of 0, 2, 4, ...,or[n/2] successes. Thus it is the probability of an even number of successes in n Bernoulli trials and hence it is equal to r n . 33. For 0 ≤ i ≤ n, let B i be the event that i of the balls are red. Let A be the event that in drawing k balls from the urn, successively, and with replacement, no red balls appear. Then P(B 0 |A) = P(A|B 0 )P(B 0 ) n summationdisplay i=0 P(A|B i )P(B i ) = 1 × parenleftBig 1 2 parenrightBig n n summationdisplay i=0 parenleftBig n − i n parenrightBig k parenleftbigg n i parenrightbigg parenleftBig 1 2 parenrightBig n = 1 n summationdisplay i=0 parenleftbigg n i parenrightbigg parenleftBig n − i n parenrightBig k . 34. Let E be the event that Albert’s statement is the truth and F be the event that Donna tells the truth. Since Rose agrees with Donna and Rose always tells the truth, Donna is telling the truth as well. Therefore, the desired probability is P(E | F)= P(EF)/P(F). To calculate P(F), observe that for Rose to agree with Donna, none, two, or all four of Albert, Brenda, Charles, and Donna should have lied. Since these four people lie independently, this will happen with probability parenleftBig 1 3 parenrightBig 4 + parenleftbigg 4 2 parenrightbigg parenleftBig 2 3 parenrightBig 2 parenleftBig 1 3 parenrightBig 2 + parenleftBig 2 3 parenrightBig 4 = 41 81 . To calculate P(EF), note that EF is the event that Albert tells the truth and Rose agrees with Donna. This happens if all of them tell the truth, or Albert tells the truth but exactly two of Brenda, Charles and Donna lie. Hence P(EF)= parenleftBig 1 3 parenrightBig 4 + 1 3 · parenleftbigg 3 2 parenrightbigg parenleftBig 2 3 parenrightBig 2 parenleftBig 1 3 parenrightBig = 13 81 . Therefore, P(E | F)= P(EF) P(F) = 13/81 41/81 = 13 41 = 0.317. 94 Chapter 5 Special Discrete Distributions 5.2 POISSON RANDOM VARIABLES 1. λ = (0.05)(60) = 3; the answer is 1 − e −3 3 0 0! = 1 − e −3 = 0.9502. 2. λ = 1.8; the answer is summationtext 3 i=0 e −1.8 (1.8) i i! ≈ 0.89. 3. λ = 0.025 × 80 = 2; the answer is 1 − e −2 2 0 0! − e −2 2 1 1! = 1 − 3e −2 = 0.594. 4. λ = (500)(0.0014) = 0.7. The answer is 1 − e −0.7 (0.7) 0 0! − e −0.7 (0.7) 1 1! ≈ 0.156. 5. We call a room “success” if it is vacant next Saturday; we call it “failure” if it is occupied. Assuming that next Saturday is a random day, X, the number of vacant rooms on that day is approximately Poisson with rate λ = 35. Thus the desired probability is 1 − 29 summationdisplay i=0 e −35 (35) i i! = 0.823. 6. λ = (3/10)35 = 10.5. The probability of 10 misprints in a given chapter is e −10.5 (10.5) 10 10! = 0.124. Therefore, the desired probability is (0.124) 2 = 0.0154. 7. P(X= 1) = P(X= 3) implies that e −λ λ = e −λ λ 3 3! from which we get λ = √ 6. The answer is e − √ 6 parenleftbig√ 6 parenrightbig 5 5! = 0.063. 8. The probability that a bun contains no raisins is e −n/k (n/k) 0 0! = e −n/k . So the answer is parenleftbigg 4 2 parenrightbigg e −2n/k (1 − e −n/k ) 2 . 9. Let X be the number of times the randomly selected kid has hit the target. We are given that P(X = 0) = 0.04; this implies that e −λ 2 0 0! = 0.04 or e −λ = 0.04. So λ =−ln 0.04 = 3.22. Now P(X≥ 2) = 1 − P(X= 0) − P(X= 1) = 1 − 0.04 − e −λ λ 1! = 1 − 0.04 − (0.04)(3.22) = 0.83. Therefore, 83% of the kids have hit the target at least twice. Section 5.2 Poisson Random Variables 95 10. First we calculate p i ’s from binomial probability mass function with n = 26 and p = 1/365. Then we calculate them from Poisson probability mass function with parameter λ = np = 26/365. For different values of i, the results are as follows. i Binomial Poisson 0 0.93115 0.93125 1 0.06651 0.06634 2 0.00228 0.00236 3 0.00005 0.00006. Remark: In this example, since success is very rare, even for small n’s Poisson gives good approximation for binomial. The following table demonstrates this fact for n = 5. i Binomial Poisson 0 0.9874 0.9864 1 0.0136 0.0136 2 0.00007 0.00009. 11. Let N(t)be the number of shooting stars observed up to time t. Let one minute be the unit of time. Then braceleftbig N(t): t ≥ 0 bracerightbig is a Poisson process with λ = 1/12. We have that P parenleftbig N(30) = 3 parenrightbig = e −30/12 (30/12) 3 3! = 0.21. 12. P parenleftbig N(2) = 0 parenrightbig = e −3(2) = e −6 = 0.00248. 13. LetN(t)be the number of wrong calls up tot. If one day is taken as the time unit, it is reasonable to assume that braceleftbig N(t): t ≥ 0 bracerightbig is a Poisson process with λ = 1/7. By the independent increment property and stationarity, the desired probability is P parenleftbig N(1) = 0 parenrightbig = e −(1/7)·1 = 0.87. 14. Choose one month as the unit of time. Then λ = 5 and the probability of no crimes during any given month of a year is P parenleftbig N(1) = 0 parenrightbig = e −5 = 0.0067. Hence the desired probability is parenleftbigg 12 2 parenrightbigg (0.0067) 2 (1 − 0.0067) 10 = 0.0028. 15. Choose one day as the unit of time. Then λ = 3 and the probability of no accidents in one day is P parenleftBig N(1) = 0 parenrightBig = e −3 = 0.0498. The number of days without any accidents in January is approximately another Poisson random variable with approximate rate 31(0.05) = 1.55. Hence the desired probability is e −1.55 (1.55) 3 3! ≈ 0.13. 96 Chapter 5 Special Discrete Distributions 16. Choosing one hours as time unit, we have that λ = 6. Therefore, the desired probability is P parenleftbig N(0.5) = 1 and N(2.5) = 10 parenrightbig = P parenleftbig N(0.5) = 1 and N(2.5) − N(0.5) = 9 parenrightbig = P parenleftbig N(0.5) = 1 parenrightbig P parenleftbig N(2.5) − N(0.5) = 9 parenrightbig = P parenleftbig N(0.5) = 1 parenrightbig P parenleftbig N(2) = 9 parenrightbig = 3 1 e −3 1! · 12 9 e −12 9! ≈ 0.013. 17. The expected number of fractures per meter is λ = 1/60. Let N(t)be the number of fractures in t meters of wire. Then P parenleftbig N(t) = n parenrightbig = e −t/60 (t/60) n n! ,n= 0, 1, 2,... . In a ten minute period, the machine turns out 70 meters of wire. The desired probability, P parenleftbig N(70)>1 parenrightbig is calculated as follows: P parenleftbig N(70)>1 parenrightbig = 1 − P parenleftbig N(70) = 0 parenrightbig − P parenleftbig N(70) = 1 parenrightbig = 1 − e −70/60 − 70 60 e −70/60 ≈ 0.325. 18. Let the epoch at which the traffic light for the left–turn lane turns red be labeled t = 0. Let N(t)be the number of cars that arrive at the junction at or prior to t trying to turn left. Since cars arrive at the junction according to a Poisson process, clearly, braceleftbig N(t): t ≥ 0 bracerightbig is a stationary and orderly process which possesses independent increments. Therefore, braceleftbig N(t): t ≥ 0 bracerightbig is also a Poisson process. Its parameter is given by λ = E bracketleftbig N(1) bracketrightbig = 4(0.22) = 0.88. (For a rigorous proof, see the solution to Exercise 9, Section 12.2.) Thus P parenleftbig N(t) = n parenrightbig = e −(0.88)t bracketleftbig (0.88)t bracketrightbig n n! , and the desired probability is P parenleftbig N(3) ≥ 4 parenrightbig = 1 − 3 summationdisplay n=0 e −(0.88)3 bracketleftbig (0.88)3 bracketrightbig n n! ≈ 0.273. 19. Let X be the number of earthquakes of magnitude 5.5 or higher on the Richter scale during the next 60 years. Clearly, X is a Poisson random variable with parameter λ = 6(1.5) = 9. Let A be the event that the earthquakes will not damage the bridge during the next 60 years. Since the events {X = i}, i = 0, 1, 2,..., are mutually exclusive and uniontext ∞ i=1 {X = i} is the sample space, by the Law of Total Probability (Theorem 3.4), P(A) = ∞ summationdisplay i=0 P(A| X = i)P(X = i) = ∞ summationdisplay i=0 (1 − 0.015) i e −9 9 i i! = ∞ summationdisplay i=0 (0.985) i e −9 9 i i! = e −9 ∞ summationdisplay i=0 bracketleftbig (0.985)(9) bracketrightbig i i! = e −9 e (0.985)(9) = 0.873716. Section 5.2 Poisson Random Variables 97 20. Let N be the total number of letter carriers in America. Let n be the total number of dog bites letter carriers sustain. Let X be the number of bites a randomly selected letter carrier, say Karl, sustains on a given year. Call a bite “success,” if it is Karl that is bitten and failure if anyone but Karl is bitten. Since the letter carriers are bitten randomly, it is reasonable to assume that X is approximately a binomial random variable with parameters n and p = 1/N. Given that n is large (it was more than 7000 in 1983 and at least 2,795 in 1997), 1/N is small, and n/N is moderate, X can be approximated by a Poisson random variable with parameter λ = n/N. We know that P(X = 0) = 0.94. This implies that (e −λ · λ 0 )/0!=0.94. Thus e −λ = 0.94, and hence λ =−ln 0.94 = 0.061875. Therefore, X is a Poisson random variable with parameter 0.061875. Now P parenleftbig X>1 | X ≥ 1 parenrightbig = P(X>1) P(X≥ 1) = 1 − P(X= 0) − P(X= 1) 1 − P(X= 0) = 1 − 0.94 − 0.0581625 1 − 0.94 = 0.030625, where P(X= 1) = e −λ · λ 1 1! = λe −λ = (0.061875)(0.94) = 0.0581625. Therefore, approximately 3.06% of the letter carriers who sustained one bite, will be bitten again. 21. We should find n so that 1 − e −nM/N (nM/N) 0 0! ≥ α. This gives n ≥−N ln(1 − α)/M. The answer is the least integer greater than or equal to −N ln(1 − α)/M. 22. (a) For each k-combination n 1 , n 2 , ..., n k of 1, 2, ..., n, there are (n − 1) n−k distributions with exactly k matches, where the matches occur at n 1 , n 2 , ..., n k . This is because each of the remaining n − k balls can be placed into any of the cells except the cell that has the same number as the ball. Since there are parenleftbigg n k parenrightbigg k-combinations n 1 , n 2 , ..., n k of 1, 2, ..., n, the total number of ways we can place the n balls into the n cells so that there are exactly k matches is parenleftbigg n k parenrightbigg (n − 1) n−k . Hence the desired probability is parenleftbigg n k parenrightbigg (n − 1) n−k n n . (b) Let X be the number of matches. We will show that lim n→∞ P(X= k) = e −1 /k!; that is, X is Poisson with parameter 1. We have lim n→∞ P(X= k) = lim n→∞ parenleftbigg n k parenrightbigg (n − 1) n−k n n = lim n→∞ parenleftbigg n k parenrightbiggparenleftbigg n − 1 n parenrightbigg n (n − 1) −k = lim n→∞ 1 k! · n! (n − k)! · parenleftbigg 1 − 1 n parenrightbigg n · 1 (n − 1) k = 1 k! e −1 · 98 Chapter 5 Special Discrete Distributions Note that lim n→∞ parenleftbigg 1 − 1 n parenrightbigg n = e −1 , and lim n→∞ n! (n − k)! (n − 1) k = 1, since by Stirling’s formula, lim n→∞ n! (n − k)! (n − 1) k = lim n→∞ √ 2πn· n n · e −n √ 2π(n− k) · (n − k) n−k · e −(n−k) · (n − 1) k = lim n→∞ radicalbigg n n − k · n n (n − k) n · (n − k) k (n − 1) k · 1 e k = 1· · e k · 1 · 1 e k = 1, where n n (n − k) n → e k because (n − k) n n n = parenleftbigg 1 − k n parenrightbigg n → e −k . 23. (a) The probability of an even number of events in (t, t + α) is ∞ summationdisplay n=0 e −λα (λα) 2n (2n)! = e −λα ∞ summationdisplay n=0 (λα) 2n (2n)! = e −αλ bracketleftBig 1 2 ∞ summationdisplay n=0 (λα) n n! + 1 2 ∞ summationdisplay n=0 (−λα) n n! bracketrightBig = e −αλ bracketleftBig 1 2 e λα + 1 2 e −λα bracketrightBig = 1 2 (1 + e −2λα ). (b) The probability of an odd number of events in (t, t + α) is ∞ summationdisplay n=1 e −λα (λα) 2n−1 (2n − 1)! = e −λα ∞ summationdisplay n=1 (λα) 2n−1 (2n − 1)! = e −λα bracketleftBig 1 2 ∞ summationdisplay n=0 (λα) n n! − 1 2 ∞ summationdisplay n=0 (−λα) n n! bracketrightBig = e −λα bracketleftBig 1 2 e λα − 1 2 e −λα bracketrightBig = 1 2 parenleftbig 1 − e −2λα parenrightbig . 24. We have that P parenleftbig N 1 (t) = n, N 2 (t) = m parenrightbig = ∞ summationdisplay i=0 P parenleftbig N 1 (t) = n, N 2 (t) = m | N(t) = i parenrightbig P parenleftbig N(t) = i parenrightbig = P parenleftbig N 1 (t) = n, N 2 (t) = m | N(t) = n + m parenrightbig P parenleftbig N(t) = n + m parenrightbig = parenleftbigg n + m n parenrightbigg p n (1 − p) m · e −λt (λt) n+m (n + m)! . Therefore, P parenleftbig N 1 (t) = n parenrightbig = ∞ summationdisplay m=0 P parenleftbig N 1 (t) = n, N 2 (t) = m parenrightbig Section 5.3 Other Discrete Random Variables 99 = ∞ summationdisplay m=0 parenleftbigg n + m n parenrightbigg p n (1 − p) m · e −λt (λt) n+m (n + m)! = ∞ summationdisplay m=0 (n + m)! n! m! p n (1 − p) m e −λtp e −λt(1−p) (λt) n (λt) m (n + m)! = ∞ summationdisplay m=0 e −λtp e −λt(1−p) (λtp) n bracketleftbig λt(1 − p) bracketrightbig m n! m! = e −λtp (λtp) n n! ∞ summationdisplay m=0 e −λt(1−p) bracketleftbig λt(1 − p) bracketrightbig m m! = e −λtp (λtp) n n! . It can easily be argued that the other properties of Poisson process are also satisfied for the process braceleftbig N 1 (t): t ≥ 0 bracerightbig .So braceleftbig N 1 (t): t ≥ 0 bracerightbig is a Poisson process with rate λp. By symmetry, braceleftbig N 2 (t): t ≥ 0 bracerightbig is a Poisson process with rate λ(1 − p). 25. Let N(t) be the number of females entering the store between 0 and t. By Exercise 24, braceleftbig N(t): t ≥ 0 bracerightbig is a Poisson process with rate 1 ·(2/3) = 2/3. Hence the desired probability is P parenleftbig N(15) = 15 parenrightbig = e −15(2/3) bracketleftbig 15(2/3) bracketrightbig 15 15! = 0.035. 26. (a) Let A be the region whose points have a (positive) distance d or less from the given tree. The desired probability is the probability of no trees in this region and is equal to e −λπd 2 (λπd 2 ) 0 0! = e −λπd 2 . (b) We want to find the probability that the region A has at most n − 1 trees. The desired quantity is n−1 summationdisplay i=0 e −λπd 2 (λπd 2 ) i i! . 27. p(i) = (λ/i)p(i − 1) implies that for i<λ, the function p is increasing and for i>λit is decreasing. Hence i =[λ] is the maximum. 5.3 OTHER DISCRETE RANDOM VARIABLES 1. Let D denote a defective item drawn, and N denote a nondefective item drawn. The answer is S = braceleftbig NNN,DNN,NDN,NND,NDD,DND,DDN bracerightbig . 100 Chapter 5 Special Discrete Distributions 2. S = braceleftbig ss,fss,sfs,sffs,ffss,fsfs,sfffs,fsffs,fffss,ffsfs,... bracerightbig . 3. (a) 1/(1/12) = 12. (b) parenleftBig 11 12 parenrightBig 2 parenleftBig 1 12 parenrightBig ≈ 0.07. 4. (a) (1 − pq) r−1 pq. (b) 1/pq. 5. parenleftbigg 7 2 parenrightbigg (0.2) 3 (0.8) 5 ≈ 0.055. 6. (a) (0.55) 5 (0.45) ≈ 0.023. (b) (0.55) 3 (0.45)(0.55) 3 (0.45) ≈ 0.0056. 7. bracketleftBig parenleftbigg 5 1 parenrightbiggparenleftbigg 45 7 parenrightbigg bracketrightBigslashBig parenleftbigg 50 8 parenrightbigg = 0.42. 8. The probability that at least n light bulbs are required is equal to the probability that the first n − 1 light bulbs are all defective. So the answer is p n−1 . 9. We have P(N = n) P(X= x) = parenleftbigg n − 1 x − 1 parenrightbigg p x (1 − p) n−x parenleftbigg n x parenrightbigg p x (1 − p) n−x = x n . 10. Let X be the number of the words the student had to spell until spelling a word correctly. The random variable X is geometric with parameter 0.70. The desired probability is given by P(X≤ 4) = 4 summationdisplay i=1 (0.30) i−1 (0.70) = 0.9919. 11. The average number of digits until the fifth 3 is 5/(1/10) = 50. So the average number of digits before the fifth 3 is 49. 12. The probability that a random bridge hand has three aces is p = parenleftbigg 4 3 parenrightbiggparenleftbigg 48 10 parenrightbigg parenleftbigg 52 13 parenrightbigg = 0.0412. Therefore, the average number of bridge hands until one has three aces is 1/p = 1/0.0412 = 24.27. 13. Either the (N + 1)st success must occur on the (N + M − m + 1)st trial, or the (M + 1)st Section 5.3 Other Discrete Random Variables 101 failure must occur on the (N + M − m + 1)st trial. The answer is parenleftbigg N + M − m N parenrightbigg parenleftBig 1 2 parenrightBig N+M−m+1 + parenleftbigg N + M − m M parenrightbigg parenleftBig 1 2 parenrightBig N+M−m+1 . 14. We have that X + 10 is negative binomial with parameters (10, 0.15). Therefore, ∀i ≥ 0, P(X= i) = P(X+ 10 = i + 10) = parenleftbigg i + 9 9 parenrightbigg (0.15) 10 (0.85) i . 15. Let X be the number of good diskettes in the sample. The desired probability is P(X≥ 9) = P(X= 9) + P(X= 10) = parenleftbigg 10 1 parenrightbiggparenleftbigg 90 9 parenrightbigg parenleftbigg 100 10 parenrightbigg + parenleftbigg 90 10 parenrightbiggparenleftbigg 10 0 parenrightbigg parenleftbigg 100 10 parenrightbigg ≈ 0.74. 16. We have that 560(0.35) = 196 persons make contributions. So the answer is 1 − parenleftbigg 364 15 parenrightbigg parenleftbigg 560 15 parenrightbigg − parenleftbigg 364 14 parenrightbiggparenleftbigg 196 1 parenrightbigg parenleftbigg 560 15 parenrightbigg = 0.987. 17. The transmission of a message takes more than t minutes, if the first [t/2]+1 times it is sent it will be garbled, where [t/2] is the greatest integer less than or equal to t/2. The probability of this is p [t/2]+1 . 18. The probability that the sixth coin is accepted on the nth try is parenleftbigg n − 1 5 parenrightbigg (0.10) 6 (0.90) n−6 . Therefore, the desired probability is ∞ summationdisplay n=50 parenleftbigg n − 1 5 parenrightbigg (0.10) 6 (0.90) n−6 = 1 − 49 summationdisplay n=6 parenleftbigg n − 1 5 parenrightbigg (0.10) 6 (0.90) n−6 = 0.6346. 19. The probability that the station will successfully transmit or retransmit a message is (1−p) N−1 . This is because for the station to successfully transmit or retransmit its message, none of the other stations should transmit messages at the same instance. The number of transmissions and retransmissions of a message until the success is geometric with parameter (1 − p) N−1 . Therefore, on average, the number of transmissions and retransmissions is 1/(1 − p) N−1 . 102 Chapter 5 Special Discrete Distributions 20. If the fifth tail occurs after the 14th trial, ten or more heads have occurred. Therefore, the fifth tail occurs before the tenth head if and only if the fifth tail occurs before or on the 14th flip. Calling tails success, X, the number of flips required to get the fifth tail is negative binomial with parameters 5 and 1/2. The desired probability is given by 14 summationdisplay n=5 P(X= n) = 14 summationdisplay n=5 parenleftbigg n − 1 4 parenrightbigg parenleftBig 1 2 parenrightBig 5 parenleftBig 1 2 parenrightBig n−5 ≈ 0.91. 21. The probability of a straight is 10 parenleftbig 4 5 parenrightbig − 40 parenleftbigg 52 5 parenrightbigg = 0.003924647. Therefore, the expected number of poker hands required until the first straight is 1/0.003924647 = 254.80. 22. (a) Since P(X= n − 1) P(X= n) = 1 1 − p > 1, P(X= n) is a decreasing function of n; hence its maximum is at n = 1. (b) The probability that X is even is given by ∞ summationdisplay k=1 P(X= 2k) = ∞ summationdisplay k=1 p(1 − p) 2k−1 = p(1 − p) 1 − (1 − p) 2 = 1 − p 2 − p . (c) We want to show the following: LetX beadiscreterandomvariablewiththesetofpossiblevalues braceleftbig 1, 2, 3 ... bracerightbig . If for all positive integers n and m, P(X>n+ m | X>m)= P(X > n), (17) then X is a geometric random variable. That is, there exists a number p, 0 ~~ lscript− α bracerightbig divided by area(S) = lscript 2 /2. But area(R) = ⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩ (3α − lscript) 2 2 if lscript 3 ≤ α ≤ lscript 2 lscript 2 2 − 3lscript 2 2 parenleftBig 1 − α lscript parenrightBig 2 if lscript 2 ≤ α ≤ lscript. 164 Chapter 8 Bivariate Distributions Hence the desired probability is P(E)= ⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩ parenleftBig 3α lscript − 1 parenrightBig 2 if lscript 3 ≤ α ≤ lscript 2 1 − 3 parenleftBig 1 − α lscript parenrightBig 2 if lscript 2 ≤ α ≤ lscript. 25. R is the square bounded by the lines x + y = 1, −x + y = 1, −x − y = 1, and x − y = 1; its area is 2. To find the probability density function of X, the x-coordinate of the point selected at random from R, first we calculate P(X≤ t),∀t.For−1 ≤ t<0, P(X≤ t) is the area of the triangle bound by the lines −x + y = 1, −x − y = 1, and x = t which is (1 + t) 2 divided by area(R) = 2. (Draw a figure.) For 0 ≤ t<1, P(X ≤ t) is the area inside R to the left of the line x = t which is 2 − (1 − t) 2 divided by area(R) = 2. Therefore, P(X≤ t) = ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ 0 t<−1 (1 + t) 2 2 −1 ≤ t<0 2 − (1 − t) 2 2 0 ≤ t<1 1 t ≥ 1, and hence d dt P(X≤ t) = ⎧ ⎪ ⎨ ⎪ ⎩ 1 + t −1 ≤ t<0 1 − t 0 ≤ t<1 0 otherwise. This shows that f X (t), the probability density function of X is given by f X (t) = 1 −|t|, −1 ≤ t ≤ 1; 0, elsewhere. 26. Clearly, P(Z ≤ z) = integraldisplayintegraldisplay {(x,y): y/x≤z} f(x,y)dx dy. Now for x>0, y/x ≤ z if and only if y ≤ xz; for x<0, y/x ≤ z if and only if y ≥ xz. Therefore, integration region is braceleftbig (x, y): x<0,y≥ xz bracerightbig ∪ braceleftbig (x, y): x>0,y≤ xz bracerightbig . Thus P(Z ≤ z) = integraldisplay 0 −∞ parenleftBig integraldisplay ∞ xz f(x,y)dy parenrightBig dx + integraldisplay ∞ 0 parenleftBig integraldisplay xz −∞ f(x,y)dy parenrightBig dx. Section 8.1 Joint Distributions of Two Random Variables 165 Using the substitution y = tx,weget P(Z ≤ z) = integraldisplay 0 −∞ parenleftBig integraldisplay −∞ z xf (x, tx) dt parenrightBig dx + integraldisplay ∞ 0 parenleftBig integraldisplay z −∞ xf (x, tx) dt parenrightBig dx = integraldisplay 0 −∞ parenleftBig integraldisplay z −∞ −xf (x, tx) dt parenrightBig dx + integraldisplay ∞ 0 parenleftBig integraldisplay z −∞ xf (x, tx) dt parenrightBig dx = integraldisplay 0 −∞ parenleftBig integraldisplay z −∞ |x|f(x,tx)dt parenrightBig dx + integraldisplay ∞ 0 parenleftBig integraldisplay z −∞ |x|f(x,tx)du parenrightBig dx = integraldisplay ∞ −∞ parenleftBig integraldisplay z −∞ |x|f(x,tx)dt parenrightBig dx = integraldisplay z −∞ parenleftBig integraldisplay ∞ −∞ |x|f(x,tx)dx parenrightBig dt. Differentiating with respect to z, Fundamental Theorem of Calculus implies that, f Z (z) = d dz P(Z ≤ z) = integraldisplay ∞ −∞ |x|f(x,xz)dx. 27. Note that there are exactly n such closed semicircular disks because the probability that the diameter through P i contains any other point P j is 0. (Draw a figure.) Let E be the event that all the points are contained in a closed semicircular disk. Let E i be the event that the points are all in D i . Clearly, E =∪ n i=1 E i . Since there is at most one D i ,1≤ i ≤ n, that contains all the P i ’s, the events E 1 , E 2 , ..., E n are mutually exclusive. Hence P(E)= P parenleftBig n uniondisplay i=1 E i parenrightBig = n summationdisplay i=1 P(E i ) = n summationdisplay i=1 parenleftBig 1 2 parenrightBig n−1 = n parenleftBig 1 2 parenrightBig n−1 , where the next-to-the-last equality follows because P(E i ) is the probability that P 1 , P 2 , ..., P i−1 , P i+1 , ..., P n fall inside D i . The probability that any of these falls inside D i is (area of D i )/(area of the disk) = 1/2 independently of the others. Hence the probability that all of them fall inside D i is (1/2) n−1 . 28. We have that f X (x) = Gamma1(α + β + γ) Gamma1(α)Gamma1(β)Gamma1(γ) integraldisplay 1−x 0 x α−1 y β−1 (1 − x − y) γ−1 dy = 1 B(α, β + γ)B(β,γ) x α−1 integraldisplay 1−x 0 y β−1 (1 − x − y) γ−1 dy. Let z = y/(1 − x); then dy = (1 − x)dz, and integraldisplay 1−x 0 y β−1 (1−x −y) γ−1 dy = (1−x) β+γ−1 integraldisplay 1 0 z β−1 (1−z) γ−1 dz = (1−x) β+γ−1 B(β,γ). So f X (x) = 1 B(α, β + γ)B(β,γ) x α−1 (1 − x) β+γ−1 B(β,γ) = 1 B(α, β + γ) x α−1 (1 − x) β+γ−1 . 166 Chapter 8 Bivariate Distributions This shows that X is beta with parameters (α, β + γ). A similar argument shows that Y is beta with parameters (β, γ + α). 29. It is straightforward to check that f(x,y) ≥ 0, f is continuous and integraldisplay ∞ −∞ integraldisplay ∞ −∞ f(x,y)dx dy = 1. Therefore, f is a continuous probability density function. We will show that ∂F ∂x does not exist at (0, 0). Similarly, one can show that ∂F ∂x does not exist at any point on the y-axis. Note that for small Delta1x > 0, F(Delta1x, 0) − F(0, 0) = P(X≤ Delta1x , Y ≤ 0) − P(X≤ 0 ,Y≤ 0) = P(0 ≤ X ≤ Delta1x , Y ≤ 0) = integraldisplay 0 −∞ integraldisplay Delta1x 0 f(x,y)dx dy. Now, from the definition of f(x,y), we must have Delta1x < (1/2)e y or, equivalently, y>ln(2Delta1x). Thus, for small Delta1x > 0, F(Delta1x, 0) − F(0, 0) = integraldisplay 0 ln(2Delta1x) integraldisplay Delta1x 0 (1 − 2xe −y )dxdy = (Delta1x) 2 − bracketleftBig (Delta1x) ln(2Delta1x) + Delta1x 2 bracketrightBig . This implies that lim Delta1x→0+ F(Delta1x, 0) − F(0, 0) Delta1x = lim Delta1x→0+ bracketleftBig Delta1x − ln(2Delta1x) − 1 2 bracketrightBig =∞, showing that ∂F ∂x does not exist at (0, 0). 8.2 INDEPENDENT RANDOM VARIABLES 1. Note that p X (x) = (1/25)(3x 2 + 5), p Y (y) = (1/25)(2y 2 + 5). Now p X (1) = 8/25, p Y (0) = 5/25, and p(1, 0) = 1/25. Since p(1, 0) negationslash= p X (1)p Y (0), X and Y are dependent. 2. Note that p(1, 1) = 1 7 , p X (1) = p(1, 1) + p(1, 2) = 1 7 + 2 7 = 3 7 , p Y (1) = p(1, 1) + p(2, 1) = 1 7 + 5 7 = 6 7 . Since p(1, 1) negationslash= p X (1)p Y (1), X and Y are dependent. Section 8.2 Independent Random Variables 167 3. By the independence of X and Y, P(X= 1,Y= 3) = P(X= 1)P(Y = 3) = 1 2 parenleftBig 2 3 parenrightBig · 1 2 parenleftBig 2 3 parenrightBig 3 = 4 81 . P(X+ Y = 3) = P(X= 1,Y= 2) + P(X= 2,Y= 1) = 1 2 parenleftBig 2 3 parenrightBig · 1 2 parenleftBig 2 3 parenrightBig 2 + 1 2 parenleftBig 2 3 parenrightBig 2 · 1 2 parenleftBig 2 3 parenrightBig = 4 27 . 4. No, they are not independent because, for example, P(X= 0 | Y = 8) = 1but P(X= 0) = parenleftbigg 39 8 parenrightbigg parenleftbigg 52 8 parenrightbigg = 0.08175 negationslash= 1, showing that P(X= 0 | Y = 8) negationslash= P(X= 0). 5. The answer is parenleftbigg 7 2 parenrightbigg parenleftBig 1 2 parenrightBig 2 parenleftBig 1 2 parenrightBig 5 · parenleftbigg 8 2 parenrightbigg parenleftBig 1 2 parenrightBig 2 parenleftBig 1 2 parenrightBig 6 = 0.0179. 6. We have that P parenleftbig max(X, Y) ≤ t parenrightbig = P(X≤ t, Y ≤ t) = P(X≤ t)P(Y ≤ t) = F(t)G(t). P parenleftbig min(X, Y) ≤ t parenrightbig = 1 − P parenleftbig min(X,Y)>t parenrightbig = 1 − P(X>t, Y >t)= 1 − P(X>t)P(Y >t) = 1 − bracketleftbig 1 − F(t) bracketrightbigbracketleftbig 1 − G(t) bracketrightbig = F(t)+ G(t) − F(t)G(t). 7. Let X and Y be the number of heads obtained by Adam and Andrew, respectively. The desired probability is n summationdisplay i=0 P(X= i, Y = i) = n summationdisplay i=0 P(X= i)P(Y = i) = n summationdisplay i=0 parenleftbigg n i parenrightbigg parenleftBig 1 2 parenrightBig i parenleftBig 1 2 parenrightBig n−i · parenleftbigg n i parenrightbigg parenleftBig 1 2 parenrightBig i parenleftBig 1 2 parenrightBig n−i = parenleftBig 1 2 parenrightBig 2n n summationdisplay i=0 parenleftbigg n i parenrightbigg 2 = parenleftBig 1 2 parenrightBig 2n parenleftbigg 2n n parenrightbigg , where the last equality follows by Example 2.28. 168 Chapter 8 Bivariate Distributions An Intuitive Solution: Let Z be the number of tails obtained by Andrew. The desired proba- bility is n summationdisplay i=0 P(X= i, Y = i) = n summationdisplay i=0 P(X= i, Z = i) = n summationdisplay i=0 P(X= i, Y = n − i) = P(Adam and Andrew get a total of n heads) = P(nheads in 2n flips of a fair coin) = parenleftBig 1 2 parenrightBig 2n parenleftbigg 2n n parenrightbigg . 8. For i, j ∈ braceleftbig 0, 1, 2, 3 bracerightbig , the sum of the numbers in the ith row is p X (i) and the sum of the numbers in the jth row is p Y (j). We have that p X (0) = 0.41,p X (1) = 0.44,p X (2) = 0.14,p X (3) = 0.01; p Y (0) = 0.41,p Y (1) = 0.44,p Y (2) = 0.14,p Y (3) = 0.01. Since for all x,y ∈ braceleftbig 0, 1, 2, 3 bracerightbig , p(x,y) = p X (x)p Y (y), X and Y are independent. 9. They are not independent because f X (x) = integraldisplay x 0 2 dy = 2x, 0 ≤ x ≤ 1; f Y (y) = integraldisplay 1 y 2 dx = 2(1 − y), 0 ≤ y ≤ 1; and so f(x,y) negationslash= f X (x)f Y (y). 10. Let X and Y be the amount of cholesterol in the first and in the second sandwiches, respectively. Since X and Y are continuous random variables, P(X = Y) = 0 regardless of what the probability density functions of X and Y are. 11. We have that f X (x) = integraldisplay ∞ 0 x 2 e −x(y+1) dy = xe −x ,x≥ 0; f Y (y) = integraldisplay ∞ 0 x 2 e −x(y+1) dx = 2 (y + 1) 3 ,y≥ 0, where the second integral is calculated by applying integration by parts twice. Now since f(x,y) negationslash= f X (x)f Y (y), X and Y are not independent. Section 8.2 Independent Random Variables 169 12. Clearly, E(XY) = integraldisplay 1 0 integraldisplay 1 x (xy)(8xy)dy dx = integraldisplay 1 0 parenleftBig integraldisplay 1 x 8y 2 dy parenrightBig x 2 dx = 4 9 , E(X) = integraldisplay 1 0 integraldisplay 1 x x(8xy)dy dx = 8 15 , E(Y) = integraldisplay 1 0 integraldisplay 1 x y(8xy)dy dx = 4 5 . So E(XY) negationslash= E(X)E(Y). 13. Since f(x,y) = e −x · 2e −2y = f X (x)f Y (y), X and Y are independent exponential random variables with parameters 1 and 2, respectively. Therefore, E(X 2 Y)= E(X 2 )E(Y) = 2 · 1 2 = 1. 14. The joint probability density function of X and Y is given by f(x,y) = braceleftBigg e −(x+y) x>0,y>0 0 elsewhere. Let G be the probability distribution function, and g be the probability density function of X/Y.Fort>0, G(t) = P parenleftBig X Y ≤ t parenrightBig = P(X≤ tY) = integraldisplay ∞ 0 parenleftBig integraldisplay ty 0 e −(x+y) dx parenrightBig dy = t 1 + t . Therefore, for t>0, g(t) = G prime (t) = 1 (1 + t) 2 . Note that G prime (t) = 0 for t<0; G prime (0) does not exist. 15. Let F and f be the probability distribution and probability density functions of max(X, Y), respectively. Clearly, F(t)= P parenleftbig max(X, Y) ≤ t parenrightbig = P(X≤ t, Y ≤ t) = (1 − e −t ) 2 ,t≥ 0. Thus f(t)= F prime (t) = 2e −t (1 − e −t ) = 2e −t − 2e −2t . 170 Chapter 8 Bivariate Distributions Hence E bracketleftbig max(X, Y) bracketrightbig = 2 integraldisplay ∞ 0 te −t dt − integraldisplay ∞ 0 2te −2t dt = 2 − 1 2 = 3 2 . Note that integraldisplay ∞ 0 te −t dt is the expected value of an exponential random variable with parameter 1, thus it is 1. Also, integraldisplay ∞ 0 2te −2t dt is the expected value of an exponential random variable with parameter 2, thus it is 1/2. 16. Let F and f be the probability distribution and probability density functions of max(X, Y). For −1 t)= 1 − integraldisplay 1 t integraldisplay 1 t/x dy dx = t − t ln t. Hence f(t)= F prime (t) = braceleftBigg − ln t 0 0,y>0. So the desired probability is P(Y >X)= integraldisplay ∞ 0 bracketleftBig integraldisplay ∞ x 2 11 e −(2y)/11 dy bracketrightBig 1 6 e −x/6 dx = 11 23 . 21. If I A and I B are independent, then P(I A = 1,I B = 1) = P(I A = 1)P(I B = 1). This is equivalent to P(AB) = P(A)P(B) which shows that A and B are independent. On the other hand, if {A,B} is an independent set, so are the following: braceleftbig A,B c bracerightbig , braceleftbig A c ,B bracerightbig , and braceleftbig A c ,B c bracerightbig . Therefore, P(AB) = P(A)P(B), P(AB c ) = P(A)P(B c ), P(A c B) = P(A c )P(B), P(A c B c ) = P(A c )P(B c ). These relations, respectively, imply that P(I A = 1,I B = 1) = P(I A = 1)P(I B = 1), P(I A = 1,I B = 0) = P(I A = 1)P(I B = 0), P(I A = 0,I B = 1) = P(I A = 0)P(I B = 1), P(I A = 0,I B = 0) = P(I A = 0)P(I B = 0). These four relations show that I A and I B are independent random variables. 22. The joint probability density function of B and C is f(b,c) = ⎧ ⎪ ⎨ ⎪ ⎩ 9b 2 c 2 676 1 **0, or, equivalently, B 2 > 4C. Let E = braceleftbig (b, c): 1 **** 4c bracerightbig ; 172 Chapter 8 Bivariate Distributions the desired probability is integraldisplayintegraldisplay E 9b 2 c 2 676 dbdc = integraldisplay 3 2 parenleftBig integraldisplay b 2 /4 1 9b 2 c 2 676 dc parenrightBig db ≈ 0.12. (Draw a figure to verify the region of integration.) 23. Note that f X (x) = integraldisplay ∞ −∞ g(x)h(y) dy = g(x) integraldisplay ∞ −∞ h(y) dy, f Y (y) = integraldisplay ∞ −∞ g(x)h(y) dx = h(y) integraldisplay ∞ −∞ g(x)dx. Now f X (x)f Y (y) = g(x)h(y) integraldisplay ∞ −∞ h(y) dy integraldisplay ∞ −∞ g(x)dx = f(x,y) integraldisplay ∞ −∞ integraldisplay ∞ −∞ h(y)g(x) dy dx = f(x,y) integraldisplay ∞ −∞ integraldisplay ∞ −∞ f(x,y)dy dx = f(x,y). This relation shows that X and Y are independent. 24. Let G and g be the probability distribution and probability density functions of max(X, Y) slashbig min(X, Y). Then G(t) = 0ift<1. For t ≥ 1, G(t) = P parenleftBig max(X, Y) min(X, Y) ≤ t parenrightBig = P parenleftBig max(X, Y) ≤ t min(X, Y) parenrightBig = P parenleftBig X ≤ t min(X, Y), Y ≤ t min(X, Y) parenrightBig = P parenleftBig min(X, Y) ≥ X t , min(X, Y) ≥ Y t parenrightBig = P parenleftBig X ≥ X t ,Y≥ X t ,X≥ Y t ,Y≥ Y t parenrightBig = P parenleftBig Y ≥ X t ,X≥ Y t parenrightBig = P parenleftBig X t ≤ Y ≤ tX parenrightBig . This quantity is the area of the region braceleftbig (x, y): 0 **|y|, f Y|X (y|x) = (1/2)e −x integraldisplay x −x (1/2)e −x dy = 1 2x , −x5. Using these, we have that E(X | Y = 5) = ∞ summationdisplay x=1 xp X|Y (x|5) = ∞ summationdisplay x=1 x p(x,5) p Y (5) = 4 summationdisplay x=1 1 11 x parenleftBig 11 12 parenrightBig x + ∞ summationdisplay x=6 x parenleftBig 11 12 parenrightBig 4 parenleftBig 1 13 parenrightBigparenleftBig 12 13 parenrightBig x−6 = 0.72932 + parenleftBig 11 12 parenrightBig 4 parenleftBig 1 13 parenrightBig ∞ summationdisplay y=0 (y + 6) parenleftBig 12 13 parenrightBig y = 0.72932 + parenleftBig 11 12 parenrightBig 4 parenleftBig 1 13 parenrightBigbracketleftBig ∞ summationdisplay y=0 y parenleftBig 12 13 parenrightBig y + 6 ∞ summationdisplay y=0 parenleftBig 12 13 parenrightBig y bracketrightBig = 0.702932 + parenleftBig 11 12 parenrightBig 4 parenleftBig 1 13 parenrightBigbracketleftBig 12/13 (1/13) 2 + 6 1 1 − (12/13) bracketrightBig = 13.412. Remark: In successive draws of cards from an ordinary deck of 52 cards, one at a time, randomly, and with replacement, the expected value of the number of draws until the first ace is 1/(1/13) = 13. This exercise shows that knowing the first king occurred on the fifth trial will increase, on the average, the number of trials until the first ace 0.412 draws. Section 8.3 Conditional Distributions 181 17. Let X be the number of blue chips in the first 9 draws and Y be the number of blue chips drawn altogether. We have that E(X | Y = 10) = 9 summationdisplay x=0 x p(x,10) p Y (10) = 9 summationdisplay x=1 x parenleftbigg 9 x parenrightbigg parenleftBig 12 22 parenrightBig x parenleftBig 10 22 parenrightBig 9−x · parenleftbigg 9 10 − x parenrightbigg parenleftBig 12 22 parenrightBig 10−x parenleftBig 10 22 parenrightBig x−1 parenleftbigg 18 10 parenrightbigg parenleftBig 12 22 parenrightBig 10 parenleftBig 10 22 parenrightBig 8 = 9 summationdisplay x=1 x parenleftbigg 9 x parenrightbiggparenleftbigg 9 10 − x parenrightbigg parenleftbigg 18 10 parenrightbigg = 9 × 10 18 = 5, where the last sum is (9×10)/18 because it is the expected value of a hypergeometric random variable with N = 18, D = 9, and n = 10. 18. Clearly, f X (x) = integraldisplay 1 x n(n − 1)(y − x) n−2 dy = n(1 − x) n−1 . Thus f Y|X (y|x) = f(x,y) f X (x) = n(n − 1)(y − x) n−2 n(1 − x) n−1 = (n − 1)(y − x) n−2 (1 − x) n−1 . Therefore, E(Y | X = x) = integraldisplay 1 x y (n − 1)(y − x) n−2 (1 − x) n−1 dy = n − 1 (1 − x) n−1 integraldisplay 1 x y(y − x) n−2 dy. But integraldisplay 1 x y(y − x) n−2 dy = integraldisplay 1 x (y − x + x)(y − x) n−2 dy = integraldisplay 1 x (y − x) n−1 dy + integraldisplay 1 x x(y − x) n−2 dy = (1 − x) n n + x(1 − x) n−1 n − 1 . Thus E(Y | X = x) = n − 1 n (1 − x)+ x = n − 1 n + 1 n x. 182 Chapter 8 Bivariate Distributions 19. (a) The area of the triangle is 1/2. So f(x,y) = braceleftBigg 2ifx ≥ 0, y ≥ 0, x + y ≤ 1 0 elsewhere. (b) f Y (y) = integraldisplay 1−y 0 2 dx = 2(1 − y), 0 0,v>0 bracerightbig . It has the unique solution x = e −u/2 , y = e −v/2 . Hence J = vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle − 1 2 e −u/2 0 0 − 1 2 e −v/2 vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle = 1 4 e −(u+v)/2 negationslash= 0. By Theorem 8.8, g(u, v), the joint probability density function of U and V is g(u, v) = f parenleftbig e −u/2 ,e −v/2 parenrightbig vextendsingle vextendsingle vextendsingle 1 4 e −(u+v)/2 vextendsingle vextendsingle vextendsingle = 1 4 e −(u+v)/2 ,u>0,v>0. 184 Chapter 8 Bivariate Distributions 2. Let f(x,y) be the joint probability density function of X and Y. Clearly, f(x,y) = f 1 (x)f 2 (y), x > 0,y>0. Let V = X and g(u, v) be the joint probability density functions of U and V . The probability density function of U is g U (u), its marginal density function. The system of two equations in two unknowns braceleftBigg x/y = u x = v defines a one-to-one transformation of R = braceleftbig (x, y): x>0,y>0 bracerightbig onto the region Q = braceleftbig (u, v): u>0,v>0 bracerightbig . It has the unique solution x = v, y = v/u. Hence J = vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle 01 − v u 2 1 u vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle = v u 2 negationslash= 0. By Theorem 8.8, g(u, v) = f parenleftBig v, v u parenrightBigvextendsingle vextendsingle vextendsingle v u 2 vextendsingle vextendsingle vextendsingle = v u 2 f parenleftBig v, v u parenrightBig = v u 2 f 1 (v)f 2 parenleftBig v u parenrightBig u>0,v>0. Therefore, g U (u) = integraldisplay ∞ 0 v u 2 f 1 (v)f 2 parenleftBig v u parenrightBig dv, u > 0. 3. Let g(r,θ) be the joint probability density function of R and Theta1. We will show that g(r,θ) = g R (r)g Theta1 (θ). This proves the surprising result that R and Theta1 are independent. Let f(x,y) be the joint probability density function of X and Y. Clearly, f(x,y) = 1 2π e −(x 2 +y 2 )/2 , −∞ 0, 0 <θ<2π bracerightbig . It has the unique solution braceleftBigg x = r cos θ y = r sin θ. Hence J = vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle cos θ −r sin θ sin θrcos θ vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle = r negationslash= 0. By Therorem 8.8, g(r,θ) is given by g(r,θ) = f(rcos θ,r sin θ)|r|= 1 2π re −r 2 /2 0 <θ<2π, r > 0. Now g R (r) = integraldisplay 2π 0 1 2π re −r 2 /2 dθ = re −r 2 /2 ,r>0, and g Theta1 (θ) = integraldisplay ∞ 0 1 2π re −r 2 /2 dr = 1 2π , 0 <θ<2π. Therefore, g(r,θ) = g R (r)g Theta1 (θ), showing that R and Theta1 are independent random variables. The formula for g Theta1 (θ) indicates that Theta1 is a uniform random variable over the interval (0, 2π). The probability density function obtained for R is called Rayleigh. 4. Method 1: By the convolution theorem (Theorem 8.9), g, the probability density function of the sum of X and Y, the two random points selected from (0, 1) is given by g(t) = integraldisplay ∞ −∞ f 1 (x)f 2 (t − x)dx, where f 1 and f 2 are, respectively, the probability density functions of X and Y. Since f 1 (x) = f 2 (x) = braceleftBigg 1 x ∈ (0, 1) 0 elsewhere, the integrand, f 1 (x)f 2 (t − x) is nonzero if 0 t bracerightbig divided by the area of S:1− (2 − t) 2 2 . (Draw figures to verify these regions.) Let G be the probability distribution function of X + Y. We have shown that G(t) = ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ 0 t<0 t 2 2 0 ≤ t<1 1 − (2 − t) 2 2 1 ≤ t<2 1 t ≥ 2. Therefore, g(t) = G prime (t) = ⎧ ⎪ ⎨ ⎪ ⎩ t 0 ≤ t<1 2 − t 1 ≤ t<2 0 otherwise. 5. (a) Clearly, p X (x) = 1/3 for x =−1, 0, 1 and p Y (y) = 1/3 for y =−1, 0, 1. Since P(X+ Y = z) = ⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩ 1/9 z =−2, +2 2/9 z =−1, +1 3/9 z = 0, the relation P(X+ Y = z) = summationdisplay x p X (x)p Y (z − x) is easily seen to be true. Section 8.4 Transformations of Two Random Variables 187 (b) p(x,y) = p X (x)p Y (y) for all possible values x and y of X and Y if and only if (1/9)+c = 1/9 and (1/9) − c = 1/9; that is, if and only if c = 0. 6. Let h(x, y) be the joint probability density function of X and Y. Then h(x, y) = ⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩ 1 x 2 y 2 x ≥ 1,y≥ 1 0 elsewhere. Consider the system of two equations in two unknowns braceleftBigg x/y = u xy = v. (29) This system has the unique solution braceleftBigg x = √ uv y = √ v/u. (30) We have that x ≥ 1 ⇐⇒ √ uv ≥ 1 ⇐⇒ u ≥ 1 v , y ≥ 1 ⇐⇒ √ v/u ≥ 1 ⇐⇒ v ≥ u. Clearly, x ≥ 1, y ≥ 1 imply that v = xy ≥ 1, so 1 v > 0. Therefore, the system of equations (29) defines a one-to-one transformation of R = braceleftbig (x, y): x ≥ 1,y≥ 1 bracerightbig onto the region Q = braceleftBig (u, v): 0 < 1 v ≤ u ≤ v bracerightBig . By (30), J = vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle 1 2 radicalbigg v u 1 2 radicalbigg u v − √ v 2u √ u 1 2 √ uv vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle = 1 2u negationslash= 0. Hence, by Theorem 8.8, g(u, v), the joint probability density function of U and V is given by g(u, v) = h parenleftbigg √ uv, radicalbigg v u parenrightbigg |J|= 1 2uv 2 , 0 < 1 v ≤ u ≤ v. 188 Chapter 8 Bivariate Distributions 7. Let h be the joint probability density function of X and Y. Clearly, h(x, y) = braceleftBigg e −(x+y) x>0,y>0 0 elsewhere. Consider the system of two equations in two unknowns braceleftBigg x + y = u e x = v. (31) This system has the unique solution braceleftBigg x = ln v y = u − ln v. (32) We have that x>0 ⇐⇒ ln v>0 ⇐⇒ v>1, y>0 ⇐⇒ u − ln v>0 ⇐⇒ e u >v. Therefore, the system of equations (31) defines a one-to-one transformation of R = braceleftbig (x, y): x>0,y>0 bracerightbig onto the region Q = braceleftbig (u, v): u>0, 1 0, 1 0,y>0. Consider the system of two equations in two unknowns ⎧ ⎨ ⎩ x + y = u x x + y = v. (33) 190 Chapter 8 Bivariate Distributions Clearly, (33) implies that u>0 and v>0. This system has the unique solution braceleftBigg x = uv y = u − uv. (34) We have that x>0 ⇐⇒ uv > 0 ⇐⇒ u>0 and v>0, y>0 ⇐⇒ u − uv > 0 ⇐⇒ v<1. Therefore, the system of equations (33) defines a one-to-one transformation of R = braceleftbig (x, y): x>0,y>0 bracerightbig onto the region Q = braceleftbig (u, v): u>0, 0 0, 0 0, 0 0,y>0. The system of two equations in two unknowns braceleftBigg x + y = u x/y = v Chapter 8 Review Problems 191 defines a one-to-one transformation of R = braceleftbig (x, y): x>0,y>0 bracerightbig onto the region Q = braceleftbig (u, v): u>0,v>0 bracerightbig . It has he unique solution x = uv/(1 + v), y = u/(1 + v). Hence J = vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle v 1 + v u (1 + v) 2 1 1 + v − u (1 + v) 2 vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle =− u (1 + v) 2 negationslash= 0. By Theorem 8.8, g(u, v), the joint probability density function of U and V is g(u, v) = f parenleftBig uv 1 + v , u 1 + v parenrightBig |J|= λ 2 u (1 + v) 2 e −λu ,u>0,v>0. This shows that g(u, v) = g U (u)g V (v), where g U (u) = λ 2 ue −λu ,u>0, and g V (v) = 1 (1 + v) 2 ,v>0. Therefore, U = X + Y and V = X/Y are independent random variables. REVIEW PROBLEMS FOR CHAPTER 8 1. (a) We have that P(XY ≤ 6) = p(1, 2) + p(1, 4) + p(1, 6) + p(2, 2) + p(3, 2) = 0.05 + 0.14 + 0.10 + 0.25 + 0.15 = 0.69. (b) First we calculate p X (x) and p Y (y), the marginal probability mass functions of X and Y. They are given by the following table. x y 123p Y (y) 2 0.05 0.25 0.15 0.45 4 0.14 0.10 0.17 0.41 6 0.10 0.02 0.02 0.14 p X (x) 0.29 0.37 0.34 192 Chapter 8 Bivariate Distributions Therefore, E(X) = 1(0.29) + 2(0.37) + 3(0.34) = 2.05; E(Y) = 2(0.45) + 4(0.41) + 6(0.14) = 3.38. 2. (a) and (b) p(x,y), the joint probability mass function of X and Y, and p X (x) and p Y (y), the marginal probability mass functions of X and Y are given by the following table. y x 12345 6p X (x) 2 1/36 0000 01/36 3 0 2/36 0 0 0 0 2/36 4 0 1/36 2/36 0 0 0 3/36 5 0 0 2/36 2/36 0 0 4/36 6 0 0 1/36 2/36 2/36 0 5/36 7 0 0 0 2/36 2/36 2/36 6/36 8 0 0 0 1/36 2/36 2/36 5/36 9 00002/36 2/36 4/36 10 00001/36 2/36 3/36 11 000002/36 2/36 12 000001/36 1/36 p Y (y) 1/36 3/36 5/36 7/36 9/36 11/36 (c) E(X) = summationtext 15 x=2 xp X (x) = 7; E(Y) = summationtext 6 y=1 yp Y (y) = 161/36 ≈ 4.47. 3. Let X be the number of spades and Y be the number of hearts in the random bridge hand. The desired probability mass function is p X|Y (x|4) = p(x,4) p Y (4) = parenleftbigg 13 x parenrightbiggparenleftbigg 13 4 parenrightbiggparenleftbigg 26 9 − x parenrightbigg parenleftbigg 52 13 parenrightbigg parenleftbigg 13 4 parenrightbiggparenleftbigg 39 9 parenrightbigg parenleftbigg 52 13 parenrightbigg = parenleftbigg 13 x parenrightbiggparenleftbigg 26 9 − x parenrightbigg parenleftbigg 39 9 parenrightbigg , 0 ≤ x ≤ 9. 4. The set of possible values of X and Y, both, is braceleftbig 0, 1, 2, 3 bracerightbig . Let p(x,y)be their joint probability mass function; then p(x,y) = parenleftbigg 13 x parenrightbiggparenleftbigg 13 y parenrightbiggparenleftbigg 26 3 − x − y parenrightbigg parenleftbigg 52 3 parenrightbigg , 0 ≤ x, y, x + y ≤ 3. Chapter 8 Review Problems 193 5. Reducing the sample space, the answer is parenleftbigg 13 x parenrightbiggparenleftbigg 13 6 − x parenrightbigg parenleftbigg 26 6 parenrightbigg , 0 ≤ x ≤ 6. 6. (a) integraldisplay 2 0 parenleftBig integraldisplay x 0 c x dy parenrightBig dx = 1 equal1⇒ c = 1/2. (b) f X (x) = integraldisplay x 0 1 2x dy = 1 2 , 0 0,y>0. Therefore, by symmetry, P(X>2Y)+ P(Y >2X) = 2P(X>2Y)= 2 integraldisplay ∞ 0 parenleftbiggintegraldisplay ∞ 2y 4xye −x 2 e −y 2 dx parenrightbigg dy = 2 5 . 12. We have that f X (x) = integraldisplay 1−x 0 3(x + y)dy =− 3 2 x 2 + 3 2 , 0 1/2) = integraldisplay 1/2 0 bracketleftbiggintegraldisplay 1−x (1/2)−x 3(x + y)dy bracketrightbigg dx + integraldisplay 1 1/2 bracketleftbiggintegraldisplay 1−x 0 3(x + y)dy bracketrightbigg dx = 9 64 + 5 16 = 29 64 . 13. Since f X|Y (x|y) = f(x,y) f Y (y) = e −y integraltext 1 0 e −y dx = 1, 0 0, we have that E(X n | Y = y) = integraldisplay 1 0 x n · 1 dx = 1 n + 1 ,n≥ 1. Chapter 8 Review Problems 195 14. Let p(x,y) be the joint probability mass function of X and Y. We have that p(x,y) = parenleftbigg 10 x parenrightbigg parenleftBig 1 4 parenrightBig x parenleftBig 3 4 parenrightBig 10−x · parenleftbigg 15 y parenrightbigg parenleftBig 1 4 parenrightBig y parenleftBig 3 4 parenrightBig 15−y = parenleftbigg 10 x parenrightbiggparenleftbigg 15 y parenrightbigg parenleftBig 1 4 parenrightBig x+y parenleftBig 3 4 parenrightBig 25−x−y , 0 ≤ x ≤ 10, 0 ≤ y ≤ 15. 15. integraldisplay 1 0 bracketleftbiggintegraldisplay 1 x cx(1 − x)dy bracketrightbigg dx = 1 equal1⇒ c = 12. Clearly, f X (x) = integraldisplay 1 x 12x(1 − x)dy = 12x(1 − x) 2 , 0 Y parenrightBig P(X>Y) = 2P parenleftBig min(X, Y − X, lscript − Y)< lscript 20 vextendsingle vextendsingle vextendsingle Xx bracerightbig ; that is, 17lscript 20 × 17lscript 20 2 ÷ lscript 2 2 = 0.7225. Therefore, the desired probability is 1 − 0.7225 = 0.2775. 20. Let p(x,y) be the joint probability mass function of X and Y. p(x,y) = P(X= x, Y = y) = (0.90) x−1 (0.10)(0.90) y−1 (0.10) = (0.90) x+y−2 (0.10) 2 . 21. We have that f X (x) = integraldisplay x −x dy = 2x, 0 2 RANDOM VARIABLES 1. Let p(h, d, c, s) be the joint probability mass function of the number of hearts, diamonds, clubs, and spades selected. We have p(h, d, c, s) = parenleftbigg 13 h parenrightbiggparenleftbigg 13 d parenrightbiggparenleftbigg 13 c parenrightbiggparenleftbigg 13 s parenrightbigg parenleftbigg 52 13 parenrightbigg ,h+ d + c + s = 13, 0 ≤ h, d, c, s ≤ 13. 2. Let p(a, h, n, w) be the joint probability mass function of A, H, N, and W. Clearly, p(a, h, n, w) = parenleftbigg 8 a parenrightbiggparenleftbigg 7 h parenrightbiggparenleftbigg 3 n parenrightbiggparenleftbigg 20 w parenrightbigg parenleftbigg 38 12 parenrightbigg , a + h + n + w = 12, 0 ≤ a ≤ 8, 0 ≤ h ≤ 7, 0 ≤ n ≤ 3, 0 ≤ w ≤ 12. The marginal probability mass function of A is given by p A (a) = parenleftbigg 8 a parenrightbiggparenleftbigg 30 12 − a parenrightbigg parenleftbigg 38 12 parenrightbigg , 0 ≤ a ≤ 8. 3. (a) The desired joint marginal probability mass functions are given by p X,Y (x, y) = 2 summationdisplay z=1 xyz 162 = xy 54 ,x= 4, 5,y= 1, 2, 3. p Y,Z (y, z) = 5 summationdisplay x=4 xyz 162 = yz 18 ,y= 1, 2, 3,z= 1, 2. p X,Z (x, z) = 3 summationdisplay y=1 xyz 162 = xz 27 ,x= 4, 5,z= 1, 2. Section 9.1 Joint Distributions of n > 2 Random Variables 201 (b) E(YZ) = 3 summationdisplay y=1 2 summationdisplay z=1 yzp Y,Z (y, z) = 3 summationdisplay y=1 2 summationdisplay z=1 (yz) 2 18 = 35 9 . 4. (a) The desired marginal joint probability mass functions are given by f X,Y (x, y) = integraldisplay ∞ y 6e −x−y−z dz = 6e −x−2y , 0 0, f Y (y) = integraldisplay ∞ 0 parenleftbiggintegraldisplay ∞ 0 x 2 e −x(1+y+z) dz parenrightbigg dx = 1 (1 + y) 2 ,y>0, and similarly, f Z (z) = 1 (1 + z) 2 ,z>0. Also f X,Y (x, y) = integraldisplay ∞ 0 x 2 e −x(1+y+z) dz = xe −x(1+y) ,y>0. Since f(x,y,z) negationslash= f X (x)f Y (y)f Z (z), X, Y, and Z are not independent. Since f X,Y (x, y) negationslash= f X (x)f Y (y), X, Y, and Z are not pairwise independent either. 202 Chapter 9 Multivariate Distributions 7. (a) The marginal probability distribution functions of X, Y, and Z are, respectively, given by F X (x) = F(x,∞, ∞) = 1 − e −λ 1 x ,x>0, F Y (y) = F(∞,y,∞) = 1 − e −λ 2 y ,y>0, F Z (z) = F(∞, ∞,z)= 1 − e −λ 3 z ,z>0. Since F(x,y,z) = F X (x)F Y (y)F Z (z), the random variables X, Y, and Z are independent. (b) From part (a) it is clear that X, Y, and Z are independent exponential random variables with parameters λ 1 , λ 2 , and λ 3 , respectively. Hence their joint probability density functions is given by f(x,y,z) = λ 1 λ 2 λ 3 e −λ 1 x−λ 2 y−λ 3 z . (c) The desired probability is calculated as follows: P(X 2 Random Variables 203 11. Yes, it is because f ≥ 0 and integraldisplay ∞ 0 integraldisplay ∞ x 1 integraldisplay ∞ x 2 ··· integraldisplay ∞ x n−1 e −x n dx n dx n−1 ···dx 1 = integraldisplay ∞ 0 integraldisplay ∞ x 1 integraldisplay ∞ x 2 ··· integraldisplay ∞ x n−2 e −x n−1 dx n−1 ···dx 1 =···= integraldisplay ∞ 0 integraldisplay ∞ x 1 e −x 2 dx 2 dx 1 = integraldisplay ∞ 0 e −x 1 dx 1 = 1. 12. Let f(x 1 ,x 2 ,x 3 ) be the joint probability density function of X 1 , X 2 , and X 3 , the lifetimes of the original, the second, and the third transistors, respectively. We have that f(x 1 ,x 2 ,x 3 ) = 1 5 e −x 1 /5 · 1 5 e −x 2 /5 · 1 5 e −x 3 /5 = 1 125 e −(x 1 +x 2 +x 3 )/5 . Now P(X 1 + X 2 + X 3 < 15) = integraldisplay 15 0 integraldisplay 15−x 1 0 integraldisplay 15−x 1 −x 2 0 1 125 e −(x 1 +x 2 +x 3 )/5 dx 3 dx 2 dx 1 = integraldisplay 15 0 integraldisplay 15−x 1 0 bracketleftbigg 1 25 e −(x 1 +x 2 )/5 − 1 25 e −3 bracketrightbigg dx 2 dx 1 = integraldisplay 15 0 parenleftbigg 1 5 e −x 1 /5 − 4 5 e −3 + 1 25 e −3 x 1 parenrightbigg dx 1 = 1 − 17 2 e −3 = 0.5768. Therefore, the desired probability is P(X 1 + X 2 + X 3 ≥ 15) = 1 − 0.5768 = 0.4232. 13. Let F be the distribution function of X. We have that F(t)= P(X≤ t) = 1 − P(X>t)= 1 − P(X 1 >t,X 2 >t,... ,X n >t) = 1 − P(X 1 > t)P(X 2 >t)···P(X n >t)= 1 − e −λ 1 t e −λ 2 t ···e −λ n t = 1 − e −(λ 1 +λ 2 +···+λ n )t ,t>0. Thus X is exponential with parameter λ 1 + λ 2 +···+λ n . 14. Let Y be the number of functioning components of the system. The random variable Y is binomial with parameters n and p. The reliability of this system is given by r = P(X= 1) = P(Y ≥ k) = n summationdisplay i=k parenleftbigg n i parenrightbigg p i (1 − p) n−i . 204 Chapter 9 Multivariate Distributions 15. Let X i be the lifetime of the ith part. The time until the item fails is the random variable min(X 1 ,X 2 ,... ,X n ) which by the solution to Exercise 13 is exponentially distributed with parameter nλ. Thus the average life of the item is 1/(nλ). 16. Let X 1 , X 2 , ... be the lifetimes of the transistors selected at random. Clearly, N = min braceleftbig n: X n >s bracerightbig . Note that P parenleftbig X N ≤ t | N = n parenrightbig = P parenleftbig X n ≤ t | X 1 ≤ s, X 2 ≤ s,...,X n−1 ≤ s, X n >s). This shows that for s ≥ t, P parenleftbig X N ≤ t | N = n parenrightbig = 0. For ss) = P(s s) = P(s s) = F(t)− F(s) 1 − F(s) . This relation shows that the probability distribution function of X N given N = n does not depend on n. Therefore, X N and N are independent. 17. Clearly, X = X 1 bracketleftBig 1 − (1 − X 2 )(1 − X 3 ) bracketrightBigbracketleftBig 1 − (1 − X 4 )(1 − X 5 X 6 ) bracketrightBig X 7 = X 1 X 7 parenleftbig X 2 X 4 + X 3 X 4 − X 2 X 3 X 4 + X 2 X 5 X 6 + X 3 X 5 X 6 − X 2 X 3 X 5 X 6 − X 2 X 4 X 5 X 6 − X 3 X 4 X 5 X 6 + X 2 X 3 X 4 X 5 X 6 parenrightbig . The reliability of this system is r = p 1 p 7 parenleftbig p 2 p 4 + p 3 p 4 − p 2 p 3 p 4 + p 2 p 5 p 6 + p 3 p 5 p 6 − p 2 p 3 p 5 p 6 − p 2 p 4 p 5 p 6 − p 3 p 4 p 5 p 6 + p 2 p 3 p 4 p 5 p 6 parenrightbig . 18. Let G and F be the distribution functions of max 1≤i≤n X i and min 1≤i≤n X i , respectively. Let g and f be their probability density functions, respectively. For 0 ≤ t<1, G(t) = P(X 1 ≤ t,X 2 ≤ t,... ,X n ≤ t) = P(X 1 ≤ t)P(X 2 ≤ t)···P(X n ≤ t) = t n . Section 9.1 Joint Distributions of n > 2 Random Variables 205 So G(t) = ⎧ ⎪ ⎨ ⎪ ⎩ 0 t<0 t n 0 ≤ t<1 1 t ≥ 1. Therefore, g(t) = G prime (t) = braceleftBigg nt n−1 0 t parenrightBig = 1 − P(X 1 > t)P(X 2 >t)···P(X n >t) = 1 − (1 − t) n , 0 ≤ t<1. Hence F(t)= ⎧ ⎪ ⎨ ⎪ ⎩ 0 t<0 1 − (1 − t) n 0 ≤ t<1 1 t ≥ 1, and f(t)= braceleftBigg n(1 − t) n−1 0 t parenrightbig = 1 − P(X 1 >t,X 2 >t,... ,X n >t) = 1 − P(X 1 > t)P(X 2 >t)···P(X n >t) = 1 − bracketleftbig 1 − F(t) bracketrightbig n . 206 Chapter 9 Multivariate Distributions 20. We have that P(Y n >x)= P parenleftBig min(X 1 ,X 2 ,... ,X n )> x n parenrightBig = P parenleftBig X 1 > x n ,X 2 > x n ,...,X n > x n parenrightBig = P parenleftBig X 1 > x n parenrightBig P parenleftBig X 2 > x n parenrightBig ···P parenleftBig X n > x n parenrightBig = parenleftBig 1 − x n parenrightBig n . Thus lim n→∞ P(Y n >x)= lim n→∞ parenleftBig 1 − x n parenrightBig n = e −x ,x>0. 21. We have that P(X 2 Random Variables 207 Thus f X (x) = integraldisplay 1−x 0 parenleftbiggintegraldisplay 1−x−y 0 6 dz parenrightbigg dy = 3(1 − x) 2 , 0 0. Applying convolution theorem, we obtain P parenleftbig B 2 − 4AC ≥ 0 parenrightbig = 1 − P parenleftbig B 2 − 4AC < 0 parenrightbig = 1 − integraldisplay ∞ −∞ F −4AC (0 − x)f B 2 (x)dx = 1 − integraldisplay 1 0 parenleftBig 1 − x 4 + x 4 ln x 4 parenrightBig 1 2 √ x dx. Letting y = √ x/2, we get dy = 1 4 √ x dx.So P parenleftbig B 2 − 4AC ≥ 0 parenrightbig = 1 − integraldisplay 1/2 0 (1 − y 2 + y 2 ln y 2 )2dy = 1 − integraldisplay 1/2 0 2dy + 2 integraldisplay 1/2 0 (y 2 − y 2 ln y 2 )dy = 2 integraldisplay 1/2 0 (y 2 − y 2 ln y 2 )dy. Now by integration by parts (u = ln y 2 , dv = y 2 dy), integraldisplay y 2 ln y 2 dy = 1 3 y 3 ln y 2 − 2 9 y 3 . Thus P parenleftbig B 2 − 4AC ≥ 0 parenrightbig = bracketleftBig 10 9 y 3 − 2 3 y 3 ln y 2 bracketrightBig 1/2 0 = 5 36 + 1 6 ln 2 ≈ 0.25. 25. The following solution by Scott Harrington, Duke University, Durham, NC, was given in The College Mathematics Journal, September 1993. Let V be the set of points (A,B,C)∈[0, 1] 3 such that f(x)= x 3 +Ax 2 +Bx+C = 0 has all real roots. The probability that all of the roots are real is the volume of V . Section 9.1 Joint Distributions of n > 2 Random Variables 209 The function is cubic, so it either has one real root and two complex roots or three real roots. Since the coefficient of x 3 is positive, lim x→−∞ f(x) =−∞and lim x→+∞ f(x)=+∞. The number of real roots of the graph of f(x)depends on the nature of the critical points of the function f . f prime (x) = 3x 2 + 2Ax + B = 0, with roots x =− 1 3 A ± 1 3 radicalbig A 2 − 3B. Let D = √ A 2 − 3B, x 1 =− 1 3 (A + D), and x 2 =− 1 3 (A − D). If A 2 < 3B then the critical points are imaginary, so the graph of f(x)is strictly increasing and there must be exactly one real root. Thus we may assume A 2 ≥ 3B. In order for there to be three real roots, counting multiplicities, the local maximum parenleftbig x 1 ,f(x 1 ) parenrightbig and local minimum parenleftbig x 2 ,f(x 2 ) parenrightbig must satisfy f(x 1 ) ≥ 0 and f(x 2 ) ≤ 0; that is, f(x 1 ) =− 1 27 (A 3 + 3A 2 D + 3AD 2 + D 3 ) + 1 9 A(A 2 + 2AD + D 2 ) − 1 3 B(A + D) + C ≥ 0, f(x 2 ) =− 1 27 (A 3 − 3A 2 D + 3AD 2 − D 3 ) + 1 9 A(A 2 − 2AD + D 2 ) − 1 3 B(A − D) + C ≤ 0. Simplifying produces two half-spaces: C ≥ 1 27 parenleftBig − 2A 3 + 9AB − 2(A 2 − 3B) 3/2 parenrightBig , (constraint surface 1); C ≤ 1 27 parenleftBig − 2A 3 + 9AB + 2(A 2 − 3B) 3/2 parenrightBig , (constraint surface 2). These two surfaces intersect at the curve given parametrically by A = t, B = 1 3 t 2 and C = 1 27 t 3 . Note that all points in the intersection of these two half-spaces satisfy B ≤ 1 3 A 2 . Surface 2 intersects the plane C = 0 at the A-axis, but surface 1 intersects the plane C = 0 at the curve B = 1 4 A 2 , which is a quadratic curve in the plane C = 0 located between the A-axis and the upper limit B = 1 3 A 2 . Therefore, V is the region above the plane C = 0 and constraint surface 1, and below constraint surface 2. The volume of V is the volume V 2 under surface 2 minus the volume V 1 under surface 1. Now V 1 = integraldisplay 1 a=0 integraldisplay (1/3)a 2 b=(1/4)a 2 1 27 parenleftBig − 2a 3 + 9ab − 2(a 2 − 3b) 3/2 parenrightBig dbda 210 Chapter 9 Multivariate Distributions = integraldisplay 1 0 1 27 bracketleftbigg − 2a 3 b + 9 2 ab 2 + 4 15 (a 2 − 3b) 5/2 bracketrightbigg (1/3)a 2 b=(1/4)a 2 da = integraldisplay 1 0 1 27 · 7 160 a 5 da = 7 25, 920 , and V 2 = integraldisplay 1 a=0 integraldisplay (1/3)a 2 b=0 1 27 parenleftBig − 2a 3 + 9ab + 2(a 2 − 3b) 3/2 parenrightBig dbda = integraldisplay 1 0 1 27 bracketleftbigg − 2a 3 b + 9 2 ab 2 − 4 15 (a 2 − 3b) 5/2 bracketrightbigg (1/3)a 2 b=0 da = integraldisplay 1 0 1 270 a 5 da = 1 1620 . Thus V = V 2 − V 1 = 1 1, 620 − 7 25, 920 = 1 2, 880 . 9.2 ORDER STATISTICS 1. By Theorem 9.5, we have that f 3 (x) = 4! 2! 1! f(x) bracketleftbig F(x) bracketrightbig 2 bracketleftbig 1 − F(x) bracketrightbig , where f(x)= ⎧ ⎨ ⎩ 10x)dx. Now P parenleftbig X (n) >x parenrightbig = 1 − P parenleftbig X (n) ≤ x parenrightbig = 1 − P(X 1 ≤ x,X 2 ≤ x,... ,X n ≤ x) = 1 − bracketleftbig F(x) bracketrightbig n . So E bracketleftbig X (n) bracketrightbig = integraldisplay ∞ 0 parenleftBig 1 − bracketleftbig F(x) bracketrightbig n parenrightBig dx. 5. To find P parenleftbig X (i) = k parenrightbig ,0≤ k ≤ n, note that P parenleftbig X (i) = k parenrightbig = 1 − P parenleftbig X (i) k parenrightbig . Let N be the number of X j ’s that are less than k. Then N is a binomial random variable with parameters m and p 1 = k−1 summationdisplay l=0 parenleftbigg n l parenrightbigg p l (1 − p) n−l . (35) Let L be the number of X j ’s that are greater than k. Then L is a binomial random variable with parameters m and p 2 = n summationdisplay l=k+1 parenleftbigg n l parenrightbigg p l (1 − p) n−l . (36) 212 Chapter 9 Multivariate Distributions Clearly, P parenleftbig X (i) k parenrightbig = P(L≥ m − i + 1) = m summationdisplay j=m−i+1 parenleftbigg m j parenrightbigg p j 2 (1 − p 2 ) m−j . Thus, for 0 ≤ k ≤ n, P parenleftbig X (i) = k parenrightbig = 1 − m summationdisplay j=i parenleftbigg m j parenrightbigg p j 1 (1 − p 1 ) m−j − m summationdisplay j=m−i+1 parenleftbigg m j parenrightbigg p j 2 (1 − p 2 ) m−j , where p 1 and p 2 are given by (35) and (36). 6. By Theorem 9.6, the joint probability density function of X (1) and X (n) is given by f 1n (x, y) = n(n − 1)f (x)f (y) bracketleftbig F(y)− F(x) bracketrightbig n−2 ,x0 bracerightbig . It has the unique solution x = u, y = u + v. Hence J = vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle 10 11 vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle = 1 negationslash= 0. By Thereom 8.8, g(u, v) = f 12 (u, u + v)|J|=2λ 2 e −λ(u+2v) ,u≥ 0,v>0. Since g(u, v) = g U (u)g V (v), where g U (u) = 2λe −2λu ,u≥ 0, and g V (v) = λe −λv ,v>0, we have that U and V are independent. Furthermore, U is exponential with parameter 2λ and V is exponential with parameter λ. 8. Let f 12 (x, y) be the joint probability density function of X (1) and X(2). By Theorem 9.6, f 12 (x, y) = 2! f(x)f(y)= 2 · 1 σ √ 2π e −x 2 /2σ 2 · 1 σ √ 2π e −y 2 /2σ 2 = 1 σ 2 π e −x 2 /2σ 2 · e −y 2 /2σ 2 , −∞ 0 bracerightbig . It has the unique solution x = v − r, y = v. Hence J = vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle −11 01 vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle =−1 negationslash= 0. By Theorem 8.8, g(u, v) is given by g(r,v) = f 1n (v − r, v)|J| = n(n − 1)f (v − r)f(v) bracketleftbig F(v)− F(v− r) bracketrightbig n−2 , −∞ 0. This implies g R (r) = integraldisplay ∞ −∞ n(n − 1)f (v − r)f(v) bracketleftbig F(v)− F(v− r) bracketrightbig n−2 dv, r > 0. (37) Section 9.3 Multinomial Distributions 215 (b) The probability density function of n random numbers from (0, 1) is obtained by letting f(v)= ⎧ ⎨ ⎩ 10t)= 1 − P(X 1 >t,X 2 >t,...,X n >t) = 1 − bracketleftbig P(X 1 >t) bracketrightbig n = ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ 0 t<1 1 − parenleftbig 5 6 parenrightbig n 1 ≤ t<2 1 − parenleftbig 4 6 parenrightbig n 2 ≤ t<3 1 − parenleftbig 3 6 parenrightbig n 3 ≤ t<4 1 − parenleftbig 2 6 parenrightbig n 4 ≤ t<5 1 − parenleftbig 1 6 parenrightbig n 5 ≤ t<6 1 t ≥ 6. The probability mass function of X is p(x) = P(X= x) = parenleftbigg 7 − x 6 parenrightbigg n − parenleftbigg 6 − x 6 parenrightbigg n ,x= 1, 2, 3, 4, 5, 6. 3. Let D 1 , D 2 , ..., D n be the distances of the points selected from the origin. Let D = min(D 1 ,D 2 ,... ,D n ). The desired probability is P(D ≥ r) = P(D 1 ≥ r, D 2 ≥ r,...,D n ≥ r) = bracketleftbig P(D 1 ≥ r) bracketrightbig n = bracketleftbig 1 − P(D 1 0, ¯ F(t)= P parenleftbig min(X 1 ,X 2 ,... ,X n )>t parenrightbig = P(X 1 >t,X 2 >t,... ,X n >t) = P(X 1 > t)P(X 2 >t)···P(X n >t)= ¯ F 1 (t) ¯ F 2 (t) ··· ¯ F n (t). 8. For 1 ≤ i ≤ n, let X i be the lifetime of the ith component. Then max(X 1 ,X 2 ,... ,X n ) is the lifetime of the system. Let ¯ F(t) be the survival function of the system. By the independence of the lifetimes of the components, for all t>0, ¯ F(t)= P parenleftbig max(X 1 ,X 2 ,... ,X n )>t parenrightbig = 1 − P parenleftbig max(X 1 ,X 2 ,... ,X n ) ≤ t parenrightbig = 1 − P(X 1 ≤ t,X 2 ≤ t,... ,X n ≤ t) = 1 − P(X 1 ≤ t)P(X 2 ≤ t)···P(X n ≤ t) = 1 − F 1 (t)F 2 (t) ···F n (t). 9. The problem is equivalent to the following: Two points X and Y are selected independently and at random from the interval (0,lscript). What is the probability that the length of at least one 220 Chapter 9 Multivariate Distributions interval is less than lscript/20? The solution to this problem is as follows: P parenleftBig min(X, Y − X, lscript − Y)< lscript 20 vextendsingle vextendsingle vextendsingle XY parenrightBig P(X>Y) = 2P parenleftBig min(X, Y − X, lscript − Y)< lscript 20 vextendsingle vextendsingle vextendsingle Xx bracerightbig ; that is, 17lscript 20 × 17lscript 20 2 ÷ lscript 2 2 = 0.7225. Therefore, the desired probability is 1 − 0.7225 = 0.2775. 10. Let f 13 (x, y) be the joint probability density function of X (1) and X (3) . By Theorem 9.6, f 13 (x, y) = 6(y − x), 0 x,Y > y)dx dy. 18. Clearly N>iif and only if X 1 ≥ X 2 ≥ X 3 ≥···≥X i . Hence for i ≥ 2, P(N >i)= P parenleftbig X 1 ≥ X 2 ≥ X 3 ≥···≥X i−1 ≥ X i parenrightbig = 1 i! because X i ’s are independent and identically distributed. So, by Theorem 10.2, E(N) = ∞ summationdisplay i=1 P(N ≥ i) = ∞ summationdisplay i=0 P(N >i)= P(N >0) + P(N >1) + ∞ summationdisplay i=2 1 i! = 1 + 1 + ∞ summationdisplay i=2 1 i! = ∞ summationdisplay i=0 1 i! = e. 19. If the first red chip is drawn on or before the 10th draw, let N be the number of chips before the first red chip. Otherwise, let N = 10. Clearly, P(N = i) = parenleftBig 1 2 parenrightBig i parenleftBig 1 2 parenrightBig = parenleftBig 1 2 parenrightBig i+1 , 0 ≤ i ≤ 9; P(N = 10) = parenleftBig 1 2 parenrightBig 10 . The desired quantity is E(10 − N) = 9 summationdisplay i=0 (10 − i) parenleftBig 1 2 parenrightBig i+1 + (10 − 10) · parenleftBig 1 2 parenrightBig 10 ≈ 9.001. 20. Clearly, if for some λ ∈ R,X= λY, Cauchy-Schwarz’s inequality becomes equality. We show that the converse of this is also true. Suppose that for random variables X and Y, E(XY) = radicalbig E(X 2 )E(Y 2 ). Then 4 bracketleftbig E(XY) bracketrightbig 2 − 4E(X 2 )E(Y 2 ) = 0. Section 10.2 Covariance 227 Now the left side of this equation is the discriminant of the quadratic equation E(Y 2 )λ 2 − 2 bracketleftbig E(XY) bracketrightbig λ + E(X 2 ) = 0. Hence this quadratic equation has exactly one root. On the other hand, E(Y 2 )λ 2 − 2 bracketleftbig E(XY) bracketrightbig λ + E(X 2 ) = E bracketleftbig (X − λY) 2 bracketrightbig . So the equation E bracketleftbig (X − λY) 2 bracketrightbig = 0 has a unique solution. That is, there exists a unique number λ 1 ∈ R such that E bracketleftbig (X − λ 1 Y) 2 bracketrightbig = 0. Since the expected value of a positive random variable is positive, this implies that with probability 1, X − λ 1 Y = 0orX = λ 1 Y. 10.2 COVARIANCE 1. Since X and Y are independent random variables, Cov(X, Y) = 0. 2. E(X) = 3 summationdisplay x=1 4 summationdisplay y=3 1 70 x 2 (x + y) = 17 7 ; E(Y) = summationtext 3 x=1 summationtext 4 y=3 1 70 xy(x + y) = 124 35 ; E(XY) = 3 summationdisplay x=1 4 summationdisplay y=3 1 70 x 2 y(x + y) = 43 5 . Therefore, Cov(X, Y) = E(XY) − E(X)E(Y) = 43 5 − 17 7 · 124 35 =− 1 245 . 3. Intuitively, E(X) is the average of 1, 2, ..., 6 which is 7/2; E(Y) is (7/2)(1/2) = 7/4. To show these, note that E(X) = 6 summationdisplay x=1 xp X (x) = 6 summationdisplay x=1 x(1/6) = 7/2. 228 Chapter 10 More Expectations and Variances By the table constructed for p(x,y) in Example 8.2, E(Y) = 0 · 63 384 + 1 · 120 384 + 2 · 99 384 + 3 · 64 384 + 4 · 29 384 + 5 · 8 384 + 6 · 1 384 = 7 4 . By the same table, E(XY) = 6 summationdisplay x=1 6 summationdisplay y=0 xyp(x,y) = 91/12. Therefore, Cov(X, Y) = E(XY) − E(X)E(Y) = 91 12 − 7 2 · 7 4 = 35 24 > 0. This shows that X and Y are positively correlated. The higher the outcome from rolling the die, the higher the number of tails obtained—a fact consistent with our intuition. 4. Let X be the number of sheep stolen; let Y be the number of goats stolen. Let p(x,y) be the joint probability mass function of X and Y. Then, for 0 ≤ x ≤ 4, 0 ≤ y ≤ 4, 0 ≤ x + y ≤ 4, p(x,y) = parenleftbigg 7 x parenrightbiggparenleftbigg 8 y parenrightbiggparenleftbigg 5 4 − x − y parenrightbigg parenleftbigg 20 4 parenrightbigg ; p(x,y) = 0, for other values of x and y. Clearly, X is a hypergeometric random variable with parameters n = 4, D = 7, and N = 20. Therefore, E(X) = nD N = 28 20 = 7 5 . Y is a hypergeometric random variable with parametersn = 4, D = 8, andN = 20. Therefore, E(Y) = nD N = 32 20 = 8 5 . Since E(XY) = 4 summationdisplay x=0 4−x summationdisplay y=0 xyp(x,y) = 168 95 , we have Cov(X, Y) = E(XY) − E(X)E(Y) = 168 95 − 7 5 · 8 5 =− 224 475 < 0. Therefore, X and Y are negatively correlated as expected. Section 10.2 Covariance 229 5. Since Y = n − X, E(XY) = E(nX − X 2 ) = nE(X) − E(X 2 ) = nE(X) − bracketleftbig Var(X) + E(X) 2 bracketrightbig = n · np − bracketleftbig np(1 − p) + n 2 p 2 bracketrightbig = n(n − 1)p(1 − p), and Cov(X, Y) = E(XY) − E(X)E(Y) = n(n − 1)p(1 − p) − np · n(1 − p) =−np(1 − p). This confirms the (obvious) fact that X and Y are negatively correlated. 6. Both (a) and (b) are straightforward results of relation (10.6). 7. Since Cov(X, Y) = 0, we have Cov(X, Y + Z) = Cov(X, Y) + Cov(X, Z) = Cov(X, Z). 8. By relation (10.6), Cov(X + Y,X − Y)= E(X 2 − Y 2 ) − E(X + Y)E(X− Y) = E(X 2 ) − E(Y 2 ) − bracketleftbig E(X) bracketrightbig 2 + bracketleftbig E(Y) bracketrightbig 2 = Var(X) − Var(Y). 9. In Theorem 10.4, let a = 1 and b =−1. 10. (a) This is an immediate result of Exercise 8 above. (b) By relation (10.6), Cov(X, XY) = E(X 2 Y)− E(X)E(XY) = E(X 2 )E(Y) − bracketleftbig E(X) bracketrightbig 2 E(Y) = E(Y)Var(X). 11. The probability density function of Theta1 is given by f(θ)= ⎧ ⎪ ⎨ ⎪ ⎩ 1 2π if θ ∈[0, 2π] 0 otherwise. Therefore, E(XY) = integraldisplay 2π 0 sin θ cos θ 1 2π dθ = 0, E(X) = integraldisplay 2π 0 sin θ 1 2π dθ = 0, E(Y) = integraldisplay 2π 0 cos θ 1 2π dθ = 0. Thus Cov(X, Y) = E(XY) − E(X)E(Y) = 0. 230 Chapter 10 More Expectations and Variances 12. The joint probability density function of X and Y is given by f(x,y) = ⎧ ⎪ ⎨ ⎪ ⎩ 1 π x 2 + y 2 ≤ 1 0 elsewhere. X and Y are dependent because, for example, P parenleftBig 0 0 ⇐⇒ P(AB) > P(A)P(B) ⇐⇒ P(AB) P(B) > P(A), ⇐⇒ P(A| B)>P(A). The proof that I A and I B are positively correlated if and only if P(B|A) > P(B)follows by symmetry. 234 Chapter 10 More Expectations and Variances 23. By Exercise 6, Cov(aX + bY,cZ + dW) = a Cov(X, cZ + dW)+ b Cov(Y, cZ + dW) = ac Cov(X, Z) + ad Cov(X, W) + bc Cov(Y, Z) + bd Cov(Y, W). 24. By Exercise 6 and an induction on n, Cov parenleftBig n summationdisplay i=1 a i X i , m summationdisplay j=1 b j Y j parenrightBig = n summationdisplay i=1 a i Cov parenleftBig X i , m summationdisplay j=1 b j Y j parenrightBig . By Exercise 6 and an induction on m, Cov parenleftBig X i , m summationdisplay j=1 b j Y j parenrightBig = m summationdisplay j=1 b j Cov(X i ,Y j ). The desired identity follows from these two identities. 25. For 1 ≤ i ≤ n, let X i = 1 if the outcome of the ith throw is 1; let X i = 0, otherwise. For 1 ≤ j ≤ n, let Y j = 1 if the outcome of the jth throw is 6; let Y j = 0, otherwise. Clearly, Cov(X i ,Y j ) = 0ifi negationslash= j. By Exercise 24, Cov parenleftBig n summationdisplay i X i , n summationdisplay j=1 Y j parenrightBig = n summationdisplay j=1 n summationdisplay i=1 Cov(X i ,Y j ) = n summationdisplay i=1 Cov(X i ,Y i ) = n summationdisplay i=1 bracketleftbig E(X i Y i ) − E(X i )E(Y i ) bracketrightbig = n summationdisplay i=1 parenleftBig 0 − 1 6 · 1 6 parenrightBig =− n 36 . As expected, in n throws of a fair die, the number of ones and the number of sixes are negatively correlated. 26. Let S n = summationtext n i=1 a i X i , µ i = E(X i ); then E(S n ) = n summationdisplay i=1 a i µ i ,S n − E(S n ) = n summationdisplay i=1 a i (X i − µ i ). Thus Var(S n ) = E parenleftBigbracketleftBig n summationdisplay i=1 a i (X i − µ i ) bracketrightBig 2 parenrightBig = n summationdisplay i=1 a 2 i E bracketleftbig (X i − µ i ) 2 bracketrightbig + 2 summationdisplaysummationdisplay i 2. The time that the admissions office has to wait before doubling its student recruitment efforts is S N+1 = X 1 + X 2 +···+X N+1 . Therefore, E(S N+1 ) = E bracketleftbig E(S N+1 | N) bracketrightbig = ∞ summationdisplay i=0 E(S N+1 | N = i)P(N = i). Now, for i ≥ 0, E(S N+1 | N = i) = E(X 1 + X 2 +···+X i+1 | N = i) = i+1 summationdisplay j=1 E(X j | N = i) = bracketleftBig i summationdisplay j=1 E(X j | X j ≤ 2) bracketrightBig + E(X i+1 | X i+1 > 2), where by Remark 8.1, E(X j | X j ≤ 2) = 1 F(2) integraldisplay 2 0 tf (t) dt, E(X i+1 | X i+1 > 2) = 1 1 − F(2) integraldisplay ∞ 2 tf (t) dt, Section 10.4 Conditioning on Random Variables 245 F and f being the probability distribution and density functions of X i ’s, respectively. That is, for t ≥ 0, F(t)= 1 − e −5t , f(t)= 5e −5t . Thus, for 1 ≤ j ≤ i, E(X j | X j ≤ 2) = 1 1 − e −10 integraldisplay 2 0 5te −5t dt = (1.0000454) bracketleftBigparenleftBig − t − 1 5 parenrightBig e −5t bracketrightBig 2 0 = (1.0000454)(0.19999) = 0.1999092 and, for j = i + 1, E(X i+1 | X i+1 > 2) = 1 e −10 integraldisplay ∞ 2 5te −5t dt = e 10 bracketleftBigparenleftBig − t − 1 5 parenrightBig e −5t bracketrightBig ∞ 2 = 2.2. Thus, for i ≥ 0, E(S N+1 | N = i) = (0.1999092)i + 2.2. To find P(N = i), note that for i ≥ 0, P(N = i) = P(X 1 ≤ 2,X 2 ≤ 2, ... , X i ≤ 2,X i+1 > 2) = bracketleftbig F(2) bracketrightbig i bracketleftbig 1 − F(2) bracketrightbig = (0.9999546) i (0.0000454). Putting all these together, we obtain E(S N+1 ) = ∞ summationdisplay i=0 E(S N+1 | N = i)P(N = i) = ∞ summationdisplay i=0 bracketleftbig (0.1999092)i + 2.2 bracketrightbig (0.9999546) i (0.0000454) = (0.00000908) ∞ summationdisplay i=0 i(0.9999546) i + (0.00009988) ∞ summationdisplay i=0 (0.9999546) i = (0.00000908) · 0.9999546 (1 − 0.9999546) 2 + (0.00009988) · 1 1 − 0.9999546 = 4407.286, where the next to last equality follows from summationtext ∞ i=1 ir i = r/(1 − r) 2 , and summationtext ∞ i=0 r i = 1/(1 − r), |r| < 1. Since an academic year is 9 months long, and contains approximately 180 business days, the admission officers should not be concerned about this rule at all. It will take 4,407.286 business days, on average, until there is a lapse of two days between two consecutive applications. 14. Let X i be the number of calls until Steven has not missed Adam in exactly i consecutive calls. We have that E parenleftbig X i | X i−1 parenrightbig = braceleftBigg X i−1 + 1 with probability p X i−1 + 1 + E(X i ) with probability 1 − p. 246 Chapter 10 More Expectations and Variances Therefore, E(X i ) = E bracketleftbig E(X i | X i−1 ) bracketrightbig = bracketleftbig E(X i−1 ) + 1 bracketrightbig p + bracketleftbig E(X i−1 ) + 1 + E(X i ) bracketrightbig (1 − p). Solving this equation for E(X i ), we obtain E(X i ) = 1 p bracketleftbig 1 + E(X i−1 ) bracketrightbig . Now X 1 is a geometric random variable with parameter p.SoE(X 1 ) = 1/p. Thus E(X 2 ) = 1 p bracketleftbig 1 + E(X 1 ) bracketrightbig = 1 p parenleftBig 1 + 1 p parenrightBig , E(X 3 ) = 1 p bracketleftbig 1 + E(X 2 ) bracketrightbig = 1 p parenleftBig 1 + 1 p + 1 p 2 parenrightBig , . . . E(X k ) = 1 p parenleftBig 1 + 1 p + 1 p 2 +···+ 1 p k−1 parenrightBig = 1 p · (1/p k ) − 1 (1/p) − 1 = 1 − p k p k (1 − p) . 15. Let N be the number of games to be played until Emily wins two of the most recent three games. Let X be the number of games to be played until Emily wins a game for the first time. The random variable X is geometric with parameter 0.35. Hence E(X) = 1/0.35. First, we find the random variable E(N | X) in terms of X. Then we obtain E(N) by calculating the expected value of E(N | X). Let W be the event that Emily wins the (X + 1)st game as well. Let LW be the event that Emily loses the (X + 1)st game but wins the (X + 2)nd game. Let LL be the event that Emily loses both the (X + 1)st and the (X + 2)nd games. Given X = x, we have E(N | X = x) = (x + 1)P(W) + (x + 2)P(LW) + bracketleftbig (x + 2) + E(N) bracketrightbig P(LL). So E(N | X = x) = (x + 1)(0.35) + (x + 2)(0.65)(0.35) + bracketleftbig (x + 2) + E(N) bracketrightbig (0.65) 2 . This gives E(N | X = x) = x + (0.4225)E(N) + 1.65. Therefore, E(N | X) = X + (0.4225)E(N) + 1.65. Hence E(N) = E bracketleftbig E(N | X) bracketrightbig = E(X) + (0.4225)E(N) + 1.65 = 1 0.35 + (0.4225)E(N) + 1.65. Solving this for E(N) gives E(N) = 7.805. Section 10.4 Conditioning on Random Variables 247 16. Since hemophilia is a sex-linked disease, and John is phenotypically normal, John is H. Therefore, no matter what Kim’s genotype is, none of the daughters has hemophilia. Whether a boy has hemophilia or not depends solely on the genotype of Kim. Let X be the number of the boys who have hemophilia. To find, E(X), the expected number of the boys who have hemophilia, let Z = ⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩ 0 if Kim is hh 1 if Kim is Hh 2 if Kim is HH. Then E(X) = E bracketleftbig E(X | Z) bracketrightbig = E(X | Z = 0)P(Z = 0) + E(X | Z = 1)P(Z = 1) + E(X | Z = 2)P(Z = 2) = 4(0.02)(0.02) + 4(1/2) bracketleftbig 2(0.98)(0.02) bracketrightbig + 0 bracketleftbig 0.98)(0.98) bracketrightbig = 0.08. Therefore, on average, 0.08 of the boys and hence 0.08 of the children are expected to have hemophilia. 17. LetX be the number of bags inspected until an unacceptable bag is found. LetK n be the number of consequent bags inspected until n consecutive acceptable bags are found. The number of bags inspected in one inspection cycle is X + K m . We are interested in E(X + K m ) = E(X) + E(K m ). Clearly, X is a geometric random variable with parameter α(1 − p).So E(X) = 1/ bracketleftbig α(1 − p) bracketrightbig . To find E(K m ), note that ∀n, E(K n ) = E bracketleftbig E(K n | K n−1 ) bracketrightbig . Now E(K n | K n−1 = i) = (i + 1)p + bracketleftbig i + 1 + E(K n ) bracketrightbig (1 − p) = (i + 1) + (1 − p)E(K n ). (41) To derive this relation, we noted the following. It took i inspections to find n − 1 consecutive acceptable bags. If the next bag inspected is also acceptable, we have the n consecutive acceptable bags required in i + 1 inspections. This occurs with probability p. However, if the next bag inspected is unacceptable, then, on the average, we need an additional E(K n ) inspections bracketleftbig a total of i + 1 + E(K n ) inspections bracketrightbig until we get n consecutive acceptable bags of cinnamon. This happens with probability 1 − p. From (41), we have E(K n | K n−1 ) = (K n−1 + 1) + (1 − p)E(K n ). Finding the expected values of both sides of this relation gives E(K n ) = E(K n−1 ) + 1 + (1 − p)E(K n ). 248 Chapter 10 More Expectations and Variances Solving for E(K n ), we obtain E(K n ) = 1 p + E(K n−1 ) p . Noting that E(K 1 ) = 1/p and solving recursively, we find that E(K n ) = 1 p + 1 p 2 +···+ 1 p n . Therefore, the desired quantity is E(X + K m ) = E(X) + E(K m ) = 1 α(1 − p) + 1 p parenleftBig 1 + 1 p +···+ 1 p m−1 parenrightBig = 1 α(1 − p) + 1 p · parenleftBig 1 p parenrightBig m − 1 1 p − 1 = (1 − α)p m + α αp m (1 − p) . 18. For 0 0 and ρ 2 = ρ σ Y σ X · ρ σ X σ Y = 1 4 . Therefore ρ = 1/2. 6. We use Theorem 8.8 to find the joint probability density function of X and Y. The joint probability density function of Z and W is given by f(z,w)= 1 2π exp bracketleftBig − 1 2 parenleftbig z 2 + w 2 parenrightbig bracketrightBig . Let h 1 (z, w) = σ 1 z+µ 1 and h 2 (z, w) = σ 2 parenleftbig ρz+ radicalbig 1 − ρ 2 w parenrightbig +µ 2 . The system of equations ⎧ ⎨ ⎩ σ 1 z + µ 1 = x σ 2 parenleftbig ρz + radicalbig 1 − ρ 2 w parenrightbig + µ 2 = y defines a one-to-one transformation of R 2 in the zw-plane onto R 2 in the xy-plane. It has a unique solution z = x − µ 1 σ 1 , w = 1 radicalbig 1 − ρ 2 bracketleftBig y − µ 2 σ 2 − ρ(x − µ 1 ) σ 1 bracketrightBig for z and w in terms of x and y. Moreover, J = vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle 1 σ 1 0 − ρ σ 1 radicalbig 1 − ρ 2 1 σ 2 radicalbig 1 − ρ 2 vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle vextendsingle = 1 σ 1 σ 2 radicalbig 1 − ρ 2 negationslash= 0. Hence, by Theorem 8.8, the joint probability density function of X and Y is given by 1 σ 1 σ 2 radicalbig 1 − ρ 2 f parenleftBig x − µ 1 σ 1 , 1 radicalbig 1 − ρ 2 bracketleftBig y − µ 2 σ 2 − ρ x − µ 1 σ 1 bracketrightBigparenrightBig . Noting that f(z,w) = 1 2π exp bracketleftBig − 1 2 parenleftbig z 2 + w 2 parenrightbig bracketrightBig . Straightforward calculations will result in (10.24), showing that the joint probability density function of X and Y is bivariate normal. 254 Chapter 10 More Expectations and Variances 7. Using Theorem 8.8, it is straightforward to show that the joint probability density function of X + Y and X − Y is bivariate normal. Since ρ(X + Y,X − Y)= Cov(X + Y,X − Y) σ X+Y · σ X−Y = Var(X) − Var(Y) σ X+Y · σ X−Y = 0, X + Y and X − Y are uncorrelated. But for bivariate normal, uncorrelated and independence are equivalent. So X + Y and X − Y are independent. REVIEW PROBLEMS FOR CHAPTER 10 1. Number the last 10 graduates who will walk on the stage 1 through 10. Let X i = 1iftheith graduate receives his or her own diploma; 0, otherwise. The number of graduates who will receive their own diploma is X = X 1 + X 2 +···+X n . Since E(X i ) = 1 · 1 n + 0 · parenleftBig 1 − 1 n parenrightBig = 1 n , we have E(X) = E(X 1 ) + E(X 2 ) +···+E(X n ) = n · 1 n = 1. 2. Since E(X) = integraldisplay 2 1 (2x 2 − 2x)dx = 5 3 , and E(X 3 ) = integraldisplay 2 1 (2x 4 − 2x 3 )dx = 49 10 , we have that E(X 3 + 2X − 7) = 49 10 + 10 3 − 7 = 37 30 . 3. Since E(X 2 ) = 1 3 integraldisplay 1 0 integraldisplay 2 0 (3x 5 + x 3 y)dy dx = 1 2 , and E(XY) = 1 3 integraldisplay 1 0 integraldisplay 2 0 (3x 4 y + x 2 y 2 )dydx = 94 135 , we have that E(X 2 + 2XY) = 1 2 + 188 135 = 511 270 . Chapter 10 Review Problems 255 4. Let X 1 , X 2 , ..., X n be geometric random variables with parameters 1, (n − 1)/n, (n − 2)/n, ... ,1/n, respectively. The desired quantity is E(X 1 + X 2 +···+X n ) = 1 + n n − 1 + n n − 2 +···+n = 1 + n parenleftBig 1 n − 1 + 1 n − 2 +···+ 1 2 + 1 parenrightBig = 1 + na n−1 . 5. Let X be the number of tosses until 4 consecutive sixes. Let Y be the number of tosses until the first non-six outcome is obtained. We have E(X) = E bracketleftbig E(X|Y) bracketrightbig = ∞ summationdisplay i=1 E(X | Y = i)P(Y = i) = 4 summationdisplay i=1 E(X | Y = i)P(Y = i)+ ∞ summationdisplay i=5 E(X | Y = i)P(Y = i) = 4 summationdisplay i=1 bracketleftbig i + E(X) bracketrightbig parenleftBig 1 6 parenrightBig i−1 parenleftBig 5 6 parenrightBig + ∞ summationdisplay i=5 4 parenleftBig 1 6 parenrightBig i−1 parenleftBig 5 6 parenrightBig . This equation reduces to E(X) = bracketleftbig 1 + E(X) bracketrightbig 5 6 + bracketleftbig 2 + E(X) bracketrightbig 1 6 · 5 6 + bracketleftbig 3 + E(X) bracketrightbig parenleftBig 1 6 parenrightBig 2 parenleftBig 5 6 parenrightBig + bracketleftbig 4 + E(X) bracketrightbig parenleftBig 1 6 parenrightBig 3 parenleftBig 5 6 parenrightBig + 4 parenleftBig 5 6 parenrightBig (1/6) 4 1 − (1/6) . Solving this equation for E(X), we obtain E(X) = 1554. 6. f(x,y,z) = (2x)(2y)(2z),00,E(X| Y = y) = 1/y, E(X|Y)= 1/Y. 15. Let X and Y denote the number of minutes past 10:00 A.M. that bus A and bus B arrive at the station, respectively. X is uniformly distributed over (0, 30). Given that X = x, Y is uniformly distributed over (0,x). Let f(x,y) be the joint probability density function of X and Y. We calculate E(Y) by conditioning on X: E(Y) = E bracketleftbig E(Y|X) bracketrightbig = integraldisplay ∞ −∞ E(Y | X = x)f X (x) dx = integraldisplay 30 0 x 2 · 1 30 dx = 30 4 . Thus the expected arrival time of bus B is 7.5 minutes past 10:00 A.M. 16. To find the distribution function of summationtext N i=1 X i , note that P parenleftBig N summationdisplay i=1 X i ≤ t parenrightBig = ∞ summationdisplay n=1 P parenleftBig N summationdisplay i=1 X i ≤ t vextendsingle vextendsingle N = n parenrightBig P(N = n) = ∞ summationdisplay n=1 P parenleftBig n summationdisplay i=1 X i ≤ t vextendsingle vextendsingle N = n parenrightBig P(N = n) = ∞ summationdisplay n=1 P parenleftBig n summationdisplay i=1 X i ≤ t parenrightBig P(N = n), 260 Chapter 10 More Expectations and Variances where the last inequality follows since N is independent of braceleftbig X 1 ,X 2 ,X 3 ,... bracerightbig .Now summationtext n i=1 X i is a gamma random variable with parameters n and λ. Thus P parenleftBig N summationdisplay i=1 X i ≤ t parenrightBig = ∞ summationdisplay n=1 bracketleftbiggintegraldisplay t 0 λe −λx (λx) n−1 (n − 1)! dx bracketrightbigg (1 − p) n−1 p = ∞ summationdisplay n=1 integraldisplay t 0 λpe −λx bracketleftbig λ(1 − p)x bracketrightbig n−1 (n − 1)! dx = integraldisplay t 0 λpe −λx ∞ summationdisplay n=1 bracketleftbig λ(1 − p)x bracketrightbig n−1 (n − 1)! dx = integraldisplay t 0 λpe −λx e λ(1−p)x dx = integraldisplay t 0 λpe −λpx dx = 1 − e −λpt . This shows that summationtext N i=1 X i is exponential with parameter λp. 17. Let X 1 , X 2 , ..., X i , ..., X 20 be geometric random variables with parameters 1, 19/20, ..., bracketleftbig 20 − (i − 1) bracketrightbig /20, ..., 1/20. The desired quantity is E parenleftBig 20 summationdisplay i=1 X i parenrightBig = 20 summationdisplay i=1 E(X i ) = 20 summationdisplay i=1 20 20 − (i − 1) = 71.9548. Chapter 11 Sums of Independent Random Variables and Limit Theorems 11.1 MOMENT-GENERATING FUNCTIONS 1. M X (t) = E parenleftbig e tX parenrightbig = 5 summationdisplay x=1 e tx p(x) = 1 5 parenleftbig e t + e 2t + e 3t + e 4t + e 5t parenrightbig . 2. (a) For t negationslash= 0, M X (t) = E parenleftbig e tX parenrightbig = integraldisplay 3 −1 1 4 e tx dx = 1 4 parenleftBig e 3t − e −t t parenrightBig , whereas for t = 0, M X (0) = 1. Thus M X (t) = ⎧ ⎪ ⎨ ⎪ ⎩ 1 4 parenleftBig e 3t − e −t t parenrightBig if t negationslash= 0 1ift = 0. Since X is uniform over (−1, 3), E(X) = −1 + 3 2 = 1 and Var(X) = bracketleftbig 3 − (−1) bracketrightbig 2 12 = 4 3 . (b) By the definition of derivative, E(X) = M prime X (0) = lim h→0 M X (h) − M X (0) h = lim h→0 1 h parenleftBig e 3h − e −h 4h − 1 parenrightBig = lim h→0 e 3h − e −h − 4h 4h 2 = lim h→0 3e 3h + e −h − 4 8h = lim h→0 9e 3h − e −h 8 = 1, where the fifth and sixth equalities follow from L’Hôpital’s rule. 262 Chapter 11 Sums of Independent Random Variables and Limit Theorems 3. Note that M X (t) = E parenleftbig e tX parenrightbig = ∞ summationdisplay x=1 e tx · 2 parenleftBig 1 3 parenrightBig x = 2 ∞ summationdisplay x=1 e tx · e −x ln 3 = 2 ∞ summationdisplay x=1 e x(t−ln 3) . Restricting the domain of M X (t) to the set braceleftbig t : t 1 for t ∈ (0, ∞). Therefore, summationtext ∞ x=1 e tx x 2 diverges on (0, ∞) and thus on no interval of the form (−δ,δ), δ > 0, M X (t) exists. 24. For t<1/2, (11.2) implies that M X (t) = ∞ summationdisplay n=0 E(X n ) n! t n = ∞ summationdisplay n=0 (n + 1)(2t) n = 1 2 ∞ summationdisplay n=0 d dt (2t) n+1 = 1 2 d dt bracketleftBig ∞ summationdisplay n=0 (2t) n+1 bracketrightBig = 1 2 · d dt bracketleftBig ∞ summationdisplay n=0 (2t) n − 1 bracketrightBig = 1 2 · d dt bracketleftBig 1 1 − 2t − 1 bracketrightBig = 1 (1 − 2t) 2 = bracketleftBig 1/2 (1/2) − t bracketrightBig 2 . We see that for t<1/2, M X (t) exists; furthermore, it is the moment-generating function of a gamma random variable with parameters r = 2 and λ = 1/2. 25. (a) At the end of the first period, with probability 1, the investment will grow to A + A X k = A parenleftBig 1 + X k parenrightBig ; at the end of the second period, with probability 1, it will grow to A parenleftBig 1 + X k parenrightBig + A parenleftBig 1 + X k parenrightBig · X k = A parenleftBig 1 + X k parenrightBig 2 ; and, in general, at the end of thenth period, with probability 1, it will grow toA parenleftBig 1+ X k parenrightBig n . (b) Dividing a year into k equal periods allows the banks to compound interest quarterly, monthly, or daily. If we increase k, we can compound interest every minute, second, or even fraction of a second. For an infinitesimal ε>0, suppose that the interest is compounded at the end of each period of length ε.Ifε → 0, then the interest is compounded continuously. Since a year is 1/ε periods, each of length ε, the interest rate per period of length ε is the random variable X/(1/ε) = εX. Suppose that at time t, the investment has grown to A(t). Then at t + ε, with probability 1, the investment will be A(t + ε) = A(t) + A(t) · εX. 268 Chapter 11 Sums of Independent Random Variables and Limit Theorems This implies that P parenleftBig A(t + ε) − A(t) ε = XA(t) parenrightBig = 1. Letting ε → 0, yields P parenleftBig lim ε→0 A(t + ε) − A(t) ε = XA(t) parenrightBig = 1 or, equivalently, with probability 1, A prime (t) = XA(t). (c) Part (b) implies that, with probability 1, A prime (t) A(t) = X. Integrating both sides of this equation, we obtain that, with probability 1, ln[A(t)]=tX+ C, or A(t) = e tX+c . Considering the fact that A(0) = A, this equation yields A = e c . Therefore, with probability 1, A(t) = e tX · e c = Ae tX . This shows that if the interest rate is compounded continuously, then an initial investment of A dollars will grow, in t years, with probability 1, to the random variable Ae tX , whose expected value is E(Ae tX ) = AE(e tX ) = AM X (t). We have shown the following: If money is invested in a bank at an annual rate X, where X is a random variable, and if the bank compounds interest continuously, then, on av- erage, the money will grow by a factor of M X (t), the moment-generating function of the interest rate. 26. Since X i and X j are binomial with parameters (n, p i ) and (n, p j ), E(X i ) = np i ,E(X j ) = np j , σ X i = radicalbig np i (1 − p i ), σ X j = radicalbig np j (1 − p j ). Section 11.2 Sums of Independent Random Variables 269 To find E(X i X j ), note that M(t 1 ,t 2 ) = E parenleftbig e t 1 X i +t 2 X j parenrightbig = n summationdisplay x i =0 n−x i summationdisplay x j =0 e t 1 x i +t 2 x j P(X i = x i ,X j = x j ) = n summationdisplay x i =0 n−x i summationdisplay x j =0 e t 1 x i +t 2 x j · n! x i ! x j ! (n − x i − x j )! p x i i p x j j (1 − p i − p j ) n−x i −x j = n summationdisplay x i =0 n−x i summationdisplay x j =0 n! x i ! x j ! (n − x i − x j )! parenleftbig e t 1 p i parenrightbig x i parenleftbig e t 2 p j parenrightbig x j (1 − p i − p j ) n−x i −x j = parenleftbig p i e t 1 + p j e t 2 + 1 − p i − p j parenrightbig n , where the last equality follows from multinomial expansion (Theorem 2.6). Therefore, ∂ 2 M ∂t 1 ∂t 2 (t 1 ,t 2 ) = n(n − 1)p i p j e t 1 e t 2 parenleftbig p i e t 1 + p j e t 2 + 1 − p i − p j parenrightbig n−2 , and so E(X i X j ) = ∂ 2 M ∂t 1 ∂t 2 (0, 0) = n(n − 1)p i p j . Thus ρ(X i ,X j ) = n(n − 1)p i p j − (np i )(np j ) √ np i (1 − p i ) · radicalbig np j (1 − p j ) =− radicalbigg p i p j (1 − p i )(1 − p j ) . 11.2 SUMS OF INDEPENDENT RANDOM VARIABLES 1. M αX (t) = E parenleftbig e tαX parenrightbig = M X (tα) = exp bracketleftbig αµt + (1/2)α 2 σ 2 t 2 bracketrightbig . 2. Since M X 1 +X 2 +···+X n (t) = M X 1 (t)M X 2 (t) ···M X n (t) = bracketleftBig pe t 1 − (1 − p)e t bracketrightBig n , X 1 + X 2 +···+X n is negative binomial with parameters (n, p). 3. Since M X 1 +X 2 +···+X n (t) = M X 1 (t)M X 2 (t) ···M X n (t) = parenleftBig λ λ − t parenrightBig n , X 1 + X 2 +···+X n is gamma with parameters n and λ. 270 Chapter 11 Sums of Independent Random Variables and Limit Theorems 4. For 1 ≤ i ≤ n, let X i be negative binomial with parameters r i and p. We have that M X 1 +X 2 +···+X n (t) = M X 1 (t)M X 2 (t) ···M X n (t) = bracketleftBig pe t 1 − (1 − p)e t bracketrightBig r 1 bracketleftBig pe t 1 − (1 − p)e t bracketrightBig r 2 ··· bracketleftBig pe t 1 − (1 − p)e t bracketrightBig r n = bracketleftBig pe t 1 − (1 − p)e t bracketrightBig r 1 +r 2 +···+r n . Thus X 1 + X 2 +···+X r is negative binomial with parameters r 1 + r 2 +···+r n and p. 5. Since M X 1 +X 2 +···+X n (t) = M X 1 (t)M X 2 (t) ···M X n (t) = parenleftBig λ λ − t parenrightBig r 1 parenleftBig λ λ − t parenrightBig r 2 ··· parenleftBig λ λ − t parenrightBig r n = parenleftBig λ λ − t parenrightBig r 1 +r 2 +···+r n , X 1 + X 2 +···+X n is gamma with parameters r 1 + r 2 +···+r n and λ. 6. By Theorem 11.4, the total number of underfilled bottles is binomial with parameters 180 and 0.15. Therefore, the desired probability is parenleftbigg 180 27 parenrightbigg (0.15) 27 (0.85) 153 = 0.083. 7. For j*0) = P parenleftBig X + Y − 5 3 > 0 − 5 3 parenrightBig = 1 − Phi1(−1.67) = Phi1(1.67) = 0.9525, P(X− Y<2) = P parenleftBig X − Y + 3 3 < 2 + 3 3 parenrightBig = Phi1(1.67) = 0.9525, and P(3X + 4Y>20) = P parenleftBig 3X + 4Y − 19 √ 130 > 20 − 19 √ 130 parenrightBig = 1 − Phi1(0.9) = 0.4641. 11. Theorem 11.7 implies that ¯ X ∼ N(110, 1.6), where ¯ X is the average of the IQ’s of the randomly selected students. Therefore, P( ¯ X ≥ 112) = P parenleftbigg ¯ X − 110 √ 1.6 ≥ 112 − 110 √ 1.6 parenrightbigg = 1 − Phi1(1.58) = 0.0571. 12. Let ¯ X 1 be the average of the accounts selected at store 1 and ¯ X 2 be the average of the accounts selected at store 2. We have that ¯ X 1 ∼ N parenleftBig 90, 900 10 parenrightBig = N(90, 90) and ¯ X 2 ∼ N parenleftBig 100, 2500 15 parenrightBig = N parenleftBig 100, 500 3 parenrightBig . 272 Chapter 11 Sums of Independent Random Variables and Limit Theorems Therefore, ¯ X 1 − ¯ X 2 ∼ N parenleftBig − 10, 770 3 parenrightBig and so P( ¯ X 1 > ¯ X 2 ) = P( ¯ X 1 − ¯ X 2 > 0) = P parenleftbigg ¯ X 1 − ¯ X 2 + 10 √ 770/3 > 0 + 10 √ 770/3 parenrightbigg = 1 − Phi1(0.62) = 0.2676. 13. By Exercise 6, Section 10.5, X and Y are sums of independent standard normal random variables. Hence αX + βY is a linear combination of independent standard normal random variables. Thus, by Theorem 11.7, αX + βY is normal. 14. By Exercise 13, X − Y is normal; its mean is 71 − 60 = 11, its variance is Var(X − Y)= Var(X) + Var(Y) − 2Cov(X, Y) = Var(X) + Var(Y) − 2ρ(X,Y)σ X σ Y = 9 + (2.7) 2 − 2(0.45)(3)(2.7) = 9. Therefore, P(X− Y ≥ 8) = P parenleftBig X − Y − 11 3 ≥ 8 − 11 3 parenrightBig = 1 − Phi1(−1) = Phi1(1) = 0.8413. 15. Let ¯ X be the average of the weights of the 12 randomly selected athletes. Let X 1 , X 2 , ..., X 12 be the weights of these athletes. Since ¯ X ∼ N parenleftBig 225, 25 2 12 parenrightBig = N parenleftBig 225, 625 12 parenrightBig , we have that P(X 1 + X 2 +···+X 12 ≤ 2700) = P parenleftBig ¯ X ≤ 2700 12 parenrightBig = P( ¯ X ≤ 225) = P parenleftbigg ¯ X − 225 √ 625/12 ≤ 225 − 225 √ 625/12 parenrightbigg = Phi1(0) = 1 2 . 16. Let ¯ X 1 and ¯ X 2 be the averages of the final grades of the probability and calculus courses Dr. Olwell teaches, respectively. We have that ¯ X 1 ∼ N parenleftBig 65, 418 22 parenrightBig = N(65, 19) and ¯ X 2 ∼ N parenleftBig 72, 448 28 parenrightBig = N(72, 16). Therefore, ¯ X 1 − ¯ X 2 ∼ N(−7, 35) and hence the desired probability is P parenleftbig | ¯ X 1 − ¯ X 2 |≥2 parenrightbig = P( ¯ X 1 − ¯ X 2 ≥ 2) + P( ¯ X 1 − ¯ X 2 ≤−2) = P parenleftbigg ¯ X 1 − ¯ X 2 + 7 √ 35 ≥ 2 + 7 √ 35 parenrightbigg + P parenleftbigg ¯ X 1 − ¯ X 2 + 7 √ 35 ≤ −2 + 7 √ 35 parenrightbigg = 1 − Phi1(1.52) + Phi1(0.85) = 1 − 0.9352 + 0.8023 = 0.8671. Section 11.2 Sums of Independent Random Variables 273 17. Let X and Y be the lifetimes of the mufflers of the first and second cars, respectively. (a) To calculate the desired probability, P(|X − Y|≥1.5), note that by symmetry, P parenleftbig |X − Y|≥1.5 parenrightbig = 2P(X− Y ≥ 1.5). Now X − Y ∼ N(0, 2), hence P parenleftbig |X − Y|≥1.5 parenrightbig = 2P parenleftbigg X − Y − 0 √ 2 ≥ 1.5 − 0 √ 2 parenrightbigg = 2 bracketleftbig 1 − Phi1(1.06) bracketrightbig = 0.289. (b) Let Z be the lifetime of the first muffler the family buys. By symmetry, the desired probability is 2P(Y >X+ Z) = 2P(Y − X − Z>0). Now Y − X − Z ∼ N(−3, 3). Hence 2P(Y − X − Z>0) = 2P parenleftbigg Y − X − Z + 3 √ 3 > 0 + 3 √ 3 parenrightbigg = 2 bracketleftbig 1 − Phi1(1.73) bracketrightbig = 0.0836. 18. Let n be the maximum number of passengers who can use the elevator and X 1 , X 2 , ..., X n be the weights of n random passengers. We must have P(X 1 + X 2 +···X n > 3000)<0.0003 or, equivalently, P(X 1 + X 2 +···+X n ≤ 3000)>0.9997. Let ¯ X be the mean of the weights of the n random passengers. We must have P parenleftbigg ¯ X ≤ 3000 n parenrightbigg > 0.9997. Since ¯ X ∼ N parenleftBig 155, 625 n parenrightBig , we must have P parenleftBig ¯ X − 155 25/ √ n ≤ (3000/n) − 155 25/ √ n parenrightBig > 0.9997, or Phi1 parenleftBig 3000 25 √ n − 155 √ n 25 parenrightBig > 0.9997. Using Table 2 of the Appendix, this gives 3000 25 √ n − 155 √ n 25 ≥ 3.49 or, equivalently, 155n + 87.25 √ n − 3000 ≤ 0. 274 Chapter 11 Sums of Independent Random Variables and Limit Theorems Since the roots of the quadratic equation 155n + 87.25 √ n − 3000 = 0 are (approximately) √ n = 4.127 and √ n =−4.69, the inequality is valid if and only if parenleftbig√ n + 4.69 parenrightbigparenleftbig√ n − 4.127 parenrightbig ≤ 0. But √ n + 4.69 > 0, so the inequality is valid if and only if √ n − 4.127 ≤ 0orn ≤ 17.032. Therefore the answer is n = 17. 19. By Remark 9.3, the marginal joint probability mass function of X 1 , X 2 , ..., X k is multinomial with parameters n and (p 1 , p 2 , ... ,p k ,1−p 1 −p 2 −···−p k ). Thus, letting p = p 1 +p 2 + ···+p k and x = x 1 + x 2 +···+x k , we have that p(x 1 ,x 2 ,... ,x k ) = n! x 1 ! x 2 ! ··· x k ! (n − x)! p x 1 1 p x 2 2 ···p x k k (1 − p) n−x . This gives P(X 1 + X 2 +···+X k = i) = summationdisplay x 1 +x 2 +···+x k =i n! x 1 ! x 2 ! ··· x k ! (n − i)! p x 1 1 p x 2 2 ···p x k k (1 − p) n−i = n! i! (n − i)! (1 − p) n−i summationdisplay x 1 +x 2 +···+x k =i i! x 1 ! x 2 ! ··· x k ! p x 1 1 p x 2 2 ···p x k k = parenleftbigg n i parenrightbigg (1 − p) n−i (p 1 + p 2 +···+p k ) i = parenleftbigg n i parenrightbigg p i (1 − p) n−i . This shows that X 1 +X 2 +···+X k is binomial with parameters n and p = p 1 +p 2 +···+p k . 20. First note that if Y 1 and Y 2 are two exponential random variables each with rate λ, min(Y 1 ,Y 2 ) is exponential with rate 2λ. Now let A 1 , A 2 , ..., A 11 be the customers in the line ahead of Kim. Due to the memoryless property of exponential random variables, X 1 , the time until A 1 ’s turn to make a call is exponential with rate 2(1/3) = 2/3. The time until A 2 ’s turn to call is X 1 + X 2 , where X 2 is exponential with rate 2(1/3) = 2/3. Continuing this argument and considering the fact that Kim is the 12th person waiting in the line, we have that the time until Kim’s turn to make a phone call is X 1 + X 2 +···+X 12 , where {X 1 ,X 2 ,... ,X 12 } is an independent and identically distributed sequence of exponential random variables each with rate 2/3. Hence the distribution of the waiting time of Kim is gamma with parameters (12, 2/3). Her expected waiting time is 12(2/3) = 18. 11.3 MARKOV AND CHEBYSHEV INEQUALITIES 1. Let X be the lifetime (in months) of a randomly selected dollar bill. We are given that E(X) = 22. By Markov inequality, Section 11.3 Markov and Chebyshev Inequalities 275 P(X≥ 60) ≤ 22 60 = 0.37. This shows that at most 37% of the one-dollar bills last 60 or more months; that is, at least five years. 2. We have that P(X≥ 2) = 2/5. Hence, by Markov’s inequality, 2 5 = P(X≥ 2) ≤ E(X) 2 . This gives E(X) ≥ 4/5. 3. (a) P(X≥ 11) ≤ E(X) 11 = 5 11 = 0.4545. (b) P(X≥ 11) = P(X− 5 ≥ 6) ≤ P parenleftbig |X − 5|≥6 parenrightbig ≤ σ 2 36 = 42 − 25 36 = 0.472. 4. Let X be the lifetime of the randomly selected light bulb; we have P(X≤ 700) ≤ P parenleftbig |X − 800|≥100 parenrightbig ≤ 2500 10, 000 = 0.25. 5. Let X be the number of accidents that will occur tomorrow. Then (a) P(X≥ 5) ≤ 2 5 = 0.4. (b) P(X≥ 5) = 1 − 4 summationdisplay i=0 e −2 2 i i! = 0.053. (c) P(X≥ 5) = P(X− 2 ≥ 3) ≤ P parenleftbig |X − 2|≥3 parenrightbig ≤ 2 9 = 0.222 6. Let X be the IQ of a randomly selected student from this campus; we have P(X>140) ≤ P parenleftbig |X − 110| > 30 parenrightbig ≤ 15 900 = 0.017. Therefore, less than 1.7% of these students have an IQ above 140. 7. Let X be the waiting period from the time Helen orders the book until she receives it. We want to find a so that P(X2µ) = P(X− µ>µ)≤ P parenleftbig |X − µ|≥µ parenrightbig ≤ µ µ 2 = 1 µ . 10. We have that P(38 < ¯ X<46) = P(−4 < ¯ X − 42 < 4) = P parenleftbig | ¯ X − 42| < 4 parenrightbig = 1 − P parenleftbig | ¯ X − 42|≥4 parenrightbig . By (11.3), P parenleftbig | ¯ X − 42|≥4 parenrightbig ≤ 60 16(25) = 3 20 . Hence P(38 < ¯ X<46) ≥ 1 − 3 20 = 17 20 = 0.85. 11. For i = 1, 2,... ,n, let X i be the IQ of the ith student selected at random. We want to find n, so that P parenleftBig − 3 < X 1 + X 2 +···+X n n − µ<3 parenrightBig ≥ 0.92 or, equivalently, P(| ¯ X − µ|≥3) ≤ 0.08. Since E(X i ) = µ and Var(X i ) = 150, by (11.3), P(| ¯ X − µ|≥3) ≤ 150 3 2 · n . Therefore, all we need to do is to find n for which 150/(9n) ≤ 0.08. This gives n ≥ 150/[9(0.08)]=208.33. Thus the psychologist should choose a sample of size 209. 12. Let X 1 , X 2 , ..., X n be the random sample, µ be the expected value of the distribution, and σ 2 be the variance of the distribution. We want to find n so that P(| ¯ X − µ| < 2σ)≥ 0.98 or, equivalently, P(| ¯ X − µ|≥2σ)<0.02. By (11.3), P(| ¯ X − µ|≥2σ)≤ σ 2 (2σ) 2 · n = 1 4n . Therefore, all we need to do is to make sure that 1/(4n) ≤ 0.02. This gives n ≥ 12.5. So a sample of size 13 gives a mean which is within 2 standard deviations from the expected value with a probability of at least 0.98. Section 11.3 Markov and Chebyshev Inequalities 277 13. Call a random observation success, if the operator is busy. Call it failure, if he is free. In (11.5), let ε = 0.05 and α = 0.04;wehave n ≥ 1 4(0.05) 2 (0.04) = 2500. Therefore, at least 2500 independent observations should be made to ensure that (1/n) summationtext n i=1 estimates p, the proportion of time that the airline operator is busy, with a maximum error of 0.05 with probability 0.96 or higher. 14. By (11.5), n ≥ 1 4(0.05) 2 (0.06) = 1666.67. Therefore, it suffices to flip the coin n = 1667 times independently. 15. P parenleftbig |X − µ|≥α parenrightbig = P parenleftbig |X − µ| 2n ≥ α 2n parenrightbig ≤ E bracketleftbig (X − µ) 2n bracketrightbig α 2n . 16. By Markov’s inequality, P(X>t)= P parenleftbig e kX >e kt parenrightbig ≤ E parenleftbig e kX parenrightbig e kt . 17. By the Corollary of Cauchy-Schwarz Inequality (Theorem 10.3), bracketleftbig E(X − Y) bracketrightbig 2 ≤ E bracketleftbig (X − Y) 2 bracketrightbig = 0. This gives that E(X − Y)= 0. Therefore, Var(X − Y)= E bracketleftbig (X − Y) 2 bracketrightbig − bracketleftbig E(X − Y) bracketrightbig 2 = 0. We have shown that X−Y is a random variable with mean 0 and variance 0; by Example 11.16, P(X− Y = 0) = 1. So with probability 1, X = Y. 18. If Y = X with probability 1, Theorem 10.5 implies that ρ(X,Y) = 1. Suppose that ρ(X,Y) = 1; we show that X=Y with probability 1. Note that E(X) = E(Y) = (n + 1)/2, Var(X) = Var(Y) = (n 2 − 1)/12, and σ X = σ Y = radicalbig (n 2 − 1)/12. These and 1 = ρ(X,Y) = E(XY) − E(X)E(Y) σ X σ Y imply that E(XY) = (2n 2 + 3n + 1)/6. Therefore, E bracketleftbig (X − Y) 2 bracketrightbig = E(X 2 − 2XY + Y 2 ) = E(X 2 ) + E(Y 2 ) − 2E(XY) = Var(X) + bracketleftbig E(X) bracketrightbig 2 + Var(Y) + bracketleftbig E(Y) bracketrightbig 2 − 2E(XY) = n 2 − 1 12 + parenleftBig n + 1 2 parenrightBig 2 + n 2 − 1 12 + parenleftBig n + 1 2 parenrightBig 2 − 2n 2 + 3n + 1 3 = 0. E bracketleftbig (X − Y) 2 bracketrightbig = 0 implies that with probability 1, X=Y (see Exercise 17 above). 278 Chapter 11 Sums of Independent Random Variables and Limit Theorems 19. By Markov’s inequality, P parenleftBig X ≥ 1 t ln α parenrightBig = P(tX≥ ln α) = P parenleftbig e tX ≥ α parenrightbig ≤ E parenleftbig e tX parenrightbig α = 1 α M X (t). 20. Using gamma function introduced in Section 7.4, E(X) = 1 n! integraldisplay ∞ 0 x n+1 e −x dx = Gamma1(n + 2) n! = (n + 1)! n! = n + 1, E(X 2 ) = 1 n! integraldisplay ∞ 0 x n+2 e −x dx = Gamma1(n + 3) n! = (n + 2)! n! = (n + 1)(n + 2). Hence σ 2 X = (n + 1)(n + 2) − (n + 1) 2 = n + 1. Now P(0 *Mwith probability 1, then X 2 >Mwith probability 1 since X 1 and X 2 are identically distributed. Therefore, X 1 + X 2 > 2M>Mwith probability 1. This argument shows that {X 1 >M}⊆{X 1 + X 2 >M}⊆{X 1 + X 2 + X 3 >M}⊆··· . Therefore, by the continuity of probability function (Theorem 1.8), lim n→∞ P(X 1 + X 2 +···+X n >M)= P parenleftbig lim n→∞ X 1 + X 2 +···+X n >M parenrightbig . Section 11.4 Laws of Large Numbers 279 By this relation, it suffices to show that ∀M>0, lim n→∞ X 1 + X 2 +···+X n >M (45) with probability 1. Let S be the sample space over which X i ’s are defined. Let µ = E(X i ); we are given that µ>0. By the central limit theorem, P parenleftBig lim n→∞ X 1 + X 2 +···X n n = µ parenrightBig = 1. Therefore, letting V = braceleftBig ω ∈ S : lim n→∞ X 1 (ω) + X 2 (ω) +···X n (ω) n = µ bracerightBig , we have that P(V)= 1. To establish (45), it is sufficient to show that ∀ω ∈ V , lim n→∞ X 1 (ω) + X 2 (ω) +···X n (ω) =∞. (46) To do so, applying the definition of limit to lim n→∞ X 1 (ω) + X 2 (ω) +···X n (ω) n = µ, we have that for ε = µ/2, there exists a positive integer N (depending on ω) such that ∀n>N, vextendsingle vextendsingle vextendsingle X 1 (ω) + X 2 (ω) +···X n (ω) n − µ vextendsingle vextendsingle vextendsingle <ε= µ 2 or, equivalently, − µ 2 < X 1 (ω) + X 2 (ω) +···X n (ω) n − µ< µ 2 . This yields X 1 (ω) + X 2 (ω) +···X n (ω) n > µ 2 . Thus, for all n>N, X 1 (ω) + X 2 (ω) +···X n (ω) > nµ 2 , which establishes (46). 3. For 0 <ε<1, P parenleftbig |Y n − 0| >ε parenrightbig = 1 − P parenleftbig |Y n − 0|≤ε parenrightbig = 1 − P(X≤ n) = 1 − integraldisplay n 0 f(x)dx. Therefore, lim n→∞ P parenleftbig |Y n − 0| >ε parenrightbig = 1 − integraldisplay ∞ 0 f(x)dx = 1 − 1 = 0, showing that Y n converges to 0 in probability. 280 Chapter 11 Sums of Independent Random Variables and Limit Theorems 4. By the strong law of large numbers, S n /n converges to µ almost surely. Therefore, S n /n converges to µ in probability and hence lim n→∞ P parenleftbig n(µ − ε) ≤ S n ≤ n(µ + ε) parenrightbig = lim n→∞ P parenleftBig µ − ε ≤ S n n ≤ µ + ε parenrightBig = lim n→∞ P parenleftBigvextendsingle vextendsingle vextendsingle S n n − µ vextendsingle vextendsingle vextendsingle ≤ ε parenrightBig = 1 − lim n→∞ P parenleftBigvextendsingle vextendsingle vextendsingle S n n − µ vextendsingle vextendsingle vextendsingle >ε parenrightBig = 1 − 0 = 1. 5. Suppose that the bank will never be empty of customers again. We will show a contradiction. Let U n = T 1 +T 2 +···+T n . Then U n is the time the nth new customer arrives. Let W i be the service time of the ith new customer served. Clearly, W 1 , W 2 , W 3 , ... are independent and identicallydistributedrandomvariableswithE(W i ) = 1/µ. LetZ n = T 1 +W 1 +W 2 +···+W n . Since the bank will never be empty of customers, Z n is the departure time of the nth new customer served. By the strong law of large numbers, lim n→∞ U n n = 1 λ and lim n→∞ Z n n = lim n→∞ parenleftBig T 1 n + W 1 + W 2 +···+W n n parenrightBig = lim n→∞ T 1 n + lim n→∞ W 1 + W 2 +···+W n n = 0 + 1 µ = 1 µ . Clearly, the bank will never remain empty of customers again if and only if ∀n, U n+1 0, P parenleftbig |X n − 0|≥ε parenrightbig is the probability that the random point selected from [0, 1] is in bracketleftBig i 2 k , i + 1 2 k bracketrightBig . Now n →∞implies that 2 k →∞ and the length of the interval bracketleftBig i 2 k , i + 1 2 k bracketrightBig → 0, Therefore, lim n→∞ P parenleftbig |X n − 0|≥ε parenrightbig = 0. However, X n does not converge at any point because for all positive natural number N, there are always m>Nand n>N, such that X m = 0 and X n = 1 making it impossible for |X n − X m | to be less than a given 0 <ε<1. 282 Chapter 11 Sums of Independent Random Variables and Limit Theorems 11.5 CENTRAL LIMIT THEOREM 1. Let X 1 , X 2 , ..., X 150 be the random points selected from the interval (0, 1). For 1 ≤ i ≤ 150, X i is uniform over (0, 1). Therefore, E(X i ) = µ = 0.5 and σ X i = 1/ √ 12. We have P parenleftBig 0.48 < X 1 + X 2 +···+X 150 150 < 0.52 parenrightBig = P(72 0) = P parenleftBig X 1 + X 2 +···+X n n > 0 parenrightBig = P parenleftBig X 1 + X 2 +···+X n − n(0) √ 2 √ n > 0 parenrightBig = 1 − Phi1(0) = 0.5. 5. Let µ = E(X i ) and σ = σ X i . Clearly, E(S n ) = nµ and σ S n = σ √ n; thus, by the central limit theorem, P parenleftbig E(S n ) − σ Sn ≤ S n ≤ E(S n ) + σ S n parenrightbig = P parenleftbig nµ − σ √ n ≤ S n ≤ nµ + σ √ n parenrightbig = P parenleftBig − 1 ≤ S n − nµ σ √ n ≤ 1 parenrightBig ≈ Phi1(1) − Phi1(−1) = 2Phi1(1) − 1 = 0.6826. 6. For 1 ≤ i ≤ 300, let X i be the amount of the ith expenditure minus Jim’s ith record; X i is ap- proximatelyuniformover(−1/2, 1/2).HenceE(X i ) = 0 andσ X i = radicalBig bracketleftbig (1/2) − (−1/2) bracketrightbig 2 /12 = 1/(2 √ 3). The desired probability is P(−10 0) = 1 − P(X= 0) = 1 − e −1/2 = 0.393. 7. Note that M (n) X (t) = (−1) n+1 (n + 1)! (1 − t) n+2 . Therefore, E(X n ) = M (n) X (0) = (−1) n+1 (n + 1)!. 8. Let ¯ X be the average of the heights of 10 randomly selected men and ¯ Y be the average heights of 6 randomly selected women. Theorem 10.7 implies that ¯ X ∼ N parenleftBig 173, 40 10 parenrightBig and ¯ Y ∼ N parenleftBig 160, 20 6 parenrightBig ; thus ¯ X − ¯ Y ∼ N parenleftBig 13, 22 3 parenrightBig . Therefore, P( ¯ X − ¯ Y ≥ 5) = P parenleftbigg ¯ X − ¯ Y − 13 √ 22/3 ≥ 5 − 13 √ 22/3 parenrightbigg = Phi1(2.95) = 0.9984. 288 Chapter 11 Sums of Independent Random Variables and Limit Theorems 9. By definition, E parenleftbig e tX parenrightbig = integraldisplay ∞ −∞ 1 2 e −|x| e tx dx = integraldisplay 0 −∞ 1 2 e x · e tx dx + integraldisplay ∞ 0 1 2 e −x · e tx dx = 1 2 integraldisplay 0 −∞ e (1+t)x dx + 1 2 integraldisplay ∞ 0 e x(t−1) dx. Now for these integrals to exist, we must restrict the domain of the moment-generating function of X to {t ∈ R:−1 15) = P(X− 10 > 5) ≤ P parenleftbig |X − 10| > 5 parenrightbig = P parenleftbig |X − E(X)| > 5 parenrightbig ≤ 5/3 25 = 0.0667. 16. P(X≥ 45) ≤ P parenleftbig |X − 0|≥45 parenrightbig ≤ 15 2 /45 2 = 1/9. 290 Chapter 11 Sums of Independent Random Variables and Limit Theorems 17. Suppose that the ith randomly selected book is X i centimeters thick. The desired probability is P(X 1 + X 2 +···+X 31 ≤ 87) = P parenleftbigg X 1 + X 2 +···+X 31 − 3(31) 1 √ 31 ≤ 87 − 3(31) 1 √ 31 parenrightbigg ≈ Phi1 parenleftBig 87 − 93 √ 31 parenrightBig = Phi1(−1.08) = 1 − 0.8599 = 0.1401. 18. For 1 ≤ i ≤ 20, let X i denote the outcome of the ith roll. We have E(X i ) = 6 summationdisplay i=1 i · 1 6 = 7 2 ,E(X 2 i ) = 6 summationdisplay i=1 i 2 · 1 6 = 91 6 . Thus Var(X i ) = 91 6 − 49 4 = 35 12 , and hence P parenleftBig 65 ≤ 20 summationdisplay i=1 X i ≤ 75 parenrightBig = P parenleftbigg 65 − 70 √ 35/12 · √ 20 ≤ summationtext 20 i=1 X i − 70 √ 35/12 · √ 20 ≤ 75 − 70 √ 35/12 · √ 20 parenrightbigg ≈ Phi1(0.65) − Phi1(−0.65) = 2Phi1(0.65) − 1 = 0.4844. 19. By Markov’s inequality, P(X≥ nµ) ≤ µ nµ = 1 n . So nP(X ≥ nµ) ≤ 1. 20. Let X = summationtext 26 i=1 Xi. We have that E(X i ) = 26/51 = 0.5098,E(X 2 i ) = E(X i ) = 0.5098, Var(X i ) = 0.5098 − (0.5098) 2 = 0.2499, E(X i X j ) = P(X i = 1,X j = 1) = P(X i = 1)P(X j = 1 | X i = 1) = 26 51 · 25 49 = 0.2601, and Cov(X i ,X j ) = E(X i X j ) − E(X i )E(X j ) = 0.2601 − (0.5098) 2 = 0.0002. Thus E(X) = 26(0.5098) = 13.2548 and Var(X) = 26 summationdisplay i=1 Var(X i ) + 2 summationdisplaysummationdisplay i 15. The time Linda has to wait before being able to cross the street is 0 if N = 0 (i.e., X 1 > 15), and is S N = X 1 + X 2 +···+X N , otherwise. Therefore, E(S N ) = E bracketleftbig E(S N | N) bracketrightbig = ∞ summationdisplay i=0 E(S N | N = i)P(N = i) = ∞ summationdisplay i=1 E(S N | N = i)P(N = i), 292 Chapter 12 Stochastic Processes where the last equality follows since for N = 0, we have that S N = 0. Now E(S N | N = i) = E(X 1 + X 2 +···+X i | N = i) = i summationdisplay j=1 E(X j | N = i) = i summationdisplay j=1 E(X j | X j ≤ 15), where by Remark 8.1, E(X j | X j ≤ 15) = 1 F(15) integraldisplay 15 0 tf (t) dt; F and f being the probability distribution and density functions of X i ’s, respectively. That is, for t ≥ 0, F(t)= 1 − e −t/7 , f(t)= (1/7)e −t/7 . Thus E(X j | X j ≤ 15) = 1 1 − e −15/7 integraldisplay 15 0 t 7 e −t/7 dt = (1.1329) bracketleftbigg − (t + 7)e −t/7 bracketrightbigg 15 0 = (1.1329)(4.41898) = 5.00631. This gives E(S N | N = i) = 5.00631i. To find P(N = i), note that for i ≥ 1, P(N = i) = P(X 1 ≤ 15,X 2 ≤ 15, ... , X i ≤ 15,X i+1 > 15) = bracketleftbig F(15) bracketrightbig i bracketleftbig 1 − F(15) bracketrightbig = (0.8827) i (0.1173). Putting all these together, we obtain E(S N ) = ∞ summationdisplay i=1 E(S N | N = i)P(N = i) = ∞ summationdisplay i=1 (5.00631i)(0.8827) i (0.1173) = (0.5872) ∞ summationdisplay i=1 i(0.8827) i = (0.5872) · 0.8827 (1 − 0.8827) 2 = 37.6707, where the next to last equality follows from summationtext ∞ i=1 ir i = r/(1 − r) 2 , |r| < 1. Therefore, on average, Linda has to wait approximately 38 seconds before she can cross the street. 4. Label the time point 9:00 A.M. as t = 0. Then t = 4 corresponds to 1:00 P.M. Let N(t) be the number of fish caught at or prior to t; braceleftbig N(t): t ≥ 0 bracerightbig is a Poisson process with rate 2. Let X 1 , X 2 , ..., X 6 be six uniformly distributed independent random variables over [0, 4]. By theorem 12.4, given that N(4) = 6, the time that the fisherman caught the first fish is Y = min(X 1 ,X 2 ,... ,X 6 ). Therefore, the desired probability is P(Y <1) = 1 − P(Y ≥ 1) = 1 − P parenleftbig min(X 1 ,X 2 ,... ,X 6 ) ≥ 1 parenrightbig = 1 − P(X 1 ≥ 1,X 2 ≥ 1,... ,X 6 ≥ 1) = 1 − P(X 1 ≥ 1)P(X 2 ≥ 1) ···P(X 6 ≥ 1) = 1 − parenleftBig 3 4 parenrightBig 6 = 0.822. Section 12.2 More on Poisson Processes 293 5. Let S 1 , S 2 , and S 3 be the number of meters of wire manufactured, after the inspector left, until the first, second, and third fractures appeared, respectively. By Theorem 12.4, given that N(200) = 3, the joint probability density function of S 1 , S 2 , and S 3 is f S 1 ,S 2 ,S 3 |N(200) (t 1 ,t 2 ,t 3 | 3) = 3! 8, 000, 000 , 0 t and S>T)= P(T >t)P(S >t)= e −λt · e −µt = e −(λ+µ)t , 294 Chapter 12 Stochastic Processes P(B)= P(S >T)= integraldisplay ∞ 0 P(S >T | T = u)λe −λu du = integraldisplay ∞ 0 P(S >u| T = u)λe −λu du = integraldisplay ∞ 0 P(S > u)λe −λu du = λ integraldisplay ∞ 0 e −µu · e λu du = λ λ + µ . A similar calculation shows that P(AB) = P(S >T >t)= integraldisplay ∞ t P(S >T | T = u)λe −λu du = integraldisplay ∞ t e −µu · λe −λu du = λ λ + µ e −(λ+µ)t = P(A)P(B). 8. (a) Let X be the number of customers arriving to the queue during a service period S. Then P(X= n) = integraldisplay ∞ 0 P(X= n | S = t)µe −µt dt = integraldisplay ∞ 0 e −λt (λt) n n! µe −µt dt = λ n µ n! integraldisplay ∞ 0 t n e −(λ+µ)t dt = λ n µ n! (λ + µ) integraldisplay ∞ 0 t n (λ + µ)e −(λ+µ)t dt. Note that (λ + µ)e −(λ+µ)t is the probability density function of an exponential random variable Z with parameter λ + µ. Hence P(X= n) = λ n µ n! (λ + µ) E(Z n ). By Example 11.4, E(Z n ) = n! (λ + µ) n . Therefore, P(X= n) = λ n µ (λ + µ) n+1 = parenleftBig 1 − λ λ + µ parenrightBig n parenleftBig µ λ + µ parenrightBig ,n≥ 0. This is the probability mass function of a geometric random variable with parameter µ/(λ + µ). (b) Due to the memoryless property of exponential random variables, the remaining service time of the customer being served is also exponential with parameter µ. Hence we want to find the number of new customers arriving during a period, which is the sum of n+1 independent exponential random variables. Since during each of these service times the number of new arrivals is geometric with parameter µ/(λ+µ), during the entire period under consideration, the distribution of the total number of new customers arriving is the sum of n+ 1 independent geometric random variables each with parameter µ/(λ+µ), which is negative binomial with parameters n + 1 and µ/(λ + µ). Section 12.2 More on Poisson Processes 295 9. It is straightforward to check that M(t) is stationary, orderly, and possesses independent increments. Clearly, M(0) = 0. Thus braceleftbig M(t): t ≥ 0 bracerightbig is a Poisson process. To find its rate, note that, for 0 ≤ k<∞, P parenleftbig M(t) = k parenrightbig = ∞ summationdisplay n=k P parenleftbig M(t) = k | N(t) = n parenrightbig P parenleftbig N(t) = n parenrightbig = ∞ summationdisplay n=k parenleftbigg n k parenrightbigg p k (1 − p) n−k · e −λt (λt) n n! = e −λt p k k! (1 − p) k ∞ summationdisplay n=k bracketleftbig λt(1 − p) bracketrightbig n (n − k)! = e −λt p k k! (1 − p) k · bracketleftbig λt(1 − p) bracketrightbig k ∞ summationdisplay n=k bracketleftbig λt(1 − p) bracketrightbig n−k (n − k)! = e −λt p k k! (λt) k e λt(1−p) = (λpt) k k! e −λpt . This shows that the parameter of braceleftbig M(t): t ≥ 0 bracerightbig is λp. 10. Note that P parenleftbig V i = min(V 1 ,V 2 ,... ,V k ) parenrightbig is the probability that the first shock occurring to the system is of type i. Suppose that the first shock occurs to the system at time u.Ifwe label the time point u as t = 0, then from that point on, by stationarity and the independent- increments property, probabilistically, the behavior of these Poisson processes is identical to the system considered prior to u. So the probability that the second shock is of type i is identical to the probability that the first shock is of type i, and so on. Hence they are all equal to P parenleftbig V i = min(V 1 ,V 2 ,... ,V k ) parenrightbig . To find this probability, note that, for 1 ≤ j ≤ k, V j ’s, are independent exponential random variables, and the probability density function of V j is λ j e −λ j t . Thus P(V j >u)= e −λ j u . By conditioning on V i ,wehave P parenleftbig V i = min(V 1 ,... ,V k ) parenrightbig = integraldisplay ∞ 0 P parenleftbig min(V 1 ,... ,V k ) = V i | V i = u parenrightbig λ i e −λ i u du = λ i integraldisplay ∞ 0 P parenleftbig min(V 1 ,... ,V k ) = u | V i = u parenrightbig e −λ i u du = λ i integraldisplay ∞ 0 P(V 1 ≥ u,... ,V i−1 ≥ u, V i+1 ≥ u,... ,V k ≥ u | V i = u)e −λ i u du = λ i integraldisplay ∞ 0 P(V 1 ≥ u,... ,V i−1 ≥ u, V i+1 ≥ u,... ,V k ≥ u)e −λ i u du = λ i integraldisplay ∞ 0 P(V 1 ≥ u) ···P(V i−1 ≥ u)P(V i+1 ≥ u) ···P(V k ≥ u)e −λ i u du 296 Chapter 12 Stochastic Processes = λ i integraldisplay ∞ 0 e −λ 1 u ···e −λ i−1 u · e −λ i+1 u ···e −λ k u · e −λ i u du = λ i integraldisplay ∞ 0 e −(λ 1 +···+λ k )u du = λ i integraldisplay ∞ 0 e −λu du = λ i λ . 12.3 MARKOV CHAINS 1. {X n : n = 1, 2,...} is not a Markov chain. For example, P(X 4 = 1) depends on all the values of X 1 , X 2 , and X 3 , and not just X 3 . That is, whether or not the fourth person selected is female depends on the genders of all three persons selected prior to the fourth and not only on the gender of the third person selected. 2. For j ≥ 0, P(X n = j) = ∞ summationdisplay i=0 P(X n = j | X 0 = i)P(X 0 = i) = ∞ summationdisplay i=0 p n ij p(i), where p n ij is the ijth entry of the matrix P n . 3. The transition probability matrix of this Markov chain is P = ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ 01/20001/2 1/201/2000 01/201/20 0 001/201/20 0001/201/2 1/20001/20 ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ . By calculating P 4 and P 5 , we will find that, (a) the probability that in 4 transitions the Markov chain returns to 1 is P 4 11 = 3/8; (b) the probability that, in 5 transitions, the Markov chain enters 2 or 6 is p 5 12 + p 5 16 = 11 32 + 11 32 = 11 16 . 4. Solution 1: Starting at 0, the process eventually enters 1 or 2 with equal probabilities. Since 2 is absorbing, “never entering 1” is equivalent to eventually entering 2 directly from 0. The probability of that is 1/2. Solution 2: Let Z be the number of transitions until the first visit to 1. Note that state 2 is absorbing. If the process enters 2, it will always remain there. Hence Z = n if and only if the Section 12.3 Markov Chains 297 first n − 1 transitions are from 0 to 0, and the nth transition is from 0 to 1, implying that P(Z = n) = parenleftBig 1 2 parenrightBig n−1 parenleftBig 1 4 parenrightBig ,n= 1, 2,... . The probability that the process ever enters 1 is P(Z <∞) = ∞ summationdisplay n=1 parenleftBig 1 2 parenrightBig n−1 parenleftBig 1 4 parenrightBig = 1/4 1 − (1/2) = 1 2 . Therefore, the probability that the process never enters 1 is 1 − (1/2) = 1/2. 5. (a) By the Markovian property, given the present, the future is independent of the past. Thus the probability that tomorrow Emmett will not take the train to work is, simply, p 21 + p 23 = 1/2 + 1/6 = 2/3. (b) The desired probability is p 21 p 11 + p 21 p 13 + p 23 p 31 + p 23 p 33 = 1/4. 6. Let X n denote the number of balls in urn I after n transfers. The stochastic process {X n : n = 0, 1,...} is a Markov chain with state space {0, 1,... ,5} and transition probability matrix P = ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ 010000 1/504/50 0 0 02/503/50 0 003/502/50 0004/501/5 000010 ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ . Direct calculations show that P (6) = P 6 = ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ 241 3125 0 2044 3125 0 168 625 0 0 5293 15625 0 9492 15625 0 168 3125 1022 15625 0 9857 15625 0 4746 15625 0 0 4746 15625 0 9857 15625 0 1022 15625 168 3125 0 9492 15625 0 5293 15625 0 0 168 625 0 2044 3125 0 241 3125 ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ . 298 Chapter 12 Stochastic Processes Hence, by Theorem 12.5, P(X 6 = 4) = 0 · 168 625 + 1 15 · 0 + 2 15 · 4746 15625 + 3 15 · 0 + 4 15 · 5293 15625 + 5 15 · 0 = 0.1308. 7. By drawing a transition graph, it is readily seen that this Markov chain consists of the recurrent classes {0, 3} and {2, 4} and the transient class {1}. 8. Let Z n be the outcome of the nth toss. Then X n+1 = max(X n ,Z n+1 ) shows that {X n : n = 1, 2,...} is a Markov chain. Its state space is {1, 2,... ,6}, and its transition probability matrix is given by P = ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ 1/61/61/61/61/61/6 02/61/61/61/61/6 003/61/61/61/6 0004/61/61/6 00005/61/6 000001 ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ . It is readily seen that no two states communicate with each other. Therefore, we have six classes of which {1}, {2}, {3}, {4}, {5}, are transient, and {6} is recurrent (in fact, absorbing). 9. This can be achieved more easily by drawing a transition graph. An example of a desired matrix is as follows: ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ 001/201/2000 10000000 01000000 001/32/30000 000002/503/5 00 0 0 1/201/20 000003/502/5 00 0 0 1/302/30 ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ . 10. For 1 ≤ i ≤ 7, starting from state i, let x i be the probability that the Markov chain will eventually be absorbed into state 4. We are interested in x 6 . Applying the law of total Section 12.3 Markov Chains 299 probability repeatedly, we obtain the following system of linear equations: ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ x 1 = (0.3)x 1 + (0.7)x 2 x 2 = (0.3)x 1 + (0.2)x 2 + (0.5)x 3 x 3 = (0.6)x 4 + (0.4)x 5 x 4 = 1 x 5 = x 3 x 6 = (0.1)x 1 + (0.3)x 2 + (0.1)x 3 + (0.2)x 5 + (0.2)x 6 + (0.1)x 7 x 7 = 0. Solving this system of equations, we obtain ⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩ x 1 = x 2 = x 3 = x 4 = x 5 = 1 x 6 = 0.875 x 7 = 0. Therefore, the probability is 0.875 that, starting from state 6, the Markov chain will eventually be absorbed into state 4. 11. Let π 1 , π 2 , and π 3 be the long-run probabilities that the sportsman devotes to horseback riding, sailing, and scuba diving, respectively. Then, by Theorem 12.7, π 1 , π 2 , and π 3 are obtained from solving the system of equations. ⎛ ⎝ π 1 π 2 π 3 ⎞ ⎠ = ⎛ ⎝ 0.20 0.32 0.60 0.30 0.15 0.13 0.50 0.53 0.27 ⎞ ⎠ ⎛ ⎝ π 1 π 2 π 3 ⎞ ⎠ along with π 1 + π 2 + π 3 = 1. The matrix equation above gives us the following system of equations ⎧ ⎪ ⎨ ⎪ ⎩ π 1 = 0.20π 1 + 0.32π 2 + 0.60π 3 π 2 = 0.30π 1 + 0.15π 2 + 0.13π 3 π 3 = 0.50π 1 + 0.53π 2 + 0.27π 3 . By choosing any two of these equations along with π 1 + π 2 + π 3 = 1, we obtain a system of three equations in three unknowns. Solving that system yields π 1 = 0.38856, π 2 = 0.200056, and π 3 = 0.411383. Hence the long-run probability that on a randomly selected vacation day the sportsman sails is approximately 0.20. 12. For n ≥ 1, let X n = braceleftBigg 1 if the nth fish caught is trout 0 if the nth fish caught is not trout. 300 Chapter 12 Stochastic Processes Then {X n : n = 1, 2,...} is a Markov chain with state space {0, 1} and transition probability matrix parenleftbigg 10/11 1/11 8/91/9 parenrightbigg Let π 0 be the fraction of fish in the lake that are not trout, and π 1 be the fraction of fish in the lake that are trout. Then, by Theorem 12.7, π 0 and π 1 satisfy parenleftbigg π 0 π 1 parenrightbigg = parenleftbigg 10/11 8/9 1/11 1/9 parenrightbiggparenleftbigg π 0 π 1 parenrightbigg , which gives us the following system of equations ⎧ ⎨ ⎩ π 0 = (10/11)π 0 + (8/9)π 1 π 1 = (1/11)π 0 + (1/9)π 1 . By choosing any one of these equations along with the relation π 0 + π 1 = 1, we obtain a system of two equations in two unknown. Solving that system yields π 0 = 88/97 ≈ 0.907 and π 1 = 9/97 ≈ 0.093. Therefore, approximately 9.3% of the fish in the lake are trout. 13. Let X n = ⎧ ⎪ ⎨ ⎪ ⎩ 1 if the nth card is drawn by player I 2 if the nth card is drawn by player II 3 if the nth card is drawn by player III. {X n : n = 1, 2,...} is a Markov chain with probability transition matrix P = ⎛ ⎝ 48/52 4/52 0 039/52 13/52 12/52 0 40/52 ⎞ ⎠ . Let π 1 , π 2 , and π 3 be the proportion of cards drawn by players I, II, and III, respectively. π 1 , π 2 , and π 3 are obtained from ⎛ ⎝ π 1 π 2 π 3 ⎞ ⎠ = ⎛ ⎝ 12/13 0 3/13 1/13 3/40 01/410/13 ⎞ ⎠ ⎛ ⎝ π 1 π 2 π 3 ⎞ ⎠ and π 1 + π 2 + π 3 = 1, which gives π 1 = 39/64 ≈ 0.61, π 2 = 12/64 ≈ 0.19, and π 3 = 13/64 ≈ 0.20. 14. For 1 ≤ i ≤ 9, let π i be the probability that the mouse is in cell i,1≤ i ≤ 9, at a random time Section 12.3 Markov Chains 301 in the future. Then π i ’s satisfy ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 9 ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ = ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ 01/301/300000 1/201/201/40000 01/30 0 01/3000 1/20001/401/20 0 01/301/301/301/30 001/201/40001/2 0001/30 0 01/30 00001/401/201/2 000001/301/30 ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 9 ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ . Solving this system of equations along with summationtext 9 i=1 π 1 , we obtain π 1 = π 3 = π 7 = π 9 = 1/12, π 2 = π 4 = π 6 = π 8 = 1/8, π 5 = 1/6. 15. Let X n denote the number of balls in urn I after n transfers. The stochastic process {X n : n = 0, 1,...} is a Markov chain with state space {0, 1,... ,5} and transition probability matrix P = ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ 010000 1/504/50 0 0 02/503/50 0 003/502/50 0004/501/5 000010 ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ . Clearly, {X n : n = 0, 1,...} is an irreducible recurrent Markov chain; since it is finite-state, it is positive recurrent. However, {X n : n = 0, 1,...} is not aperiodic, and the period of each state is 2. Hence the limiting probabilities do not exist. For 0 ≤ i ≤ 5, let π i be the fraction of time urn I contains i balls. Then with this interpretation, π i ’s satisfy the following equations ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ π 0 π 1 π 2 π 3 π 4 π 5 ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ = ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ 01/50 0 00 102/50 00 04/503/500 003/504/50 00 02/501 00 0 01/50 ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ π 0 π 1 π 2 π 3 π 4 π 5 ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ , summationtext 5 i=0 π i = 1. Solving these equations, we obtain π 0 = π 5 = 1/31, π 1 = π 4 = 5/31, π 2 = π 3 = 10/31. 302 Chapter 12 Stochastic Processes Therefore, the fraction of time an urn is empty is π 0 +π 5 = 2/31. Hence the expected number of balls transferred between two consecutive times that an urn becomes empty is 31/2 = 15.5. 16. Solution 1: Let X n be the number of balls in urn I immediately before the nth game begins. Then {X n : n = 1, 2,...} is a Markov chain with state space {0, 1,... ,7} and transition probability matrix P = ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ 3/41/4000000 1/41/21/400000 01/41/21/40000 001/41/21/40 0 0 0001/41/21/40 0 00001/41/21/40 000001/41/21/4 0000001/43/4 ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ . Since the transition probability matrix is doubly stochastic; that is, the sum of each column is also 1, for i = 0, 1,... ,7, π i , the long-run probability that the number of balls in urn I immediately before a game begins is 1/8 (see Example 12.35). This implies that the long-run probability mass function of the number of balls in urn I or II is 1/8 for i = 0, 1,... ,7. Solution 2: Let X n be the number of balls in the urn selected at step 1 of the nth game. Then {X n : n = 1, 2,...} is a Markov chain with state space {0, 1,... ,7} and transition probability matrix P = ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ 1/20000001/2 1/41/40 0 0 01/41/4 01/41/40 01/41/40 001/41/41/41/40 0 0001/21/20 0 0 001/41/41/41/40 0 01/41/40 01/41/40 1/41/40 0 0 01/41/4 ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ . Since the transition probability matrix is doubly stochastic; that is, the sum of each column is also 1, for i = 0, 1,... ,7, π i , the long-run probability that the number of balls in the urn selected at step 1 of a game is 1/8 (see Example 12.35). This implies that the long-run probability mass function of the number of balls in urn I or II is 1/8 for i = 0, 1,... ,7. 17. For i ≥ 0, state i is directly accessible from 0. On the other hand, i is accessible from i + 1. These two facts make it possible for all states to communicate with each other. Therefore, the Markov chain has only one class. Since 0 is recurrent and aperiodic (note that p 00 > 0 makes 0 aperiodic), all states are recurrent and aperiodic. Let π k be the long-run probability that a Section 12.3 Markov Chains 303 computer selected at the end of a semester will last at least k additional semesters. Solving ⎛ ⎜ ⎜ ⎜ ⎝ π 0 π 1 π 2 . . . ⎞ ⎟ ⎟ ⎟ ⎠ = ⎛ ⎜ ⎜ ⎜ ⎝ p 1 100... p 2 010... p 3 001... . . . ⎞ ⎟ ⎟ ⎟ ⎠ ⎛ ⎜ ⎜ ⎜ ⎝ π 0 π 1 π 2 . . . ⎞ ⎟ ⎟ ⎟ ⎠ along with summationtext ∞ i=0 π i = 1, we obtain π 0 = 1 1 + summationtext ∞ i=1 (1 − p 1 − p 2 −···−p i ) , π k = 1 − p 1 − p 2 −···−p k 1 + summationtext ∞ i=1 (1 − p 1 − p 2 −···−p i ) ,k≥ 1. 18. Let DN denote the state at which the last movie Mr. Gorfin watched was not a drama, but the one before that was a drama. Define DD, ND, and NN similarly, and label the states DD, DN, ND, and NN by 0, 1, 2, and 3, respectively. Let X n = 0 if the nth and (n − 1)st movies Mr. Gorfin watched were both dramas. Define X n = 1, 2, and 3 similarly. Then {X n : n = 1, 2,...} is a Markov chain with state space {0, 1, 2, 3} and transition probability matrix P = ⎛ ⎜ ⎜ ⎝ 7/81/80 0 001/21/2 1/21/20 0 001/87/8 ⎞ ⎟ ⎟ ⎠ . (a) If the first two movies Mr. Gorfin watched last weekend were dramas, the probability that the fourth one is a drama is p 2 00 + p 2 02 . Since P 2 = ⎛ ⎜ ⎜ ⎝ 49/64 7/64 1/16 1/16 1/41/41/16 7/16 7/16 1/16 1/41/4 1/16 1/16 7/64 49/64 ⎞ ⎟ ⎟ ⎠ , the desired probability is (49/64) + (1/16) = 53/64. (b) Let π 0 denote the long-run probability that Mr. Gorfin watches two dramas in a row. Define π 1 , π 2 , and π 3 similarly. We have that, ⎛ ⎜ ⎜ ⎝ π 0 π 1 π 2 π 3 ⎞ ⎟ ⎟ ⎠ = ⎛ ⎜ ⎜ ⎝ 7/801/20 1/801/20 01/201/8 01/207/8 ⎞ ⎟ ⎟ ⎠ ⎛ ⎜ ⎜ ⎝ π 0 π 1 π 2 π 3 ⎞ ⎟ ⎟ ⎠ . Solving this system along with π 0 +π 1 +π 2 +π 3 = 1, we obtain π 0 = 2/5, π 1 = 1/10, π 2 = 1/10, and π 3 = 2/5. Hence the probability that Mr. Gorfin watches two dramas in a row is 2/5. 304 Chapter 12 Stochastic Processes 19. Clearly, X n+1 = braceleftBigg 0 if the (n + 1)st outcome is 6 1 + X n otherwise. This relation shows that {X n : n = 1, 2,...} is a Markov chain. Its transition probability matrix is given by P = ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ 1/65/6000... 1/605/60 0... 1/60 05/60... 1/60005/6 ... . . . ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ . It is readily seen that all states communicate with 0. Therefore, by transitivity of the com- munication property, all states communicate with each other. Therefore, the Markov chain is irreducible. Clearly, 0 is recurrent. Since p 00 > 0, it is aperiodic as well. Hence all states are recurrent and aperiodic. On the other hand, starting at 0, the expected number of transi- tions until the process returns to 0 is 6. This is because the number of tosses until the next 6 obtained is a geometric random variable with probability of success p = 1/6, and hence expected value 1/p = 6. Therefore, 0, and hence all other states are positive recurrent. Next, a simple probabilistic argument shows that, π i = parenleftBig 5 6 parenrightBig i parenleftBig 1 6 parenrightBig ,i= 0, 1, 2,... . This can also be shown by solving the following system of equations: ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ π 0 π 1 π 2 π 3 . . . ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ = ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ 1/61/61/61/6 ... 5/6000... 05/60 0... 005/60... . . . ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ π 0 π 1 π 2 π 3 . . . ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ π 0 + π 1 + π 2 +···=1. 20. (a) Let X n = braceleftBigg 1 if Alberto wins the nth game 0 if Alberto loses the nth game. Then {X n : n = 1, 2,...} is a Markov chain with state space {0, 1}. Its transition probability matrix is P = parenleftBig 1 − pp p 1 − p parenrightBig . Using induction, we will now show that P (n) = P n = ⎛ ⎜ ⎜ ⎝ 1 2 + 1 2 (1 − 2p) n 1 2 − 1 2 (1 − 2p) n 1 2 − 1 2 (1 − 2p) n 1 2 + 1 2 (1 − 2p) n ⎞ ⎟ ⎟ ⎠ . Section 12.3 Markov Chains 305 Clearly, for n = 1, P (1) = P. Suppose that P (n) = ⎛ ⎜ ⎜ ⎜ ⎝ 1 2 + 1 2 (1 − 2p) n 1 2 − 1 2 (1 − 2p) n 1 2 − 1 2 (1 − 2p) n 1 2 + 1 2 (1 − 2p) n ⎞ ⎟ ⎟ ⎟ ⎠ . We will show that P n+1 = ⎛ ⎜ ⎜ ⎜ ⎝ 1 2 + 1 2 (1 − 2p) n+1 1 2 − 1 2 (1 − 2p) n+1 1 2 − 1 2 (1 − 2p) n+1 1 2 + 1 2 (1 − 2p) n+1 ⎞ ⎟ ⎟ ⎟ ⎠ . To do so, note that P (n+1) = parenleftbigg p 00 p 01 p 10 p 11 parenrightbiggparenleftbigg p n 00 p n 01 p n 10 p n 11 parenrightbigg = parenleftbigg p 00 p n 00 + p 01 p n 10 p 00 p n 01 + p 01 p n 11 p 10 p n 00 + p 11 p n 10 p 10 p n 01 + p 11 p n 11 parenrightbigg . Thus p n+1 11 = p 10 p n 01 + p 11 p n 11 = p bracketleftBig 1 2 − 1 2 (1 − 2p) n bracketrightBig + (1 − p) bracketleftBig 1 2 + 1 2 (1 − 2p) n bracketrightBig = 1 2 bracketleftbig p + (1 − p) bracketrightbig + 1 2 (1 − 2p) n bracketleftbig − p + (1 − p) bracketrightbig = 1 2 + 1 2 (1 − 2p) n+1 . This establishes what we wanted to show. The proof that p n+1 00 = 1 2 + 1 2 (1 − 2p) n+1 is identical to what we just showed. We have P n+1 01 = 1 − P n+1 00 = 1 − bracketleftBig 1 2 + 1 2 (1 − 2p) n bracketrightBig = 1 2 − 1 2 (1 − 2p) n . Similarly, p n+1 10 = 1 − p n+1 11 = 1 2 − 1 2 (1 − 2p) n . (b) Let π 0 and π 1 be the long-run probabilities that Alberto loses and wins a game, respec- tively. Then parenleftbigg π 0 π 1 parenrightbigg = parenleftbigg 1 − pp p 1 − p parenrightbiggparenleftbigg π 0 π 1 parenrightbigg , and π 0 + π 1 = 1 imply that π 0 = π 1 = 1/2. Therefore, the expected number of games Alberto will play between two consecutive wins is 1/π 1 = 2. 306 Chapter 12 Stochastic Processes 21. For each j ≥ 0, lim n→∞ p n ij exists and is independent of i if the following system of equations, in π 0 , π 1 , ..., have a unique solution. ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ π 0 π 1 π 2 π 3 . . . ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ = ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ 1 − p 1 − p 0000... p 01− p 000... 0 p 01− p 00... 00p 01− p 0 ... . . . ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ π 0 π 1 π 2 π 3 . . . ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ π 0 + π 1 + π 2 +···=1. From the matrix equation, we obtain π i = parenleftBig p 1 − p parenrightBig i π 0 ,i= 0, 1,... . For these quantities to satisfy summationtext ∞ i=0 π i = 1, we need the geometric series ∞ summationdisplay i=0 parenleftBig p 1 − p parenrightBig i to converge. Hence we must have p<1 − p,orp<1/2. Therefore, for p<1/2, this irreducible, aperiodic Markov chain which is positively recurrent has limiting probabilities. Note that, for p<1/2, π 0 ∞ summationdisplay i=0 parenleftBig p 1 − p parenrightBig i = 1 yields π 0 = 1 − p 1 − p . Thus the limiting probabilities are π i = parenleftBig p 1 − p parenrightBig i parenleftBig 1 − p 1 − p parenrightBig ,i= 0, 1, 2,... . 22. Let Y n be Carl’s fortune after the nth game. Let X n be Stan’s fortune after the nth game. Let Z n = Y n − X n . The {Z n : n = 0, 1,...} is a random walk with state space {0, ±2, ±4,...}. We have that Z 0 = 0, and at each step either the process moves two units to the right with probability 0.46 or two units to the left with probability 0.54. Let A be the event that, starting at 0, the random walk will eventually enter 2; P(A) is the desired quantity. By the law of total probability, P(A) = P(A| Z 1 = 2)P(Z 1 = 2) + P(A| Z 1 =−2)P(Z 1 =−2) = 1 · (0.46) + bracketleftbig P(A) bracketrightbig 2 · (0.54). To show that P(A| Z 1 =−2) = bracketleftbig P(A) bracketrightbig 2 , let E be the event of, starting from −2, eventually entering 0. It should be clear that P(E)= P(A). By independence of E and A,wehave P(A| Z =−2) = P(EA)= P(E)P(A)= bracketleftbig P(A) bracketrightbig 2 . Section 12.3 Markov Chains 307 We have shown that P(A), the quantity we are interested in, satisfies (0.54) bracketleftbig P(A) bracketrightbig 2 − P(A) + 0.46 = 0. This is a quadratic equation in P(A). Solving it gives P(A) = 23/27 ≈ 0.85. 23. We will use induction on m.Form = 1, the relation is, simply, the Markovian property, which is true. Suppose that the relation is valid for m − 1. We will show that it is also valid for m. We have P(X n+m = j | X 0 = i 0 ,X 1 = i 1 ,... ,X n = i n ) = summationdisplay i∈S P(X n+m = j | X 0 = i 0 ,... ,X n = i n ,X n+m−1 = i) P(X n+m−1 = i | X 0 = i 0 ,... ,X n = i n ) = summationdisplay i∈S P(X n+m = j | X n+m−1 = i)P(X n+m−1 = i | X n = i n ) = summationdisplay i∈S P(X n+m = j | X n+m−1 = i, X n = i n )P(X n+m−1 = i | X n = i n ) = P(X n+m = j | X n = i n ), where the following relations are valid from the definition of Markov chain: given the present state, the process is independent of the past. P(X n+m = j | X 0 = i 0 ,... ,X n = i n ,X n+m−1 = i) = P(X n+m = j | X n+m−1 = i), P(X n+m = j | X n+m−1 = i) = P(X n+m = j | X n+m−1 = i, X n = i n ). 24. Let (0, 0), the origin, be denoted by O. It should be clear that, for all n ≥ 0, P 2n+1 OO = 0. Now, for n ≥ 1, let Z 1 , Z 2 , Z 3 , and Z 4 be the number of transitions to the right, left, up, and down, respectively. The joint probability mass function of Z 1 , Z 2 , Z 3 , and Z 4 is multinomial. We have P 2n OO = n summationdisplay i=0 P(Z 1 = i,Z 2 = i,Z 3 = n − i,Z 4 = n − i) = n summationdisplay i=0 (2n)! i! i! (n − i)! (n − i)! parenleftBig 1 4 parenrightBig i parenleftBig 1 4 parenrightBig i parenleftBig 1 4 parenrightBig n−i parenleftBig 1 4 parenrightBig n−i = n summationdisplay i=0 (2n)! n! n! · n! i! (n − i)! · n! i! (n − i)! parenleftBig 1 4 parenrightBig 2n = parenleftBig 1 4 parenrightBig 2n parenleftbigg 2n n parenrightbigg n summationdisplay i=0 parenleftbigg n i parenrightbigg 2 . 308 Chapter 12 Stochastic Processes By Example 2.28, n summationdisplay i=0 parenleftbigg n i parenrightbigg 2 = parenleftbigg 2n n parenrightbigg . Thus P n OO = parenleftBig 1 4 parenrightBig 2n parenleftbigg 2n n parenrightbigg 2 . Now, by Theorem 2.7 (Stirling’s formula), parenleftBig 1 4 parenrightBig 2n parenleftbigg 2n n parenrightbigg 2 = parenleftBig 1 4 parenrightBig 2n · bracketleftBig (2n)! n! n! bracketrightBig 2 ∼ 1 4 2n · bracketleftBig √ 4πn(2n) 2n e −2n ( √ 2πn· n n · e −n ) 2 bracketrightBig 2 = 1 πn . Therefore, ∞ summationdisplay n=1 P n OO = ∞ summationdisplay n=1 parenleftBig 1 4 parenrightBig 2n parenleftbigg 2n n parenrightbigg 2 is convergent if and only if ∞ summationdisplay n=1 1 πn is convergent. Since 1 π ∞ summationdisplay n=1 1 n is divergent, summationtext ∞ n=1 P n OO is divergent, showing that the state (0, 0) is recurrent. 25. Clearly, P(X n+1 = 1 | X n = 0) = 1. For i ≥ 1, given X n = i, either X n+1 = i + 1 in which case we say that a transition to the right has occurred, or X n+1 = i − 1 in which case we say that a transition to the left has occurred. For i ≥ 1, given X n = i, when the nth transition occurs, let S be the remaining service time of the customer being served or the service time of a new customer, whichever applies. Let T be the time from the nth transition until the next arrival. By the memoryless property of exponential random variables, S and T are exponential random variables with parameters µ and λ, respectively. For i ≥ 1, P(X n+1 = i + 1 | X n = i) = P(T ~~T | T = t)λe −λt dt = integraldisplay ∞ 0 P(S > t)λe −λt dt = integraldisplay ∞ 0 e −µt · λe −λt dt = λ λ + µ . Therefore, P(X n+1 = i − 1 | X n = i) = P(T >S)= 1 − λ λ + µ = µ λ + µ . These calculations show that knowing X n , the next transition does not depend on the values of X j for j~~ 0 only for positive even integers. Since the greatest common divisor of such integers is 2, the period of 0, and hence the period of all other states is 2. 26. The ijth element of PQis the product of the ith row of P with the jth column of Q. Thus it is summationdisplay lscript p ilscript q lscriptj . To show that the sum of each row of PQis 1, we will now calculate the sum of the elements of the ith row of PQ, which is summationdisplay j summationdisplay lscript p ilscript q lscriptj = summationdisplay lscript summationdisplay j p ilscript q lscriptj = summationdisplay lscript parenleftBig p ilscript summationdisplay j q lscriptj parenrightBig = summationdisplay lscript p ilscript = 1. Note that summationdisplay j q lscriptj = 1 and summationdisplay lscript p ilscript = 1 since the sum of the elements of the lscriptth row of Q and the sum of the elements of the ith row of P are 1. 27. If state j is accessible from state i, there is a path i = i 1 ,i 2 ,i 3 , ... , i n = j from i to j.Ifn ≤ K, we are done. If n>K, by the pigeonhole principle, there must exist k and lscript (k 0} and J ={n ≥ 1: p n jj > 0}. Then d(i), the period of i, is the greatest common divisor of the elements of I, and d(j), the period of j, is the greatest common divisor of the elements of J.Ifd(i) negationslash= d(j), then one of d(i) and d(j) is smaller than the other one. We will prove the theorem for the case in which d(j) 0 and p m ji > 0. Let k ∈ J; then p k jj > 0. We have p n+m ii ≥ p n ij p m ji > 0, 310 Chapter 12 Stochastic Processes and p n+k+m ii ≥ p n ij p k jj p m ji > 0. By these inequalities, we have that d(i) divides n + m and n + k + m. Hence it divides (n + k + m) − (n + m) = k. We have shown that, if k ∈ J, then d(i) divides k. This means that d(i) divides all members of J. It contradicts the facts that d(j) is the greatest common divisor of J and d(j)ioccurs if and only if Liz has not played with Bob since i Sundays ago, and the earliest she will play with him is next Sunday. Now the probability is i/k that Liz will play with Bob if last time they played was i Sundays ago; hence P(Z >i)= 1 − i k ,i= 1, 2,... ,k− 1. Using this fact, for 0 ≤ i ≤ k − 2, we obtain p i(i+1) = P(X n+1 = i + 1 | X n = i) = P(X n = i, X n+1 = i + 1) P(X n = i) = P(Z >i+ 1) P(Z >i) = 1 − i + 1 k 1 − i k = k − i − 1 k − i , p i0 = P(X n+1 = 0 | X n = i) = 1 − k − i − 1 k − i = 1 k − i , p (k−1)0 = P(X n+1 = 0 | X n = k − 1) = 1. Hence the transition probability matrix of {X n : n = 1, 2,...} is given by Section 12.3 Markov Chains 311 P = ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ 1 k 1 − 1 k 000... 00 1 k − 1 01− 1 k − 1 00... 00 1 k − 2 001− 1 k − 2 0 ... 00 1 k − 3 00 01− 1 k − 3 ... 00 . . . 1 2 0000... 0 1 2 10000... 00 ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ . It should be clear that the Markov chain under consideration is irreducible, aperiodic, and positively recurrent. For 0 ≤ i ≤ k − 1, let π i be the long-run probability that Liz says no to Bob for i consecutive weeks. π 0 , π 1 , ..., π k−1 are obtained from solving the following matrix equation along with summationtext k−1 i=0 π i = 1. ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ π 0 π 1 π 2 π 3 . . . π k−2 π k−1 ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ = ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ 1 k 1 k − 1 1 k − 2 1 k − 3 ... 1 2 1 1 − 1 k 000... 00 01− 1 k − 1 00... 00 001− 1 k − 2 0 ... 00 00 01− 1 k − 3 ... 00 . . . 0000... 1 2 0 ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ π 0 π 1 π 2 π 3 . . . π k−2 π k−1 ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ . The matrix equation gives π i = k − i k π 0 ,i= 1, 2,... ,k− 1. 312 Chapter 12 Stochastic Processes Using summationtext k−1 i=0 π i = 1, we obtain π 0 k−1 summationdisplay i=0 k − i k = 1 or, equivalently, π 0 k bracketleftBig k−1 summationdisplay i=0 k − k−1 summationdisplay i=0 i bracketrightBig = 1. This implies that π 0 k bracketleftBig k 2 − (k − 1)k 2 bracketrightBig = 1, which gives π 0 = 2/(k + 1). Hence π i = 2(k − i) k(k + 1) ,i= 0, 1, 2,... ,k− 1. 30. Let X i be the amount of money player A has after i games. Clearly, X 0 = a and {X n : n = 0, 1,...} is a Markov chain with state space {0, 1,... ,a,a+ 1,... ,a+ b}. For 0 ≤ i ≤ a + b, let m i = E(T | X 0 = i). Let F be the event that A wins the first game. Then, for 1 ≤ i ≤ a + b − 1, E(T | X 0 = i) = E(T | X 0 = i,F)P(F | X 0 = i)+ E(T | X 0 = i,F c )P(F c | X 0 = i). This gives m i = (1 + m i+1 ) 1 2 + (1 + m i−1 ) 1 2 , 1 ≤ i ≤ a + b − 1, or, equivalently, 2m i = 2 + m i+1 + m i−1 , 1 ≤ i ≤ a + b − 1. Now rewrite this relation as m i+1 − m i =−2 + m i − m i−1 , 1 ≤ i ≤ a + b − 1, and, for 1 ≤ i ≤ a + b, let y i = m i − m i−1 . Then y i+1 =−2 + y i , 1 ≤ i ≤ a + b − 1, and, for 1 ≤ i ≤ a + b, m i = y 1 + y 2 +···+y i . Clearly, m 0 = 0, m a+b = 0,y 1 = m 1 , and y 2 =−2 + y 1 =−2 + m 1 , y 3 =−2 + y 2 =−2 + (−2 + m 1 ) =−4 + m 1 . . . y i =−2(i − 1) + m 1 , 1 ≤ i ≤ a + b. Section 12.3 Markov Chains 313 Hence, for 1 ≤ i ≤ a + b, m i = y 1 + y 2 +···+y i = im 1 − 2 bracketleftbig 1 + 2 +···+(i − 1) bracketrightbig = im 1 − i(i − 1) = i(m 1 − i + 1). This and m a+b = 0 imply that (a + b)(m 1 − a − b + 1) = 0, or m 1 = a + b − 1. Therefore, m i = i(a + b − i), and hence the desired quantity is E(T | X 0 = a) = m a = ab. 31. Let q be a positive solution of the equation x = summationtext ∞ i=0 α i x i . Then q = summationtext ∞ i=0 α i q i . We will show that ∀n ≥ 0, P(X n = 0) ≤ q. This implies that p = lim n→∞ P(X n = 0) ≤ q. To establish that P(X n = 0) ≤ q, we use induction. For n = 0, P(X 0 = 0) = 0 ≤ q is trivially true. Suppose that P(X n = 0) ≤ q.We have P(X n+1 = 0) = ∞ summationdisplay i=0 P(X n+1 = 0 | X 1 = i)P(X 1 = i). It should be clear that P(X n+1 = 0 | X 1 = i) = bracketleftbig P(X n = 0 | X 0 = 1) bracketrightbig i . However, since P(X 0 = 1) = 1, P(X n = 0 | X 0 = 1) = P(X n = 0). Therefore, P(X n+1 = 0 | X 1 = i) = bracketleftbig P(X n = 0) bracketrightbig i . Thus P(X n+1 = 0) = ∞ summationdisplay i=0 bracketleftbig P(X n = 0) bracketrightbig i P(X 1 = i) ≤ ∞ summationdisplay i=0 q i α i = q. This establishes the theorem. 314 Chapter 12 Stochastic Processes 32. Multiplying P successively, we obtain p 12 = 1 13 p 2 12 = parenleftBig 9 13 parenrightBigparenleftBig 1 13 parenrightBig + 1 13 , p 3 12 = parenleftBig 9 13 parenrightBig 2 parenleftBig 1 13 parenrightBig + parenleftBig 9 13 parenrightBigparenleftBig 1 13 parenrightBig + 1 13 , and in general, p n 12 = 1 13 bracketleftBigparenleftBig 9 13 parenrightBig n−1 + parenleftBig 9 13 parenrightBig n−2 +···+1 bracketrightBig = 1 13 · 1 − parenleftBig 9 13 parenrightBig n 1 − 9 13 = 1 4 bracketleftBig 1 − parenleftBig 9 13 parenrightBig n bracketrightBig . Hence the desired probability is lim n→∞ p n 12 = 1/4. 33. We will use induction. Let n = 1; then, for 1 + j − i to be nonnegative, we must have i − 1 ≤ j. For the inequality 1 + j − i 2 ≤ 1 to be valid, we must have j ≤ i + 1. Therefore, i − 1 ≤ j ≤ i + 1. But, for j = i,1+ j − i is not even. Therefore, if 1 + j − i is an even nonnegative integer satisfying 1 + j − i 2 ≤ 1, we must have j = i − 1orj = i + 1. For j = i − 1, n + j − i 2 = 1 + i − 1 − i 2 = 0 and n − j + i 2 = 1 − i + 1 + i 2 = 1. Hence P(X 1 = i − 1 | X 0 = i) = 1 − p = parenleftbigg 1 0 parenrightbigg p 0 (1 − p) 1 , showing that the relation is valid. For j = i + 1, n + j − i 2 = 1 + i + 1 − i 2 = 1 and n − j + i 2 = 1 − i − 1 + i 2 = 0. Hence P(X 1 = i + 1 | X 0 = i) = p = parenleftbigg 1 1 parenrightbigg p 1 (1 − p) 0 , showing that the relation is valid in this case as well. Since, for a simple random walk, the only possible transitions from i are to states i + 1 and i − 1, in all other cases P(X 1 = j | X 0 = i) = 0. Section 12.4 Continuous-Time Markov Chains 315 We have established the theorem for n = 1. Now suppose that it is true for n. We will show it for n + 1 by conditioning on X n : P(X n+1 = j | X 0 = i) = P(X n+1 = j | X 0 = i, X n = j − 1)P(X n = j − 1 | X 0 = i) + P(X n+1 = j | X 0 = i, X n = j + 1)P(X n = j + 1 | X 0 = i) = P(X n+1 = j | X n = j − 1)P(X n = j − 1 | X 0 = i) + P(X n+1 = j | X n = j + 1)P(X n = j + 1 | X 0 = i) = p · parenleftbigg n n + j − 1 − i 2 parenrightbigg p (n+j−1−i)/2 (1 − p) (n−j+1+i)/2 + (1 − p) parenleftbigg n n + j + 1 − i 2 parenrightbigg p (n+j+1−i)/2 (1 − p) (n−j−1+i)/2 = bracketleftbiggparenleftbigg n n − 1 + j − i 2 parenrightbigg + parenleftbigg n n + 1 + j − i 2 parenrightbiggbracketrightbigg p (n+1+j−i)/2 (1 − p) (n+1−j+i)/2 = parenleftbigg n + 1 n + 1 + j − i 2 parenrightbigg p (n+1+j−i)/2 (1 − p) (n+1−j+i)/2 . 12.4 CONTINUOUS-TIME MARKOV CHAINS 1. By Chapman-Kolmogorov equations, p ij (t + h) − p ij (t) = ∞ summationdisplay k=0 p ik (h)p kj (t) − p ij (t) = summationdisplay knegationslash=i p ik (h)p kj (t) + p ii (h)p ij (t) − p ij (t) = summationdisplay knegationslash=i p ik (h)p kj (t) + p ij (t) bracketleftbig p ii (h) − 1 bracketrightbig . Thus p ij (t + h) − p ij (t) h = summationdisplay knegationslash=i p ik (h) h p kj (t) − p ij (t) 1 − p ii (h) h . Letting h → 0, by (12.13) and (12.14), we have p prime ij (t) = summationdisplay knegationslash=i q ik p kj (t) − ν i p ij (t). 316 Chapter 12 Stochastic Processes 2. Clearly, braceleftbig X(t): t ≥ bracerightbig is a continuous-time Markov chain. Its balance equations are as follows: State Input rate to = Output rate from f µπ 0 = λπ f 0 λπ f + µπ 1 + µπ 2 + µπ 3 = µπ 0 + λπ 0 1 λπ 0 = λπ 1 + µπ 1 2 λπ 1 = λπ 2 + µπ 2 3 λπ 2 = µπ 3 . Solving these equations along with π f + π 0 + π 1 + π 2 + π 3 = 1 we obtain π f = µ 2 λ(λ + µ) ,π 0 = µ λ + µ , π 1 = λµ (λ + µ) 2 ,π 2 = λ 2 µ (λ + µ) 3 , π 3 = parenleftBig λ λ + µ parenrightBig 3 . 3. The fact that braceleftbig X(t): t ≥ 0 bracerightbig is a continuous-time Markov chain should be clear. The balance equations are State Input rate to = Output rate from (0, 0) µπ (1,0) + λπ (0,1) = λπ (0,0) + µπ (0,0) (n, 0) µπ (n+1,0) + λπ (n−1,0) = λπ (n,0) + µπ (n,0) ,n≥ 1 (0,m) λπ (0,m+1) + µπ (0,m−1) = λπ (0,m) + µπ (0,m) m ≥ 1. 4. Let X(t) be the number of customers in the system at time t. Then the process braceleftbig X(t): t ≥ 0 bracerightbig is a birth and death process with λ n = λ, n ≥ 0, and µ n = nµ, n ≥ 1. To find π 0 , the probability that the system is empty, we will first calculate the sum in (12.18). We have ∞ summationdisplay n=1 λ 0 λ 1 ···λ n−1 µ 1 µ 2 ···µ n = ∞ summationdisplay n=1 λ n n! µ n = ∞ summationdisplay n=1 1 n! parenleftBig λ µ parenrightBig n =−1 + ∞ summationdisplay n=0 1 n! parenleftBig λ µ parenrightBig n =−1 + e λ/µ . Hence, by (12.18), π 0 = 1 1 − 1 + e λ/µ = e −λ/µ . Section 12.4 Continuous-Time Markov Chains 317 By (12.17), π n = λ n π 0 n!µ n = (λ/µ) n e −λ/µ n! ,n= 0, 1, 2,... . This shows that the long-run number of customers in such an M/M/∞ queueing system is Poisson with parameter λ/µ. The average number of customers in the system is, therefore, λ/µ. 5. Let X(t) be the number of operators busy serving customers at time t. Clearly, braceleftbig X(t): t ≥ 0 bracerightbig is a finite-state birth and death process with state space {0, 1,... ,c}, birth rates λ n = λ, n = 0, 1,... ,c, and death rates µ n = nµ, n = 0, 1,... ,c. Let π 0 be the proportion of time that all operators are free. Let π c be the proportion of time all of them are busy serving customers. (a) π c is the desired quantity. By (12.22), π 0 = 1 1 + c summationdisplay n=1 λ n n! µ n = 1 c summationdisplay n=0 1 n! parenleftBig λ µ parenrightBig n . By (12.21), π c = 1 c! (λ/µ) c summationtext c n=0 1 n! (λ/µ) n . This formula is called Erlang’s loss formula. (b) We want to find the smallest c for which 1/c! summationtext c n=0 (1/n!) ≤ 0.004. For c = 5, the left side is 0.00306748. For c = 4, it is 0.01538462. Therefore, the airline must hire at least five operators to reduce the probability of losing a call to a number less than 0.004. 6. No, it is not because it is possible for the process to enter state 0 directly from state 2. In a birth and death process, from a state i, transitions are only possible to the states i −1 and i +1. 7. For n ≥ 0, let H n be the time, starting from n, until the process enters state n + 1 for the first time. Clearly, E(H 0 ) = 1/λ and, by Lemma 12.2, E(H n ) = 1 λ + E(H n−1 ), n ≥ 1. 318 Chapter 12 Stochastic Processes Hence E(H 0 ) = 1 λ , E(H 1 ) = 1 λ + 1 λ = 2 λ , E(H 2 ) = 1 λ + 2 λ = 3 λ . Continuing this process, we obtain, E(H n ) = n + 1 λ ,n≥ 0. The desired quantity is j−1 summationdisplay n=i E(H n ) = j−1 summationdisplay n=i n + 1 λ = 1 λ bracketleftbig (i + 1) + (i + 2) +···+j bracketrightbig = 1 λ bracketleftbig (1 + 2 +···+j)− (1 + 2 +···+i) bracketrightbig = 1 λ bracketleftBig j(j + 1) 2 − i(i + 1) 2 bracketrightBig = j(j + 1) − i(i + 1) 2λ . 8. Suppose that a birth occurs each time that an out-of-order machine is repaired and begins to operate, and a death occurs each time that a machine breaks down. The fact that braceleftbig X(t): t ≥ 0 bracerightbig is a birth and death process with state space {0, 1,... ,m} should be clear. The birth and death rates are λ n = braceleftBigg kλ n = 0, 1,... ,m− k (m − n)λ n = m − k + 1,m− k + 2,... ,m, µ n = nµ n = 0, 1,... ,m. 9. The Birth rates are braceleftBigg λ 0 = λ λ n = α n λ, n ≥ 1. The death rates are braceleftBigg µ 0 = 0 µ n = µ + (n − 1)γ, n ≥ 1. 10. Let X(t) be the population size at time t. Then braceleftbig X(t): t ≥ 0 bracerightbig is a birth and death process with birth rates λ n = nλ + γ , n ≥ 0, and death rates µ n = nµ, n ≥ 1. For i ≥ 0, let H i Section 12.4 Continuous-Time Markov Chains 319 be the time, starting from i, until the population size reaches i + 1 for the first time. We are interested in E(H 0 ) + E(H 1 ) + E(H 2 ). Note that, by Lemma 12.2, E(H i ) = 1 λ i + µ i λ i E(H i−1 ), i ≥ 1. Since E(H 0 ) = 1/γ , E(H 1 ) = 1 λ + γ + µ λ + γ · 1 γ = µ + γ γ(λ+ γ) , and E(H 2 ) = 1 2λ + γ + 2µ 2λ + γ · µ + γ γ(λ+ γ) = γ(λ+ γ)+ 2µ(µ + γ) γ(λ+ γ)(2λ + γ) . Thus the desired quantity is E(H 0 ) + E(H 1 ) + E(H 2 ) = (λ + γ)(2λ + γ)+ (µ + γ)(2λ + 2µ + γ)+ γ(λ+ γ) γ(λ+ γ)(2λ + γ) . 11. Let X(t) be the number of deaths in the time interval [0,t]. Since there are no births, by Remark 7.2, it should be clear that braceleftbig X(t): t ≥ 0 bracerightbig is a Poisson process with rate µ as long as the population is not extinct. Therefore, for 0 0, j = 0, we have p i0 (t) = 1 − i summationdisplay j=1 p ij (t) = 1 − i summationdisplay j=1 e −µt (µt) i−j (i − j)! = 1 − 1 summationdisplay j=i e −µt (µt) i−j (i − j)! . Letting k = i − j yields p i0 (t) = 1 − i−1 summationdisplay k=0 e −µt (µt) k k! = ∞ summationdisplay k=i e −µt (µt) k k! . 12. Suppose that a birth occurs whenever a physician takes a break, and a death occurs whenever he or she becomes available to answer patients’ calls. Let X(t) be the number of physicians on break at time t. Then braceleftbig X(t): t ≥ 0 bracerightbig is a birth and death process with state space {0, 1, 2}. Clearly, X(t) = 0ifatt both of the physicians are available to answer patients’calls, X(t) = 1 if at t only one of the physicians is available to answer patients’ calls, and X(t) = 2ifatt none of the physicians is available to answer patients’ calls. We have that λ 0 = 2λ, λ 1 = λ, λ 2 = 0, 320 Chapter 12 Stochastic Processes µ 0 = 0,µ 1 = µ, µ 2 = 2µ. Therefore, ν 0 = 2λ, ν 1 = λ + µ, ν 2 = 2µ. Also, p 01 = p 21 = 1,p 02 = p 20 = 0,p 10 = µ λ + µ ,p 12 = λ λ + µ . Therefore, q 01 = ν 0 p 01 = 2λ, q 10 = ν 1 p 10 = µ, q 12 = ν 1 p 12 = λ, q 21 = ν 2 p 21 = 2µ, q 02 = q 20 = 0. Substituting these quantities in the Kolmogorov backward equations p prime ij (t) = summationdisplay knegationslash=i q ik p kj (t) − ν i p ij (t), we obtain p prime 00 (t) = 2λp 10 (t) − 2λp 00 (t) p prime 01 (t) = 2λp 11 (t) − 2λp 01 (t) p prime 02 (t) = 2λp 12 (t) − 2λp 02 (t) p prime 10 (t) = λp 20 (t) + µp 00 (t) − (λ + µ)p 10 (t) p prime 11 (t) = λp 21 (t) + µp 01 (t) − (λ + µ)p 11 (t) p prime 12 (t) = λp 22 (t) + µp 02 (t) − (λ + µ)p 12 (t) p prime 20 (t) = 2µp 10 (t) − 2µp 20 (t) p prime 21 (t) = 2µp 11 (t) − 2µp 21 (t) p prime 22 (t) = 2µp 12 (t) − 2µp 22 (t). 13. Let X(t) be the number of customers in the system at time t. Then braceleftbig X(t): n ≥ 0 bracerightbig is a birth and death process with λ n = λ, for n ≥ 0, and µ n = braceleftBigg nµ n = 0, 1,... ,c cµ n>c. By (12.21), for n = 1, 2,...c, π n = λ n n! µ n π 0 = 1 n! parenleftBig λ µ parenrightBig n π 0 ; for n>c, π n = λ n c! µ c (cµ) n−c π 0 = λ n c! c n−c µ n π 0 = c c c! parenleftBig λ cµ parenrightBig n π 0 = c c c! ρ n π 0 . Section 12.4 Continuous-Time Markov Chains 321 Noting that summationtext c n=0 π n + summationtext ∞ n=c+1 π n = 1, we have π 0 c summationdisplay n=0 1 n! parenleftBig λ µ parenrightBig n + π 0 c c c! ∞ summationdisplay n=c+1 ρ n = 1. Since ρ<1, we have summationtext ∞ n=c+1 ρ n = ρ c+1 1 − ρ . Therefore, π 0 = 1 c summationdisplay n=0 1 n! parenleftBig λ µ parenrightBig n + c c c! ∞ summationdisplay n=c+1 ρ n = c! (1 − ρ) c! (1 − ρ) c summationdisplay n=0 1 n! parenleftBig λ µ parenrightBig n + c c ρ c+1 . 14. Let s,t > 0. If j*j.Therefore, for j**i. Then ∞ summationdisplay k=0 p ik (s)p kj (t) = j summationdisplay k=i p ik (s)p kj (t) = j summationdisplay k=i e −λs (λs) k−i (k − i)! · e −λt (λt) j−k (j − k)! = e −λ(t+s) (j − i)! j summationdisplay k=i (j − i)! k − i)! (j − k)! (λs) k−i (λt) j−k = e −λ(t+s) (j − i)! j−i summationdisplay lscript=0 (j − i)! lscript! (j − i − lscript)! (λs) lscript (λt) (j−i)−lscript = e −λ(t+s) (j − i)! j−i summationdisplay lscript=0 parenleftbigg j − i lscript parenrightbigg (λs) lscript (λt) (j−i)−lscript = e −λ(t+s) (j − i)! (λs + λt) j−i where the last equality follows by Theorem 2.5, the binomial expansion. Since e −λ(t+s) (j − i)! bracketleftbig λ(t + s) bracketrightbig j−i = p ij (s + t), we have shown that the Chapman-Kolmogorov equations are satisfied. 322 Chapter 12 Stochastic Processes 15. Let X(t)be the number of particles in the shower t units of time after the cosmic particle enters the earth’s atmosphere. Clearly, braceleftbig X(t): t ≥ 0 bracerightbig is a continuous-time Markov chain with state space {1, 2,...} and ν i = iλ, i ≥ 1. parenleftbig In fact, braceleftbig X(t): t ≥ 0 bracerightbig is a pure birth process, but that fact will not help us solve this exercise. parenrightbig Clearly, for i ≥ 1, j ≥ 1, p ij = braceleftBigg 1ifj = i + 1 0ifj negationslash= i + 1. Hence q ij = braceleftBigg ν i if j = i + 1 0ifj negationslash= i + 1. We are interested in finding p 1n (t). This is the desired probability. For n = 1, p 11 (t) is the probability that the cosmic particle does not collide with any air particles during the first t units of time in the earth’s atmosphere. Since the time it takes the particle to collide with another particle is exponential with parameter λ,wehavep 11 (t) = e −λt .Forn ≥ 2, by the Kolmogorov’s forward equation, p prime 1n (t) = summationdisplay knegationslash=n q kn p 1k (t) − ν n p 1n (t) = q (n−1)n p 1(n−1) (t) − ν n p 1n (t) = ν n−1 p 1(n−1) (t) − ν n p 1n (t). Therefore, p prime 1n (t) = (n − 1)λp 1(n−1) (t) − nλp 1n (t). (49) For n = 2, this gives p prime 12 (t) = λp 11 (t) − 2λp 12 (t) or, equivalently, p prime 12 (t) = λe −λt − 2λp 12 (t). Solving this first order linear differential equation with boundary condition p 12 (0) = 0, we obtain p 12 (t) = e −λt (1 − e −λt ). For n = 3, by (49), p prime 13 (t) = 2λp 12 (t) − 3λp 13 (t) or, equivalently, p prime 13 (t) = 2λe −λt (1 − e −λt ) − 3λp 13 (t). Solving this first order linear differential equation with boundary condition p 13 (0) = 0 yields p 13 (t) = e −λt (1 − e −λt ) 2 . Continuing this process, and using induction, we obtain that p 1n (t) = e −λt (1 − e −λt ) n−1 n ≥ 1. Section 12.4 Continuous-Time Markov Chains 323 16. It is straightforward to see that π (i,j) = parenleftBig λ µ 1 parenrightBig i parenleftBig 1 − λ µ 1 parenrightBigparenleftBig λ µ 2 parenrightBig j parenleftBig 1 − λ µ 2 parenrightBig ,i,j≥ 0, satisfy the following balance equations for the tandem queueing system under consideration. Hence, by Example 12.43, π (i,j) is the product of an M/M/1 system having i customers in the system, and another M/M/1 queueing system having j customers in the system. This establishes what we wanted to show. State Input rate to = Output rate from (0, 0) µ 2 π (0,1) = λπ (0,0) (i, 0), i ≥ 1 µ 2 π (i,1) + λπ (i−1,0) = λπ (i,0) + µ 1 π (i,0) (0,j), j ≥ 1 µ 2 π (0,j+1) + µ 1 π (1,j−1) = λπ (0,j) + µ 2 π (0,j) (i,j), i,j ≥ 1 µ 2 π (i,j+1) + µ 1 π (i+1,j−1) + λπ (i−1,j) = λπ (i,j) + µ 1 π (i,j) + µ 2 π (i,j) . 17. Clearly, braceleftbig X(t): t ≥ 0 bracerightbig is a birth and death process with birth rates λ i = iλ, i ≥ 0, and death rates µ i = iµ + γ , i>0; µ 0 = 0. For some m ≥ 1, suppose that X(t) = m. Then, for infinitesimal values of h, by (12.5), the population at t+h is m+1 with probability mλh+o(h), it is m − 1 with probability (mµ + γ)h+ o(h), and it is still m with probability 1 − mλh − o(h) − (mµ + γ)h− o(h) = 1 − (mλ + mµ + γ)h+ o(h). Therefore, E bracketleftbig X(t + h) | X(t) = m bracketrightbig = (m + 1) bracketleftbig mλh + o(h) bracketrightbig + (m − 1) bracketleftbig (mµ + γ)h+ o(h) bracketrightbig + m bracketleftbig 1 − (mλ + mµ + γ)h+ o(h) bracketrightbig = m + bracketleftbig m(λ − µ) − γ bracketrightbig h + o(h). This relation implies that E bracketleftbig X(t + h) | X(t) bracketrightbig = X(t)+ bracketleftbig (λ − µ)X(t) − γ bracketrightbig h + o(h). Equating the expected values of both sides, and noting that E bracketleftBig E bracketleftbig X(t + h) | X(t) bracketrightbig bracketrightBig = E bracketleftbig X(t + h) bracketrightbig , we obtain E bracketleftbig X(t + h) bracketrightbig = E bracketleftbig X(t) bracketrightbig + h(λ − µ)E bracketleftbig X(t) bracketrightbig − γh+ o(h). For simplicity, let g(t) = E bracketleftbig X(t) bracketrightbig . We have shown that g(t + h) = g(t) + h(λ − µ)g(t) − γh+ o(h) 324 Chapter 12 Stochastic Processes or, equivalently, g(t + h) − g(t) h = (λ − µ)g(t) − γ + o(h) h . As h → 0, this gives g prime (t) = (λ − µ)g(t) − γ. If λ = µ, then g prime (t) =−γ .Sog(t) =−γt + c. Since g(0) = n, we must have c = n,or g(t) =−γt+ n.Ifλ negationslash= µ, to solve the first order linear differential equation, g prime (t) = (λ − µ)g(t) − γ, let f(t)= (λ − µ)g(t) − γ.Then 1 λ − µ f prime (t) = f(t), or f prime (t) f(t) = λ − µ. This yields ln |f(t)|=(λ − µ)t + c, or f(t)= e (λ−µ)t+c = Ke (λ−µ)t , where K = e c . Thus g(t) = K λ − µ e (λ−µ)t + γ λ − µ . Now g(0) = n implies that K = n(γ − µ) − γ.Thus g(t) = E bracketleftbig X(t) bracketrightbig = ne (λ−µ)t + γ λ − µ bracketleftbig 1 − e (λ−µ)t bracketrightbig . 18. For n ≥ 0, let E n be the event that, starting from state n, eventually extinction will occur. Let α n = P(E n ). Clearly, α 0 = 1. We will show that α n = 1, for all n.Forn ≥ 1, starting from n, let Z n be the state to which the process will move. Then Z n is a discrete random variable with set of possible values {n − 1,n+ 1}. Conditioning on Z n yields P(E n ) = P(E n | Z n = n − 1)P(Z n = n − 1) + P(E n | Z n = n + 1)P(Z n = n + 1). Hence α n = α n−1 · µ n λ n + µ n + α n+1 · λ n λ n + µ n ,n≥ 1, or, equivalently, λ n (α n+1 − α n ) = µ n (α n − α n−1 ), n ≥ 1. Section 12.4 Continuous-Time Markov Chains 325 For n ≥ 0, let y n = α n+1 − α n .Wehave λ n y n = µ n y n−1 ,n≥ 1, or y n = µ n λ n y n−1 ,n≥ 1. Therefore, y 1 = µ 1 λ 1 y 0 y 2 = µ 2 λ 2 y 1 = µ 1 µ 2 λ 1 λ 2 y 0 . . . y n = µ 1 µ 2 ···µ n λ 1 λ 2 ···λ n y 0 .n≥ 1. On the other hand, by y n = α n+1 − α n , n ≥ 0, α 1 = α 0 + y 0 = 1 + y 0 α 2 = α 1 + y 1 = 1 + y 0 + y 1 . . . α n+1 = 1 + y 0 + y 1 +···+y n . Hence α n+1 = 1 + y 0 + n summationdisplay k=1 y k = 1 + y 0 + y 0 n summationdisplay k=1 µ 1 µ 2 ···µ k λ 1 λ 2 ···λ k = 1 + y 0 parenleftBig 1 + n summationdisplay k=1 µ 1 µ 2 ···µ k λ 1 λ 2 ···λ k parenrightBig = 1 + (α 1 − 1) parenleftBig 1 + n summationdisplay k=1 µ 1 µ 2 ···µ k λ 1 λ 2 ···λ k parenrightBig . Since ∞ summationdisplay k=1 µ 1 µ 2 ···µ k λ 1 λ 2 ···λ k =∞, the sequence n summationdisplay k=1 µ 1 µ 2 ···µ k λ 1 λ 2 ···λ k increases without bound. For α n ’s to exist, this requires that α 1 = 1, which in turn implies that α n+1 = 1, for n ≥ 1. 326 Chapter 12 Stochastic Processes 12.5 BROWNIAN MOTION 1. (a) By the independent-increments property of Brownian motions, the desired probability is P parenleftbig − 1/2 *ε parenrightBig = P parenleftbig |X(t)| >εt parenrightbig = P parenleftbig X(t)>εt parenrightbig + P parenleftbig X(t) < −εt parenrightbig = P parenleftBig Z> εt σ √ t parenrightBig + P parenleftBig Z<− εt σ √ t parenrightBig = P parenleftBig Z> ε √ t σ parenrightBig + P parenleftBig Z<− ε √ t σ parenrightBig = 1 − Phi1 parenleftbig ε √ t/σ parenrightbig + Phi1 parenleftbig − ε √ t/σ parenrightbig = 1 − Phi1 parenleftbig ε √ t/σ parenrightbig + 1 − Phi1 parenleftbig ε √ t/σ parenrightbig = 2 − 2Phi1 parenleftbig ε √ t/σ parenrightbig . This implies that lim t→0 P parenleftBig |X(t)| t >ε parenrightBig = 2 − 1 = 1. whereas lim t→∞ P parenleftBig |X(t)| t >ε parenrightBig = 2 − 2 = 0, 4. Let F be the probability distribution function of 1/Y 2 . Let Z ∼ N(0, 1).Wehave F(t)= P parenleftbig 1/Y 2 ≤ t parenrightbig = P parenleftbig Y 2 ≥ 1/t parenrightbig = P parenleftbig Y ≥ 1/ √ t parenrightbig + P parenleftbig Y ≤−1/ √ t parenrightbig = P parenleftBig Z ≥ α σ √ t parenrightBig + P parenleftBig Z ≤− α σ √ t parenrightBig = 1 − Phi1 parenleftBig α σ √ t parenrightBig + Phi1 parenleftBig − α σ √ t parenrightBig = 2 bracketleftBig 1 − Phi1 parenleftBig α σ √ t parenrightBigbracketrightBig , which, by (12.35), is also the distribution function of T α . 5. Clearly, P(T t, by Theorem 12.10, P(T 0, the probability density function of Z(t) is φ t (x) = 1 σ √ 2πt exp bracketleftBig − x 2 2σ 2 t bracketrightBig . Section 12.5 Brownian Motion 329 Therefore, E bracketleftbig V(t) bracketrightbig = E bracketleftbig |Z(t)| bracketrightbig = integraldisplay ∞ −∞ |x|φ t (x) dx = 2 integraldisplay ∞ 0 xφ t (x) dx = 2 integraldisplay ∞ 0 x σ √ 2πt e −x 2 /(2σ 2 t) dx. Making the change of variable u = x σ √ t yields E bracketleftbig V(t) bracketrightbig = σ radicalbigg 2t π integraldisplay ∞ 0 ue −u 2 /2 du = σ radicalbigg 2t π bracketleftBig − e −u 2 /2 bracketrightBig ∞ 0 = σ radicalbigg 2t π . Var bracketleftbig V(t) bracketrightbig = E bracketleftbig V(t) 2 bracketrightbig − parenleftbig E bracketleftbig V(t) bracketrightbigparenrightbig 2 = E bracketleftbig Z(t) 2 bracketrightbig − 2σ 2 t π = σ 2 t − 2σ 2 t π = σ 2 t parenleftBig 1 − 2 π parenrightBig , since E bracketleftbig Z(t) 2 bracketrightbig = Var bracketleftbig Z(t) bracketrightbig + parenleftbig E bracketleftbig Z(t) bracketrightbigparenrightbig 2 = σ 2 t + 0 = σ 2 t. To find P parenleftbig V(t)≤ z | V(0) = z 0 parenrightbig , note that, by (12.27), P parenleftbig V(t)≤ z | V(0) = z 0 parenrightbig = P parenleftbig |Z(t)|≤z | V(0) = z 0 parenrightbig = P parenleftbig − z ≤ Z(t) ≤ z | V(0) = z 0 parenrightbig = integraldisplay z −z 1 σ √ 2πt e −(u−z 0 ) 2 /(2σ 2 t) du. Letting U ∼ N(z 0 ,σ 2 t) and Z ∼ N(0, 1), this implies that P parenleftbig V(t)≤ z | V(0) = z 0 parenrightbig = P(−z ≤ U ≤ z) = P parenleftBig −z − z 0 σ √ t ≤ z ≤ z − z 0 σ √ t parenrightBig = Phi1 parenleftBig z − z 0 σ √ t parenrightBig − Phi1 parenleftBig −z − z 0 σ √ t parenrightBig = Phi1 parenleftBig z + z 0 σ √ t parenrightBig + Phi1 parenleftBig z − z 0 σ √ t parenrightBig − 1. 10. Clearly, D(t) = radicalbig X(t) 2 + Y(t) 2 + Z(t) 2 . Since X(t), Y(t), and Z(t) are independent and 330 Chapter 12 Stochastic Processes identically distributed normal random variables with mean 0 and variance σ 2 t,wehave E bracketleftbig D(t) bracketrightbig = integraldisplay ∞ −∞ integraldisplay ∞ −∞ integraldisplay ∞ −∞ radicalbig x 2 + y 2 + z 2 · 1 σ √ 2πt e −x 2 /(2σ 2 t) · 1 σ √ 2πt e −y 2 /(2σ 2 t) · 1 σ √ 2πt e −z 2 /(2σ 2 t) dxdydz = 1 2πσ 3 t √ 2πt integraldisplay ∞ −∞ integraldisplay ∞ −∞ integraldisplay ∞ −∞ radicalbig x 2 + y 2 + z 2 · e −(x 2 +y 2 +z 2 )/(2σ 2 t) dxdydz. We now make a change of variables to spherical coordinates: x = ρ sin φ cos θ, y = ρ sin φ sin θ, z = ρ cos φ, ρ 2 = x 2 + y 2 + z 2 , dxdydz = ρ 2 sin φdρdφdθ,0 ≤ ρ<∞, 0 ≤ φ ≤ π, and 0 ≤ θ ≤ 2π. We obtain E bracketleftbig D(t) bracketrightbig = 1 2πσ 3 t √ 2πt integraldisplay 2π 0 integraldisplay π 0 integraldisplay ∞ 0 ρe −ρ 2 /(2σ 2 t) · ρ 2 sin φdρdφ,dθ = 1 2πσ 3 t √ 2πt integraldisplay 2π 0 bracketleftBig integraldisplay π 0 parenleftBig integraldisplay ∞ 0 ρ 3 e −ρ 2 /(2σ 2 t) dρ parenrightBig sin φdφ bracketrightBig dθ = 1 2πσ 3 t √ 2πt integraldisplay 2π 0 parenleftBig integraldisplay π 0 bracketleftBig − σ 2 t(ρ 2 + 2σ 2 t)e −ρ 2 /(2σ 2 t) bracketrightBig ∞ 0 sin φdφ parenrightBig dθ = 1 2πσ 3 t √ 2πt · 2σ 4 t 2 integraldisplay 2π 0 parenleftBig integraldisplay π 0 sin φdφ parenrightBig dθ = 2σ radicalbigg 2t π . 11. Noting that √ 5.29 = 2.3, we have V(t)= 95e −2t+2.3W(t) , where braceleftbig W(t): t ≥ 0 bracerightbig is a standard Brownian motion. Hence W(t) ∼ N(0,t). The desired probability is P parenleftbig V(0.75)<80 parenrightbig = P parenleftbig 95e −2(0.75)+2.3W(0.75) < 80 parenrightbig = P parenleftbig e 2.3W(0.75) < 3.774 parenrightbig = P parenleftbig W(0.75)<0.577 parenrightbig = P parenleftBig W(0.75) − 0 √ 0.75 < 0.577 √ 0.75 parenrightBig = P(Z <0.67) = Phi1(0.67) = 0.7486. Chapter 12 Review Problems 331 REVIEW PROBLEMS FOR CHAPTER 12 1. Label the time point 10:00 as t = 0. We are given that N(180) = 10 and are interested in P parenleftbig S 10 ≥ 160 | N(180) = 10 parenrightbig . Let X 1 , X 2 , ..., X 10 be 10 independent random variables uni- formly distributed over the interval [0, 180]. Let Y = max(X 1 ,... ,X 10 ). By Theorem 12.4, P parenleftBig S 10 > 160 | N(180) = 10 parenrightbig = P(Y >160) = 1 − P(Y ≤ 160) = 1 − P parenleftbig max(X 1 ,... ,X 10 ) ≤ 160 parenrightbig = 1 − P(X 1 ≤ 160)P(X 2 ≤ 160) ···P(X 10 ≤ 160) = 1 − parenleftBig 160 180 parenrightBig 10 = 0.692. 2. For all positive integer n, we have that P 2n = parenleftbigg 10 01 parenrightbigg and P 2n+1 = parenleftbigg 01 10 parenrightbigg . Therefore, {X n : n = 0, 1,...} is not regular. 3. By drawing a transition graph, it can be readily seen that, if states 0, 1, 2, 3, and 4 are renamed 0, 4, 2, 1, and 3, respectively, then the transition probability matrix P 1 will change to P 2 . 4. Let Z be the number of transitions until the first visit to 1. Clearly, Z is a geometric random variable with parameter p = 3/5. Hence its expected value is 1/p = 5/3. 5. By drawing a transition graph, it is readily seen that this Markov chain consists of two recurrent classes {3, 5} and {4}, and two transient classes {1} and {2}. 6. We have that X n+1 = braceleftBigg X n if the (n + 1)st outcome is not 6 1 + X n if the (n + 1)st outcome is 6. This shows that {X n : n = 1, 2,...} is a Markov chain with state space {0, 1, 2,...}. Its transition probability matrix is given by P = ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ 5/61/6000... 05/61/60 0... 005/61/60... 0005/61/6 ... . . . ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ . All states are transient; no two states communicate with each other. Therefore, we have infinitely many classes; namely, {0}, {1}, {2}, ..., and each one of them is transient. 332 Chapter 12 Stochastic Processes 7. The desired probability is p 11 p 11 + p 11 p 12 + p 12 p 22 + p 12 p 21 + p 21 p 11 + p 21 p 12 + p 22 p 21 + p 22 p 22 = (0.20) 2 + (0.20)(0.30) + (0.30)(0.15) + (0.30)(0.32) + (0.32)(0.20) + (0.32)(0.30) + (0.15)(0.32) + (0.15) 2 = 0.4715. 8. The following is an example of such a transition probability matrix: P = ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ 00100 000 1 0 00 0 0 00 00010 000 01/2001/2000 00001/32/300 00000 010 00000 001 00000 100 ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ . 9. For n ≥ 1, let X n = braceleftBigg 1 if the nth golfball produced is defective 0 if the nth golfball produced is good. Then {X n : n = 1, 2,...} is a Markov chain with state space {0, 1} and transition probability matrix parenleftBig 15/18 3/18 11/12 1/12 parenrightBig . Let π 0 be the fraction of golfballs produced that are good, and π 1 be the fraction of the balls produced that are defective. Then, by Theorem 12.7, π 0 and π 1 satisfy parenleftbigg π 0 π 1 parenrightbigg = parenleftbigg 15/18 11/12 3/18 1/12 parenrightbiggparenleftbigg π 0 π 1 parenrightbigg , which gives us the following system of equations ⎧ ⎨ ⎩ π 0 = (15/18)π 0 + (11/12)π 1 π 1 = (3/18)π 0 + (1/12)π 1 . By choosing any one of these equations along with the relation π 0 + π 1 = 1, we obtain a system of two equations in two unknowns. Solving that system yields π 0 = 11 13 ≈ 0.85 and π 1 = 2 13 ≈ 0.15. Therefore, approximately 15% of the golfballs produced have no logos. 10. Let X n = ⎧ ⎪ ⎨ ⎪ ⎩ 1 if the nth ball is drawn by Carmela 2 if the nth ball is drawn by Daniela 3 if the nth ball is drawn by Lucrezia. Chapter 12 Review Problems 333 The process {X n : n = 1, 2,...} is an irreducible, aperiodic, positive recurrent Markov chain with transition probability matrix P = ⎛ ⎝ 7/31 11/31 13/31 7/31 11/31 13/31 7/31 11/31 13/31 ⎞ ⎠ . Let π 1 , π 2 , and π 3 be the long-run proportion of balls drawn by Carmela, Daniela, and Lucrezia, respectively. Intuitively, it should be clear that these quantities are 7/31, 11/31, and 13/31, respectively. However, that can be seen also by solving the following matrix equation along with π 0 + π 1 + π 3 = 1. ⎛ ⎝ π 1 π 2 π 3 ⎞ ⎠ = ⎛ ⎝ 7/31 7/31 7/31 11/31 11/31 11/31 13/31 13/31 13/31 ⎞ ⎠ ⎛ ⎝ π 1 π 2 π 3 ⎞ ⎠ . 11. Let π 1 and π 2 be the long-run probabilities that Francesco devotes to playing golf and playing tennis, respectively. Then, by Theorem 12.7, π 1 and π 2 are obtained from solving the system of equations parenleftbigg π 1 π 2 parenrightbigg = parenleftbigg 0.30 0.58 0.70 0.42 parenrightbiggparenleftbigg π 1 π 2 parenrightbigg along with π 1 + π 2 = 1. The matrix equation above gives the following system of equations: braceleftBigg π 1 = 0.30π 1 + 0.58π 2 π 2 = 0.70π 1 + 0.42π 2 . By choosing any one of these equations along with the relation π 1 + π 2 = 1, we obtain a system of two equations in two unknowns. Solving that system yields π 1 = 0.453125 and π 2 = 0.546875. Therefore, the long-run probability that, on a randomly selected day, Francesco plays tennis is approximately 0.55. 12. Suppose that a train leaves the station at t = 0. Let X 1 be the time until the first passenger arrives at the station after t = 0. Let X 2 be the additional time it will take until a train arrives at the station, X 3 be the time after that until a passenger arrives, and so on. Clearly, X 1 , X 2 , ... are the times between consecutive change of states. By the memoryless property of exponential random variables, {X 1 ,X 2 ,...} is a sequence of independent and identically distributed exponential random variables with mean 1/λ. Hence, by Remark 7.2, braceleftbig N(t): t ≥ 0 bracerightbig is a Poisson process with rate λ. Therefore, N(t) is a Poisson random variable with parameter λt. 13. Let X(t) be the number of components working at time t. Clearly, braceleftbig X(t): t ≥ 0 bracerightbig is a continuous-time Markov chain with state space {0, 1, 2}. Let π 0 , π 1 , and π 2 be the long-run proportion of time the process is in states 0, 1, and 2, respectively. The balance equations for braceleftbig X(t): t ≥ 0 bracerightbig are as follows: 334 Chapter 12 Stochastic Processes State Input rate to = Output rate from 0 λπ 1 = µπ 0 1 2λπ 2 + µπ 0 = µπ 1 + λπ 1 2 µπ 1 =2λπ 2 From these equations, we obtain π 1 = µ λ π 0 and π 2 = µ 2 2λ 2 π 0 . Using π 0 + π 1 + π 2 = 1 yields π 0 = 2λ 2 2λ 2 + 2λµ + µ 2 . Hence the desired probability is 1 − π 0 = µ(2λ + µ) 2λ 2 + 2λµ + µ 2 . 14. Suppose that every time an out-of-order machine is repaired and is ready to operate a birth occurs. Suppose that a death occurs every time that a machine breaks down. The fact that braceleftbig X(t): t ≥ 0 bracerightbig is a birth and death process should be clear. The birth and death rates are λ n = ⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩ kλ n = 0, 1,... ,m+ s − k (m + s − n)λ n = m + s − k + 1,m+ s − k + 2,... ,m+ s 0 n ≥ m + s; µ n = ⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩ nµ n = 0, 1,... ,m mµ n = m + 1,m+ 2,... ,m+ s 0 n>m+ s. 15. Let X(t)be the number of machines operating at time t. For 0 ≤ i ≤ m, let π i be the long-run proportion of time that there are exactly i machines operating. Suppose that a birth occurs each time that an out-of-order machine is repaired and begins to operate, and a death occurs each time that a machine breaks down. Then braceleftbig X(t): t ≥ 0 bracerightbig is a birth and death process with state space {0, 1,... ,m}, and birth and death rates, respectively, given by λ i = (m − i)λ and µ i = iµ for i = 0, 1,... ,m. To find π 0 , first we will calculate the following sum: m summationdisplay i=1 λ 0 λ 1 ···λ i−1 µ 1 µ 2 ···µ i = m summationdisplay i=1 (mλ) bracketleftbig (m − 1)λ bracketrightbigbracketleftbig (m − 2)λ bracketrightbig ··· bracketleftbig (m − i + 1)λ bracketrightbig µ(2µ)(3µ) ···(iµ) = m summationdisplay i=1 m P i λ i i! µ i = m summationdisplay i=1 parenleftbigg m i parenrightbigg parenleftBig λ µ parenrightBig i =−1 + m summationdisplay i=0 parenleftbigg m i parenrightbigg parenleftBig λ µ parenrightBig i 1 m−i =−1 + parenleftBig 1 + λ µ parenrightBig m , Chapter 12 Review Problems 335 where m P i is the number of i-element permutations of a set containing m objects. Hence, by (12.22), π 0 = parenleftBig 1 + λ µ parenrightBig −m = parenleftBig λ + µ µ parenrightBig −m = parenleftBig µ λ + µ parenrightBig m . By (12.21), π i = λ 0 λ 1 ···λ i−1 µ 1 µ 2 ···µ i π 0 = m P i λ i i! µ i π 0 = parenleftbigg m i parenrightbigg parenleftBig λ µ parenrightBig i parenleftBig µ λ + µ parenrightBig m = parenleftbigg m i parenrightbigg parenleftBig λ µ parenrightBig i parenleftBig µ λ + µ parenrightBig i parenleftBig µ λ + µ parenrightBig m−i = parenleftbigg m i parenrightbigg parenleftBig λ λ + µ parenrightBig i parenleftBig 1 − λ λ + µ parenrightBig m−i , 0 ≤ i ≤ m. Therefore, in steady-state, the number of machines that are operating is binomial with param- eters m and λ/(λ + µ). 16. Let X(t) be the number of cars at the center, either being inspected or waiting to be inspected, at time t. Clearly, braceleftbig X(t): t ≥ 0 bracerightbig is a birth and death process with rates λ n = λ/(n + 1), n ≥ 0, and µ n = µ, n ≥ 1. Since ∞ summationdisplay n=1 λ 0 λ 1 ···λ n−1 µ 1 µ 2 ···µ n = ∞ summationdisplay n=1 λ · λ 2 · λ 3 ··· λ n µ n =−1 + ∞ summationdisplay n=0 1 n! parenleftBig λ µ parenrightBig n = e λ/µ − 1. By (12.18), π 0 = e −λ/µ . Hence, by (12.17), π n = λ · λ 2 · λ 3 ··· λ n µ n e −λ/µ = (λ/µ) n e −λ/µ n! ,n≥ 0. Therefore, the long-run probability that there are n cars at the center for inspection is Poisson with rate λ/µ. 17. Let X(t)be the population size at time t. Then braceleftbig X(t): t ≥ 0 bracerightbig is a birth and death process with birth rates λ n = nλ, n ≥ 1, and death rates µ n = nµ, n ≥ 0. For i ≥ 0, let H i be the time, starting from i, until the population size reaches i + 1 for the first time. We are interested in summationtext 4 i=1 E(H i ). Note that, by Lemma 12.2, E(H i ) = 1 λ i + µ i λ i E(H i−1 ), i ≥ 1. Since E(H 0 ) = 1/λ, E(H 1 ) = 1 λ + µ λ · 1 λ = 1 λ + µ λ 2 , 336 Chapter 12 Stochastic Processes E(H 2 ) = 1 2λ + 2µ 2λ · parenleftBig 1 λ + µ λ 2 parenrightBig = 1 2λ + µ λ 2 + µ 2 λ 3 , E(H 3 ) = 1 3λ + 3µ 3λ parenleftBig 1 2λ + µ λ 2 + µ 2 λ 3 parenrightBig = 1 3λ + µ 2λ 2 + µ 2 λ 3 + µ 3 λ 4 , E(H 4 ) = 1 4λ + 4µ 4λ parenleftBig 1 3λ + µ 2λ 2 + µ 2 λ 3 + µ 3 λ 4 parenrightBig = 1 4λ + µ 3λ 2 + µ 2 2λ 3 + µ 3 λ 4 + µ 4 λ 5 . Therefore, the answer is 4 summationdisplay i=1 E(H i ) = 25λ 4 + 34λ 3 µ + 30λ 2 µ 2 + 24λµ 3 + 12µ 4 12λ 5 . 18. Let X(t) be the population size at time t. Then braceleftbig X(t): t ≥ 0 bracerightbig is a birth and death process with rates λ n = γ,n ≥ 0, and µ n = nµ, n ≥ 1. To find π i ’s, we will first calculate the sum in the relation (12.18): ∞ summationdisplay n=1 λ 0 λ 1 ···λ n−1 µ 1 µ 2 ···µ n = ∞ summationdisplay n=1 γ n n! µ n =−1 + ∞ summationdisplay n=0 1 n! parenleftBig γ µ parenrightBig n =−1 + e γ/µ . Thus, by (12.18), π 0 = e −γ/µ and, by (12.17), for i ≥ 1, π i = γ n n! µ n e −γ/µ = (γ/µ) n e −γ/µ n! . Hence the steady-state probability mass function of the population size is Poisson with pa- rameter γ/µ. 19. By applying Theorem 12.9 to braceleftbig Y(t): t ≥ 0 bracerightbig with t 1 = 0, t 2 = t, y 1 = 0, y 2 = y, and t = s, we have E bracketleftbig Y(s)| Y(t)= y bracketrightbig = 0 + y − 0 t − 0 (s − 0) = s t y, and Var bracketleftbig Y(s)| Y(t)= y bracketrightbig = σ 2 · (t − s)(s − 0) t − 0 = σ 2 (t − s) s t . 20. First, suppose that sy)= P parenleftbig no zeros in (x, y) parenrightbig = 1 − 2 π arccos radicalbigg x y . 22. Let the current price of the stock, per share, be v 0 . Noting that √ 27.04 = 5.2, we have V(t)= v 0 e 3t+5.2W(t) , where braceleftbig W(t): t ≥ 0 bracerightbig is a standard Brownian motion. Hence W(t) ∼ N(0,t). The desired probability is calculated as follows: P parenleftbig V(2) ≥ 2v 0 parenrightbig = P parenleftbig v 0 e 6+5.2W(2) ≥ 2v 0 parenrightbig = P parenleftbig 6 + 5.2W(2) ≥ ln 2 parenrightbig = P parenleftbig W(2) ≥−1.02 parenrightbig = P parenleftBig W(2) − 0 √ 2 ≥−0.72 parenrightBig = P(Z ≥−0.72) = 1 − P(Z <−0.72) = 1 − Phi1(−0.72) = 0.7642. artsci aaa.pdf

n+ m) P(X>m) = P(X > n). Therefore, P(X>n+ m) = P(X > n)P(X > m). (19) Section 5.3 Other Discrete Random Variables 103 Let p = P(X= 1); using induction, we prove that (18) is valid for all positive integers n.To show (18) for n = 2, note that (19) implies that P(X>2) = P(X>1)P(X > 1). Since P(X>1) = 1 − P(X= 1) = 1 − p, this relation gives 1 − P(X= 1) − P(X= 2) = (1 − p) 2 , or 1 − p − P(X= 2) = (1 − p) 2 , which yields P(X= 2) = p(1 − p), so (18) is also true for n = 2. Now assume that (18) is valid for all positive integers i, i ≤ n; that is, assume that P(X= i) = p(1 − p) i−1 ,i≤ n. (20) We will show that (18) is true for n + 1. The induction hypothesis [relation (20)] implies that P(X≤ n) = n summationdisplay i=1 P(X= i) = n summationdisplay i=1 p(1 − p) i−1 = p 1 − (1 − p) n 1 − (1 − p) = 1 − (1 − p) n . So P(X>n)= (1 − p) n and, similarly, P(X>n− 1) = (1 − p) n−1 . Now (19) yields P(X>n+ 1) = P(X > n)P(X > 1), which implies that 1 − P(X≤ n) − P(X= n + 1) = (1 − p) n (1 − p). Substituting P(X≤ n) = 1 − (1 − p) n in this relation, we obtain P(X= n + 1) = p(1 − p) n , which establishes (18) for n + 1. Therefore, we have what we wanted to show. 23. Consider a coin for which the probability of tails is 1 − p and the probability of heads is p. In successive and independent flips of the coin, let X 1 be the number of flips until the first head, X 2 be the total number of flips until the second head, X 3 be the total number of flips until the third head, and so on. Then the length of the first character of the message and X 1 are identically distributed. The total number of the bits forming the first two characters of the message and X 2 are identically distributed. The total number of the bits forming the first three characters of the message and X 3 are identically distributed, and so on. Therefore, the total number of the bits forming the message has the same distribution as X k . This is negative binomial with parameters k and p. 104 Chapter 5 Special Discrete Distributions 24. Let X be the number of cartons to be opened before finding one without rotten eggs. X is not a geometric random variable because the number of cartons is limited, and one carton not having rotten eggs is not independent of another carton not having rotten eggs. However, it should be obvious that a geometric random variable with parameter p = parenleftbigg 1000 12 parenrightbigg slashBig parenleftbigg 1200 12 parenrightbigg = 0.1109 is a good approximation for X. Therefore, we should expect approximately 1/p = 1/0.1109 = 9.015 cartons to be opened before finding one without rotten eggs. 25. Either the Nth success should occur on the (2N − M)th trial or the Nth failure should occur on the (2N − M)th trial. By symmetry, the answer is 2 · parenleftbigg 2N − M − 1 N − 1 parenrightbigg parenleftBig 1 2 parenrightBig N parenleftBig 1 2 parenrightBig N−M = parenleftbigg 2N − M − 1 N − 1 parenrightbigg parenleftBig 1 2 parenrightBig 2N−M−1 . 26. The desired quantity is 2 times the probability of exactly N successes in (2N − 1) trials and failures on the (2N)th and (2N + 1)st trials: 2 parenleftbigg 2N − 1 N parenrightbigg parenleftBig 1 2 parenrightBig N parenleftBig 1 − 1 2 parenrightBig (2N−1)−N · parenleftBig 1 − 1 2 parenrightBig 2 = parenleftbigg 2N − 1 N parenrightbigg parenleftBig 1 2 parenrightBig 2N . 27. Let X be the number of rolls until Adam gets a six. Let Y be the number of rolls of the die until Andrew rolls an odd number. Since the events (X = i),1≤ i<∞, form a partition of the sample space, by Theorem 3.4, P parenleftbig Y>X parenrightbig = ∞ summationdisplay i=1 P parenleftbig Y>X| X = i parenrightbig P parenleftbig X = i parenrightbig = ∞ summationdisplay i=1 P parenleftbig Y>i parenrightbig P parenleftbig X = i parenrightbig = ∞ summationdisplay i=1 parenleftBig 1 2 parenrightBig i · parenleftBig 5 6 parenrightBig i−1 1 6 = 6 5 · 1 6 ∞ summationdisplay i=1 parenleftBig 5 12 parenrightBig i = 1 5 · 5 12 1 − 5 12 = 1 7 , where P(Y >i)= (1/2) i since for Y to be greater than i,Andrew must obtain an even number on each of the the first i rolls. 28. The probability of 4 tagged trout among the second 50 trout caught is p n = parenleftbigg 50 4 parenrightbiggparenleftbigg n − 50 46 parenrightbigg parenleftbigg n 50 parenrightbigg . It is logical to find the value of n for which p n is maximum. (In statistics this value is called the maximum likelihood estimate for the number of trout in the lake.) To do this, note that p n p n−1 = (n − 50) 2 n(n − 96) . Section 5.3 Other Discrete Random Variables 105 Now p n ≥ p n−1 if and only if (n − 50) 2 ≥ n(n − 96),orn ≤ 625. Therefore, n = 625 makes p n maximum, and hence there are approximately 625 trout in the lake. 29. (a) Intuitively, it should be clear that the answer is D/N. To prove this, let E j be the event of obtaining exactly j defective items among the first (k − 1) draws. Let A k be the event that the kth item drawn is defective. We have P(A k ) = k−1 summationdisplay j=0 P(A k | E j )P(E j ) = k−1 summationdisplay j=0 D − j N − k + 1 · parenleftbigg D j parenrightbiggparenleftbigg N − D k − 1 − j parenrightbigg parenleftbigg N k − 1 parenrightbigg . Now (D − j) parenleftbigg D j parenrightbigg = D parenleftbigg D − 1 j parenrightbigg and (N − k + 1) parenleftbigg N k − 1 parenrightbigg = N parenleftbigg N − 1 k − 1 parenrightbigg . Therefore, P(A k ) = k−1 summationdisplay j=0 D parenleftbigg D − 1 j parenrightbiggparenleftbigg N − D k − 1 − j parenrightbigg N parenleftbigg N − 1 k − 1 parenrightbigg = D N k−1 summationdisplay j=0 parenleftbigg D − 1 j parenrightbiggparenleftbigg N − D k − 1 − j parenrightbigg parenleftbigg N − 1 k − 1 parenrightbigg = D N , where k−1 summationdisplay j=0 parenleftbigg D − 1 j parenrightbiggparenleftbigg N − D k − 1 − j parenrightbigg parenleftbigg N − 1 k − 1 parenrightbigg = 1 since parenleftbigg D − 1 j parenrightbiggparenleftbigg N − D k − 1 − j parenrightbigg parenleftbigg N − 1 k − 1 parenrightbigg is the probability mass function of a hypergeometric random variable with parameters N − 1, D − 1, and k − 1. (b) Intuitively, it should be clear that the answer is (D − 1)/(N − 1). To prove this, let A k be as before and let F j be the event of exactly j defective items among the first (k − 2) draws. Let B be the event that the (k − 1)st and the kth items drawn are defective. We have P(B)= k−2 summationdisplay j=0 P(B | F j )P(F j ) 106 Chapter 5 Special Discrete Distributions = k−2 summationdisplay j=0 (D − j)(D − j − 1) (N − k + 2)(N − k + 1) · parenleftbigg D j parenrightbiggparenleftbigg N − D k − 2 − j parenrightbigg parenleftbigg N k − 2 parenrightbigg = k−2 summationdisplay j=0 D(D − 1) parenleftbigg D − 2 j parenrightbiggparenleftbigg N − D k − 2 − j parenrightbigg N(N − 1) parenleftbigg N − 2 k − 2 parenrightbigg = D(D − 1) N(N − 1) k−2 summationdisplay j=0 parenleftbigg D − 2 j parenrightbiggparenleftbigg N − D k − 2 − j parenrightbigg parenleftbigg N − 2 k − 2 parenrightbigg = D(D − 1) N(N − 1) . Using this, we have that the desired probability is P(A k | A k−1 ) = P(A k A k−1 ) P(A k−1 ) = P(B) P(A k−1 ) = D(D − 1) N(N − 1) D N = D − 1 N − 1 . REVIEW PROBLEMS FOR CHAPTER 5 1. 20 summationdisplay i=12 parenleftbigg 20 i parenrightbigg (0.25) i (0.75) 20−i = 0.0009. 2. N(t), the number of customers arriving at the post office at or prior to t is a Poisson process with λ = 1/3. Thus P parenleftbig N(30) ≤ 6 parenrightbig = 6 summationdisplay i=0 P parenleftbig N(30) = i parenrightbig = 6 summationdisplay i=0 e −(1/3)30 bracketleftbig (1/3)30 bracketrightbig i i! = 0.130141. 3. 4 · 8 30 = 1.067. 4. 2 summationdisplay i=0 parenleftbigg 12 i parenrightbigg (0.30) i (0.70) 12−i = 0.253. Chapter 5 Review Problems 107 5. parenleftbigg 5 2 parenrightbigg (0.18) 2 (0.82) 3 = 0.179. 6. 1999 summationdisplay i=2 parenleftbigg i − 1 2 − 1 parenrightbigg parenleftBig 1 1000 parenrightBig 2 parenleftBig 999 1000 parenrightBig i−2 = 0.59386. 7. 12 summationdisplay i=7 parenleftbigg 160 i parenrightbiggparenleftbigg 200 12 − i parenrightbigg parenleftbigg 360 12 parenrightbigg = 0.244. 8. Call a train that arrives between 10:15 A.M. and 10:28 A.M. a success. Then p, the probability of success is p = 28 − 15 60 = 13 60 . Therefore, the expected value and the variance of the number of trains that arrive in the given period are 10(13/60) = 2.167 and 10(13/60)(47/60) = 1.697, respectively. 9. The number of checks returned during the next two days is Poisson with λ = 6. The desired probability is P(X≤ 4) = 4 summationdisplay i=0 e −6 6 i i! = 0.285. 10. Suppose that 5% of the items are defective. Under this hypothesis, there are 500(0.05) = 25 defective items. The probability of two defective items among 30 items selected at random is parenleftbigg 25 2 parenrightbiggparenleftbigg 475 28 parenrightbigg parenleftbigg 500 30 parenrightbigg = 0.268. Therefore, under the above hypothesis, having two defective items among 30 items selected at random is quite probable. The shipment should not be rejected. 11. N is a geometric random variable with p = 1/2. So E(N) = 1/p = 2, and Var(N) = (1 − p)/p 2 = bracketleftbig 1 − (1/2) bracketrightbig /(1/4) = 2. 12. parenleftBig 5 6 parenrightBig 5 parenleftBig 1 6 parenrightBig = 0.067. 13. The number of times a message is transmitted or retransmitted is geometric with parameter 1 − p. Therefore, the expected value of the number of transmissions and retransmissions of a 108 Chapter 5 Special Discrete Distributions message is 1/(1 − p). Hence the expected number of retransmissions of a message is 1 1 − p − 1 = p 1 − p . 14. Call a customer a “success,” if he or she will make a purchase using a credit card. Let E be the event that a customer entering the store will make a purchase. Let F be the event that the customer will use a credit card. To find p, the probability of success, we use the law of multiplication: p = P(EF)= P(E)P parenleftbig F | E parenrightbig = (0.30)(0.85) = 0.255. The random variable X is binomial with parameters 6 and 0.255. Hence P parenleftbig X = i parenrightbig = parenleftbigg 6 i parenrightbigg parenleftbig 0.255 parenrightbig i parenleftbig 1 − 0.255 parenrightbig 6−i ,i= 0, 1,... ,6. Clearly, E(X) = np = 6(0.255) = 1.53 and Var(X) = np(1 − p) = 6(0.255)(1 − 0.255) = 1.13985. 15. 5 summationdisplay i=3 parenleftbigg 18 i parenrightbiggparenleftbigg 10 5 − i parenrightbigg parenleftbigg 28 5 parenrightbigg = 0.772. 16. By the formula for the expected value of a hypergeometric random variable, the desired quantity is (5 × 6)/16 = 1.875. 17. We want to find the probability that at most 4 of the seeds do not germinate: 4 summationdisplay i=0 parenleftbigg 40 i parenrightbigg (0.06) i (0.94) 40−i = 0.91. 18. 1 − 2 summationdisplay i=0 parenleftbigg 20 i parenrightbigg (0.06) i (0.94) 20−i = 0.115. Let X be the number of requests for reservations at the end of the second day. It is reasonable to assume that X is Poisson with parameter 3 × 3 × 2 = 18. Hence the desired probability is P(X≥ 24) = 1 − 23 summationdisplay i=0 P(X= i) = 1 − 23 summationdisplay i=0 e −18 (18) i i! = 1 − 0.89889 = 0.10111. Chapter 5 Review Problems 109 19. Suppose that the company’s claim is correct. Then the probability of 12 or less drivers using seat belts regularly is 12 summationdisplay i=0 parenleftbigg 20 i parenrightbigg (0.70) i (0.30) 20−i ≈ 0.228. Therefore, under the assumption that the company’s claim is true, it is quite likely that out of 20 randomly selected drivers, 12 use seat belts. This is not a reasonable evidence to conclude that the insurance company’s claim is false. 20. (a) (0.999) 999 (0.001) 1 = 0.000368. (b) parenleftbigg 2999 2 parenrightbigg (0.001) 3 (0.999) 2997 = 0.000224. 21. Let X be the number of children having the disease. We have that the desired probability is P(X= 3 | X ≥ 1) = P(X= 3) P(X≥ 1) = parenleftbigg 5 3 parenrightbigg (0.23) 3 (0.77) 2 1 − (0.77) 5 = 0.0989. 22. (a) parenleftBig w w + b parenrightBig n−1 parenleftBig b w + b parenrightBig . (b) parenleftBig w w + b parenrightBig n−1 . 23. Let n be the desired number of seeds to be planted. Let X be the number of seeds which will germinate. We have that X is binomial with parameters n and 0.75. We want to find the smallest n for which P(X≥ 5) ≥ 0.90. or, equivalently, P(X<5) ≤ 0.10. That is, we want to find the smallest n for which 4 summationdisplay i=0 parenleftbigg n i parenrightbigg (0.75) i (.25) n−i ≤ 0.10. By trial and error, as the following table shows, we find that the smallest n satisfying P(X<5) ≤ 0.10 is 9. So at least nine seeds is to be planted. n summationtext 4 i=0 parenleftbig n i parenrightbig (0.75) i (.25) n−i 5 0.7627 6 0.4661 7 0.2436 8 0.1139 9 0.0489 110 Chapter 5 Special Discrete Distributions 24. Intuitively, it must be clear that the answer is k/n. To prove this, let B be the event that the ith baby born is blonde. Let A be the event that k of the n babies are blondes. We have P(B | A) = P(AB) P(A) = p · parenleftbigg n − 1 k − 1 parenrightbigg p k−1 (1 − p) n−k parenleftbigg n k parenrightbigg p k (1 − p) n−k = parenleftbigg n − 1 k − 1 parenrightbigg parenleftbigg n k parenrightbigg = k n . 25. The size of a seed is a tiny fraction of the size of the area. Let us divide the area up into many small cells each about the size of a seed. Assume that, when the seeds are distributed, each of them will land in a single cell. Accordingly, the number of seeds distributed will equal the number of nonempty cells. Suppose that each cell has an equal chance of having a seed independent of other cells (this is only approximately true). Since λ is the average number of seeds per unit area, the expected number of seeds in the area, A,isλA. Let us call a cell in A a “success” if it is occupied by a seed. Let n be the total number of cells in A and p be the probability that a cell will contain a seed. Then X, the number of cells in A with seeds is a binomial random variable with parameters n and p. Using the formula for the expected number of successes in a binomial distribution (= np), we see that np = λA and p = λA/n. As n goes to infinity, p approaches zero while np remains finite. Hence the number of seeds that fall on the area A is a Poisson random variable with parameter λA and P(X= i) = e −λA (λA) i i! . 26. Let D/N → p, then by the Remark 5.2, for all n, parenleftbigg D x parenrightbiggparenleftbigg N − D n − x parenrightbigg parenleftbigg N n parenrightbigg ≈ parenleftbigg n x parenrightbigg p x (1 − p) n−x . Now since n →∞and nD/N → λ, n is large and np is appreciable, thus parenleftbigg n x parenrightbigg p x (1 − p) n−x ≈ e −λ λ x x! . Chapter 6 Continuous Random Variables 6.1 PROBABILITY DENSITY FUNCTIONS 1. (a) integraldisplay ∞ 0 ce −3x dx = 1 equal1⇒ c = 3. (b) P(0

Advertisement
)

Advertisement

"StudyBlue is great for studying. I love the study guides, flashcards and quizzes. So extremely helpful for all of my classes!"

Alice , Arizona State University"I'm a student using StudyBlue, and I can 100% say that it helps me so much. Study materials for almost every subject in school are available in StudyBlue. It is so helpful for my education!"

Tim , University of Florida"StudyBlue provides way more features than other studying apps, and thus allows me to learn very quickly!??I actually feel much more comfortable taking my exams after I study with this app. It's amazing!"

Jennifer , Rutgers University"I love flashcards but carrying around physical flashcards is cumbersome and simply outdated. StudyBlue is exactly what I was looking for!"

Justin , LSU