PAGE 338 Applied Statistics Chapter 8 Sampling Distributions and Estimation 8.1 a. EMBED Equation.DSMT4 EMBED Equation.DSMT4 b. EMBED Equation.DSMT4 EMBED Equation.DSMT4 c. EMBED Equation.DSMT4 EMBED Equation.DSMT4 8.2 a. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 or (196.08, 203.92). b. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 or (990.2, 1009.80). c. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 or (49.608, 50.392). 8.3 a. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 or (4.0252, 4.0448). b. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 or (4.03304, 4.03696). c. In either case, we would conclude that our sample came from a population that did not have a population mean equal to 4.035. 8.4 a. 1. No, for n = 1 the 100 samples don’t represent a normal distribution 2.The distribution of the sample means becomes more normally distributed as n increases. 3. The standard error becomes closer to that predicted by the CLT the larger the sample becomes. 4. This demonstration reveals that if numerous samples were taken and analyzed we can confirm the CLT. In the real word, based on our notion of the true mean, we can assess this. We can generate the 95% range and determine if our values are within this range or not. Also, recognize that there is a low probability of this single range not being representative. b. 1. No, for n = 1 the 100 samples don’t represent a normal distribution 2.The distribution of the sample means becomes more normally distributed as n increases. The standard error becomes closer to that predicted by the CLT the larger the sample becomes. 4. This demonstration reveals that if numerous samples taken and analyzed we can confirm the CLT. In the real word, based on our notion of the true mean, we can assess this. We can generate the 95% range and determine if our values are within this range or not. Also, recognize that there is a low probability of this single range not being representative. 8.5 a. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 or (11.057, 16.943). b. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 or (33.675, 40.325). c. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 or (115.12, 126.88). 8.6 Exam 1: EMBED Equation.DSMT4 = EMBED Equation.DSMT4 or (70.661, 79.339). Exam 2: EMBED Equation.DSMT4 = EMBED Equation.DSMT4 or (74.661, 83.339). Exam 3: EMBED Equation.DSMT4 = EMBED Equation.DSMT4 or (60.661, 69.339) The first two confidence intervals overlap. This suggests that the first two exams had the same population mean. 8.7 EMBED Equation.DSMT4 = EMBED Equation.DSMT4 or (2.4725, 2.4775). 8.8 a. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 or (21.797, 26.203). b. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 or (37.901, 46.099). c. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 or (113.571, 124.429). Note: t values are found using the Excel formula =tinv( (1(cc),n(1) where cc is the confidence coefficient. For a. this, this would be = tinv((1-.90), 6) 8.9 a. Appendix D = 2.262, Excel = tinv(.05, 9) = 2.2622 b. Appendix D = 2.602, Excel = tinv(.02, 15) = 2.6025 c. Appendix D = 1.678 ,Excel = tinv(.10, 47) =1.6779 8.10 a. Appendix D = 2.021, Excel = tinv(.05, 40) = 2.0211 b. Appendix D = 1.990, Excel = tinv(.05, 80) = 1.9901 c. Appendix D = 1.984, Excel = tinv(.05, 100) = 1.984 All are fairly close to 1.96. 8.11 a. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 or (33.01, 58.31). b. The confidence interval could be narrower increasing the size of the sample or decreasing the confidence level. 8.12 a. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 or (18.276, 21.474). Note: t values are found using the Excel formula =tinv( ((1-cc)/2),n-1) where cc is the confidence coefficient. For a. this, this would be = tinv(((1-.90)/2), 15) 8.13 EMBED Equation.DSMT4 = EMBED Equation.DSMT4 or (742.20, 882.80). 8.14 a. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 or (11971, 37069). b. Increase the sample size or decrease the confidence level. c. It is unclear whether this distribution is normal or not. There appear to be outliers. 8.15 1. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 or (81.873, 88.127). 2. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 or (82.787, 94.414). 3. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 or (73.345, 78.655). b. Confidence intervals 1 and 2 overlap. The scores on exam 3 are very different than the first two. There was a decrease in the average exam score on the third exam. c. Here the standard deviation is not known, so use the t-distribution. 8.16 a. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 = 0.0913 b. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 = 0.0566 c. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 = 0.30 d. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 = .0032 Normality okay except in d. 8.17 a. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 = .0620 b. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 = .0877 c. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 = .1216 8.18 EMBED Equation.DSMT4 = EMBED Equation.DSMT4 = (.0293, .0667) b. Yes, .048*500 = 24 which is larger than 10. c. The Very Quick Rule would not work well here because p is small. 8.19 a. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 or (.2556, .4752) b. .3654*52 = 19. Normality assumption is met. 8.20 a. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 or (.2948, .5424) b. Given np = 18 we can assume normality. 8.21 a. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 or (.0152, .0768) b. Given np = 11.5 we can assume normality. 8.22 a. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 = (.3157, .6443) b. Yes, np = 24 which is greater than 10. 8.23 a. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 = (.1507, .3199) b. Yes, np = 32 which is greater than 10. c. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 = 136 EMBED Equation.DSMT4 = EMBED Equation.DSMT4 = 769 d. When the desired error decreases and the desired confidence increases, the sample size must increase. 8.24 Using MegaStat: a. 55 b. 217 c. 865 8.25 Using MegaStat: 25 8.26 Assume a normal distribution. Solve for sigma using = (28-20)/4 = 2. From Megastat: n = 11. 8.27 Assume a normal distribution. Solve for sigma using = (200-100)/4 = 25. From Megastat: n = 97. 8.28 Assume a Poisson distribution. EMBED Equation.DSMT4 = 2.1213. From Megastat: n = 98. 8.29 a. EMBED Equation.DSMT4 = EMBED Equation.DSMT4 = 47 b. We assumed normality and estimated = (3450-3103)/2 = 86.75 8.30 a. Using Megastat: 2165. We use π =.5. b. Sampling method: Perhaps a poll via the Internet. 8.31 a. Using Megastat: 1692. We use ( =.5. b. Sampling method: Mailed survey. 8.32 a. Using Megastat: 601. We use π =.5. b. Sampling method: Direct Observation. 8.33 a. Using Megastat: 2401. We use π =.5. b. Sampling method: Random sample via telephone or Internet survey. 8.34 Students should observe that raw data histograms will show a uniform distribution whereas the histogram of sample means shows a shape less uniform and closer to a normal distribution. The average of the raw data will be equal to the average of the sample means. The population mean is 49.5 so we would expect our data set to have a value close to this. The population standard deviation is 28.87 so we would expect the raw data to have a sample standard deviation close to this. We would also expect the standard deviation of the means to be close to 14.435 (28.87/sqrt(4)). The point of this exercise is to observe that the average of sample means is close to the population mean but the standard deviation of sample means is smaller. 8.35 a. Because the diameter is continuous there will always be slight variation in values from nickel to nickel. b. From Megastat: (.8332, .8355) c. The t distribution assumes a normal population, but in practice, this assumption can be relaxed, as long as the population is not badly skewed. We assume that here. d. Use EMBED Equation.DSMT4 to estimate the sample size. z = 2.577.and so n = 95 8.36 a. From Megastat: (3.2283, 3.3813) b. Use EMBED Equation.DSMT4 to estimate the sample size. z = 1.645 and so n = 53. c. The flow of the mixture might be one factor. There are many other possibilities. 8.37 a. From Megastat: (29.4427, 39.6342) b. The cups may be different size. Having different people take the sample and count can cause lack of consistency in sampling process. Different boxes may have different numbers of raisin. c. Producer most likely regulates raisins by weight, not count. d. A quality control system would increase consistency by monitoring system so producer knows what is expected from their process and how their process varies. Then producer can work on minimizing variation by eliminating causes of variation. 8.38 a. From Megastat: (266.76, 426.24) b. Zero values suggest a left skewed distribution. c. Use EMBED Equation.DSMT4 with z = 2.577 to get n = 1927. d. Because n > N, increase the desired error. For example, if E = 50 then n = 78. 8.39 a. From Megastat: (19.249, 20.689) b. Fuel economy can also vary due to tire pressure and weather. There may be more than sampling variability contributing to differences in sample means. 8.40 a. From Megastat: (7.292, 9.98) b. There are possible outliers that make the normal distribution questionable. c. Use EMBED Equation.DSMT4 with z = 2.326 to get n = 38. 8.41 a. From Megastat: (33.013, 58.315) b. With repair costs it is possible the distribution is skewed to the right. Also, the population size is small relative to the sample size which might cause problems. c. Use EMBED Equation.DSMT4 with z = 1.96 to get n = 119. 8.42 a. From Megastat: (3,230, 3,327) b. Use EMBED Equation.DSMT4 with z = 1.96 to get n = 91. c. The line chart shows a decrease in the number of steps over time. 8.43 a. From Megastat: (29.078, 29.982) b. Normality is a common distribution for height but at younger ages it is possible to see high outliers. c. Use EMBED Equation.DSMT4 with z = 1.96 to get n = 116. 8.44 a. From Megastat: (74.02, 86.81) b. The sample is somewhat small and the length of the commercial could be a function of the type of time out called. 8.45 a. From Megastat: (48.515, 56.965) b. The distribution is more likely to be skewed to the right with a few CDs having very long playing times. c. Use EMBED Equation.DSMT4 with z = 1.96 to get n = 75. 8.46 a. Estimated standard deviation using the uniform approximation using EMBED Equation.DSMT4 and normal approximation using EMBED Equation.DSMT4 where b is the maximum and a is the minimum of the range. Uniform Distribution Normal Distribution Chromium 0.0635 0.055 Selenium 0.0004 0.00035 Barium 0.0043 0.00375 Fluoride 0.0289 0.0250 b. An estimate of the standard deviation is necessary to calculate the sample size needed for a desired error and confidence level. 8.47 a. From Megastat: (.125, .255) b. Normality can be assumed. np = 19. c. Use EMBED Equation.DSMT4 with z = 1.645 to get n = 463. d. A quality control manager needs to understand that the sample proportion will usually be different from the population proportion but that the way the sample proportion varies is predictable. 8.48 a. From Megastat: (.039, .117) b. Different industries may have different quantities and types of records, especially if they have many government contracts. 8.49 a. From Megastat: (.092, .134) b. Yes, np = 69. c. Use EMBED Equation.DSMT4 with z = 1.645 to get n = 677. d. Yes, the results could be very different today. There is a stronger focus on nutrition today than there was 10-15 years ago. 8.50 a. From Megastat: (.176, .294) b. Normality assumption holds. np = 47 and n(1(p) = 153. c. No, the Very Quick Rule suggests that p should be close to .5. In this example, p = .235. d. n = 304. e. Frequent sampling of the noodle mix would help the manufacturer identify problems and stay on target. 8.51 a. From Megastat: (.595, .733) b. No, viewers of the late night program are a unique group of television watchers compared to the rest of the population. Not all TV watchers stay up late to watch the late night programs. 8.52 a. From Megastat: (.012, .014) b. The sample size is large enough to make the normal assumption valid. 8.53 a. standard error = EMBED Equation.DSMT4 = .0128 b. From Megastat: (.121, .171) c. No, np = 112. d. VQR: (.1103, .1821) 8.54 a. From Megastat: (.093, .130) b. Normality assumption holds. c. VQR: (.0753, .1472). This interval is wider than the one found in part a. d. This is one batch of popcorn kernels, not a random sample from the food producer. It is not clear that the person taking the sample used random sampling techniques. We also do not know the age of the popcorn that was popped. 8.55 a. From Megastat: (.001, .002) b. No, we would like to know the proportion of all mosquitoes killed, not the proportion of mosquitoes killed out of bugs killed. 8.56 a. From Megastat: (.616, .824) b. Yes, the sample size is large enough to use the normal assumption. c. Contacting the longshoremen’s union might help the sampling process. 8.57* With normality: (.011, .087). With binomial (.0181, .1031). 8.58* With normality (−.002, .402). With binomial (.0433, .4809). Normality is not justified because n is too small. 8.59* a. From Megastat: (.914, 1.006) b. The upper limit on the confidence interval is greater than 1. c. Normality is not justified therefore use the binomial approach. d. From MINITAB: (.8629, .9951) 8.60 a. margin of error = EMBED Equation.DSMT4 b. From Megastat: (.423, .457) c. Because the interval falls below .5 we would conclude that it is unlikely 50% of the voters opposed the signing. 8.61 a. Margin of error = EMBED Equation.DSMT4 Assume that p = .5 and a 95% confidence level. PAGE 68