Alyse Fischer Statistics 201 Project 2 March 17, 2010 1. Random Sample= 200+63 (ID: 289463) 2. Chart According to the charts, a majority of the students are Sophomore?s. The number of Junior?s and Senior?s are evenly distributed and make up the second largest amount of students in Statistics 201. Other has the least amount of students. 3. Pareto Plot 2 Gender=female Pareto Plot 2 Gender=male The females enjoy watching figure-skating the most while the males enjoy the snowboarding the most. No females enjoy watching Cross-Country Skiing or Skeleton the best and no males enjoy Nordic combined the best. 4. Distributions 12 GPA Quantiles Moments Normal(3.25042,0.44252) Parameter Estimates -2log(Likelihood) = 1003.5216914615 Goodness-of-Fit Test Shapiro-Wilk W Test Note: Ho = The data is from the Normal distribution. Small p-values reject Ho. The histogram is relatively normally distributed, but slightly skewed to the left. There are several outliers to the left. There are two gaps. One between 2.1, 2.2 and another between 1.8, 1.9. I have chosen to use the median and IQR because my histogram is skewed to the left. The Center of the histogram is the median which is 3.26. This tells us that there?s 50% of data below 3.26 and 50% of the data is above 3.26. The Spread of the histogram is the IQR which is .6275. The IQR tells us that the middle 50% of the data has a range of .6275. My Normal Probability Plot indicates that my data is not normally distributed because the black line is not straight along the diagonal red line. My Goodness of Fit Test indicates that my data is not normally distributed because the probability, which is .0001 < .05. Quantiles 100.0% maximum 200 99.5% 100 97.5% 70 90.0% 50 75.0% quartile 30 50.0% median 20 25.0% quartile 13 10.0% 10 2.5% 5.9 0.5% 2.18 0.0% minimum 0 Moments Mean 25.721839 Std Dev 18.359025 Std Err Mean 0.8802479 Upper 95% Mean 27.451918 Lower 95% Mean 23.99176 N 4355. Distributions Gender=female Pairs of Shoes Distributions Gender=male Pairs of Shoes Quantiles 100.0% maximum 60 99.5% 30.3 97.5% 23 90.0% 13 75.0% quartile 10 50.0% median 6 25.0% quartile 5 10.0% 3 2.5% 2 0.5% 1 0.0% minimum 1 Moments Mean 7.7632242 Std Dev 5.3611356 Std Err Mean 0.2690677 Upper 95% Mean 8.2922039 Lower 95% Mean 7.2342445 N 397 Oneway Analysis of Pairs of Shoes by Gender Quantiles b. The female distribution has a skewed graph to the right with one outlier at 200. The male distribution is also skewed to the right with one outlier at 60. Both histograms are unimodel. The median (center) for the females is 20 while the median for males is 6. The IQR (spread) for females is 17 while the IQR for males is 5. c. In general females have more shoes than males. The range of shoes is greater with females rather than males. 6. a. I think that there will be an association between the two variables because if someone reads a text while driving they are more likely to the answer the text rather than not. b. Contingency Analysis of Read Texts While Driving? By Send Texts While Driving? Contingency Table Send Texts While Driving? By Read Texts While Driving? c. There is a correlation between the two variables. Most people who frequently read their texts while driving also send texts while driving and vice versa. 7. Scatterplot Matrix Multivariate Gender=female Correlations 17 Age Hope to be Married 18 Age Hope to Have First Child 17 Age Hope to be Married 1.0000 0.6047 18 Age Hope to Have First Child 0.6047 1.0000 The correlations are estimated by REML method. Scatterplot Matrix Multivariate Gender=male Correlations 17 Age Hope to be Married 18 Age Hope to Have First Child 17 Age Hope to be Married 1.0000 0.5178 18 Age Hope to Have First Child 0.5178 1.0000 The correlations are estimated by REML method. b. Scatterplot Matrix Multivariate Gender=female Correlations 17 Age Hope to be Married 18 Age Hope to Have First Child 17 Age Hope to be Married 1.0000 0.6047 18 Age Hope to Have First Child 0.6047 1.0000 The correlations are estimated by REML method. Scatterplot Matrix Multivariate Gender=male Correlations 17 Age Hope to be Married 18 Age Hope to Have First Child 17 Age Hope to be Married 1.0000 0.5178 18 Age Hope to Have First Child 0.5178 1.0000 The correlations are estimated by REML method. c. Both male and female have relatively similar correlation. The females have a slightly stronger correlation (.6047) than the males (.5178). This is surprising because I would assume the males would have a much weaker correlation than my data gives. 8. a. Bivariate Fit of $ Spent on Haircut By Number Body Piercings Linear Fit $ Spent on Haircut = 17.018113 + 4.203038*36 Number Body Piercings Summary of Fit Analysis of Variance Parameter Estimates Quantiles 100.0% maximum 141.17 99.5% 108.209 97.5% 56.1697 90.0% 21.4134 75.0% quartile 4.57581 50.0% median -3.8303 25.0% quartile -10.533 10.0% -17.018 2.5% -29.66 0.5% -43.203 0.0% minimum -80.47 Moments Mean -3.59e-16 Std Dev 20.967642 Std Err Mean 0.7641033 Upper 95% Mean 1.5000293 Lower 95% Mean -1.500029 N 753 Distributions Residuals $ Spent on Haircut b. R-Squared= 0.159542; 15.95% of money spent on haircuts account for the body piercings c. Bivariate Fit of $ Spent on Haircut By Number Body Piercings Linear Fit 24 $ Spent on Haircut = 16.017385 + 4.2323786*36 Number Body Piercings Summary of Fit Analysis of Variance Parameter Estimates Quantiles 100.0% maximum 95.5179 99.5% 77.0179 97.5% 51.1719 90.0% 20.9017 75.0% quartile 5.51786 50.0% median -2.9469 25.0% quartile -10.197 10.0% -16.017 2.5% -28.139 0.5% -41.412 0.0% minimum -54.109 Moments Mean 1.212e-15 Std Dev 17.955605 Std Err Mean 0.6454009 Upper 95% Mean 1.2669463 Lower 95% Mean -1.266946 N 774Distributions Residuals $ Spent on Haircut R-Squared= 0.194814. Excluding the outliers, 19.48% of money spent on haircuts account for the body piercings 9. a. 112 Levelsb. No, because it was an open ended response in a survey, everyone virtually put the same answer, but everyone typed in their answer different ways brings up 112 different levels. 100.0% maximum 4 99.5% 4 97.5% 4 90.0% 3.81 75.0% quartile 3.6 50.0% median 3.26 25.0% quartile 2.9725 10.0% 2.653 2.5% 2.36825 0.5% 1.9833 0.0% minimum 1.61 Mean 3.2504207 Std Dev 0.4425244 Std Err Mean 0.0153418 Upper 95% Mean 3.2805339 Lower 95% Mean 3.2203075 N 832 Type Parameter Estimate Lower 95% Upper 95% Location ? 3.2504207 3.2203075 3.2805339 Dispersion ? 0.4425244 0.4222354 0.464877 W Prob F C. Total 752 393369.63 <.0001* Term Estimate Std Error t Ratio Prob>|t| Intercept 17.018113 0.958948 17.75 <.0001* Number Body Piercings 4.203038 0.352018 11.94 <.0001* RSquare 0.194814 RSquare Adj 0.193771 Root Mean Square Error 17.96723 Mean of Response 22.83075 Observations (or Sum Wgts) 774 Source DF Sum of Squares Mean Square F Ratio Model 1 60298.22 60298.2 186.7851 Error 772 249218.11 322.8 Prob > F C. Total 773 309516.33 <.0001* Term Estimate Std Error t Ratio Prob>|t| Intercept 16.017385 0.815851 19.63 <.0001* 36 Number Body Piercings 4.2323786 0.30968 13.67 <.0001*