Spring 2007

STAT 7030 – Mathematical Statistics II

 

Date

Section

Assignment

Jan 8

Brief review and discussion of problems on final.

Two-sample t-test and the F-test for comparing two variances. (sections 9.2, 9.3). Exam file. Baby boom data file.

Summary of Minitab session.

Read 9.2, 9.3. Also read Chapter 8; it is short and informative and we will use it later in the course.

Hw#1 (due Jan 22): 9.2.3, 9.2.4 (plot the data and discuss normality, don’t just do the test using the given summary descriptive statistics. Also check if it you may assume equal variances.), 9.2.9, 9.3.1 (add to the given data: President Reagan: height 6’1”, age 93; President Ford: height 6’, age 93. If the answer in part (b) is No, use the modified test for unequal variances.).

Look for more to come regarding statistics articles to read.

Jan 22

Discussion on designing experiments. See notes.

Remarks on the F distribution. Hypothesis tests and confidence intervals for two population proportions.

Data on gasoline taxes : an example of a matched pairs hypothesis test.

Hw#2 (due Jan 29): 8.2.2, 8.2.4, 8.2.16, 8.2.34, 8.2.36, 9.4.7, 9.5.8.  Exercise 1 from notes.

See article for example of statistical study.

Jan 29

Goodness-of-fit tests. The multinomial distribution.(10.1-10.3)

Mendel data (Minitab project file)

Hw#2 extended: due Wednesday, Jan. 31

See interesting article on Goodness-of-fit tests.

Hw#3: (due Feb 7)L 10.2.4, 10.3.2, 10.3.3, 10.3.6(in this problem you test for a single proportion, would it be any different than a hypothesis test for one proportion?), 10.3.10.

For the binomial data in the linked file perform a goodness of fit test to assess normality by dividing the data into 8 frequency classes. See file for steps to follow. 

Feb 5

Invited talk: Dr. Brani Vidakovic from Georgia Tech. will speak on Wavelets analysis on biomedical data, in CL 1003.

Chi-square test for independence. 

Data files: education; diet.

Minitab project worked in class: ex. 10.4.5. fitting an exponential distribution.

Feb 12

We will start Ch. 11: The Method of Least Squares.

Fitting data (Minitab file) and some explanations (word file)

Read 10.4 and 10.5.

Hw#4: (due Feb 14L) 10.4.12, 10.5.2, 10.5.6, 10.5.8.

Think about the projects you want to present: get data “generous enough” so that we can apply all the tests that we have learned so far, and even more. 

Hw#5: (due Feb 21): 11.2.20, 23, 25, 30.

Plot your data, do a transformation if a linear plot does not seem appropriate, do a residual plot for the fitted model. Write the equation of the regression line and the fitted model.

Still Hw#5: Work the exercise on the handout given in class (12.2 “Never the same amounts”: refer to questions 3&4, but be careful to state the hypotheses, then run separate Chi-square tests). How does this test compare to running separate proportion tests for each color of candy? Consider the entire amount in all five bags.

Feb 14

Go to fullsize image

 

Feb 19

The Linear Model: estimating the parameters in the regression equation and the variance.

Check the updates on Feb 12.

Data file.

Hw#6 (due Feb 26): 11.3.7, 11.3.9 (get the data and do the regression yourselves), 11.3.11. (refer to Theorem 11.3.2 for the distribution of ), 11.3.12, 11.3.13, 11.3.15.

Think about this and we will talk in class: 11.3.4. 

Feb 26

Prediction from regression equations.

Covariance and correlation. The sample correlation coefficient.

Minitab project file: pb. 11.2.30

Exam 1 is now posted. The exam is due Wednesday, March 14, 6:30 pm.

You may e-mail me if you have any questions.

March 5

No class. Spring Break!

Although most of you are working, have a nice and restful break!

March 12

Estimating the coefficients in the quadratic regression model. The Bivariate Normal Distribution.

Fuel data file.

Class notes on the bivariate normal distribution. See also Maple file for some computations and same file exported as html file.

March 19

Skewness and kurtosis.

More on the Bivariate Normal Distribution.

Tests of significance for the correlation coefficient in the bivariate normal distribution.

Discussion on why the test of significance for  based on  makes it hard to compute the power

 

of the test. Simulation (Maple file with explanations) of bivariate normal data that has a given correlation.

Same file exported in html format.

Notes to be posted soon.

More to be added

Research skewness and kurtosis and the information they provide for normality of a data set. See Resources page for links to texts and information online.

Read article related to Galton’s data on heights of parents and their children. Also, on the main site for Galton’s biography, you may read the conclusions in his own words in Regression toward Mediocrity in Hereditary Stature (make the search using the word “regression” in Search menu. The article is the first hit.)

Read sections 11.4 and 11.5 (see class notes above)

HW#7 (due March 21): 11.4.10, 11.4.17, 11.5.8 (Be careful here, how can you decide if the variables are independent? You can decide if they are uncorrelated. But then does it imply they are independent? How must they be so that we can conclude that they are independent?), 11.5.10.

Here is a treat for you: Galton’s (slightly modified) data on heights (Excel file). Data obtained from the Jump CD.

The bivariate normal distribution (MPJ file)

March 21

Spring has come!

 

March 26

Discuss article on the sample size for Z and t confidence intervals.

Introduction to ANOVA.

See article in Minitab online resources on a complete analysis using ANOVA.

Example of ANOVA: toxins on trout.

Hw#8Ldue March 28: 11.5.1, 11.5.4, 11.5.5, 11.5.11

Remember the project I talked about. I would like you to have a proposal with a description of what you plan on doing by April 4th. See guidelines (to be posted).

Here is an article I would like you to start reading. Please read instructions.

April 4

Notice. I am switching class with Dr. Lawson.

I just counted, five more classes left, I am going to miss you guys!

 

Skewness of exponential distribution (Maple file) and Minitab project file.

 

One-way ANOVA.

Comparisons of means in ANOVA: Tuckey, Bonferroni and Scheffe’s methods.

Read 12.1. and 12.2. Look at the ANOVA analysis example in the Minitab resources. Here is the article link again.

Let’s continue with the article on “How large should n be…”. Read again from the beginning and continue to section 3.

Hw#9:

1. Devise a way of obtaining the coefficient of skewness of a standard exponential distribution through simulations. How would you estimate it from data? You may use Minitab, Excel or other software of choice. Give a description of your method and report results. I do not have to see the data worksheet.

If you adventurous enough you may try computing it with Maple or other software of choice.

2.  How would you obtain the .025 and .975 quantiles of Z_n (refer to page 5 in the article)? Try to estimate them from the data you generate in part 1.

3.  Proceed to estimate the coefficient of skewness of the t_n statistic (see page 7). Even in the article it is mentioned that they estimated it from 10000 samples (quite something!).

o        Remember, by today you should have a clear idea of what your project will be about. Here are some guidelines. If there are any questions you can think of, or you feel I omitted something of utmost importance, I will appreciate any improvements to the guidelines.

April 9

Testing hypotheses with contrasts. Transforming data.

If you have already done something on your project bring the questions to class. We can start talking about your ideas.  

If not, we can start

Randomized block design.

Data files: tablets; pb12.4.4 text                                                                      

Read 12.2, 12.3. and 12.4. Scheffe’s method that I mentioned at the end of last lecture is related to 12.4.

Look at APPENDIX 12.A.1. We obtained the Tuckey’s confidence intervals without having stacked data! Try to answer this: in the example in class on trout toxins the treatments did not have the same sample size. However, in the proof of Tuckey’s method the samples are considered equal. Does it actually matter? If it does, how would explain the confidence intervals we obtained for all the pairwise differences between the means in that example?

And, of course, some homework, but it ain’t much! Work on your projects as well!

HW#10: 12.2.3, 12.2.8, 12.3.3. Remember to assess whether the assumptions of the model are satisfied.

April 16

Randomized block design.

More project discussions.

Notes on the problem done in class, and the Minitab project file.

I will have the exam ready for you on Monday and it will be due on April 30.

Work on your projects this weekend.

HW#11: 12.4.6., 12.5.1. We started in class 12.5.3 and we decided after all we might not need to transform the data. Run an ANOVA test but also think of a way of testing the accuracy by conducting a Chi Square test. The data is not normal after all, we decided it is binomial.

Work on your projects.

Enjoy your weekend!

April 23

Nonparametric tests: sign test, rank test, rank-sum test.

Minitab files: rank,

Notes on Wilcoxon rank-sum test: diet_pigs (Mintab mpj file)

 

Project presentations: Sarah Alum, Pete Stafford, Kate Small.

Finish Chapter 13. You only have to cover the paired t-test, which we discussed on the past. Pay attention to the case studies, you never know when you may need some of the information you find in them.

Hw#12: 13.2.3 (do this one using both ANOVA and paired t-test and see how the value of the F-statistic compares to the value of the t-statistic) 13.2.5. Just “pour la bonne bouche” which may be the equivalent of the “icing on the cake”, and extra credit, try 13.2.12.

Exam 2 is now posted. The exam is due by Tuesday, May 1st, 12PM.

April 30

This is the day of the final. We can still have presentations, and…

A party too!

It all happened so quickly, I did not get the chance to say good bye.

I want to thank you all for two great semesters and wish you all the best in your personal and professional life.