Formula Sheet for mathematical statistics
1. Probability Cheat Sheet
one page (2 sides) by Peleg Michaeli
peleg.yogiley.org/math/probability/probabilitycs.pdf
2. Probability and Statistics Cookbook
detailed list by Matthias Vallentin
http://matthias.vallentin.net/probability-and-statistics-cookbook/cookbook-en.pdf
Wednesday, May 23, 2012
Tuesday, May 22, 2012
Note on mathematical statistics: 5.7 Chi-Square Test
1. Mathematical basis of Chi-Square test
- Multivariate normal distribution implies a Chi-Square (n) distribution of Y=Sum( (X_i-mu_i)^2 / sigma_i^2 ).
- By CLT, the joint pdf of different groups of sample are treated multivariate normal.
- The statistic Q_k-1 = Sum( Y_i^2) = Sum [1..k] ( (X_i-E_i)^2/E_i ) has a Chi-Square(k-1) distribution.
- With the idea of interval frequency approximation, every distribution can be treated as Multi-binomial.
- Partition the domain of experiment result into finite mutually disjoint sets A_1, A_2 ... A_n
- Count the number of result in A_i as frequency X_i
- Assign df (different ways)
- Assign the probability of result in A_i as p_i (different ways)
- Evaluate statistic Q_k-1
- Test
Test | Goodness of fit | Homogeneity | Independence |
example | 5.7.1, 5.7.2 | 5.7.3 | 5.7.4 |
H0 | Result has the theoretical distribution | Two sets of sample have the same distribution | Two attributions of subjects are independent |
Key fact | X_i/n = p_0i | p_1i=p_2i=p_0i=E_i/n | P_ij=Pi*Pj |
source of E_i | multinomial model | MLE | MLE |
formula of E_i (1<=i<=k) | E_i=n*p_0i | E_i=(X_i1+X_i2)/(n1+n2) | E_ij=(X_i./n) * (X_.j/n) |
df | k-1 | k-1=2(k-1)-(k-1) | (a-1)(b-1)=(a*b-1)-(a+b-2) |
statistic | Sum [i=1..k] ( (X_i-E_i)^2/ E_i ) | Sum [j=1,2] [i=1..k]( (X_ij-n_j*E_i)^2/n_j*E_i ) | Sum [j=1,a] [i=1..b]( (X_ij-n*E_ij)^2/n*E_ij) |
Dataset | x_1, x_2 ... x_n | x_1, x_2,... x_n1 y_1, y_2 ... y_n2 |
contingency table Xij, 1<=i<=a, 1<=j<=b k=a*b |
4. Remarks
- Chi-Square tests are not exact test, but approximate test
- The statistic is based on frequency of result in interval, instead of the result itself
- Make sure E_i > 5 or use Fisher exact test
- Minimum Chi-Square estimation
- MLE based Chi-square tests have greater rejection rate than tests based on Minimum Chi-Square estimator
- Every estimated parameter p0i costs one df
Notes on mathematical statistics: textbook
Here is a serial of notes on my studying on mathematical statistics, specifically on the textbook of
Introduction to Mathematical Statistics,
R.V. Hogg, A Craig and J. W. McKean
6th edition, Pearson.
There was an official solution book pdf file including partial answers to exercises.
However, it can be located and downloaded only after intensive Google search.
I keep my homework and exercises from Chapter 4 to Chapter 7 which can be used as an "AS IS" manual for the book (not only even number questions in official solution, but also some odd numbers questions).
Contact me with the page number and exercises index.
I will see what I can do for you.
Introduction to Mathematical Statistics,
R.V. Hogg, A Craig and J. W. McKean
6th edition, Pearson.
There was an official solution book pdf file including partial answers to exercises.
However, it can be located and downloaded only after intensive Google search.
I keep my homework and exercises from Chapter 4 to Chapter 7 which can be used as an "AS IS" manual for the book (not only even number questions in official solution, but also some odd numbers questions).
Contact me with the page number and exercises index.
I will see what I can do for you.
Subscribe to:
Posts (Atom)