ANDERSON-DARLING TEST

The 'Anderson-Darling test', named after Theodore Wilbur Anderson, Jr. (1918–?) and Donald A. Darling (1915–?), who invented it in 1952[1], is one of the most powerful statistics for detecting most departures from normality. It may be used with small sample sizes ''n'' ≤ 25. Very large sample sizes may reject the assumption of normality with only slight imperfections, but industrial data with sample sizes of 200 and more have passed the Anderson-Darling test.
The Anderson-Darling test assesses whether a sample comes from a specified distribution. The formula for the test statistic A to assess if data {Y_1 (note that the data must be put in order) comes from a distribution with cumulative distribution function (CDF) F is
: A^2 = -N-S
where
: S=sum_{k=1}^N rac{2k-1}{N}left[ln F(Y_k) + lnleft(1-F(Y_{N+1-k})
ight)
ight].
The test statistic can then be compared against the critical values of the theoretical distribution (dependent on which F is used) to determine the P-value.
The Anderson-Darling test for normality is a distance or empirical distribution function (EDF) test. It is based upon the concept that when given a hypothesized underlying distribution, the data can be transformed to a uniform distribution. The transformed sample data can be then tested for uniformity with a distance test (Shapiro 1980).
In comparisons of power, Stephens (1974) found A^2 to be one of the best EDF statistics for detecting most departures from normality.[2] The only statistic close was the W^2 (Cramér von-Mises test) statistic.

Contents
Procedure
See also
External links
References

Procedure


(If testing for normal distribution of the variable ''X'')
1) The data of the variable ''X'' that should be tested is sorted from low to high.
2) The mean, ar{X}, and standard deviation, s, are calculated from the sample of ''X''.
3) The values of X are standardized as follows:
::Y_i= rac{X_i-ar{X}}{s}
4) With the standard normal CDF Phi, A^2 is calculated using:

::A^2 = -n - rac{1}{n} sum_{i=1}^n (2i-1)(ln Phi(Y_i)+ ln(1-Phi(Y_{n+1-i}))).
5) A^{2
★ }, an approximate adjustment for sample size, is calculated using:
::A^{2
★ }=A^2left(1+ rac{0.75}{n}+ rac{2.25}{n^2}
ight)
6) If A^{2
★ } exceeds 0.752 then the hypothesis of normality is rejected for a 5% level test.
Note:
1. If ''s'' = 0 or any P_i=(0 or 1) then A^2 cannot be calculated and is undefined.
2. Above, it was assumed that the variable X_i was being tested for normal distribution. Any other theoretical distribution can be assumed by using its CDF. Each theoretical distribution has its own critical values, and some examples are: lognormal, exponential, Weibull, extreme value type I and logistic distribution.
3. Null hypothesis follows the true distribution (in this case, N(0, 1)).

See also



Kolmogorov-Smirnov test

Shapiro-Wilk test

Smirnov-Cramér-von-Mises test

External links



US NIST Handbook of Statistics

References


1. Asymptotic theory of certain "goodness-of-fit" criteria based on stochastic processes, , T. W., Anderson, Annals of Mathematical Statistics, 1952

2. EDF Statistics for Goodness of Fit and Some Comparisons, , M. A., Stephens, Journal of the American Statistical Association, 1974


This article provided by Wikipedia. To edit the contents of this article, click here for original source.

psst.. try this: add to faves