| Title: | Tests for Departure from Normality |
|---|---|
| Description: | A toolkit for assessing data normality using a comprehensive collection of statistical methods. It includes descriptive measures and formal hypothesis tests, such as skewness and kurtosis tests, the Anderson–Darling test, the Shapiro–Wilk test, and the D'Agostino–Pearson K2 omnibus test. |
| Authors: | Joon-Keat Lai [aut, cre, cph] (ORCID: <https://orcid.org/0000-0002-9840-5836>) |
| Maintainer: | Joon-Keat Lai <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.0.2 |
| Built: | 2026-06-10 12:01:15 UTC |
| Source: | https://github.com/p10911004-npust/normality |
Performs the Anderson-Darling (A2) normality test which is based on the empirical distribution function (EDF).
Anderson_Darling_test(x, alpha = 0.05, silent = FALSE)Anderson_Darling_test(x, alpha = 0.05, silent = FALSE)
x |
A numeric vector. |
alpha |
Numeric (default: 0.05). Significance threshold, range from 0 to 1. |
silent |
Logical (default: FALSE). If |
A list.
D’Agostino, R.B., 2017. Tests for the Normal Distribution. In: D’Agostino, R.B., Stephens, M.A. (Eds.), Goodness-of-Fit Techniques, 1st ed. Routledge, New York, pp. 372–373. https://doi.org/10.1201/9780203753064
Stephens, M.A., 2017. Tests Based on EDF Statistics. In: D’Agostino, R.B., Stephens, M.A. (Eds.), Goodness-of-Fit Techniques, 1st ed. Routledge, New York, pp. 126–128. https://doi.org/10.1201/9780203753064
Anderson, T.W., Darling, D.A., 1954. A Test of Goodness of Fit. J. Am. Stat. Assoc. 49, 765–769. https://doi.org/10.1080/01621459.1954.10501232
Anderson_Darling_test(leghorn_chick)Anderson_Darling_test(leghorn_chick)
A numeric vector, the cholesterol values from a sample of 62 subjects from the Framingham Heart Study (FHS). This dataset was obtained from D'Agostino paper.
cholesterolcholesterol
A numeric vector length of 62.
D’Agostino, R.B., Belanger, A., D’Agostino Jr., R.B., 1990. A Suggestion for Using Powerful and Informative Tests of Normality. Am. Stat. 44, 316–321. https://doi.org/10.1080/00031305.1990.10475751
The D'Agostino–Pearson Chi-square (K2) test is a moment test for assessing whether a sample comes from a normal distribution. It combines information from skewness (asymmetry) and kurtosis (tail heaviness) into a single omnibus test statistic.
D.Agostino_Pearson_test( x, alpha = 0.05, alternative = c("two.sided", "less", "greater"), silent = FALSE )D.Agostino_Pearson_test( x, alpha = 0.05, alternative = c("two.sided", "less", "greater"), silent = FALSE )
x |
A numeric vector. |
alpha |
Significance threshold (default: 0.05). |
alternative |
Character (default: "two.sided). The alternative hypothesis (H1) to test. Available options are c("two.sided", "less", "greater"). Note that, this is only applied on skewness and kurtosis test. |
silent |
Logical (default: FALSE). If |
A list
D’Agostino, R.B., Belanger, A., D’Agostino, R.B., 1990. A Suggestion for Using Powerful and Informative Tests of Normality. Am. Stat. 44, 316–321. https://doi.org/10.1080/00031305.1990.10475751
D.Agostino_Pearson_test(cholesterol)D.Agostino_Pearson_test(cholesterol)
Tied data
is_tied(x, ratio = 0.3, remove_NA = FALSE)is_tied(x, ratio = 0.3, remove_NA = FALSE)
x |
A numeric vector |
ratio |
Numeric (default: 0.3). The ratio threshold of being considred as tied-data. The value range from 0 to 1. |
remove_NA |
Logical (default: TRUE). Whether or not to remove NAs. |
Logical
is_tied(c(1, 1, 2, 2, 2, 3, 4, 5)) #> TRUEis_tied(c(1, 1, 2, 2, 2, 3, 4, 5)) #> TRUE
Kurtosis test
kurtosis( x, alpha = 0.05, alternative = c("two.sided", "less", "greater"), method = c("G2", "b2", "g2") )kurtosis( x, alpha = 0.05, alternative = c("two.sided", "less", "greater"), method = c("G2", "b2", "g2") )
x |
Numeric vector. The input data. |
alpha |
Numeric (default: 0.05). Significance threshold (0 - 1). |
alternative |
Character (default: "two.sided). The alternative hypothesis (H1) to test. Available options are c("two.sided", "less", "greater"). |
method |
Character (default: "G2"). Different skewness formula. Available options are c("G2", "b2", "g2"). The "g2" is the original one. The "G2" and "b2" are the unbiased estimate version of "g2". |
A list: is_normal: Is the input data normally distributed? method: The name of the test. alpha: Significance threshold (default: 0.05). alternative: The alternative hypothesis (H1) to test. summary_table: Statistic summary, if any. statistic: The value used to calculate p-value. pvalue: p-value. confidence_interval: The lower and upper bound of CI.
Joanes, D.N., Gill, C.A., 1998. Comparing measures of sample skewness and kurtosis. J. R. Stat. Soc. D (The Statistician) 47, 183–189. https://doi.org/10.1111/1467-9884.00122
Wright, D.B., Herrington, J.A., 2011. Problematic standard errors and confidence intervals for skewness and kurtosis. Behav. Res. Methods 43, 8–17. https://doi.org/10.3758/s13428-010-0044-x
x <- c(10:17, 12, 12, 13, 13, 13, 13, 13, 14, 14, 14, 15, 15) kurtosis(x)x <- c(10:17, 12, 12, 13, 13, 13, 13, 13, 14, 14, 14, 15, 15) kurtosis(x)
A numeric vector
leghorn_chickleghorn_chick
A numeric vector length of 20.
Stephens, M.A., 2017. Tests Based on EDF Statistics. In: D’Agostino, R.B., Stephens, M.A. (Eds.), Goodness-of-Fit Techniques, 1st ed. Routledge, New York, pp. 98. https://doi.org/10.1201/9780203753064
The standard output format for normality package.
normality_standard_output( method = "what test?", is_normal = NA, alpha = NA_real_, alternative = c("two.sided", "less", "greater"), summary_table = NULL, statistic = NA_real_, pvalue = NA_real_, confidence_interval = c(lower = NA_real_, upper = NA_real_) )normality_standard_output( method = "what test?", is_normal = NA, alpha = NA_real_, alternative = c("two.sided", "less", "greater"), summary_table = NULL, statistic = NA_real_, pvalue = NA_real_, confidence_interval = c(lower = NA_real_, upper = NA_real_) )
method |
Character. The name of the test. |
is_normal |
Logical. Is the input data normally distributed? |
alpha |
Numeric (default: 0.05). Significance threshold. |
alternative |
Character. The alternative hypothesis (H1) to test. Available options are c("two.sided", "less", "greater"). |
summary_table |
Statistic summary, if any. |
statistic |
Numeric. The value used to calculate p-value. |
pvalue |
Numeric. The p-value of the test. |
confidence_interval |
Numeric vector of length 2. The lower and upper bound of CI. |
A list contains 8 vectors.
Coefficients (ai) for the W test for normality.
Shapiro_Wilk_coef_tableShapiro_Wilk_coef_table
A data frame with 50 rows and 25 variables:
rownames is the sample size (n); colnames is the corresponding coefficients (ai).
Shapiro, S.S., Wilk, M.B., 1965. An Analysis of Variance Test for Normality (Complete Samples). Biometrika 52, 591–611. https://doi.org/10.2307/2333709
The percentage points (critical values of W) of the W test for n = 3(1)50.
Shapiro_Wilk_pval_tableShapiro_Wilk_pval_table
A data frame with 50 rows and 10 variables:
rownames is the sample size (n); colnames is the corresponding p-values.
Shapiro, S.S., Wilk, M.B., 1965. An Analysis of Variance Test for Normality (Complete Samples). Biometrika 52, 591–611. https://doi.org/10.2307/2333709
Performs the Shapiro-Wilk normality test which is based on the regression and correlation technique.
Shapiro_Wilk_test( x, alpha = 0.05, method = c("SWR", "SF", "SW"), silent = FALSE )Shapiro_Wilk_test( x, alpha = 0.05, method = c("SWR", "SF", "SW"), silent = FALSE )
x |
A numeric vector. |
alpha |
Significance threshold (default: 0.05). |
method |
Character (default: "SWR"). Use which modification of the test? Available options are c("SWR", "SF", "SW"). |
silent |
Logical (default: FALSE). If |
method
"SW": Shapiro-Wilk, the original test (Shapiro and Wilk, 1965).
Only applicable when 3 <= n <= 50.
"SF": Shapiro-Francia, modified by Francia (Shapiro and Francia, 1972),
and finally simplified and extended by Royston (Royston, 1993).
Only applicable when 5 <= n <= 5000.
"SWR": Shapiro-Wilk-Royston, modified by Royston (Royston, 1995).
Only applicable when 3 <= n <= 5000.
A list.
Shapiro, S.S., Wilk, M.B., 1965. An Analysis of Variance Test for Normality (Complete Samples). Biometrika 52, 591–611. https://doi.org/10.2307/2333709
Shapiro, S.S., Francia, R.S., 1972. An Approximate Analysis of Variance Test for Normality. J. Am. Stat. Assoc. 67, 215–216. https://doi.org/10.1080/01621459.1972.10481232
Royston, P., 1993. A pocket-calculator algorithm for the Shapiro–Francia test for non-normality: an application to medicine. Stat. Med. 12, 181–184. https://doi.org/10.1002/sim.4780120209
Royston, P., 1992. Approximating the Shapiro–Wilk W-test for non-normality. Stat. Comput. 2, 117–119. https://doi.org/10.1007/BF01891203
Royston, P., 1995. Remark AS R94: A Remark on Algorithm AS 181: The W-test for Normality. Appl. Stat. 44, 547–551. https://doi.org/10.2307/2986146
Shapiro_Wilk_test(rnorm(20), method = "SW")Shapiro_Wilk_test(rnorm(20), method = "SW")
Skewness test
skewness( x, alpha = 0.05, alternative = c("two.sided", "less", "greater"), method = c("G1", "b1", "g1") )skewness( x, alpha = 0.05, alternative = c("two.sided", "less", "greater"), method = c("G1", "b1", "g1") )
x |
Numeric vector. The input data. |
alpha |
Numeric (default: 0.05). Significance threshold (0 - 1). |
alternative |
Character (default: "two.sided). The alternative hypothesis (H1) to test. Available options are c("two.sided", "less", "greater"). |
method |
Character (default: "G1"). Different skewness formula. Available options are c("G1", "b1", "g1"). The "g1" is the original one. The "G1" and "b1" are the unbiased estimate version of "g1". |
A list: is_normal: Is the input data normally distributed? method: The name of the test. alpha: Significance threshold (default: 0.05). alternative: The alternative hypothesis (H1) to test. summary_table: Statistic summary, if any. statistic: The value used to calculate p-value. pvalue: p-value. confidence_interval: The lower and upper bound of CI.
Joanes, D.N., Gill, C.A., 1998. Comparing measures of sample skewness and kurtosis. J. R. Stat. Soc. D (The Statistician) 47, 183–189. https://doi.org/10.1111/1467-9884.00122
Wright, D.B., Herrington, J.A., 2011. Problematic standard errors and confidence intervals for skewness and kurtosis. Behav. Res. Methods 43, 8–17. https://doi.org/10.3758/s13428-010-0044-x
skewness(cholesterol)skewness(cholesterol)