Package 'normality'

Title: Tests for Departure from Normality
Description: A toolkit for assessing data normality using a comprehensive collection of statistical methods. It includes descriptive measures and formal hypothesis tests, such as skewness and kurtosis tests, the Anderson–Darling test, the Shapiro–Wilk test, and the D'Agostino–Pearson K2 omnibus test.
Authors: Joon-Keat Lai [aut, cre, cph] (ORCID: <https://orcid.org/0000-0002-9840-5836>)
Maintainer: Joon-Keat Lai <[email protected]>
License: MIT + file LICENSE
Version: 0.0.2
Built: 2026-06-10 12:01:15 UTC
Source: https://github.com/p10911004-npust/normality

Help Index


Anderson-Darling Normality Test

Description

Performs the Anderson-Darling (A2) normality test which is based on the empirical distribution function (EDF).

Usage

Anderson_Darling_test(x, alpha = 0.05, silent = FALSE)

Arguments

x

A numeric vector.

alpha

Numeric (default: 0.05). Significance threshold, range from 0 to 1.

silent

Logical (default: FALSE). If FALSE, print out the results.

Value

A list.

References

D’Agostino, R.B., 2017. Tests for the Normal Distribution. In: D’Agostino, R.B., Stephens, M.A. (Eds.), Goodness-of-Fit Techniques, 1st ed. Routledge, New York, pp. 372–373. https://doi.org/10.1201/9780203753064

Stephens, M.A., 2017. Tests Based on EDF Statistics. In: D’Agostino, R.B., Stephens, M.A. (Eds.), Goodness-of-Fit Techniques, 1st ed. Routledge, New York, pp. 126–128. https://doi.org/10.1201/9780203753064

Anderson, T.W., Darling, D.A., 1954. A Test of Goodness of Fit. J. Am. Stat. Assoc. 49, 765–769. https://doi.org/10.1080/01621459.1954.10501232

Examples

Anderson_Darling_test(leghorn_chick)

Cholesterol data

Description

A numeric vector, the cholesterol values from a sample of 62 subjects from the Framingham Heart Study (FHS). This dataset was obtained from D'Agostino paper.

Usage

cholesterol

Format

A numeric vector length of 62.

References

D’Agostino, R.B., Belanger, A., D’Agostino Jr., R.B., 1990. A Suggestion for Using Powerful and Informative Tests of Normality. Am. Stat. 44, 316–321. https://doi.org/10.1080/00031305.1990.10475751


D'Agostino-Pearson K2 Normality Test

Description

The D'Agostino–Pearson Chi-square (K2) test is a moment test for assessing whether a sample comes from a normal distribution. It combines information from skewness (asymmetry) and kurtosis (tail heaviness) into a single omnibus test statistic.

Usage

D.Agostino_Pearson_test(
  x,
  alpha = 0.05,
  alternative = c("two.sided", "less", "greater"),
  silent = FALSE
)

Arguments

x

A numeric vector.

alpha

Significance threshold (default: 0.05).

alternative

Character (default: "two.sided). The alternative hypothesis (H1) to test. Available options are c("two.sided", "less", "greater"). Note that, this is only applied on skewness and kurtosis test.

silent

Logical (default: FALSE). If FALSE, print out the results.

Value

A list

References

D’Agostino, R.B., Belanger, A., D’Agostino, R.B., 1990. A Suggestion for Using Powerful and Informative Tests of Normality. Am. Stat. 44, 316–321. https://doi.org/10.1080/00031305.1990.10475751

Examples

D.Agostino_Pearson_test(cholesterol)

Tied data

Description

Tied data

Usage

is_tied(x, ratio = 0.3, remove_NA = FALSE)

Arguments

x

A numeric vector

ratio

Numeric (default: 0.3). The ratio threshold of being considred as tied-data. The value range from 0 to 1.

remove_NA

Logical (default: TRUE). Whether or not to remove NAs.

Value

Logical

Examples

is_tied(c(1, 1, 2, 2, 2, 3, 4, 5))
#> TRUE

Kurtosis test

Description

Kurtosis test

Usage

kurtosis(
  x,
  alpha = 0.05,
  alternative = c("two.sided", "less", "greater"),
  method = c("G2", "b2", "g2")
)

Arguments

x

Numeric vector. The input data.

alpha

Numeric (default: 0.05). Significance threshold (0 - 1).

alternative

Character (default: "two.sided). The alternative hypothesis (H1) to test. Available options are c("two.sided", "less", "greater").

method

Character (default: "G2"). Different skewness formula. Available options are c("G2", "b2", "g2"). The "g2" is the original one. The "G2" and "b2" are the unbiased estimate version of "g2".

Value

A list: is_normal: Is the input data normally distributed? method: The name of the test. alpha: Significance threshold (default: 0.05). alternative: The alternative hypothesis (H1) to test. summary_table: Statistic summary, if any. statistic: The value used to calculate p-value. pvalue: p-value. confidence_interval: The lower and upper bound of CI.

References

Joanes, D.N., Gill, C.A., 1998. Comparing measures of sample skewness and kurtosis. J. R. Stat. Soc. D (The Statistician) 47, 183–189. https://doi.org/10.1111/1467-9884.00122

Wright, D.B., Herrington, J.A., 2011. Problematic standard errors and confidence intervals for skewness and kurtosis. Behav. Res. Methods 43, 8–17. https://doi.org/10.3758/s13428-010-0044-x

Examples

x <- c(10:17, 12, 12, 13, 13, 13, 13, 13, 14, 14, 14, 15, 15)
kurtosis(x)

Leghorn chicken data

Description

A numeric vector

Usage

leghorn_chick

Format

A numeric vector length of 20.

References

Stephens, M.A., 2017. Tests Based on EDF Statistics. In: D’Agostino, R.B., Stephens, M.A. (Eds.), Goodness-of-Fit Techniques, 1st ed. Routledge, New York, pp. 98. https://doi.org/10.1201/9780203753064


Standard output format

Description

The standard output format for normality package.

Usage

normality_standard_output(
  method = "what test?",
  is_normal = NA,
  alpha = NA_real_,
  alternative = c("two.sided", "less", "greater"),
  summary_table = NULL,
  statistic = NA_real_,
  pvalue = NA_real_,
  confidence_interval = c(lower = NA_real_, upper = NA_real_)
)

Arguments

method

Character. The name of the test.

is_normal

Logical. Is the input data normally distributed?

alpha

Numeric (default: 0.05). Significance threshold.

alternative

Character. The alternative hypothesis (H1) to test. Available options are c("two.sided", "less", "greater").

summary_table

Statistic summary, if any.

statistic

Numeric. The value used to calculate p-value.

pvalue

Numeric. The p-value of the test.

confidence_interval

Numeric vector of length 2. The lower and upper bound of CI.

Value

A list contains 8 vectors.


Shapiro-Wilk normality test (coefficients)

Description

Coefficients (ai) for the W test for normality.

Usage

Shapiro_Wilk_coef_table

Format

A data frame with 50 rows and 25 variables:

rownames is the sample size (n); colnames is the corresponding coefficients (ai).

References

Shapiro, S.S., Wilk, M.B., 1965. An Analysis of Variance Test for Normality (Complete Samples). Biometrika 52, 591–611. https://doi.org/10.2307/2333709


Shapiro-Wilk normality test (p-values)

Description

The percentage points (critical values of W) of the W test for n = 3(1)50.

Usage

Shapiro_Wilk_pval_table

Format

A data frame with 50 rows and 10 variables:

rownames is the sample size (n); colnames is the corresponding p-values.

References

Shapiro, S.S., Wilk, M.B., 1965. An Analysis of Variance Test for Normality (Complete Samples). Biometrika 52, 591–611. https://doi.org/10.2307/2333709


Shapiro-Wilk Normality Test

Description

Performs the Shapiro-Wilk normality test which is based on the regression and correlation technique.

Usage

Shapiro_Wilk_test(
  x,
  alpha = 0.05,
  method = c("SWR", "SF", "SW"),
  silent = FALSE
)

Arguments

x

A numeric vector.

alpha

Significance threshold (default: 0.05).

method

Character (default: "SWR"). Use which modification of the test? Available options are c("SWR", "SF", "SW").

silent

Logical (default: FALSE). If FALSE, print out the results.

Details

method

  • "SW": Shapiro-Wilk, the original test (⁠Shapiro and Wilk, 1965⁠). Only applicable when 3 <= n <= 50.

  • "SF": Shapiro-Francia, modified by Francia (⁠Shapiro and Francia, 1972⁠), and finally simplified and extended by Royston (⁠Royston, 1993⁠). Only applicable when 5 <= n <= 5000.

  • "SWR": Shapiro-Wilk-Royston, modified by Royston (⁠Royston, 1995⁠). Only applicable when 3 <= n <= 5000.

Value

A list.

References

Shapiro, S.S., Wilk, M.B., 1965. An Analysis of Variance Test for Normality (Complete Samples). Biometrika 52, 591–611. https://doi.org/10.2307/2333709

Shapiro, S.S., Francia, R.S., 1972. An Approximate Analysis of Variance Test for Normality. J. Am. Stat. Assoc. 67, 215–216. https://doi.org/10.1080/01621459.1972.10481232

Royston, P., 1993. A pocket-calculator algorithm for the Shapiro–Francia test for non-normality: an application to medicine. Stat. Med. 12, 181–184. https://doi.org/10.1002/sim.4780120209

Royston, P., 1992. Approximating the Shapiro–Wilk W-test for non-normality. Stat. Comput. 2, 117–119. https://doi.org/10.1007/BF01891203

Royston, P., 1995. Remark AS R94: A Remark on Algorithm AS 181: The W-test for Normality. Appl. Stat. 44, 547–551. https://doi.org/10.2307/2986146

Examples

Shapiro_Wilk_test(rnorm(20), method = "SW")

Skewness test

Description

Skewness test

Usage

skewness(
  x,
  alpha = 0.05,
  alternative = c("two.sided", "less", "greater"),
  method = c("G1", "b1", "g1")
)

Arguments

x

Numeric vector. The input data.

alpha

Numeric (default: 0.05). Significance threshold (0 - 1).

alternative

Character (default: "two.sided). The alternative hypothesis (H1) to test. Available options are c("two.sided", "less", "greater").

method

Character (default: "G1"). Different skewness formula. Available options are c("G1", "b1", "g1"). The "g1" is the original one. The "G1" and "b1" are the unbiased estimate version of "g1".

Value

A list: is_normal: Is the input data normally distributed? method: The name of the test. alpha: Significance threshold (default: 0.05). alternative: The alternative hypothesis (H1) to test. summary_table: Statistic summary, if any. statistic: The value used to calculate p-value. pvalue: p-value. confidence_interval: The lower and upper bound of CI.

References

Joanes, D.N., Gill, C.A., 1998. Comparing measures of sample skewness and kurtosis. J. R. Stat. Soc. D (The Statistician) 47, 183–189. https://doi.org/10.1111/1467-9884.00122

Wright, D.B., Herrington, J.A., 2011. Problematic standard errors and confidence intervals for skewness and kurtosis. Behav. Res. Methods 43, 8–17. https://doi.org/10.3758/s13428-010-0044-x

Examples

skewness(cholesterol)