统计分析中重要非参数检验.docx
文本预览下载声明
重要的非参数检验
拟合优度检验(goodness of fit)
正态性W检验
利用shapiro-wilk的W统计量做正态性检验。
R语言命令:
shapiro.text(x)
x -- a numeric vector of data values. Missing values are allowed, but the number of non-missing values must be between 3 and 5000.
经验分布的Kolmogorov-Smirnov检验方法
利用总体分布函数和经验分布函数之间的距离来建立统计量D=sup-∞x∞Fnx-F0(x)。理论上可以检验任何分布。
(1).单个总体的检验
ks.test(x,”pexp”,1/1500)
(2).两个总体的检验
假设X1,X2,?,Xn1是来自分布F(x)的总体的样本,且F(x)未知,Y1,Y2,?,Yn2是来自G(x)的总体的样本,G(x)未知。那么检验两个分布是否相同,即原假设为H0:Fx=G(x)。
R语言命令:
ks.test(x,y,…,alternative=c(“two.sided”,”less”,”greater”),exact=NULL)
x -- a numeric vector of data values.
y -- either a numeric vector of data values, or a character string naming a cumulative distribution function or an actual cumulative distribution function such as pnorm. Only continuous CDFs are valid.
... -- parameters of the distribution specified (as a character string) by y.
alternative -- indicates the alternative hypothesis and must be one of two.sided (default), less, or greater. You can specify just the initial letter of the value, but the argument name must be give in full. See ‘Details’ for the meanings of the possible values.
exact -- NULL or a logical indicating whether an exact p-value should be computed. See ‘Details’ for the meaning of NULL. Not available in the two-sample case for a one-sided test or if ties are present.
Pearson拟合优度χ2检验
(1).理论分布完全已知的情况,那么零假设H0:某变量X具有A分布,被择假设H0:某变量X不具有A分布,上述问题的检验方法是将数轴-∞,∞分成m个区间:I1=-∞,a1,I2=[a1,a2,?,Im=[am-1,∞),记这些区间的理论概率分布为p1,p2,?,pm,pi=PX∈Ii,i=1,2,?,m。
ni为X1,X2,?X2中落在区间Ii内的个数,则在原假设成立下,ni的期望值为npi,ni与npi的差距可视为理论与观察之间的偏离的衡量。那么统计量就是将这些衡量结合起来得到的。记统计量K=i=1m(ni-npi)2npi。在原假设成立的条件下,当n→∞时,K依分布收敛于自由度为m-1的χ2分布。当给定显著性水平α,Kχ2α(m-1),则拒绝原假设。进一步讨论,计算出K值后,可以计算出P值,P=P{χ2α(m-1)K},可将P值成为所得数据与原假设的拟合优度,P值越大,支持原假设的证据就越强。给定一个显著性水平α,当Pα时就拒绝原假设。
R语言命令:
chisq.test(x, y = NULL, correct = TRUE, p = rep(1/length(x), length(x)), rescale.p = FALSE, simulate.p.value = FALSE, B = 2000)
x -- a numeric vector or matrix. x and y can also both be factors.
y -- a numeric vector; ignored if x is
显示全部