## Friday, April 2, 2010

### Python, statistics : Test of Variances

It is 2:23 AM, April 3 and Blogger is encountering problems saving this blog post. I will make a mirror copy at my alternate site Digital Explorations.

In this post, we discuss the two cases of testing variances. One is testing whether a population variance or standard deviation is equal to some assumed variance. The second case is testing whether two populations have the same variance. By taking samples from the population and computing the sample
variance, we should be able to form decisions whether the hypothesis should be accepted or not.

Case Null Test statistic df
single sample$\sigma^2 = \sigma_0^2$$\chi^2 =(n-1) \frac{s^2}{\sigma_0^2} df = n-1 two sample\sigma_1^2=\sigma_2^2$$f= \frac{s_1^2}{s_2^2}$$df1 =n1-1, df2= n2-1$

Our file "testofvariances.py" consist of two functions to implement the above cases of testing
variances. For case 1, we call vartest, and for case 2, we call twovartest. Both functions return
a tuple (p-value, test stat, critical value(s)), and both functions allows for three alternatives:left-sided, double sided and right-sided tests.

"""
file         testofvariances.py
author    dr.ernesto p. adorio
U.P. Clark
Clark Field, Pampanga
"""
from math import *
from rstats import *

def  vartest(svar, pvar0, n, alpha, side=1):
"""
Performs a basic chisq test for variance with
H0: sigma^2 = \sigma_0^2
Arguments:
svar - sample variance
pvar0 - \sigma_0^2 assumed population variance
n - sample size
"""
teststat = svar/pvar0 * (n -1)
df = n-1
if side == 0:
crit1 = qchisq(alpha/2, df)
crit2 = qchisq(1-alpha/2,df)
pvalue = p(teststat, df)
if pvalue > 0.5:
pvalue = 1-0.5
pvalue *=2
return pvalue, teststat, (crit1, crit2)
elif side == -1:
crit = qchisq(alpha, df)
pvalue = p(teststat, df)
return pvalue, teststat, crit
else:
crit = qchisq(1-alpha, df)
pvalue = 1- pchisq(teststat, df)
return pvalue,  teststat,  crit

def twovartest(svar1, n1, svar2, n2, alpha, side):
"""
Performs an F-test  for testing the null  pvar_1 = pvar2
"""
df1 = n1-1
df2 = n2-1
fteststat = float(svar1)/svar2
if side == 0:
#print fteststat,  df1,  df2
crit1 = qf(alpha/2.0,  df1,  df2)
crit2 = qf(1-alpha/2.0,   df1,  df2)
pvalue = pf(fteststat,  df1,  df2)
if pvalue > 0.5:
pvalue = 1-0.5
pvalue *=2
return pvalue,  fteststat,  (crit1,  crit2)
elif side == -1:
crit = qf(alpha,  df1,  df2 )
pvalue = pf(fteststat,  df1,  df2)
else:
crit = qf(1-alpha,    df1,  df2 )
pvalue = pf(fteststat,  df1, df2)
return pvalue,  fteststat,  crit

def Walpole_tests():
print "Example 8, p.320"
print  vartest(1.2**2,  0.9 * 0.9,  10,  alpha= 0.05, side=1)
print
print "Example 9, p.322"
print twovartest(16, 12, 25,  10, alpha= 0.10,  side = 0)
print

if __name__=="__main__":
Walpole_tests()


When the above file is run under the Python interpreter, it returns the following output:

toto@toto-laptop:~/Blogs/statistics\$ python testofvariances.py
Example 8, p.320
(0.066881587774126672, 16.0, 16.918977604620448)

Example 9, p.322
(0.47864961042191279, 0.64000000000000001, (0.34527730855736571, 3.1024854075283774))


The results looks ok with us at the moment.