Friday, April 2, 2010

Python, statistics : Test of Variances

It is 2:23 AM, April 3 and Blogger is encountering problems saving this blog post. I will make a mirror copy at my alternate site Digital Explorations.

In this post, we discuss the two cases of testing variances. One is testing whether a population variance or standard deviation is equal to some assumed variance. The second case is testing whether two populations have the same variance. By taking samples from the population and computing the sample
variance, we should be able to form decisions whether the hypothesis should be accepted or not.


Case Null Test statistic df
single sample$\sigma^2 = \sigma_0^2$$\chi^2 =(n-1) \frac{s^2}{\sigma_0^2}$ $df = n-1$
two sample$\sigma_1^2=\sigma_2^2$$f= \frac{s_1^2}{s_2^2}$$df1 =n1-1, df2= n2-1$



Our file "testofvariances.py" consist of two functions to implement the above cases of testing
variances. For case 1, we call vartest, and for case 2, we call twovartest. Both functions return
a tuple (p-value, test stat, critical value(s)), and both functions allows for three alternatives:left-sided, double sided and right-sided tests.

"""
file         testofvariances.py
author    dr.ernesto p. adorio
               U.P. Clark
               Clark Field, Pampanga
"""
from math import *
from rstats import *

def  vartest(svar, pvar0, n, alpha, side=1):
    """
    Performs a basic chisq test for variance with
    H0: sigma^2 = \sigma_0^2
    Arguments:
        svar - sample variance
        pvar0 - \sigma_0^2 assumed population variance
        n - sample size
    """
    teststat = svar/pvar0 * (n -1)
    df = n-1
    if side == 0:
        crit1 = qchisq(alpha/2, df)
        crit2 = qchisq(1-alpha/2,df)
        pvalue = p(teststat, df)
        if pvalue > 0.5:
           pvalue = 1-0.5
        pvalue *=2
        return pvalue, teststat, (crit1, crit2)
    elif side == -1:
        crit = qchisq(alpha, df)
        pvalue = p(teststat, df)
        return pvalue, teststat, crit
    else:
         crit = qchisq(1-alpha, df)
         pvalue = 1- pchisq(teststat, df)
    return pvalue,  teststat,  crit
    
def twovartest(svar1, n1, svar2, n2, alpha, side):
    """
    Performs an F-test  for testing the null  pvar_1 = pvar2
    """
    df1 = n1-1
    df2 = n2-1
    fteststat = float(svar1)/svar2
    if side == 0:
        #print fteststat,  df1,  df2       
        crit1 = qf(alpha/2.0,  df1,  df2)
        crit2 = qf(1-alpha/2.0,   df1,  df2)
        pvalue = pf(fteststat,  df1,  df2)
        if pvalue > 0.5:
           pvalue = 1-0.5
        pvalue *=2
        return pvalue,  fteststat,  (crit1,  crit2)
    elif side == -1:
        crit = qf(alpha,  df1,  df2 )
        pvalue = pf(fteststat,  df1,  df2)
    else:
        crit = qf(1-alpha,    df1,  df2 )
        pvalue = pf(fteststat,  df1, df2)
    return pvalue,  fteststat,  crit
    
def Walpole_tests():
    print "Example 8, p.320"
    print  vartest(1.2**2,  0.9 * 0.9,  10,  alpha= 0.05, side=1)
    print
    print "Example 9, p.322"
    print twovartest(16, 12, 25,  10, alpha= 0.10,  side = 0)
    print
    
if __name__=="__main__":
    Walpole_tests()

When the above file is run under the Python interpreter, it returns the following output:


toto@toto-laptop:~/Blogs/statistics$ python testofvariances.py
Example 8, p.320
(0.066881587774126672, 16.0, 16.918977604620448)

Example 9, p.322
(0.47864961042191279, 0.64000000000000001, (0.34527730855736571, 3.1024854075283774))

The results looks ok with us at the moment.

No comments:

Post a Comment