Sunday, August 15, 2010

Python, Statistics: Computing the lower, mid and upper "hinges" of a sample.

The statistical hinges of a sample are designed for easy manual computation. The mid-hinge of an odd number of data points is the middle value. If the number of data points is even, then the midhinge is the average of the two middle values.

The lower and upper hinges are the midhinges of the left and right subsamples with the midhinge removed, i.e, the sample has been split into two. Assume for example that [1,2,3,4,5,6,7] is the sample. the mid hinge is 4 and the lower hinge is the midhinge of [1,2,3] which is 2 and the upper hinge is the midhinge of [5,6,7] or 6.

On the other hand, for the sample [1,2,3,4,5,6], the midhinge is the mean of 3 and 4 which is (3.5).
Then the lower hinge is the midhinge of the remaining lef subsample[1,2,3] which is 2 and the upper hinge is the midhinge of the remaining right subsample [4, 5,6 ] which is 5.

Here is hinges.py, a Python program for computing the hinges of a sample.

"""
file   hinges.py

author Ernesto P. Adorio
       ernesto.adorio@gmail.com
       UPDEPP(UP at Clarkfield)

version 0.0.1  August 15, 2010
"""

def midhinge(X):
    n= len(X)
    if n == 1: 
        return X[0]
    if  n == 0:
        return None
    if n % 2== 0: # even number.
        a,  b = n//2 - 1,  n//2
        return (X[a] + X[b])* 0.5
    else:
        m = (n-1)    //2
        return X[m]

def lowerhinge(X):
    n = len(X)
    if n % 2 == 0:
       return midhinge(X[:n//2])
    else:
        return midhinge(X[:(n-1)//2])

def upperhinge(X):
    n = len(X)
    if n %2 == 0:
        return midhinge(X[n//2:])
    else:
        return midhinge(X[n//2+1:])

def fivenum(X):
    """
    Five number summary for sample X. added aug.16, 2010
    """
    return (min(X), lowerhinge(X), midhinge(X), upperhinge(X), max(X))

if __name__== "__main__":
    X = range(1,  11)
    for i in range(10):
       print "X=",  X, ":","n=",len(X) ,"hinges=",  
       print lowerhinge(X),  midhinge(X), upperhinge(X)
       X=X[:-1]

When the above program runs, it outputs:

python hinges.py

python hinges.py
X= [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] : n= 10 hinges= 3 5.5 8
X= [1, 2, 3, 4, 5, 6, 7, 8, 9] : n= 9 hinges= 2.5 5 7.5
X= [1, 2, 3, 4, 5, 6, 7, 8] : n= 8 hinges= 2.5 4.5 6.5
X= [1, 2, 3, 4, 5, 6, 7] : n= 7 hinges= 2 4 6
X= [1, 2, 3, 4, 5, 6] : n= 6 hinges= 2 3.5 5
X= [1, 2, 3, 4, 5] : n= 5 hinges= 1.5 3 4.5
X= [1, 2, 3, 4] : n= 4 hinges= 1.5 2.5 3.5
X= [1, 2, 3] : n= 3 hinges= 1 2 3
X= [1, 2] : n= 2 hinges= 1 1.5 2
X= [1] : n= 1 hinges= None 1 None


Don't forget to sort the input data sample for computing its hinges!!

We are confident that the program is correct. But we need independent confirmation.
Also I would appreciate any revisions which will make the above code more compact.

TBD: fivenum will be incorporated in the next version of hinges.py. Done! But the version number
stays at 0.0.1

No comments:

Post a Comment