## Sunday, May 16, 2010

### Python, Statistics: The nonparametric Mann-Whitney Test

Draft! Untested. Do not use yet.

def mannwhitney(S1, S2):
"""
Returns the Mann-Whitney U statistic of two samples S1 and S2.
"""
# Form a single array with a categorical variable indicate the sample
X = [(s, 0) for s in S1]
X.extend([(s,1) for s in S2])
R = Rank(X)

# Compute needed parameters.
n1 = len(S1)
n2 = len(S2)

# Compute total ranks for sample 1.
R1 = sum([R[i] for i, (x,j) in enumerate(X) if j == 0])
u1 = R1 - (n1 + (n1+1)/2.0)
u2 = n1 * n2 - u1
U = min(u1, u2)

mU     = n1 * n2 / 2.0
sigmaU = sqrt((n1 *n2)*(n1 + n2 + 1)/12.0)
return U, mu, sigmaU


Still needs to find resources for computing the discrete distribution function of the Mann-Whitney test. Blogger will appreciate any help. Failing this, the scipy module has a mannwhitneyu function which returns the U statistic and the p-value of the test.