It requires that the data is at least on the ordinal, interval or ratio scales or the data must be sortable or ranked. Since the values (after transformation) are ranks, we looked for our old routine for ranking and decided to revamp it. See Version 0.0.1
def rankarray(X, rankstartvalue = 1, averageTies=True): """ version 0.0.2 may 16, 2010 """ R = [rankstartvalue + i for i in range(len(X))] xi = [(x,i) for i, x in enumerate(X)] xi.sort() if averageTies: start = 0 end = 0 for i in range(1, len(X)): if xi[i][0] == xi[start][0]: end = i else: count = end-start + 1 avgRank = xi[start][0] + xi[end][0] / 2 for j in range(start,end+1): R[j] = avgRank start = i end = i #Adjust for any trailing similar ranks. if start != end: count = end - start + 1 for j in range(start,end+1): R[j] = (R[start][0]+ R[end][0])/2 RR=[0] * len(X) for j, (x, i) in enumerate(xi): # print j, x, i, R[i] RR[i] =R[j] return RR if __name__ == "__main__": X = [7, 7, 4,4,4,4,4, 8, 6, 5, 1, 1] print "X=", X print "Rank start value=%d, averageTies=%s" %(0,True) R = rankarray(X, rankstartvalue = 0, averageTies = True) print R print "Rank start value=%d, averageTies=%s" %(0,False) R = rankarray(X, rankstartvalue = 0, averageTies = False) print R print "Rank start value=%d, averageTies=%s" %(1,True) R = rankarray(X, rankstartvalue = 1, averageTies = True) print R print "Rank start value=%d, averageTies=%s" %(1,False) R = rankarray(X, rankstartvalue = 1, averageTies = False) print R
When the above script is run, it outputs:
Python 2.6.5 (r265:79063, Apr 16 2010, 13:57:41) [GCC 4.4.3] on toto-laptop, Standard >>> X= [7, 7, 4, 4, 4, 4, 4, 8, 6, 5, 1, 1] Rank start value=0, averageTies=True [10, 10, 6, 6, 6, 6, 6, 11, 9, 7, 1, 1] Rank start value=0, averageTies=False [9, 10, 2, 3, 4, 5, 6, 11, 8, 7, 0, 1] Rank start value=1, averageTies=True [10, 10, 6, 6, 6, 6, 6, 12, 9, 7, 1, 1] Rank start value=1, averageTies=False [10, 11, 3, 4, 5, 6, 7, 12, 9, 8, 1, 2]
But we cannot guarantee 100 percent that there are no errors.
No comments:
Post a Comment