Friday, April 16, 2010

The recursive Durbin formula for the partial autocorrelation function.

This post is only for historical purposes and is superceded by a new direct CORRECTED implementation in Corrected post for partial autocorrelation function.This post may be deleted one week from now. Please visit the revised and corrected version.



Durbin developed a recursive formula for the pacfś given the acf($r_k$'s): \[ \phi_k= \phi_{k,k} = \begin{cases} r_1 & \text{for $k=1$}\\ \frac{r_k - \sum_{j=1}^{k-1} \phi_{k-1} r_{k-j} }{1 -\sum_{j=1}^{k-1}\phi_{k-1,j} r_j} & \text{for $k>1$}\\ \end{cases} \] \[ \[\phi_{k,j} = \begin{cases} \phi_{k-1, j} -\phi_{k}\phi_{k-1,k-j} & \text{for $j = 1,2, \ldots, k-1$}\\ \phi_{k,k} & \text{for $j =k$}\\ \end{cases} \]



Here is our initial Python implementation of the recursive function.

def durbin(r, lag = 1):
    #print"debug: inside durbin, lag=", lag
    
    def phi(k, j):
        #print "calling phi with ", k, j
        if k == j:
           if k == 1:
              return r[1]
           else:
              N= r[k] - sum([phi(k-1, k-1) * r[k-j] for j in range(1,k)])
              D= 1 - sum([phi(k-1, j) * r[j] for j in range(1, k)])
              return N/D
        else:
           return sum([phi(k-1,j) - phi(k,k) * phi(k-1,k-j) for j in range(1, k)])
    return phi(lag,lag) 

def  acf(X, k=1):
    """
    Computes the sample autocorrelation function coeffficient.
    for given lag k and input data X.
    """
    if k == 0:
        return 1.0
    flen = float(len(X))
    xbar = float(sum([x for x in X])) / flen
    D = sum([(x-xbar)**2 for x in X])
    N = sum([ (x-xbar)* (xtpk -xbar) for (x, xtpk) in zip(X[:-k],X[k:])])
    return N/D

def sacf(X,  maxlag):
    return [acf(X, i) for i in range(maxlag+1)] #includes 0 lag

#data from Danao's text. page 323.
X = ['772.9', '909.4', '1080.3', '1276.2', '1380.6', '1354.8', 
   '1096.9', '1066.7', '1108.7', '1109', '1203.7', '1328.2', '1380', 
   '1435.3', '1416.2', '1494.9', '1525.6', '1551.1', '1539.2', 
   '1629.1', '1665.3', '1708.7', '1799.4', '1873.3', '1973.3', 
   '2087.6', '2208.3', '2271.4', '2365.6', '2423.3', '2416.2', 
   '2484.8', '2608.5', '2744.1', '2729.3', '2695', '2826.7', 
   '2958.6', '3115.2', '3192.4', '3187.1', '3248.8', '3166', 
   '3279.1', '3489.9', '3585.2', '3676.5']

def Test1(maxlag):
    """
    Prints out values of acf and pacf for the Danao data set.
    """

    Y = [float(x) for x in X]
    r  = sacf(Y,  maxlag)
    for lag in range(1, 5):
      print r[lag], durbin(r, lag)
 
if __name__== "__main__":
   Test1(17)
 

The recursion is actually performed by an internal function phi() which accepts two arguments k, j. However, the output of the above even for low lag value of 5 is not so good( we may have miscoded the algorithm!). Anyway we present it in the hope that a gentle reader will show us the way to properly implement Durbin's algorithm! We will recheck and recheck the formulas.

A corrected version has been successfully written. Here is the wrong results returned by our first implementation.

$ python durbin.py
python durbin.py
0 0.925682317386 0.925682317386
1 0.852706579655 -0.0292160394675
2 0.787096604484 5.86791773408
3 0.737850083142 -3.27133094039

Notice the very bad value of 5.867+ for i = 2, lag = 3. No partial autocorrelations are ever greater than 1!

duh...

No comments:

Post a Comment