## Friday, April 16, 2010

### The recursive Durbin formula for the partial autocorrelation function.

This post is only for historical purposes and is superceded by a new direct CORRECTED implementation in Corrected post for partial autocorrelation function.This post may be deleted one week from now. Please visit the revised and corrected version.

Durbin developed a recursive formula for the pacfś given the acf($r_k$'s): $\phi_k= \phi_{k,k} = \begin{cases} r_1 & \text{for k=1}\\ \frac{r_k - \sum_{j=1}^{k-1} \phi_{k-1} r_{k-j} }{1 -\sum_{j=1}^{k-1}\phi_{k-1,j} r_j} & \text{for k>1}\\ \end{cases}$ $\[\phi_{k,j} = \begin{cases} \phi_{k-1, j} -\phi_{k}\phi_{k-1,k-j} & \text{for j = 1,2, \ldots, k-1}\\ \phi_{k,k} & \text{for j =k}\\ \end{cases}$

Here is our initial Python implementation of the recursive function.

def durbin(r, lag = 1):
#print"debug: inside durbin, lag=", lag

def phi(k, j):
#print "calling phi with ", k, j
if k == j:
if k == 1:
return r
else:
N= r[k] - sum([phi(k-1, k-1) * r[k-j] for j in range(1,k)])
D= 1 - sum([phi(k-1, j) * r[j] for j in range(1, k)])
return N/D
else:
return sum([phi(k-1,j) - phi(k,k) * phi(k-1,k-j) for j in range(1, k)])
return phi(lag,lag)

def  acf(X, k=1):
"""
Computes the sample autocorrelation function coeffficient.
for given lag k and input data X.
"""
if k == 0:
return 1.0
flen = float(len(X))
xbar = float(sum([x for x in X])) / flen
D = sum([(x-xbar)**2 for x in X])
N = sum([ (x-xbar)* (xtpk -xbar) for (x, xtpk) in zip(X[:-k],X[k:])])
return N/D

def sacf(X,  maxlag):
return [acf(X, i) for i in range(maxlag+1)] #includes 0 lag

#data from Danao's text. page 323.
X = ['772.9', '909.4', '1080.3', '1276.2', '1380.6', '1354.8',
'1096.9', '1066.7', '1108.7', '1109', '1203.7', '1328.2', '1380',
'1435.3', '1416.2', '1494.9', '1525.6', '1551.1', '1539.2',
'1629.1', '1665.3', '1708.7', '1799.4', '1873.3', '1973.3',
'2087.6', '2208.3', '2271.4', '2365.6', '2423.3', '2416.2',
'2484.8', '2608.5', '2744.1', '2729.3', '2695', '2826.7',
'2958.6', '3115.2', '3192.4', '3187.1', '3248.8', '3166',
'3279.1', '3489.9', '3585.2', '3676.5']

def Test1(maxlag):
"""
Prints out values of acf and pacf for the Danao data set.
"""

Y = [float(x) for x in X]
r  = sacf(Y,  maxlag)
for lag in range(1, 5):
print r[lag], durbin(r, lag)

if __name__== "__main__":
Test1(17)



The recursion is actually performed by an internal function phi() which accepts two arguments k, j. However, the output of the above even for low lag value of 5 is not so good( we may have miscoded the algorithm!). Anyway we present it in the hope that a gentle reader will show us the way to properly implement Durbin's algorithm! We will recheck and recheck the formulas.

A corrected version has been successfully written. Here is the wrong results returned by our first implementation.

\$ python durbin.py
python durbin.py
0 0.925682317386 0.925682317386
1 0.852706579655 -0.0292160394675
2 0.787096604484 5.86791773408
3 0.737850083142 -3.27133094039


Notice the very bad value of 5.867+ for i = 2, lag = 3. No partial autocorrelations are ever greater than 1!

duh...