Levinson and Durbin developed a recursive formula for the pacfś given the acf($r_k$'s): \[ \phi_k= \phi_{k,k} = \begin{cases} r_1 & \text{for $k=1$}\\ \frac{r_k - \sum_{j=1}^{k-1} \phi_{k-1} r_{k-j} }{1 -\sum_{j=1}^{k-1}\phi_{k-1,j} r_j} & \text{for $k>1$}\\ \end{cases} \] \[ \[\phi_{k,j} = \begin{cases} \phi_{k-1, j} -\phi_{k}\phi_{k-1,k-j} & \text{for $j = 1,2, \ldots, k-1$}\\ \phi_{k,k} & \text{for $j =k$}\\ \end{cases} \]
The denominator may be considered as the variance $$V(k-1)$$ which satisfies $$V(k) = V(k-1) (1 - phi(k,k)^2$$. It is just a matter of carefully carrying out the computations to compute the current lag values using the previous lag values.
We have done a previous implementation using matrix algebra Computing sample partial autocorrelations and a wrong recursive implementation of the Levinson-Durbin recursive formula in Recursive formulation
It returns an array of the same length as the autocorrelation array.
# -*- coding: utf-8 -*- """ file : levdurbin.py author: dr. ernesto p. adorio version: 0.0.1 May 10, 2010 """ def levinson_durbin(r, maxlag): """ Computes the array of pcf's corresponding to the autocorrelations using the Levinson-Durgin Iterative Algorithm up to maxlag. Arguments r - autocorrelations at lags 1 to maxlag. Set r[0] to 1 ! Reference Hans Gilgen,"Univariate time series in geosciences: theory and examples", pp. 255. """ A = [[0.0]*(maxlag+1) for row in range(maxlag+1)] V = [0.0] * (maxlag+1) A[0][0] = r[0] A[1][1] = r[1]/r[0] V[1] = r[0] -A[1][1] * r[1] for k in range(2, maxlag+1): A[k][k]= (r[k] - sum([A[j][k-1] * r[k-j] for j in range(1, k)]))/ V[k-1] for j in range(1,k): A[j][k] = A[j][k-1] - A[k][k] * A[k-j][k-1] V[k] = V[k-1] * (1 - A[k][k] **2) return [A[k][k] for k in range(maxlag+1)] def acf(X, k=1): """ Computes the sample autocorrelation function coeffficient. for given lag k and input data X. """ if k == 0: return 1.0 flen = float(len(X)) xbar = float(sum([x for x in X])) / flen D = sum([(x-xbar)**2 for x in X]) N = sum([ (x-xbar)* (xtpk -xbar) for (x, xtpk) in zip(X[:-k],X[k:])]) return N/D def sacf(X, maxlag): return [acf(X, i) for i in range(maxlag+1)] #includes 0 lag #data from Danao's text. page 323. X = ['772.9', '909.4', '1080.3', '1276.2', '1380.6', '1354.8', '1096.9', '1066.7', '1108.7', '1109', '1203.7', '1328.2', '1380', '1435.3', '1416.2', '1494.9', '1525.6', '1551.1', '1539.2', '1629.1', '1665.3', '1708.7', '1799.4', '1873.3', '1973.3', '2087.6', '2208.3', '2271.4', '2365.6', '2423.3', '2416.2', '2484.8', '2608.5', '2744.1', '2729.3', '2695', '2826.7', '2958.6', '3115.2', '3192.4', '3187.1', '3248.8', '3166', '3279.1', '3489.9', '3585.2', '3676.5'] def Test1(maxlag): """ Prints out values of acf and pacf for the Danao data set. """ Y = [float(x) for x in X] r = sacf(Y, maxlag) for i, pcor in enumerate(levinson_durbin(r, maxlag)): print i, r[i], pcor if __name__== "__main__": Test1(17)
When the above program runs, it outputs the following html'ed (done separately) table:
Lag | ACF | PCF |
---|---|---|
0 | 1.0 | 1.0 |
1 | 0.925682317386 | 0.925682317386 |
2 | 0.852706579655 | -0.0292160394675 |
3 | 0.787096604484 | 0.0122016750938 |
4 | 0.737850083142 | 0.0784204182616 |
5 | 0.697253316633 | 0.0358005082792 |
6 | 0.64842031925 | -0.071567073034 |
7 | 0.587527096625 | -0.0988314678858 |
8 | 0.519141887224 | -0.0843023226042 |
9 | 0.450228026064 | -0.0629615959671 |
10 | 0.384896320219 | -0.0480711212172 |
11 | 0.32584304195 | -0.0201806609446 |
12 | 0.273845336962 | 0.00621787084071 |
13 | 0.216766465976 | -0.0631790415256 |
14 | 0.156888401912 | -0.0477940531702 |
15 | 0.0992408085419 | -0.0204290294085 |
16 | 0.0477462812535 | -0.0101131483561 |
17 | -0.00206714577028 | -0.0495417448475 |
so far so good. What turned me off was the emphasis on the word "recursion"! Next we will try to implement in the future the Burg algorithm, but we will take our sweet time to do it.
For small sized applications, the space wasted by the triangular matrix for the A coefficients is not alarming. One can use a simple dictionary instead.
No comments:
Post a Comment