Levinson and Durbin developed a recursive formula for the pacfś given the acf($r_k$'s): \[ \phi_k= \phi_{k,k} = \begin{cases} r_1 & \text{for $k=1$}\\ \frac{r_k - \sum_{j=1}^{k-1} \phi_{k-1} r_{k-j} }{1 -\sum_{j=1}^{k-1}\phi_{k-1,j} r_j} & \text{for $k>1$}\\ \end{cases} \] \[ \[\phi_{k,j} = \begin{cases} \phi_{k-1, j} -\phi_{k}\phi_{k-1,k-j} & \text{for $j = 1,2, \ldots, k-1$}\\ \phi_{k,k} & \text{for $j =k$}\\ \end{cases} \]
The denominator may be considered as the variance $$V(k-1)$$ which satisfies $$V(k) = V(k-1) (1 - phi(k,k)^2$$. It is just a matter of carefully carrying out the computations to compute the current lag values using the previous lag values.
We have done a previous implementation using matrix algebra Computing sample partial autocorrelations and a wrong recursive implementation of the Levinson-Durbin recursive formula in Recursive formulation
It returns an array of the same length as the autocorrelation array.
# -*- coding: utf-8 -*-
"""
file     : levdurbin.py
author: dr. ernesto p. adorio
version: 0.0.1   May 10, 2010
"""
def levinson_durbin(r, maxlag):
    """
    Computes the array of pcf's corresponding to the autocorrelations
    using the Levinson-Durgin Iterative Algorithm up to
    maxlag.
    Arguments 
      r - autocorrelations at lags 1 to maxlag. Set r[0] to 1 !
    Reference
     Hans Gilgen,"Univariate time series in geosciences: theory and examples",
       pp. 255.
    """
    A = [[0.0]*(maxlag+1) for row in range(maxlag+1)]
    V = [0.0] * (maxlag+1)
    A[0][0] = r[0]
    A[1][1] = r[1]/r[0]
    V[1] = r[0] -A[1][1] * r[1]
    for k in range(2, maxlag+1):
        A[k][k]= (r[k] - sum([A[j][k-1] * r[k-j] for j in range(1, k)]))/ V[k-1]
        for j in range(1,k):
            A[j][k] = A[j][k-1] - A[k][k] * A[k-j][k-1]
        V[k] = V[k-1] * (1 - A[k][k] **2)
    return [A[k][k] for k in range(maxlag+1)]
def  acf(X, k=1):
    """
    Computes the sample autocorrelation function coeffficient.
    for given lag k and input data X.
    """
    if k == 0:
        return 1.0
    flen = float(len(X))
    xbar = float(sum([x for x in X])) / flen
    D = sum([(x-xbar)**2 for x in X])
    N = sum([ (x-xbar)* (xtpk -xbar) for (x, xtpk) in zip(X[:-k],X[k:])])
    return N/D
def sacf(X,  maxlag):
    return [acf(X, i) for i in range(maxlag+1)] #includes 0 lag
#data from Danao's text. page 323.
X = ['772.9', '909.4', '1080.3', '1276.2', '1380.6', '1354.8', 
   '1096.9', '1066.7', '1108.7', '1109', '1203.7', '1328.2', '1380', 
   '1435.3', '1416.2', '1494.9', '1525.6', '1551.1', '1539.2', 
   '1629.1', '1665.3', '1708.7', '1799.4', '1873.3', '1973.3', 
   '2087.6', '2208.3', '2271.4', '2365.6', '2423.3', '2416.2', 
   '2484.8', '2608.5', '2744.1', '2729.3', '2695', '2826.7', 
   '2958.6', '3115.2', '3192.4', '3187.1', '3248.8', '3166', 
   '3279.1', '3489.9', '3585.2', '3676.5']
def Test1(maxlag):
    """
    Prints out values of acf and pacf for the Danao data set.
    """
    Y = [float(x) for x in X]
    r  = sacf(Y,  maxlag)
    for i, pcor in enumerate(levinson_durbin(r, maxlag)):
       print i, r[i], pcor
 
if __name__== "__main__":
   Test1(17)
When the above program runs, it outputs the following html'ed (done separately) table:
| Lag | ACF | PCF | 
|---|---|---|
| 0 | 1.0 | 1.0 | 
| 1 | 0.925682317386 | 0.925682317386 | 
| 2 | 0.852706579655 | -0.0292160394675 | 
| 3 | 0.787096604484 | 0.0122016750938 | 
| 4 | 0.737850083142 | 0.0784204182616 | 
| 5 | 0.697253316633 | 0.0358005082792 | 
| 6 | 0.64842031925 | -0.071567073034 | 
| 7 | 0.587527096625 | -0.0988314678858 | 
| 8 | 0.519141887224 | -0.0843023226042 | 
| 9 | 0.450228026064 | -0.0629615959671 | 
| 10 | 0.384896320219 | -0.0480711212172 | 
| 11 | 0.32584304195 | -0.0201806609446 | 
| 12 | 0.273845336962 | 0.00621787084071 | 
| 13 | 0.216766465976 | -0.0631790415256 | 
| 14 | 0.156888401912 | -0.0477940531702 | 
| 15 | 0.0992408085419 | -0.0204290294085 | 
| 16 | 0.0477462812535 | -0.0101131483561 | 
| 17 | -0.00206714577028 | -0.0495417448475 | 
so far so good. What turned me off was the emphasis on the word "recursion"! Next we will try to implement in the future the Burg algorithm, but we will take our sweet time to do it.
For small sized applications, the space wasted by the triangular matrix for the A coefficients is not alarming. One can use a simple dictionary instead.
No comments:
Post a Comment