Monday, May 10, 2010

Python, econometrics: Levinson Durbin recursive formula for partial autocorrelations.

Here is a direct Python implementation of the Levinson Durbin recursive formula for partial autocorrelations corresponding to the autocorrelation values for a time series.


Levinson and Durbin developed a recursive formula for the pacfś given the acf($r_k$'s): \[ \phi_k= \phi_{k,k} = \begin{cases} r_1 & \text{for $k=1$}\\ \frac{r_k - \sum_{j=1}^{k-1} \phi_{k-1} r_{k-j} }{1 -\sum_{j=1}^{k-1}\phi_{k-1,j} r_j} & \text{for $k>1$}\\ \end{cases} \] \[ \[\phi_{k,j} = \begin{cases} \phi_{k-1, j} -\phi_{k}\phi_{k-1,k-j} & \text{for $j = 1,2, \ldots, k-1$}\\ \phi_{k,k} & \text{for $j =k$}\\ \end{cases} \]

The denominator may be considered as the variance $$V(k-1)$$ which satisfies $$V(k) = V(k-1) (1 - phi(k,k)^2$$. It is just a matter of carefully carrying out the computations to compute the current lag values using the previous lag values.

We have done a previous implementation using matrix algebra Computing sample partial autocorrelations and a wrong recursive implementation of the Levinson-Durbin recursive formula in Recursive formulation

It returns an array of the same length as the autocorrelation array.

# -*- coding: utf-8 -*-
"""
file     : levdurbin.py
author: dr. ernesto p. adorio
version: 0.0.1   May 10, 2010
"""

def levinson_durbin(r, maxlag):
    """
    Computes the array of pcf's corresponding to the autocorrelations
    using the Levinson-Durgin Iterative Algorithm up to
    maxlag.
    Arguments 
      r - autocorrelations at lags 1 to maxlag. Set r[0] to 1 !
    Reference
     Hans Gilgen,"Univariate time series in geosciences: theory and examples",
       pp. 255.
    """
    A = [[0.0]*(maxlag+1) for row in range(maxlag+1)]
    V = [0.0] * (maxlag+1)
    A[0][0] = r[0]
    A[1][1] = r[1]/r[0]
    V[1] = r[0] -A[1][1] * r[1]
    for k in range(2, maxlag+1):
        A[k][k]= (r[k] - sum([A[j][k-1] * r[k-j] for j in range(1, k)]))/ V[k-1]
        for j in range(1,k):
            A[j][k] = A[j][k-1] - A[k][k] * A[k-j][k-1]
        V[k] = V[k-1] * (1 - A[k][k] **2)
    return [A[k][k] for k in range(maxlag+1)]

def  acf(X, k=1):
    """
    Computes the sample autocorrelation function coeffficient.
    for given lag k and input data X.
    """
    if k == 0:
        return 1.0
    flen = float(len(X))
    xbar = float(sum([x for x in X])) / flen
    D = sum([(x-xbar)**2 for x in X])
    N = sum([ (x-xbar)* (xtpk -xbar) for (x, xtpk) in zip(X[:-k],X[k:])])
    return N/D

def sacf(X,  maxlag):
    return [acf(X, i) for i in range(maxlag+1)] #includes 0 lag

#data from Danao's text. page 323.
X = ['772.9', '909.4', '1080.3', '1276.2', '1380.6', '1354.8', 
   '1096.9', '1066.7', '1108.7', '1109', '1203.7', '1328.2', '1380', 
   '1435.3', '1416.2', '1494.9', '1525.6', '1551.1', '1539.2', 
   '1629.1', '1665.3', '1708.7', '1799.4', '1873.3', '1973.3', 
   '2087.6', '2208.3', '2271.4', '2365.6', '2423.3', '2416.2', 
   '2484.8', '2608.5', '2744.1', '2729.3', '2695', '2826.7', 
   '2958.6', '3115.2', '3192.4', '3187.1', '3248.8', '3166', 
   '3279.1', '3489.9', '3585.2', '3676.5']

def Test1(maxlag):
    """
    Prints out values of acf and pacf for the Danao data set.
    """

    Y = [float(x) for x in X]
    r  = sacf(Y,  maxlag)

    for i, pcor in enumerate(levinson_durbin(r, maxlag)):
       print i, r[i], pcor
 
if __name__== "__main__":
   Test1(17)

When the above program runs, it outputs the following html'ed (done separately) table:



Lag ACF PCF
01.01.0
10.9256823173860.925682317386
20.852706579655-0.0292160394675
30.7870966044840.0122016750938
40.7378500831420.0784204182616
50.6972533166330.0358005082792
60.64842031925-0.071567073034
70.587527096625-0.0988314678858
80.519141887224-0.0843023226042
90.450228026064-0.0629615959671
100.384896320219-0.0480711212172
110.32584304195-0.0201806609446
120.2738453369620.00621787084071
130.216766465976-0.0631790415256
140.156888401912-0.0477940531702
150.0992408085419-0.0204290294085
160.0477462812535-0.0101131483561
17-0.00206714577028-0.0495417448475

so far so good. What turned me off was the emphasis on the word "recursion"! Next we will try to implement in the future the Burg algorithm, but we will take our sweet time to do it.

For small sized applications, the space wasted by the triangular matrix for the A coefficients is not alarming. One can use a simple dictionary instead.

No comments:

Post a Comment