Version 0.0.2 jun 6 added Leviatan's instrumental variable method

This blog post will undergo massive revisions. Use the code at your own risk.

Consider the model with infinite number of lags:

$$Y_t = \alpha + \sum_{i= 0}^{\infty} \beta_i X_{t - 1}$$.

This model has an infinite number of unknown coefficients. A simplification

was obtained by Koyck who assumed that $$\beta _i = \beta_0 \lambda^i.$$

are geometrically distributed . An equivalent formulation for this case is

given by $$Y_t =\alpha (1 -\lambda) + \beta_0 X_t + -\lambda Y_{t-1} + (u_t - \lambda u_{t-1}) + v_t$$

and we need only to determine at most only three (?!) values: $$\alpha, \lambda$$ and $\beta_0.$$

The model matrix with a constant column is then $$Y = 1 + X|Y_{t-1}$$.

Let $c_0, c_1, c_2$$ be the computed parameters for this model.

Then the original parameters are obtained by the following:\[ \begin{array}{l}\lambda = c_2\\ \beta_0 = c_1\\ \alpha = c_0 / (1 - c_2)$$.\\ \end{array} \]

The question is, the linear model rhs contains a dependent, endogenous, stochastic lagged

variable $$Y_{t-1}$$ which violates the assumptions of least squares method.

The output of applying the least squares method will result in biased, inconsistent(when the errors are not normally distributed). Most results obtained by blindly applying least squares

exhibit high multicollinearity and serially correlated errors.

We present a Python function to generate the Koyck linear model given the

input endogenous stochastic Y vector and the independent exogenous vector X.

import random from matlib import matzero, matInsertConstCol, matprint def koyckZmatrix(Y, X, withconstcol = True): n = len(Y) if n != len(X): raise ValueError, "in koyckmatrix():Y,X have different lengths!" Z = matzero(n-1, 2) for i in range(n-1): # one less due to lag. Z[i][0], Z[i][1] = X[i+1], Y[i] if withconstcol: matInsertConstCol(Z, 0, 1) return Y[1:], Z Y = [random.random() for i in range(10)] X = range(10) print "The input Y, X vectors" for i in range(len(X)): print i, Y[i], X[i] Y, Z = koyckZmatrix(Y, X) print "The Input Koyck Y and Z matrix." matprint(Z)

In spite of the results being biased and inconsistent, we can use the OLs procedure as a generator of starting values to a better scheme like Maximum Likelihood estimation.

A secon approach to solving the Koyck geometrically lagged model is the method of instrumental variables. We define an instrumental variable which is NOT correlated with the the error term of the Koyck model.

Leviatan developed the following normal equations, obtained by choosing $$X_{t-1}$$ as a proxy variable. This system is shown in page 678 of Gujarati's text on page 678. $$\begin{array}{ll}\sum {Y_t} &= n \hat{\alpha_0} + \hat{\alpha_1} \sum {X_t} + \alpha_2 \sum Y_{t-1} \\ \sum {Y_t X_t} &= \hat{\alpha_0}\sum {X_t} + \hat{\alpha_1}\sum{X_t^2} + \hat{\alpha_2} Y_{t-1}X_t \end{array}$$

$$\sum Y_t X_{t-1} &= \hat{\alpha_0}\sum X_{t-1} +\hat{\alpha_1} \sum X_{t}X_{t-1} + \hat{\alpha_2}\sum Y_{t-1} X_{t-1}$$

The following Python code sets-up the normal equations and solve for the coefficients: $$\hat{\alpha_0}, \hat{\alpha_1}, \hat{\alpha_2}.$$

from scipy import linalg def leviatan(Y, X): """ Solves the Koyck distributed lag system by instrumental variable technique. Reference: pp.678-679, Gujarati, 4e, "Basic Econometrics" """ #RHS vector. b=[0.0] * 3 b[0] = sum(Y[1:]) b[1] = sum([y * x for y, x in zip(Y[1:],X[1:])]) b[2] = sum([y * x for y, x in zip(Y[1:], X[:-1])]) #Least squares matrix. A = matzero(3,3) A[0][0] = len(Y)-1 A[0][1] = A[1][0] = sum(X[:-1]) A[0][2] = sum(Y[1:]) A[1][0] = sum(X[:-1]) A[1][1] = sum([x*x for x in X[1:]]) A[1][2] = sum([y* x for y, x in zip(Y[:-1], X[1:])]) A[2][0] = sum ([x for x in X[:-1]]) A[2][1] = sum ([x * xtm1 for x , xtm1 in zip(X[1:], X[:-1])]) A[2][2] = sum ([ ytm1 * xtm1 for ytm1, xtm1 in zip(Y[:-1], X[:-1])]) #Call linear solver C = linalg.solve (A, b) return C

The code, unfortunately, has not been checked on an example. We will go back to this later and we add that the estimators returned may be consistent but still high in multicollinearity and low in aefficiency.

We used the linear solver in scipy. One can also use the code for lm.py which we described in Canned linear solver

We will present codes for the Maximum Likelihood method later...

Duh, checking the latex...

TBC

## No comments:

## Post a Comment