Does variable X causes Y or vice-versa? We say that X granger cause Y if past values of X helps in predicting future values of Y. In other words, in the the model $$ Y = \sum \alpha_i X_{t-i} + \sum \beta_i Y_{t-i} + u_t$$, computed estimates of $$\alpha_i$$ are significantly non zero.
In a similar manner, Y granger cause X if past values of Y helps in predicting future values of Y. Thus in the model $$ X = \sum \delta_i X_{t-i} + \sum \gamma_i Y_{t-i} + u_t$$, the estimated value of $$\gamma_i$$ are significantly different from zero.
Here is (untested!)Python code for performing a Granger test on whether a variable X causes Y (or X preceeds Y)
""" """ File grangertest.py author Dr. Ernesto P. Adorio UPDEPP, UP CLARK ernesto.adorio@gmail.com version 0.0.1 06.06.2010 version 0.0.2 06.09.2010 """ from matlib import * import lm from scipy import stats def grangertest(Y,X, maxlag): """ Performs a Granger causality test on variables (vectors) Y and X. The null hypothesis is: Does X causes Y? Returned value: pvalue, F, df1, df2 """ # Create linear model involving Y lags only. n = len(Y) if n != len(X): raise ValueError, "grangertest: incompatible Y,X vectors" M = [ [0] * maxlag for i in range(n-maxlag)] for i in range(maxlag, n): for j in range(1, maxlag+1): M[i-maxlag][j-1] = Y[i-j] fit = lm.ols(M, Y[maxlag:]) RSSr = fit.RSS # Create linear model including X lags. for i in range(maxlag, n): xlagged = [X[i-j] for j in range(1, maxlag+1)] M[i-maxlag].extend(xlagged) fit = lm.ols(M, Y[maxlag:]) RSSu = fit.RSS df1 = maxlag df2 = n - 2 * maxlag - 1 F = ((RSSr - RSSu)/df1) /(RSSu/df2) pvalue = 1.0 - stats.f.cdf(F, df1, df2) return pvalue, F, df1, df2, RSSr, RSSu def Test(): D = open("chick-egg.dat").read().split("\n")[1:] years, chicks , eggs = [], [], [] for line in D: splitted = line.split() if len(splitted) == 3: year, chick, egg = splitted years.append(float(year)) chicks.append(float(chick)) eggs.append(float(egg)) lag = 4 print "Ho: do chicks cause eggs?" pvalue, F, df1, df2, RSSr, RSSu =grangertest(eggs, chicks, maxlag = lag) print "pvalue, F, df1, df2, RSSr, RSSu=", pvalue , F, df1, df2, RSSr, RSSu print "Ho: do eggs cause chicks?" pvalue, F, df1, df2, RSSr, RSSu =grangertest(chicks, eggs, maxlag = lag) print "pvalue, F, df1, df2, RSSr, RSSu=", pvalue , F, df1, df2, RSSr, RSSu if __name__ == "__main__": Test()
The test data file may be downloaded from uiuc econometrics group: data eggs.
we are wondering why the data eggs has 77 rows while the example programs in that reference shows only 53 observations! We will go back to this later but for a lag of 4, Here are the results.
python grangertest.py
Ho: do chicks cause eggs?
pvalue, F, df1, df2, RSSr, RSSu= 0.919252836295 0.232357141261 4 68 1401289.91664 1382395.24562
Ho: do eggs cause chicks?
pvalue, F, df1, df2, RSSr, RSSu= 0.00172519529214 4.83191020278 4 68 30442292479.4 23704704139.1
We will be happy if a reader shows us where the discrepancy lies. The published example shows pvalue of
The Null hypothesis is that X does not granger cause Y. If the returned pvalue is less than specified alpha, then the Null hypothesis is rejected, otherwise we fail to reject the hypothesis.
We will provide an example later. Classes will be opening soon and we have to do something else at the moment.
Reference: 696-700 of Gujarati
Wikipedia: Wiki article on Granger Causality
Last WARNING: I still need an alternate published example to check the veracity of this code.
Did you ever find a second published example to test with?
ReplyDeleteAt the moment no. perhaps you can help find one, preferably open source.
ReplyDeleteHi, its interesting, i was looking for something like this. Might interest you: http://tresenraya.postach.io/analisis-de-series-temporales-ejemplos-de-test-de-granger-para-python
ReplyDelete