So far this has included finishing the augmented Dickey-Fuller (ADF) test for unit roots. The big time sink here is that the ADF test-statistic has a non-standard distribution in most cases. The ADF test statistic is obtained by running the following regression

One approach to testing for a unit root means testing the t-stat on the coefficient on the lagged level of

*y*. The actual distribution for this statistic, however, is not Student's t. Many software packages use the tables in Fuller (1976, updated to 1996 version or not) in order to get the critical values for the test statistic depending on the sample size. They use linear interpolation for sample sizes not included in the table. The p-values for the obtained test statistic are usually obtained using MacKinnon's (1994) study that estimated regression surfaces of these distributions via Monte Carlo simulation.

While we do use MacKinnon's approximate p-values from the 1994 paper, MacKinnon wrote a note updating this paper in early 2010, which gives new regression surface results for obtaining the critical values. We use these new results for the critical values. Therefore, when using our ADF test, it is advised that if the p-value is close to the reject/accept region then the critical values should be used in place of the p-value to make the ultimate decision.

We can illlustrate the use of ADF. Note that this version is only in my branch and that it is still in the sandbox, even though it has now been tested, because the API and returned results may change. We will demonstrate on a series that we can easily guess is non-stationary, real GDP.

In [1]: import scikits.statsmodels as sm In [2]: from scikits.statsmodels.sandbox.tsa.stattools import adfuller In [3]: data = sm.datasets.macrodata.load() In [4]: realgdp = data.data['realgdp'] In [5]: adf = adfuller(realgdp, maxlag=4, autolag=None, regression="ct") In [6]: adf Out[6]: (-1.8566384063254346, 0.67682917510440099, 4, 198, {'1%': -4.0052351400496136, '10%': -3.1402115863254525, '5%': -3.4329000694218998})

The return values are the test statistic, its p-value (the null-hypothesis here is that the series

*does*contain a unit root), the number of lags of the differences used, the number of observations for the regression, and a dictionary containing the critical values at the respective confidence levels. The regression option controls the type of regression (ie., whether to include a constant or a linear or quadratic time trend), and the autolag option has three options for choosing the lag length to help correct for serial correlation in the regression. There are 'AIC', 'BIC', and 't-stat'. The former two choose the lag length that maximizes the infofrmation criterion, the latter chooses the lag length based on the significance of the lag. This starts with maxlag and works its way down. The docstring has more detailed information.

Beyond this, I have been working on an autocorrelation function (acf), a partial autocorrelation function (pacf), and Q-Statistics (Box-Ljung test). Next up for this week is finishing my VAR class with identification schemes. After this, I will work to integrate post-estimation tests into our results classes, most likely using some sort of mix-in classes and attach test containers to the results objects for test results. Then it's off to the SciPy conference. There I will hopefully be participating in the stats sprint, helping out with the docs marathon and discussing what we need for the future of statistics and Python.

Fuller, W.A. 1996.Introduction to Statistical Time Series.2nd ed. Wiley. MacKinnon, J.G. 1994. "Approximate asymptotic distribution functions for unit-root and cointegration tests.Journal of Business and Economic Statistics12, 167-76. MacKinnon, J.G. 2010. "Critical Values for Cointegration Tests." Queen's University, Dept of Economics, Working Papers. Available at http://ideas.repec.org/p/qed/wpaper/1227.html