Friday, April 24, 2009

Getting Started with GSoC and SciPy

I'm trying to resist making the obligatory "hello world" post here. The best I can do is only mentioning the urge.

First, a little bit about myself. My name is Skipper Seabold. I am finishing my first year as a PhD student in economics at American University in Washington, DC, and I have recently been accepted to the Google Summer of Code 2009 to work on the SciPy project. I have been a computer hardware and programming hobbyist since my middle school days. I have built my computers my whole life and back in high school tinkered around with Visual Basic (Apps for AOL 3.0 on Windows 3.x and Windows 95 anyone?), Turbo Pascal, C++, and Java mostly in the context of coursework. Two years ago I was introduced to the Python programming language, and I haven't looked back. Needless to say I'm very happy to have two of my interests, economics and programming, overlap.

This is where SciPy comes in. For those who are unfamiliar with SciPy, I direct you to the homepage here. In short, SciPy is an open source library of algorithms for numerical analysis for those working in engineering or the sciences more broadly defined. The SciPy library depends on NumPy. The Tentative NumPy Tutorial is a good place to start learning about the capabilities of NumPy. And likewise, the Getting Started page has plenty of resources to introduce you to the power of SciPy. In particular the tutorials, documentation, and cookbook are good to look at.

What I will be working on this summer is providing a consistent user interface for statistical models and appropriate statistical tests in SciPy similar to those found in other statistics/econometric software packages. I will also provide a unified development framework for those who would like to add to this effort in the future. Updates may be less regular over the next few weeks, but check here for at least weekly updates on the work over the summer.

3 comments:

  1. Looks like our illustrious host left out an important personal detail found here: http://www.musowls.org/pdfs/mustoday/2009/March09.pdf

    "Skipper Seabold is currently studying
    at American University in Washington,
    DC, where he recently received a master’s
    degree in fi nancial economics and is now
    pursuing a Ph.D. in economics. In his
    studies, Skipper fi nds that he does actually
    use calculus every day. He lives with Brett
    Meeks ’02, and, together, they fi ght crime
    at night."

    ReplyDelete
  2. Is there a good tutorial (preferably with examples) on using the SciPy stats? I'm a newbie to Python - and therefore SciPy - and am trying to get up to speed in order to do some analysis of several economic data sets.

    Thanks.

    ReplyDelete
  3. Hey, sorry I didn't see your comment sooner. I thought I had comment notification turned on. If you're still following, you can find the documentation tutorials for stats here
    http://docs.scipy.org/scipy/docs/scipy-docs/tutorial/

    Note that there is a stats folder and a stats.rst file there. These are still a work in progress (and incidentally are distributed with the code if you build the docs, you will have them on your computer). If you check back here periodically, I am working on a tutorial for working with the statsmodels code, and it will include working with scipy.stats and matplotlib as well ie., for residuals analysis etc.

    ReplyDelete