GBB Logo         
GBB Services
GBB Services : Programming : SciPy '02

SciPy '02 Report

by Walter Vannini

This essay originally appeared in the October 2002 issue of Accent, the newsletter of the Silicon Valley Chapter of the ACCU.

The first "Python for Scientific Computing Workshop" (http://www.scipy.org/) was held at Caltech on September 5 and 6. As one of the few attendees from the commercial world, I got quite an education into the concerns of the scientific world, and how Python is helping out.

There were roughly 70 attendees, mostly from government organizations, universities, and research institutions. The largest contingents were from the astronomy and bioinformatics community, although many other fields, such as particle physics, were represented.

One of the big selling points of Python, beyond the fact that it's open source, is its use as a glue language. There are many existing C and Fortran based applications that have been developed, and they need to be used in a coordinated way. I got the impression that many organizations had tried the scripting language Tcl originally, but found that it didn't scale. Also, there are huge compiled libraries of C and Fortran source that provide very useful services. Open source tools like SWIG, the "Simplified Wrapper and Interface Generator", and f2py, the "Fortran to Python Interface Generator", can be used to make the C and Fortran code available to Python applications. There is a large Fortran community in the scientific world, and f2py is making Python a useful scripting language for that community. Another big plus is that Python is easy to use and understand, yet powerful. Since most of the scientific users are not primarily programmers, this is an important advantage.

The unifying theme of the two day workshop/conference was "SciPy", an open source Python library that includes modules for signal processing, integration, special functions, and of course graphics and plotting. It's built on top of the Numeric module, and is currently supported by Enthought Inc (http://www.enthought.com/). Enthought was a major organizer of the conference1, and three representatives of the company (Eric Jones, David Morrill, Travis Vaught) attended the conference. They gave several excellent presentations, covering a wide variety of topics: the Numeric module, parallel computing, community development, and Chaco. Chaco is a toolkit for plotting that Enthought (primarily David Morrill) originally developed for a client, and has now made available to the open source community.

SciPy and Numeric make it easy to do many of the things that the scientific community wants to do. To many attendees, SciPy with Numeric is seen as an open source alternative to MATLAB. The ongoing work by the SciPy community is in fact making this a reality. One of the most productive members of that community, Travis Oliphant of Brigham Young University, gave an in depth tutorial introducing SciPy, and later gave another presentation describing what SciPy still needs. As well as more functionality, more documentation is being seen as very important. Travis invited us to help out.

In the scientific world, speed and optimization of numeric computations is often a priority, so that wrapping of compiled code is done not just to reuse a huge amount of existing Fortran code, but also to selectively replace portions of Python code. Pat Miller, from the Lawrence Livermore National Laboratory, described some experimental techniques he's working on to directly optimize Python code. If he's successful, even more of the scientific community has a reason to switch to Python.

There were several talks given by the people at Art Olson's Molecular Graphics Laboratory at Scripps ( http://www.scripps.edu/pub/olson-web/). Michel Sanner described some of the molecular visualization tools that are being developed at Scripps using Python. As well as viewing molecular structures via PyOpenGL to gain insight, researchers at Scripps enjoy actually holding models to gain further understanding. Attendees got to handle some of the physical molecular models manufactured by the 3d printer from Z Corporation (http://www.zcorp.com/). Although the models are roughly $20 each, the cost of a low end printer is about thirty thousand dollars. I'm looking forward to prices dropping a couple of orders of magnitude.

As well as the in depth tutorials and presentations, there were lightning talks on a variety of uses of Python and SciPy. Along with the expected bionformatics and astronomical applications, there were applications involving financial analysis, weather research, brain-machine interfacing, and quantum chemistry. There was a report on a metal casting application, and an example of its use involving two uranium hemispheres. A satellite image processing application was described. One of its requirements, based on National Security considerations, was that it had to process images with file sizes of 2 GB in minutes. This is ongoing work, and changes to the Python Imaging Library (PIL) will probably need to be made to properly handle files of that size.

There were many opportunities to chat with people during breaks. When I was asked what I did with Python I replied that I was a Python enthusiast, and that I used Python as an administrative tool and to help automate parts of C++ programming (and as a handy command line calculator2). But, as a contract programmer, all the paying projects I've found are C++ based, not Python based. Another developer I spoke with told me that he was in a similar position, except that he was a Ruby enthusiast who could only find paying Python projects to work on.

It looks like there will be a SciPy '03, but the dates haven't been set yet.

SciPy Status

The scientific Python (SciPy) package already has:

  • graphics and plotting
  • integration
  • special functions
  • signal processing
  • image processing
  • genetic algorithms
  • ordinary differential equation (ODE) solvers
  • unconstrained optimization
  • parallel programming tools
  • Fast Fourier transform
  • interpolation
  • statistical functions
  • linear algebra and blas routines based on LAPACK
  • simulated annealing
  • input/output modules

The SciPy community still wants the following features:

  • unit test cases for existing modules
  • documentation
  • constrained optimization
  • nonlinear conjugate gradient
  • Krylov subspace iterative solvers
  • PDE solvers
  • Computational geometry utilities
  • wavelets
  • more input/output modules
A "hot list" will soon be available at http://www.scipy.org/.

1) The conference was also hosted by The National Biomedical Computation Resource and The Center for Advanced Computing Research. Michel Sanner and Michael Aivazis of these institutes played key roles in making the conference a reality.

2) e.g., using Python 2.2 (and later):

python -c "print (10**36)/998999"

generates the Fibonacci sequence.