Scientific Applications
Context
Python is frequently used for high-performance scientific applications. Itis widely used in academia and scientific projects because it is easy to writeand performs well.
Due to its high performance nature, scientific computing in Python oftenutilizes external libraries, typically written in faster languages (like C, orFortran for matrix operations). The main libraries used are NumPy, SciPyand Matplotlib. Going into detail about these libraries is beyond the scopeof the Python guide. However, a comprehensive introduction to the scientificPython ecosystem can be found in the Python Scientific Lecture Notes.
Tools
IPython
IPython is an enhanced version of Python interpreter,which provides features of great interest to scientists. The inline mode_allows graphics and plots to be displayed in the terminal (Qt based version).Moreover, the _notebook mode supports literate programming and reproduciblescience generating a web-based Python notebook. This notebook allows you tostore chunks of Python code alongside the results and additional comments(HTML, LaTeX, Markdown). The notebook can then be shared and exported in variousfile formats.
Libraries
NumPy
NumPy is a low level library written in C (andFortran) for high level mathematical functions. NumPy cleverly overcomes theproblem of running slower algorithms on Python by using multidimensional arraysand functions that operate on arrays. Any algorithm can then be expressed as afunction on arrays, allowing the algorithms to be run quickly.
NumPy is part of the SciPy project, and is released as a separate library sopeople who only need the basic requirements can use it without installing therest of SciPy.
NumPy is compatible with Python versions 2.4 through 2.7.2 and 3.1+.
Numba
Numba is a NumPy aware Python compiler(just-in-time (JIT) specializing compiler) which compiles annotated Python (andNumPy) code to LLVM (Low Level Virtual Machine) through special decorators.Briefly, Numba uses a system that compiles Python code with LLVM to code whichcan be natively executed at runtime.
SciPy
SciPy is a library that uses NumPy for more mathematicalfunctions. SciPy uses NumPy arrays as the basic data structure, and comeswith modules for various commonly used tasks in scientific programming,including linear algebra, integration (calculus), ordinary differential equationsolving, and signal processing.
Matplotlib
Matplotlib is a flexible plottinglibrary for creating interactive 2D and 3D plots that can also be saved asmanuscript-quality figures. The API in many ways reflects that of MATLAB, easing transition of MATLABusers to Python. Many examples, along with the source code to recreate them,are available in the matplotlib gallery.
Pandas
Pandas is a data manipulation librarybased on NumPy which provides many useful functions for accessing,indexing, merging, and grouping data easily. The main data structure (DataFrame)is close to what could be found in the R statistical package; that is,heterogeneous data tables with name indexing, time series operations, andauto-alignment of data.
xarray
xarray is similar to Pandas, but itis intended for wrapping multidimensional scientific data. By labelling thedata with dimensions, coordinates, and attributes, it makes complexmultidimensional operations clearer and more intuitive. It also wrapsmatplotlib for quick plotting, and can apply most operations in parallel usingdask.
Rpy2
Rpy2 is a Python binding for the Rstatistical package allowing the execution of R functions from Python andpassing data back and forth between the two environments. Rpy2 is the objectoriented implementation of the Rpybindings.
PsychoPy
PsychoPy is a library for cognitive scientistsallowing the creation of cognitive psychology and neuroscience experiments.The library handles presentation of stimuli, scripting of experimental design,and data collection.
Resources
Installation of scientific Python packages can be troublesome, as many ofthese packages are implemented as Python C extensions which need to be compiled.This section lists various so-called scientific Python distributions whichprovide precompiled and easy-to-install collections of scientific Pythonpackages.
Unofficial Windows Binaries for Python Extension Packages
Many people who do scientific computing are on Windows, yet many of thescientific computing packages are notoriously difficult to build and install onthis platform. Christoph Gohlke,however, has compiled a list of Windows binaries for many useful Pythonpackages. The list of packages has grown from a mainly scientific Pythonresource to a more general list. If you’re on Windows, you may want to check itout.
Anaconda
The Anaconda Python Distributionincludes all the common scientific Python packages as well as many packagesrelated to data analytics and big data. Anaconda itself is free, and a numberof proprietary add-ons are available for a fee. Free licenses for theadd-ons are available for academics and researchers.
Canopy
Canopy is another scientificPython distribution, produced by Enthought.A limited ‘Canopy Express’ variant is available for free, but Enthoughtcharges for the full distribution. Free licenses are available for academics.