Python Data Science Resources


Python distribution

Use Python 3 instead of Python 2, Python 2 is no longer supported as of January 2020.

I recommend using Anaconda Python, which features a full suite of data science libraries.

Libraries

The must have Python resources:

  • Numpy is the package to use for data and linear algebra and is widely supported by other libraries.(Quickstart guide)

  • Matplotlib.pyplot the standard for plotting in python. Can be used for line, bar, contour, and 3d plots (and more!) (Tutorials)

  • Pandas great tool for working with large sets of data. Good excel/csv input/output. Lots of indexing, slicing, grouping functionality

  • xarray adds additional functionality to numpy, such as dimensions, coordinates, and attributes. My go-to for reading/writing netCDF4 and HDF5 data files.

Beginners Guides