We have made use of Python’s Pandas package in a variety of posts on the site. These have showcased some of Pandas’ abilities including the following:
- DataFrames for data manipulation with built in indexing
- Handling of missing data
- Data alignment
- Melting/stacking and Pivoting/unstacking data sets
- Groupby feature allowing split -> apply -> combine operations on data sets
- Data merging and joining
Pandas is also a high performance library, with much of its code written in Cython or C. Unfortunately, Pandas can have a bit of a steep learning curve — In this post, I’ll cover some introductory tips and tricks to help one get started with this excellent package.
- This post was partially inspired by Tom Augspurger’s Pandas tutorial, which has a youtube video that can be viewed along side it. We also suggest some other excellent resource materials — where relevant — below.
- The notebook we use below can be downloaded from our github page. Feel free to grab it and follow along.
Follow us on twitter for new submission alerts!