Pandas groupby aggregate functions

Learn what the possible aggregate functions are for Pandas groupby

Stephen Allwright
Stephen Allwright

What is the groupby function in Pandas?

groupby is a function for Pandas which allows you to aggregate a DataFrame up to a higher level of extraction.

As an example, if you have row-level order data but want to aggregate the data on a customer level then you could use groupby on the customer identifier to do this, therefore allowing you to present calculations such as total revenue per customer and mean revenue per order.

Groupby aggregate functions in Pandas

When using the groupby function, you must define which columns will be aggregated and what aggregation calculations should be undertaken.

There are a number of built-in aggregations within Pandas that are very simple to use, these are:

  • count() – Number of non-null observations
  • nunique() - Number of unique values
  • sum() – Sum of values
  • mean() – Mean of values
  • median() – Arithmetic median of values
  • mad() - Mean absolute deviation of values
  • prod() - Product of values
  • min() – Minimum
  • max() – Maximum
  • mode() – Mode
  • std() – Standard deviation
  • var() – Variance

Use groupby with a single aggregation

The syntax for using a single built-in Pandas aggregation is:


Use groupby with multiple aggregations

It's also possible to use multiple aggregate functions on the same column, to do that we just need to create a list of functions, like so:


Pandas groupby column and sum another column
Divide columns
Scale multiple columns
Label encode columns
Remove outliers


Groupby documentation
Aggregate documentation


Stephen Allwright Twitter

I'm a Data Scientist currently working for Oda, an online grocery retailer, in Oslo, Norway. These posts are my way of sharing some of the tips and tricks I've picked up along the way.