# Pandas groupby aggregate functions

GroupBy is a function for Pandas which allows you to aggregate a DataFrame up a higher level of extraction. For example, if you have row level order data but want to calculate the data on a customer level then you could use GroupBy on the customer identifier to do this.

## What is a Pandas groupby aggregate function?

Groupby is a function for Pandas which allows you to aggregate a DataFrame up a higher level of extraction. For example, if you have row level order data but want to calculate the data on a customer level then you could use groupby on the customer identifier to do this, therefore allowing you to present calculations such as `total revenue`

and `mean revenue per order`

.

## What are the possible Pandas groupby aggregate functions?

When using the groupby function you must define which columns will be aggregated and what type of aggregation calculations should be undertaken. You can use separate packages such as NumPy for aggregations within the groupby function, however there are a number of built in aggregations that are very simple to use, these are:

- count() – Number of non-null observations
- nunique() - Number of unique values
- sum() – Sum of values
- mean() – Mean of values
- median() – Arithmetic median of values
- mad() - Mean absolute deviation of values
- prod() - Product of values
- min() – Minimum
- max() – Maximum
- mode() – Mode
- std() – Standard deviation
- var() – Variance

You can use these aggregations in the following way:

`df.groupby('customer_id').agg({'revenue':['sum','mean','std'],'product_id':['count','nunique']}) `

## Related articles

Pandas groupby column and sum another column

Divide columns

Scale multiple columns

Label encode columns

Remove outliers

## References

### Newsletter

Join the newsletter to receive the latest updates from the world of Data Science in your inbox.