How to use LISTAGG function in Snowflake

Listagg is an aggregation function in Snowflake which concatenates values. In this post I will walk through what it’s useful for and how to use it in your database.

Stephen Allwright
Stephen Allwright

listagg is an aggregation function in Snowflake which concatenates values, in its simplest form all you need to run is select listagg(col_1) from table. In this post I will walk through what it’s useful for and how to use it in your database.

Snowflake listagg, what does it do?

listagg is a column aggregation function which concatenates all values within a column to a list, using a defined delimiter to separate the elements. It’s often used in conjunction with a group by clause to aggregate up to a given level.

This sounds similar to the concat function, but differs as the concat function concatenates along row wise, whilst listagg concatenates up column wise to a list.

listagg with a delimiter

The most common syntax for using listagg in Snowflake is:

listagg( column_1, delimiter )

select listagg( product_name, "," ) from products

This basic syntax will concatenate all of the values in product_name and separate them all by using the delimiter which in our case is “,”.

listagg with distinct values

It’s also possible to return a concatenated list containing just the distinct values in a column. This is done using the following syntax:

listagg( distinct column_1, delimiter )

select listagg( distinct category_name, "|" ) from products

listagg within group

We can take this function one step further by adding the within group function, which will concatenate the values in a given order:

listagg( column1, delimiter) within group (order by column_2 desc)

select 
category_name, listagg( product_name, ",") within group (order by price desc) 
from products
group by category_name

Limitations and common errors when using listagg

There are a few things that you should be aware of when using this function:

  • The delimiter must be of datatype string
  • If no delimiter is provided then an empty string is used
  • There isn’t a defined order of values within the list, unless an order by clause is used in conjunction with within group
  • null values are not included within the output list
  • listagg is an aggregation function, so other columns included in the statement need to either be the subject of the aggregation or aggregated themselves.

Using coalesce in Snowflake
Using replace in Snowflake
Using substring in Snowflake
Using ifnull in Snowflake
Using sample in Snowflake

References

Snowflake listagg documentation

Snowflake

Stephen Allwright Twitter

I'm a Data Scientist currently working for Oda, an online grocery retailer, in Oslo, Norway. These posts are my way of sharing some of the tips and tricks I've picked up along the way.