Coalesce is a column transformation function in Snowflake, in its simplest form all you need to run is
select coalesce(column1,column2) from table. In this post I will walk through what it’s useful for and how to use it in your database.
Snowflake coalesce, what is it?
Coalesce is a function which takes a collection of columns and returns the first
non null value, or returns
null if there are no
non null values.
How do you use coalesce in Snowflake?
The syntax for using
coalesce in Snowflake is:
coalesce( col_1 , col_2 , ... , col_n )
There are a few things that you should be aware of when using this function:
- All columns used in the function need to have the same data type
- The function searches for the first
non nullvalue from left to right in the list of columns
Example of using coalesce function in Snowflake
Coalesce is a function which on the surface seems to have a niche use case, but once you start to use it quickly becomes powerful. I will show you the benefit of this function by using a real world example you could face.
Imagine that you have a table containing product information where each product has a hierarchy of categories ranging from the highest aggregation to the most granular. The table would something like this:
|Product||Category 1||Category 2||Category 3|
|Butter||Dairy||Butter and margarine||Butter|
|Banana||Fruit & Veg||Fruit|
|Carrot||Fruit & Veg||Vegetable||Root vegetable|
If you were asked to return the most granular category per product then this would be difficult to achieve with a normal
select query as the lowest category isn’t always in the same column. This is a perfect use case for using
coalesce , as we want to find the first
non null column when traveling up the category tree. Our query would look like this:
select product, coalesce(category_3,category_2,category_1) as lowest_category from prodcuts /* | product | lowest_category | |---------|-----------------| | Milk | Milk | | Butter | Butter | | Banana | Fruit | | Carrot | Root vegetable | */