
Use SQLAlchemy to create Snowflake table from Pandas DataFrame
Creating, replacing, or appending a table in Snowflake directly from a Pandas DataFrame in Python reduces the friction of infrastructure and gets the data into the hands of end users faster. In this post I will show how to create a table in Snowflake from Pandas using only SQLAlchemy.
Creating, replacing, or appending a table in Snowflake directly from a Pandas DataFrame in Python reduces the friction of infrastructure and gets the data into the hands of end users faster. In this post I will show how to create a table in Snowflake from Pandas using only SQLAlchemy.
Create Snowflake table from Pandas using SQLAlchemy
The current best practice for creating a Snowflake table using SQLAlchemy is:
- Install the required packages:
pandas sqlalchemy
- Connect to Snowflake using
sqlalchemy
- Create Snowflake table using
to_sql
method from Pandas
Install SQLAlchemy and Pandas to your machine
In order to execute the code described in this post you need to first install some required packages to your environment or machine, these are:
pandas
sqlalchemy
The best practice is to create a separate Python environment for each project, so I will first create a Conda environment and then install the required packages:
conda create -n env_name python=3.9 pandas sqlalchemy
conda activate env_name
Connect to Snowflake using SQLAlchemy
Now onto the Python code. First of all you need to connect to Snowflake using your credentials. We do this by using the SQLAlchemy package to create the engine which allows us to interact with Snowflake.
from sqlalchemy import create_engine
#Create connection to Snowflake using your account and user
account_identifier = '<account_identifier>'
user = '<user_login_name>'
password = '<password>'
database_name = '<database_name>'
schema_name = '<schema_name>'
conn_string = f"snowflake://{user}:{password}@{account_identifier}/{database_name}/{schema_name}"
engine = create_engine(conn_string)
Create table if it already exists using SQLAlchemy
If the table you want to create already exists then we need to know what to do with it. to_sql
has a parameter if_exists
for this eventuality, and there are three options that can be parsed in:
append
. Add the data at the end of the existing tablereplace
. Drop the table and create a new one in it’s placefail
. Return an error
Use to_sql to create Snowflake table from Pandas
Now that we have a connection to Snowflake and decided what to do if the table already exists, we can define the table name. It’s important to note that this table name needs to be lower case, otherwise an error will be returned.
Once this is decided upon then we can let the to_sql
method take our DataFrame, do some SQL magic in the background, and create our new table:
import pandas as pd
#Create your DataFrame
table_name = 'cities'
df = pd.DataFrame(data=[['Stephen','Oslo'],['Jane','Stockholm']],columns=['Name','City'])
#What to do if the table exists? replace, append, or fail?
if_exists = 'replace'
#Write the data to Snowflake
with engine.connect() as con:
df.to_sql(name=table_name.lower(), con=con, if_exists=if_exists)
Snowflake table from Pandas DataFrame using SQLAlchemy, tying it all together
Now that we have explored all aspects of the task as hand, it’s time to bring it all together into one code snippet:
import pandas as pd
from sqlalchemy import create_engine
#Create connection to Snowflake using your account and user
account_identifier = '<account_identifier>'
user = '<user_login_name>'
password = '<password>'
database_name = '<database_name>'
schema_name = '<schema_name>'
conn_string = f"snowflake://{user}:{password}@{account_identifier}/{database_name}/{schema_name}"
engine = create_engine(conn_string)
#Create your DataFrame
table_name = 'cities'
df = pd.DataFrame(data=[['Stephen','Oslo'],['Jane','Stockholm']],columns=['Name','City'])
#What to do if the table exists? replace, append, or fail?
if_exists = 'replace'
#Write the data to Snowflake
with engine.connect() as con:
df.to_sql(name=table_name.lower(), con=con, if_exists=if_exists)
Related articles
Write Pandas data to Snowflake table
Create Snowflake table from Pandas
References
SQLAlchemy create engine documentation
Pandas to_sql documentation