
write_pandas and Snowflake Connector for Python, explained
Write_pandas is a method in the Snowflake Connector for Python which allows the user to append data to a table in Snowflake from a Pandas DataFrame. In this post I will show how to use the method and answer some common questions that users have.
Write_pandas is a method in the Snowflake Connector for Python which allows the user to append data to a table in Snowflake from a Pandas DataFrame. In this post I will show how to use the method and answer some common questions that users have.
write_pandas and Snowflake Connector for Python
The best practice for using write_pandas to write a Pandas DataFrame to Snowflake is:
- Install the required packages:
pandas snowflake-connector-python
- Connect to Snowflake using
connect
method insnowflake-connector-python
- Write DataFrame to Snowflake using
write_pandas
method from Python connector
Install Snowflake Connector for Python and Pandas to your machine
In order to execute the code described in this post you need to first install some required packages to your environment or machine, these are:
pandas
snowflake-connector-python
The best practice is to create a separate Python environment for each project, so I will first create a Conda environment and then install the required packages:
conda create -n env_name python=3.9 pandas
conda activate env_name
pip install "snowflake-connector-python[pandas]"
Connect to Snowflake using the connector for Python
Now onto the Python code. First of all you need to connect to Snowflake using your credentials. We will be doing this of course by using the Snowflake Connector for Python connector:
import os
from snowflake.connector import connect
#Use your Snowfake user credentials, recommended to save these as env variables
conn = connect(
user=os.environ["SNOW_USER"],
password=os.environ["SNOW_PASSWORD"],
account=os.environ["SNOW_ACCOUNT"],
role=os.environ["SNOW_ROLE"],
)
We have now created a connection object with Snowflake that we can use later on to interact with the instance.
Using Snowflake credentials in Python
As you can see, your Snowflake credentials are required to create this connection, and we have entered these using environment variables. You can of course decide what the best insertion method is for your project, but given that this is sensitive information you should remember the golden rule:
Do not enter your credentials in plain text, especially when working within a Git repository
User access rights in Snowflake
OK, one last note on connect to your instance, I promise. Your Snowflake user will have a certain level of access which grants you rights to certain databases, schemas, and tables. So make sure that you have the correct level of access for your needs.
Use write_pandas to write Pandas DataFrame to Snowflake
Now that you have your connection to the Snowflake instance you can start to do the interesting bit, writing the data.
In order to write the data you need to first define the table that you will be working with. One of the quirks, and downsides, of using the Snowflake connector functions is that this table needs to already exist before you can append it.
After the table has been defined I will use the write_pandas
function to append the data, which does some SQL magic behind the scenes. It first uses a PUT
command to upload the data to a temporary storage and then uses COPY INTO
to move the data from that storage to the table. Let’s put the function into action:
from snowflake.connector.pandas_tools import write_pandas
import pandas as pd
#Define the table name, schema, and database you want to write to
#Note: the table, schema, and database need to already exist in Snowflake
table_name = 'CITIES'
schema = 'CUSTOMERS'
database = 'DW'
df = pd.DataFrame(data=[['Stephen','Oslo'],['Jane','Stockholm']],columns=['Name','City'])
#Combine these using the function from the Snowflake connector
success, num_chunks, num_rows, output = write_pandas(
conn=conn,
df=df,
table_name=table_name,
database=database,
schema=schema
)
The write_pandas
function only requires conn
, df
, and table_name
but I have chosen to also define the database
and schema
as this is a best practice to ensure that the correct table is being modified.
Output from write_pandas function
The function will return some useful information for us so that we can understand if the appending has worked as expected, these are:
success
: True/False for if the function was successfulnum_chunks
: The number of data chunks copiednum_rows
: The number of rows appendedoutput
: TheCOPY INTO
command used
Write Pandas DataFrame to table using Snowflake Connector for Python
Now that we have explored all aspects of appending the data, it’s time to bring it all together into one code snippet where we go from connection to writing the data:
from snowflake.connector.pandas_tools import write_pandas
from snowflake.connector import connect
import pandas as pd
import os
#Use your Snowfake user credentials, recommended to save these as env variables
conn = connect(
user=os.environ["SNOW_USER"],
password=os.environ["SNOW_PASSWORD"],
account=os.environ["SNOW_ACCOUNT"],
role=os.environ["SNOW_ROLE"],
)
#Define the table name, schema, and database you want to write to
#Note: the table, schema, and database need to already exist in Snowflake
table_name = 'CITIES'
schema = 'CUSTOMERS'
database = 'DW'
df = pd.DataFrame(data=[['Stephen','Oslo'],['Jane','Stockholm']],columns=['Name','City'])
#Combine these using the function from the Snowflake connector
success, num_chunks, num_rows, output = write_pandas(
conn=conn,
df=df,
table_name=table_name,
database=database,
schema=schema
)
Using Snowflake Connector when table doesn’t exist yet
If you try to run these code snippets for a table that doesn’t exist yet in Snowflake then an error will be returned. If you would like to create the table before you append then I would recommend using the best practice for writing Pandas data to Snowflake as described in this post.
Related articles
Write Pandas data to Snowflake table
Create Snowflake table from Pandas
Create table from Pandas using Python connector
References
write_pandas documentation
Python connector installation documentation