Snowflake Snowpark Python API
The Snowpark library provides intuitive APIs for querying and processing data in a data pipeline. Using this library, you can build applications that process data in Snowflake without having to move data to the system where your application code runs.
Source code | Developer guide | API reference | Product documentation | Samples
Getting started
Have your Snowflake account ready
If you don't have a Snowflake account yet, you can sign up for a 30-day free trial account.
Create a Python virtual environment
You can use miniconda, anaconda, or virtualenv to create a Python 3.8, 3.9 or 3.10 virtual environment.
To have the best experience when using it with UDFs, creating a local conda environment with the Snowflake channel is recommended.
Install the library to the Python virtual environment
pip install snowflake-snowpark-python
Optionally, you need to install pandas in the same environment if you want to use pandas-related features:
pip install "snowflake-snowpark-python[pandas]"
Create a session and use the APIs
from snowflake.snowpark import Session
connection_parameters = {
"account": "<your snowflake account>",
"user": "<your snowflake user>",
"password": "<your snowflake password>",
"role": "<snowflake user role>",
"warehouse": "<snowflake warehouse>",
"database": "<snowflake database>",
"schema": "<snowflake schema>"
}
session = Session.builder.configs(connection_parameters).create()
df = session.create_dataframe([[1, 2], [3, 4]], schema=["a", "b"])
df = df.filter(df.a > 1)
df.show()
pandas_df = df.to_pandas() # this requires pandas installed in the Python environment
result = df.collect()
Samples
The Developer Guide and API references have basic sample code. Snowflake-Labs has more curated demos.
Logging
Configure logging level for snowflake.snowpark
for Snowpark Python API logs.
Snowpark uses the Snowflake Python Connector.
So you may also want to configure the logging level for snowflake.connector
when the error is in the Python Connector.
For instance,
import logging
for logger_name in ('snowflake.snowpark', 'snowflake.connector'):
logger = logging.getLogger(logger_name)
logger.setLevel(logging.DEBUG)
ch = logging.StreamHandler()
ch.setLevel(logging.DEBUG)
ch.setFormatter(logging.Formatter('%(asctime)s - %(threadName)s %(filename)s:%(lineno)d - %(funcName)s() - %(levelname)s - %(message)s'))
logger.addHandler(ch)
Contributing
Please refer to CONTRIBUTING.md.