• Stars
    star
    122
  • Rank 292,031 (Top 6 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 1 year ago
  • Updated 5 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

An LLM-powered Streamlit chatbot for data exploration and question answering on Snowflake

Frosty: Build a LLM Chatbot in Streamlit on your Snowflake Data

Open in GitHub Codespaces

Overview

In this guide, we will build an LLM-powered chatbot named "Frosty" that performs data exploration and question answering by writing and executing SQL queries on Snowflake data. The application uses Streamlit and Snowflake and can be plugged into your LLM of choice, alongside data from Snowflake Marketplace. By the end of the session, you will have an interactive web application chatbot which can converse and answer questions based on a public job listings dataset.

View the demo page for a full walkthrough and more material.

Run the app

Once environment is set up and secrets are configured including connection to a Snowflake environment with the relevant view, the app can be run by:

streamlit run src/frosty_app.py

App Demo

Run in Codespaces

Press the button above to get started with this guide in GitHub Codespaces. This may be especially useful if you are less comfortable with python environment setup (or don't feel like wrestling with it today). Notes and tips on using Codespaces with this guide:

  • Once you launch the codespace, dependencies should be installed automatically and the app should launch after a few seconds.
  • The app needs secrets to be added, you'll need to configure .streamlit/secrets.toml in Codespaces (or similar) before the app succeeds. An example file is provided to help you get started. The app will show an exception on launch until this is added.
  • Please ensure codespace use is appropriate for the planned data access and usage. Consider using encrypted secrets for any sensitive credentials.
  • Learn more about Github Codespace free usage limits and billing here, and lifecycle of a codespace here.

Testing

This repo provides automated tests of the Frosty app functionality using Streamlit AppTest. Tests are located in the src/test_frosty.py file and can be run using pytest. Calls to Snowflake and OpenAI are mocked using Python's unittest mock. This approach is effective for rapidly and consistently testing your app functionality in an automated development process.

Testing example output

$ pytest -v
================ test session starts ================
platform darwin -- Python 3.10.12, pytest-7.4.2, pluggy-1.3.0 --
cachedir: .pytest_cache
rootdir: /python/Snowflake-Labs/sfguide-frosty-llm-chatbot-on-snowflake
plugins: anyio-3.7.1
collected 3 items

src/test_frosty.py::test_validate_creds PASSED [ 33%]
src/test_frosty.py::test_prompts PASSED [ 66%]
src/test_frosty.py::test_frosty_app PASSED [100%]

================ 3 passed in 1.37s ================

More Repositories

1

snowflake-arctic

Python
514
star
2

schemachange

A Database Change Management tool for Snowflake
Python
502
star
3

snowpark-python-demos

This repository provides various demos/examples of using Snowpark for Python.
Jupyter Notebook
267
star
4

sfquickstarts

Follow along with our tutorials to get you up and running with the Snowflake Data Cloud.
HTML
237
star
5

awesome-snowflake

A curated list of resources about Snowflake
180
star
6

snowflake-demo-streamlit

This repo contains a collection of Streamlit in Snowflake demos, tutorials, and examples
Python
160
star
7

snowflake-demo-notebooks

Collection of Snowflake Notebook demos, tutorials, and examples
Jupyter Notebook
138
star
8

sf-samples

Sample files, code snippets and downloads for Snowflake labs and tutorials.
125
star
9

dbt_constraints

This package generates database constraints based on the tests in a dbt project
SQL
120
star
10

sansshell

A non-interactive daemon for host management
Go
89
star
11

sfguide-data-engineering-with-snowpark-python

Python
89
star
12

Excelerator

This is an Excel Addin for Windows that reads and writes data to Snowflake
VBA
76
star
13

sfgrantreport

Snowflake Grant Report offers a way of visualizing role hierarchy and rapid diagnosis of as-is permissions, giving customers insight without difficult discovery.
C#
72
star
14

sfguide-getting-started-dataengineering-ml-snowpark-python

Jupyter Notebook
64
star
15

semantic-model-generator

Python
62
star
16

django-snowflake

Python
59
star
17

sfsnowsightextensions

Snowflake Snowsight Extensions enables manipulation of Snowsight features from command-line
C#
56
star
18

snowpark-python-template

Python project template for Snowpark development
Python
54
star
19

sfguide-intro-to-machine-learning-with-snowflake-ml-for-python

Jupyter Notebook
54
star
20

roleout

HCL
50
star
21

arctic-embed

Jupyter Notebook
47
star
22

sfguide-getting-started-machine-learning

Jupyter Notebook
41
star
23

sqltools-snowflake-driver

A Snowflake driver for the SQLTools VSCode extension.
TypeScript
36
star
24

snowpark-containers-llama-2-sample

Python
32
star
25

sfguide-getting-started-snowpark-python

Quickstart: Getting Started with Snowpark Python
Jupyter Notebook
32
star
26

mlflow-snowflake

Python
31
star
27

sfguide-citibike-ml-snowpark-python

Jupyter Notebook
31
star
28

sfguide-recommender-pipeline

Snowflake Guide: Building a Recommendation Engine Using Snowflake & Amazon SageMaker
Python
30
star
29

snowpark-extensions-py

Python
26
star
30

snowflake-vcrpy

A PyTest plugin to speed up your tests which depend on Snowflake sessions
Python
25
star
31

terraform-snowflake-api-integration-with-geff-aws

Terraform module to create resources across the Snowflake and AWS providers and establish proper relationships within those resources.
HCL
23
star
32

snowsql-formatter

JavaScript
22
star
33

snowpark-devops

Python
22
star
34

snowflake-cli-action

Github Action enabling easy use of Snowflake CLI in your CI/CD workflows
Shell
22
star
35

sfguide-twitter-auto-ingest

Learn how to auto-ingest streaming data into Snowflake using Snowpipe.
Python
21
star
36

sfquickstart-data-clean-room

PLpgSQL
20
star
37

sfguide-external-functions-examples

Python
20
star
38

sfguide-getting-started-with-native-apps

Python
19
star
39

lezer-snowsql

JavaScript
19
star
40

sfguide-data-engineering-with-snowpark-python-intro

Python
19
star
41

sfguide-spcs-cortex-reactjs-flask-app

JavaScript
18
star
42

sfguide-snowflake-python-api

Guide for running a custom API Powered by Snowflake in Python
Python
18
star
43

terraform-snowflake-storage-integration-aws

To create the base infrastructure for storage only pipelines that load data from S3 to Snowflake.
HCL
18
star
44

geff

Python
16
star
45

sfguide_snowpark_on_jupyter

Jupyter Notebooks with Snowpark
Jupyter Notebook
15
star
46

sfguide-native-apps-chairlift

Snowflake Native Application sample demonstrating data sharing and analysis using a fictional Chairlift manufacturer.
Python
15
star
47

icetire

Data Science Sandbox for Snowflake
Jupyter Notebook
15
star
48

streamlit-examples

Python
14
star
49

sfguide-getting-started-weaviate-on-spcs

The code in this repository deploys Weaviate into Snowpark Container Services (SPCS), demonstrating how to run Weaviate in Snowflake.
Jupyter Notebook
14
star
50

sfguide-financial-asset-management

Snowflake demo for Financial Services
PLSQL
14
star
51

sfguide-getting-started-with-cortex-analyst

Python
13
star
52

sfguide-data-apps-demo

JavaScript
12
star
53

sfguide-getting-started-with-snowflake-devops

PLpgSQL
12
star
54

SC.DDLExportScripts

Collection of utility scripts to extract code so it can be upgraded to SnowFlake using the SnowConvert tool.
PLpgSQL
11
star
55

sfguide-data-crawler

Generate descriptions of Snowflake tables and views with LLMs
Python
11
star
56

Sentry

Python
11
star
57

builder-workshops

10
star
58

sfguide-blog-ai-assistant

Python
10
star
59

sfguide-snowpark-pytorch-streamlit-openai-image-rec

Python
10
star
60

EDMC-CDMC-v1-14-Control_Mapping

Controls that will be used in Snowflake to satisfy the EDMC's CDMC framework.
Python
10
star
61

OpenLineage-AccessHistory-Setup

Guideline to extract table lineage info in OpenLineage format from access history view
10
star
62

sfguide-getting-started-with-snowflake-arctic-and-snowflake-cortex

Jupyter Notebook
10
star
63

sfguide-tasty-bytes-snowpark-101-for-data-science

Jupyter Notebook
9
star
64

sfguide-snowpark-for-python-streamlit

Python
9
star
65

sfguide-snowpark-scikit-learn

Jupyter Notebook
9
star
66

aws-integrations-cloudops

Snowflake & AWS Service Catalog Integration
Python
9
star
67

spcs-updates

8
star
68

SFSimilarity

Similarity and Distance functions for Snowflake
Java
8
star
69

sfguide-getting-started-snowpark-scala

Snowpark Twitter Sentiment Analysis Example
Scala
8
star
70

Snowflake-HL7V2-Parsing

Java
8
star
71

sfguide-data-engineering-with-notebooks

Jupyter Notebook
8
star
72

sfguide-intro-to-snowpark-container-services

Python
8
star
73

cortex-search

7
star
74

sfguide-terraform-sample

Sample project for the guide Terraforming Snowflake
HCL
7
star
75

sfguide-vault-snowflakepasswords

vault-snowflakepasswords-sample is a sample Hashicorp Vault database plugin designed to work with the Snowflake Data Platform
Go
7
star
76

terraform-snowflake-snowalert

Terraformed version of Snowalert
HCL
6
star
77

sfguide-sporting-events-prediction-using-snowpark-ml

Jupyter Notebook
6
star
78

sfguide-getting-started-with-generative-ai-snowflake-external-functions-streamlit

Python
5
star
79

sfguide-marketplace-accelerator2.0

Snowflake Marketplace Accelerator 2.0 speeds up monteization and inter/intra org collaboration
PLpgSQL
5
star
80

snowpark-extensions

Useful Extensions to be used when modernizing projects from Spark to Snowpark
Scala
5
star
81

sfguide-build-and-deploy-snowpark-ml-models-using-streamlit-snowflake-notebooks

Python
5
star
82

sfguide-getting-started-with-pandas-on-snowflake

Jupyter Notebook
5
star
83

sfguide-data-engineering-pipelines-with-pandas-on-snowflake

Jupyter Notebook
4
star
84

sqlpack

Python
4
star
85

sfguide-text-embedding-snowpark-container-service

A pure-Python approach to packaging a text embedding model into a Snowpark Container Services service
Python
4
star
86

snowflake-maven-gradle-plugins

Maven and Gradle plugins to deploy your UDFs and stored procedures to Snowflake
Java
4
star
87

snowpark-scala-template

Scala project template for Snowpark development
Scala
4
star
88

sfguide_failover_scripts

Env setup for the replication/failover hands on lab
Python
4
star
89

devday2024-oss-demo

Jupyter Notebook
4
star
90

sfguide-cohort-builder

Python
3
star
91

modern-data-engineering-snowflake

Companion repository that goes along with Snowflake's "Introduction to Modern Data Engineering with Snowflake" course on Coursera
PLpgSQL
3
star
92

terraform-snowflake-snowpipe-aws

Terraform module for creating Snowpipe to ingest data from AWS S3 bucket
HCL
3
star
93

sfguide-getting-started-with-iceberg-tables

Jupyter Notebook
3
star
94

sfguide-prompt-engineering-and-llm-evaluation

PLpgSQL
3
star
95

sfguide-snowflake-java-api

Guide for running a custom API Powered by Snowflake in java
Java
3
star
96

sfguide-getting-started-snowpark-python-sagemaker

Jupyter Notebook
3
star
97

sfguide-aws-autopilot-integration

3
star
98

snowpark-java-template

Java project template for Snowpark development
Java
3
star
99

sfguide-getting-started-snowpark-python-feast

Jupyter Notebook
3
star
100

sfguide-snowpark-python-top-three-tips-for-optimal-performance

Jupyter Notebook
3
star