dolly
Databricksโ Dolly, a large language model trained on the Databricks Machine Learning Platformpyspark-ai
English SDK for Apache Sparkdbx
๐งฑ Databricks CLI eXtensions - aka dbx is a CLI tool for development and advanced Databricks workflows management.dbldatagen
Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelinestempo
API for manipulating time series on top of Apache Spark: lagged time values, rolling statistics (mean, avg, sum, count, etc), AS OF joins, downsampling, and interpolationmosaic
An extension to the Apache Spark framework that allows easy and fast processing of very large geospatial datasets.overwatch
Capture deep metrics on one or all assets within a Databricks workspaceucx
Automated migrations to Unity Catalogcicd-templates
Manage your Databricks deployments and CI with code.automl-toolkit
Toolkit for Apache Spark ML for Feature clean-up, feature Importance calculation suite, Information Gain selection, Distributed SMOTE, Model selection and training, Hyper parameter optimization and selection, Model interprability.migrate
Old scripts for one-off ST-to-E2 migrations. Use "terraform exporter" linked in the readme.dlt-meta
Metadata driven Databricks Delta Live Tables framework for bronze/silver pipelinesdataframe-rules-engine
Extensible Rules Engine for custom Dataframe / Dataset validationdiscoverx
A Swiss-Army-knife for your Data Intelligence platform administration.geoscan
Geospatial clustering at massive scalejupyterlab-integration
DEPRECATED: Integrating Jupyter with Databricks via SSHsmolder
HL7 Apache Spark Datasourcefeature-factory
Accelerator to rapidly deploy customized features for your businessdatabricks-sync
An experimental tool to synchronize source Databricks deployment with a target Databricks deployment.doc-qa
transpiler
SIEM-to-Spark Transpilerbrickster
R Toolkit for Databricksdelta-oms
DeltaOMS is a solution that help build a centralized repository of Delta Transaction logs and associated operational metrics/statistics for your Delta Lakehouse. Unity Catalog supported in the v0.7.0-rc1 release.Documentation here - https://databrickslabs.github.io/delta-oms/v0.7.0-rc1/pytester
Python Testing for Databricksremorph
Cross-compiler and Data Reconciler into Databricks Lakehousesplunk-integration
Databricks Add-on for Splunkdbignite
arcuate
Delta Sharing + MLflow for ML model & experiment exchange (arcuate delta - a fan shaped river delta)databricks-sdk-r
Databricks SDK for R (Experimental)tika-ocr
sandbox
Experimental or low-maturity thingsblueprint
Baseline for Databricks Labs projects written in Pythondelta-sharing-java-connector
A Java connector for delta.io/sharing/ that allows you to easily ingest data on any JVM.partner-connect-api
pylint-plugin
Databricks Plugin for PyLintlsql
Lightweight SQL execution wrapper only on top of Databricks SDKwaterbear
Automated provisioning of an industry Lakehouse with enterprise data modelLove Open Source and this site? Check out how you can help us