SQL Scripts
Useful SQL scripts, split from DevOps Bash tools, for which this is now a submodule.
Hari Sekhon
Cloud & Big Data Contractor, United Kingdom
(you're welcome to connect with me on LinkedIn)
Inventory
DevOps / DBA
aws_athena_cloudtrail_ddl.sql
- AWS Athena DDL to setup up integration to query CloudTrail logs from Athenabigquery_*.sql
- Google BigQuery scripts:bigquery_billing_*.sql
- billing queries for GCP usage eg. highest cost services, most used GCP products, recent charges etc.bigquery_info_*.sql
- information schema queries for datasets, tables, columns, partitioning, clustering etc.
mysql_*.sql
:postgres_*.sql
:- PostgreSQL queries for DBA investigating + performance tuning
- postgres_info.sql - big summary overview, recommend you start here
- tested on PostgreSQL 8.4, 9.x, 10.x, 11.x, 12.x, 13.x
Analytics
bigquery_*.sql
- Google BigQuery scripts:bigquery_billing_*.sql
- billing queries for GCP usage eg. highest cost services, most used GCP products, recent charges etc.bigquery_info_*.sql
- information schema queries for datasets, tables, columns, partitioning, clustering etc.- analytics/
bigquery_*.sql
- ecommerce queries and BigQuery ML machine learning classification logistic regression models and purchasing predictions - for more BigQuery examples, see Data Engineering demos
DevOps SQL tooling
You can quickly test the PostgreSQL / MySQL scripts using postgres.sh
/ mysqld.sh
/ mariadb.sh
in the DevOps Bash tools repo, which boots a docker container and drops straight in to a mysql
/ psql
shell with this directory mounted at /sql
and used as $PWD
for fast easy sourcing eg.
postgres:
\i /sql/postgres_query_times.sql
\i postgres_query_times.sql
mysql:
source /sql/mysql_sessions.sql
\. mysql_sessions.sql
Related scripts
- .psqlrc - advanced PostgreSQL psql client config
- psql.sh - quickly connect to PostgreSQL with command line switches inferred from environment variables
- mysql.sh - quickly connect to MySQL / MariaDB with command line switches inferred from environment variables
- postgres.sh - one-touch PostgreSQL, boots docker container and drops you in to
psql
shell. Version can be given as an argument - mysqld.sh / mariadb.sh - one-touch MySQL / MariaDB, boots docker container and drops you in to
mysql
shell. Version can be given as an argument - postgres_foreach_table.sh / mysql_foreach_table.sh - execute templated SQL queries/statements against all or a subset of tables
- postgres_tables_row_counts.sh / mysql_tables_row_counts.sh - get row counts for all or a subset of tables
- sqlcase.pl - autocases your SQL code
- I use this a lot and call it via hotkey configured in my .vimrc
- there are
*case.pl
specializations for most of the major RDBMS and distributed SQL systems, even several NoSQL systems, using each one's language specific keywords
- Hive & Impala SQL:
- beeline.sh - quickly connect to Hive, auto-determines HiveServer2 address, Kerberos & SSL options, ZooKeeper quorum
- impala_shell.sh - quickly connect to Impala, auto-determines a Hadoop worker node address and Kerberos options (can use an environment variable for a Load Balancer setup)
- hive_foreach_table.sh / impala_foreach_table.sh - execute templated SQL queries/statements against all or a subset of tables
- hive_tables_row_counts.sh / impala_tables_row_counts.sh - get row counts for all or a subset of tables
- hive_tables_column_counts.sh / impala_tables_column_counts.sh - get the column counts for big tables in Hive / Impala
- hive_tables_metadata.sh / impala_tables_metadata.sh / hive_tables_locations.sh / impala_tables_locations.sh - get Hive / Impala metadata for all or a subset of tables, eg. Location to determine where the external tables data is being stored (HDFS / S3 paths)
Related Repositories
-
DevOps Bash Tools - 1000+ DevOps Bash Scripts, Advanced
.bashrc
,.vimrc
,.screenrc
,.tmux.conf
,.gitconfig
, CI configs & Utility Code Library - AWS, GCP, Kubernetes, Docker, Kafka, Hadoop, SQL, BigQuery, Hive, Impala, PostgreSQL, MySQL, LDAP, DockerHub, Jenkins, Spotify API & MP3 tools, Git tricks, GitHub API, GitLab API, BitBucket API, Code & build linting, package management for Linux / Mac / Python / Perl / Ruby / NodeJS / Golang, and lots more random goodies -
Jenkins - Advanced Jenkinsfile & Jenkins Groovy Shared Library
-
GitHub-Actions - GitHub Actions master template & GitHub Actions Shared Workflows library
-
Templates - dozens of Code & Config templates - AWS, GCP, Docker, Jenkins, Terraform, Vagrant, Puppet, Python, Bash, Go, Perl, Java, Scala, Groovy, Maven, SBT, Gradle, Make, GitHub Actions Workflows, CircleCI, Jenkinsfile, Makefile, Dockerfile, docker-compose.yml, M4 etc.
-
Kubernetes configs - Kubernetes YAML configs - Best Practices, Tips & Tricks are baked right into the templates for future deployments
-
Terraform - Terraform templates for AWS / GCP / Azure / GitHub management
-
DevOps Python Tools - 80+ DevOps CLI tools for AWS, GCP, Hadoop, HBase, Spark, Log Anonymizer, Ambari Blueprints, AWS CloudFormation, Linux, Docker, Spark Data Converters & Validators (Avro / Parquet / JSON / CSV / INI / XML / YAML), Elasticsearch, Solr, Travis CI, Pig, IPython
-
DevOps Perl Tools - 25+ DevOps CLI tools for Hadoop, HDFS, Hive, Solr/SolrCloud CLI, Log Anonymizer, Nginx stats & HTTP(S) URL watchers for load balanced web farms, Dockerfiles & SQL ReCaser (MySQL, PostgreSQL, AWS Redshift, Snowflake, Apache Drill, Hive, Impala, Cassandra CQL, Microsoft SQL Server, Oracle, Couchbase N1QL, Dockerfiles, Pig Latin, Neo4j, InfluxDB), Ambari FreeIPA Kerberos, Datameer, Linux...
-
The Advanced Nagios Plugins Collection - 450+ programs for Nagios monitoring your Hadoop & NoSQL clusters. Covers every Hadoop vendor's management API and every major NoSQL technology (HBase, Cassandra, MongoDB, Elasticsearch, Solr, Riak, Redis etc.) as well as message queues (Kafka, RabbitMQ), continuous integration (Jenkins, Travis CI) and traditional infrastructure (SSL, Whois, DNS, Linux)
-
Nagios Plugin Kafka - Kafka API pub/sub Nagios Plugin written in Scala with Kerberos support
-
HAProxy Configs - 80+ HAProxy Configs for Hadoop, Big Data, NoSQL, Docker, Elasticsearch, SolrCloud, HBase, Cloudera, Hortonworks, MapR, MySQL, PostgreSQL, Apache Drill, Hive, Presto, Impala, ZooKeeper, OpenTSDB, InfluxDB, Prometheus, Kibana, Graphite, SSH, RabbitMQ, Redis, Riak, Rancher etc.
-
Dockerfiles - 50+ DockerHub public images for Docker & Kubernetes - Hadoop, Kafka, ZooKeeper, HBase, Cassandra, Solr, SolrCloud, Presto, Apache Drill, Nifi, Spark, Mesos, Consul, Riak, OpenTSDB, Jython, Advanced Nagios Plugins & DevOps Tools repos on Alpine, CentOS, Debian, Fedora, Ubuntu, Superset, H2O, Serf, Alluxio / Tachyon, FakeS3
-
HashiCorp Packer templates - Linux automated bare-metal installs and portable virtual machines OVA format appliances using HashiCorp Packer, Redhat Kickstart, Debian Preseed and Ubuntu AutoInstaller / Cloud-Init
-
Diagrams-as-Code - Cloud & Open Source architecture diagrams with Python & D2 source code provided - automatically regenerated via GitHub Actions CI/CD - AWS, GCP, Kubernetes, Jenkins, ArgoCD, Traefik, Kong API Gateway, Nginx, Redis, PostgreSQL, Kafka, Spark, web farms, event processing...