• Stars
    star
    842
  • Rank 54,118 (Top 2 %)
  • Language
    Dockerfile
  • License
    Apache License 2.0
  • Created almost 4 years ago
  • Updated about 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

The Apache Flink SQL Cookbook is a curated collection of examples, patterns, and use cases of Apache Flink SQL. Many of the recipes are completely self-contained and can be run in Ververica Platform as is.

Apache Flink SQL Cookbook

The Apache Flink SQL Cookbook is a curated collection of examples, patterns, and use cases of Apache Flink SQL. Many of the recipes are completely self-contained and can be run in Ververica Platform as is.

The cookbook is a living document. ๐ŸŒฑ

Table of Contents

Foundations

  1. Creating Tables
  2. Inserting Into Tables
  3. Working with Temporary Tables
  4. Filtering Data
  5. Aggregating Data
  6. Sorting Tables
  7. Encapsulating Logic with (Temporary) Views
  8. Writing Results into Multiple Tables
  9. Convert timestamps with timezones

Aggregations and Analytics

  1. Aggregating Time Series Data
  2. Watermarks
  3. Analyzing Sessions in Time Series Data
  4. Rolling Aggregations on Time Series Data
  5. Continuous Top-N
  6. Deduplication
  7. Chained (Event) Time Windows
  8. Detecting Patterns with MATCH_RECOGNIZE
  9. Maintaining Materialized Views with Change Data Capture (CDC) and Debezium
  10. Hopping Time Windows
  11. Window Top-N
  12. Retrieve previous row value without self-join

Other Built-in Functions & Operators

  1. Working with Dates and Timestamps
  2. Building the Union of Multiple Streams
  3. Filtering out Late Data
  4. Overriding table options
  5. Expanding arrays into new rows
  6. Split strings into maps

User-Defined Functions (UDFs)

  1. Extending SQL with Python UDFs

Joins

  1. Regular Joins
  2. Interval Joins
  3. Temporal Table Join between a non-compacted and compacted Kafka Topic
  4. Lookup Joins
  5. Star Schema Denormalization (N-Way Join)
  6. Lateral Table Join

Former Recipes

  1. Aggregating Time Series Data (Before Flink 1.13)

About Apache Flink

Apache Flink is an open source stream processing framework with powerful stream- and batch-processing capabilities.

Learn more about Flink at https://flink.apache.org/.

License

Copyright ยฉ 2020-2022 Ververica GmbH

Distributed under Apache License, Version 2.0.

More Repositories

1

flink-cdc-connectors

CDC Connectors for Apache Flinkยฎ
Java
4,956
star
2

flink-training-exercises

Java
552
star
3

sql-training

Java
543
star
4

flink-sql-gateway

Java
489
star
5

stateful-functions

Stateful Functions for Apache Flink
Java
276
star
6

flink-jdbc-driver

Java
128
star
7

flink-sql-benchmark

Java
102
star
8

ververica-platform-playground

Instructions for getting started with Ververica Platform on minikube.
Shell
89
star
9

frocksdb

C++
61
star
10

flink-statefun-workshop

Python
44
star
11

jupyter-vvp

Jupyter Integration for Flink SQL via Ververica Platform
Python
41
star
12

flink-training-troubleshooting

Java
40
star
13

lab-fraud-detection

Demo code for implementing and showcasing a Fraud Detection Engine with Apache Flink.
Java
30
star
14

streaming-ledger

Serializable ACID transactions on streaming data
Java
22
star
15

lab-flink-latency

Lab for testing different Flink job latency optimization techniques covered in a Flink Forward 2021 talk
Java
22
star
16

lab-flink-repository-analytics

Java
18
star
17

lab-sql-vs-datastream

Lab project to showcase Flink's performance differences between using a SQL query and implementing the same logic via the DataStream API
Java
13
star
18

flink-ecosystem

Ecosystem website for Apache Flink
TypeScript
12
star
19

tpc-ds-generators

Binaries for TPC-DS data generators
8
star
20

acwern

Flink visualization library for blogposts
TypeScript
6
star
21

ForSt

A Persistent Key-Value Store designed for Streaming processing
C++
4
star
22

demo-vvp-via-azure-pipelines

Java
3
star
23

pyflink-docs

pyflink documentation
Python
2
star
24

lab-vvp-pyflink

Java
2
star
25

flink-emr-terraform

Terraform module for creating AWS EMR Flink clusters.
HCL
1
star