• Stars
    star
    293
  • Rank 141,748 (Top 3 %)
  • Language
    TypeScript
  • License
    MIT License
  • Created over 6 years ago
  • Updated 29 days ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

SQL Parsers for BigData, built with antlr4.

dt-sql-parser

NPM version NPM downloads Chat

English | 简体中文

dt-sql-parser is a SQL Parser project built with ANTLR4, and it's mainly for the BigData domain. The ANTLR4 generated the basic Parser, Visitor, and Listener, so it's easy to complete the syntax validation, tokenizer, traverse the AST, and so on features.

Besides, it provides some helper methods, like split SQL, and filter the -- and /**/ types of comments in SQL.

Supported SQL:

  • Generic SQL (MySQL)
  • Flink SQL
  • Spark SQL
  • Hive SQL
  • PL/SQL
  • PostgreSQL
  • Trino SQL

Tips: This project is the default for Javascript language, also you can try to compile it to other languages if you need.

Integrating SQL Parser with Monaco Editor

We have provided a monaco-sql-languages package, you can integrate with monaco-editor easily.

Installation

// use npm
npm i dt-sql-parser --save

// use yarn
yarn add dt-sql-parser

Usage

Syntax Validation

First, we need to import the Parser object from dt-sql-parser, the different language needs different Parser, so if you need to handle the Flink SQL, you can import the FlinkSQL Parser.

The below is a GenericSQL Parser example:

import GenericSQL from 'dt-sql-parser/dist/parser/generic';

const parser = new GenericSQL();

const correctSql = 'select id,name from user1;';
const errors = parser.validate(correctSql);
console.log(errors); 

Output:

/*
[]
*/

Validate failed:

const incorrectSql = 'selec id,name from user1;'
const errors = parser.validate(incorrectSql);
console.log(errors); 

Output:

/*
[
    {
        endCol: 5,
        endLine: 1,
        startCol: 0,
        startLine: 1,
        message: "mismatched input 'SELEC' expecting {<EOF>, 'ALTER', 'ANALYZE', 'CALL', 'CHANGE', 'CHECK', 'CREATE', 'DELETE', 'DESC', 'DESCRIBE', 'DROP', 'EXPLAIN', 'GET', 'GRANT', 'INSERT', 'KILL', 'LOAD', 'LOCK', 'OPTIMIZE', 'PURGE', 'RELEASE', 'RENAME', 'REPLACE', 'RESIGNAL', 'REVOKE', 'SELECT', 'SET', 'SHOW', 'SIGNAL', 'UNLOCK', 'UPDATE', 'USE', 'BEGIN', 'BINLOG', 'CACHE', 'CHECKSUM', 'COMMIT', 'DEALLOCATE', 'DO', 'FLUSH', 'HANDLER', 'HELP', 'INSTALL', 'PREPARE', 'REPAIR', 'RESET', 'ROLLBACK', 'SAVEPOINT', 'START', 'STOP', 'TRUNCATE', 'UNINSTALL', 'XA', 'EXECUTE', 'SHUTDOWN', '--', '(', ';'}"
    }
]
*/

We instanced a Parser object, and use the validate method to check the SQL syntax, if failed returns an array object includes error message.

Tokenizer

Get all tokens by the Parser:

import GenericSQL from 'dt-sql-parser/dist/parser/generic';

const parser = new GenericSQL()
const sql = 'select id,name,sex from user1;'
const tokens = parser.getAllTokens(sql)
console.log(tokens)
/*
[
    {
        channel: 0
        column: 0
        line: 1
        source: [SqlLexer, InputStream]
        start: 0
        stop: 5
        tokenIndex: -1
        type: 137
        _text: null
    },
    ...
]
*/

Visitor

Traverse the tree node by the Visitor:

import GenericSQL from 'dt-sql-parser/dist/parser/generic';
import { SqlParserVisitor } from 'dt-sql-parser/dist/parser/generic/SqlParserVisitor';

const parser = new GenericSQL()
const sql = `select id,name from user1;`
// parseTree
const tree = parser.parse(sql)
class MyVisitor extends SqlParserVisitor {
    // overwrite visitTableName
    visitTableName(ctx) {
        let tableName = ctx.getText().toLowerCase()
        console.log('TableName', tableName)
    }
    // overwrite visitSelectElements
    visitSelectElements(ctx) {
        let selectElements = ctx.getText().toLowerCase()
        console.log('SelectElements', selectElements)
    }
}
const visitor = new MyVisitor()
visitor.visit(tree)

/*
SelectElements id,name
TableName user1
*/

Tips: The node's method name can be found in the Visitor file under the corresponding SQL directory

Listener

Access the specified node in the AST by the Listener

import GenericSQL from 'dt-sql-parser/dist/parser/generic';
import { SqlParserListener } from 'dt-sql-parser/dist/parser/generic/SqlParserListener';

const parser = new GenericSQL();
const sql = 'select id,name from user1;'
// parseTree
const tree = parser.parse(sql)
class MyListener extends SqlParserListener {
    enterTableName(ctx) {
        let tableName = ctx.getText().toLowerCase()
        console.log('TableName', tableName)
    }
    enterSelectElements(ctx) {
        let selectElements = ctx.getText().toLowerCase()
        console.log('SelectElements', selectElements)
    }
}
const listenTableName = new MyListener();
parser.listen(listenTableName, tree);

/*
SelectElements id,name
TableName user1
*/

Tips: The node's method name can be found in the Listener file under the corresponding SQL directory

Clean

Clear the comments and spaces before and after

import { cleanSql } from 'dt-sql-parser';

const sql = `-- comment comment
select id,name from user1; `
const cleanedSql = cleanSql(sql)
console.log(cleanedSql)

/*
select id,name from user1;
*/

Split SQL

When the SQL text is very big, you can think about to split it by ; , and handle it by each line.

import { splitSql } from 'dt-sql-parser';

const sql = `select id,name from user1;
select id,name from user2;`
const sqlList = splitSql(sql)
console.log(sqlList)

/*
["select id,name from user1;", "\nselect id,name from user2;"]
*/

Other API

  • parserTreeToString(input: string)

Parse the input and convert the AST to a List-like tree string.

Roadmap

  • Auto-complete
  • Code formatting

License

MIT

More Repositories

1

chunjun

A data integration framework
Java
3,970
star
2

flinkStreamSQL

基于开源的flink,对其实时sql进行扩展;主要实现了流与维表的join,支持原生flink SQL所有的语法
Java
2,017
star
3

Taier

Taier is a big data development platform for submission, scheduling, operation and maintenance, and indicator information display
Java
1,329
star
4

molecule

🚀 A lightweight Web IDE UI framework.
TypeScript
879
star
5

jlogstash

java 版本的logstash
HTML
267
star
6

monaco-sql-languages

SQL languages for monaco-editor
TypeScript
217
star
7

chengying

一款支持标准化schema定义、自动化部署产品包的软件,旨在对产品包下每个服务进行部署、升级、卸载、配置等操作,解放人工运维成本。
Go
197
star
8

dt-react-component

React UI component library based on antd package
TypeScript
84
star
9

jfilebeat

类filebeat的轻量级日志采集工具
Java
68
star
10

DatasourceX

Java
65
star
11

catcher

java性能采集工具
Java
51
star
12

flinkx

48
star
13

code-review-practices

Code Review Practices
48
star
14

yice-performance

易测性能检测平台
TypeScript
46
star
15

doraemon

A management tool to help you organize your daily development
TypeScript
32
star
16

chengying-agent

EasyAgent is an infrastructure component, applied to manage the life-cycle of services on the remote host.
Go
32
star
17

dt-utils

前端常用工具函数
TypeScript
30
star
18

dt-python-parser

Python Parsers for BigData, built with antlr4.
JavaScript
25
star
19

ko

Project toolkit for React Applications
JavaScript
24
star
20

dt-react-monaco-editor

Monaco editor for React.
TypeScript
22
star
21

jlogstash-input-plugin

java 版本 logstash input 插件
Java
21
star
22

jlogstash-performance-testing

jlogstash 与 logstash 性能对比
20
star
23

molecule-examples

The collection of Molecule examples
TypeScript
20
star
24

UED

袋鼠云数栈 UED 团队 - http://ued.dtstack.cn/
TypeScript
20
star
25

ant-design-dtinsight-theme

This is a document of DTInsight-theme based on Ant Design.
Less
20
star
26

dt-form-renderer

Render Interaction Form Via JSON
TypeScript
18
star
27

ant-design-testing

TypeScript
16
star
28

babel-plugin-treasure

Base on babel-plugin-import , committed to achieving the AST optimization requirements of a unified library
HTML
11
star
29

chengying-server

Go
10
star
30

Code-Style-Guide

10
star
31

chunjun-web

ChunJun Offical Website https://dtstack.github.io/chunjun-web/
JavaScript
8
star
32

jlogstash-output-plugin

java 版本 logstash output 插件
Java
7
star
33

jlogstash-filter-plugin

java 版本 logstash filter 插件
Java
7
star
34

easyvc-power-meter

an open source component code for demonstrating @easyv/cli.
JavaScript
6
star
35

chengying-schema

Shell
6
star
36

chengying-front

TypeScript
5
star
37

chengying-web

JavaScript
4
star
38

typescript-migration-helper

Help ES6 + React + Redux project migrates to the Typescript project.
Perl
2
star
39

easyv-cli

EasyV官方组件开发工具
2
star
40

dtstack-log-java-sdk

玳数日志java版本sdk
Java
2
star
41

create-molecule

Create Molecule Application with create-react-app.
TypeScript
2
star
42

dt-react-codemirror-editor

Codemirror editor for React.
TypeScript
2
star
43

maven-repository

1
star
44

elasticsearch-sql-old

Java
1
star
45

dt-monaco-editor-nls-webpack-plugin

Simplified Chinese Support For Monaco Editor
JavaScript
1
star