• Stars
    star
    340
  • Rank 124,317 (Top 3 %)
  • Language
    Haskell
  • License
    Apache License 2.0
  • Created about 3 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

PICARD - Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models. PICARD is a ServiceNow Research project that was started at Element AI.

ServiceNow completed its acquisition of Element AI on January 8, 2021. All references to Element AI in the materials that are part of this project should refer to ServiceNow.


make it parse

build license

This is the official implementation of the following paper:

Torsten Scholak, Nathan Schucher, Dzmitry Bahdanau. PICARD - Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP).

If you use this code, please cite:

@inproceedings{Scholak2021:PICARD,
  author = {Torsten Scholak and Nathan Schucher and Dzmitry Bahdanau},
  title = "{PICARD}: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models",
  booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
  month = nov,
  year = "2021",
  publisher = "Association for Computational Linguistics",
  url = "https://aclanthology.org/2021.emnlp-main.779",
  pages = "9895--9901",
}

Watch The Video

Watch the video

Overview

This code implements:

  • The PICARD algorithm for constrained decoding from language models.
  • A text-to-SQL semantic parser based on pre-trained sequence-to-sequence models and PICARD achieving state-of-the-art performance on both the Spider and the CoSQL datasets.

About PICARD

TL;DR: We introduce PICARD -- a new method for simple and effective constrained decoding from large pre-trained language models. On the challenging Spider and CoSQL text-to-SQL datasets, PICARD significantly improves the performance of fine-tuned but otherwise unmodified T5 models. Using PICARD, our T5-3B models achieved state-of-the-art performance on both Spider and CoSQL.

In text-to-SQL translation, the goal is to translate a natural language question into a SQL query. There are two main challenges to this task:

  1. The generated SQL needs to be semantically correct, that is, correctly reflect the meaning of the question.
  2. The SQL also needs to be valid, that is, it must not result in an execution error.

So far, there has been a trade-off between these two goals: The second problem can be solved by using a special decoder architecture that -- by construction -- always produces valid SQL. This is the approach taken by most prior work. Those decoders are called "constrained decoders", and they need to be trained from scratch on the text-to-SQL dataset. However, this limits the generality of the decoders, which is a problem for the first goal.

A better approach would be to use a pre-trained encoder-decoder model and to constrain its decoder to produce valid SQL after fine-tuning the model on the text-to-SQL task. This is the approach taken by the PICARD algorithm.

How is PICARD different from existing constrained decoders?

  • It’s an incremental parsing algorithm that integrates with ordinary beam search.
  • It doesn’t require any training.
  • It doesn’t require modifying the model.
  • It works with any model that generates a sequence of tokens (including language models).
  • It doesn’t require a special vocabulary.
  • It works with character-, sub-word-, and word-level language models.

How does PICARD work?

The following picture shows how PICARD is integrated with beam search.



Decoding starts from the left and proceeds to the right. The algorithm begins with a single token (usually <s>), and then keeps expanding the beam with hypotheses generated token-by-token by the decoder. At each decoding step and for each hypothesis, PICARD checks whether the next top-k tokens are valid. In the image above, only 3 token predictions are shown, and k is set to 2. Valid tokens (☑) are added to the beam. Invalid ones (☒) are discarded. The k+1-th, k+2-th, ... tokens are discarded, too. Like in normal beam search, the beam is pruned to contain only the top-n hypotheses. n is the beam size, and in the image above it is set to 2 as well. Hypotheses that are terminated with the end-of-sentence token (usually </s>) are not expanded further. The algorithm stops when the all hypotheses are terminated or when the maximum number of tokens has been reached.

How does PICARD know whether a token is valid?

In PICARD, checking, accepting, and rejecting of tokens and token sequences is achieved through parsing. Parsing means that we attempt to assemble a data structure from the tokens that are currently in the beam or are about to be added to it. This data structure (and the parsing rules that are used to build it) encode the constraints we want to enforce.

In the case of SQL, the data structure we parse to is the abstract syntax tree (AST) of the SQL query. The parsing rules are defined in a computer program called a parser. Database engines, such as PostgreSQL, MySQL, and SQLite, have their own built-in parser that they use internally to process SQL queries. For Spider and CoSQL, we have implemented a parser that supports a subset of the SQLite syntax and that checks additional constraints on the AST. In our implementation, the parsing rules are made up from simpler rules and primitives that are provided by a third-party parsing library.

PICARD uses a parsing library called attoparsec that supports incremental input. This is a special capability that is not available in many other parsing libraries. You can feed attoparsec a string that represents only part of the expected input to parse. When parsing reaches the end of an input fragment, attoparsec will return a continuation function that can be used to continue parsing. Think of the continuation function as a suspended computation that can be resumed later. Input fragments can be parsed one after the other when they become available until the input is complete.

Herein lies the key to PICARD: Incremental parsing of input fragments is exactly what we need to check tokens one by one during decoding.

In PICARD, parsing is initialized with an empty string, and attoparsec will return the first continuation function. We then call that continuation function with all the token predictions we want to check in the first decoding step. For those tokens that are valid, the continuation function will return a new continuation function that we can use to continue parsing in the next decoding step. For those tokens that are invalid, the continuation function will return a failure value which cannot be used to continue parsing. Such failures are discarded and never end up in the beam. We repeat the process until the end of the input is reached. The input is complete once the model predicts the end-of-sentence token. When that happens, we finalize the parsing by calling the continuation function with an empty string. If the parsing is successful, it will return the final AST. If not, it will return a failure value.

The parsing rules are described at a high level in the PICARD paper. For details, see the PICARD code, specifically the Language.SQL.SpiderSQL.Parse module.

How well does PICARD work?

Let's look at the numbers:

On Spider

URL Based on Exact-set Match Accuracy Execution Accuracy
Dev Test Dev Test
tscholak/cxmefzzi w PICARD T5-3B 75.5 % 71.9 % 79.3 % 75.1 %
tscholak/cxmefzzi w/o PICARD T5-3B 71.5 % 68.0 % 74.4 % 70.1 %
tscholak/3vnuv1vf w PICARD t5.1.1.lm100k.large 74.8 % — 79.2 % —
tscholak/3vnuv1vf w/o PICARD t5.1.1.lm100k.large 71.2 % — 74.4 % —
tscholak/1wnr382e w PICARD T5-Large 69.1 % — 72.9 % —
tscholak/1wnr382e w/o PICARD T5-Large 65.3 % — 67.2 % —
tscholak/1zha5ono w PICARD t5.1.1.lm100k.base 66.6 % — 68.4 % —
tscholak/1zha5ono w/o PICARD t5.1.1.lm100k.base 59.4 % — 60.0 % —

Click on the links to download the models. tscholak/cxmefzzi and tscholak/1wnr382e are the versions of the model that we used in our experiments for the paper, reported as T5-3B and T5-Large, respectively. tscholak/cxmefzzi, tscholak/3vnuv1vf, and tscholak/1zha5ono were trained to use database content, whereas tscholak/1wnr382e was not.

Note that, without PICARD, 12% of the SQL queries generated by tscholak/cxmefzzi on Spider’s development set resulted in an execution error. With PICARD, this number decreased to 2%.

On CoSQL Dialogue State Tracking

URL Based on Question Match Accuracy Interaction Match Accuracy
Dev Test Dev Test
tscholak/2e826ioa w PICARD T5-3B 56.9 % 54.6 % 24.2 % 23.7 %
tscholak/2e826ioa w/o PICARD T5-3B 53.8 % 51.4 % 21.8 % 21.7 %
tscholak/2jrayxos w PICARD t5.1.1.lm100k.large 54.2 % — — —
tscholak/2jrayxos w/o PICARD t5.1.1.lm100k.large 52.5 % — — —

Click on the links to download the models. tscholak/2e826ioa is the version of the model that we used in our experiments for the paper, reported as T5-3B.

Quick Start

Prerequisites

This repository uses git submodules. Clone it like this:

$ git clone [email protected]:ElementAI/picard.git
$ cd picard
$ git submodule update --init --recursive

Training

The training script is located in seq2seq/run_seq2seq.py. You can run it with:

$ make train

The model will be trained on the Spider dataset by default. You can also train on CoSQL by running make train-cosql.

The training script will create the directory train in the current directory. Training artifacts like checkpoints will be stored in this directory.

The default configuration is stored in configs/train.json. The settings are optimized for a GPU with 40GB of memory.

These training settings should result in a model with at least 71% exact-set-match accuracy on the Spider development set. With PICARD, the accuracy should go up to at least 75%.

We have uploaded a model trained on the Spider dataset to the huggingface model hub, tscholak/cxmefzzi. A model trained on the CoSQL dialog state tracking dataset is available, too, tscholak/2e826ioa.

Evaluation

The evaluation script is located in seq2seq/run_seq2seq.py. You can run it with:

$ make eval

By default, the evaluation will be run on the Spider evaluation set. Evaluation on the CoSQL evaluation set can be run with make eval-cosql.

The evaluation script will create the directory eval in the current directory. The evaluation results will be stored there.

The default configuration is stored in configs/eval.json.

Serving

A trained model can be served using the seq2seq/serve_seq2seq.py script. The configuration file can be found in configs/serve.json. You can start serving with:

$ make serve

By default, the 800-million-parameter tscholak/3vnuv1vf model will be loaded. You can also load a different model by specifying the model name in the configuration file. The device to use can be specified as well. The default is to use the first available GPU. CPU can be used by specifying -1.

When the script is called, it uses the folder specified by the db_path option to look for SQL database files. The default folder is database, which will be created in the current directory. Initially, this folder will be empty, and you can add your own SQL files to it. The structure of the folder should be like this:

database/
  my_1st_database/
    my_1st_database.sqlite
  my_2nd_database/
    my_2nd_database.sqlite

where my_1st_database and my_2nd_database are the db_ids of the databases.

Once the server is up and running, use the Swagger UI to test inference with the /ask endpoint. The server will be listening at http://localhost:8000/, and the Swagger UI will be available at http://localhost:8000/docs#/default/ask_ask__db_id___question__get.

Docker

There are three docker images that can be used to run the code:

  • tscholak/text-to-sql-dev: Base image with development dependencies. Use this for development. Pull it with make pull-dev-image from the docker hub. Rebuild the image with make build-dev-image.
  • tsscholak/text-to-sql-train: Training image with development dependencies but without Picard dependencies. Use this for fine-tuning a model. Pull it with make pull-train-image from the docker hub. Rebuild the image with make build-train-image.
  • tscholak/text-to-sql-eval: Training/evaluation image with all dependencies. Use this for evaluating a fine-tuned model with Picard. This image can also be used for training if you want to run evaluation during training with Picard. Pull it with make pull-eval-image from the docker hub. Rebuild the image with make build-eval-image.

All images are tagged with the current commit hash. The images are built with the buildx tool which is available in the latest docker-ce. Use make init-buildkit to initialize the buildx tool on your machine. You can then use make build-dev-image, make build-train-image, etc. to rebuild the images. Local changes to the code will not be reflected in the docker images unless they are committed to git.

More Repositories

1

N-BEATS

N-BEATS is a neural-network based model for univariate timeseries forecasting. N-BEATS is a ServiceNow Research project that was started at Element AI.
Python
509
star
2

HighRes-net

Pytorch implementation of HighRes-net, a neural network for multi-frame super-resolution, trained and tested on the European Space Agency’s Kelvin competition. This is a ServiceNow Research project that was started at Element AI.
Jupyter Notebook
278
star
3

BrowserGym

BrowserGym, a gym environment for web task automation in the Chromium browser.
Python
272
star
4

embedding-propagation

Codebase for Embedding Propagation: Smoother Manifold for Few-Shot Classification. This is a ServiceNow Research project that was started at Element AI.
Python
208
star
5

LCFCN

ECCV 2018 - Where are the Blobs: Counting by Localization with Point Supervision. This is a ServiceNow Research project that was started at Element AI.
Python
169
star
6

seasonal-contrast

seasonal-contrast is a ServiceNow Research project that was started at Element AI.
Python
161
star
7

TACTiS

TACTiS-2: Better, Faster, Simpler Attentional Copulas for Multivariate Time Series, from ServiceNow Research
Python
126
star
8

stl-decomp-4j

Java implementation of Seasonal-Trend-Loess time-series decomposition algorithm.
Java
116
star
9

TADAM

The implementation of https://papers.nips.cc/paper/7352-tadam-task-dependent-adaptive-metric-for-improved-few-shot-learning . TADAM is a ServiceNow Research project that was started at Element AI.
Jupyter Notebook
106
star
10

WorkArena

WorkArena: How Capable are Web Agents at Solving Common Knowledge Work Tasks?
Python
105
star
11

osaka

Codebase for "Online Fast Adaptation and Knowledge Accumulation: a New Approach to Continual Learning". This is a ServiceNow Research project that was started at Element AI.
Python
102
star
12

example-restclient-myworkapp-nodejs

This project contains source code for an example Node.js web application that interacts with ServiceNow's REST APIs.
JavaScript
97
star
13

geo-bench

GEO-Bench: Toward Foundation Models for Earth Monitoring
Python
80
star
14

PySNC

Python API for ServiceNow
Python
76
star
15

am3

Adaptive Cross-Modal Few-shot learning OSS code. This is a ServiceNow Research project that was started at Element AI.
Python
65
star
16

devtraining-needit-utah

65
star
17

multithreaded-estimators

Multithreading inference in Tensorflow Estimators. This is a ServiceNow Research project that was started at Element AI.
Python
57
star
18

duorat

DuoRAT is a ServiceNow Research project that was started at Element AI.
Python
55
star
19

TapeAgents

TapeAgents is a framework that facilitates all stages of the LLM Agent development lifecycle
Python
51
star
20

azimuth

Helping AI practitioners better understand their datasets and models in text classification. From ServiceNow.
Python
51
star
21

synbols

The Synbols dataset generator is a ServiceNow Research project that was started at Element AI.
Python
43
star
22

app-dev-methodology

38
star
23

devtraining-needit-tokyo

This repository is used by the Developer Site training content, Tokyo release.
36
star
24

HypE

Knowledge Hypergraphs: Prediction Beyond Binary Relations is a ServiceNow Research project that was started at Element AI.
Python
30
star
25

devtraining-needit-paris

This repository is used by the developer site training content, Paris release. It is used for the Build the NeedIt App, Scripting in ServiceNow, Application Security, Importing Data, Automating Application Logic, Flow Designer, REST Integrations, Reporting and Analytics, Domain Separation, Mobile Applications, and Context-sensitive Help courses.
29
star
26

data-augmentation-with-llms

Data Augmentation for Intent Classification with Off-the-Shelf Large Language Models is a ServiceNow Research project
Python
28
star
27

devtraining-needit-quebec

This repository is used by the developer site training content, Paris release. It is used for the Build the NeedIt App, Scripting in ServiceNow, Application Security, Importing Data, Automating Application Logic, Flow Designer, REST Integrations, Reporting and Analytics, Domain Separation, Mobile Applications, and Context-sensitive Help courses.
27
star
28

bilevel_augment

bilevel_augment is a ServiceNow Research project that was started at Element AI.
Jupyter Notebook
26
star
29

servicenow-cli

Alternative download for ServiceNow CLI clients.
26
star
30

devtraining-needit-rome

This repository is used by the developer site training content, Rome release. It is used for the Build the NeedIt App, Scripting in ServiceNow, Application Security, Importing Data, Automating Application Logic, Flow Designer, REST Integrations, Reporting and Analytics, Domain Separation, Mobile Applications, and Context-sensitive Help courses.
26
star
31

wise_ils

BMVC 2019 - Where are the Masks: Instance Segmentation with Image-level Supervision. This is a ServiceNow Research project that was started at Element AI.
Python
25
star
32

ALM

Documentation, guides, templates, everything else in between for Application Lifecycle Management at ServiceNow.
23
star
33

sncicd_githubworkflow

.yml sample pipeline template for workflow
21
star
34

example-restclient-myworkapp-ios

This project contains source code for an example iOS application that interacts with ServiceNow's REST APIs.
Objective-C
20
star
35

servicenow-devops-change

JavaScript
18
star
36

devtraining-needit-sandiego

This repository is used by the Developer Site training content, San Diego release.
18
star
37

AgentLab

Python
18
star
38

avenue

Avenue is a simulator designed to test and prototype reinforcement learning algorithms. Avenue is a ServiceNow Research project that was started at Element AI.
Python
15
star
39

context-is-key-forecasting

Context is Key: A Benchmark for Forecasting with Essential Textual Information
Jupyter Notebook
15
star
40

atf-headless-runner

RobotFramework
14
star
41

sncicd-instance-scan

TypeScript
13
star
42

typed-dag

Causal discovery with typed directed acyclic graphs (t-DAG). This is a ServiceNow Research project that was started at Element AI.
Python
13
star
43

beyond-trivial-explanations

Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations is a ServiceNow Research project that was started at Element AI.
Python
13
star
44

servicenow-cicd-azure-extension

Extension for Azure Pipelines to help developers get started faster with ServiceNow's CI/CD APIs
JavaScript
12
star
45

sncicd-apply-changes

TypeScript
12
star
46

MiniTouch

MiniTouch is a ServiceNow Research project that was started at Element AI.
Python
11
star
47

sncicd-gitlab-docker

Docker image containing build steps for running your ServiceNow CI/CD pipelines on GitLab.
JavaScript
10
star
48

synbols-benchmarks

Benchmarks for the Synbols project. Synbols is a ServiceNow Research project that was started at Element AI.
Python
10
star
49

promptmix-emnlp-2023

Offical code repository for PromptMix: A Class Boundary Augmentation Method for Large Language Model Distillation, EMNLP 2023
Python
10
star
50

bytesteady

A fast classification and tagging tool using byte-level n-gram embeddings. bytesteady is a ServiceNow Research project that was started at Element AI.
C++
10
star
51

sncicd-plugin-activate

TypeScript
8
star
52

sncicd-publish-app

TypeScript
8
star
53

sncicd-batch-install

TypeScript
8
star
54

better-ui

ServiceNow UI giving you back the full width of your screen
8
star
55

active-fairness

Code for the paper "Can Active Learning Preemptively Mitigate Fairness Issues?" presented at RAI 2021. This is a ServiceNow Research project that was started at Element AI.
Jupyter Notebook
8
star
56

tk-knn

Python
7
star
57

a-day-in-the-life-of-a-ux-engineer

This is a sample repository for Platform UX Engineering
JavaScript
7
star
58

fashiongen-challenge-template

Base template to submit a model for the challenge. Fashiogen is a ServiceNow Research project that was started at Element AI.
Dockerfile
7
star
59

RepoFusion

This repository contains code for data preparation and experiments for pre training llm with repository level context in various ways
Python
6
star
60

build-pynini-wheels

Build `manylinux2014_x86_64` Python wheels for `pynini`, wrapping all its dependencies. This is a ServiceNow Research project that was started at Element AI.
Dockerfile
6
star
61

sncicd-plugin-rollback

TypeScript
6
star
62

devtraining-createnotes-orlando

Repository for the Service Portal Creating Custom Widgets module, Orlando release.
6
star
63

sncicd-tests-run

TypeScript
6
star
64

K15APIDemo

K15APIDemo
Swift
6
star
65

sncicd-install-app

TypeScript
6
star
66

radar.apple.com

Samples for Apple Radars.
Objective-C
5
star
67

workflow-discovery

Python
5
star
68

sncicd-rollback-app

TypeScript
4
star
69

THANOS

This ANOmaly is Synthetic - A Timeseries Recipe Data Generator
Jupyter Notebook
4
star
70

SISR

PyTorch-SRGAN is a modern PyTorch implementation of SRGAN Single Example Super Resolution. PyTorch-SRGAN is a ServiceNow Research project that was started at Element AI.
Python
4
star
71

Alpha-UX

Design guidelines and specs for SIR UX
HTML
4
star
72

devtraining-createnotes-paris

Repository for the Service Portal Creating Custom Widgets module, Paris release.
3
star
73

servicenow-devops-test-report

JavaScript
3
star
74

mid-cyberark-external-credential-resolver

MID Server External Credential Resolver for CyberArk vault.
Java
3
star
75

regions-of-reliability

Regions of Reliability in the Evaluation of Multivariate Probabilistic Forecasts
Python
3
star
76

K16-dd-order-tracking

3
star
77

servicenow-devops-get-change

We are creating new custom actions for Github Actions feature of Github. These actions are needed for ITSM DevOps product.
JavaScript
3
star
78

dir_lines_streamer

Rust crate allowing reading files inside a directory line-by-line, one file after the other (in human-alphabetical order). dir_lines_streamer is a ServiceNow Research project that was started at Element AI.
Rust
3
star
79

geo-bench-experiments

Python
3
star
80

servicenow-devops-sonar

JavaScript
2
star
81

devtraining-createnotes-tokyo

Repository for the Service Portal Creating Custom Widgets learning module, Tokyo release.
2
star
82

ExperienceHub

HTML
2
star
83

VA-iOS-SDK

Virtual Agent SDK for iOS native apps. Used to add VA to an existing app.
2
star
84

broad

Functionality to download and prepare BROAD, an image evaluation dataset for broad OOD detection.
Python
2
star
85

neighbour-distance

neighbour-distance is a ServiceNow Research project that was started at Element AI.
Jupyter Notebook
2
star
86

synbols-resources

Resources for the Synbols dataset generator. Synbols is a ServiceNow Research project that was started at Element AI.
PureBasic
2
star
87

devtraining-createnotes-rome

Repository for the Service Portal Creating Custom Widgets module, Rome release.
2
star
88

servicenow-devops-config-validate

JavaScript
2
star
89

servicenow-devops-update-change

We are creating new custom actions for Github Actions feature of Github. These actions are needed for ITSM DevOps product.
JavaScript
2
star
90

repliqa

A Question-Answering Dataset on Unseen Content [Details Coming Soon!]
Jupyter Notebook
2
star
91

agent-poirot

1
star
92

lightstep-ir-grafana-metrics

1
star
93

devtraining-application-release

1
star
94

devtraining-createnotes-sandiego

Repository for the Service Portal Creating Custom Widgets module, San Diego release.
1
star
95

FigmaResources

Archive of prior release Figma libraries
1
star
96

research

Unlock work experiences of the future. Join ServiceNow Research as we advance the state-of-the-art in Enterprise AI.
1
star
97

devtraining-createnotes-quebec

Repository for the Service Portal Creating Custom Widgets module, Quebec release.
1
star
98

acc-ansible

1
star
99

sn_ss_api_matrix

ServiceNow Server Side API Matrix
1
star
100

Public-Sector-Innovation-Community

1
star