• Stars
    star
    260
  • Rank 156,277 (Top 4 %)
  • Language
    Python
  • License
    MIT License
  • Created over 4 years ago
  • Updated almost 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation (CVPR20)

ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation

This is a pytorch implementation of the paper "ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation"

Dependency

  • This work was tested with PyTorch 1.2.0, CUDA 9.0, python 3.6 and Ubuntu 16.04.
  • requirements can be found in the file environmentPytorch12.yml. The command to create the environment from the file is: conda env create --name pytorch1.2 --file=environmentPytorch12.yml
  • To activate the environment use: source activate pytorch1.2

Training

  • To view the results during the training process, you need to set up a visdom port: visdom -port 8192

Supervised Training

python train.py --name_prefix demo --dataname RIMEScharH32W16 --capitalize --display_port 8192 
  • Main arguments:
    • --name: unless specified in the arguments, the experiment name is determined by the name_prefix, the dataset and parameters different from the default ones (see code in options/base_options.py).
    • --name_prefix: the prefix to the automatically generated experiment name.
    • --dataname: name of dataset which will determine the dataroot path according to data/dataset_catalog.py
    • --lex: the lexicon used to generate the fake images. There is a default lexicon for english/french data specified in options/base_options.py.
    • --capitalize: randomly capitalize first letters of words in the lexicon used.
    • --display_port: visdom display port
    • --checkpoints_dir: the networks weights and sample images are saved to checkpoints_dir/experiment_name.
    • --use_rnn: whether to use LSTM
    • --seed: determine the seed for numpy and pytorch instead of using a random one.
    • --gb_alpha: the balance between the recognizer and discriminator loss. Higher alpha means larger weight for the recognizer.
  • Other arguments are explained in the file options/base_options.py and options/train_options.py.

Semi-Supervised Training

python train_semi_supervised.py --dataname IAMcharH32W16rmPunct --unlabeled_dataname CVLtrH32 --disjoint
  • Main arguments:

    • --dataname: name of dataset which will determine the labeled dataroot path according to data/dataset_catalog.py. This data is used to train only the Recognizer (in the disjoint case) or the Recognizer and the Discriminator networks.
    • --unlabeled_dataname: name of dataset which will determine the unlabeled dataroot path according to data/dataset_catalog.py. This data is used to train only the Discriminator network.
    • --disjoint: Disjoint training of the discriminator and the recognizer (each sees only the unlabeled/labeled data accordingly).
  • Other arguments are explained in the file options/base_options.py and options/train_options.py.

LMDB file generation for training data

Before generating an LMDB download the desired dataset into Datasets:

The structure of the directories should be:

  • Datasets
    • IAM
      • wordImages (the downloaded words dataset)
      • lineImages (the downloaded lines dataset)
      • original (the downloaded xml labels data)
      • original_partition (the downloaded partition)
        • te.lst
        • tr.lst
        • va1.lst
        • va2.lst
    • RIMES
      • orig (the downloaded dataset)
        • training_WR
        • groundtruth_training_icdar2011.txt
        • testdataset_ICDAR
        • ground_truth_test_icdar2011.txt
        • valdataset_ICDAR
        • ground_truth_validation_icdar2011.txt
    • CVL
      • cvl-database-1-1 (the downloaded dataset)
        • trainset
        • testset
        • readme.txt
    • Lexicon
      • english_words.txt
      • Lexique383.tsv

To generate an LMDB file of one of the datasets CVL/IAM/RIMES/GW for training use the code:

cd data
python create_text_data.py
  • Main arguments (determined inside the file):
    • create_Dict = False: create a dictionary of the generated dataset
    • dataset = 'IAM': CVL/IAM/RIMES/gw
    • mode = 'va2': tr/te/va1/va2/all
    • labeled = True: save the labels of the images or not.
    • top_dir = 'Datasets': The directory containing the folders with the different datasets.
    • words = False: parameter relevant for IAM/RIMES. Use words images, otherwise use lines
    • parameters relevant for IAM:
    • offline = True: use offline images
    • author_number = -1: use only images of a specific writer. If the value is -1, use all writers, otherwise use the index of this specific writer
    • remove_punc = True: remove images which include only one punctuation mark from the list ['.', '', ',', '"', "'", '(', ')', ':', ';', '!']
    • resize parameters:
    • resize='noResize': charResize|keepRatio|noResize - type of resize, char - resize so that each character's width will be in a specific range (inside this range the width will be chosen randomly), keepRatio - resize to a specific image height while keeping the height-width aspect-ratio the same. noResize - do not resize the image
    • imgH = 32: height of the resized image
    • init_gap = 0: insert a gap before the beginning of the text with this number of pixels
    • charmaxW = 18: The maximum character width
    • charminW = 10: The minimum character width
    • h_gap = 0: Insert a gap below and above the text
    • discard_wide = True: Discard images which have a character width 3 times larger than the maximum allowed character size (instead of resizing them) - this helps discard outlier images
    • discard_narr = True: Discard images which have a character width 3 times smaller than the minimum allowed charcter size.

The generated lmdb will be saved in the relevant dataset folder and the dictionary with be saved in Lexicon folder.

Generating an LMDB file with GAN data

python generate_wordsLMDB.py --dataname IAMcharH32rmPunct --results_dir ./lmdb_files/IAM_concat --n_synth 100,200 --name model_name 
  • Main arguments:
    • --dataname: name of dataset which will determine the dataroot path according to data/dataset_catalog.py. note that will be concatenated to the generated image.
    • --no_concat_dataset: ignore β€œdataname” (previous parameter), do not concatenate
    • --results_dir: path to result, will be concatenated with "n_synth"
    • --n_synth: number of examples to generate in thousands
    • --name: name of model used to generate the images
    • --lex: lexicon used to generate the images

Main Folders

The structure of the code is based on the structure of the CycleGAN code.

  1. data/ - Folder containing functions relating to the data, including generation, dataloading, alphabetes and a catalog which translates dataset names into folder location. The dataset_catalog should be updated according to the path to the lmdb you are using.
  2. models/ - Folder containing the models (with the forward, backward and optimization functions) and the network architectures. The generator and discriminator architectures are based on BigGAN. The recognizer architecture is based on crnn.
  3. options/ - Files containing the arguments for the training and data generation process.
  4. plots/ - Python notebook files with visualizations of the data.
  5. util/ - General function that are used in packages such as loss definitions.

Citation

If you use this code for your research, please cite our paper.

@inproceedings{fogel2020scrabblegan,
    title={ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation},
    author={Sharon Fogel and Hadar Averbuch-Elor and Sarel Cohen and Shai Mazor and Roee Litman},
    booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    month = {June},
    year = {2020}
}

License

ScrabbleGAN is released under the MIT license. See the LICENSE and THIRD-PARTY-NOTICES.txt files for more information.

Contributing

Your contributions are welcome!
See CONTRIBUTING.md and CODE_OF_CONDUCT.md for more info.

More Repositories

1

style-dictionary

A build system for creating cross-platform styles.
JavaScript
3,855
star
2

computer-vision-basics-in-microsoft-excel

Computer Vision Basics in Microsoft Excel (using just formulas)
2,391
star
3

selling-partner-api-docs

This repository contains documentation for developers to use to call Selling Partner APIs.
1,541
star
4

smoke-framework

A light-weight server-side service framework written in the Swift programming language.
Swift
1,430
star
5

alexa-skills-kit-js

SDK and example code for building voice-enabled skills for the Amazon Echo.
1,134
star
6

ion-java

Java streaming parser/serializer for Ion.
Java
840
star
7

sketch-constructor

Read/write/manipulate Sketch files in Node without Sketch plugins!
JavaScript
538
star
8

selling-partner-api-models

This repository contains OpenAPI models for developers to use when developing software to call Selling Partner APIs.
Mustache
532
star
9

pecos

PECOS - Prediction for Enormous and Correlated Spaces
Python
501
star
10

amzn-drivers

Official AWS drivers repository for Elastic Network Adapter (ENA) and Elastic Fabric Adapter (EFA)
C
444
star
11

ion-js

A JavaScript implementation of Amazon Ion.
TypeScript
323
star
12

xfer

Transfer Learning library for Deep Neural Networks.
Python
250
star
13

awsssmchaosrunner

Amazon's light-weight library for chaos engineering on AWS. It can be used for EC2 and ECS (with EC2 launch type).
Kotlin
247
star
14

ion-python

A Python implementation of Amazon Ion.
Python
210
star
15

amazon-pay-sdk-php

Amazon Pay PHP SDK
PHP
209
star
16

fire-app-builder

Fire App Builder is a framework for building java media apps for Fire TV, allowing you to add your feed of media content to a configuration file and build an app to browse and play it quickly.
Java
178
star
17

exoplayer-amazon-port

Official port of ExoPlayer for Amazon devices
Java
168
star
18

oss-dashboard

A dashboard for viewing many GitHub organizations at once.
Ruby
158
star
19

ion-c

A C implementation of Amazon Ion.
C
149
star
20

metalearn-leap

Original PyTorch implementation of the Leap meta-learner (https://arxiv.org/abs/1812.01054) along with code for running the Omniglot experiment presented in the paper.
Python
147
star
21

ion-go

A Go implementation of Amazon Ion.
Go
146
star
22

distance-assistant

Pedestrian monitor that provides visual feedback to help ensure proper social distancing guidelines are being observed
Python
135
star
23

auction-gym

AuctionGym is a simulation environment that enables reproducible evaluation of bandit and reinforcement learning methods for online advertising auctions.
Jupyter Notebook
135
star
24

hawktracer

HawkTracer is a highly portable, low-overhead, configurable profiling tool built in Amazon Video for getting performance metrics from low-end devices.
C++
131
star
25

trans-encoder

Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations
Python
131
star
26

smoke-aws

AWS services integration for the Smoke Framework
Swift
109
star
27

amazon-payments-magento-2-plugin

Extension to enable Amazon Pay on Magento 2
PHP
105
star
28

MXFusion

Modular Probabilistic Programming on MXNet
Python
102
star
29

amazon-weak-ner-needle

Named Entity Recognition with Small Strongly Labeled and Large Weakly Labeled Data
Python
99
star
30

amazon-advertising-api-php-sdk

⛔️ DEPRECATED - Amazon Advertising API PHP Client Library
PHP
93
star
31

ion-rust

Rust implementation of Amazon Ion
Rust
86
star
32

ads-advanced-tools-docs

Code samples and supplements for the Amazon Ads advanced tools center
Jupyter Notebook
83
star
33

image-to-recipe-transformers

Code for CVPR 2021 paper: Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning
Python
82
star
34

oss-attribution-builder

The OSS Attribution Builder is a website that helps teams create attribution documents (notices, "open source screens", credits, etc) commonly found in software products.
TypeScript
79
star
35

smoke-http

Specialised HTTP Client for service operations abstracted from the HTTP protocol.
Swift
69
star
36

amazon-ray

Staging area for ongoing enhancements to Ray focused on improving integration with AWS and other Amazon technologies.
Python
66
star
37

alexa-coho

Sample code for building skill adapters for Alexa Connected Home using the Lighting API
JavaScript
62
star
38

amazon-pay-sdk-ruby

Amazon Pay Ruby SDK
Ruby
58
star
39

amazon-pay-sdk-java

Amazon Pay Java SDK
Java
53
star
40

amazon-pay-sdk-python

Amazon Pay Python SDK
Python
53
star
41

zero-shot-rlhr

Python
51
star
42

supply-chain-simulation-environment

Python
49
star
43

amazon-pay-sdk-csharp

Amazon Pay C# SDK
C#
47
star
44

ion-dotnet

A .NET implementation of Amazon Ion.
C#
47
star
45

multiconer-baseline

Python
47
star
46

amazon-pay-api-sdk-php

Amazon Pay API SDK (PHP)
PHP
47
star
47

zeek-plugin-enip

Zeek network security monitor plugin that enables parsing of the Ethernet/IP and Common Industrial Protocol standards
Zeek
45
star
48

amazon-pay-sdk-samples

Amazon Pay SDK Sample Code
PHP
43
star
49

oss-contribution-tracker

Track contributions made to external projects and manage CLAs
TypeScript
40
star
50

amazon-s3-gst-plugin

A collection of Amazon S3 GStreamer elements.
C
40
star
51

fashion-attribute-disentanglement

Python
39
star
52

zeek-plugin-s7comm

Zeek network security monitor plugin that enables parsing of the S7 protocol
Zeek
39
star
53

milan

Milan is a Scala API and runtime infrastructure for building data-oriented systems, built on top of Apache Flink.
Scala
39
star
54

selling-partner-api-samples

Sample code for Amazon Selling Partner API use cases
Python
37
star
55

orthogonal-additive-gaussian-processes

Light-weighted code for Orthogonal Additive Gaussian Processes
Python
37
star
56

jekyll-doc-project

This repository contains an open-source Jekyll theme for authoring and publishing technical documentation. This theme is used by Appstore/Alexa tech writers and other community members. Most of the theme's files are stored in a Ruby Gem (called jekyll-doc-project).
HTML
36
star
57

smoke-dynamodb

SmokeDynamoDB is a library to make it easy to use DynamoDB from Swift-based applications, with a particular focus on usage with polymorphic database tables (tables that do not have a single schema for all rows).
Swift
34
star
58

amazon-pay-api-sdk-nodejs

Amazon Pay API SDK (Node.js)
JavaScript
34
star
59

zeek-plugin-bacnet

Zeek network security monitor plugin that enables parsing of the BACnet standard building controls protocol
Zeek
30
star
60

ss-aga-kgc

Python
30
star
61

chalet-charging-location-for-electric-trucks

Optimization tool to identify charging locations for electric trucks
Python
30
star
62

amazon-pay-api-sdk-java

Amazon Pay API SDK (Java)
Java
29
star
63

credence-to-causal-estimation

A framework for generating complex and realistic datasets for use in evaluating causal inference methods.
Python
29
star
64

sparse-vqvae

Experimental implementation for a sparse-dictionary based version of the VQ-VAE2 paper
Python
28
star
65

zeek-plugin-profinet

Zeek network security monitor plugin that enables parsing of the Profinet protocol
Zeek
28
star
66

ion-tests

Test vectors for testing compliant Ion implementations.
25
star
67

differential-privacy-bayesian-optimization

This repo contains the underlying code for all the experiments from the paper: "Automatic Discovery of Privacy-Utility Pareto Fronts"
Python
25
star
68

buy-with-prime-cdk-constructs

This package extends common CDK constructs with opinionated defaults to help create an organization strategy around infrastructure as code.
TypeScript
25
star
69

basis-point-sets

Python
24
star
70

ion-hive-serde

A Apache Hive SerDe (short for serializer/deserializer) for the Ion file format.
Java
24
star
71

zeek-plugin-tds

Zeek network security monitor plugin that enables parsing of the Tabular Data Stream (TDS) protocol
Zeek
24
star
72

ion-intellij-plugin

Support for Ion in Intellij IDEA.
Kotlin
23
star
73

ion-schema-kotlin

A Kotlin reference implementation of the Ion Schema Specification.
Kotlin
23
star
74

smoke-framework-application-generate

Code generator to generate SmokeFramework-based applications from service models.
Swift
23
star
75

emukit-playground

A web page explaining concepts of statistical emulation and making decisions under uncertainty in an interactive way.
JavaScript
22
star
76

ftv-livetv-sample-tv-app

Java
22
star
77

smoke-framework-examples

Sample applications showing the usage of the SmokeFramework and related libraries.
Swift
22
star
78

ion-hash-go

A Go implementation of Amazon Ion Hash.
Go
22
star
79

pretraining-or-self-training

Codebase for the paper "Rethinking Semi-supervised Learning with Language Models"
Python
22
star
80

tiny-attribution-generator

A small tool and library to create attribution notices from various formats
TypeScript
20
star
81

confident-sinkhorn-allocation

Pseudo-labeling for tabular data
Jupyter Notebook
20
star
82

ion-docs

Source for the GitHub Pages for Ion.
Java
19
star
83

autotrail

AutoTrail is a highly modular, partial automation workflow engine providing run time execution control
Python
19
star
84

git-commit-template

Set commit templates for git
JavaScript
19
star
85

smoke-aws-generate

Code generator to generate the SmokeAWS library from service models.
Swift
18
star
86

amazon-codeguru-profiler-for-spark

A Spark plugin for CPU and memory profiling
Java
17
star
87

smoke-aws-credentials

A library to obtain and assume automatically rotating AWS IAM roles written in the Swift programming language.
Swift
17
star
88

service-model-swift-code-generate

Modular code generator to generate Swift applications from service models.
Swift
17
star
89

amazon-pay-api-sdk-dotnet

Amazon Pay API SDK (.NET)
C#
17
star
90

sample-fire-tv-app-video-skill

This sample Fire TV app shows how to integrate an Alexa video skill in a simple, basic way.
Java
16
star
91

amazon-template-library

A collection of general purpose C++ utilities that play well with the Standard Library and Boost.
C++
16
star
92

ion-cli

Rust
15
star
93

refuel-open-domain-qa

Python
15
star
94

rheoceros

Cloud-based AI / ML workflow and data application development framework
Python
15
star
95

amazon-instant-access-sdk-php

PHP SDK to aid in 3p integration with Instant Access
PHP
14
star
96

amazon-mcf-plugin-for-magento-1

Plugin code to enable Amazon MCF in Magento 1.
PHP
14
star
97

login-with-amazon-wordpress

A pre-integrated plugin that can be installed into a Wordpress powered website to integrate with Login with Amazon.
PHP
14
star
98

amzn-ec2-ena-utilities

Python
14
star
99

firetv-sample-touch-app

This sample Android project demonstrates how to build the main UI of a Fire TV application in order to support both Touch interactions and Remote D-Pad controls.
Java
13
star
100

eslint-plugin-no-date-parsing

Disallow string parsing with new Date and Date.parse.
TypeScript
13
star