• Stars
    star
    1,277
  • Rank 36,858 (Top 0.8 %)
  • Language
    Jupyter Notebook
  • License
    MIT License
  • Created over 7 years ago
  • Updated about 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A simple PyTorch Implementation of Generative Adversarial Networks, focusing on anime face drawing.

AnimeGAN

A simple PyTorch Implementation of Generative Adversarial Networks, focusing on anime face drawing.

Randomly Generated Images

The images are generated from a DCGAN model trained on 143,000 anime character faces for 100 epochs.

fake_sample_1

Image Interpolation

Manipulating latent codes, enables the transition from images in the first row to the last row.

transition

Original Images

The images are not clean, some outliers can be observed, which degrades the quality of the generated images.

real_sample

Usage

To run the experiment,

$ python main.py --dataRoot path_to_dataset/ 

The pretrained model for DCGAN are also in this repo, play it inside the jupyter notebook.

anime-faces Dataset

Anime-style images of 126 tags are collected from danbooru.donmai.us using the crawler tool gallery-dl. The images are then processed by a anime face detector python-animeface. The resulting dataset contains ~143,000 anime faces. Note that some of the tags may no longer meaningful after cropping, i.e. the cropped face images under 'uniform' tag may not contain visible parts of uniforms.

How to construct the dataset from scratch ?

Prequisites: gallery-dl, python-animeface

  1. Download anime-style images

    # download 1000 images under the tag "misaka_mikoto"
    gallery-dl --images 1000 "https://danbooru.donmai.us/posts?tags=misaka_mikoto"
    
    # in a multi-processing manner
    cat tags.txt | \
    xargs -n 1 -P 12 -I 'tag' \ 
    bash -c ' gallery-dl --images 1000 "https://danbooru.donmai.us/posts?tags=$tag" '
  2. Extract faces from the downloaded images

    import animeface
    from PIL import Image
    
    im = Image.open('images/anime_image_misaka_mikoto.png')
    faces = animeface.detect(im)
    x,y,w,h = faces[0].face.pos
    im = im.crop((x,y,x+w,y+h))
    im.show() # display

I've cleaned the original dataset, the new version of the dataset has 115085 images in 126 tags. You can access the images from:

Non-commercial use please.

Things I've learned

  1. GANs are really hard to train.
  2. DCGAN generally works well, simply add fully-connected layers causes problems.
  3. In my cases, more layers for G yields better images, in the sense that G should be more powerful than D.
  4. Add noise to D's inputs and labels helps stablize training.
  5. Use differnet input and generate resolution (64x64 vs 96x96), there seems no obvious difference during training, the generated images are also very similar.
  6. Binray Noise as G's input amazingly works, but the images are not as good as those with Gussian Noise, idea credit to @cwhy ['Binary Noise' here I mean a sequence of {-1,1} generated by bernoulli distribution at p=0.5 ]

I did not carefully verify them, if you are looking for some general GAN tips, see @soumith's ganhacks

Others

  1. This project is heavily influenced by chainer-DCGAN and IllustrationGAN, the codes are mostly borrowed from PyTorch DCGAN example, thanks the authors for the clean codes.
  2. Dependencies: pytorch, torchvision
  3. This is a toy project for me to learn PyTorch and GANs, most importantly, for fun! :) Any feedback is welcome.

@jayleicn

More Repositories

1

ClipBERT

[CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks.
Python
700
star
2

scipy-lecture-notes-zh-CN

中文版scipy-lecture-notes. 网站下线, 以离线HTML的形式继续更新, 见release.
Python
409
star
3

moment_detr

[NeurIPS 2021] Moment-DETR code and QVHighlights dataset
Python
254
star
4

TVQA

[EMNLP 2018] PyTorch code for TVQA: Localized, Compositional Video Question Answering
Python
168
star
5

recurrent-transformer

[ACL 2020] PyTorch code for MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning
Jupyter Notebook
167
star
6

TVRetrieval

[ECCV 2020] PyTorch code for XML on TVRetrieval dataset - TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
Python
151
star
7

singularity

[ACL 2023] Official PyTorch code for Singularity model in "Revealing Single Frame Bias for Video-and-Language Learning"
Python
127
star
8

TVQAplus

[ACL 2020] PyTorch code for TVQA+: Spatio-Temporal Grounding for Video Question Answering
Python
122
star
9

TVCaption

[ECCV 2020] PyTorch code of MMT (a multimodal transformer captioning model) on TVCaption dataset
Python
86
star
10

VideoLanguageFuturePred

[EMNLP 2020] What is More Likely to Happen Next? Video-and-Language Future Event Prediction
Python
47
star
11

mTVRetrieval

[ACL 2021] mTVR: Multilingual Video Moment Retrieval
Python
26
star
12

classification-with-coarse-fine-labels

Code accompanying the paper Weakly Supervised Image Classification with Coarse and Fine Labels.
Lua
8
star
13

my-scripts

Collections of useful scripts for my daily usage
Python
1
star
14

pytorch-pretrained-BERT

A copy from https://github.com/huggingface/pytorch-pretrained-BERT
Jupyter Notebook
1
star