• Stars
    star
    115
  • Rank 304,111 (Top 7 %)
  • Language
    Python
  • License
    MIT License
  • Created over 3 years ago
  • Updated 5 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

The Tensorflow, Keras implementation of Swin-Transformer and Swin-UNET

keras-vision-transformer

This repository contains the tensorflow.keras implementation of the Swin Transformer (Liu et al., 2021) and its applications to benchmark datasets.

  • Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S. and Guo, B., 2021. Swin transformer: Hierarchical vision transformer using shifted windows. arXiv preprint arXiv:2103.14030. https://arxiv.org/abs/2103.14030.

  • Hu, C., Wang, Y., Chen, J., Jiang, D., Zhang, X., Tian, Q. and Wang, M., 2021. Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation. arXiv preprint arXiv:2105.05537.

Notebooks

Note: the Swin-UNET implementation is experimental

  • MNIST image classification with Swin Transformers [link]
  • Oxford IIIT Pet image Segmentation with Swin-UNET [link]

Dependencies

  • TensorFlow 2.5.0, Keras 2.5.0, Numpy 1.19.5.

Overview

Swin Transformers are Transformer-based computer vision models that feature self-attention with shift-windows. Compared to other vision transformer variants, which compute embedded patches (tokens) globally, the Swin Transformer computes token subsets through non-overlapping windows that are alternatively shifted within Transformer blocks. This mechanism makes Swin Transformers more suitable for processing high-resolution images. Swin Transformers have shown effectiveness in image classification, object detection, and semantic segmentation problems.

Contact

Yingkai (Kyle) Sha <[email protected]> <[email protected]>

The work is benefited from:

  • The official Pytorch implementation of Swin-Transformers [link].
  • Swin-Transformer-TF [link].

License

MIT License