Simple Finetuner for Segment Anything
This repository contains a simple starter code for finetuning the FAIR Segment Anything (SAM) models leveraging the convenience of PyTorch Lightning.
Setup
-
Install dependencies
First run
git clone --recurse-submodules [email protected]:bhpfelix/segment-anything-finetuner.git
Then
cd segment-anything-finetuner
Follow the setup instruction of Segment Anything to install the proper dependencies. Then run
pip install -r requirements.txt
-
Data preparation
The starter code supports Coco format input with the following layout
βββ dataset_name/ β βββ train/ β β βββ _annotations.coco.json # COCO format annotation β β βββ 000001.png # Images β β βββ 000002.png β β βββ ... β βββ val/ β β βββ _annotations.coco.json # COCO format annotation β β βββ xxxxxx.png # Images β β βββ ...
-
Download model checkpoints
Download the necessary SAM model checkpoints and arrange the repo as follows:
βββ dataset_name/ # structure as detailed above β βββ ... βββ segment-anything/ # The FAIR SAM repo β βββ ... βββ SAM/ # the SAM pretrained checkpoints β βββ sam_vit_h_4b8939.pth β βββ ... βββ finetune.py βββ ...
finetune.py
)
Finetuning (This file contains a simple finetuning script for the Segment Anything model on Coco format datasets.
Example usage:
python finetune.py \
--data_root ./dataset_name \
--model_type vit_h \
--checkpoint_path ./SAM/sam_vit_h_4b8939.pth \
--freeze_image_encoder \
--batch_size 2 \
--image_size 1024 \
--steps 1500 \
--learning_rate 1.e-5 \
--weight_decay 0.01
We can optionally use the --freeze_image_encoder
flag to detach the image encoder parameters from optimization and save GPU memory.
Notes
- As of now the image resizing implementation is different from the
ResizeLongestSide
transform in SAM. - Drop path and layer-wise learning rate decay are not currently applied.
- The finetuning script currently only supports bounding box input prompts.
Resources
Citation
@article{kirillov2023segany,
title={Segment Anything},
author={Kirillov, Alexander and Mintun, Eric and Ravi, Nikhila and Mao, Hanzi and Rolland, Chloe and Gustafson, Laura and Xiao, Tete and Whitehead, Spencer and Berg, Alexander C. and Lo, Wan-Yen and Doll{\'a}r, Piotr and Girshick, Ross},
journal={arXiv:2304.02643},
year={2023}
}