• Stars
    star
    441
  • Rank 98,861 (Top 2 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 1 year ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Relate Anything Model is capable of taking an image as input and utilizing SAM to identify the corresponding mask within the image.

RAM: Relate-Anything-Model

The following developers have equally contributed to this project in their spare time, the names are in alphabetical order.

Zujin Guo, Bo Li, Jingkang Yang, Zijian Zhou.

Affiliate: MMLab@NTU & VisCom Lab, KCL/TongJi


🚀 🚀 🚀 This is a demo that combine Meta's Segment-Anything model with the ECCV'22 paper: Panoptic Scene Graph Generation.

🔥🔥🔥 Please star our codebase OpenPSG and RAM if you find it useful/interesting.

[Huggingface Demo]

[Gradio Demo (Faster)]

[Dataset]

Relate Anything Model is capable of taking an image as input and utilizing SAM to identify the corresponding mask within the image. Subsequently, RAM can provide an analysis of the relationship between any arbitrary objects mask.

The object masks are generated using SAM. RAM was trained to detect the relationships between the object masks using the OpenPSG dataset, and the specifics of this method are outlined in a subsequent section.

demo.png

Examples

Our current demo supports:

(1) generate arbitary objects masks and reason relationships in between.

(2) given coordinates then generate object masks and reason the relationship between given objects and other objects in the image.

We will soon add support for detecting semantic labels of objects with the help of OVSeg.

Here are some examples of the Relate Anything Model in action about playing soccer, dancing, and playing basketball.

Method

Our method is based on the winning solution of the PSG competition, with some modifications. The original report can be found here.

Inference

Our approach uses the Segment Anything Model (SAM) to identify and mask objects in an image. The model then extracts features for each segmented object. We use a Transformer module to enable interaction between the object features, allowing us to compute pairwise object relationships and categorize their interrelations.

Training

We train our model using the PSG dataset. For each training PSG image, SAM segments multiple objects, but only a few of them match the ground truth (GT) masks in PSG. We perform a simple matching between SAM's predictions and the GT masks based on their intersection-over-union (IOU) scores, so that (almost) every GT mask is assigned to a SAM mask. We then re-generate the relation map according to SAM's masks. With the GT data prepared, we train our model using cross entropy loss, as shown in the figure above.

Setup

To set up the environment, we use Conda to manage dependencies. To specify the appropriate version of cudatoolkit to install on your machine, you can modify the environment.yml file, and then create the Conda environment by running the following command:

conda env create -f environment.yml

Make sure to use segment_anything in this repository, which includes the mask feature extraction operation.

Download the pretrained model

  1. SAM: link
  2. RAM: link

Place these two models in ./checkpoints/ from the root directory.

Gradio demo

  • We also provide a UI for testing our method that is built with gradio. This demo also supports generating new directions on the fly! Running the following command in a terminal will launch the demo:
    python app.py
    
  • This demo is also hosted on HuggingFace here.

Acknowledgement

We thank Chunyuan Li for his help in setting up the demo.

Citation

If you find this project helpful for your research, please consider citing the following BibTeX entry.

@inproceedings{yang2022psg,
    author = {Yang, Jingkang and Ang, Yi Zhe and Guo, Zujin and Zhou, Kaiyang and Zhang, Wayne and Liu, Ziwei},
    title = {Panoptic Scene Graph Generation},
    booktitle = {ECCV}
    year = {2022}
}

@inproceedings{yang2023pvsg,
    author = {Yang, Jingkang and Peng, Wenxuan and Li, Xiangtai and Guo, Zujin and Chen, Liangyu and Li, Bo and Ma, Zheng and Zhou, Kaiyang and Zhang, Wayne and Loy, Chen Change and Liu, Ziwei},
    title = {Panoptic Video Scene Graph Generation},
    booktitle = {CVPR},
    year = {2023},
}

More Repositories

1

Otter

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
Python
3,560
star
2

Generalizable-Mixture-of-Experts

GMoE could be the next backbone model for many kinds of generalization task.
Python
290
star
3

MADAN

Pytorch Code release for our NeurIPS paper "Multi-source Domain Adaptation for Semantic Segmentation"
Python
171
star
4

Learning-Invariant-Representations-and-Risks

Pytorch code release of CVPR 21 Paper: Learning Invariant Representations and Risks
Python
32
star
5

Mapillary2COCO

Transfer Mapillary Vistas Dataset to Coco format
Python
28
star
6

GenBench

Benchmarking and Analyzing Generative Data for Visual Recognition
Python
26
star
7

Time-Series-Analysis

2017-Summer-Term-Study
23
star
8

IIB

Python
16
star
9

Data-Structure

Data structure and Algorithm
C++
8
star
10

Higher-Cloud-Computing-Project

We are devoting to building a cloud computing platform that leverages idle resources based on mobile or local networks
Java
5
star
11

Shared-Route

New way to explore your campus life.
Java
4
star
12

VisualizeUrText

Lab1-Pair_Programming
Java
3
star
13

HCCP-Patronus

This is an explosive start-up idea bounced out of my mind I was doing my course project. I am not sure when I can achieve them, but he will be sticked there to remind me his existence.
Java
3
star
14

HighPrecisionDetection

Do some experiments
Python
1
star
15

learn_to_crawl

HTML
1
star
16

Codeforces

C++
1
star
17

luodian-LAB-4

Just for SE assignment
Java
1
star
18

Network_Alignment

Task from a UCI professor
C++
1
star
19

Pytorch_Quick_Practices

Practices to quick get into pytorch
Python
1
star
20

GO_Kitti

Currently doing KITTI challenge.
Python
1
star
21

LeetCode

Record my way to improve my coding ability towards algorithms and data structures. Helping me build a solid foundation on the road of scientific research.
C++
1
star
22

Code-contest

C++
1
star
23

Unet-TGS-Salt-Challenge

TGS Salt Identification Challenge
Python
1
star
24

Analysis-WindMachine-Data

Python
1
star
25

I-Love-Study

This is an android app made by WD.Hao and L.Bo
Java
1
star
26

HCCP-Distributed-Download

第一款HCCP上的应用
Java
1
star
27

Patricia

wdh && lb
C++
1
star
28

HIT-OS

实验代码大部分借鉴前人火炬,但是ppt做的很详细,可以一看
C
1
star
29

Machine-Learning-Ng

Ng's public courses in cousera
1
star
30

What-Do-You-Like

First web app
JavaScript
1
star