SceneSeg LGSS
Codebase for CVPR2020 A Local-to-Global Approach to Multi-modal Movie Scene Segmentation
Introduction
From a video to segemented scenes. Basically, two steps are needed including holistic features extraction and temporal scene segmentation.
A single stage temporal scene segmentation is also provided in the demo. This is going to be a easy-to-use tool for plot/story unstanding with scene as a semantic unit. Currently, it only supports image input.
Features
- Basic video processing tools are provided including shot detection and its parallel version.
- Holistic semantic video feature extractors including place, audio, human, action, speech are planned to be included if you wish and leave a looking forward message in the issue. Place and audio are supported now in the
pre
. Full version is located at movienet-tools. - All-in-one scene segmentation tool with all multi-modal multi-semantic elements.
Notice
Installation
Please refer to INSTALL.md for installation and dataset preparation. Pretrained models and dataset are also explanined here.
Get Started
Citation
@inproceedings{rao2020local,
title={A Local-to-Global Approach to Multi-modal Movie Scene Segmentation},
author={Rao, Anyi and Xu, Linning and Xiong, Yu and Xu, Guodong and Huang, Qingqiu and Zhou, Bolei and Lin, Dahua},
booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2020}
}