• Stars
    star
    573
  • Rank 77,865 (Top 2 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created about 3 years ago
  • Updated almost 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Efficient Inference for Big Models

BMInf

Efficient Inference for Big Models

OverviewInstallationQuick Start简体中文

doc github version

What's New

  • 2022/07/31 (BMInf 2.0.0) BMInf can now be applied to any transformer-based model.
  • 2021/12/21 (BMInf 1.0.0) Now the package no more depends on cupy and supports PyTorch backpropagation.
  • 2021/10/18 We updated the generate interface and added a new CPM 2.1 demo.
  • 2021/09/24 We publicly released BMInf on the 2021 Zhongguancun Forum (AI and Multidisciplinary Synergy Innovation Forum).

Note: README for BMInf-1 can be found in old_docs directory. Examples of CPM-1/2 and EVA will be published soon.

Overview

BMInf (Big Model Inference) is a low-resource inference package for large-scale pretrained language models (PLMs).

BMInf supports running models with more than 10 billion parameters on a single NVIDIA GTX 1060 GPU in its minimum requirements. Running with better GPUs leads to better performance. In cases where the GPU memory supports the large model inference (such as V100 or A100), BMInf still has a significant performance improvement over the existing PyTorch implementation.

If you use the code, please cite the following paper:

@inproceedings{han2022bminf,
	title={BMInf: An Efficient Toolkit for Big Model Inference and Tuning},
	author={Han, Xu and Zeng, Guoyang and Zhao, Weilin and Liu, Zhiyuan and Zhang, Zhengyan and Zhou, Jie and Zhang, Jun and Chao, Jia and Sun, Maosong},
	booktitle={Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations},
	pages={224--230},
	year={2022}
}

Installation

  • From pip: pip install bminf

  • From source code: download the package and run python setup.py install

Hardware Requirement

Here we list the minimum and recommended configurations for running BMInf.

Minimum Configuration Recommended Configuration
Memory 16GB 24GB
GPU NVIDIA GeForce GTX 1060 6GB NVIDIA Tesla V100 16GB
PCI-E PCI-E 3.0 x16 PCI-E 3.0 x16

GPUs with compute capability 6.1 or higher are supported by BMInf. Refer to the table to check whether your GPU is supported.

Software Requirement

BMInf requires CUDA version >= 10.1 and all the dependencies can be automaticlly installed by the installation process.

  • python >= 3.6
  • torch >= 1.7.1
  • cpm_kernels >= 1.0.9

Quick Start

Use bminf.wrapper to automatically convert your model.

import bminf

# initialize your model on CPU
model = MyModel()

# load state_dict before using wrapper
model.load_state_dict(model_checkpoint)

# apply wrapper
with torch.cuda.device(CUDA_DEVICE_INDEX):
    model = bminf.wrapper(model)

If bminf.wrapper does not fit your model well, you can use the following method to replace it manually.

  • Replace torch.nn.ModuleList with bminf.TransformerBlockList.
module_list = bminf.TransformerBlockList([
	# ...
], [CUDA_DEVICE_INDEX])
  • Replace torch.nn.Linear with bminf.QuantizedLinear.
linear = bminf.QuantizedLinear(torch.nn.Linear(...))

Performances

Here we report the speeds of CPM2 encoder and decoder we have tested on different platforms. You can also run benchmark/cpm2/encoder.py and benchmark/cpm2/decoder.py to test the speed on your machine!

Implementation GPU Encoder Speed (tokens/s) Decoder Speed (tokens/s)
BMInf NVIDIA GeForce GTX 1060 718 4.4
BMInf NVIDIA GeForce GTX 1080Ti 1200 12
BMInf NVIDIA GeForce GTX 2080Ti 2275 19
BMInf NVIDIA Tesla V100 2966 20
BMInf NVIDIA Tesla A100 4365 26
PyTorch NVIDIA Tesla V100 - 3
PyTorch NVIDIA Tesla A100 - 7

Community

We welcome everyone to contribute codes following our contributing guidelines.

You can also find us on other platforms:

License

The package is released under the Apache 2.0 License.

References

  1. CPM-2: Large-scale Cost-efficient Pre-trained Language Models. Zhengyan Zhang, Yuxian Gu, Xu Han, Shengqi Chen, Chaojun Xiao, Zhenbo Sun, Yuan Yao, Fanchao Qi, Jian Guan, Pei Ke, Yanzheng Cai, Guoyang Zeng, Zhixing Tan, Zhiyuan Liu, Minlie Huang, Wentao Han, Yang Liu, Xiaoyan Zhu, Maosong Sun.
  2. CPM: A Large-scale Generative Chinese Pre-trained Language Model. Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang, Juanzi Li, Xiaoyan Zhu, Maosong Sun.
  3. EVA: An Open-Domain Chinese Dialogue System with Large-Scale Generative Pre-Training. Hao Zhou, Pei Ke, Zheng Zhang, Yuxian Gu, Yinhe Zheng, Chujie Zheng, Yida Wang, Chen Henry Wu, Hao Sun, Xiaocong Yang, Bosi Wen, Xiaoyan Zhu, Minlie Huang, Jie Tang.
  4. Language Models are Unsupervised Multitask Learners. Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever.

More Repositories

1

ChatDev

Create Customized Software using Natural Language Idea (through LLM-powered Multi-Agent Collaboration)
Shell
24,842
star
2

MiniCPM-V

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
Python
12,088
star
3

XAgent

An Autonomous LLM Agent for Complex Task Solving
Python
8,102
star
4

MiniCPM

MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.
Jupyter Notebook
7,009
star
5

ToolBench

[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.
Python
4,789
star
6

AgentVerse

🤖 AgentVerse 🪐 is designed to facilitate the deployment of multiple LLM-based agents in various applications, which primarily provides two frameworks: task-solving and simulation
JavaScript
4,095
star
7

BMTools

Tool Learning for Big Models, Open-Source Solutions of ChatGPT-Plugins
Python
2,884
star
8

CPM-Bee

百亿参数的中英文双语基座大模型
Python
2,686
star
9

VisCPM

[ICLR'24 spotlight] Chinese and English Multimodal Large Model Series (Chat and Paint) | 基于CPM基础模型的中英双语多模态大模型系列
Python
1,075
star
10

ProAgent

An LLM-based Agent for the New Automation Paradigm - Agentic Process Automation
Python
754
star
11

IoA

An open-source framework for collaborative AI agents, enabling diverse, distributed agents to team up and tackle complex tasks through internet-like connectivity.
Python
556
star
12

BMTrain

Efficient Training (including pre-training and fine-tuning) for Big Models
Python
554
star
13

CPM-Live

Live Training for Open-source Big Models
Python
512
star
14

BMList

A List of Big Models
Python
339
star
15

RepoAgent

An LLM-powered repository agent designed to assist developers and teams in generating documentation and understanding repositories quickly.
Python
336
star
16

UltraFeedback

A large-scale, fine-grained, diverse preference dataset (and models).
Python
302
star
17

ModelCenter

Efficient, Low-Resource, Distributed transformer implementation based on BMTrain
Python
234
star
18

BMPrinciples

A collection of phenomenons observed during the scaling of big foundation models, which may be developed into consensus, principles, or laws in the future
222
star
19

UltraEval

[ACL 2024 Demo] Official GitHub repo for UltraEval: An open source framework for evaluating foundation models.
Python
215
star
20

InfiniteBench

100k+ Long-Context Benchmark for Large Language Models (paper upcoming)
Python
105
star
21

OlympiadBench

[ACL 2024]Official GitHub repo for OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems.
Python
89
star
22

MobileCPM

A Toolkit for Running On-device Large Language Models (LLMs) in APP
C++
53
star
23

RAGEval

Python
47
star
24

DecT

Source code for ACL 2023 paper Decoder Tuning: Efficient Language Understanding as Decoding
Python
42
star
25

XAgent-doc

Document for XAgent.
19
star
26

UltraLink

An Open-Source Knowledge-Enhanced Multilingual Supervised Fine-tuning Dataset
Python
17
star
27

BMInf-demos

BMInf demos.
JavaScript
13
star
28

General-Model-License

6
star
29

VisRAG

Python
1
star