• Stars
    star
    692
  • Rank 65,341 (Top 2 %)
  • Language
    C++
  • License
    Apache License 2.0
  • Created about 8 years ago
  • Updated 10 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

ThunderGBM: Fast GBDTs and Random Forests on GPUs

Documentation Status GitHub license GitHub issues PyPI version Downloads

Documentations | Installation | Parameters | Python (scikit-learn) interface

What's new?

ThunderGBM won 2019 Best Paper Award from IEEE Transactions on Parallel and Distributed Systems by the IEEE Computer Society Publications Board (1 out of 987 submissions, for the work "Zeyi Wen^, Jiashuai Shi*, Bingsheng He, Jian Chen, Kotagiri Ramamohanarao, and Qinbin Li*, Exploiting GPUs for Efficient Gradient Boosting Decision Tree Training , IEEE Transactions on Parallel and Distributed Systems, vol. 30, no. 12, 2019, pp. 2706-2717."). see more details: Best Paper Award Winners from IEEE, News from NUS School of Computing

Overview

The mission of ThunderGBM is to help users easily and efficiently apply GBDTs and Random Forests to solve problems. ThunderGBM exploits GPUs to achieve high efficiency. Key features of ThunderGBM are as follows.

  • Often by 10x times over other libraries.
  • Support Python (scikit-learn) interfaces.
  • Supported Operating System(s): Linux and Windows.
  • Support classification, regression and ranking.

Why accelerate GBDT and Random Forests: A survey conducted by Kaggle in 2017 shows that 50%, 46% and 24% of the data mining and machine learning practitioners are users of Decision Trees, Random Forests and GBMs, respectively.

GBDTs and Random Forests are often used for creating state-of-the-art data science solutions. We've listed three winning solutions using GBDTs below. Please check out the XGBoost website for more winning solutions and use cases. Here are some example successes of GDBTs and Random Forests:

Getting Started

Prerequisites

  • cmake 2.8 or above
    • gcc 4.8 or above for Linux | CUDA 9 or above
    • Visual C++ for Windows | CUDA 10

Quick Install

  • For Linux with CUDA 9.0

    • pip install thundergbm
  • For Windows (64bit)

    • Download the Python wheel file (for Python3 or above)

    • Install the Python wheel file

      • pip install thundergbm-0.3.4-py3-none-win_amd64.whl
  • Currently only support python3

  • After you have installed thundergbm, you can import and use the classifier (similarly for regressor) by:

from thundergbm import TGBMClassifier
clf = TGBMClassifier()
clf.fit(x, y)

Build from source

git clone https://github.com/zeyiwen/thundergbm.git
cd thundergbm
#under the directory of thundergbm
git submodule init cub && git submodule update

Build on Linux (build instructions for Windows)

#under the directory of thundergbm
mkdir build && cd build && cmake .. && make -j

Quick Start

./bin/thundergbm-train ../dataset/machine.conf
./bin/thundergbm-predict ../dataset/machine.conf

You will see RMSE = 0.489562 after successful running.

MacOS is not supported, as Apple has suspended support for some NVIDIA GPUs. We will consider supporting MacOS based on our user community feedbacks. Please stay tuned.

How to cite ThunderGBM

If you use ThunderGBM in your paper, please cite our work (TPDS and JMLR).

@ARTICLE{8727750,
  author={Z. {Wen} and J. {Shi} and B. {He} and J. {Chen} and K. {Ramamohanarao} and Q. {Li}},
  journal={IEEE Transactions on Parallel and Distributed Systems}, 
  title={Exploiting GPUs for Efficient Gradient Boosting Decision Tree Training}, 
  year={2019},
  volume={30},
  number={12},
  pages={2706-2717},
  }

@article{wenthundergbm19,
 author = {Wen, Zeyi and Shi, Jiashuai and He, Bingsheng and Li, Qinbin and Chen, Jian},
 title = {{ThunderGBM}: Fast {GBDTs} and Random Forests on {GPUs}},
 journal = {Journal of Machine Learning Research},
 volume={21},
 year = {2020}
}

Related papers

  • Zeyi Wen, Jiashuai Shi, Bingsheng He, Jian Chen, Kotagiri Ramamohanarao and Qinbin Li. Exploiting GPUs for Efficient Gradient Boosting Decision Tree Training. IEEE Transactions on Parallel and Distributed Systems (TPDS), accepted in May 2019. pdf

  • Zeyi Wen, Hanfeng Liu, Jiashuai Shi, Qinbin Li, Bingsheng He, Jian Chen. ThunderGBM: Fast GBDTs and Random Forests on GPUs. Featured at JMLR MLOSS (Machine Learning Open Source Software). Year: 2020, Volume: 21, Issue: 108, Pages: 1βˆ’5. pdf

  • Zeyi Wen, Bingsheng He, Kotagiri Ramamohanarao, Shengliang Lu, and Jiashuai Shi. Efficient Gradient Boosted Decision Tree Training on GPUs. The 32nd IEEE Intern ational Parallel and Distributed Processing Symposium (IPDPS), pages 234-243, 2018. pdf

Key members of ThunderGBM

  • Zeyi Wen, NUS (now at The University of Western Australia)
  • Hanfeng Liu, GDUFS (a visting student at NUS)
  • Jiashuai Shi, SCUT (a visiting student at NUS)
  • Qinbin Li, NUS
  • Advisor: Bingsheng He, NUS
  • Collaborators: Jian Chen (SCUT)

Other information

  • This work is supported by a MoE AcRF Tier 2 grant (MOE2017-T2-1-122) and an NUS startup grant in Singapore.

Related libraries

More Repositories

1

thundersvm

ThunderSVM: A Fast SVM Library on GPUs and CPUs
C++
1,564
star
2

NIID-Bench

Federated Learning Benchmark - Federated Learning on Non-IID Data Silos: An Experimental Study (ICDE 2022)
Python
558
star
3

FedTree

A tree-based federated learning system (MLSys 2023)
C++
142
star
4

ThunderGP

HLS-based Graph Processing Framework on FPGAs
C++
135
star
5

Medusa

Medusa: Building GPU-based Parallel Sparse Graph Applications with Sequential C/C++ Code
Cuda
61
star
6

Awesome-Literature-ILoGs

Awesome literature on imbalanced learning on graphs
58
star
7

G3

G3: A Programmable GNN Training System on GPU
Cuda
42
star
8

briskstream

A Multicore, NUMA Optimised Data Stream Processing System
Java
39
star
9

PyOE

Python library for data stream learning
Python
28
star
10

ThunderRW

Source code of "ThunderRW: An In-Memory Graph Random Walk Engine" published in VLDB'2021 - By Shixuan Sun, Yuhang Chen, Shengliang Lu, Bingsheng He and Yuchen Li
C++
26
star
11

FedSim

A coupled vertical federated learning framework that boosts the model performance with record similarities (NeurIPS 2022)
Python
23
star
12

PrivML

20
star
13

SOFF

Python
19
star
14

ConsisGAD

Python
18
star
15

SimFL

Practical Federated Gradient Boosting Decision Trees (AAAI 2020)
C++
18
star
16

ForkGraph

C++
16
star
17

ReGraph

Scaling Graph Processing on HBM-enabled FPGAs with Heterogeneous Pipelines
C++
16
star
18

ThundeRiNG

Fast Multiple Independent Random Number Sequences Generation on FPGAs
C++
14
star
19

hacc_demo

Shell
14
star
20

FedOV

Towards Addressing Label Skews in One-Shot Federated Learning (ICLR 2023)
Python
14
star
21

Vine

Accelerating Exact Constrained Shortest Paths on GPUs
C++
14
star
22

PathEnum

Source code of "PathEnum: Towards Real-Time Hop-Constrained s-t Path Enumeration", published in SIGMOD'2021 - By Shixuan Sun, Yuhang Chen, Bingsheng He, and Bryan Hooi
C++
14
star
23

OEBench

OEBench: Investigating Open Environment Challenges in Real-World Relational Data Streams (VLDB 2024)
Python
13
star
24

VertiBench

Feature partitioner by imbalance or correlation (ICLR 2024)
Jupyter Notebook
9
star
25

omniDB

General query processing engine
C++
7
star
26

LightRW

C++
6
star
27

HashjoinOnHARP

The MAIN project of the paper "Is FPGA useful for Hash Joins?"
C++
5
star
28

PMP

Python
5
star
29

RUSH

A fast library for real-time burst subgraph detection
Python
4
star
30

On-the-fly-data-shuffling-for-OpenCL-based-FPGAs

JavaScript
4
star
31

DeltaBoost

GBDT-based model with efficient unlearning (SIGMOD 2023)
C++
4
star
32

ModelGo

TeX
4
star
33

Pyper

3
star
34

KGraph

Concurrent Graph Query Processing with Memoization on Graph
3
star
35

Awesome-Prompt-For-Research

Awesome prompts for computer science research including paper editting and code debugging
2
star
36

Melia

C
2
star
37

Query_on_OpenCL_FPGA

C++
1
star
38

FedGMA

Communication-Efficient Generalized Neuron Matching for Federated Learning (ICPP'23)
Python
1
star
39

HashJoin_HMA

A hash join implementation optimized for many-core processors with die-stacked HBMs
C++
1
star
40

Clementi

Clementi: A Scalable Multi-FPGA Graph Processing Framework
C++
1
star