PaddlePaddle/PaddleFleetX

Stars
436
Rank 99,877 (Top 2 %)
Language
Python
License
Apache License 2.0
Created almost 6 years ago
Updated 6 months ago

PaddlePaddle/PaddleFleetX

PaddlePaddle

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

飞桨大模型开发套件，提供大语言模型、跨模态大模型、生物计算大模型等领域的全流程开发工具链。

简介

PaddleFleetX旨在打造一套简单易用、性能领先、且功能强大的端到端大模型工具库，覆盖大模型环境部署、数据处理、预训练、微调、模型压缩、推理部署全流程，并支持语言、视觉、多模态等多个领域的前沿大模型算法。

最新消息 🔥

更新 (2022-09-21): PaddleFleetX 发布 v0.1 版本.

教程

安装

首先，您需要准备 PaddleFleetX 所需的运行环境。我们强烈推荐您使用 Docker 的方式来安装环境，具体安装方式请参考Docker环境部署。其他安装方式如裸机安装，请参考裸机部署。

环境安装完成后，您可以使用以下命令将 PaddleFleetX 下载到本地，然后根据实际需要、参考教程运行相应的模型代码。

git clone https://github.com/PaddlePaddle/PaddleFleetX.git

模型库

模型	参数量	预训练文件
GPT	345M	GPT_345M

性能

相对于业界主流套件Megatron-LM¹与Megatron-DeepSpeed²，PaddleFleetX可以达到更高的训练吞吐。下表列出了在同等模型规模下，在多台拥有八张A100-SXM4-40GB GPU的服务器上（CUDA Version为11.6），PaddleFleetX与两者的性能对比。其中，0.35B、1.3B以及175B模型使用Megatron-LM套件。6.7B模型使用Megatron-DeepSpeed套件。

1. Megatron-LM commit id: 0bb597b42c53355a567aba2a1357cc34b9d99ddd (Commit on Jul 21, 2022)

2. Megatron-DeepSpeed commit id: 54f1cb7c300b05bf4e232c3efb862e5becd9fb53 (Commit On Sep 27, 2022)

工业级应用

许可

PaddleFleetX 基于 Apache 2.0 license 许可发布。

引用

@misc{paddlefleetx,
    title={PaddleFleetX: An Easy-to-use and High-Performance One-stop Tool for Deep Learning},
    author={PaddleFleetX Contributors},
    howpublished = {\url{https://github.com/PaddlePaddle/PaddleFleetX}},
    year={2022}
}

PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice （『飞桨』核心框架，深度学习&机器学习高性能单机、分布式训练和跨平台部署）

PaddleDetection

Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-time multi-person keypoint detection.

PaddleHub

Awesome pre-trained models toolkit based on PaddlePaddle. (400+ models including Image, Text, Audio, Video and Cross-Modal with Easy Inference & Serving)【安全加固，暂停交互，请耐心等待】

PaddleNLP

👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.

PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

PaddleSeg

Easy-to-use image segmentation library with awesome pre-trained model zoo, supporting wide-range of practical tasks in Semantic Segmentation, Interactive Segmentation, Panoptic Segmentation, Image Matting, 3D Segmentation, etc.

PaddleGAN

PaddlePaddle GAN library, including lots of interesting applications like First-Order motion transfer, Wav2Lip, picture repair, image editing, photo2cartoon, image style transfer, GPEN, and so on.

Paddle-Lite

PaddlePaddle High Performance Deep Learning Inference Engine for Mobile and Edge (飞桨高性能深度学习端侧推理引擎）

models

Officially maintained, supported by PaddlePaddle, including CV, NLP, Speech, Rec, TS, big models and so on.

ERNIE

Official implementations for various pre-training models of ERNIE-family, covering topics of Language Understanding & Generation, Multimodal Understanding & Generation, and beyond.

PaddleClas

A treasure chest for visual classification and recognition powered by PaddlePaddle

PaddleX

All-in-One Development Tool based on PaddlePaddle（飞桨低代码全流程开发工具）

VisualDL

Deep Learning Visualization Toolkit（『飞桨』深度学习可视化工具）

PaddleRec

Recommendation Algorithm大规模推荐算法库，包含推荐系统经典及最新算法LR、Wide&Deep、DSSM、TDM、MIND、Word2Vec、Bert4Rec、DeepWalk、SSR、AITM，DSIN，SIGN，IPREC、GRU4Rec、Youtube_dnn、NCF、GNN、FM、FFM、DeepFM、DCN、DIN、DIEN、DLRM、MMOE、PLE、ESMM、ESCMM, MAML、xDeepFM、DeepFEFM、NFM、AFM、RALM、DMR、GateNet、NAML、DIFM、Deep Crossing、PNN、BST、AutoInt、FGCNN、FLEN、Fibinet、ListWise、DeepRec、ENSFM，TiSAS，AutoFIS等，包含经典推荐系统数据集criteo 、movielens等

PARL

A high-performance distributed training framework for Reinforcement Learning

awesome-DeepLearning

深度学习入门课、资深课、特色课、学术案例、产业实践案例、深度学习知识百科及面试题库The course, case and knowledge of Deep Learning and AI

Jupyter Notebook

FastDeploy

⚡️An Easy-to-use and Fast Deep Learning Model Deployment Toolkit for ☁️Cloud 📱Mobile and 📹Edge. Including Image, Video, Text and Audio 20+ main stream scenarios and 150+ SOTA models with end-to-end optimization, multi-platform and multi-framework support.

book

Deep Learning 101 with PaddlePaddle （『飞桨』深度学习框架入门教程）

Jupyter Notebook

Research

novel deep learning research works with PaddlePaddle

PGL

Paddle Graph Learning (PGL) is an efficient and flexible graph learning framework based on PaddlePaddle

PaddleSlim

PaddleSlim is an open-source library for deep model compression and architecture search.

PaddleVideo

Awesome video understanding toolkits based on PaddlePaddle. It supports video data annotation tools, lightweight RGB and skeleton based action recognition model, practical applications for video tagging and sport action detection.

PaddleHelix

Bio-Computing Platform Featuring Large-Scale Representation Learning and Multi-Task Deep Learning “螺旋桨”生物计算工具集

Paddle.js

Paddle.js is a web project for Baidu PaddlePaddle, which is an open source deep learning framework running in the browser. Paddle.js can either load a pre-trained model, or transforming a model from paddle-hub with model transforming tools provided by Paddle.js. It could run in every browser with WebGL/WebGPU/WebAssembly supported. It could also run in Baidu Smartprogram and WX miniprogram.

Serving

A flexible, high-performance carrier for machine learning models（『飞桨』服务化部署框架）

RocketQA

🚀 RocketQA, dense retrieval for information retrieval and question answering, including both Chinese and English state-of-the-art models.

X2Paddle

Deep learning model converter for PaddlePaddle. (『飞桨』深度学习模型转换工具)

Paddle2ONNX

ONNX Model Exporter for PaddlePaddle

Paddle-Lite-Demo

lib, demo, model, data

Knover

Large-scale open domain KNOwledge grounded conVERsation system based on PaddlePaddle

Parakeet

PAddle PARAllel text-to-speech toolKIT (supporting Tacotron2, Transformer TTS, FastSpeech2/FastPitch, SpeedySpeech, WaveFlow and Parallel WaveGAN)

FlyCV

FlyCV is a high-performance library for processing computer visual tasks.

Paddle3D

A 3D computer vision development toolkit based on PaddlePaddle. It supports point-cloud object detection, segmentation, and monocular 3D object detection models.

Quantum

Jupyter Notebook

PaddleYOLO

🚀🚀🚀 YOLO series of PaddlePaddle implementation, PP-YOLOE+, RT-DETR, YOLOv5, YOLOv6, YOLOv7, YOLOv8, YOLOv10, YOLOX, YOLOv5u, YOLOv7u, YOLOv6Lite, RTMDet and so on. 🚀🚀🚀

Anakin

High performance Cross-platform Inference-engine, you could run Anakin on x86-cpu,arm, nv-gpu, amd-gpu,bitmain and cambricon devices.

VIMER

视觉预训练基础模型仓库

PaddleTS

Awesome Easy-to-Use Deep Time Series Modeling based on PaddlePaddle, including comprehensive functionality modules like TSDataset, Analysis, Transform, Models, AutoTS, and Ensemble, etc., supporting versatile tasks like time series forecasting, representation learning, and anomaly detection, etc., featured with quick tracking of SOTA deep models.

PaddleFL

Federated Deep Learning in PaddlePaddle

ERNIE-SDK

ERNIE Bot Agent is a Large Language Model (LLM) Agent Framework, powered by the advanced capabilities of ERNIE Bot and the platform resources of Baidu AI Studio.

Jupyter Notebook

PaddleSpatial

PaddleSpatial is an open-source spatial-temporal computing tool based on PaddlePaddle.

PaddleRS

Awesome Remote Sensing Toolkit based on PaddlePaddle.

PaddleMIX

Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.

PaddleCloud

PaddlePaddle Docker images and K8s operators for PaddleOCR/Detection developers to use on public/private cloud.

MetaGym

Collection of Reinforcement Learning / Meta Reinforcement Learning Environments.

PASSL

PASSL包含 SimCLR，MoCo v1/v2，BYOL，CLIP，PixPro，simsiam, SwAV, BEiT，MAE 等图像自监督算法以及 Vision Transformer，DEiT，Swin Transformer，CvT，T2T-ViT，MLP-Mixer，XCiT，ConvNeXt，PVTv2 等基础视觉算法

PaddleScience

PaddleScience is SDK and library for developing AI-driven scientific computing applications based on PaddlePaddle.

InterpretDL

InterpretDL: Interpretation of Deep Learning Models，基于『飞桨』的模型可解释性算法库。

docs

Documentations for PaddlePaddle

Paddle-Inference-Demo

PaddleRobotics

PaddleRobotics is an open-source algorithm library for robots based on Paddle, including open-source parts such as human-robot interaction, complex motion control, environment perception, SLAM positioning, and navigation.

TrustAI

PALM

a Fast, Flexible, Extensible and Easy-to-use NLP Large-scale Pretraining and Multi-task Learning Framework.

ElasticCTR

ElasticCTR，即飞桨弹性计算推荐系统，是基于Kubernetes的企业级推荐系统开源解决方案。该方案融合了百度业务场景下持续打磨的高精度CTR模型、飞桨开源框架的大规模分布式训练能力、工业级稀疏参数弹性调度服务，帮助用户在Kubernetes环境中一键完成推荐系统部署，具备高性能、工业级部署、端到端体验的特点，并且作为开源套件，满足二次深度开发的需求。

AutoDL

PLSC

Paddle Large Scale Classification Tools，supports ArcFace, CosFace, PartialFC, Data Parallel + Model Parallel. Model includes ResNet, ViT, Swin, DeiT, CaiT, FaceViT, MoCo, MAE, ConvMAE, CAE.

CINN

Compiler Infrastructure for Neural Networks

LiteKit

Off-The-Shelf AI Development Kit for APP Developers based on Paddle Lite （『飞桨』移动端开箱即用AI套件, 包含Java & Objective C接口支持）

PaddleFlow

PaddleSports

PaddleDTX

Paddle with Decentralized Trust based on Xuperchain

PaConvert

PaddlePaddle Code Convert Toolkit. 『飞桨』深度学习代码转换工具

XWorld

A C++/Python simulator package for reinforcement learning

community

PaddlePaddle Developer Community

Jupyter Notebook

PaddleSleeve

benchmark

hapi

hapi is a High-level API that supports both static and dynamic execution modes

Jupyter Notebook

Mobile

Embedded and Mobile Deployment

PaddleCustomDevice

PaddlePaddle custom device implementaion. (『飞桨』自定义硬件接入实现)

PaddleDepth

PaddlePaddle.org

PaddlePaddle.org is the repository for the website of the PaddlePaddle open source project.

PaDiff

Paddle Automatically Diff Precision Toolkits.

EasyData

PaddleTest

PaddlePaddle TestSuite

epep

Easy & Effective Application Framework for PaddlePaddle

paddle-ce-latest-kpis

Paddle Continuous Evaluation, keep updating.

VisionTools

PaddleCraft

Take neural networks as APIs for human-like AI.

Contrib

contribution works with PaddlePaddle from the third party developers

PaddleTransfer

飞桨迁移学习算法库

continuous_evaluation

Macro Continuous Evaluation Platform for Paddle.

recordio

An implementation of the RecordIO file format.

Perf

Paddle-bot

examples

continuous_integration

PaddleSOT

A Bytecode level Implementation of Symbolic OpCode Translator For PaddlePaddle

tape

paddle_upgrade_tool

upgrade paddle-1.x to paddle-2.0

PaddleAPEX

PaddleAPEX：Paddle Accuracy and Performance EXpansion pack

talks

CLA

any

Legacy Repo only for PaddlePaddle with version <= 1.3