Top Rating
- Top Contributors
  Discover the Top Open Source contributors by country or by language
- Interviews
  Discover real stories from Open Source developers
Discover

Discover your Favorite Language
Discover the top trending repositories and projects on Github. Explore the latest trends in your preferred languages.

Dart

PHP

Haskell

F#

C++

Nix

Groovy

Perl

More Languages
Awesome

Awesome repositories
Discover the most awesome repositories and projects of your favorite languages. Inspired by the Awesome-* lists trend in GitHub.

Java

C#

Kotlin

Dart

Rust

Erlang

Elm

Go

More Languages
By Country

Rankings by Country
Discover the community of talented open source contributors in each country.

🇮🇴 British Indian Ocean Territory

🇷🇼 Rwanda

🇮🇷 Iran

🇲🇹 Malta

🇹🇹 Trinidad and Tobago

🇬🇵 Guadeloupe

🇱🇹 Lithuania

🇰🇪 Kenya

All Countries Compare Countries

huawei-noah/Pretrained-Language-Model

Stars
2,961
Rank 15,305 (Top 0.4 %)
Language
Python
Created about 5 years ago
Updated 11 months ago

huawei-noah/Pretrained-Language-Model

huawei-noah

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

Pretrained Language Model

This repository provides the latest pretrained language models and its related optimization techniques developed by Huawei Noah's Ark Lab.

Directory structure

PanGu-α is a Large-scale autoregressive pretrained Chinese language model with up to 200B parameter. The models are developed under the MindSpore and trained on a cluster of Ascend 910 AI processors.
NEZHA-TensorFlow is a pretrained Chinese language model which achieves the state-of-the-art performances on several Chinese NLP tasks developed under TensorFlow.
NEZHA-PyTorch is the PyTorch version of NEZHA.
NEZHA-Gen-TensorFlow provides two GPT models. One is Yuefu (乐府), a Chinese Classical Poetry generation model, the other is a common Chinese GPT model.
TinyBERT is a compressed BERT model which achieves 7.5x smaller and 9.4x faster on inference.
TinyBERT-MindSpore is a MindSpore version of TinyBERT.
DynaBERT is a dynamic BERT model with adaptive width and depth.
BBPE provides a byte-level vocabulary building tool and its correspoinding tokenizer.
PMLM is a probabilistically masked language model. Trained without the complex two-stream self-attention, PMLM can be treated as a simple approximation of XLNet.
TernaryBERT is a weights ternarization method for BERT model developed under PyTorch.
TernaryBERT-MindSpore is the MindSpore version of TernaryBERT.
HyperText is an efficient text classification model based on hyperbolic geometry theories.
BinaryBERT is a weights binarization method using ternary weight splitting for BERT model, developed under PyTorch.
AutoTinyBERT provides a model zoo that can meet different latency requirements.
PanGu-Bot is a Chinese pre-trained open-domain dialog model build based on the GPU implementation of PanGu-α.
CeMAT is a universal sequence-to-sequence multi-lingual pre-training language model for both autoregressive and non-autoregressive neural machine translation tasks.
Noah_WuKong is a large-scale Chinese vision-language dataset and a group of benchmarking models trained on it.
Noah_WuKong-MindSpore is a MindSpore version of Noah_WuKong.
CAME is a Confidence-guided Adaptive Memory Efficient Optimizer.

Efficient-AI-Backbones

Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.

HEBO

Bayesian optimisation & Reinforcement Learning library developped by Huawei Noah's Ark Lab

Jupyter Notebook

Efficient-Computing

Efficient computing methods developed by Huawei Noah's Ark Lab

Jupyter Notebook

AdderNet

Code for paper " AdderNet: Do We Really Need Multiplications in Deep Learning?"

trustworthyAI

Trustworthy AI related projects

SMARTS

Scalable Multi-Agent RL Training School for Autonomous Driving

bolt

Bolt is a deep learning library with high performance and heterogeneous flexibility.

noah-research

vega

AutoML tools chain

VanillaNet

Speech-Backbones

This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.

Jupyter Notebook

streamDM

Stream Data Mining Library for Spark Streaming

Pretrained-IPT

xingtian

xingtian is a componentized library for the development and verification of reinforcement learning algorithms

benchmark

Disout

Code for AAAI 2020 paper, Beyond Dropout: Feature Map Distortion to Regularize Deep Neural Networks (Disout).

BGCN

A Tensorflow implementation of "Bayesian Graph Convolutional Neural Networks" (AAAI 2019).

BHT-ARIMA

Code for paper: Block Hankel Tensor ARIMA for Multiple Short Time Series Forecasting (AAAI-20)

multi_hyp_cc

[CVPR2020] A Multi-Hypothesis Approach to Color Constancy

Efficient-NLP

streamDM-Cpp

stream Machine Learning in C++

Federated-Learning