Top Rating
- Top Contributors
  Discover the Top Open Source contributors by country or by language
- Interviews
  Discover real stories from Open Source developers
Discover

Discover your Favorite Language
Discover the top trending repositories and projects on Github. Explore the latest trends in your preferred languages.

Scala

Lua

Objective-C

JavaScript

Nix

C

Perl

Dart

More Languages
Awesome

Awesome repositories
Discover the most awesome repositories and projects of your favorite languages. Inspired by the Awesome-* lists trend in GitHub.

Julia

Shell

Ruby

Objective-C

MATLAB

Go

Perl

F#

More Languages
By Country

Rankings by Country
Discover the community of talented open source contributors in each country.

🇱🇷 Liberia

🇪🇸 Spain

🇮🇸 Iceland

🇧🇴 Bolivia

🇲🇶 Martinique

🇯🇴 Jordan

🇰🇾 Cayman Islands

🇨🇷 Costa Rica

All Countries Compare Countries

thu-pacman/PET

Stars
112
Rank 310,464 (Top 7 %)
Language
C++
License
Apache License 2.0
Created over 3 years ago
Updated over 2 years ago

thu-pacman/PET

thu-pacman

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections

PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections

PET is the first DNN framework that optimizes tensor programs with partially equivalent transformations and automated corrections. PET discovers and applies program transformations that improve computation efficiency but only maintain partial functional equivalence. PET then automatically corrects results to restore full equivalence. We develop rigorous theoretical foundations to simplify equivalence examination and correction for partially equivalent transformations, and design an efficient search algorithm to quickly discover highly optimized programs by combining fully and partially equivalent optimizations at the tensor, operator, and graph levels. Our evaluation shows that PET outperforms existing systems by up to 2.5x, by unlocking previously missed opportunities from partially equivalent transformations.


Figure 1: End-to-end performance comparison between PET and existing frameworks. For each DNN, the numbers above the PET bars show the speedups over the best baseline. TASO does not support the 3D convolution operators in Resnet3D-18.

Install PET

See README.pdf A.4 to install PET from source.

Publication

Wang, Haojie, Jidong Zhai, Mingyu Gao, Zixuan Ma, Shizhi Tang, Liyan Zheng, Yuanzhi Li, Kaiyuan Rong, Yuanyong Chen, and Zhihao Jia. "PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections." In 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI 21), pp. 37-54. 2021.

Contributors

Currently PET is maintained in a private repository. Updates will be synchronized to this repository periodically. Contributors of PET are listed as follows.

GeminiGraph

A computation-centric distributed graph processing system.

GridGraph

Out-of-core graph processing on a single machine.

TriCache

A User-Transparent Block Cache Enabling High-Performance Out-of-Core Processing with In-Memory Programs

FasterMoE

gscholar-citations-crawler

Crawl all your citations from Google Scholar

LiveGraph

LiveGraph: a transactional graph storage system with purely sequential adjacency list scans

HyQuas

A hybrid partitioner based quantum circuit simulation system on GPU

SmartMoE-AE

GraphPi

RisGraph

RisGraph: A Real-Time Streaming System for Evolving Graphs to Support Sub-millisecond Per-update Analysis at Millions Ops/s

Spindle

lab-guide

Everything about PACMAN!

VAPRO

Light-weight Performance Variance Detection for Production-run Parallel Applications

self-checkpoint

An in-memory checkpoint method using less space.

AIPerf

mpi-profiler

A simple and easy-to-use profiler for MPI programs. It profiles CPU time and MPI time for each process. No source code modification is need, just re-link the program with this library.

LiveGraph-Binary

LiveGraph: a transactional graph storage system with purely sequential adjacency list scans

CYPRESS

CYPRESS: Combining Static and Dynamic Analysis for Top-Down Communication Trace Compression

AIPerf-MoE

MoE Model Benchmark of AIPerf

Mat2Stencil

A Modular Matrix-Based DSL for Explicit and Implicit Matrix-Free PDE Solvers on Structured Grid.

tprint

tprint is a printing library specially designed for SW architecture. Currently providing C and fortran API.