• Stars
    star
    4
  • Rank 3,304,323 (Top 66 %)
  • Language
    C++
  • License
    MIT License
  • Created over 4 years ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Tuna centric MIOpen client

More Repositories

1

ROCm

AMD ROCmâ„¢ Software - GitHub Home
Shell
4,583
star
2

HIP

HIP: C++ Heterogeneous-Compute Interface for Portability
C++
3,398
star
3

MIOpen

AMD's Machine Intelligence Library
Assembly
1,060
star
4

HIPIFY

HIPIFY: Convert CUDA to Portable C++ Code
C++
505
star
5

hcc

HCC is an Open Source, Optimizing C++ Compiler for Heterogeneous Compute currently for the ROCm GPU Computing Platform
C++
425
star
6

rocBLAS

Next generation BLAS implementation for ROCm platform
C++
308
star
7

composable_kernel

Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators
C++
285
star
8

omnitrace

Omnitrace: Application Profiling, Tracing, and Analysis
C++
283
star
9

rccl

ROCm Communication Collectives Library (RCCL)
C++
231
star
10

ROCR-Runtime

ROCm Platform Runtime: ROCr a HPC market enhanced HSA based runtime
C++
217
star
11

Tensile

Stretching GPU performance for GEMMs and tensor contractions.
Python
214
star
12

aomp

AOMP is an open source Clang/LLVM based compiler with added support for the OpenMP® API on Radeon™ GPUs. Use this repository for releases, issues, documentation, packaging, and examples.
Fortran
203
star
13

AMDMIGraphX

AMD's graph optimization engine.
C++
185
star
14

rocFFT

Next generation FFT implementation for ROCm
C++
174
star
15

MIVisionX

MIVisionX toolkit is a set of comprehensive computer vision and machine intelligence libraries, utilities, and applications bundled into a single toolkit. AMD MIVisionX also delivers a highly optimized open-source implementation of the Khronos OpenVXâ„¢ and OpenVXâ„¢ Extensions.
C++
168
star
16

gpufort

GPUFORT: S2S translation tool for CUDA Fortran and Fortran+X in the spirit of hipify
Fortran
159
star
17

rocPRIM

ROCm Parallel Primitives
C++
157
star
18

rocm-examples

A collection of examples for the ROCm software stack
C++
154
star
19

omniperf

Advanced Profiling and Analytics for AMD Hardware
Python
132
star
20

rocprofiler

ROC profiler library. Profiling with perf-counters and derived metrics.
C
126
star
21

rocMLIR

C++
120
star
22

rocSPARSE

Next generation SPARSE implementation for ROCm platform
C++
117
star
23

rocm_smi_lib

ROCm SMI LIB
C++
116
star
24

rocRAND

RAND library for HIP programming language
C++
110
star
25

HIP-CPU

An implementation of HIP that works on CPUs, across OSes.
C++
107
star
26

rocThrust

ROCm Thrust - run Thrust dependent software on AMD GPUs
C++
100
star
27

ROCm-Device-Libs

ROCm Device Libraries
C
97
star
28

rocSOLVER

Next generation LAPACK implementation for ROCm platform
C++
91
star
29

rocWMMA

rocWMMA
C++
86
star
30

hipCUB

Reusable software components for ROCm developers
C++
81
star
31

rocALUTION

Next generation library for iterative sparse solvers for ROCm platform
C++
74
star
32

hipfort

Fortran interfaces for ROCm libraries
Fortran
69
star
33

roctracer

ROCm Tracer Callback/Activity Library for Performance tracing AMD GPUs
C++
69
star
34

hipSPARSE

ROCm SPARSE marshalling library
C++
67
star
35

atmi

Asynchronous Task and Memory Interface, or ATMI, is a runtime framework and programming model for heterogeneous CPU-GPU systems. It provides a consistent, declarative API to create task graphs on CPUs and GPUs (integrated and discrete).
C++
66
star
36

ROCmValidationSuite

The ROCm Validation Suite is a system administrator’s and cluster manager's tool for detecting and troubleshooting common problems affecting AMD GPU(s) running in a high-performance computing environment, enabled using the ROCm software stack on a compatible platform.
C++
61
star
37

rocm-cmake

CMake modules used within the ROCm libraries
CMake
59
star
38

hipFFT

hipFFT is a FFT marshalling library.
C++
52
star
39

ROCgdb

This is ROCgdb, the ROCm source-level debugger for Linux, based on GDB, the GNU source-level debugger.
C
50
star
40

amd_matrix_instruction_calculator

A tool for generating information about the matrix multiplication instructions in AMD Radeonâ„¢ and AMD Instinctâ„¢ accelerators
Python
48
star
41

ROCm-CompilerSupport

The compiler support repository provides various Lightning Compiler related services.
C++
46
star
42

rpp

AMD ROCm Performance Primitives (RPP) library is a comprehensive high-performance computer vision library for AMD processors with HIP/OpenCL/CPU back-ends.
C++
46
star
43

ROCclr

44
star
44

rocm_bandwidth_test

Bandwidth test for ROCm
C++
41
star
45

amdsmi

AMD SMI
C++
39
star
46

HIPCC

HIPCC: HIP compiler driver
C++
39
star
47

aotriton

Ahead of Time (AOT) Triton Math Library
Python
37
star
48

Experimental_ROC

Experimental and Intriguing Tools for ROCm
Shell
35
star
49

rocHPCG

HPCG benchmark based on ROCm platform
C++
35
star
50

ROC_SHMEM

ROC_SHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.
C++
34
star
51

MISA

Machine Intelligence Shader Autogen. AMDGPU ML shader code generator. (previously iGEMMgen)
Python
34
star
52

ROCm.github.io

ROCm Website
32
star
53

TransferBench

TransferBench is a utility capable of benchmarking simultaneous copies between user-specified devices (CPUs/GPUs)
C++
29
star
54

rocm-blogs

Jupyter Notebook
28
star
55

clang-ocl

OpenCL compilation with clang compiler.
CMake
26
star
56

hipSOLVER

ROCm SOLVER marshalling library
C++
24
star
57

ROCm-OpenCL-Driver

ROCm OpenCL Compiler Tool Driver
C++
24
star
58

rdc

RDC
C++
23
star
59

hipRAND

Random number library that generate pseudo-random and quasi-random numbers.
C++
23
star
60

rccl-tests

RCCL Performance Benchmark Tests
Cuda
21
star
61

ROCdbgapi

The AMD Debugger API is a library that provides all the support necessary for a debugger and other tools to perform low level control of the execution and inspection of execution state of AMD's commercially available GPU architectures.
C++
19
star
62

pyrsmi

python package of rocm-smi-lib
Python
18
star
63

hip-python

HIP Python Low-level Bindings
Shell
17
star
64

hip-tests

C++
15
star
65

roc-stdpar

C++
14
star
66

pytorch-micro-benchmarking

Python
14
star
67

hipify_torch

Python
13
star
68

rocmProfileData

C++
13
star
69

rocm-docs-core

ROCm Documentation Python package for ReadTheDocs build standardization
CSS
12
star
70

rocAL

The AMD rocAL is designed to efficiently decode and process images and videos from a variety of storage formats and modify them through a processing graph programmable by the user.
C++
11
star
71

half

C++
9
star
72

rocprofiler-sdk

C++
9
star
73

rocBLAS-Examples

Examples illustrating usage of the rocBLAS library
C++
9
star
74

OSU_Microbenchmarks

ROCm - UCX enabled OSU_Benchmarks
C
8
star
75

MITuna

Python
7
star
76

rtg_tracer

C++
7
star
77

Gromacs

ROCm's implementation of Gromacs
C++
6
star
78

rocm-spack-pkgs

Repository to host spack recipes for ROCm
Python
6
star
79

rbuild

Rocm build tool
Python
6
star
80

rocm-core

CMake
5
star
81

rocm-llvm-python

Low-level Cython and Python bindings to the (ROCm) LLVM and AMD COMGR C API. Also ships the official LLVM Clang bindings.
Shell
4
star
82

hip-testsuite

Python
4
star
83

flang

Mirror of flang repo: The source repo is https://github.com/flang-compiler/flang . Once a day the master branch is updated from the upstream source repo and then locked. AOMP or ROCm developers may commit or create PRs on branch aomp-dev.
C++
3
star
84

numba-hip

HIP backend patch for Numba, the NumPy aware dynamic Python compiler using LLVM.
Python
3
star
85

tensorcast

Python
3
star
86

hipSPARSELt

C++
2
star
87

aomp-extras

hostcall services library, math library, and utilities
Shell
2
star
88

MIOpenExamples

MIOpen examples
C++
2
star
89

rocprofiler-register

CMake
2
star
90

rocm-install-on-windows

2
star
91

hipOMB

OSU MPI benchmarks with ROCm support
C
1
star
92

migraphx-benchmark

1
star
93

rocm-recipes

Recipes for rocm
CMake
1
star
94

hipBLAS-common

Common files shared by hipBLAS and hipBLASLt
CMake
1
star