• Stars
    star
    1
  • Language
    Python
  • License
    MIT License
  • Created 5 months ago
  • Updated 28 days ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

HIP backend patch for Numba, the NumPy aware dynamic Python compiler using LLVM.

More Repositories

1

ROCm

AMD ROCmâ„¢ Software - GitHub Home
Shell
4,470
star
2

HIP

HIP: C++ Heterogeneous-Compute Interface for Portability
C++
3,398
star
3

MIOpen

AMD's Machine Intelligence Library
Assembly
1,046
star
4

HIPIFY

HIPIFY: Convert CUDA to Portable C++ Code
C++
440
star
5

hcc

HCC is an Open Source, Optimizing C++ Compiler for Heterogeneous Compute currently for the ROCm GPU Computing Platform
C++
425
star
6

rocBLAS

Next generation BLAS implementation for ROCm platform
C++
308
star
7

composable_kernel

Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators
C++
285
star
8

omnitrace

Omnitrace: Application Profiling, Tracing, and Analysis
C++
283
star
9

rccl

ROCm Communication Collectives Library (RCCL)
C++
231
star
10

Tensile

Stretching GPU performance for GEMMs and tensor contractions.
Python
211
star
11

ROCR-Runtime

ROCm Platform Runtime: ROCr a HPC market enhanced HSA based runtime
C++
205
star
12

aomp

AOMP is an open source Clang/LLVM based compiler with added support for the OpenMP® API on Radeon™ GPUs. Use this repository for releases, issues, documentation, packaging, and examples.
Fortran
203
star
13

AMDMIGraphX

AMD's graph optimization engine.
C++
181
star
14

rocFFT

Next generation FFT implementation for ROCm
C++
174
star
15

MIVisionX

MIVisionX toolkit is a set of comprehensive computer vision and machine intelligence libraries, utilities, and applications bundled into a single toolkit. AMD MIVisionX also delivers a highly optimized open-source implementation of the Khronos OpenVXâ„¢ and OpenVXâ„¢ Extensions.
C++
168
star
16

gpufort

GPUFORT: S2S translation tool for CUDA Fortran and Fortran+X in the spirit of hipify
Fortran
159
star
17

rocPRIM

ROCm Parallel Primitives
C++
154
star
18

omniperf

Advanced Profiling and Analytics for AMD Hardware
Python
128
star
19

rocprofiler

ROC profiler library. Profiling with perf-counters and derived metrics.
C
122
star
20

rocm-examples

A collection of examples for the ROCm software stack
C++
121
star
21

rocMLIR

C++
120
star
22

rocSPARSE

Next generation SPARSE implementation for ROCm platform
C++
117
star
23

rocm_smi_lib

ROCm SMI LIB
C++
114
star
24

rocRAND

RAND library for HIP programming language
C++
111
star
25

HIP-CPU

An implementation of HIP that works on CPUs, across OSes.
C++
107
star
26

rocThrust

ROCm Thrust - run Thrust dependent software on AMD GPUs
C++
100
star
27

ROCm-Device-Libs

ROCm Device Libraries
C
99
star
28

rocSOLVER

Next generation LAPACK implementation for ROCm platform
C++
91
star
29

hipCUB

Reusable software components for ROCm developers
C++
79
star
30

rocALUTION

Next generation library for iterative sparse solvers for ROCm platform
C++
74
star
31

rocWMMA

rocWMMA
C++
71
star
32

roctracer

ROCm Tracer Callback/Activity Library for Performance tracing AMD GPUs
C++
67
star
33

hipSPARSE

ROCm SPARSE marshalling library
C++
67
star
34

hipfort

Fortran interfaces for ROCm libraries
Fortran
66
star
35

atmi

Asynchronous Task and Memory Interface, or ATMI, is a runtime framework and programming model for heterogeneous CPU-GPU systems. It provides a consistent, declarative API to create task graphs on CPUs and GPUs (integrated and discrete).
C++
65
star
36

ROCmValidationSuite

The ROCm Validation Suite is a system administrator’s and cluster manager's tool for detecting and troubleshooting common problems affecting AMD GPU(s) running in a high-performance computing environment, enabled using the ROCm software stack on a compatible platform.
C++
61
star
37

rocm-cmake

CMake modules used within the ROCm libraries
CMake
59
star
38

hipFFT

hipFFT is a FFT marshalling library.
C++
52
star
39

amd_matrix_instruction_calculator

A tool for generating information about the matrix multiplication instructions in AMD Radeonâ„¢ and AMD Instinctâ„¢ accelerators
Python
48
star
40

ROCm-CompilerSupport

The compiler support repository provides various Lightning Compiler related services.
C++
46
star
41

rpp

AMD ROCm Performance Primitives (RPP) library is a comprehensive high-performance computer vision library for AMD processors with HIP/OpenCL/CPU back-ends.
C++
46
star
42

ROCclr

44
star
43

ROCgdb

This is ROCgdb, the ROCm source-level debugger for Linux, based on GDB, the GNU source-level debugger.
C
44
star
44

rocm_bandwidth_test

Bandwidth test for ROCm
C++
41
star
45

HIPCC

HIPCC: HIP compiler driver
C++
39
star
46

Experimental_ROC

Experimental and Intriguing Tools for ROCm
Shell
35
star
47

rocHPCG

HPCG benchmark based on ROCm platform
C++
35
star
48

ROC_SHMEM

ROC_SHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.
C++
34
star
49

MISA

Machine Intelligence Shader Autogen. AMDGPU ML shader code generator. (previously iGEMMgen)
Python
33
star
50

amdsmi

AMD SMI
C++
32
star
51

ROCm.github.io

ROCm Website
32
star
52

TransferBench

TransferBench is a utility capable of benchmarking simultaneous copies between user-specified devices (CPUs/GPUs)
C++
27
star
53

clang-ocl

OpenCL compilation with clang compiler.
CMake
26
star
54

hipSOLVER

ROCm SOLVER marshalling library
C++
25
star
55

aotriton

Ahead of Time (AOT) Triton Math Library
Python
24
star
56

ROCm-OpenCL-Driver

ROCm OpenCL Compiler Tool Driver
C++
24
star
57

rocm-blogs

Jupyter Notebook
22
star
58

rccl-tests

RCCL Performance Benchmark Tests
Cuda
21
star
59

hipRAND

Random number library that generate pseudo-random and quasi-random numbers.
C++
21
star
60

rdc

RDC
C++
19
star
61

ROCdbgapi

The AMD Debugger API is a library that provides all the support necessary for a debugger and other tools to perform low level control of the execution and inspection of execution state of AMD's commercially available GPU architectures.
C++
19
star
62

pyrsmi

python package of rocm-smi-lib
Python
17
star
63

hip-python

HIP Python Low-level Bindings
Shell
16
star
64

hip-tests

C++
15
star
65

roc-stdpar

C++
14
star
66

pytorch-micro-benchmarking

Python
14
star
67

hipify_torch

Python
13
star
68

rocmProfileData

C++
13
star
69

rocm-docs-core

ROCm Documentation Python package for ReadTheDocs build standardization
CSS
12
star
70

rocAL

The AMD rocAL is designed to efficiently decode and process images and videos from a variety of storage formats and modify them through a processing graph programmable by the user.
C++
10
star
71

half

C++
9
star
72

rocprofiler-sdk

C++
9
star
73

rocBLAS-Examples

Examples illustrating usage of the rocBLAS library
C++
9
star
74

OSU_Microbenchmarks

ROCm - UCX enabled OSU_Benchmarks
C
8
star
75

MITuna

Python
7
star
76

rtg_tracer

C++
7
star
77

rocm-spack-pkgs

Repository to host spack recipes for ROCm
Python
6
star
78

rbuild

Rocm build tool
Python
6
star
79

Gromacs

ROCm's implementation of Gromacs
C++
5
star
80

rocm-core

CMake
4
star
81

hip-testsuite

Python
4
star
82

MIFin

Tuna centric MIOpen client
C++
4
star
83

rocm-llvm-python

Low-level Cython and Python bindings to the (ROCm) LLVM C API.
Shell
3
star
84

flang

Mirror of flang repo: The source repo is https://github.com/flang-compiler/flang . Once a day the master branch is updated from the upstream source repo and then locked. AOMP or ROCm developers may commit or create PRs on branch aomp-dev.
C++
3
star
85

hipSPARSELt

C++
2
star
86

aomp-extras

hostcall services library, math library, and utilities
Shell
2
star
87

MIOpenExamples

MIOpen examples
C++
2
star
88

hipOMB

OSU MPI benchmarks with ROCm support
C
1
star
89

migraphx-benchmark

1
star
90

tensorcast

Python
1
star
91

rocm-recipes

Recipes for rocm
CMake
1
star
92

rocprofiler-register

CMake
1
star
93

rocm-install-on-windows

1
star