• Stars
    star
    606
  • Rank 73,958 (Top 2 %)
  • Language
    C++
  • License
    Apache License 2.0
  • Created over 4 years ago
  • Updated 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

oneAPI Math Kernel Library (oneMKL) Interfaces

oneAPI Math Kernel Library (oneMKL) Interfaces

oneAPI logo

oneMKL Interfaces is an open-source implementation of the oneMKL Data Parallel C++ (DPC++) interface according to the oneMKL specification. It works with multiple devices (backends) using device-specific libraries underneath.

oneMKL is part of oneAPI.

User Application oneMKL Layer Third-Party Library Hardware Backend
oneMKL interface oneMKL selector Intel(R) oneAPI Math Kernel Library for x86 CPU x86 CPU
Intel(R) oneAPI Math Kernel Library for Intel GPU Intel GPU
NVIDIA cuBLAS for NVIDIA GPU NVIDIA GPU
NVIDIA cuSOLVER for NVIDIA GPU NVIDIA GPU
NVIDIA cuRAND for NVIDIA GPU NVIDIA GPU
NVIDIA cuFFT for NVIDIA GPU NVIDIA GPU
NETLIB LAPACK for x86 CPU x86 CPU
AMD rocBLAS for AMD GPU AMD GPU
AMD rocSOLVER for AMD GPU AMD GPU
AMD rocRAND for AMD GPU AMD GPU
AMD rocFFT for AMD GPU AMD GPU
portBLAS x86 CPU, Intel GPU, NVIDIA GPU, AMD GPU

Table of Contents


Support and Requirements

Supported Usage Models:

There are two oneMKL selector layer implementations:

  • Run-time dispatching: The application is linked with the oneMKL library and the required backend is loaded at run-time based on device vendor (all libraries should be dynamic).

Example of app.cpp with run-time dispatching:

#include "oneapi/mkl.hpp"

...
cpu_dev = sycl::device(sycl::cpu_selector());
gpu_dev = sycl::device(sycl::gpu_selector());

sycl::queue cpu_queue(cpu_dev);
sycl::queue gpu_queue(gpu_dev);

oneapi::mkl::blas::column_major::gemm(cpu_queue, transA, transB, m, ...);
oneapi::mkl::blas::column_major::gemm(gpu_queue, transA, transB, m, ...);

How to build an application with run-time dispatching:

if OS is Linux, use icpx compiler. If OS is Windows, use icx compiler. Linux example:

$> icpx -fsycl –I$ONEMKL/include app.cpp
$> icpx -fsycl app.o –L$ONEMKL/lib –lonemkl
  • Compile-time dispatching: The application uses a templated backend selector API where the template parameters specify the required backends and third-party libraries and the application is linked with the required oneMKL backend wrapper libraries (libraries can be static or dynamic).

Example of app.cpp with compile-time dispatching:

#include "oneapi/mkl.hpp"

...
cpu_dev = sycl::device(sycl::cpu_selector());
gpu_dev = sycl::device(sycl::gpu_selector());

sycl::queue cpu_queue(cpu_dev);
sycl::queue gpu_queue(gpu_dev);

oneapi::mkl::backend_selector<oneapi::mkl::backend::mklcpu> cpu_selector(cpu_queue);

oneapi::mkl::blas::column_major::gemm(cpu_selector, transA, transB, m, ...);
oneapi::mkl::blas::column_major::gemm(oneapi::mkl::backend_selector<oneapi::mkl::backend::cublas> {gpu_queue}, transA, transB, m, ...);

How to build an application with compile-time dispatching:

$> clang++ -fsycl –I$ONEMKL/include app.cpp
$> clang++ -fsycl app.o –L$ONEMKL/lib –lonemkl_blas_mklcpu –lonemkl_blas_cublas

Refer to Selecting a Compiler for the choice between icpx/icx and clang++ compilers.

Supported Configurations:

Supported domains: BLAS, LAPACK, RNG, DFT

Linux*

Domain Backend Library Supported Link Type Supported Compiler
BLAS x86 CPU Intel(R) oneAPI Math Kernel Library Dynamic, Static DPC++, LLVM*, hipSYCL
Intel GPU Dynamic, Static DPC++
NVIDIA GPU NVIDIA cuBLAS Dynamic, Static LLVM*, hipSYCL
x86 CPU NETLIB LAPACK Dynamic, Static DPC++, LLVM*, hipSYCL
AMD GPU AMD rocBLAS Dynamic, Static LLVM*, hipSYCL
x86 CPU, Intel GPU, NVIDIA GPU, AMD GPU portBLAS Dynamic, Static DPC++, LLVM*
LAPACK x86 CPU Intel(R) oneAPI Math Kernel Library Dynamic, Static DPC++, LLVM*
Intel GPU Dynamic, Static DPC++
NVIDIA GPU NVIDIA cuSOLVER Dynamic, Static LLVM*
AMD GPU AMD rocSOLVER Dynamic, Static LLVM*
RNG x86 CPU Intel(R) oneAPI Math Kernel Library Dynamic, Static DPC++, LLVM*, hipSYCL
Intel GPU Dynamic, Static DPC++
NVIDIA GPU NVIDIA cuRAND Dynamic, Static LLVM*, hipSYCL
AMD GPU AMD rocRAND Dynamic, Static LLVM*, hipSYCL
DFT Intel GPU Intel(R) oneAPI Math Kernel Library Dynamic, Static DPC++
x86 CPU Dynamic, Static DPC++
NVIDIA GPU NVIDIA cuFFT Dynamic, Static DPC++
AMD GPU AMD rocFFT Dynamic, Static DPC++

Windows*

Domain Backend Library Supported Link Type Supported Compiler
BLAS x86 CPU Intel(R) oneAPI Math Kernel Library Dynamic, Static DPC++, LLVM*
Intel GPU Dynamic, Static DPC++
x86 CPU NETLIB LAPACK Dynamic, Static DPC++, LLVM*
LAPACK x86 CPU Intel(R) oneAPI Math Kernel Library Dynamic, Static DPC++, LLVM*
Intel GPU Dynamic, Static DPC++
RNG x86 CPU Intel(R) oneAPI Math Kernel Library Dynamic, Static DPC++, LLVM*
Intel GPU Dynamic, Static DPC++

* LLVM - Intel project for LLVM* technology with support for NVIDIA CUDA


Hardware Platform Support

  • CPU
    • Intel Atom(R) Processors
    • Intel(R) Core(TM) Processor Family
    • Intel(R) Xeon(R) Processor Family
  • Accelerators
    • Intel(R) Processor Graphics GEN9
    • NVIDIA(R) TITAN RTX(TM) (Linux* only. cuRAND backend tested also with Quadro and A100 GPUs. Not tested with other NVIDIA GPU families and products.)
    • AMD(R) GPUs see here tested on AMD Vega 20 (gfx906)

Supported Operating Systems

Linux*

Operating System CPU Host/Target Integrated Graphics from Intel (Intel GPU) NVIDIA GPU
Ubuntu 18.04.3, 19.04 18.04.3, 19.10 18.04.3, 20.04
SUSE Linux Enterprise Server* 15 Not supported Not supported
Red Hat Enterprise Linux* (RHEL*) 8 Not supported Not supported
Linux* kernel N/A 4.11 or higher N/A

Windows*

Operating System CPU Host/Target Integrated Graphics from Intel (Intel GPU)
Microsoft Windows* 10 (64-bit version only) 10 (64-bit version only)
Microsoft Windows* Server 2016, 2019 Not supported

Software Requirements

What should I download?

General:

Using Conan Using CMake Directly
Functional Testing Build Only Documentation
Linux* : GNU* GCC 5.1 or higher
Windows* : MSVS* 2017 or MSVS* 2019 (version 16.5 or newer)
Python 3.6 or higher CMake
Ninja (optional)
Conan C++ package manager GNU* FORTRAN Compiler - Sphinx
NETLIB LAPACK - -

Hardware and OS Specific:

Operating System Device Package Installed by Conan
Linux*/Windows* x86 CPU Intel(R) oneAPI DPC++ Compiler
or
Intel project for LLVM* technology
No
Intel(R) oneAPI Math Kernel Library Yes
Intel GPU Intel(R) oneAPI DPC++ Compiler No
Intel GPU driver No
Intel(R) oneAPI Math Kernel Library Yes
Linux* only NVIDIA GPU Intel project for LLVM* technology
or
hipSYCL with CUDA backend and dependencies
No
AMD GPU Intel project for LLVM* technology
or
hipSYCL with ROCm backend and dependencies
No

If Building with Conan, above packages marked as "No" must be installed manually.

If Building with CMake, above packages must be installed manually.

Notice for Use of Conan Package Manager

LEGAL NOTICE: By downloading and using this container or script as applicable (the "Software Package") and the included software or software made available for download, you agree to the terms and conditions of the software license agreements for the Software Package, which may also include notices, disclaimers, or license terms for third party software (together, the "Agreements") included in this README file.

If the Software Package is installed through a silent install, your download and use of the Software Package indicates your acceptance of the Agreements.

Product and Version Information:

Product Supported Version Installed by Conan Conan Package Source Package Install Location on Linux* License
Python 3.6 or higher No N/A Pre-installed or Installed by user PSF
Conan C++ Package Manager 1.24 or higher No N/A Installed by user MIT
CMake 3.13 or higher Yes
(3.15 or higher)
conan-center ~/.conan/data or $CONAN_USER_HOME/.conan/data The OSI-approved BSD 3-clause License
Ninja 1.10.0 Yes conan-center ~/.conan/data or $CONAN_USER_HOME/.conan/data Apache License v2.0
GNU* FORTRAN Compiler 7.4.0 or higher Yes apt /usr/bin GNU General Public License, version 3
Intel(R) oneAPI DPC++ Compiler latest No N/A Installed by user End User License Agreement for the Intel(R) Software Development Products
hipSYCL later than 2cfa530 No N/A Installed by user BSD-2-Clause License
Intel project for LLVM* technology binary for x86 CPU Daily builds (experimental) tested with 20200331 No N/A Installed by user Apache License v2
Intel project for LLVM* technology source for NVIDIA GPU Daily source releases: tested with 20200421 No N/A Installed by user Apache License v2
Intel(R) oneAPI Math Kernel Library latest Yes apt /opt/intel/inteloneapi/mkl Intel Simplified Software License
NVIDIA CUDA SDK 10.2 No N/A Installed by user End User License Agreement
AMD rocBLAS 4.5 No N/A Installed by user AMD License
AMD rocRAND 5.1.0 No N/A Installed by user AMD License
AMD rocSOLVER 5.0.0 No N/A Installed by user AMD License
AMD rocFFT rocm-5.4.3 No N/A Installed by user AMD License
NETLIB LAPACK 3.7.1 Yes conan-community ~/.conan/data or $CONAN_USER_HOME/.conan/data BSD like license
Sphinx 2.4.4 Yes pip ~/.local/bin (or similar user local directory) BSD License
portBLAS 0.1 No N/A Installed by user Apache License v2.0

conan-center: https://api.bintray.com/conan/conan/conan-center

conan-community: https://api.bintray.com/conan/conan-community/conan


Documentation


Contributing

See CONTRIBUTING for more information.


License

Distributed under the Apache license 2.0. See [LICENSE](LICENSE) for more

information.


FAQs

oneMKL

  1. What is the difference between the following oneMKL items?

Answer:

  • The oneAPI Specification for oneMKL defines the DPC++ interfaces for performance math library functions. The oneMKL specification can evolve faster and more frequently than implementations of the specification.

  • The oneAPI Math Kernel Library (oneMKL) Interfaces Project is an open source implementation of the specification. The project goal is to demonstrate how the DPC++ interfaces documented in the oneMKL specification can be implemented for any math library and work for any target hardware. While the implementation provided here may not yet be the full implementation of the specification, the goal is to build it out over time. We encourage the community to contribute to this project and help to extend support to multiple hardware targets and other math libraries.

  • The Intel(R) oneAPI Math Kernel Library (oneMKL) product is the Intel product implementation of the specification (with DPC++ interfaces) as well as similar functionality with C and Fortran interfaces, and is provided as part of Intel® oneAPI Base Toolkit. It is highly optimized for Intel CPU and Intel GPU hardware.

Conan

  1. I am behind a proxy. How can Conan download dependencies from external network?

    • ~/.conan/conan.conf has a [proxies] section where you can add the list of proxies. For details refer to Conan proxy settings.
  2. I get an error while installing packages via APT through Conan.

    dpkg: warning: failed to open configuration file '~/.dpkg.cfg' for reading: Permission denied
    Setting up intel-oneapi-mkl-devel (2021.1-408.beta07) ...
    E: Sub-process /usr/bin/dpkg returned an error code (1)
    
    • Although your user session has permissions to install packages via sudo apt, it does not have permissions to update debian package configuration, which throws an error code 1, causing a failure in conan install command.
    • The package is most likely installed correctly and can be verified by:
      1. Running the conan install command again.
      2. Checking /opt/intel/inteloneapi for mkl and/or tbb directories.

More Repositories

1

oneTBB

oneAPI Threading Building Blocks (oneTBB)
C++
5,603
star
2

oneDNN

oneAPI Deep Neural Network Library (oneDNN)
C++
3,576
star
3

oneAPI-samples

Samples for Intel® oneAPI Toolkits
C++
922
star
4

oneDPL

oneAPI DPC++ Library (oneDPL) https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/dpc-library.html
C++
720
star
5

oneDAL

oneAPI Data Analytics Library (oneDAL)
C++
607
star
6

SYCLomatic

LLVM
221
star
7

level-zero

oneAPI Level Zero Specification Headers and Loader
C++
210
star
8

oneCCL

oneAPI Collective Communications Library (oneCCL)
C++
188
star
9

oneVPL

oneAPI Video Processing Library (oneVPL) dispatcher, tools, and examples
C++
173
star
10

oneAPI-spec

oneAPI Specification source files
Python
165
star
11

oneapi-ci

Sample configuration files for using oneAPI in CI systems
Shell
92
star
12

oneVPL-intel-gpu

C++
86
star
13

oneAPI-tab

oneAPI Technical Advisory Board (TAB) Meeting Notes
71
star
14

distributed-ranges

Distributed ranges is a generalization of C++ ranges for distributed data structures.
C++
46
star
15

level-zero-tests

oneAPI Level Zero Conformance & Performance test content
C++
45
star
16

Velocity-Bench

C++
42
star
17

unified-runtime

C++
31
star
18

unified-memory-framework

A library for constructing allocators and memory pools. It also contains broadly useful abstractions and utilities for memory management. UMF allows users to manage multiple memory pools characterized by different attributes, allowing certain allocation types to be isolated from others and allocated using different hardware resources as required.
C
31
star
19

oneVPL-cpu

oneAPI Video Processing Library (oneVPL) CPU implementation. This GitHub repository is no longer active. See ReadMe for more information.
C++
25
star
20

level-zero-spec

Python
17
star
21

ishmem

Intel® SHMEM - Device initiated shared memory based communication library
C++
15
star
22

drone-navigation-inspection

AI Starter Kit for AI applications in Drone technology using Intel® Optimized Tensorflow*
Python
13
star
23

predictive-asset-health-analytics

AI Starter Kit for Predictive Asset Maintenance using Intel® optimized version of XGBoost
HTML
13
star
24

SYCLomatic-test

LLVM
13
star
25

text-data-generation

AI Starter Kit for AI Unstructured Synthetic Data Generation using Intel® Extension for Pytorch
Python
10
star
26

traffic-camera-object-detection

AI Starter Kit for traffic camera object detection using Intel® Extension for Pytorch
Python
10
star
27

invoice-to-cash-automation

Ai starter kit for trade promotion and claim documents categorization using pytorch* and Tensorflow*
Python
7
star
28

demand-forecasting

AI Starter Kit for demand forecasting using Intel® Optimized Tensorflow*
Python
7
star
29

disease-prediction

AI Starter Kit for the implementation of AI-based NLP Disease Prediction system using Intel® Extension for PyTorch* and Intel® Neural Compressor
Python
7
star
30

computational-fluid-dynamics

AI Starter Kit for fluid Flow Profiling using Intel® Optimized Tensorflow*
Python
6
star
31

historical-assets-document-process

AI Starter Kit for Historical Assets document processing using Intel® Extension for Pytorch
Python
6
star
32

network-intrusion-detection

AI Starter Kit for Network Intrusion Detection using Intel® Extension for Scikit-learn*
Python
6
star
33

ai-transcribe

AI Starter Kit for the implementation of an AI transcribe system using Intel® Extension for PyTorch*
Python
6
star
34

level-zero-intel-gpu

5
star
35

structural-damage-assessment

AI Starter Kit for applications in Satellite Image processing using Intel® Extension for Pytorch
Python
5
star
36

digital-twin

AI Starter Kit to build a MOSFET Digital Twin for Design Exploration using Intel® optimized version of XGBoost
Python
4
star
37

medical-imaging-diagnostics

AI Starter Kit for image-based abnormalities for different diseases classification using Intel® Optimized Tensorflow*
Python
4
star
38

visual-quality-inspection

AI Starter Kit for Quality Visual Inspection using Intel® Extension for Pytorch
Python
4
star
39

customer-chatbot

AI Starter Kit for Customer Chatbot using Intel® Extension for Pytorch
Python
3
star
40

distributed-ranges-tutorial

C++
3
star
41

purchase-prediction

AI Starter Kit for Purchase Prediction model using Intel® Extension for Scikit-learn*
Python
3
star
42

customer-segmentation

AI Starter Kit for Customer Segmentation for Online Retail using Intel® Extension for Scikit-learn*
Python
3
star
43

powerline-fault-detection

AI Starter Kit for detect faulty signals in power line voltage using Intel® Extension for Scikit-learn*
Python
3
star
44

image-data-generation

AI Starter Kit for Synthetic Image Generation using Intel® Optimized Tensorflow*
Python
2
star
45

intelligent-indexing

AI Starter Kit for Intelligent Indexing of Incoming Correspondence using Intel® Extension for Scikit-learn*
Python
2
star
46

unified-runtime-spec

2
star
47

visual-process-discovery

AI Starter Kit for Visual Process Discovery using Intel® Extension for Pytorch
Python
2
star
48

vertical-search-engine

AI Starter Kit for Semantic Vertical Search Engines using Intel® Extension for Pytorch
Python
2
star
49

document-automation

AI Starter Kit for Named Entity Recognition using Intel® Optimized Tensorflow (version 2.9.0 with oneDNN)
Python
2
star
50

ai-structured-data-generation

AI Starter Kit to generate structured synthetic data using Intel® Distribution of Modin
Python
1
star
51

voice-data-generation

AI Starter Kit for Synthetic Voice and Audio Generation using Intel® Extension for Pytorch
Python
1
star
52

order-to-delivery-time-forecasting

AI Starter Kit of a delivery time forecasting solution using Intel® optimized version of XGBoost
1
star
53

product-recommendations

AI Starter Kit for product recommendation system using Intel® Extension for Scikit-learn*
Jupyter Notebook
1
star
54

customer-churn-prediction

AI Starter Kit for customer churn prediction using Intel® Extension for Scikit-learn*
Python
1
star
55

credit-card-fraud-detection

AI Starter Kit for Credit Card Fraud Detection model using Intel® Extension for Scikit-learn*
Python
1
star
56

loan-default-risk-prediction

AI Starter Kit to predict probability of a loan default from client using Intel® optimized version of XGBoost
Python
1
star
57

ai-data-protection

AI Starter Kit for Personal Identifiable Information Anonymization using Intel® Extension for Pytorch
Python
1
star
58

engineering-design-optimization

AI Starter Kit for Engineering Design Optimization using Intel® Extension for Pytorch
Python
1
star
59

data-streaming-anomaly-detection

AI Starter Kit for Data Streaming Anomaly Detection using Intel® Optimized Tensorflow*
Python
1
star