• Stars
    star
    331
  • Rank 127,323 (Top 3 %)
  • Language
    C++
  • License
    MIT License
  • Created over 3 years ago
  • Updated 9 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Multi-platform GPU-accelerated interface for compute-intensive pipelines.

πŸ’¨ CyberEther (Work-In-Progress)

Portable and heterogeneously-accelerated graphical visualization for radio signals.

Features

  • πŸ¦‘ Meant to be extremely portable (NVIDIA, Apple M1/M2, Raspberry Pi, AMD).
  • πŸ–ΌοΈ Meant to support the most common heterogeneous APIs (CUDA, Vulkan, Metal, WebGPU).
  • πŸ”‹ Runtime configurable hardware accelerated heterogeneous computing.
  • 🧳 Compatible with GNU Radio with gr-cyber.

Compatibility

Devices

Device Metal Vulkan WebGPU
macOS (Apple Silicon) βœ… (Native) βœ… (via MoltenVK) βœ… (Dawn*)
iOS/iPadOS βœ… (Native) βœ… (via MoltenVK) N/A
Linux (NVIDIA/AMD/Intel) N/A βœ… (Native) βœ… (Dawn*)
Windows (NVIDIA/AMD/Intel) N/A βœ… (Native*) βœ… (Dawn*)
Android N/A βœ… (Native*) βœ… (Dawn*)
Browser (WebKit/Chrome/Firefox) N/A N/A βœ… (Chrome)

* Not validated yet.

Graphical Modules

Module Metal Vulkan WebGPU CPU+Render
Lineplot βœ… (Full) βœ… (Graphical) βœ… (Graphical) βœ… (Full)
Waterfall βœ… (Full) βœ… (Graphical) βœ… (Graphical) βœ… (Full)
Spectrogram βœ… (Full) βœ… (Graphical) βœ… (Graphical) βœ… (Slow but full)
Constellation ❌ (Porting) ❌ (Porting) ❌ (Porting) βœ… (Slow but full)

Compute Modules

Module CPU CUDA Metal Vulkan Description
Amplitude βœ… ❌ (Porting) βœ… ❌ (Next) Complex data to power.
FFT βœ… ❌ (Porting) βœ… ❌ (Next) Channelization.
Multiply βœ… ❌ (Porting) βœ… ❌ (Next) Vector multiplication.
Scale βœ… ❌ (Porting) βœ… ❌ (Next) Scaling vector by factor.
Window βœ… ❌ (Porting) βœ… ❌ (Next) Apply a window to vector.
Filter βœ… ❌ (Porting) βœ… ❌ (Next) Apply a FIR Filter to vector.
Soapy βœ… ❌ (Porting) βœ… ❌ (Next) Simple SoapySDR tap.

Installation

The only way to use CyberEther at the moment is by compiling it from the source. It won't be that difficult because of the low number of dependencies. See file INSTALL.md for the installation process.

Required Dependencies

  • C++20 Compiler (GCC-10 or higher)
  • Meson (cmake but better)
  • Ninja (make but faster)

Optional Dependencies

  • GLFW (if compiling graphical modules)
  • FFTW (if compiling FFT module)
  • Metal (if compiling Metal backend)
  • CUDA (if compiling CUDA backend)

Implementation Notes

QT GUI for GNU Radio

This is currently using a GLFW window to operate, but it's totally possible for it to be rendered as a QGLSurface inside a QTApp parent (standard GNU Radio window). The idea is to write support for this as soon as the API becomes more stable.

Development Speed

This is a side project that requires a considerable amount of work that I develop in my free time. I'm making some sacrifices to increase the speed of development like developing new features for Metal (because I have an Apple laptop) and later porting to other APIs, scarce documentation, and currently no font rendering support.

Jetstream Render

An important goal of this project is to be as portable as possible. This requires the code to support multiple heterogeneous computing APIs (like Metal, Vulkan, OpenGL, etc). To reduce the amount of code duplication, CyberEther builds on top of a custom-built Jetstream::Render engine that abstracts away methods from hardware vendors into a single coherent API. This render can change from the graphical backend in runtime by a simple toggle.

// Initialize Render
Render::Window::Config renderCfg;
renderCfg.size = {3130, 1140};
renderCfg.resizable = true;
renderCfg.imgui = true;
renderCfg.vsync = true;
renderCfg.title = "CyberEther";

// Create a render with the Metal API (Apple GPUs) backend.
Render::Initialize<Device::Metal>(renderCfg);

// -- OR --

// Create a render with the Vulkan API (Generic GPUs) backend.
Render::Initialize<Device::Vulkan>(renderCfg);

Windowing System

Following the "being as much portable as possible" directive, Jetstream will render the plot into a headless framebuffer texture. This means that the buffer can be bound into a QtView or Cocoa Window. This is practical, easy to implement, and doesn't force you into our window manager. To make things easier for development, Jetstream uses glfw to create a window and display multiple plots using the ImGui windowing library. But they are not at all required! Hell, I don't know why, but if you want, you can even share the framebuffer texture with another process on Linux using DMA-BUF. Again, ImGui is only used to display the rendered frame buffer. No data plotting drawing is made with it, unlike other implementations.

Font Rendering

Font rendering is known to be a pain to implement. This feature will require some engineering work to be implemented correctly, and once implemented will make things hard to change. Therefore, I'm choosing to put that away for the sake of development speed at this early stage of development.

Computing While Rendering

Sharing data with the computation pipeline while rendering requires synchronization to make sure the data is not being mutated during the presentation. This synchronization has to be precise and lightweight to not cause any stutter. This requires a mutually exclusive synchronization (compute vs present) with priority to the rendering calls. Synchronization with blocking mechanisms such as std::conditional_variable was tried first, but these turn out to be quite slow on some types of hardware, particularly ARM devices (Jetson/RPi). The solution to this appears to be a new C++20 synchronization mechanism called std::atomic_flags that requires the underlying implementation to use non-blocking mechanisms. This most of the time translates to better performance but requires C++20 to work. You can check out this implementation here.

Data Handling

Turns out, you need data to display data. Our way to handle data is with the "Jetstream::Vector". This is basically the same as a std::span but made with heterogeneous computing in mind. The main problem with std::span is that you don't know where your data is local to. Sure, one can create a std::span with a GPU memory pointer, but how do you know that when you pass that vector forward? This requires further runtime logic to communicate the location of the buffer making things confusing. This is not an optimal place to be in an envoriment with multiple heterogeneous devices using incoherent memory. To solve this in Jetstream, the Jetstream::Vector requires a template Jetstream::Device to specify the locale of the data. For example, data inside the CUDA GPU is represented as Vector<Device::CUDA>. This will ensure that whoever is receiving this vector knows how to handle it properly in compile time (no runtime overhead and less code). This syntax also works in an envoriment like CUDA Unified Memory with Vector<Device::CUDA | Device::CPU> where the data can be accessed either from the GPU and CPU using the same pointer.

Jetstream Compute

Sometimes, some data processing is required to display a signal. Like the render, Jetstream::Compute offers a way to abstract the computation away from hardware libraries (CPU, CUDA, Vulkan, Metal, etc). For this purpose, Jetstream separates computation into blocks. These blocks can be overloaded with different APIs implementations using the Jetstrem::Device enum. For example, below we have an FFT block with a CUDA and CPU (FFTW) implementation.

// Performing the implementation with the CPU (FFTW) implementation.
// This expects the input buffer to be local to the CPU (Vector<Device::CPU>).
fft = Block<FFT, Device::CPU>({
    .size = stream.size(),
}, {
    .buffer = mul->getProductBuffer(),
});

// Performing the implementation with the CUDA implementation.
// This expects the input buffer to be local to the GPU (Vector<Device::CUDA>).
fft = Block<FFT, Device::CUDA>({
    .size = stream.size(),
}, {
    .buffer = mul->getProductBuffer(),
});

To accommodate some APIs that require initialization, the backend is initialized with the following code. This will create a global singleton class that can be accessed by a compute module to register a new kernel or to bind a compute buffer to a render texture. Multiple backends can be initialized at the same time.

// Initialize compute API for Metal (Apple GPUs).
Jetstream::Backend::Initialize<Device::Metal>({});

// Initialize compute API for CUDA (NVIDIA GPUs).
Jetstream::Backend::Initialize<Device::CUDA>({});

Wrangling Different Asynchronous Compute APIs

Every API offers a different way to synchronize work, and you can't synchronize asynchronous work from multiple different APIs at once. The CUDA way of dealing with this is using streams and Metal/Vulkan work using encoders. Honestly, this problem gives me a headache and I am still not sure how to properly handle this. But at this time, the scheduler is grouping sequential blocks that use the same API together and sequentially running them. For example, suppose a pipeline with six blocks. Blocks 1, 2, and 3 use CUDA, block 4 uses Metal, and block 5 and 6 use CUDA. The way the scheduler would handle this is to create a CUDA Graph on asynchronous stream #1, enqueue work (from B1, B2, and B3), wait for completion, create a Metal compute encoder, enqueue work (from B4), wait for completion, create a CUDA Graph on async stream #2, enqueue work (from B5, and B6), wait for completion, and repeat. It's not clear if this is the best way to deal with things, but this should make sure no computation is being done at the same time the input buffer is being read.

Samurai

The original idea behind Samurai was to provide a better C++ interface than SoapySDR. The problem is maintaining SDR driver support is quite difficult and SoapySDR has become a standard implementation. I quite like Samurai API but I might replace it with SoapySDR in the future. Or at least, write a wrapper around SoapySDR with Samurai's API style.

Development Quicklogs

September 13, 2022

Some refactoring has been recently made to support other compute APIs like Vulkan and Metal. As of now, only the Metal implementation works out-of-the-box, but a CUDA implementation from before the refactor is available and should be ported to the current upstream in the coming weeks.

More Repositories

1

pisdr-image

πŸ₯§ A SDR Linux Distro for the Raspberry Pi and other SBC. Compatible out of the box with multiple SDR.
Shell
889
star
2

CyberRadio

πŸ“» An SDR Based FM/AM Radio For Desktop. Accelerated with #cuSignal and Numba.
Python
193
star
3

kimera

Low-latency hardware accelerated codec based video streaming utility.
C
147
star
4

pico-stuff

I add my Pi Pico (RP2040) stuff here.
C
140
star
5

software-radar

Software Defined Radar with GNURadio Companion
Python
61
star
6

webusb-libusb

Translation layer from libusb to webusb.
C
56
star
7

Raspberry-Pi-RFID

A Socket.io based RC522 RFID Reader Demo for Raspberry Pi
HTML
52
star
8

radio-core

πŸ“” Python based DSP for CyberRadio and PyRadio. Accelerated with #cuSignal and Numba.
Python
50
star
9

blade

Beamforming & Stuff β„’
C++
42
star
10

PyRadio

πŸ€“ Python Scripts for WBFM and AM demodulation. Accelerated with #cuSignal and Numba.
40
star
11

LatexBot

Automatically generate images from LaTeX expressions on Telegram.
JavaScript
33
star
12

weatherdump

Multi-platform toolkit for weather satellites reception and decoding. Available for Linux, macOS, Windows and RaspberryPi.
Go
33
star
13

ESP8266-RTOS-HomeKit

ESP8266 HomeKit Accessory Implementation based on FreeRTOS.
C
24
star
14

CyberRadioBlast

A browser based SDR console.
JavaScript
20
star
15

WattsUp

Swift
11
star
16

gr-satellites-demod

GNU Radio Scripts to Demodulate Satellites Signals
9
star
17

gr-cyber

CMake
8
star
18

node-moodle-scraper

Node Moodle Scraper
JavaScript
5
star
19

gr-teleskop

CMake
5
star
20

khronos-gnss

Khronos GNSS Frequency Standand & NTP Server
C
4
star
21

weatherdump-desktop

JavaScript
4
star
22

GypsyBot

https://telegram.me/GypsyBot
JavaScript
3
star
23

cpp-goesdump

Satellite Data Stream Parser for XRIT (NOAA-GOES)
C++
3
star
24

DSP-GUITools

X11 GUI Tools for Digital Signal Processing
C++
2
star
25

Node-Telegram-Bot

JavaScript
2
star
26

DeepSeed-MLDM2019

TensorFlow Convolutional Neural Network TFRecords Based
Python
2
star
27

molten-signal

[WIP] Molten Signal is a C++ backed Python library that draws an FFT waterfall on the screen with Vulkan.
C++
2
star
28

Convert-It

Simple Fahrenheit to Celsius using Cocoa. v0.2
Objective-C
2
star
29

turboseti-stream

Python
2
star
30

prometheus-iio-exporter

Prometheus exporter for the IIO sensor values of the RFSoC.
Go
2
star
31

youtube-dl-info

Simple youtube-dl info module.
JavaScript
2
star
32

holoscan-demos

C++
2
star
33

SETI

Python
1
star
34

dfa-minimizer

Python
1
star
35

blast

JavaScript
1
star
36

ata-datum

Python
1
star
37

weatherdump-report

Analytics server for the WeatherDump.
Go
1
star
38

Grabber

Download media from a URL with a extension on iOS.
Swift
1
star
39

beam-calc

Python
1
star
40

gofast

Go
1
star
41

cygnus-starfinder

Cygnus Starfinder - Python OpenCV Sextant
Python
1
star
42

utfpr-sorting-algorithms

C
1
star
43

sdrOS

1
star
44

Mosaic

Swift
1
star
45

Minutemen

https://app.luigifreitas.me/minutemen
Swift
1
star