• Stars
    star
    497
  • Rank 88,652 (Top 2 %)
  • Language
    Python
  • License
    GNU Lesser Genera...
  • Created about 8 years ago
  • Updated 4 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Backend.AI is a streamlined, container-based computing cluster platform that hosts popular computing/ML frameworks and diverse programming languages, with pluggable heterogeneous accelerator support including CUDA GPU, ROCm GPU, TPU, IPU and other NPUs.

Backend.AI

PyPI release version Supported Python versions Wheels Gitter

Backend.AI is a streamlined, container-based computing cluster platform that hosts popular computing/ML frameworks and diverse programming languages, with pluggable heterogeneous accelerator support including CUDA GPU, ROCm GPU, TPU, IPU and other NPUs.

It allocates and isolates the underlying computing resources for multi-tenant computation sessions on-demand or in batches with customizable job schedulers with its own orchestrator. All its functions are exposed as REST/GraphQL/WebSocket APIs.

Contents in This Repository

This repository contains all open-source server-side components and the client SDK for Python as a reference implementation of API clients.

Directory Structure

  • src/ai/backend/: Source codes
    • manager/: Manager
    • manager/api: Manager API handlers
    • agent/: Agent
    • agent/docker/: Agent's Docker backend
    • agent/k8s/: Agent's Kubernetes backend
    • kernel/: Agent's kernel runner counterpart
    • runner/: Agent's in-kernel prebuilt binaries
    • helpers/: Agent's in-kernel helper package
    • common/: Shared utilities
    • client/: Client SDK
    • cli/: Unified CLI for all components
    • storage/: Storage proxy
    • storage/api: Storage proxy's manager-facing and client-facing APIs
    • web/: Web UI server
    • plugin/: Plugin subsystem
    • test/: Integration test suite
    • testutils/: Shared utilities used by unit tests
    • meta/: Legacy meta package
  • docs/: Unified documentation
  • tests/
    • manager/, agent/, ...: Per-component unit tests
  • configs/
    • manager/, agent/, ...: Per-component sample configurations
  • docker/: Dockerfiles for auxiliary containers
  • fixtures/
    • manager/, ...: Per-component fixtures for development setup and tests
  • plugins/: A directory to place plugins such as accelerators, monitors, etc.
  • scripts/: Scripts to assist development workflows
    • install-dev.sh: The single-node development setup script from the working copy
  • stubs/: Type annotation stub packages written by us
  • tools/: A directory to host Pants-related tooling
  • dist/: A directory to put build artifacts (.whl files) and Pants-exported virtualenvs
  • changes/: News fragments for towncrier
  • pants.toml: The Pants configuration
  • pyproject.toml: Tooling configuration (towncrier, pytest, mypy)
  • BUILD: The root build config file
  • **/BUILD: Per-directory build config files
  • BUILD_ROOT: An indicator to mark the build root directory for Pants
  • requirements.txt: The unified requirements file
  • *.lock, tools/*.lock: The dependency lock files
  • docker-compose.*.yml: Per-version recommended halfstack container configs
  • README.md: This file
  • MIGRATION.md: The migration guide for updating between major releases
  • VERSION: The unified version declaration

Server-side components are licensed under LGPLv3 to promote non-proprietary open innovation in the open-source community while other shared libraries and client SDKs are distributed under the MIT license.

There is no obligation to open your service/system codes if you just run the server-side components as-is (e.g., just run as daemons or import the components without modification in your codes). Please contact us (contact-at-lablup-com) for commercial consulting and more licensing details/options about individual use-cases.

Getting Started

Installation for Single-node Development

Run scripts/install-dev.sh after cloning this repository.

This script checks availability of all required dependencies such as Docker and bootstrap a development setup. Note that it requires sudo and a modern Python installed in the host system based on Linux (Debian/RHEL-likes) or macOS.

Installation for Multi-node Tests & Production

Please consult our documentation for community-supported materials. Contact the sales team ([email protected]) for professional paid support and deployment options.

Accessing Compute Sessions (aka Kernels)

Backend.AI provides websocket tunneling into individual computation sessions (containers), so that users can use their browsers and client CLI to access in-container applications directly in a secure way.

  • Jupyter: data scientists' favorite tool
    • Most container images have intrinsic Jupyter and JupyterLab support.
  • Web-based terminal
    • All container sessions have intrinsic ttyd support.
  • SSH
    • All container sessions have intrinsic SSH/SFTP/SCP support with auto-generated per-user SSH keypair. PyCharm and other IDEs can use on-demand sessions using SSH remote interpreters.
  • VSCode
    • Most container sessions have intrinsic web-based VSCode support.

Working with Storage

Backend.AI provides an abstraction layer on top of existing network-based storages (e.g., NFS/SMB), called vfolders (virtual folders). Each vfolder works like a cloud storage that can be mounted into any computation sessions and shared between users and user groups with differentiated privileges.

Major Components

Manager

It routes external API requests from front-end services to individual agents. It also monitors and scales the cluster of multiple agents (a few tens to hundreds).

Agent

It manages individual server instances and launches/destroys Docker containers where REPL daemons (kernels) run. Each agent on a new EC2 instance self-registers itself to the instance registry via heartbeats.

Storage Proxy

It provides a unified abstraction over multiple different network storage devices with vendor-specific enhancements such as real-time performance metrics and filesystem operation acceleration APIs.

Webserver

It hosts the SPA (single-page application) packaged from our web UI codebase for end-users and basic administration tasks.

Synchronizing the backend.ai-app repository as a subtree:

$ git remote add webui-package https://github.com/lablup/backend.ai-app  # first time only
$ git subtree pull --squash --prefix=src/ai/backend/web/static webui-package main

Kernels

Computing environment recipes (Dockerfile) to build the container images to execute on top of the Backend.AI platform.

Jail

A programmable sandbox implemented using ptrace-based system call filtering written in Rust.

Hook

A set of libc overrides for resource control and web-based interactive stdin (paired with agents).

Client SDK Libraries

We offer client SDKs in popular programming languages. These SDKs are freely available with MIT License to ease integration with both commercial and non-commercial software products and services.

Plugins

Legacy Components

These components still exist but are no longer actively maintained.

Media

The front-end support libraries to handle multi-media outputs (e.g., SVG plots, animated vector graphics)

  • The Python package (lablup) is installed inside kernel containers.
  • To interpret and display media generated by the Python package, you need to load the Javascript part in the front-end.
  • https://github.com/lablup/backend.ai-media

IDE and Editor Extensions

We now recommend using in-kernel applications such as Jupyter Lab, Visual Studio Code Server, or native SSH connection to kernels via our client SDK or desktop apps.

Python Version Compatibility

Backend.AI Core Version Compatible Python Version
23.03.x 3.11.x
22.03.x / 22.09.x 3.10.x
21.03.x / 21.09.x 3.8.x

License

Refer to LICENSE file.

More Repositories

1

backend.ai-webui

Backend.AI Web UI for web / desktop app (Windows/Linux/macOS). Backend.AI Web UI provides a convenient environment for users, while allowing various commands to be executed without CLI. It also provides some visual features that are not provided by the CLI, such as dashboards and statistics.
TypeScript
102
star
2

backend.ai-kernels

Repository of Backend.AI-enabled container recipes
Jupyter Notebook
32
star
3

backend.ai-manager

Backend.AI Manager and API Gateway Daemon
Python
31
star
4

raftify

Experimental High level Raft framework
Rust
28
star
5

rraft-py

Unofficial Python Binding of the tikv/raft-rs
Python
20
star
6

callosum

An RPC Transport Library for asyncio
Python
19
star
7

etcetra

Etcd client built with pure asyncio gRPC library
Python
16
star
8

backend.ai-integration-jupyter

Jupyter kernel integration for Backend.AI
Python
10
star
9

contribution-academy-2021

10
star
10

backend.ai-client-py

Backend.AI Client Library for Python
Python
10
star
11

backend.ai-jail

A programmable security sandbox for Backend.AI kernels
Rust
7
star
12

qedis

Redis over QUIC with improved connection management
Python
7
star
13

talkativot

Talkativot: Do-It-Yourself backbone for your AI friend
Python
5
star
14

vscode-live-code-runner

Visual Studio Code extension to run code snippets using Backend.AI
TypeScript
5
star
15

backend.ai-common

Common libraries and utilities for Backend.AI server-side framework
Python
5
star
16

contribution-academy-2022

Contribution academy 2022
5
star
17

backend.ai-client-js

Backend.AI Client Library for Javascript
JavaScript
5
star
18

backend.ai-webserver

A minimal webapp to convert web session requests to API requests
Python
4
star
19

atom-live-code-runner

ATOM package to run code snippets using Sorna API server
JavaScript
3
star
20

backend.ai-accelerator-cuda-mock

A mockup plugin for CUDA accelerator plugin
Python
3
star
21

backend.ai-kernel-runner

A common base runner for various programming languages
Python
3
star
22

riteraft-py

Python Raft framework for regular people.
Python
3
star
23

model.backend.ai-ko-mecab

the project of Mecab is Natural language processing tool for Korean.
Python
2
star
24

aioraft-ng

Unofficial implementation of Raft consensus algorithm written in asyncio-based Python.
Python
2
star
25

backend.ai-media

Auxiliary Javascript libraries for Backend.AI front-end developers
JavaScript
2
star
26

backend.ai-client-java

Backend.AI Client Library for Java
Java
2
star
27

backend.ai-krunner-static-gnu

A statically built Python and extensions bundle for Backend.AI Kernel Runner on GNU libc-based containers
Python
2
star
28

etcd-client-py

Python binding to etcd-client crate in Rust
Python
2
star
29

aiotusclient

aiotusclient custom implementation of py-tus-client for backend.ai
Python
1
star
30

backend.ai-docs

Collection of Backend.AI guides.
JavaScript
1
star
31

backend.ai-docs-webui

User's guide for Backend.AI GUI Console.
Python
1
star
32

numpy-builder

A numpy custom builder for our purpose
Makefile
1
star
33

backend.ai-packages

Backend.ai Package repositorys for various Linux distros such as Debian and RHEL.
Shell
1
star
34

backend.ai-client-php

PHP API library for Lablup.AI Cloud
1
star
35

backend.ai-app

Backend.AI app for desktop / web for distribution.
HTML
1
star
36

backend.ai-cli

Unified command-line interface for Backend.AI
Python
1
star
37

mini-kvstore

A mini key-value store accessible via HTTP REST API for use in GitHub Actions
Python
1
star
38

backend.ai-example-notebooks

Example notebooks for Backend.AI / Backend.AI Cloud
Jupyter Notebook
1
star
39

auto-labeler

1
star
40

coroutine-check

Analyze Python source files and warn if a coroutine is not called in a proper way (yield from or await)
Python
1
star
41

backend.ai-fasttrack-examples

Example pipelines for Backend.AI FastTrack.
1
star