• Stars
    star
    750
  • Rank 60,494 (Top 2 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created about 1 year ago
  • Updated 4 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Build real-time multimodal AI applications πŸ€–πŸŽ™οΈπŸ“Ή
The LiveKit icon, the name of the repository and some sample code in the background.

LiveKit Agents

The Agent Framework is designed for building real-time, programmable participants that run on servers. Easily tap into LiveKit WebRTC sessions and process or generate audio, video, and data streams.

The framework includes plugins for common workflows, such as voice activity detection and speech-to-text.

Agents integrates seamlessly with LiveKit server, offloading job queuing and scheduling responsibilities to it. This eliminates the need for additional queuing infrastructure. Agent code developed on your local machine can scale to support thousands of concurrent sessions when deployed to a server in production.

This SDK is currently in Developer Preview mode and not ready for production use. There will be bugs and APIs may change during this period.

We welcome and appreciate any feedback or contributions. You can create issues here or chat live with us in the #dev channel within the LiveKit Community Slack.

Docs & Guides

Examples

KITT

An voice assistant using DeepGram STT, GPT-4, and ElevenLabs TTS

Object Detection

Real-time object detection using DirectAI

Installation

To install the core Agents library:

pip install livekit-agents

Agents includes a set of prebuilt plugins that make it easier to compose together agents. These plugins cover common tasks like converting speech to text or vice versa and running inference on a generative AI model. You can install a plugin as follows:

pip install livekit-plugins-deepgram

The following plugins are available today:

Plugin Features
livekit-plugins-deepgram STT
livekit-plugins-directai Vision, object detection
livekit-plugins-elevenlabs TTS
livekit-plugins-fal Image generation
livekit-plugins-google STT
livekit-plugins-nltk Utilities for working with text
livekit-plugins-openai Dalle 3, STT, TTS
livekit-plugins-silero VAD

Concepts

  • Agent: A function that defines the workflow of a programmable, server-side participant. This is your application code.
  • Worker: A container process responsible for managing job queuing with LiveKit server. Each worker is capable of running multiple agents simultaneously.
  • Plugin: A library class that performs a specific task, like speech-to-text, from a specific provider. An agent can compose multiple plugins together to perform more complex tasks.

Running an agent

The framework exposes a CLI interface to run your agent. To get started, you'll need the following environment variables set:

  • LIVEKIT_URL
  • LIVEKIT_API_KEY
  • LIVEKIT_API_SECRET

Running the worker

This will start the worker and wait for users to connect to your LiveKit server:

python my_agent.py start

Using playground for your agent UI

To ease the process of building and testing an agent, we've developed a versatile web frontend called "playground". You can use or modify this app to suit your specific requirements. It can also serve as a starting point for a completely custom agent application.

Joining a specific room

To join a LiveKit room that's already active, you can use the simulate-job command:

python my_agent.py simulate-job --room-name <my-room>

What happens when I run my agent?

When you follow the steps above to run your agent, a worker is started that opens an authenticated WebSocket connection to a LiveKit server instance(defined by your LIVEKIT_URL and authenticated with an access token).

No agents are actually running at this point. Instead, the worker is waiting for LiveKit server to give it a job.

When a room is created, the server notifies one of the registered workers about a new job. The notified worker can decide whether or not to accept it. If the worker accepts the job, the worker will instantiate your agent as a participant and have it join the room where it can start subscribing to tracks. A worker can manage multiple agent instances simultaneously.

If a notified worker rejects the job or does not accept within a predetermined timeout period, the server will route the job request to another available worker.

What happens when I SIGTERM a worker?

The orchestration system was designed for production use cases. Unlike the typical web server, an agent is a stateful program, so it's important that a worker can't be terminated while it's managing any active agents.

When calling SIGTERM on a worker, the worker will signal to LiveKit server that it no longer wants additional jobs. It will also auto-reject any new job requests that get through before the server signal is received. The worker will remain alive while it manages any agents connected to rooms.

Downloading model files

Some plugins require model files to be downloaded before they can be used. To download all the necessary models for your agent, execute the following command:

python my_agent.py download-files

If you're developing a custom plugin, you can integrate this functionality by implementing a download_files method in your Plugin class:

class MyPlugin(Plugin):
    def __init__(self):
        super().__init__(__name__, __version__)

    def download_files(self):
        _ = torch.hub.load(
            repo_or_dir="my-repo",
            model="my-model",
        )


LiveKit Ecosystem
Real-time SDKsReact Components Β· JavaScript Β· iOS/macOS Β· Android Β· Flutter Β· React Native Β· Rust Β· Python Β· Unity (web) Β· Unity (beta)
Server APIsNode.js Β· Golang Β· Ruby Β· Java/Kotlin Β· Python Β· Rust Β· PHP (community)
Agents FrameworksPython Β· Playground
ServicesLivekit server Β· Egress Β· Ingress Β· SIP
ResourcesDocs Β· Example apps Β· Cloud Β· Self-hosting Β· CLI

More Repositories

1

livekit

End-to-end stack for WebRTC. SFU media server and SDKs.
Go
9,214
star
2

client-sdk-js

LiveKit browser client SDK (javascript)
TypeScript
329
star
3

client-sdk-flutter

Flutter Client SDK for LiveKit
Dart
238
star
4

livekit-cli

Command line interface to LiveKit
Go
199
star
5

server-sdk-go

Client and server SDK for Golang
Go
189
star
6

client-sdk-swift

LiveKit Swift Client SDK. Easily build live audio or video experiences into your mobile app, game or website.
Swift
182
star
7

rust-sdks

LiveKit real-time and server SDKs for Rust
Rust
173
star
8

client-sdk-android

LiveKit SDK for Android
Kotlin
169
star
9

livekit-react

React component and library for LiveKit
TypeScript
169
star
10

egress

Export and record WebRTC sessions and tracks
Go
166
star
11

components-js

Official open source React components and examples for building with LiveKit.
TypeScript
145
star
12

node-sdks

LiveKit real-time and server SDKs for Node.JS
TypeScript
124
star
13

client-sdk-react-native

TypeScript
98
star
14

python-sdks

LiveKit real-time and server SDKs for Python
Python
80
star
15

sip

SIP to WebRTC bridge for LiveKit
Go
76
star
16

protocol

LiveKit protocol. Protobuf definitions for LiveKit's signaling protocol
Go
69
star
17

ingress

Ingest streams (RTMP/WHIP) or files (HLS, MP4) to LiveKit WebRTC
Go
68
star
18

agents-playground

TypeScript
61
star
19

client-sdk-unity-web

Client SDK for Unity WebGL
C#
50
star
20

livekit-helm

LiveKit Helm charts
Smarty
47
star
21

livekit-recorder

Go
32
star
22

client-sdk-unity

C#
31
star
23

track-processors-js

TypeScript
29
star
24

server-sdk-kotlin

Kotlin
28
star
25

livekit-server-sdk-python

LiveKit Server SDK for Python
Python
25
star
26

psrpc

Go
22
star
27

client-example-swift

Example app for LiveKit Swift SDK πŸ‘‰ https://github.com/livekit/client-sdk-swift
Swift
21
star
28

server-sdk-ruby

LiveKit Server SDK for Ruby
Ruby
21
star
29

meet

Open source video conferencing app built on LiveKit Components, LiveKit Cloud, and Next.js.
TypeScript
16
star
30

client-unity-demo

Demo for LiveKit Unity SDK
C#
11
star
31

client-sdk-cpp

C++
11
star
32

WebRTC-swift

Swift package for WebRTC
Objective-C
10
star
33

livekit-docs

JavaScript
10
star
34

deploy

Resources for deploying LiveKit
Go
9
star
35

webrtc-vmaf

VMAF benchmarking tool for WebRTC codecs
Python
9
star
36

chrometester

A livekit tester to simulate a subscriber in a room, uses headless Chromium
JavaScript
8
star
37

gstreamer-publisher

Command-line app that publishes any GStreamer pipeline to LiveKit
Go
8
star
38

nats-test

benchmarking app that emulates our usage patterns
Go
7
star
39

mediatransportutil

Media transport utilities
Go
6
star
40

rtcscore-go

Library to calculate Mean Opinion Score(MOS)
Go
4
star
41

components-swift

Swift
4
star
42

signal-proxy

Go
3
star
43

client-example-collection-swift

A collection of small examples for the LiveKit Swift SDK πŸ‘‰ https://github.com/livekit/client-sdk-swift
Swift
3
star
44

webrtc-chrome

Fork of Google's WebRTC repo, with LiveKit patches.
C++
3
star
45

components-android

Kotlin
3
star
46

gst-plugins

Go
3
star
47

client-sdk-react-native-expo-plugin

TypeScript
2
star
48

webrtc-xcframework

Ruby
2
star
49

mageutil

Go
2
star
50

ios-test-apps

Swift
1
star
51

gstreamer

C
1
star
52

websocket-bridge

Send and Receive Media to LiveKit via WebSocket
Go
1
star