The Agent Framework is designed for building real-time, programmable participants that run on servers. Easily tap into LiveKit WebRTC sessions and process or generate audio, video, and data streams.
The framework includes plugins for common workflows, such as voice activity detection and speech-to-text.
Agents integrates seamlessly with LiveKit server, offloading job queuing and scheduling responsibilities to it. This eliminates the need for additional queuing infrastructure. Agent code developed on your local machine can scale to support thousands of concurrent sessions when deployed to a server in production.
This SDK is currently in Developer Preview mode and not ready for production use. There will be bugs and APIs may change during this period.
We welcome and appreciate any feedback or contributions. You can create issues here or chat live with us in the #dev channel within the LiveKit Community Slack.
An voice assistant using DeepGram STT, GPT-4, and ElevenLabs TTS
Real-time object detection using DirectAI
To install the core Agents library:
pip install livekit-agents
Agents includes a set of prebuilt plugins that make it easier to compose together agents. These plugins cover common tasks like converting speech to text or vice versa and running inference on a generative AI model. You can install a plugin as follows:
pip install livekit-plugins-deepgram
The following plugins are available today:
Plugin | Features |
---|---|
livekit-plugins-deepgram | STT |
livekit-plugins-directai | Vision, object detection |
livekit-plugins-elevenlabs | TTS |
livekit-plugins-fal | Image generation |
livekit-plugins-google | STT |
livekit-plugins-nltk | Utilities for working with text |
livekit-plugins-openai | Dalle 3, STT, TTS |
livekit-plugins-silero | VAD |
- Agent: A function that defines the workflow of a programmable, server-side participant. This is your application code.
- Worker: A container process responsible for managing job queuing with LiveKit server. Each worker is capable of running multiple agents simultaneously.
- Plugin: A library class that performs a specific task, like speech-to-text, from a specific provider. An agent can compose multiple plugins together to perform more complex tasks.
The framework exposes a CLI interface to run your agent. To get started, you'll need the following environment variables set:
- LIVEKIT_URL
- LIVEKIT_API_KEY
- LIVEKIT_API_SECRET
This will start the worker and wait for users to connect to your LiveKit server:
python my_agent.py start
To ease the process of building and testing an agent, we've developed a versatile web frontend called "playground". You can use or modify this app to suit your specific requirements. It can also serve as a starting point for a completely custom agent application.
To join a LiveKit room that's already active, you can use the simulate-job
command:
python my_agent.py simulate-job --room-name <my-room>
When you follow the steps above to run your agent, a worker is started that opens an authenticated WebSocket connection to a LiveKit server instance(defined by your LIVEKIT_URL
and authenticated with an access token).
No agents are actually running at this point. Instead, the worker is waiting for LiveKit server to give it a job.
When a room is created, the server notifies one of the registered workers about a new job. The notified worker can decide whether or not to accept it. If the worker accepts the job, the worker will instantiate your agent as a participant and have it join the room where it can start subscribing to tracks. A worker can manage multiple agent instances simultaneously.
If a notified worker rejects the job or does not accept within a predetermined timeout period, the server will route the job request to another available worker.
The orchestration system was designed for production use cases. Unlike the typical web server, an agent is a stateful program, so it's important that a worker can't be terminated while it's managing any active agents.
When calling SIGTERM on a worker, the worker will signal to LiveKit server that it no longer wants additional jobs. It will also auto-reject any new job requests that get through before the server signal is received. The worker will remain alive while it manages any agents connected to rooms.
Some plugins require model files to be downloaded before they can be used. To download all the necessary models for your agent, execute the following command:
python my_agent.py download-files
If you're developing a custom plugin, you can integrate this functionality by implementing a download_files
method in your Plugin class:
class MyPlugin(Plugin):
def __init__(self):
super().__init__(__name__, __version__)
def download_files(self):
_ = torch.hub.load(
repo_or_dir="my-repo",
model="my-model",
)
LiveKit Ecosystem | |
---|---|
Real-time SDKs | React Components Β· JavaScript Β· iOS/macOS Β· Android Β· Flutter Β· React Native Β· Rust Β· Python Β· Unity (web) Β· Unity (beta) |
Server APIs | Node.js Β· Golang Β· Ruby Β· Java/Kotlin Β· Python Β· Rust Β· PHP (community) |
Agents Frameworks | Python Β· Playground |
Services | Livekit server Β· Egress Β· Ingress Β· SIP |
Resources | Docs Β· Example apps Β· Cloud Β· Self-hosting Β· CLI |