• Stars
    star
    129
  • Rank 279,262 (Top 6 %)
  • Language
    Python
  • Created over 1 year ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A script that uses Telnyx API to make bulk calls to a given list of numbers and analyzes the audio recordings with Whisper. This way, we donโ€™t have to read all the transcriptions.

Call-and-Transcribe Application Documentation

This application is a robust solution that leverages the Telnyx API for call management and the OpenAI API for transcription services. It allows you to initiate multiple calls to a list of numbers, play a sound during each call, record the calls, and transcribe the recordings. The transcriptions along with other call details are written to a TSV (Tab-Separated Values) file.

You can find the backstory of this app in the tweet below:

https://twitter.com/yigitkonur/status/1654827917845860353

Application Setup

The application is developed in Python and uses the Flask framework for managing incoming webhooks. It also uses the Telnyx and OpenAI APIs, so you need to have valid API keys for these services.

Key Functions

  1. call_and_play_sound(number): Initiates a call to a given number and stores the call information in a global dictionary. The actual sound playback starts after the call is answered.

  2. transcribe_call(call_control_id): Retrieves the recording of a completed call, transcribes it using the OpenAI API, and writes the result (along with the caller and recipient numbers and call duration) to a TSV file.

  3. load_numbers(filename): Loads a list of phone numbers from a text file, where each line contains one number.

  4. process_number(number): A wrapper for call_and_play_sound(number). This function is designed to be used with multi-threading.

  5. multi_threaded_call(numbers): Uses a thread pool to manage multiple calls simultaneously.

  6. display_progress(numbers): Uses the Rich library to display a progress bar and a table summarizing the call results.

Webhook Routes

  1. @app.route("/webhook", methods=["POST"]): Handles various call events, such as call initiation, answer, and hangup. It starts audio playback and recording when a call is answered, and triggers transcription when a call is hung up.

  2. @app.route("/webhook/call-recording-saved", methods=["POST"]): Handles the event when a call recording is saved on the Telnyx platform. It downloads the audio file and transcribes it.

Usage

To use the application, follow these steps:

  1. Place the list of phone numbers (one per line) in a text file.

  2. Run the application. Make sure to set up an environment where the Telnyx and OpenAI APIs are accessible.

  3. The application will start calling the numbers in the list, playing a sound file on each call, recording the call, and transcribing the call once it's completed.

  4. The transcription results are stored in a TSV file named results.tsv, with the columns "From Number", "To Number", "Text", and "Total Call Duration".

Important Notes

Please ensure to replace the placeholder xxxxxxxxxxxxxxx in the code with your actual API keys and other required values such as the connection_id for Telnyx, the 'from' number for making calls, and the sound file's URL.

Also, note that this application assumes the Flask server and your Telnyx account are properly configured to handle the webhook events.

Audio Transcription Application Documentation

The second Python script (transcriber.py) is designed to transcribe a bulk of audio files using OpenAI's API. It searches for audio files in a specified directory, transcribes the content of each file, and then stores the transcriptions in a Tab-Separated Values (TSV) file. It supports different audio formats including mp3, mp4, mpeg, mpga, m4a, wav, and webm.

Key Functions

  1. transcribe_audio(filename): This function transcribes a given audio file using OpenAI's API. It applies an exponential backoff strategy to handle possible exceptions during the transcription process.

Workflow of the Script

  1. The script begins by setting the OpenAI API key. Make sure to replace 'sk-xxxxxxxxxx' with your actual OpenAI API key.

  2. The directory where the audio files are stored is specified. Change this to the directory path on your local machine where your audio files are stored.

  3. It then prepares a list of all audio files in the specified directory.

  4. A TSV file named transcriptions.tsv is created to store the transcriptions. Each row in the file represents an audio file and contains two fields: the filename and the transcription of the text in the audio file.

  5. A progress bar is created using the Rich library to track the transcription process.

  6. The script uses a ThreadPoolExecutor to concurrently transcribe multiple audio files. For each completed transcription, the result is written to the TSV file and the progress bar is updated.

Usage

To use this script, ensure that you have the OpenAI Python client set up correctly and that the OPENAI_API_KEY environment variable is set to your OpenAI API key. Also, make sure to replace the directory path with the path of the directory that contains the audio files you want to transcribe.

Then, simply run the script. It will transcribe all the audio files in the specified directory and write the transcriptions to a TSV file named transcriptions.tsv.

Important Note

This script uses a max_workers value of 3 for the ThreadPoolExecutor. This means that it will transcribe up to three audio files concurrently. You can adjust this value as needed based on the capabilities of your machine and the rate limits of your OpenAI API key.

Also, the script assumes that your OpenAI API key has sufficient privileges to use the Audio API for transcribing audio files.

More Repositories

1

write-into-menubar

Write whatever you want into OSX menubar by using BitBar + Alfred Powerpack.
Shell
34
star
2

context-aware-srt-translation-gpt

A repository trying to translate subtitles with GPT 3.5 Turbo without losing context (using the dynamic window context method).
Python
27
star
3

swift-ocr-llm-powered-pdf-to-markdown

An open-source OCR API that leverages OpenAI's powerful language models with optimized performance techniques like parallel processing and batching to deliver high-quality text extraction from complex PDF documents. Ideal for businesses seeking efficient document digitization and data extraction solutions.
Python
24
star
4

data-preparation-for-fine-tuning

A Python project for preparing and analyzing datasets from JSONL files. It includes tools for shuffling, categorizing, and generating reports on dataset content.
Python
14
star
5

rust-hyper-load-balanced-api-client

A high-performance Rust tool for sending API requests (to LLMs in my case) with built-in weighted load balancing, retry mechanisms, and rate limiting. Using hyper for fast request handling, it manages large volumes of asynchronous requests and is optimized for 10K request per second.
Rust
12
star
6

fastapi-http-proxy-with-caching

A FastAPI-based HTTP proxy with request caching using Redis, designed to forward requests while caching responses for efficient repeated queries.
Python
9
star
7

bulk-openai-embeddings-creator

A multi-threaded script & CLI tool for generating embeddings from text using multiple OpenAI endpoints. Supports resuming from previously processed data and customizable thread configurations
Python
8
star
8

cluster-by-similarity-high-dim-vectors

Python script for automated clustering of embeddings with DBSCAN, using cosine similarity for flexible, size-agnostic grouping (I hate K-means); outputs analysis metrics.
Python
7
star
9

copilot-pr-first-comment-updater

Python script that appends the copilot:all label to the first comments of closed pull requests in a specified GitHub repository, allowing easy tracking and management of processed PRs.
Python
4
star
10

n8n-docker-ffmpeg

Repository for setting up n8n with ffmpeg using Docker Compose, including beginner-friendly instructions and systemd service configuration for automatic startup.
Dockerfile
3
star
11

pineconedb-appscript-integration-for-sheets

A Google Apps Script custom function to fetch similar categories from a vector database using OpenAI and Pinecone APIs. This function can be used directly in Google Sheets.
JavaScript
3
star
12

go-native-squid-proxy

GoNativeSquidProxy is a high-performance, scalable proxy server fully written in Go, designed to efficiently handle HTTP/HTTPS requests as a modern alternative to Squid.
Go
2
star
13

Go-JSON-AzureSearch-Prepper

A Go utility for processing and combining JSON files, making them ready for integration with Azure Search AI.
Go
2
star
14

kv-backup-enhanced

A script to help you to download 1200 Cloudflare KV records per 5 minute (restricted due to Cloudflare's global rate limit)
Python
2
star
15

GoogleSheets-Translator

Automate translations in Google Sheets using Google Translate, optimized for handling large datasets with efficient batch processing (by using =GOOGLETRANSLATE formula to handle large amount of data effectively)
JavaScript
1
star
16

translation-cli-by-openai-api

This CLI tool makes it easy to translate a large number of strings from an XLSX file into many languages using OpenAI. It has powerful features like a progress bar and the ability to customize the file structure.
1
star