• Stars
    star
    161
  • Rank 233,470 (Top 5 %)
  • Language
    JavaScript
  • License
    MIT License
  • Created about 2 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A sample web app using OpenAI Whisper to transcribe audio built on Next.js. It records audio continuously for some time interval then uploads the audio data to the server for transcribing/translating.

openai-whisper

This is a sample webapp implementation of OpenAI Whisper, an automatic speech recognition (ASR) system, using Next.JS.

It records audio data automatically and uploads the audio data to the server for transcribing/translating then sends back the result to the front end. It is also possible to playback the recorded audio to verify the output.

Update: If you want to use Next 13 with experimental feature enabled (appDir), please check openai-whisper-api instead. Just set the flag to use whisper python module instead of whisper API.


Motivation

It has been said that Whisper itself is not designed to support real-time streaming tasks per se but it does not mean we cannot try, vain as it may be, lol.

So this project is my attempt to make an almost real-time transcriber web application using openai Whisper. The efficacy of which depends on how fast the server can transcribe/translate the audio.

I used Next.js so that I do not have to make separate backend and frontend apps.

As for the backend, I used exec to execute shell command invoking Whisper. I have not yet find a way to import it as a node.js module. All examples with import seem to be using python server.

import { exec } from 'child_process'

exec(`whisper './${filename}' --model tiny --language Japanese --task translate`, (err, stdout, stderr) => {
    if (err) {
        console.log(err)
    } else {
        console.log(stdout)
        console.log(stderr)
    }
})

Notice I am just using the tiny model to perform super fast transcribing task. This is all my system can handle otherwise it will come to a stand still.

The App

App

I changed the behavior of the app from previous version. Before, the app will record audio data continuously by some time interval, by default 5s. Right now, it will only start recording if it can detect sound.

There is a threshold setting to eliminate background noise from triggering the audio capture. By default it is set to -45dB (0dB is the loudest sound). Adjust the variable minDecibels in Settings if you want to set it to lower or higher depending on your needs.

In normal human conversation, it is said that we tend to pause, on average, around 2 seconds between each sentences. Keeping this in mind, if sound is not detected for more than 2 seconds, recording will stop and the audio data will be sent to the backend for transcribing. You can change this by editing the value of maxPause, by default set to 2500ms.

Output

It is possible to play the uploaded audio and follow the text output since the time period is shown.

As for the code itself, I used class component (I know, I know...) because I had a difficult time to access state variables using hooks when I was developing.

Settings

Aside from minDecibels and maxPause, you can also change several Whisper options such as language, model and task from the Settings dialog. Please check Whisper's github repository for the explanation on the options.

There are still lots of things to do so this project is still a work in progress...

Setup

First, you need to install Whisper and its Python dependencies

$ pip install git+https://github.com/openai/whisper.git

You also need ffmpeg installed on your system

# macos
$ brew install ffmpeg

# windows using chocolatey
$ choco install ffmpeg

# windows using scoop
$ scoop install ffmpeg

By this time, you can test Whisper using command line

$ whisper myaudiofile.ogg --language Japanese --task translate

If that is successful, you can proceed to install this app.

Clone the repository and install the dependencies

$ git clone https://github.com/supershaneski/openai-whisper.git myproject

$ cd myproject

$ npm install

$ npm run dev

Open your browser to http://localhost:3006/ to load the application page.

Using HTTPS

You might want to run this app using https protocol. This is needed if you want to use a separate device for audio capture and use your machine as server.

In order to do so, prepare the proper certificate and key files and edit server.js at the root directory.

Then run

$ node server.js

Now, open your browser to https://localhost:3006/ to load the page.

More Repositories

1

openai-whisper-talk

openai-whisper-talk is a sample voice conversation application powered by OpenAI technologies such as Whisper, Completions, Embeddings, and the latest Text-to-Speech. The application is built using Nuxt, a Javascript framework based on Vue.js.
JavaScript
137
star
2

openai-api-function-call-sample

A sample app to demonstrate Function calling using the latest format in Chat Completions API and also in Assistants API.
JavaScript
86
star
3

openai-whisper-api

A sample speech transcription app implementing OpenAI Text to Speech API based on Whisper, an automatic speech recognition (ASR) system, built using Next 13, the React framework
JavaScript
75
star
4

openai-assistants-api-multi-user-sample

This sample project demonstrate the OpenAI Assistants API’s ability to manage single-threaded multi-user interactions through a full-stack app using Node.js, Vue.js, and socket.io for server-client communication.
Vue
51
star
5

chatgpt-with-image-sample

This sample project integrates OpenAI's GPT-4 Vision, with advanced image recognition capabilities, and DALL·E 3, the state-of-the-art image generation model, with the Chat completions API. This powerful combination allows for simultaneous image creation and analysis.
JavaScript
23
star
6

callcenter-simulator

A sample callcenter simulator using OpenAI technologies such as ChatGPT, Whisper and other APIs to automate and improve callcenter operation. This project is intended to be a sandbox application to test feasibility and other ideas.
JavaScript
23
star
7

openai-assistants-api-streaming

A sample application to demonstrate OpenAI Assistants API streaming, built using Next.js.
JavaScript
17
star
8

openai-chatfriend

A chatbox application built using Nuxt 3 powered by Open AI Text completion endpoint. You can select different personality of your AI friend. The default will respond in Japanese. You can use this app to practice your Nihongo skills!
Vue
14
star
9

openai-chatgpt-api

A sample interactive storytelling narrative chatbot application using ChatGPT API, powered by gpt-3.5-turbo, OpenAI’s advanced language model, built using Next 13, the React framework.
JavaScript
12
star
10

react-three-terrain

A terrain map visualizer based from height maps using React and three.js
JavaScript
9
star
11

openai-chatterbox

A sample Nuxt 3 application that listens to chatter in the background and transcribes it using the powerful OpenAI Whisper, an automatic speech recognition (ASR) system.
Vue
9
star
12

mytrip-hokkaido-app

This sample project is a customizable regional travel planning app that uses artificial intelligence to generate itineraries based on user text, powered by OpenAI Chat Completions API, built using Next.js 13.
JavaScript
7
star
13

vue-quiz-app

A sample quiz application built using Vue 3 + Vite. Implements route views and components using Composition API. Quiz data fetched remotely using Open Trivia DB API (opentdb).
Vue
6
star
14

chatgpt-learning-app

This sample React app aims to be a learning hub that aids students in their studies. By providing topics from their actual course syllabus, users can interact with the AI chatbot tutor and engage in dynamic conversations related to the topics. They can ask questions, explore concepts, and generate quiz to test your knowledge.
JavaScript
6
star
15

grocery-list-app

This is a sample React project that generates a grocery list of ingredients based on a menu or a list of dishes. It is powered by the OpenAI Chat Completion API and built using Next.js 13.
JavaScript
5
star
16

openai-chatgpt-prompts

Collection of useful custom prompts for OpenAI ChatGPT [プロンプトコレ]
5
star
17

discussion-app

This is a discussion app that allows you to explore various viewpoints on specific topics. Simply provide a topic, and it will generate multiple responses based on different perspectives, enabling you to learn and understand various views. You can customize the attributes of each point of view or character.
JavaScript
4
star
18

three-pipebuilder

Proof of concept for a 3D pipebuilder project. Using React + ThreeJS + Bootstrap
JavaScript
3
star
19

react-moving-dottext

A React component that mimics the LED signage moving text display
JavaScript
2
star
20

openai-chatgpt-desktop

A sample app that enables user to conveniently access OpenAI ChatGPT right from their desktop menubar.
JavaScript
2
star
21

NextJSSampleDataVisualization

First attempt to make a Data Visualization app using NextJS/React
JavaScript
2
star
22

three-samples

Sample exercises for ThreeJS
JavaScript
1
star
23

bun-openai

This repository contains a collection of various applications powered by the OpenAI API, with backend support from Bun server.
JavaScript
1
star
24

openai-tutorial-sample

A sample tutorial app based on modified OpenAI API quickstart tutorial
JavaScript
1
star
25

react-memory-game

A sample Memory Game using React.JS
JavaScript
1
star
26

react-redux-memo

A sample MemoBoard application using React/Redux and Material-UI
JavaScript
1
star
27

react-jest-sample

JavaScript
1
star
28

NextJSBlogSample

Sample Blogsite using Next.JS
JavaScript
1
star
29

react-google-charts-example

A project using react-google-charts module and googledoc as data source
JavaScript
1
star
30

vue-whackamole

A Whack-A-Mole game created using Vue.JS
Vue
1
star
31

React3DXRoomSimulator

React + 3DX sample project
JavaScript
1
star
32

NextJSQuizApp

Sample Quiz App using NextJS/React
JavaScript
1
star
33

react-crud-app

A simple Todo app that serves as the React layer to demonstrate a MERN stack
JavaScript
1
star
34

next-ipsum-app

A placeholder text generator based on excerpts taken from the novel Noli Me Tangere by Jose Rizal.
JavaScript
1
star
35

NextJSSlidePuzzle

A Slide Puzzle App using NextJS/React
JavaScript
1
star
36

react-piano-openai

A piano application using Web Audio api created using React based on the reply given by OpenAI ChatGPT
JavaScript
1
star
37

react-ml5-moviereview

A sample React project to check the favorable score of user reviews for movies using ml5.js and data from TMDb.
JavaScript
1
star
38

react-mosaic-app

A simple image mosaic builder made using React
JavaScript
1
star
39

react-ui-components

A collection of React components and custom hooks based on generated code by OpenAI ChatGPT
JavaScript
1
star
40

openai-structured-output-sample

A sample application to demonstrate how to use Structured Outputs in OpenAI Chat Completions API with streaming, built using Next.js.
JavaScript
1
star