• Stars
    star
    140
  • Rank 261,473 (Top 6 %)
  • Language
    JavaScript
  • License
    Apache License 2.0
  • Created over 4 years ago
  • Updated 4 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A best practice for streaming audio from a browser microphone to Dialogflow or Google Cloud STT by using websockets.

License

Google Cloud / Dialogflow - Self Service Kiosk Demo

Open in Cloud Shell

A best practice for streaming audio from a browser microphone to Dialogflow or Google Cloud STT by using websockets.

Airport SelfService Kiosk demo, to demonstrate how microphone streaming to GCP works, from a web application.

It makes use of the following GCP resources:

  • Dialogflow & Knowledge Bases
  • Speech to Text
  • Text to Speech
  • Translate API
  • (optionally) App Engine Flex

In this demo, you can start recording your voice, it will display answers on a screen and synthesize the speech.

alt text

alt text

Live demo

A working demo can be found here: http://selfservicedesk.appspot.com/

Blog posts

I wrote very extensive blog articles on how to setup your streaming project. Want to exactly learn how this code works? Have a start here:

Blog 1: Introduction to the GCP conversational AI components, and integrating your own voice AI in a web app.
Blog 2: Building a client-side web application which streams audio from a browser microphone to a server.
Blog 3: Building a web server which receives a browser microphone stream and uses Dialogflow or the Speech to Text API for retrieving text results.
Blog 4: Getting Audio Data from Text (Text to Speech) and play it in your browser.

Slides & Video

There's a presentation and a video that accompanies the tutorial.

Slidedeck AudioStreaming

Setup Local Environment

Get a Node.js environment

  1. apt-get install nodejs -y

  2. apt-get npm

Get an Angular environment

  1. sudo npm install -g @angular/cli

Clone Repo

  1. git clone https://github.com/dialogflow/selfservicekiosk-audio-streaming.git selfservicekiosk

  2. Set the PROJECT_ID variable: export PROJECT_ID=[gcp-project-id]

  3. Set the project: gcloud config set project $PROJECT_ID

  4. Download the service account key.

  5. Assign the key to environment var: GOOGLE_APPLICATION_CREDENTIALS

LINUX/MAC export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service_account.json WIN set GOOGLE_APPLICATION_CREDENTIALS=c:\path\to\service_account.json

  1. Login: gcloud auth login

  2. Open server/env.txt, change the environment variables and rename the file to server/.env

  3. Enable APIs:

 gcloud services enable \
 appengineflex.googleapis.com \
 containerregistry.googleapis.com \
 cloudbuild.googleapis.com \
 cloudtrace.googleapis.com \
 dialogflow.googleapis.com \
 logging.googleapis.com \
 monitoring.googleapis.com \
 sourcerepo.googleapis.com \
 speech.googleapis.com \
 mediatranslation.googleapis.com \
 texttospeech.googleapis.com \
 translate.googleapis.com
  1. Build the client-side Angular app:

    cd client && sudo npm install
    npm run-script build
    
  2. Start the server Typescript app, which is exposed on port 8080:

    cd ../server && sudo npm install
    npm run-script watch
    
  3. Browse to http://localhost:8080

Setup Dialogflow

  1. Create a Dialogflow agent at: http://console.dialogflow.com

  2. Zip the contents of the dialogflow folder, from this repo.

  3. Click settings > Import, and upload the Dialogflow agent zip, you just created.

  4. Caution: Knowledge connector settings are not currently included when exporting, importing, or restoring agents.

    Make sure you have enabled Beta features in settings.

    1. Select Knowledge from the left menu.
    2. Create a Knowledge Base: Airports
    3. Add the following Knowledge Base FAQs, as text/html documents:
    1. As a response it requires the following custom payload:
    {
    "knowledgebase": true,
    "QUESTION": "$Knowledge.Question[1]",
    "ANSWER": "$Knowledge.Answer[1]"
    }
    
    1. And to make the Text to Speech version of the answer working add the following Text SSML response:
    $Knowledge.Answer[1]
    

Deploy with App Engine Flex

This demo makes heavy use of websockets and the microphone getUserMedia() HTML5 API requires to run over HTTPS. Therefore, I deploy this demo with a custom runtime, so I can include my own Dockerfile.

  1. Edit the app.yaml to tweak the environment variables. Set the correct Project ID.

  2. Deploy with: gcloud app deploy

  3. Browse: gcloud app browse

Examples

The selfservice kiosk is a full end to end application. To showcase smaller examples, I've created 6 small demos. Here's how you can get these running:

  1. Install the required libraries, run the following command from the examples folder:

    npm install

  2. Start the simpleserver node app:

    npm --EXAMPLE=1 --PORT=8080 --PROJECT_ID=[your-gcp-project-id] run start

To switch to the various examples, edit the EXAMPLE variable to one of these:

  • Example 1: Dialogflow Speech Intent Detection
  • Example 2: Dialogflow Speech Detection through streaming
  • Example 3: Dialogflow Speech Intent Detection with Text to Speech output
  • Example 4: Speech to Text Transcribe Recognize Call
  • Example 5: Speech to Text Transcribe Streaming Recognize
  • Example 6: Text to Speech in a browser
  1. Browse to http://localhost:8080. Open the inspector, to preview the Dialogflow results object.

The code required for these examples can be found in simpleserver.js for the different Dialogflow & STT calls. - example1.html - example5.html will show the client-side implementations.

License

Apache 2.0

This is not an official Google product.

More Repositories

1

dialogflow-nodejs-client

Node.js SDK for Dialogflow
JavaScript
658
star
2

dialogflow-fulfillment-nodejs

Dialogflow agent fulfillment library supporting v1&v2, 8 platforms, and text, card, image, suggestion, custom responses
JavaScript
599
star
3

dialogflow-android-client

Android SDK for Dialogflow
Java
575
star
4

dialogflow-python-client

Python library for Dialogflow
Python
556
star
5

dialogflow-javascript-client

JavaScript Web SDK for Dialogflow
TypeScript
412
star
6

dialogflow-apple-client

iOS SDK for Dialogflow
Objective-C
244
star
7

agent-human-handoff-nodejs

A simple Dialogflow agent, a server, and a web interface that shows an approach for handling text-based conversations between a Dialogflow agent and a human operator
JavaScript
211
star
8

fulfillment-webhook-json

Dialogflow's Fulfillment: Webhook JSON (Requests & Responses)
190
star
9

asr-server

FastCGI support for Kaldi ASR
C++
184
star
10

dialogflow-ruby-client

Ruby SDK for Dialogflow
Ruby
141
star
11

dialogflow-java-client

Java client library for Dialogflow
Java
133
star
12

resources

Links to all Dialogflow libraries and samples
119
star
13

fulfillment-weather-nodejs

Integrating an API with Dialogflow's Fulfillment
JavaScript
81
star
14

fulfillment-firestore-nodejs

Integrating Firebase's Firestore database with Dialogflow
JavaScript
81
star
15

fulfillment-bike-shop-nodejs

Integrating Google Calendar API with Dialogflow's Fulfillment & Knowledge Connectors
JavaScript
79
star
16

dialogflow-dotnet-client

.NET framework for Dialogflow
C#
70
star
17

dialogflow-botkit-client

Botkit library for Dialogflow
JavaScript
62
star
18

dialogflow-java-client-v2

Java client for Dialogflow: Design and integrate a conversational user interface into your applications and devices.
59
star
19

dialogflow-unity-client

Unity library for Dialogflow
C#
54
star
20

fulfillment-translate-python

Python
45
star
21

dialogflow-cordova-client

Cordova library for Dialogflow
Objective-C
42
star
22

api-ai-english-asr-model

Api.ai English Speech Recognition (ASR) Model for Kaldi
36
star
23

fulfillment-actions-library-nodejs

Integrating Actions on Google Client Library with Dialogflow's Fulfillment Library
JavaScript
29
star
24

dialogflow-cpp-client

C++ library for Dialogflow
C++
28
star
25

api-ai-cocoa-swift

Cocoa Swift library
Swift
28
star
26

fulfillment-telephony-nodejs

Sample integrating Telephony, Google Sheets, and Slot Filling with Dialogflow
JavaScript
24
star
27

fulfillment-importer-nodejs

Dialogflow's Importer for Alexa Skills to import a Alexa Skill to Dialogflow
JavaScript
22
star
28

fulfillment-slot-filling-nodejs

Slot Filling with Dialogflow Fulfillment
JavaScript
19
star
29

fulfillment-temperature-converter-nodejs

Sample demonstrating how to make a Dialogflow agent compatible with 9 platforms
JavaScript
19
star
30

fulfillment-faq-nodejs

Integrating Dialogflow's Knowledge Connectors, Phone Gateway, and Actions on Google
JavaScript
17
star
31

dialogflow-xamarin-client

Xamarin SDK for Dialogflow
C#
10
star
32

fulfillment-multi-locale-nodejs

Sample showing how to use multiple languages and locales (e.g. English and French)
JavaScript
10
star
33

city-streets-trivia-nodejs

This sample demonstrates how to create and update developer entities using the Dialogflow Node.js Client and the Dialogflow Fulfillment Library. It also demonstrates how to create session entities from your fulfillment code.
JavaScript
10
star
34

fulfillment-regex-nodejs

Validate Entities with Regular Expressions in Dialogflow's Fulfillment
JavaScript
8
star