Connecting Asterisk with Google Speech API

You have your Asterisk application and running. Now, you are looking to leverage the power of Google APIs to interpret and analyze the audio of your customers’ phone calls (both incoming and outgoing) in real time. Here is how you can do it.

A typical Asterisk application has a Dialplan that controls and executes flow for all incoming and outgoing calls as per its configuration in extensions.conf file. (Read more on how to configure an Asterisk dialplan)

Asterisk’s Dialplan has its own programming format. However, its functionality can be extended in different programming languages through Asterisk Gateway Interface (AGI). It supports wide variety of languages (Read more on AGI). This interface can be utilized to add smart capabilities such as speech to text, natural language processing etc. by integrating with Google Cloud APIs such Google Speech API, Google API.ai.

Example Application

Consider an Asterisk application that handles incoming/outgoing calls to your server and tries to interpret caller’s speech and respond accordingly. This application manages the call flow through Dialplan in extensions.conf. It records the caller’s audio while he/she is speaking and passes the audio path to PHP script via AGI interface. The parameters can be passed to AGI as command line arguments which can be read by PHP script.

Record Audio (via Dialplan)

(More details on Record function here)

Record Audio (via Dialplan)

(More details on AGI here)

Receive parameters from AGI (in PHP)

Calling Google Speech API

(For connecting with AWS Lex, continue reading Connecting Asterisk with AWS Lex)

This script then calls Google’s Cloud Speech API to convert speech audio to text. You can use Google Speech API SDK in PHP for this purpose. (Refer Google Cloud Api documentation on how to setup the SDK and Google Speech API Basics).

You can also pass hints to the Speech API if you are expecting the caller to speak some of the known words to improve the accuracy of the API.

 

The API returns audio transcript (and various alternatives) for the audio passed to the API. Once this transcript text is available with the PHP script, the text can be analyzed by various NLP tools.

Calling DialogFlow (API.ai)

In this example, in order to figure out the caller’s intent this text can be further passed to Google’s API.ai (dialog flow). You need to configure a bot on DialogFlow that recognizes the text passed to it and accordingly responds with the matched intent. For example, a user may be asked a question (say “Do you need a credit card?”) and he/she responds “I am interested”. The text “I am interested” will be matched with an intent that indicates a “YES” for user’s interest in the product.

Refer this for installing php sdk for for DialogFlow

Once Dialogflow bot is setup on Dialogflow website and php sdk is setup at your server, you can make call to DialogFlow api through this sample code.

Playback Audio Response (via Dialplan)

Once the intent is identified, Asterisk Dialplan can pick an appropriate response in form of a pre-recorded audio file for this scenario and play the same to the caller in real time.

This enables a two way communication between the caller and your asterisk application with real time interpretation of caller’s speech. PHP script can also record the caller’s intent on a DB server for further reporting.

If you have questions and need more details, please share them in the comments section below and we will respond. If you want to integrate with our solution for your business, please reach out to us through our contact us page.

Checkout this space again to see a live demo of this application.

 

1 thought on “Connecting Asterisk with Google Speech API

Leave a Reply

Your email address will not be published. Required fields are marked *