Node-RED Library page: node-red-contrib-twilio-ivr
Project source code at GitHub: node-red-contrib-twilio-ivr
Twilio is fast becoming an important part of Internet’s communication infrastructure. What is Twilio? Twilio is a developer platform that allows you to add capabilities like voice, video, and messaging to your applications. This project uses the Twilio Programmable Voice APIs to bridge the gap between the telephone system and the Internet. It connects the mysterious, traditionally closed world of telephony with the open Internet. If you like building things on the Internet, now you can integrate with those things using the phone system, whether it be via voice calls, text messages, or even wireless devices with Twilio SIM cards. Like many API services, Twilio has a pay-as-you-go model, so I’ve been able to explore all this great functionality for very little money. In this project I will show you how you can build an IVR (Interactive Voice Response) system with Twilio and one of my favorite technologies, Node-RED. And I have a great example IVR that you can call yourself to explore!
Twilio + Node-RED = POWER!
In case you don’t know what an IVR is, it’s that thing when you call a customer service phone number and are presented with a menu system that helps you (or tries to). A caller interacts with the IVR by pressing keypad numbers and speaking commands. To start using Twilio in this way, you need to buy a phone number from Twilio for your callers to call. As a developer, you present “content” to the caller by using a Twilio markup language called TwiML which instructs Twilio how to present your IVR content to whomever calls your number. It’s really just like building a web server that returns HTML that is presented by a web browser. But in this case, we build an HTTP server that returns TwiML to Twilio which audibly presents that content to the caller. The caller’s phone is the browser! Twilio handles the job of connecting telephone callers to your HTTP server.
You can build a TwiML server using any web server technology, like PHP on an Apache server, etc. I chose to use Node-RED because I use it for lots of IoT processing flows. Most importantly, by building a library of nodes specifically designed for building a Twilio IVR, I (and now you) can build sophisticated IVRs by dragging and dropping nodes into a Node-RED flow. And since Node-RED has many libraries available for integration into just about anything (including a way to invoke Twilio APIs), your IVR can integrate with databases, other servers, social media, and anything else in Internet-land. I think Node-RED is the perfect tool for creating IVR flows.
Note that Twilio has their own product like this called Twilio Studio, but I built my system before they announced their product. Besides, mine is free and I think you can build much larger and more sophisticated IVRs with my solution :)
Node-RED IVR Components
Here is an overview of the main components of an IVR built with this system. Instructions for setting all this up for your own IVR is later in this article.
Twilio IVR Library
To make all this easy, I built a library of Node-RED nodes that create TwiML for the response. Some nodes are simple and have a one-to-one correspondence with TwiML markup, like the “play” node for playing MP3 files and the “say” node for speaking text. These nodes create TwiML using the <Play> and <Say> tags. Other nodes are for higher-level IVR concepts, like the “menu” node. It lets you create a menu that is spoken to the caller and arranges for the caller input to route to the correct part of the IVR flow.
Twilio IVR Core Flow
Much of the heavy lifting of the IVR is done by a predefined Node-RED flow called Twilio IVR Core. Your IVR will use this Node-RED flow and you won’t need to change it. The main part is the Universal Router. Twilio always invokes the Universal Router HTTP endpoint (which is /router) which figures out which route to follow through the IVR based on the user’s input or routing information specified by the previously invoked route (more on this later). Routes are a key concept in the IVR. The are a particular path that the caller is on, and have names like /main-menu or /account or /agent.
Example IVR: “Customer Service Hell”
The easiest way to demonstrate the many features you can implement in an IVR was to build a great example. The Node-RED flow “Customer Service Hell” is an example that uses many features. It is designed to frustrate any caller with a maze of menu choices and frustrating interactions. You can call this IVR at (612) 999-2812. (if you are international, dial +1-612-999-2812). Please note that it costs us a small amount of money every time you call, so don’t abuse it, but by all means use it to learn and maybe get a good laugh. Here is a huge image of this Node-RED flow that you can use to follow along. Be sure to look at the route called /nightmare-hold. If you are planning on implementing a Twilio IVR, I suggest you start with this flow by import it into your Node-RED implementation (more on setup later).
Of course you have to tell Twilio to invoke your Node-RED server: in the Twilio console, specify your server endpoints in the configuration for your Twilio phone number. Here is an example of how this should look. Node-RED can be configured to protect HTTP endpoints with username/password credentials, so you need to specify these as well. Note that the error webhook does not need the credentials (and doesn’t work if you provide them).
How It All Works
Twilio invokes the IVR over HTTP to get the TwiML response for Twilio to present to the caller. A response to the caller may request input from the caller in the form of a menu or may prompt the caller for other input like an account number or invite the caller to speak a command. The Universal Router controls the flow of a call. First, it finds the caller’s session or creates a new session if the call is new. The session keeps track of anything we want to remember since we last returned TwiML to Twilio. (It’s just like a web server session where we keep track of info before responding to the browser.)
Next, the router determines which route to invoke. For example, if a menu was presented to the caller during the last invocation, then the caller’s input must be matched to the correct route to invoke. The session automatically keeps track of any menu that was presented so that the caller’s input can be used to route correctly. For example, if the last menu told the caller to press “1” for accounting, then if they pressed “1”, we route to the /accounting route. Another important thing to note is that if a call is new, the router will invoke a route called /begin, so your IVR needs to define this route as the starting point for a new call.
Next, the router actually invokes the desired call route. Each route in the IVR begins with a Node-RED HTTP endpoint. Even though Twilio does not invoke these HTTP endpoints directly (Twilio always invokes /router), the Universal Router invokes routes via HTTP. The responsibility of a route is to return the TwiML that is to be returned to Twilio. The outer layer of TwiML (the <Response> and </Response> tags) are taken care of by the router.
Examples of Common Route Patterns
Presenting a Menu
A common pattern in an IVR is to present the caller with a menu of choices. Here is the part of the IVR that implements the /main-menu route, as well as the starting point for the whole IVR, /begin. You can see that if a call is new it says a welcome message before connecting to the /main-menu route.
The pattern for a menu-based route is to start with a gather-begin node which tells Twilio that we want user input. See the Twilio documentation for the <Gather> TwiML element. You can configure the gather-begin node to expect DTMF (touch tones) or speech input, or both. You can also provide speech hints to help Twilio recognize speech input. For example if you are presenting a menu that allows the caller to say “representative”, then specify the string “representative” as a speech hint in the gather-begin node to help Twilio recognize the caller’s utterance.
The menu node creates the TwiML that reads a menu to the caller. This node provides lots of flexibility. For each menu item, the item can be spoken (“Say”) or an audio file can be played (“Play”). The word “For” can be prepended to the name of the item if that makes sense. The caller can press a number to choose the item (“Press” option) or can press or say it (“Press or Say” option). You can also specify speech that can be used to choose the item. For example in the second item below, the caller can press 2, say “two” or say “account” to choose the item. Finally, each item specifies which route in the IVR to invoke if chosen. The Universal Router will inspect the input from the caller (whether spoken or entered) and determine the route to invoke.
Also, don’t forget to use a gather-end node before returning to the router. This adds the ending tag </Gather> to the TwiML response.
Using the set-route node
Although menus are a common pattern, not every route in an IVR presents the caller with a menu of choices. Sometimes you want to prompt the caller for input (like an account number) or prompt them to make a recording of some kind. An example of this is the /account route in the Customer Service Hell IVR. Click the image below to see this route. Note that we still need the gather-begin and gather-end nodes since we are asking for input.
Dynamically building a menu
One more pattern that you may find a use for is the ability to build a menu dynamically instead of specifying it all in a single menu node. Below is a particularly insidious part of our Customer Service Hell IVR. After prompting the caller for a 4-digit PIN, it transposes the digits that they entered and tells the caller that their PIN is wrong. The IVR keeps track of the number of times the caller has failed (in the call session), and after multiple failures we add a menu choice for the caller and invite them to “shock the monkey” by pressing ‘*’. (Relax, PETA, it’s just a sound effect.) This ability to dynamically build the “content” returned to the caller is very powerful.
Here are the steps to take if you want to actually build and IVR. I suggest you get the example “Customer Service Hell” IVR working first so you can see how everything works together.
First it is assumed you have Node-RED installed and running. If you don’t, then get started at the Node-RED website.
In your Node-RED installation, install the Twilio IVR library:
npm install node-red-contrib-twilio-ivr
Next, you’ll need to add a Node-RED flow with the Twilio IVR Core. You can get this flow from the Node-RED Library entry for this project. In Node-RED, use the Import option on the menu and import from the clipboard. Past the Twilio IVR Core content from the clipboard.
To run the Customer Service Hell IVR, create another flow in Node-RED and import from the clipboard just as you did for the Twilio IVR Core. Past the content of the IVR flow into Node-RED. This IVR relies on audio files which are retrieved by Twilio when you tell Twilio to play an audio file using a “play” or “menu” node. You can download the audio files from this link and you will need to host them on a server. You will specify the base URL of all the audio files in the configuration of your Node-RED “play” and “menu” nodes.
Don’t forget to configure your Twilio account to point your phone number at your Node-RED server (this is described earlier in the article).
You will need to make one change in the Twilio IVR Core flow: the Universal Route contains an HTTP node called “invoke route” which makes the HTTP call to the route endpoints. If your server is protected with a username/password you will need to specify them. You will also need to tell this node whether it needs to use SSL/TLS when connecting.