Conversational AI has become ubiquitous in recent years with millions of smart speakers sold annually. Amazon’s Echo is the market leader and a common question we are asked is how to integrate PubNub with a user's Amazon Alexa device.
This how-to guide will assume familiarity with Amazon’s “Alexa Skills” platform, the developer toolkit that allows third-party-developed interactions to be created and deployed to Alexa. To learn more about Alexa Skills please refer to the resources pane of your Amazon Alexa console or Amazon’s ‘build an Alexa Skill’ tutorial.
At a very high level, to develop an application that interfaces with Alexa you would do the following:
Create a ‘skill’. This defines the type of conversation, e.g. ‘quiz of the day’ or ‘today’s horoscope’. This is invoked using an ‘invocation’, e.g. “Alexa, talk to daily horoscopes”
Define what the user wants to achieve, known as ‘Intents’. E.g. “Tell me the horoscope for Gemini”
Build the model. This is the most powerful feature of any conversational AI. How does the system know what ‘Intent’ you are trying to invoke? The user might ask “what is my horoscope?” or “Give me Gemini?” or any number of permutations of that; the power of Alexa skills (or any natural language parsing system) is that it gives you a user-friendly way to train your model without wading through the underlying complexity.
Do something in response to the user’s ‘Intent’. For the horoscope example this is trivial but for most real-world applications this is the hard bit. You might perform a database lookup but if your use case involves publishing messages in response to a user’s voice request, then PubNub is perfect for you.
Hey Alexa, let my hockey group know that training is cancelled
Hey Alexa, set my chat app status as away
Hey Alexa, do I have any pending messages?
Hey Alexa, who is top of the leaderboard?
Hey Alexa, where are my keys?
Hey Alexa, send a message to Keith in electronics telling him there is a customer waiting
You get the idea…
You could easily achieve any of the above and many other actions using PubNub.
Understanding what the user is trying to do is all handled by Alexa but how you respond to that request can be done in a number of ways. When you first create your Skill you will be prompted whether you want to use any of the following:
Alexa-hosted (Node.js) [recommended by Amazon]
Provision your own. This will involve defining an AWS Lambda function or webhook to handle your request.
This how-to assumes the recommended approach and uses a Node.js hosted function but there is no reason this principle would not work with the other approaches.
The system overview will look as follows:
Everything within the box labeled ‘Amazon Infrastructure’ is standard Alexa skills. The remainder of this article will concentrate on the connection between the AWS Lambda and PubNub.
As stated previously, this how-to will assume a Node.js backend but this same principle would also work with a Python backend.
package.json from your code Skills ‘code’ tab and manually add the latest version of pubnub. You do not need to run npm install. When you 'deploy' your Skill, Amazon will automatically update any dependencies you included in
index.js, make sure you import the PubNub library:
You then need to create the PubNub object. Remember that a new lambda expression is created whenever an Intent is handled, so you will need to recreate the object within the lambda each time. Where you do this is up to you, for simplicity the code below creates the object within the Intent handler but if you have multiple Intents that use PubNub you probably want to declare it outside the handler
To get data into the PubNub network, you publish it to a defined channel. Once published, your message will be received at all subscribers with very low latency.
The PubNub documentation says the recommended syntax is to use await / async. If you use that approach, remember to set your handle method as async
Since the lambda function only exists whilst handling the Skill Intent, it is not possible to maintain a connection with PubNub listening for messages.
One workaround is to query the history to find any messages that have been missed since the Intent was invoked:
This example uses the fetchMessages API to receive the last message on the channel. In production you would probably receive all messages received since the last Skill Intent was invoked. The fetchMessages API allows you to specify a start and end time token to retrieve messages between so you could use a Skills sessionAttribute to keep track of the last time you were invoked.
Conversational AI solutions like Alexa Skills are hugely powerful natural language processing systems that allow you to quickly prototype a conversational solution.
Many customers struggle to move from prototype to production as interfacing with backend data is always non-trivial. PubNub solves this problem, making it easy to send and receive messages at scale.
If you would like a more hands-on tutorial, please refer to Alexa voice controlled Raspberry Pi using PubNub. Although written a couple of years back, the workflow is still relevant.