PubNub SDKs support message encryption, allowing you to encrypt all or part of your message payload end-to-end, from client to client. This article will assume a familiarity with PubNub, that you have a PubNub account, and have generated a keyset with a corresponding publish key and subscribe key.
One option is to enable encryption at the top-level pubnub
object when making calls through our SDKs, which means that every message you send through PubNub will be encrypted with the same key:
We also support partial message encryption, where only a part of your message is encrypted, and the rest is sent 'unencrypted' (though still over TLS!)
Examples of why you might use partial message encryption include:
PubNub needs to see your push message metadata
pn_apns
orpn_fcm
You want to use a different key for each conversation, encrypting per-channel. This means the conversation between users A and B has a separate key from that between users A and C.
Where did the cipherKey come from?
In the above example code and the code samples given in our documentation, the cipherKey (pubnubenigma
above) has magically appeared. How was the cipherKey generated? And how did it securely get to your client? This article will address those questions, first in the abstract that can be applied to any cloud or self-hosted solution, and then we will discuss a more specific, worked example for AWS.
Overall Security Principles and Requirements
The goal is to create a secure and encrypted end-to-end messaging solution using PubNub to exchange messages between clients.
Additionally, messages will be encrypted between users with a unique symmetric key, e.g., AES-256.
Keys should never be stored or transmitted in plaintext.
Clients will perform encryption/description locally using the appropriate symmetric conversation key, providing end-to-end encryption.
Every conversation between 2 or more users has its own 'per-conversation' secret key. User registration and authentication are required before exchanging messages. This article does not cover 'guest users', which, though technically possible, would require special handling. PubNub cannot see your message data unless you explicitly allow it. For example, the metadata required for PubNub to deliver your mobile push messages could be plaintext, while all other parts of the message are encrypted.
You may or may not trust the server generating the per-conversation key. This article will mostly assume you trust the server, but will also discuss zero-trust approaches.
Users should be able to participate in the same conversation on multiple devices, such as desktop and mobile.
Keys should be able to be rotated as needed, for example, if a vulnerability is discovered.
All network traffic will be encrypted by secure connections (TLS).
A quick note about asymmetric encryption: Although PubNub makes it easy to implement symmetric message encryption, nothing prevents you from implementing asymmetric encryption instead. I'll touch on that briefly at the very end of this article.
Cloud-Agnostic Architecture
Generating a per-conversation encryption key
Every conversation will have a unique key used to encrypt data sent between users.
The first step is to generate that key:
Step 1 - User authentication:
As mentioned earlier, this article will not consider guest users, so for this architecture, all users should register and be authenticated by the cloud solution's identity management system.
Step 2 - Create conversation between user A and user B:
User A and user B want to talk to each other, but how that conversation is initiated will depend on your application architecture:
User B might initiate the conversation as illustrated in the diagram (perhaps by finding User A in a 'search users' dialog).
Your application might offer a 'create group' feature, and creating a group would create the conversation between all group participants (user A, user B, user C, …)
Your app server might automatically create the conversation. For example, User A might be a moderator who can message any user individually, so it makes sense to create this whenever a new user registers.
'Creating a conversation' would do several things, such as creating a PubNub channel, but to generate the conversation key, you would call an API endpoint on your backend server, possibly requiring a separate API key.
Step 3 - Generate random AES symmetric key (conversation key):
Major cloud providers allow you to generate keys in a secure and trusted environment. There will be a provider-specific SDK that you can invoke from within that provider's serverless container (later in this article I'll walk through calling the AWS KMS SDK from within an AWS Lambda function [node.js]).
AES-256 is used for several reasons:
AES-256 provides strong security (128-bit block size with a 256-bit key), offering one of the strongest encryption levels for symmetric cryptography.
It is designed for fast, secure encryption of sensitive data and is highly optimized in hardware
It is natively supported by a number of platforms and most importantly, by PubNub's SDKS.
Steps 4 & 5 - Encrypt the conversation key for each user:
The conversation key is generated once, then encrypted separately for each participant who needs to use it.
You have two options:
Server-trusted: This is the simpler approach, but it requires that the server know the encryption key. This approach will use the key management component of your cloud solution to re-encrypt the conversation key using a common key for each user. User authentication and role policies control which conversation keys a user can access.
Zero-trust is a stronger approach but involves additional configuration steps. Each user has a public-private key pair, with the user's public key shared with and stored by the cloud provider. When a conversation key is generated, it is encrypted separately for each user using that user's public key, meaning that only the user who provided their public key can decrypt it.
Which should you use? Server-trusted is an excellent approach for an MVP or an internal system, but public apps with a strong E2E encryption guarantee will typically opt for the zero-trust approach. The examples in the second half of this article will focus on server-trusted for AWS, because it is easier to explain and does not significantly affect the architecture of key distribution.
Step 6 - Store encrypted keys for each user:
Store the encrypted keys in your cloud database with a version number and timestamp. Note that as you rotate the keys used to send messages, you will need access to previous keys to decrypt historical messages.
You will have one encrypted key for each user. Although this article does not cover this, you can also imagine that if a new user is added to an existing conversation, you would add a new entry for this user in the database with their own encrypted key.
Retrieving the per-conversation key and sending messages
PubNub supports conversations with any number of users. Whether it is a 1:1 direct message conversation or a group conversation between many participants, everybody in the conversation needs access to that conversation's shared key. All encrypted conversation keys are stored in the cloud database, so each client needs to have a way of obtaining and decrypting that key.
There are several reasons you might want to retrieve the conversation key:
You want to send an encrypted PubNub message
You have received an encrypted PubNub message
The encryption key was updated, and you need the newer version of the key
You have been added to a new group conversation
Step 1 - Request conversation key:
In the architecture diagram, the client initiates the request for a new conversation key by calling an API endpoint on your cloud infrastructure. It is also possible that the conversation key is pushed to an existing user, for example you might push the encrypted keys after they are rotated, but that will depend on your architecture.
The request might also include which version of the key is required or a timestamp, for which the returned key should be valid.
Steps 2, 3, & 4 - Retrieve & return encrypted conversation key:
The cloud platform performs a database lookup based on the request. For example, it will return the specific key version requested. The key remains encrypted all the way from the database back to the client.
Step 5 - Decrypt conversation key:
The client decrypts the conversation key. How this is achieved will depend on whether you adopted a server-trusted approach or a zero-trust approach, as described for steps 4 and 5 in the previous workflow.
After decryption, all clients will have the same plaintext key.
The diagram shows two users, A and B, but there can be any number of users in the conversation, all sharing the same plaintext key—it just depends on the size of your conversation group.
Step 6 - Send encrypted PubNub message:
The plaintext conversation key can now be used as the cipherKey to encrypt the PubNub messages before sending them.
To extend the code snippet given in the introduction:
Step 7 - Decrypt received PubNub Message:
Again, using the conversation key that was decrypted in step 5, the received PubNub message contents can be decrypted as follows:
To clarify, the PubNub message has been decrypted using the 'conversation key', which was decrypted on the client in step 5 above.
Rotating the per-conversation encryption key
You will need to have a mechanism in place for rotating keys. How and why key rotation is initiated is up to you, and will depend on your application design. You might choose to rotate keys at a specific time interval, or every N messages exchanged; you might initiate the rotation by a client, or server, or both.
Regardless of how or why you initiate a key rotation, the workflow to rotate your keys will look as follows:
There are similarities between this workflow and the previously discussed workflows for generating and retrieving keys, so I recommend reviewing those sections if you have not already done so.
Step 1 - Rotate conversation key:
Regardless of how the rotation is initiated, you will have a dedicated API endpoint for this that authorized users can call.
Step 2 - Generate new AES symmetric key:
A new per-conversation symmetric key is generated in the same way as described in the workflow for 'Generating a per-conversation encryption key'.
Step 3 & 4 - Encrypt new conversation keys for each user:
The new key is encrypted for each user in the same way as described in the workflow for 'Generating a per-conversation encryption key'.
Step 5 - Store new encrypted keys for each user:
The new encrypted keys, one for each user, are stored in your cloud storage database with an appropriate version number and timestamp. The existing record stored in the 'Generating a per-conversation encryption key' is augmented with this new key. Clients can retrieve any historical key if they need to decrypt historical messages.
Step 6 - Distribute new encryption key to users:
Every user in the conversation needs a copy of the updated key to send and receive messages. You must trigger each user to run through the 'Retrieving the per-conversation key' workflow. Further, the user must do this on every device (mobile, tablet, desktop) currently logged into your app.
There are many ways you could achieve this. I would recommend:
Whichever process triggers the key rotation, also send a PubNub message/notification to every client in the channel, instructing them to retrieve the latest token.
If needed, you could prevent a client from publishing messages until they complete their token upgrade using PubNub Access Manager, but that is outside the scope of this article.
Worked example with AWS
The following section will describe how to configure key management for PubNub encrypted real-time messaging using AWS in more detail. However, the principles described here apply equally to other major cloud providers. As mentioned earlier, this example will show a trusted-server approach for simplicity, but I'll point out the changes needed for a zero-trust approach as needed.
Generating a per-conversation encryption key on AWS
The unique key for each conversation is generated by the AWS KMS (Key Management Service)
Step 1 - Authenticate end users with AWS Cognito:
If you build your solution with AWS, your identity provider is AWS Cognito. After logging in, AWS policy management can assign permission for an authenticated user to decrypt data using KMS.
For my simple test, I created an identity pool and user pool, with the appropriate permissions (kms:Decrypt
, kms:Encrypt
, kms:GenerateDataKey
). I am sharing the client-side code below that I used for testing a login that returns an authenticated kmsClient object. This kmsClient object will be used in subsequent steps.
I chose to write my clients in JavaScript, but both AWS and PubNub support many client languages, including Kotlin, Swift, and Java.
Step 2 - Generate a new key for each conversation with API Gateway:
As described earlier when discussing the cloud-agnostic approach, there are a number of ways that the conversation might be initiated, for example user A might initiate the conversation with user B after finding them in your app's directory.
Regardless of how the conversation is initiated, it requires a new per-conversation key to be created, which will be triggered by a call to the AWS API Gateway
Example HTTP POST to the /session/generate endpoint, requesting a new key be generated for a conversation, 'test-session-001', between two users, A & B.
Step 3, 4 & 5 - Generate a new, encrypted conversation key with AWS KMS and AWS Lambda:
Earlier in the article, the two options for encrypting this conversation key were discussed: 'server-trusted' and 'zero-trust'.
The AWS KMS makes it easy to implement a 'server-trusted' approach by generating a data key that will be common for each user, and then controlling each user's ability to access that key through role permissions
To implement a zero-trust approach, the AWS Lambda would need to use a previously stored user public key which could then be used to encrypt the conversation key. The values stored in the database would be different for each user in that case, since the conversation key is encrypted with a different user-specific key.
To clarify, the zero-trust approach requires two keys: the per-conversation key as described throughout this article, and an entirely separate user-specific public key, meaning only the user with the associated private key is able to decrypt their copy of the per-conversation key.
Step 6 - Store the encrypted keys in DynamoDB:
The encrypted per-conversation key, encryptedKeyBase64
, is stored in DynamoDB with an associated timestamp and version number. Note that the same encryptedKeyBase64
is stored against each user in this example.
For simplicity, the code below contains some hardcoded assumptions:
No check is made whether the key exists before writing the data.
Only 2 users are ever assigned keys, but this should be dynamic and depend on the number of users in the conversation, which could be passed as a parameter to the Lambda.
Retrieving the encryption key and sending messages (AWS)
As explained earlier in this article, when discussing how to retrieve the per-conversation key in a cloud-agnostic way, you will need to retrieve the conversation key whenever you want to send or receive an encrypted PubNub message.
Steps 1 through 4 - Request and receive encrypted key using AWS Lambda & DynamoDB:
Requests from clients to retrieve the encrypted key will be received at the API Gateway where they will trigger an AWS Lambda, which in turn will look up the values in the DynamoDB table
Example HTTP POST to the /session/key endpoint, requesting the key associated with user A for session 'test-session-001'
AWS Lambda to handle the POST. Error handling is omitted for brevity.
Throughout this entire process, the conversation key has remained encrypted.
Step 5 - Decrypt the encrypted per-conversation key using AWS KMS:
As mentioned previously, this article assumes a server-trusted approach.
In step 1 of the previous workflow, 'Generating a per-conversation encryption key on AWS', we logged in using AWS Cognito, which gave us an instance of kmsClient
with the appropriate role to decrypt KMS keys. kmsClient
will now be used to decrypt the per-conversation key.
If you were implementing a zero-trust approach, you would not be able to take advantage of the AWS KMS to decrypt the key and would need to do so manually, on a per-user basis using each user's public key that you had previously stored.
Step 6 - Send PubNub Message encrypted with conversation key:
Having retrieved the conversation key in plaintext, this can now be used as the cipherKey to encrypt PubNub messages. To clarify, this encryption by the PubNub cryptoModule
is a separate process to the encryption in steps 1-5 of this workflow.
PubNub gives you the option to specify the CryptoModule on the SDK's root pubnub
object. This has the advantage that PubNub will automatically encrypt and decrypt messages for you, but comes with the caveat that all messages are fully encrypted with the same key so this example does not use that approach. The code below shows a different technique, 'partial message encryption' which (despite the name) can be applied to the entire message body but also allows you to change the encryption key as needed.
Step 7 - Decrypt PubNub message with conversation key:
The recipient also has a copy of the decrypted conversation key, as described in step 5, above. The received PubNub message has its contents decrypted using the Crypto
module as shown below. Note the conversion to and from base64, as some APIs require an array of unsigned integers.
Note that this decryption is a separate process from what was described in steps 1 through 5 of this workflow—it uses the PubNub CryptoModule and not the kmsClient.
Rotating the encryption key (AWS)
As explained previously, you will need a mechanism to rotate your keys and notify clients that the update has occurred.
Your specific architecture for this on AWS will look something like this:
Step 1 - Rotate conversation key:
Regardless of how the rotation is initiated, the AWS API Gateway will expose an endpoint that authorized users can call.
Example HTTP POST to the /session/rotate endpoint, requesting the key for test-session-001
between users A & B be updated to version 2.
Steps 2, 3 & 4 - Generate new key with AWS KMS and AWS Lambda:
The steps that cover the regeneration of the key will be very similar to the steps used to generate the key initially, covered by the 'Generating a per-conversation encryption key on AWS' workflow.
An AWS Lambda handles the request to securely generate a new key, using the AWS KMS, as follows (error handling omitted for brevity):
As stated previously, this is using a 'server-trusted' approach. See the workflow for initially generating the key for more information about a 'zero-trust' approach.
Step 5 - Update the DynamoDB store with the new key version:
Again, storing version 2 or higher of the key proceeds in a very similar way to how version 1 was stored. The encryptedKeyBase64
output from Step 4 is stored in DynamoDB against each user in the conversation. Note that the version number is accepted as a parameter in the algorithm below.
Error handling in this AWS Lambda, including checks for an existing key, is omitted for brevity.
Step 6 - Distribute updated key to users:
The newly updated key needs to be distributed to all users in the conversation. I described this step in more detail when I discussed a server-agnostic approach earlier, and since this step does not depend on AWS, the same recommendations apply. One efficient way of notifying all online clients that a new key is available is to send a PubNub message to each one, instructing them to re-fetch the latest key from the server before they send any more messages.
Conclusion & Next Steps
There is no one-size-fits-all architecture when it comes to creating and securing a real-time application at scale, and every one of our customers has unique use cases that they factor into their solution design. Many customers are happy with the existing security provided by PubNub Access Manager in conjunction with TLS connections; however, for those customers also requiring message encryption, I hope this has given you a good starting point.
The architecture and approach given in this article have met the original requirements stated in the introduction:
Messages are encrypted with a symmetric key that is never stored or transmitted in plaintext.
Clients perform encryption/decryption locally, providing end-to-end encryption, and every conversation has a unique key.
Users are authenticated at the user level (as opposed to the device level), meaning they can send and receive encrypted messages from multiple clients (phone, desktop), provided they are logged in.
Keys can be rotated as needed.
This article did not give detailed instructions on creating and configuring the AWS resources described… the article was already long enough! If you need additional help with any aspect of your PubNub setup, please get in touch with us and we will gladly assist.
What about Asymmetric encryption?
If you want inspiration on implementing asymmetric encryption in your chat application, I highly recommend reading through the WhatsApp Encryption Overview. It contains all the information you need to implement their encryption architecture in your application, but it is quite an involved process.
Note that nothing is preventing you from implementing asymmetric encryption with PubNub messages. Still, unlike symmetric encryption, no helper methods are available in the PubNub SDK (at the time of writing), so you must encrypt and decrypt messages yourself.
Finally, please do get in touch if you have questions, either our presales or devrel teams are happy to help.