In this article, we’re going to send PubNub messages really, really fast. Why? This technique is useful for things like simulating app performance with larger user populations. Using PubNub sample data, you can significantly simplify app development and testing.
You could use it with any language, but a language that allows you to take advantage of parallelism is most ideal. We will use Python (no, the GIL won’t limit us). Note that you will be unable to use the demo key here because it is capped at 2 messages per second (per client) to prevent abuse.
Therefore, you’ll have to sign up here and use your sandbox keys (get these in the portal) when writing your script. Also, we recommend you only use this technique for testing purposes, as it can become expensive at scale if used in production.
The Macro Perspective of Rapidfire Messaging with PubNub Sample Data
With all those warnings out of the way, let’s talk about our methods. Instead of using a client SDK, we’ll merely open sockets and send REST calls to PubNub. A potential solution is to open 100 or so sockets and send messages on them round-robin style. This will perform extremely well at the cost of long term robustness. The more reliable method we will use approches the same level of performance – especially for the data simulation use case – with the added cost of multi-processing.
Our method will be to periodically spawn a process that opens a socket, send a quantity of messages on it, read the responses in a separate process, and close the socket. We need to read the responses in parallel because a socket will time-out if one sends too many messages without reading the responses. In our demo, we repeat this process every 10 seconds, sending a randomized quantity of messages.
Imports and Constants
The imports above will almost all be used in this article; I have included UUID because this script will serve as the starter code for the next article in this series. The constants are all change-able (in fact, you must enter your pub and sub keys). However, if you use the message counter webpage I included in the github code, make sure the keys and channel match.
You’ll first need to sign up for a PubNub account. Once you sign up, you can get your unique PubNub keys in the PubNub Developer Portal. Once you have, clone the GitHub repository, and enter your unique PubNub keys on the PubNub initialization.
The Send & Receive Functions
Each cycle of messaging is spawned as a new process. This new process spawns (and later joins) a helper process that reads from the socket in parallel. Using threads instead of processes would not cause any concurrency issues. However, because of Python’s global interpreter lock, the true parallelism gained with multiple processes is worth the overhead of process spawning.
We’ll start by creating our “send” function. It has a single parameter: an array of strings which we will send on that socket.
We then spawn the reading process (using the socket read function we’ll write in a moment). I also recorded a timestamp to log the time it takes to send our messages this cycle. We start our process before sending. Therefore, we rely on the two second timeout we set for the socket to keep the read going until we begin (and finish) sending.
We can now send our messages, and print the time lapsed when completed. We perform one more socket read to make sure we’ve caught everything and join the reader process we spawned. Now that we’re done, we can close the socket.
The socket reader function is extremely simple. We merely read from the socket in a while loop until we time out or receive an error (and log the error).
The Firing “Loop” with PubNub Sample Data
Now that we’ve written our sending layer, we can write a loop that generates a message array and spawns a sender process ever 10 seconds. We use the threading timer to start sending regardless of overhead and the completion of the previous send.
That’s it for now. Play around with the average quantity of messages to test performance. You’ll see that to maximize total throughput, it is better to send a smaller number of messages frequently than a large number of messages sporadically.
In the next tutorial, we will modify this code to make the toSend variable more interesting. We will send simulated usage logs defined in our analytics package overview, and also write that simulation. If you’d like to see that the messages are sending, you can use the MessageCounter.html file included in the github repo. Enjoy!