Build

Quick Word Cloud from a Chatroom with D3js

3 min read Michael Carroll on Oct 9, 2014

D3.js is a JavaScript library that lets you bring data to create interactive graphs and charts that run on browser. It is a very powerful tool for creating eye-catching data visualization. As a quick and easy exercise, in this tutorial we’ll create a colorful word cloud with D3js and the PubNub Storage & Playback history API.

A word cloud is a visualized data representation of words used in a particular text. The size of each word indicates its frequency or importance. Or just take a look at the screenshot. A picture is literally worth a thousand words!

Want to see it in action? Check out our D3js word cloud demo here. Additionally, all the D3js word cloud source code you need can be found here.

Now, let’s get started. We’re going to build a word cloud from a chatroom created using PubNub Data Streams. You’ll need a basic to intermediate knowledge of JavaScript Document Object Model (DOM), and basic understanding of SVG and D3js for this tutorial.

Using d3.layout.cloud

You’ll first need to sign up for a PubNub account. Once you sign up, you can get your unique PubNub keys in the PubNub Developer Portal. Once you have, clone the GitHub repository, and enter your unique PubNub keys on the PubNub initialization.

To create a word cloud layout, we use D3js along with the 3rd party script for D3js, d3-cloud by Jason Davies. In your HTML file, include:

d3.layout.cloud takes JSON data that includes a keys and values of each word and its frequency as a size, such as [{"text": "hello", "size": 23}, {"text": "night","size": 3}...]. This method populates graphical data, such as size of each text and its position, with a set of standard attributes, so all words will nicely fit in place.

d3.layout.cloud is asynchronous, and when it is done, bind the tailored data to DOM.

Now we are going to need the word data from PubNub stream.

Retrieving and Streaming Messages from a Chatroom

To retrieve last 100 messages stored in PubNub data stream, you can use history() API. This function fetches historical messages of a channel. This Storage & Playback feature provides real-time access to a history for all messages published to PubNub.

Assume you have created a channel called “chat”, which holds an array of each message strings that you are going to count the words to create a word cloud visualization. To pass the data to draw a word cloud with d3.js, you need a function (let’s call it processData()) to return an array containing property keys and values, as I mentioned ealrier.

At the success callback, the history() returns a list of messages, the start time token and the ending time token.

[['message1', 'message2', ... ], 'Start Time Token', 'End Time Token']

Take the array of messages to processData() to convert the data into a proper data form with keys and values of your word data objects.

After binding the data to d3.layout.cloud, you get a beautiful word cloud from your chat room messages!

For the rest of the code including the processData() function, the entire source code of the demo is on GitHub!

0