Integrating Video Calling in Chat with WebRTC and ChatEngine

What is WebRTC?

WebRTC is a free and open source project that enables web browsers and mobile devices to provide simple realtime communication. This means that app features like peer-to-peer video conferencing can easily be integrated into a web page. A browser-based video chat can be engineered rapidly with HTML and JavaScript, no back-end code required.

ChatEngine WebRTC Video Call

WebRTC allows users to stream peer-to-peer audio and video in modern web browsers. This screenshot is from a WebRTC video call between 2 iOS devices using the Safari web browser.

Making a user’s device a WebRTC client is as simple as initializing a new RTCPeerConnection(); object in front-end JavaScript. Nowadays, WebRTC support comes out of the box with web browsers like Chrome, FireFox, Edge, Safari, and Opera on desktop, as well as native iOS and Android web browsers.

WebRTC Streaming Architecture

Video chat is established on two or more client devices using the WebRTC protocol. The connection can be made using one of two modes. The first mode is peer-to-peer, meaning audio and video packets are streamed directly from client to client with UDP. This works as long as both machines have an IP address that is accessible by the public internet.

Relying on peer-to-peer connections for browser video chat is not wise in production apps. It is common for the Interactive Connectivity Establishment or ICE framework to fail to establish a connection between two users when one or both are behind advanced LAN security.

To mitigate this, you can set your RTC Configuration to first attempt peer-to-peer, and then fall back to relayed connection if peer-to-peer fails.

WebRTC Peer to Peer Diagram

If publicly accessible IP addresses are not an option, like on enterprise WiFi networks, a WebRTC connection must be established over TCP using a TURN server. The ICE framework will decide if this is necessary as users are trying to connect. A TURN server acts as a relay for video and audio data. TURN instances require bandwidth and machine time – so it’s not free like peer-to-peer streaming.

A developer like yourself can make a TURN server using open source solutions and a general web hosting service. You can also use a TURN provider, like Xirsys. Remember that using a 3rd party TURN provider means that all audio and video data flows through their systems when in transit.

WebRTC Relayed Diagram

Don’t Build a WebRTC Signaling Server – Use PubNub

WebRTC leaves out a very important component from video calling. A client must use a signaling service to communicate messages with their peer or peers. These messages are for events like:

  • I, User A, would like to call you, User B
  • User A is currently trying to call you, User B
  • I, User B, accept your call User A
  • I, User B, reject your call User A
  • I, User B, would like to end our call User A
  • I, User A, would like to end our call User B
  • Text instant messaging like in Slack, Google Hangouts, Skype, Facebook Messenger, etc.

These messages are part of the Signaling Transaction Flow which is outlined in the Mozilla Developer Network documentation for WebRTC. This diagram illustrates some of the operations that must take place for an audio or video WebRTC call.

 

The WebRTC signaling server is an abstract concept. Many services can become this “signaling server” like WebSockets, Socket.IO or PubNub. If you’re tasked with creating a solution for this, you will end up asking: Should we build or should we buy?

PubNub allows a developer like yourself to fully, and cheaply, implement event-driven solutions like a WebRTC signaling service. An Open Source WebRTC library that uses PubNub is available on GitHub. However, the following PubNub solution is even more rapid than building with the WebRTC SDK.

ChatEngine – An Open Source Chat Framework

The market for chat applications is growing rapidly. PubNub is quintessential for building chat apps, so it’s advantageous for a developer like yourself to understand how to use it. PubNub is like a global CDN for realtime data. After years of supporting customers building unique chat apps, the engineering team at PubNub has made it even easier.

ChatEngine empowers you to build and scale web, mobile and desktop chat apps for any use case. With an extensive plugin library and programmable serverless functions, ChatEngine gives developers power, speed, and global scalability.

PubNub ChatEngine Sample App in CodePen

Whether you’re a software architect or a beginner programmer, ChatEngine guides you with its opinionated structure. It’s a JavaScript wrapper around PubNub’s many powerful APIs. Think of PubNub as your automated back-end infrastructure, and you build the front-end using the ChatEngine SDK for JavaScript, native iOS or Android.

If you’re building a website support chat like Intercom, or the next Slack with Electron.js, ChatEngine provides all the open source tools to get your app in production in days instead of months. Check out the ChatEngine documentation and the ChatEngine GitHub to start writing code.

Want to include realtime video and audio conferencing with WebRTC in your ChatEngine app? This community supported plugin for ChatEngine will speed up your development process.

Open Source WebRTC Video Chat Example

The open source community has created the WebRTC plugin for ChatEngine. You can now provide your users with a peer-to-peer or relayed WebRTC video chat experience. You can use your own STUN/TURN credentials with the same configuration object shown in the MDN RTCConfiguration Documentation (note that Safari only likes an array of URLs).

 

Warning

The WebRTC Plugin referenced in this post is open source and community supported.

Use at your own risk!

 

The plugin uses ChatEngine direct messaging for the WebRTC signaling service. All of the handshakes required by the signaling transaction flow are under the covers of the plugin, so you can focus on your app’s higher level code.

The plugin is available on NPM:

npm install chat-engine-webrtc

 


 

Try the example WebRTC ChatEngine app in this GitHub Repository.

The example app source code is in the example folder.

 


You can now build your own fully functioning WebRTC chat app with ChatEngine by first making an always free PubNub account. This step is required to continue the tutorial.

Once you have globally deployed your auto-scaling ChatEngine back-end with a single button click, create a chat app front-end project on your machine.

In this tutorial, we will use plain old JavaScript, HTML, and CSS. If you want to use a modern front-end framework (like VueReact, or Angular) to build your chat app, check out the PubNub tutorials page or the examples in the ChatEngine GitHub repository.

WebRTC App Tutorial

You can use the HTML and CSS in my project example. Copy those files into your project folder.

Doing this makes a very generic chat app user interface. The example app has only 1 global chat, and no private 1:1 chats, although they are easy to implement. Make sure that you also copy the png images from the example to your project.

Open index.html with your favorite text editor. Replace the script tags beneath the body tag of your HTML file with these 2 CDN scripts. Leave the 3rd script tag that refers to app.js. We will write that file together.

<script src="https://cdn.jsdelivr.net/npm/chat-engine@0.9.18/dist/chat-engine.js"></script>
<script src="https://cdn.jsdelivr.net/npm/chat-engine-webrtc@latest/dist/chat-engine-webrtc.js"></script>

The next step is to create your own app.js file in the same directory as your index.html file. The reason we need to make a new app.js is because the script in my example uses Xirsys. My private account is wired to my PubNub Functions server. You will need to make your own back-end server and account if you wish to use a TURN provider like Xirsys. My next blog post will contain a tutorial for building WebRTC apps with TURN.

The app.js script we will write together will use only free peer-to-peer WebRTC connections. If you try to do a video call with 2 devices on the same LAN, your app will work. It is not certain that a video call connection can be made with clients on separate networks.

app.js

Make a ChatEngine declaration in your app.js file. Use the API keys for your ChatEngine app that you created earlier. They will connect every client to your specific PubNub account.

// Init ChatEngine
const ChatEngine = ChatEngineCore.create({
    publishKey: '__YOUR_PUBLISH_KEY_HERE__',
    subscribeKey: '__YOUR_SUBSCRIBE_KEY_HERE__'
}, {
    globalChannel: 'chat-engine-webrtc-example'
});

// Init the WebRTC plugin and chat interface here
ChatEngine.on('$.ready', (data) => {
    // ...
});

Next, we need to initialize and configure that ChatEngine WebRTC plugin after the ChatEngine connection is ready. To do this, we will add the following code inside the $.ready event callback that we created in the previous step.

let onlineUuids = [];

const onPeerStream = (webRTCTrackEvent) => {
    console.log('Peer a/v stream now available');
    const peerStream = webRTCTrackEvent.streams[0];
    remoteVideo.srcObject = peerStream;
};

const onIncomingCall = (user, callResponseCallback) => {
    console.log('Incoming Call from ', user.state.username);
    incomingCall(user.state.username).then((acceptedCall) => {
        if (acceptedCall) {
            // End an already open call before opening a new one
            ChatEngine.me.webRTC.disconnect();
            videoModal.classList.remove(hide);
            chatInterface.classList.add(hide);
            noVideoTimeout = setTimeout(noVideo, 5000);
        }

        callResponseCallback({ acceptedCall });
    });
};

const onCallResponse = (acceptedCall) => {
    console.log('Call response: ', acceptedCall ? 'accepted' : 'rejected');
    if (acceptedCall) {
        videoModal.classList.remove(hide);
        chatInterface.classList.add(hide);
        noVideoTimeout = setTimeout(noVideo, 5000);
    }
};

const onDisconnect = () => {
    console.log('Call disconnected');
    videoModal.classList.add(hide);
    chatInterface.classList.remove(hide);
    clearTimeout(noVideoTimeout);
};

// add the WebRTC plugin
let config = {
    rtcConfig,
    ignoreNonTurn: false,
    myStream: localStream,
    onPeerStream,
    onIncomingCall,
    onCallResponse,
    onDisconnect
};

const webRTC = ChatEngineCore.plugin['chat-engine-webrtc'];
ChatEngine.me.plugin(webRTC(config));

// Add a user to the online list when they connect
ChatEngine.global.on('$.online.*', (payload) => {
    if (payload.user.name === 'Me') {
        return;
    }

    const userId = payload.user.uuid;
    const name = payload.user.state.username;

    const userListDomNode = createUserListItem(userId, name);

    const index = onlineUuids.findIndex(id => id === payload.user.uuid);
    const alreadyInList = index > -1 ? true : false;

    if (!alreadyInList) {
        onlineUuids.push(payload.user.uuid);
    } else {
        return;
    }

    onlineList.appendChild(userListDomNode);

    userListDomNode.addEventListener('click', (event) => {
        const userId = userListDomNode.id;
        const userToCall = payload.user;

        confirmCall(name).then((yesDoCall) => {
            if (yesDoCall) {
                ChatEngine.me.webRTC.callUser(userToCall, {
                    myStream: localStream
                });
            }
        });
    });
});

// Remove a user from the online list when they disconnect
ChatEngine.global.on('$.offline.*', (payload) => {
    const index = onlineUuids.findIndex((id) => id === payload.user.uuid);
    onlineUuids.splice(index, 1);

    const div = document.getElementById(payload.user.uuid);
    if (div) div.remove();
});

// Render up to 20 old messages in the global chat
ChatEngine.global.search({
    reverse: true,
    event: 'message',
    limit: 20
}).on('message', renderMessage);

// Render new messages in realtime
ChatEngine.global.on('message', renderMessage);

The new code that we just added:

  • Declares all of the plugin event handlers for WebRTC call events
  • Initializes the WebRTC plugin for ChatEngine and passes the configuration object to the instance
  • Adds and removes user online list elements as users come on and offline in the app
  • Registers an event handler to make a new video call to a user whenever their name is clicked in the user list UI
  • Retrieves up to 20 latest text messages sent in the ChatEngine app’s global channel
  • Registers an event handler to render new chat messages whenever one is sent to the global chat, in real-time

Next, we will need some utility methods to perform UI specific functionality. These are not specific to all ChatEngine apps, they are only for running this specific UI that I designed. Add this code to the bottom of the app.js file.

// =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
// UI Render Functions
// =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
function renderMessage(message) {
    const messageDomNode = createMessageHTML(message);

    log.append(messageDomNode);

    // Sort messages in chat log based on their timetoken
    sortNodeChildren(log, 'id');

    chat.scrollTop = chat.scrollHeight;
}

function incomingCall(name) {
    return new Promise((resolve) => {
        acceptCallButton.onclick = function() {
            incomingCallModal.classList.add(hide);
            resolve(true);
        }

        rejectCallButton.onclick = function() {
            incomingCallModal.classList.add(hide);
            resolve(false);
        }

        callFromSpan.innerHTML = name;
        incomingCallModal.classList.remove(hide);
    });
}

function confirmCall(name) {
    return new Promise((resolve) => {
        yesCallButton.onclick = function() {
            callConfirmModal.classList.add(hide);
            resolve(true);
        }

        noCallButton.onclick = function() {
            callConfirmModal.classList.add(hide);
            resolve(false);
        }

        callConfirmUsername.innerHTML = name;
        callConfirmModal.classList.remove(hide);
    });
}

function getLocalUserName() {
    return new Promise((resolve) => {
        usernameInput.focus();
        usernameInput.value = '';

        usernameInput.addEventListener('keyup', (event) => {
            const nameLength = usernameInput.value.length;

            if (nameLength > 0) {
                joinButton.classList.remove('disabled');
            } else {
                joinButton.classList.add('disabled');
            }

            if (event.keyCode === 13 && nameLength > 0) {
                resolve(usernameInput.value);
            }
        });

        joinButton.addEventListener('click', (event) => {
            const nameLength = usernameInput.value.length;
            if (nameLength > 0) {
                resolve(usernameInput.value);
            }
        });
    });
}

function getLocalStream() {
    return new Promise((resolve, reject) => {
        navigator.mediaDevices
        .getUserMedia({
            audio: true,
            video: true
        })
        .then((avStream) => {
            resolve(avStream);
        })
        .catch((err) => {
            alert('Cannot access local camera or microphone.');
            console.error(err);
            reject();
        });
    });
}

function createUserListItem(userId, name) {
    const div = document.createElement('div');
    div.id = userId;

    const img = document.createElement('img');
    img.src = './phone.png';

    const span = document.createElement('span');
    span.innerHTML = name;

    div.appendChild(img);
    div.appendChild(span);

    return div;
}

function createMessageHTML(message) {
    const text = message.data.text;
    const user = message.sender.state.username;
    const jsTime = parseInt(message.timetoken.substring(0,13));
    const dateString = new Date(jsTime).toLocaleString();

    const div = document.createElement('div');
    const b = document.createElement('b');

    div.id = message.timetoken;
    b.innerHTML = `${user} (${dateString}): `;

    div.appendChild(b);
    div.innerHTML += text;

    return div;
}


// =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
// Utility Functions
// =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
function sendMessage() {
    const messageToSend = messageInput.value.replace(/\r?\n|\r/g, '');
    const trimmed = messageToSend.replace(/(\s)/g, '');

    if (trimmed.length > 0) {
        ChatEngine.global.emit('message', {
            text: messageToSend
        });
    }

    messageInput.value = '';
}

// Makes a new, version 4, universally unique identifier (UUID). Written by
//     Stack Overflow user broofa
//     (https://stackoverflow.com/users/109538/broofa) in this post
//     (https://stackoverflow.com/a/2117523/6193736).
function newUuid() {
    return ([1e7]+-1e3+-4e3+-8e3+-1e11).replace(
        /[018]/g,
        (c) => (c ^ crypto.getRandomValues(new Uint8Array(1))[0] & 15 >> c / 4)
            .toString(16)
    );
}

// Sorts sibling HTML elements based on an attribute value
function sortNodeChildren(parent, attribute) {
    const length = parent.children.length;
    for (let i = 0; i < length-1; i++) {
        if (parent.children[i+1][attribute] < parent.children[i][attribute]) {
            parent.children[i+1].parentNode
                .insertBefore(parent.children[i+1], parent.children[i]);
            i = -1;
        }
    }
}

function noVideo() {
    const message = 'No peer connection made.\n' +
        'Try adding a TURN server to the WebRTC configuration.';

    if (remoteVideo.paused) {
        alert(message);
    }
}

Next, we will add all of the client initialization and event handlers. This will request camera and microphone access from the user to collect their A/V stream. This will make their device ready for WebRTC calls. Also, register events like text message submission. Add this code before the $.ready event registration in app.js.

const hide = 'hide';
const uuid = newUuid();

// An RTCConfiguration dictionary from the browser WebRTC API
// Add STUN and TURN server information here for WebRTC calling
const rtcConfig = {};

let username; // local user name
let localStream; // Local audio and video stream
let noVideoTimeout; // Used for checking if video connection succeeded

// Init the audio and video stream on this client
getLocalStream().then((myStream) => {
    localStream = myStream;
    myVideoSample.srcObject = localStream;
    myVideo.srcObject = localStream;
}).catch(() => {
    myVideo.classList.add(hide);
    myVideoSample.classList.add(hide);
    brokenMyVideo.classList.remove(hide);
    brokenSampleVideo.classList.remove(hide);
});

// Prompt user for a username input
getLocalUserName().then((myUsername) => {
    username = myUsername;
    usernameModal.classList.add(hide);

    // Connect ChatEngine after a username and UUID have been made
    ChatEngine.connect(uuid, {
        username
    }, 'auth-key');
});

// Send a message when Enter key is pressed
messageInput.addEventListener('keydown', (event) => {
    if (event.keyCode === 13 && !event.shiftKey) {
        event.preventDefault();
        sendMessage();
        return;
    }
});

// Send a message when the submit button is clicked
submit.addEventListener('click', sendMessage);

// Register a disconnect event handler when the close video button is clicked
closeVideoButton.addEventListener('click', (event) => {
    videoModal.classList.add(hide);
    chatInterface.classList.remove(hide);
    clearTimeout(noVideoTimeout);
    ChatEngine.me.webRTC.disconnect();
});

// Disconnect ChatEngine before a user navigates away from the page
window.onbeforeunload = (event) => {
    ChatEngine.disconnect();
};

Lastly, we need to add the variables that cache every DOM element that app.js interacts with. These are elements like the chat log, online list elements, the video objects to display streams, and more. Add the following code to the top of the app.js file.

const chatInterface = document.getElementById('chat-interface');
const myVideoSample = document.getElementById('my-video-sample');
const myVideo = document.getElementById('my-video');
const remoteVideo = document.getElementById('remote-video');
const videoModal = document.getElementById('video-modal');
const closeVideoButton = document.getElementById('close-video');

const brokenMyVideo = document.getElementById('broken-my-video');
const brokenSampleVideo = document.getElementById('broken-sample-video');

const usernameModal = document.getElementById('username-input-modal');
const usernameInput = document.getElementById('username-input');
const joinButton = document.getElementById('join-button');

const callConfirmModal = document.getElementById('call-confirm-modal');
const callConfirmUsername = document.getElementById('call-confirm-username');
const yesCallButton = document.getElementById('yes-call');
const noCallButton = document.getElementById('no-call');

const incomingCallModal = document.getElementById('incoming-call-modal');
const callFromSpan = document.getElementById('call-from');
const acceptCallButton = document.getElementById('accept-call');
const rejectCallButton = document.getElementById('reject-call');

const onlineList = document.getElementById('online-list');
const chat = document.getElementById('chat');
const log = document.getElementById('log');
const messageInput = document.getElementById('message-input');
const submit = document.getElementById('submit');

Done! Now you can deploy your static front-end web files on a web hosting platform like WordPress or GitHub pages. Your WebRTC chat app will be available for use by anyone in the world. The code is mobile compatible, meaning the latest web browsers on iOS and Android will be able to run the app for face to face video!

Frequently Asked Questions (FAQ) about the WebRTC Plugin

Is the plugin officially a part of ChatEngine?

No. It is an open source project that is community supported. If you have questions or need help, reach out to devrel@pubnub.com. If you want to report a bug, do so on the GitHub Issues page.

Does ChatEngine stream audio or video data?

No. ChatEngine pairs very well with WebRTC as a signaling service. This means that PubNub signals events from client to client using the ChatEngine #direct events. These events include:

  • I, User A, would like to call you, User B
  • User A is currently trying to call you, User B
  • I, User B, accept your call User A
  • I, User B, reject your call User A
  • I, User B, would like to end our call User A
  • I, User A, would like to end our call User B
  • Text instant messaging like in Slack, Google Hangouts, Skype, Facebook Messenger, etc.

Can I make a group call with more than 2 participants?

Group calling is possible to develop with WebRTC and ChatEngine, however, the current ChatEngine WebRTC plugin can connect only 2 users in a private call. The community may develop this feature in the future but there are no plans for development to date.

I found a bug in the plugin. Where do I report it?

The ChatEngine WebRTC plugin is an open source, community supported project. This means that the best place to report bugs is on the GitHub Issues page in for the code repository. The community will tackle the bug fix at will, so there is no guarantee that a fix will be made. If you wish to provide a code fix, fork the GitHub repository to your GitHub account, push fixes, and make a pull request (process documented on GitHub).

For more ChatEngine examples and plugins, check out the ChatEngine Github. If you like this plugin, need some help, or want to build something similar, reach out to devrel@pubnub.com. We want to hear your feedback!

Try PubNub Today