Guides

What is WebRTC?

0 MIN READ • Darryn Campbell on Mar 6, 2024

WebRTC Definition

WebRTC which stands for Web Real-Time Communication, is an open-source project that provides real-time, P2P communication for data, audio, and video between two endpoints such as web browsers or within apps using JavaScript APIs.

This technology has seen significant advancements and improvements since its inception, including better network traversal, enhanced data streaming, and bandwidth capabilities. Given its robust functionalities, WebRTC has gained widespread adoption in various sectors including e-learning, telehealth, and real-time collaboration tools, solidifying its stand in the current market.

WebRTC Average Effectiveness in Data Transfer Performance

Protocol Latency: Typically around 100-300 milliseconds, depending on network conditions.
Data Transfer Speed: Can reach speeds of 10-50 Mbps, influenced by the peers' network bandwidth and connection quality.
File Size: Generally effective for files up to 1-2 GB, with performance varying based on network stability and conditions.
Encryption: Provides end-to-end cryptographic mechanisms, ensuring secure file transfers.

History of WebRTC

WebRTC, first released by Google in 2011, quickly gained support from major companies like Apple, Microsoft, Mozilla, and Opera. Its compatibility with mobile and desktop browsers (such as Chrome, Safari, Firefox, and Edge) enables to integrate communication features directly into applications without plugins. The technology's adoption has surged due to better internet connectivity and the growing demand for real-time communication apps.

Common WebRTC Tech Use Cases

WebRTC serves numerous use cases, from real-time audio and video calling and chat to peer-to-peer file sharing. Advancements in technology have further expanded the application of WebRTC to health care, live streaming platforms, and video conferencing:

In-browser / in-app customer support chat, now enhanced with features like screen sharing and real-time annotations.
Telemedicine applications facilitate telemetry, doctor-patient video/audio calls, aiding in video communication, now supporting high-definition streams (with MQTT), and secure data channels for sharing health records.
File sharing is now capable of handling larger data volumes and faster data transfer speeds
Real-time gaming and live streaming platforms, with WebRTC's low-latency streaming capabilities.
Collaborative tools featuring live video and audio conferencing, document editing, and more.

How Does WebRTC Work?

WebRTC operates by connecting two browsers through the RTCPeerConnection, facilitated by signaling protocols such as Session Description Protocol (SDP). Network traversing procedures like STUN (Session Traversal Utilities for NAT) and TURN (Traversal Using Relays around NAT) protocols have been incorporated into WebRTC's working mechanism to handle connectivity in different network scenarios. This includes establishing a direct connection even through firewalls or NATs, determining public IP addresses, and relaying data if necessary using TURN servers.

Using WebRTC through RTCPeerConnection

WebRTC provides an RTCPeerConnection interface for setting up a peer-to-peer connection. The RTCPeerConnection allows for the establishment of a connection with low-latency, high-quality voice and video calling, by handling signal processing, codec handling, peer-to-peer communication, security, and bandwidth management. Additional P2P methods like getStats() have been introduced to fetch statistical data about the connection, aiding developers in monitoring and optimizing the call's performance.

How to Use WebRTC Through RTCDataChannel & MediaStream

WebRTC allows data exchange through the RTCDataChannel, a component of the RTCPeerConnection. The RTCDataChannel API allows bidirectional communication of arbitrary data between peers. It uses the Stream Control Transmission Protocol (SCTP), providing reliable and secure file transfer and opening up possible use cases of direct data sharing and real-time gaming. The MediaStream API, also known as getUserMedia, enables access to local cameras and microphones, with user consent. Recent updates allow more granular control over the media streams, such as adjusting the video resolution, frame rate, and audio quality.

How to Use PubNub and WebRTC to Build Applications

Initially, you need some way to transfer the SDP from browser A to browser B. The PubNub network can be used as a signaling server for these apps. Features, such as Presence and Storage/Playback also can be used to enhance those apps.

When calling another user through a WebRTC protocol or app, the caller needs to know if the callee is currently online or offline and what device they’re using, if they’re available to accept the call, etc.

Presence gives you all that information and enables you to show the users they can connect with. This is essential for signaling, to prevent users from trying to connect to other users that aren’t available.

A Note on WebRTC protocol and texting

The WebRTC protocol does not provide storage capabilities, and as a result, there are no records of what messages have been sent. Specifically, with text chat, users expect a history of previous chat conversations. PubNub’s Storage/Playback feature allows users to see a history of past conversations over a desired period of time.

Basically; PubNub provides:

SDP Signaling
Presence status
Chat message storage

WebRTC vs. WebSockets

WebSockets provide a TCP-based, full-duplex communication protocol between client and server. Conversely, WebRTC supports peer-to-peer exchanges over UDP, allowing for real-time video and audio data streams. The distinction between WebRTC and WebSockets has become more significant with the increasing demand for real-time applications, prompting many developers to leverage the unique strengths of both technologies.

How to Use Audio & Video with WebRTC

With WebRTC, video and voice communication can be easily implemented into any website. This adds advanced levels of interaction to a website, allowing users to communicate in context and in real time, with either site operators or with each other. Such capabilities are more desirable than current limitations whereby users need to call a number, download a plugin, or leave your website. For example, a financial institution website could easily embed a WebRTC communication app to allow users to quickly speak to a financial representative (rather than forcing them to use their phone to endlessly talk with an automated representative).

Another example of WebRTC with voice and video would be a Skype-like video chat application, that can be used entirely in the web browser. This means that end users don’t have to install any software or plugins, and can easily connect to one another, through video, audio, and text chat, browser to browser. We built WebRTC.co, a JavaScript video chat application that runs entirely on PubNub GitHub WebRTC repo

Technologies that support WebRTC

WebRTC is based on a combination of several technologies and standards that work together to facilitate real-time communication over the web. Here are the key technologies that underpin WebRTC:

1. Real-Time Protocols (RTP/RTCP)

RTP (Real-Time Protocol) WebRTC uses RTP to handle the transmission of audio and video streams. It is responsible for delivering media streams with low latency.
RTCP (RTP Control Protocol) RTCP works alongside RTP to provide feedback on the quality of the media distribution, such as packet loss and jitter, which helps manage the quality of the stream.

2. Session Description Protocol (SDP)

SDP is used to describe multimedia communication sessions. In WebRTC, SDP is used during the signaling process to exchange information about media capabilities (codecs, formats, network information) between peers.

3. Interactive Connectivity Establishment (ICE)

ICE is a framework used by WebRTC to find the best path to connect peers, even when they are behind NATs (Network Address Translators) or firewalls. It involves gathering and testing multiple network candidates (e.g., local IPs, STUN, and TURN server candidates) to establish the most reliable connection.

4. STUN (Session Traversal Utilities for NAT)

STUN is a protocol used to discover the public IP address and port number of a device located behind a NAT. This information helps peers establish a direct connection in peer-to-peer communication.

5. TURN (Traversal Using Relays around NAT)

TURN is used as a relay service when a direct peer-to-peer connection is not possible due to strict NATs or firewalls. It relays the media streams between peers through an intermediate server.

6. DTLS (Datagram Transport Layer Security)

DTLS is used for securing data that is transported over UDP (User Datagram Protocol). In WebRTC, DTLS is used to encrypt the data streams (audio, video, and data channels) to ensure privacy and security.

7. SRTP (Secure Real-Time Transport Protocol)

SRTP is an extension of RTP that provides encryption, message authentication, and integrity, as well as replay protection for RTP data. This ensures that the media streams are secure during transmission.

8. JavaScript APIs

WebRTC provides a set of JavaScript APIs that developers can use to implement real-time communication in web applications. These APIs include:

getUserMedia() Captures audio and video from the user's device.
RTCPeerConnection() Manages the connection between peers, handles ICE candidates, and sets up media streams.
RTCDataChannel() Allows for peer-to-peer data transfer.

9. Codec Support

WebRTC supports various audio and video codecs to compress and decompress media streams. Commonly supported codecs include:

WebRTC Audio Codecs:

Opus: Opus is the default audio codec used in WebRTC. It is highly flexible and provides a wide range of bitrates while maintaining good audio quality. Opus supports voice and music and is optimized for low latency, making it ideal for real-time communication.
G.711: This codec is commonly used in traditional telephony systems and is supported by WebRTC for compatibility with legacy systems. G.711 provides a relatively high audio quality but at the cost of higher bandwidth requirements.
G.722: G.722 is another audio codec supported by WebRTC. It offers improved audio quality compared to G.711 and is typically used in high-definition voice applications.
PCMU and PCMA are narrowband audio codecs commonly used in legacy systems. While WebRTC supports them for interoperability, they are not recommended for real-time communication due to their limited audio quality.

WebRTC Video Codecs:

VP8: VP8 is the default video codec used in WebRTC. It offers good video quality with adaptive bitrate streaming capabilities, which dynamically adjusts the video quality based on network conditions. VP8 is widely supported and offers good compression efficiency.
VP9: VP9 is an advanced video codec that provides improved compression efficiency compared to VP8. It offers better video quality at lower bitrates, making it ideal for high-definition video streaming.
H.264: H.264 is a widely used video codec supported by WebRTC for compatibility with legacy systems. It offers good video quality and compression efficiency but is a licensed codec and may require royalty payments.
H.265: H.265, also known as HEVC (High-Efficiency Video Coding), is an advanced video codec that further improves compression efficiency compared to H.264. It offers better video quality at lower bitrates, but like H.264, it is a licensed codec.

What are main WebRTC Alternatives?

SIP (Session Initiation Protocol) a signaling protocol for initiating, modifying, and terminating real-time sessions involving video, voice, messaging, and other communications applications and services. It is widely used in VoIP (Voice over Internet Protocol) systems.
XMPP (Extensible Messaging and Presence Protocol) an open-standard communication protocol for instant messaging, presence information, and contact list management. It can be extended to support real-time voice and video communication as well.
MQTT (Message Queuing Telemetry Transport) is a publish-subscribe messaging protocol commonly used in IoT applications. It is designed to be lightweight and efficient, making it suitable for devices with limited processing power and bandwidth. While WebRTC and MQTT are not directly related, using them together in certain scenarios is possible. For example, suppose you are building a real-time chat application that requires audio/video communication and messaging capabilities. In that case, you can use WebRTC for audio/video streaming and MQTT for messaging.
WebSockets are often used with WebRTC, they can also be used as an alternative for real-time communication. WebSocket is a communication protocol that provides full-duplex communication channels over a single TCP connection. It allows for real-time, bi-directional communication between clients and servers.
Jingle is an extension to the XMPP protocol that adds multimedia session initiation capabilities, including audio, video, and file transfer. It is commonly used in voice and video chat applications.

What is the Future of WebRTC?

As a video and data service, WebRTC is pretty much here to stay; with buy-in from the major browser players and commitment to keep the mobile and desktop APIs up to date, you will find more and more apps moving to keep consumers on the platform with tools that support video chats for services such as dating, gaming, and healthcare.