KEY TAKEWAYSIn this guide, we discuss WebSockets and other types of client-server connections. Learn about how they work and the best cases for their application.
In today’s extremely connected and constantly online world, we expect to have information instantly. Think of all the applications that we use to send messages or to receive live, up-to-date notifications in a single day. WebSockets are a useful tool for building web applications that need to provide these instant, real-time updates and communication.
The WebSocket Protocol establishes full-duplex, bidirectional communication between a client and server. This two-way flow is unique to WebSocket connections, and it means they can transfer data very quickly and efficiently. While there are many great uses for WebSockets, there are also environments where it will work better to use a different approach, like long polling. This is one reason that PubNub is protocol agnostic, which means that we choose the best solution available for the issue at hand, and do not favor a specific API or framework.
In this guide, we will explain what WebSockets are, and detail some of the benefits of using them for your real-time application. We will go over the best use cases to implement WebSockets, and discuss other options that you may want to use instead. By the end of this piece, you will have a clearer understanding of whether or not WebSockets will work for your application’s specific needs.
A bit of background - web servers, HTTP, and polling
To understand the WebSocket API, it is also important to understand the foundation it was built on – HTTP (Hypertext Transfer Protocol) and its request/response model. HTTP is an application layer protocol, and it is the basis for all web-based communication and data transfers.
When using HTTP, clients—such as web browsers—send requests to servers, and then the servers send messages back, known as responses. The web as we know it today was built on this basic client-server cycle, although there have been many additions and updates to HTTP to make it more interactive. There are currently a few viable and supported versions of HTTP—HTTP/1.1 and HTTP/2—and a secure version known as HTTPS.
Basic HTTP requests work well for many use cases, such as when someone needs to search on a web page and receive back relevant, non-time-sensitive information on the subject. However, it is not always best suited for web applications that require real-time communication, or for data that needs to update quickly with minimal latency.
Every time that the client makes a new HTTP server request, the default behavior is to open a new HTTP connection. This is inefficient because it uses bandwidth on recurring non-payload data and increases latency between the data transfers.
Additionally, HTTP requests can only flow in one direction—from the client side. There is traditionally no mechanism for the server to initiate communication with the client. The server is unable to send data to the client unless the client requests it first. This can create issues for use cases where messaging needs to go out in real time from the server side.
Short polling vs. long polling
One of the first solutions for receiving regular data updates was HTTP polling. Polling is a technique where the client repeatedly sends requests to the server until it responds with an update. As an example—all modern web browsers offer support for XMLHttpRequest, one of the original methods of polling servers.
These earlier solutions were still not ideal for efficient real-time communication—short polling is intensive, because for every request the non-payload data is re-sent and must be parsed, including the header html, the web url, and other repetitive information that wastes resources.
The next logical step to improve latency is HTTP long polling. When long polling, the client polls the server, and that connection remains open until the server has new data. The server sends the response with the relevant information, and then the client immediately opens another request, holding again until the next update. Long polling can hold a connection open for a maximum of 280 seconds before automatically sending another request. This method effectively emulates an HTTP server push.
Long polling can provide fast communication in many environments and is widely used, but there are some drawbacks to using this method, as opposed to true push-based methods like WebSocket connections or Server Side Events (SSE). Long polling is intensive on the server side, as it requires continuous resources to hold a connection open (but it still uses less than repeatedly sending requests). Web servers generally work to send responses as quickly and with as little bandwidth as possible, which conflicts with long polling requiring the connections to stay open.
And now... WebSockets!
WebSockets were invented specifically to address shortcomings that developers found while using long polling to facilitate real-time results.
WebSockets work by initiating continuous, full-duplex communication between a client and server. This reduces unnecessary network traffic, as data can immediately travel both ways through a single open connection. This increases speed and real-time capability on the web. WebSockets also enable servers to keep track of clients and “push” data to them as needed, which was not possible using only HTTP.
WebSocket connections enable the streaming of text strings and binary data via messages. WebSocket messages include a frame, payload, and data portion. Very little non-payload data gets sent across the existing network connection this way, which helps to dramatically reduce latency and overhead, especially when compared to HTTP request and streaming models.
Google Chrome was the first browser to include standard support for WebSockets in 2009. RFC 6455—The WebSocket Protocol—was officially published online in 2011. The WebSocket Protocol and WebSocket API are standardized by the W3C and the IETF, and support across browsers is very common.
How WebSocket connections work
Before a client and server can exchange data, they must use the TCP (Transport Control Protocol) layer to establish the connection. WebSockets effectively run as a transport layer over the TCP.
Once connected through an HTTP request/response pair, the clients can use an HTTP/1.1 mechanism called an upgrade header to switch their connection from HTTP over to WebSockets. A WebSocket connection is established through a websocket handshake over the TCP. During a new websocket handshake, the client and server also communicate which subprotocol will be used for their subsequent interactions. After this is established, the connection will be running on the WebSocket protocol.
It is important to note that when running on the WebSocket protocol layer, WebSockets require a uniform resource identifier (URI) to use a “ws:” or “wss:” scheme, similar to how HTTP URLs will always use a “http:” or “https:” scheme.
The benefits of using WebSockets for real-time communication
WebSockets provide real-time updates and open lines of communication, while avoiding the high overhead of many other data transfer methods.
WebSockets are HTML5 compliant, and offer backwards compatibility with older html documents. Therefore, they are supported by all modern web browsers—Google Chrome, Mozilla Firefox, Apple Safari, and more.
WebSockets are also compatible across platforms—Android, iOS, web, and desktop apps.
A single server can have multiple WebSocket connections open simultaneously, and can even have multiple connections with the same client, which opens the door for scalability.
WebSockets can stream through many proxies and firewalls.
Evaluating the drawbacks of WebSockets
While most browsers support WebSockets, there are no included mechanisms for network load balancing or for reconnecting if the connection to a client is lost.
Many proxy servers still do not offer support for WebSockets.
WebSockets do not support caching like HTTP.
It is necessary to have fallback options, like HTTP streaming or long polling, in use cases where WebSockets may not be supported.
Ideal use cases for WebSockets
There are many potential use cases for WebSockets, but they are most ideal for certain situations. It is important to consider the benefits and drawbacks when using WebSockets for real-time connection, before deciding if they are right for your application.
The most common use cases for WebSockets are push notifications. The open, bidirectional communication lines let the server push updates to the client immediately. The open connections also work great for asynchronous/fast-paced messaging. For this reason, live chat and live audience interaction are also popular use cases.
There are applications where a millisecond difference in timing can have an impact on the outcome – like eSports scores or stock prices. WebSockets are a great consideration for these types of apps.
WebSockets can also be implemented successfully in these environments:
Online auction bids
Real-time Location tracking
IoT device updates
Essentially, when deciding if your application should be built on WebSockets, consider how important the “real-time” aspect is for pushing your updates. Also, determine the user base and number of persistent connections that will be required at any given time. When real time deployment is necessary across large volumes of connections, WebSockets may be the right way to go.
How PubNub uses (and improves on) WebSockets
PubNub’s platform can be used to enhance WebSocket applications. PubNub offers developers a set of building blocks that work over a standard socket connection, made possible by WebSockets. These features ensure reliability, security, and scalability. The building blocks also provide extra services like analytics, storage, and more. Our data streaming platform and SDKs are also extremely scalable, especially compared to socket frameworks like Socket.io or SocksJS.
PubNub is a real-time communication platform that provides the foundation and infrastructure for live events, chat messages, geolocation, and more. As a protocol agnostic solution, we will always consider the best protocol or framework for any environment. For example, PubNub has also found that long polling can be just as efficient as WebSockets in many real-word, real-time implementations. In fact, we have developed a method for efficient long polling – written in C and with multiple kernel optimizations for scale.
Of course, there are many valuable and best-case uses for WebSockets. As a team, PubNub has also compiled some great, in-depth examples of WebSocket programming in different languages:
To conclude, WebSockets are a very useful protocol for building real-time functionality across web, mobile, and desktop variants, but they are not a one-size-fits-all approach. WebSockets are just one tool that fits into a larger arsenal when developing real-time, communication based applications. It is possible to build off of basic WebSocket protocol and incorporate other methods like SSE or long polling—or to use a platform like PubNub—and construct an even better, more scalable real-time application.
The PubNub Data Stream Network helps improve on WebSocket programming. Using a platform like PubNub for your backend infrastructure can ultimately save time and decrease the overhead when updating, maintaining, and scaling your live applications.
If you are interested in using PubNub to build and power your real-time application, please contact our team here.