Evaluating Long Polling vs WebSockets
Web applications were initially designed as a client-server architecture. Clients will create an HTTP/HTTPS request to the designated server requesting or modifying a piece of information. For example, a basic web application will follow a similar flow.
The client requests data from the server
The load balancer routes the requests to the appropriate server
The server queries the applicable database for some data
The database returns the queried data to the server
The server processes the data and sends the data back to the client
A simple HTTP request is the most common way to receive server information. However, what if you wanted to get data back as soon as it was added to the database or sent to the server? With a standard web application designed as a client-server architecture, you would have to repeat this process over and over to check if there is new information that has been added to the database. This process is known as polling or sometimes referred to as short polling. The downside to this approach is that data will not be returned the majority of the time as the server hasn’t received any new information. Let's solve this problem and discuss the advantages and disadvantages of Long Polling and WebSockets.
Overview: Long polling vs WebSockets
Long polling is an approach where the client will send an API request to the server, but instead of receiving an instant response from the server, it entails maintaining the HTTP connection. Maintaining the HTTP connection enables the server to reply later when data becomes available or the timeout threshold has been reached. After receiving the response, the client will immediately send the subsequent request.
Instead of sending numerous requests repeatedly until the server receives new information, as in polling, the client only has to send one request to the server to get the latest information. After receiving the data, the client can initiate a new request, repeating this process as often as necessary.
A flow for Long polling will look as follows:
The client-side makes an HTTP request to the server requesting some data
The server does not respond immediately with the requested information but waits until new information is available
When new data becomes available, the server responds with new information
The client receives that data and immediately sends another request to the server, re-starting the process
WebSockets are a modern technology built on top of a device’s TCP/IP stack. The only relationship to HTTP protocol is that HTTP servers interpret its handshake to establish a connection. It is a bidirectional, full-duplex protocol that is stateful, meaning the connection between the client and server will persist until either party decides to terminate it.
Unlike long polling, which is only a half-duplex solution, the process does not need to repeat after receiving the latest information from the server. WebSocket technology allows us to keep the connection alive after the new information has been returned and perform bidirectional updates. The client can send information back to the server and listen for further information in the same request.
A WebSocket connection flow will look something like this:
The client-side initiates a WebSocket by sending a request which contains an upgrade header to switch the communication protocol to a WebSocket protocol
If the server can establish a connection and agrees with the client's terms, then it sends a response to the client acknowledging the WebSocket handshake request
Once the client receives a successful WebSocket connection, the client and the server can now start sending data in both directions allowing real-time communication
The server or the client decides to terminate the connection
When to choose Long Polling vs WebSockets
There is a debate on when to use long polling or WebSocket protocol. Both have their benefits and limitations and are often used for different purposes. In this section, we will discuss the key benefits of both long polling and WebSockets.
Pros of Long Polling vs WebSockets
Compatibility: Long polling is an older technology used more as a technique, making it a more compatible option than WebSockets. It is built on top of XMLHttpRequest, consistent with a broader range of web browsers and network configurations.
Network: With today’s technology, people constantly switch networks from 3G to LTE to WiFi. WebSockets must be configured to adapt to a change in the network connection. This configuration is because the connection has to be re-established with the server and cannot be revived after the client has opted to close the connection. With long polling, this is not an issue as it is set up where after the predetermined time (usually 20 seconds), the client will try to send another request re-establishing a connection with the server automatically and does not have to be handled in an error state as with WebSockets.
Use cases to choose long polling over WebSockets
Long polling and WebSockets are generally used in cases where real-time updates are required. Some examples include in-app chat, real-time pricing, geo-tracking, and IoT.
Long polling provides benefits over WebSockets in use cases with low-frequency real-time updates. These benefits are because long polling is a half-real-time solution where the connection needs to be re-established. Additionally, as mentioned above, if users are in an environment with low bandwidth or an unstable network provider, long polling is architected to re-establish the connection with no additional complications. However, long polling is an older technology/technique to perform real-time updates. As a result, it is less advanced and less flexible than WebSockets but has more support for legacy systems.
When to choose WebSockets vs long polling
Pros of WebSockets vs long polling
Reduced Resource Utilization: WebSockets maintain a persistent connection between the client and the server, reducing the overhead of establishing a new connection for each real-time update. The constant connection reduces resource utilization on the client and server side regarding network bandwidth, memory, and CPU to achieve real-time communication.
Improved Scalability: Due to the nature of WebSockets and its bidirectional communication between the client and the server, the server can push updates to the client in real time, reducing the number of requests sent. Long polling must re-establish a connection every time the client needs new information. As the user base scales, this can put a lot of strain on an individual server.
Advanced Functionality: WebSockets provide full-duplex communication channels that achieve real-time data transfer and low latency. Long polling is sometimes considered only a half-real-time solution and not ideal for high-traffic scenarios or use cases that require real-time updates. Advanced functionality brings a smoother end-user experience as they will receive more seamless updates to their application.
Use cases to choose WebSockets over long polling
WebSockets are better suited for applications that require high-frequency updates. Examples include chat applications or real-time data feeds. The persistent connection allows for efficient transmission making it a more seamless experience for the end user. Multiplayer games and collaboration tools generally use WebSockets as well. The bi-directional two-way communication allows the server to signal the client, which can be beneficial for receiving real-time updates from other clients. For example, the server can signal the client, telling it to update another player's position depending on their actions.
In terms of scaling, or when a user base starts scaling, it is ideal to switch to the WebSocket protocol. The strain on an individual server will become too large if the client base uses long polling technology. Sending a request every 20 seconds has poor utilization and will cause the server to slow down over time, increasing latency per request.
Related: See also our guide to Server-Sent Events (SSE)
How PubNub fits into the conversation of Long polling vs WebSockets
WebSockets and long polling both offer valuable solutions for web development to create real-time applications. However, many other considerations come into play when building upon these use cases. With today’s technology, nothing is as simple as sending a message or data from one client to another. There is almost always required functionality on top of the real-time system that a developer is trying to create. For example, in an in-app chat, you can look at presence updates (signalling) when users are online, profanity filtering, or even read/deliver messages.
On top of adding specific functionality, there are still problems with underlying infrastructure, such as complexities with handling scalability. When using particular technologies such as Socket.io to create a WebSocket, developers will still have to take care of dynamically spawning servers around the world behind a load balancer to manage the utilization of each server. These problems become more and more complex the more you scale.