On this page

Rate limiting

PubNub provides multiple strategies for controlling message rates in high‑occupancy scenarios. Rate limiting maintains optimal user experience and manages system usage when thousands of participants interact simultaneously.

This document focuses on PubNub's Functions‑based rate limiter, a flexible solution for controlling message flow in live events, chat applications, and other high‑traffic use cases.

What to consider

When implementing rate limiting, consider how message volume affects user experience and system usage.

Message frequency

Chats become noisy when too many messages arrive from too many participants. For meaningful dialogue, excessive noise lowers attention until the chat reaches saturation and becomes unusable. Reducing audience size per channel creates opportunities for conversation.

For engagement‑focused experiences without dialogue, saturation tolerance is higher. Business owners decide the appropriate traffic level. Messages should be readable, though users may lack time to respond.

Event occupancy

High occupancy increases both message frequency and system usage. As occupancy grows, each message must be delivered to more subscribers. Consider a scenario with 10,000 users celebrating a moment during a live event:

ScenarioEffect
Single channel
All 10,000 users receive every message,creating high volume for all participants
Ten sharded channels
Each group of 1,000 users receives only messages from their channel,reducing noise

Channel sharding improves user experience by creating smaller, more manageable conversation spaces.

Functions rate limiter

Recommended solution

The Functions‑based rate limiter is PubNub's recommended approach for controlling message rates. It provides dynamic throttling without requiring additional infrastructure while preserving the shared experience of being part of a large audience.

When 100,000 fans are celebrating together, you want them to feel the energy of the crowd and not feel separated into smaller rooms. The Functions rate limiter achieves this by intelligently throttling message rates while keeping all users in a single shared channel.

This approach:

  • Preserves the "stadium atmosphere" where users feel part of the full audience
  • Dynamically adjusts throttling based on current message rates
  • Requires no user separation or channel assignment logic
  • Works automatically with minimal configuration

For scenarios where approximate rate control suffices, this solution maintains optimal user experience without fragmenting your community.

Distributed system challenges

Rate limiting in distributed systems introduces complexity:

  • Coordination and synchronization: Multiple nodes must coordinate to enforce limits. State information propagates across nodes with latency, creating temporary discrepancies that allow brief limit violations.

  • Scalability: Managing rate limits grows harder as nodes, clients, and users increase. Solutions must scale efficiently with system growth.

  • System failures: Node or network failures disrupt rate limiting. Robust solutions handle failures gracefully and recover quickly.

  • Performance impact: Rate limiting adds processing overhead per request. Poor optimization degrades latency and throughput.

  • Configuration complexity: Defining and enforcing rules across multiple nodes or services complicates management.

  • Fairness versus efficiency: Balancing equal user opportunities against system performance requires careful tuning and monitoring.

Throttling solution

Absolute rate limiting requires fan‑in‑fan‑out architectures where a single component consumes all messages and republishes at the desired rate. This creates a single point of failure and increases latency.

Most applications need only approximate rates to avoid saturation and maintain user experience. Approximate limiting preserves PubNub's low latency and reliability guarantees.

The solution uses two components: a message throttler and a message sampler.

Message throttler

The throttler is a before‑publish Function that controls message rates based on configuration.

Configuration

Control messages update the Function's throttling configuration. Each Function instance stores configuration and persists it to the KV store for other instances to retrieve.

Periodically, instances check the KV store for configuration updates and synchronize their settings.

Throttling logic

When a non‑control message arrives on a rate‑limited channel, the Function looks up throttling parameters. Using probability and channel configuration, it randomly decides whether to throttle.

Throttled messages route to an analytics channel rather than reaching subscribers. This preserves messages for analysis without impacting user experience. Configuration can also discard throttled messages entirely.

Message sampler

The sampler subscribes to rate‑limited channels and tracks message rates. It generates control messages to adjust throttling dynamically.

Sampling configuration

Configure the sampler to monitor specific channels. Define a sampling period N and target rate per period. During each period, the sampler tallies messages per channel.

Control messages

At period end, the sampler compares counts against target rates and computes throttling adjustments. If throttling changed, it publishes control messages for the throttler to consume.

Complete architecture

The throttler and sampler form a feedback loop that maintains approximate rate limits.

Sample code

Message Sampler

Standalone Node.js application that monitors channels and calculates throttling rates:

1

Message Throttler

Before‑publish Function that applies probabilistic throttling based on configuration:

1

We recommend this approach because it:

  • Maintains optimal user experience during high traffic
  • Responds to dynamic traffic fluctuations
  • Maintains low publish latency
  • Supports different rates per channel
  • Adjustable sampling periods match traffic profiles
  • Retains messages for analytics
  • All users receive identical message streams
  • Approximate rates create organic, natural experience
Considerations

Keep in mind that the actual rate may exceed or fall short of target and that poor sampling period configuration may cause rate variance.

Channel sharding

When to use channel sharding

Channel sharding is appropriate only when logical groupings exist, such as different languages, team affiliations, or geographic regions. If there's no natural way to divide your audience, use the Functions rate limiter to preserve the shared experience.

Channel sharding divides large audiences into separate conversation spaces. Unlike the Functions rate limiter, this approach separates users into distinct rooms rather than keeping them in a shared experience. Most applications benefit more from the Functions rate limiter.

Channel sharding makes sense when there are logical groupings, such as:

  • Language-based separation (Spanish-speaking fans chat with other Spanish speakers, etc.)
  • Team affiliation (home and away fan sections)
  • Geographic regions (local fan communities)
  • Premium tiers (VIP rooms for special access)

You may want to avoid channel sharding if you want users to feel part of the full audience ("100k stadium" experience) and there's no logical way to group users, as random distribution would fragment the community feel.

For detailed implementation guidance, refer to Live Event Rate Limiting.

Last updated on