Why real-time frontends break at scale and how to fix them

See how LogRocket's Galileo AI surfaces the most severe issues for you

No signup required

Check it out

Most real-time frontends do not fail all at once. They drift.

Why real-time frontends break at scale and how to fix them

At first, the system looks fine. Data updates quickly enough, users rarely notice inconsistencies, and the UI seems stable under light load. Then the product grows. More users connect, more updates happen concurrently, and network variability becomes impossible to ignore. The result is not usually a crash. It is a gradual loss of trust in what the UI is showing.

That breakdown tends to look like this:

Counters briefly go backward
Rows appear, disappear, and reappear
Statuses flicker between values
Different users see different truths at the same time

These bugs often get dismissed as rendering issues. In many cases, they are not rendering issues at all. They are temporal consistency issues. The frontend is receiving a changing stream of information, but the architecture is still treating state like a static snapshot.

A more accurate mental model is this: real-time frontend state is not something you simply hold. It is something you continuously derive from events over time.

What teams see in production	What is actually failing	What an event-driven pipeline adds
Counters briefly go backward	Out-of-order updates	Version-aware ordering
Rows flicker or disappear	Snapshot replacement under concurrent change	Incremental event application
Two users see different states	Missed or delayed updates	Replayable, deterministic reduction
UI freezes during bursts	Too many immediate state writes	Buffering and batch application
Refreshing “fixes” the issue	Frontend drift from source of truth	Periodic snapshot reconciliation

🚀 Sign up for The Replay newsletter

The Replay is a weekly newsletter for dev and engineering leaders.

Delivered once a week, it's your curated guide to the most important conversations around frontend dev, emerging AI tools, and the state of modern software.

Why polling breaks in real-time frontends

Most frontends start with polling because it is easy to ship:

setInterval(async () => {
  const res = await fetch("/api/orders");
  const data = await res.json();
  setOrders(data);
}, 3000);

The polling model assumes that if you fetch often enough, the UI will stay close to reality. That assumption breaks in a few important ways.

Polling scales linearly, and wastefully

Every client asks the same question over and over: “Has anything changed?”

At 10 users, that is usually fine. At a few thousand users, it becomes expensive:

You run the same queries thousands of times
Most responses contain identical data
The backend spends time proving that nothing changed

Polling is inefficient because it is pull-based ignorance. The client does not know when something changed, so it keeps asking just in case.

Polling guarantees periods where the UI is wrong.

Imagine updates happen every second, but the frontend polls every five seconds:

t=1s   ORDER_CREATED
t=2s   ORDER_UPDATED
t=3s   ORDER_CANCELLED
t=5s   poll happens

The UI jumps from this:

(no order) → cancelled order

That means the frontend misses intermediate transitions entirely. Those missing transitions break animations, distort aggregates, and make state changes hard to explain.

Polling does not just delay updates. It collapses time.

Polling makes race conditions hard to avoid

Polling requests are asynchronous, so responses can arrive out of order. That means you have no built-in:

Ordering guarantees
Version awareness
Protection against stale overwrites

At scale, those problems stop being edge cases and start becoming normal behavior.

The core issue: request/response does not model time

HTTP request/response is built around a simple pattern:

Request → Response → Done

Real-time systems behave more like this:

Event → Event → Event → Event

Request/response gives you snapshots. It does not give you history, causality, or ordering. It answers, “What is the state right now?” Real-time UIs also need to answer, “What changed, in what order, and how did we get here?”

That distinction is what breaks at scale.

Why this usually appears only under load

Under light load, latency is low, event frequency is low, and ordering problems are rare enough to ignore. The illusion holds.

As load increases, events happen faster than polling intervals, network jitter becomes more visible, and concurrent updates collide. The system starts violating temporal assumptions it never explicitly modeled.

Frontend state is temporal

One of the biggest misconceptions in frontend engineering is that state is something you simply have.

In real-time systems, state is something you derive over time from a stream of events. If you do not model that explicitly, inconsistency is not an implementation mistake. It is the default outcome.

State is a reduction over time

At any moment, the UI reflects every event it has processed so far:

state(t) = reduce(events[0...t])

That idea has an important consequence: if two clients process the same events in the same order, they converge to the same state.

To reason about real-time frontends clearly, it helps to separate three things that often get blurred together:

Event log → Event stream → Snapshot

1. Event log

The event log is the authoritative history of what happened. That might be a Kafka topic, a database change log, or an append-only event store.

It provides durability, replayability, and some notion of ordering. The frontend usually does not own it, but its correctness depends on it.

2. Event stream

The event stream is what the frontend actually receives:

const events$ = incomingEventsFromSocketOrSSE;

This stream is imperfect. It may arrive late, out of order, or with duplicates. It represents the event log, but it is a lossy real-time projection of it.

3. Snapshot

The snapshot is what the UI renders:

const [state, setState] = useState(initialState);

A snapshot is fast and convenient, but it has no memory. It cannot tell you how it got there, what changed, or whether it is correct.

Many frontend systems go straight from this:

Event stream → Snapshot

That shortcut is where most inconsistency begins.

Where reactive streams fit

This is where RxJS is useful. Not as storage and not as state management in the usual sense, but as a temporal processing layer between incoming events and state updates.

A more robust architecture looks like this:

Event log → Event stream → Reactive stream processing → Reducer → Snapshot

For example:

const processedEvents$ = events$.pipe(
  map(normalize),
  filter(isRelevant),
  distinct(byIdAndVersion),
  bufferTime(50)
);

const state$ = processedEvents$.pipe(
  scan(reducer, initialState)
);

Reactive streams are not state

Concern	Role of reactive streams
Event log	Not responsible
Event storage	Not responsible
Event flow control	Core responsibility
Event transformation	Core responsibility
State derivation	Via reduction with `scan()`

Reactive streams do not replace state. They shape how events become state.

Keeping asynchronous events consistent

Once you treat the event stream as a first-class layer, the real question becomes clearer: how do you keep a messy asynchronous stream consistent?

The answer is not “trust the transport.” It is “constrain how events are applied.”

That usually means:

Ordering awareness
Idempotent reducers that tolerate duplicates
Deterministic reduction
Gap detection and recovery

What event-driven patterns solve for frontend teams

At first glance, event-driven frontends can look like too much machinery. Instead of fetching data and mutating state directly, you now have streams, reducers, event pipelines, and version checks.

In practice, that machinery solves real problems.

1. Explicit modeling of change

Polling gives you snapshots. Event-driven systems give you state transitions. That means you can preserve ordering, reason about causality, and explain why the UI looks the way it does.

2. Decoupled UI logic

In many frontends, components do too much. They fetch data, interpret responses, retry operations, and reconcile live updates. Add real-time behavior and every component starts acting like its own mini distributed system.

An event pipeline centralizes that logic.

3. Deterministic state

If the same processed event stream always produces the same state, your frontend becomes easier to debug, test, and trust.

4. Replayability

Because state is derived from events, you can record an event sequence in production, replay it locally, and step through the exact state transitions that led to a bug. That is extremely hard to do with polling and snapshot replacement.

5. A single place to enforce consistency

Without a pipeline, every component ends up handling duplicates, stale data, ordering issues, and retry logic on its own. A stream layer gives you one place to apply those rules consistently.

6. Better handling for bursty traffic

Real systems do not update at a steady pace. They spike. Event pipelines let you batch updates and apply them atomically, which protects the UI from thrashing.

7. Complexity that scales more predictably

A good event-driven architecture still grows in complexity, but it grows more linearly. The alternative is usually hidden complexity scattered across components, effects, and ad hoc synchronization code.

How the frontend event pipeline should work

So far, the core argument is this:

State is derived from events over time
Event streams are messy
You need a processing layer before mutating UI state

The next question is practical: what should actually happen between receiving an event and updating the UI?

A production-grade pipeline usually looks like this:

Transport → Normalization → Validation → Stream processing → Reduction → Snapshot

Each step has a distinct job.

1. Transport layer

The transport layer gets events into the browser. Common options include:

Server-Sent Events (SSE)
WebSockets
GraphQL subscriptions

For example, with SSE:

const source = new EventSource("/events");

source.onmessage = (e) => {
  rawEvents$.next(JSON.parse(e.data));
};

2. Normalization layer

Different services often emit different payload shapes:

{ "event_type": "order_update", "payload": { ... } }
{ "type": "ORDER_UPDATED", "data": { ... } }

Normalize them immediately:

function normalize(raw) {
  return {
    type: raw.event_type || raw.type,
    id: raw.payload?.id || raw.data?.id,
    version: raw.payload?.version ?? raw.data?.version,
    ts: raw.payload?.updated_at ?? raw.data?.updated_at
  };
}

const normalized$ = rawEvents$.pipe(
  map(normalize)
);

That gives the rest of the frontend one event vocabulary.

3. Validation layer

Not every event should be trusted blindly.

const valid$ = normalized$.pipe(
  filter((e) => e.id != null && e.type != null)
);

Validation helps prevent silent corruption, reducer crashes, and undefined behavior.

4. Stream processing layer

This is where you define how events behave over time before they affect state.

const processed$ = valid$.pipe(
  distinct(byIdAndVersion),
  groupBy((e) => e.id),
  mergeMap((group$) =>
    group$.pipe(
      scan(enforceOrdering, initialPerEntityState)
    )
  ),
  bufferTime(50)
);

A few patterns matter here.

Deduplication

Events may be retried or redelivered:

function byIdAndVersion(a, b) {
  return a.id === b.id && a.version === b.version;
}

Without deduplication, you can double-apply updates and corrupt derived state.

Per-entity ordering

Global ordering is difficult. Per-entity ordering is much more manageable.

groupBy((e) => e.id);

That gives each entity its own stream:

Order 1: v1 → v2 → v3
Order 2: v5 → v6 → v7

Then you can enforce local ordering:

function enforceOrdering(state, event) {
  if (event.version <= state.version) {
    return state;
  }

  return {
    ...state,
    version: event.version,
    event
  };
}

Buffering and batching

Real systems produce bursts:

10 events in 5ms

If you render on every update, the UI thrashes. Batching lets you apply many updates as a single state transition:

bufferTime(50);

Event enrichment

Sometimes events do not carry enough context by themselves:

map((event) => ({
  ...event,
  user: userCache[event.userId],
  computedStatus: deriveStatus(event)
}));

That can reduce repeated lookups in reducers and keep components simpler.

5. Reduction layer

After processing, reduce the resulting events into state:

const state$ = processed$.pipe(
  scan((state, batch) => {
    return batch.reduce(reducer, state);
  }, initialState)
);

A reducer for real-time state should be:

Idempotent
Deterministic
Focused on localized updates

For example:

function reducer(state, event) {
  const current = state.orders[event.id];

  if (current && event.version <= current.version) {
    return state;
  }

  return {
    ...state,
    orders: {
      ...state.orders,
      [event.id]: {
        ...current,
        ...event
      }
    }
  };
}

6. Snapshot layer

The snapshot is what components finally consume:

state$.subscribe(setState);

Or through a hook:

function useOrders() {
  return useObservable(state$);
}

Putting it together

rawEvents$
  .pipe(
    map(normalize),
    filter(valid),
    distinct(byIdAndVersion),
    groupBy((e) => e.id),
    mergeMap((group$) => group$.pipe(scan(enforceOrdering))),
    bufferTime(50),
    scan(applyBatchReducer, initialState)
  )
  .subscribe(setState);

How to troubleshoot event-driven UI consistency problems

Real-time frontends rarely fail in obvious ways. They drift. The UI still works, but some updates never apply, values become subtly wrong, and different clients disagree.

Handling out-of-order events

Version checks are the simplest starting point.

Drop stale events

if (event.version <= current.version) return state;

This is good for idempotency and simpler systems.

Buffer and reorder

If strict ordering matters more than latency:

const ordered$ = events$.pipe(
  bufferTime(50),
  map(sortByVersion)
);

You accept a small delay in exchange for a more consistent update order.

Use per-entity queues

groupBy((e) => e.id);

Enforcing ordering per entity is often the best practical tradeoff.

Handling duplicates

Most real-world delivery is at least once, not exactly once. Duplicates are normal.

Version-based idempotency

if (event.version === current.version) return state;

Event ID tracking

if (seenEvents.has(event.eventId)) return state;
seenEvents.add(event.eventId);

This helps when versioning alone is not enough.

Detecting desynchronization

Desync is not a rare bug. In real-time systems, it is something you should plan to detect and recover from.

Version gaps

if (event.version > current.version + 1) {
  triggerResync();
}

This is one of the clearest early signals that the frontend missed something.

Heartbeat or inactivity timeout

if (Date.now() - lastEventTs > threshold) {
  triggerResync();
}

This helps when connections silently drop or the backend stalls.

Checksum or hash mismatch

The backend can periodically send a state hash:

{ "type": "STATE_HASH", "hash": "abc123" }

The frontend compares it to its own computed hash:

if (localHash !== remoteHash) {
  triggerResync();
}

This catches silent corruption and missed updates.

Semantic invariants

Sometimes state becomes impossible in business terms: negative counts, invalid status transitions, or broken relationships between entities. Those are useful desync signals too.

Recovering from desync

Detection without recovery is not very useful. A practical recovery flow looks like this.

1. Pause event application

processingPaused = true;

2. Buffer incoming events

buffer.push(event);

3. Fetch an authoritative snapshot

const snapshot = await fetch("/orders/snapshot");

4. Replace or reconcile local state

The safest approach is often the simplest one:

state = snapshot;

More advanced systems may merge optimistic local changes.

5. Replay buffered events

buffer.forEach((event) => {
  state = reducer(state, event);
});

6. Resume processing

processingPaused = false;
buffer = [];

Replay is also a debugging tool

Because the system is event-driven, you can store and replay event sequences:

events.forEach((e) => {
  state = reducer(state, e);
});

That makes production bugs reproducible in a way snapshot-based systems rarely are.

Add observability or the pipeline becomes a black box

Track at least these signals:

Event lag
Processing latency
Drop and dedup rate
Resync frequency
Stream health, such as reconnect counts, heartbeat failures, and buffer sizes

Implementation patterns for reliable real-time frontend architecture

So far, “events” have been treated as if they simply appear. In reality, the frontend receives them over unreliable networks and imperfect protocols. The practical problem is not just how to process events, but how to get them into the system and keep them flowing reliably.

The browser transport constraint

Browsers do not talk directly to Kafka or NATS. They speak HTTP and WebSockets. That means your architecture usually looks like this:

Backend event bus → Gateway → Browser transport → Event pipeline

That gateway layer is not incidental. It is part of the architecture.

Choosing the right transport

There is no universally best transport. There are tradeoffs.

	Server-Sent Events	WebSockets	GraphQL subscriptions
Characteristics	One-way, HTTP-based, auto-reconnect built in	Full duplex, persistent connection, lower latency	Usually implemented over WebSockets with a GraphQL layer
Strengths	Simple mental model, easier to run through HTTP infrastructure, built-in reconnection	Bidirectional communication, flexible protocol design, efficient for frequent updates	Works well in GraphQL-heavy stacks with typed subscription shapes
Limitations	No client-to-server messaging, limited backpressure control, text-based	More complex reconnection, harder horizontal scaling, more care needed around connection state	Inherits most of the operational tradeoffs of WebSockets
Best fit	Dashboards, notifications, feeds, analytics panels	Chat, collaboration, multiplayer, bidirectional workflows	Teams already committed to GraphQL across the stack

A practical transport rule

Use SSE when simplicity and operational ease matter most.

Use WebSockets when the client also needs to send real-time messages back.

Use GraphQL subscriptions when your stack is already GraphQL-centric and the schema layer is part of the value.

Reconnection is not optional

Connections will drop. Tabs will sleep. The frontend will miss events. Your design has to assume that.

Basic SSE reconnection

SSE gives you browser-level retry behavior:

source.onerror = () => {
  // browser retries automatically
};

That helps, but it is not enough. You also need a way to resume from the last known position.

Every event should carry some cursor or version:

{ "type": "ORDER_UPDATED", "id": 1, "version": 42 }

Then reconnect with context:

const lastSeenVersion = getLastVersion();

fetch(`/events?since=${lastSeenVersion}`);

WebSocket resume logic

With WebSockets, you have to implement this yourself:

socket.onopen = () => {
  socket.send(JSON.stringify({
    type: "RESUME",
    since: lastSeenVersion
  }));
};

Handle backpressure before the UI feels it

At scale, the frontend can get overwhelmed:

Server → Network → Browser → Event pipeline → UI

When too many events arrive too quickly, the symptoms show up as input lag, UI freezes, and memory spikes.

A few common mitigation strategies:

Buffering

events$.pipe(bufferTime(100));

This batches updates and reduces render frequency.

Dropping low-priority events

filter((event) => event.priority !== "low");

Not every event deserves equal treatment.

Sampling

events$.pipe(sampleTime(200));

This is useful when only the latest value in a window matters.

Server-side throttling

Sometimes the correct fix is upstream: reduce event frequency or aggregate before sending.

Transport and pipeline integration

A minimal end-to-end flow looks like this:

const rawEvents$ = new Subject();

const source = new EventSource("/events");
source.onmessage = (e) => rawEvents$.next(JSON.parse(e.data));

const state$ = rawEvents$.pipe(
  map(normalize),
  filter(valid),
  distinct(byIdAndVersion),
  bufferTime(50),
  scan(applyReducer, initialState)
);

state$.subscribe(setState);

Practical demo: refactoring a polling-based dashboard

Imagine you are building a dashboard that shows live order activity. New orders arrive, statuses change, and counts update continuously. The first implementation is the obvious one:

setInterval(async () => {
  const res = await fetch("/api/orders");
  const data = await res.json();
  setOrders(data);
}, 5000);

It works well enough at first. Then users start noticing strange behavior. An order briefly appears as completed, then flips back to pending, then settles again. Two users looking at the same dashboard see different counts. Refreshing the page makes the UI “look correct” again.

Nothing crashes. There are no obvious console errors. But the frontend feels unreliable.

The problem is architectural, not cosmetic. Every five seconds, the UI replaces its entire state with a new snapshot. It has no understanding of what changed, in what order, or whether the newest response is actually newer than the last one it applied.

Introducing an event stream

Now imagine the backend exposes a simple event stream:

const source = new EventSource("/events");

The frontend starts receiving messages like:

{ "type": "ORDER_CREATED", "id": 1, "version": 1 }
{ "type": "ORDER_UPDATED", "id": 1, "status": "completed", "version": 2 }

The tempting implementation is to apply them directly:

source.onmessage = (e) => {
  const event = JSON.parse(e.data);
  applyEventDirectlyToState(event);
};

That looks more real-time, but it quietly reintroduces the same problem in a different form. Events can still be duplicated, delayed, or delivered out of order.

Building a minimal pipeline

Instead, add a thin processing layer:

const events$ = new Subject();

source.onmessage = (e) => {
  events$.next(JSON.parse(e.data));
};

Then process the stream before updating state:

const state$ = events$.pipe(
  map(normalize),
  distinct(byIdAndVersion),
  bufferTime(50),
  scan((state, batch) => {
    return batch.reduce(reducer, state);
  }, initialState)
);

state$.subscribe(setState);

A few important things change here.

Events are normalized before the rest of the system sees them. Duplicates are filtered before they cause damage. Bursts are batched before they trigger excessive renders. Most importantly, state is no longer replaced wholesale. It is derived.

That changes the behavior in the places that matter. Counts stop jumping backward. Lists stop flickering. Two users are more likely to see the same thing at the same time.

Just as important, the system becomes easier to reason about. You can record events, replay them, and understand exactly how the UI arrived at a particular state.

Why events alone are not enough

Even this improved system still assumes the frontend receives every event. In reality:

Connections drop
Tabs go to sleep
Gateways restart
Events get missed

When that happens, the frontend drifts again.

Introducing the hybrid model: events plus snapshots

This is where many production systems land. Keep the event stream for low-latency updates, but add periodic reconciliation with an authoritative snapshot:

setInterval(async () => {
  const snapshot = await fetch("/api/orders");
  replaceState(snapshot);
}, 60000);

At first glance, that can look like a step backward. It is not. Polling is no longer the real-time mechanism. It is the correction mechanism.

How the hybrid model works

The system now has two complementary paths:

Continuous event updates for low-latency UI changes
Periodic snapshot reconciliation for long-term correctness

The event stream keeps the interface responsive. The snapshot corrects drift. If the frontend misses an event, the next authoritative snapshot repairs the state.

What started as a simple polling loop has become a more robust system:

An event stream feeding a controlled pipeline
A reducer deriving state deterministically
A periodic sync step preserving correctness over time

The UI code often gets simpler as a result, because less consistency logic leaks into individual components.

Conclusion

Real-time frontends do not break because developers forgot how to fetch data. They break because time was never modeled directly.

Polling hides time behind intervals. Snapshots erase history. State gets overwritten without any notion of causality or ordering. As systems grow, those shortcuts turn into stale views, race conditions, and frontends that users no longer fully trust.

Event-driven patterns improve that situation by making time explicit. Once the UI is modeled as the result of events flowing through a controlled pipeline, state becomes derivable, updates become easier to reason about, and bugs become reproducible.

For many teams, the most practical production model is not pure streaming or pure polling. It is a hybrid: event streams for responsiveness, plus periodic snapshots for correction. That combination gives you a frontend that is not just fast, but consistently trustworthy over time.

How to add authentication to a React Native app with Better Auth

Learn how to build a full React Native auth system using Better Auth and Expo — with email/password login, Google OAuth, session persistence, and protected routes.

Chinwike Maduabuchi

Jun 9, 2026 ⋅ 13 min read

AI dev tool power rankings & comparison [June 2026]

Compare the top AI development tools and models of June 2026. View updated rankings, feature breakdowns, and find the best fit for you.

Chizaram Ken

Jun 8, 2026 ⋅ 11 min read

How to Check Username Availability at Scale Using Bloom Filters

How to check username availability at scale with Bloom filters

Learn how Bloom filters reduce database lookups for username availability checks while preserving correctness at scale.

Rosario De Chiara

Jun 8, 2026 ⋅ 6 min read

An advanced guide to Nuxt testing and mocking

Learn how to test Nuxt apps with Vitest, @nuxt/test-utils, runtime mocks, server route mocks, and Playwright e2e tests.

Sebastian Weber

Jun 5, 2026 ⋅ 15 min read

View all posts

Advisory boards aren’t only for executives. Join the LogRocket Content Advisory Board today →