DP5 - We deliver holistic growth strategies, then execute them

Most trading platforms are one volatility spike away from a streaming failure they didn't know was coming. This is the story of why - and what the firms that never go down actually do differently.

It's 2:07 AM. The Asian markets have just opened and something is moving - fast. Your on-call engineer's phone lights up. Then the Slack alerts start. Then the CEO's WhatsApp. The WebSocket connections are dropping. Prices are stale. The reconnection storm has begun - 40,000 clients hammering the server simultaneously, each one retrying every two seconds.

By 2:14 AM, your platform is effectively down. By 2:19 AM, the first user tweets about it. By morning, your support queue has 800 tickets.

- A scenario that plays out, somewhere, every quarter.

Variations of this story have played out at trading platforms of every size - from scrappy fintechs to established brokerages - at exactly the moments that matter most: market open, FOMC announcements, earnings surprises, geopolitical shocks. The moments when your users need you most are the moments most likely to expose the hidden fault lines in your streaming infrastructure.

The painful irony? Most of the time, the platform works fine. It handles the Tuesday afternoon session beautifully. The team gains confidence. Architecture reviews get skipped. Technical debt gets deferred. And then - inevitably — something happens that nobody planned for.

This is the first in a three-part series on real-time data infrastructure for trading platforms. We're going deeper than most of these conversations go - past the benchmarks and the marketing copy, into the actual architectural decisions that separate the platforms that hold under pressure from the ones that fold.

Part 1 (you're reading it): Why streaming infrastructure fails — and the pattern nobody talks about.
Part 2: How Tier-1 firms actually architect real-time data layers at scale.
Part 3: The build vs. buy decision - a framework you can use this quarter.

60%Infra cost reduction at eToro (approx)

<1msEnd-to-end latency at ActivTrades (excluding network)

2MUpdates/sec at adesso in production

The Architecture That Works - Until It Doesn't

Here's how most trading platforms arrive at their current streaming architecture: pragmatically. A small team needs to show prices updating in real time. Someone suggests WebSockets. It works. The demo looks great. Users grow. The team scales the same architecture with more servers, CPU, and memory.

For a long time, this is fine. WebSockets are genuinely good technology. They're bidirectional, low-latency, and supported in every browser. At 100 users, 1,000 users, even 10,000 users - the cracks may never show.

The cracks show at 40,000. Or when your mobile app users switch from WiFi to cellular mid-session. Or when your enterprise client behind a corporate proxy can't establish a WebSocket connection because their IT department's firewall strips upgrade headers. Or when a brief network hiccup causes 35,000 clients to try to reconnect within the same 10-second window.

"The build that works in staging never fully prepares you for what happens when 40,000 real humans, on 40,000 different network conditions, all need the same price at the same millisecond."

These are the normal operating environment of a trading platform at scale, not just edge cases. And they expose a fundamental truth about DIY streaming infrastructure: it's optimized for the average case, not the critical case.

The Three Fault Lines

After working with trading platforms across the fintech spectrum, three failure modes appear again and again - not as bugs, but as architectural inevitabilities baked into how most teams build streaming from scratch.

Fault Line 1: The Reconnection Storm. When connectivity drops — even briefly - clients don't politely wait to reconnect. They all try at once. A standard WebSocket server under a reconnection storm experiences exponential load at the exact moment it's least equipped to handle it. Without intelligent backoff, rate limiting, and connection queuing built into the streaming layer itself, this becomes a cascading failure.

Technical note

At 40,000 concurrent clients, even a 5-second network interruption - a routine cloud provider blip - can generate 20,000 simultaneous reconnection attempts. A naive WebSocket implementation has no native defense against this. The fix requires custom retry logic, exponential backoff, jitter, and server-side rate limiting. Most teams build a version of this. Few build it well enough.

Fault Line 2: The Bandwidth Trap. Streaming prices for 300 financial instruments to 40,000 users sounds straightforward until we do the math. If each instrument updates every 100ms, that's 3,000 updates per second - per user. Without intelligent data conflation - collapsing rapid-fire updates so that only the latest value is delivered - you're either drowning the network or building custom conflation logic on top of the WebSocket layer. Most teams do neither well.

Fault Line 3: The Last-Mile Reality. The infrastructure might be flawless - the AWS setup, the load balancers, the WebSocket clusters are all perfectly configured. And then a user tries to access your platform from inside a bank's corporate network. The proxy server strips the Upgrade: websocket header. The connection silently falls back to polling. Corporate proxies, hotel WiFi, and strict firewall environments. They're the standard operating environment of the highest-value enterprise clients.

The Invisible Tax on Your Engineering Team

There's a cost to DIY streaming infrastructure that never shows up on an AWS bill. It shows up in the sprint velocity. In the proportion of the senior engineers' time spent on infrastructure maintenance versus product development. In the number of on-call incidents that wake people up at 2AM for problems that have already been solved - by someone else, years ago.

The Pareto problem of DIY streaming: teams typically spend 80% of their streaming-related engineering effort on infrastructure - reconnection logic, bandwidth management, firewall handling, connection pooling - and 20% on the actual product features that users experience. The math should be the other way around.

"Every hour the best engineers spend maintaining WebSocket infrastructure is an hour they're not building the useful features that will actally generate revenue."

Consider what "building our own WebSocket infrastructure at scale" actually means in practice: custom reconnection and retry logic, server-side rate limiting, proxy detection and HTTP streaming fallback, delta compression and data conflation, mobile-specific connection handling, bandwidth throttling, multi-server session affinity. And then: maintaining all of it, evolving it as your user base grows, and debugging it at 2AM when it fails in a way you didn't anticipate.

What the Firms That Never Go Down Actually Do

Morgan Stanley. UBS. eToro. IG Group. ActivTrades. adesso - which manages 6 million financial instruments for the German market, pushing 500,000 to 600,000 price updates per second under normal conditions, spiking to 2 million per second at peak.

These firms share something: they've solved the streaming problem at the infrastructure layer rather than the application layer. Instead of building bespoke WebSocket infrastructure and patching it every time production reveals a new failure mode, they've deployed a streaming broker specifically designed for internet-scale last-mile delivery. The technology they share is Lightstreamer.

📊Case study - eToro

eToro is one of the world's largest social trading platforms. Their VP of Engineering, Israel Kalush, is direct about the decision: "Choosing Lightstreamer was a no-brainer for us. We had an extremely positive experience. The product simply works and delivers on its promise." The outcome: a 60% reduction in infrastructure costs and an engineering team focused on product, not plumbing.

- Israel Kalush, VP Engineering - eToro

60%approx. Infra cost reduction

✓Zero streaming incidents at peak

↑Engineering focus on product

⚡Case study - ActivTrades

ActivTrades operates a multi-platform trading environment where latency is a direct determinant of trading quality. Senior Developer Rosen Mehanov measured it precisely: "Our ActivTrader platform in combination with the Lightstreamer server can process client trading requests in less than 1 millisecond from the moment of receiving the request in the Lightstreamer server, processing it from ActivTrader platform and sending the notification to the client." Less than 1 millisecond. End to end. In production.

- Rosen Mehanov, Senior Developer - ActivTrades

<1msEnd-to-end latency

MultiPlatform support

ProdReal production numbers

📈Case study - adesso SE

adesso operates at a scale most trading platforms aspire to. Senior Manager Roland Prosek: "On average, adesso has 25,000–30,000 clients connected, and at peak times we serve up to 80,000–100,000 concurrent clients. In the German market, we manage 6 million financial instruments, providing push updates at a rate of 500,000 to 600,000 per second. During peak periods, adesso's infrastructure handles up to 2 million updates per second." Two million updates per second. In production.

- Roland Prosek, Senior Manager - adesso SE

2MUpdates/sec at peak

100KPeak concurrent clients

6MFinancial instruments

The Intelligent Streaming Difference

Lightstreamer isn't positioned as a WebSocket replacement. It's more accurate to think of it as a streaming broker that makes WebSockets work the way we always hoped they would - and falls back gracefully when they can't.

Intelligent Streaming and automatic protocol fallback. Lightstreamer automatically detects the best available transport for each client - WebSocket where available, HTTP streaming as a fallback, long polling as a last resort. This happens transparently. The corporate proxy silently killed the WebSocket? Lightstreamer routes around it.

Data conflation. When a financial instrument updates faster than a client can consume it, Lightstreamer automatically conflates updates, delivering only the latest value. This alone can reduce bandwidth consumption by 60–80% during high-volatility sessions.

Delta delivery. Rather than transmitting full instrument records on every update, Lightstreamer transmits only the fields that changed. For a stock tick where only the last price and volume changed, this means sending two values instead of thirty. At 500,000 updates per second, this adds up.

Bandwidth throttling. A mobile user on 4G has different bandwidth constraints than a desktop trader on fiber. Lightstreamer adapts delivery speed per client - slow connections receive real-time data at a rate their connection can actually sustain.

Technical note

Lightstreamer connects natively to Kafka via the Kafka Connector, to MQTT brokers via the MQTT Connector, and to IBM MQ, ActiveMQ, and other JMS providers via the JMS Connector. If your data pipeline already uses Kafka or MQTT, Lightstreamer slots in as the last-mile delivery layer without requiring a re-architecture of your upstream data infrastructure.

The firms in this article don't worry about 2AM incidents.

Not because they're lucky. Not because they have bigger teams. Because they made one architectural decision - early enough - that moved streaming reliability from a thing they maintain to a thing they depend on.

In Part 2, we go inside the architecture. How does adesso actually handle 2 million updates per second? What does the Lightstreamer–Kafka integration look like in a real trading platform? And what's the answer to the question every CTO eventually asks: could we just build this ourselves?

Is your streaming architecture ready for peak load?

DP5 offers a free architecture consultation for trading platform engineering teams.

Schedule Call for Free Consultation →