← PART 01

In Part 1 we covered the three fault lines that take down trading platforms at scale: the reconnection storm, the bandwidth trap, and the last-mile reality. The firms that avoid 2AM incidents have solved these at the infrastructure layer, not the application layer. Read Part 1 →

In Part 2 we understand the solution to address the last-mile data streaming problem. We will walk through the architecture that Tier-1 trading firms run in production- the stack, the decisions, the trade-offs, and the specific Lightstreamer capabilities that make each layer work at scale.

We'll also touch upon the question that every engineering leader asks at some point: could we just build this ourselves or it makes more sense to buy?

The Stack, Layer by Layer

Most Tier-1 trading platforms share a common architectural pattern that has emerged over years of production experience. It's not the pattern they started with, but they arrived at it, after enough 2AM incidents to understand what goes wrong and how best to handle it.

The architecture has four layers, each with a specific task. The key insight is that Kafka and Lightstreamer are not competitors. They solve different problems and sould be evaluated correctly to complement each other.

DATA SOURCES
Market feed (FIX/FAST)Internal risk engineOrder management system
KAFKA LAYERExisting infra - unchanged
Kafka topics per instrumentPartitioned by asset classRetained for replay / reconciliation
LIGHTSTREAMER LAYERThe layer that changes everything
Kafka Connector - subscribes to topicsIntelligent Streaming - per-client transport negotiationData conflation - only latest tick deliveredDelta delivery - changed fields onlyBandwidth throttling - adapts per connection
CLIENTS
Web (WebSocket / HTTP streaming / long-poll)iOS & Android (native SDKs)Desktop terminalsAlgo trading APIs

Kafka handles the upstream problem: high-throughput, fault-tolerant message ingestion from market feeds, risk engines, and order management systems. It does this exceptionally well. What Kafka was not designed to do is deliver individual ticks to 80,000 concurrent browser and mobile clients with varying network conditions, adaptive bandwidth throttling, and transparent firewall traversal.

That's the last-mile problem. And that's precisely what Lightstreamer is built for.

"Kafka gets the data to the building. Lightstreamer gets it to the desk - on every floor, through every wall, at whatever speed each connection can sustain."

The Kafka–Lightstreamer Integration

The Lightstreamer Kafka Connector is the bridge between these two layers. It subscribes to Kafka topics and streams updates to clients in real time - handling all of the last-mile complexity that Kafka deliberately leaves out of scope.

How it works in production

The connector subscribes to Kafka topics on behalf of all connected clients. When a new message arrives on a topic - say, a price update for AAPL - the connector determines which clients are subscribed to that instrument and delivers the update to each of them, applying conflation and delta delivery along the way.

Technical note
A single Lightstreamer server instance maintains subscriptions to Kafka topics centrally, rather than each client connection opening its own Kafka consumer. This means the Kafka cluster sees a constant, predictable consumer count regardless of how many clients are connected - 100 or 100,000. The fan-out from Kafka message to client delivery happens within Lightstreamer, not within Kafka.

This distinction matters at scale. At adesso's production numbers - 6 million instruments, 500,000–600,000 updates per second - each update needs to reach only the clients subscribed to that specific instrument. Without intelligent fan-out and conflation, we would be pushing the full update volume to every client. The network will be overwhelmed..

Conflation in practice

Consider a volatile session. AAPL is updating 50 times per second. A client on a mobile connection can realistically consume 10 updates per second before their experience degrades. Without conflation, 80% of the updates that we're sending are too late and also wastefully consuming the client's valuable battery and data plan.

With Lightstreamer's conflation, the server tracks the latest value for each subscribed item and delivers at the rate the client can sustain. The client always sees the most current price. They never see a stale value. They never receive redundant intermediate ticks. The network sees a fraction of the raw update volume.

Technical note
Delta delivery compounds this further. If AAPL's last tick changed only the last price and volume fields - not the bid, ask, open, high, low, or any of the other fields in the instrument record - Lightstreamer sends only those two changed values. At adesso's scale, this reduces per-message payload size by 70-85% on average, directly translating to lower bandwidth costs and higher concurrency per server.

The Case Studies, Revisited

In Part 1 we mentioned the production numbers. Here we want to put them in architectural context - because the numbers only make sense when we understand the architectural decisions that led to the outcome.

📈Case study - adesso SE - 2M updates/sec

adesso manages the German financial market's real-time price distribution infrastructure. The architecture follows the pattern above exactly: market feeds flow into Kafka, Lightstreamer consumes and fans out to clients. Senior Manager Roland Prosek: "In the German market, we manage 6 million financial instruments, providing push updates at a rate of 500,000 to 600,000 per second. During peak periods, adesso's infrastructure handles up to 2 million updates per second, with each quote being published to the Lightstreamer adapters, ready to be pushed to clients." The key word is 'ready to be pushed' - conflation and delta delivery mean the actual bytes transmitted to clients are a small fraction of the raw update volume.

- Roland Prosek, Senior Manager - adesso SE

2MUpdates/sec at peak
100KPeak concurrent clients
6MInstruments managed
🔗Case study - IG Group - multi-platform price distribution

IG Group operates one of the world's largest CFD and spread betting platforms, serving clients across web, mobile, and desktop terminals simultaneously. Their former Head of IT Development, Ivan Gowan, describes the challenge: "We provide our clients with a very broad range of platforms from which to trade. We have worked closely with Lightstreamer to ensure that our clients receive fast, reliable price updates across all our platforms." The architectural challenge here is not just speed - it's consistency. When the same price needs to reach a desktop trader, a mobile app, and an API client simultaneously, the last-mile delivery layer needs to handle three fundamentally different connection environments without any of them seeing a stale or inconsistent price.

- Ivan Gowan, former Head of IT Development - IG Group

MultiPlatforms simultaneously
Consistent price delivery
Prod-grade reliability

Could we Just Build This Ourselves?

It's a choice we have to make. We must estimate not only what it costs to build, but also the ongoing maintenance. There will be unknowns that will surface as time goes by. So include the realistic estimates, 18 months down the line and onwards, before committing. There are a number of lessons that are already learned and fixed by Lightstreamer, do we really want to re-learn that ourselves? at what cost?

CapabilityDIY WebSocketLightstreamer
Reconnection handlingCustom - we own every bugBuilt in - intelligent backoff, queuing
Firewall / proxy traversalManual fallback code requiredAutomatic protocol negotiation
Data conflationBuild it ourselves or skip itNative - only latest value delivered
Delta deliveryFull records on every tickChanged fields only
Mobile SDKGeneric WebSocket - we adaptNative iOS, Android SDKs
Kafka integrationCustom consumer + fan-out logicKafka Connector - drop-in
Bandwidth throttlingAll clients get same ratePer-client adaptive delivery
On-call incidents2AM is ours to ownLightstreamer handles the edge cases

The table above isn't a knock on DIY. It's an objective perspective. Every item in the middle column is solvable. Smart teams have solved all of them. The question is not can we build it - it's should our best engineers spend the next 18 months building and maintaining it forever, and how does that impact our product feature development velocity which actually generates revenue?

eToro's 60% infrastructure cost reduction wasn't just about server bills. It was about the engineering hours that stopped going into infrastructure maintenance and started going into product. That's the real ROI of the decision.

"The DIY might turn out to be more expensive than it appears in the begining, and compounds every quarter."

What "Production-Grade" Actually Requires

There's a gap between "WebSocket infrastructure that works" and "WebSocket infrastructure that works at 2AM during a market dislocation." The gap is filled with edge cases. These edge cases follow distinct patterns, which have been thoroughly analyzed and embedded into Lightstreamer.

Network heterogeneity. All of our users are not on fiber. A meaningful percentage are on mobile networks, corporate proxies, hotel WiFi, and VPNs. Each of these environments breaks standard WebSocket connections in different ways. Handling them gracefully requires protocol fallback logic that works silently and correctly every time - not 99% of the time.

Correlated failures. The worst streaming failures are correlated and triggered by market events that cause every user to be active simultaneously. The reconnection storm from Part 1 is a correlated failure. A platform that handles 10,000 concurrent normal connections may fall over under 10,000 simultaneous reconnections, because the load shape is entirely different. Production-grade infrastructure accounts for the worst-case load shape, not the average.

Operational visibility. When something goes wrong at 2AM, we need to know what went wrong, why, and what to do about it - in minutes, not hours. DIY streaming infrastructure typically has minimal built-in observability. Lightstreamer ships with monitoring dashboards and diagnostic tooling built into the deployment. That's not a feature we want to build from scratch.

Technical note
Lightstreamer's on-premise deployment option is relevant here for another reason: it means your monitoring, your data residency, and your incident response all live inside your own infrastructure perimeter. For firms operating under MiFID II, Regulation SCI, or other financial services compliance frameworks, this is often a non-negotiable requirement that most cloud-based streaming solutions may not meet.

Now that we have discussed the architecture. The next question is the actual build vs buy decision.

We've seen how the stack is assembled. We've seen the capability comparison. We have a sense of what the DIY path actually costs over time. But every engineering team's situation is different - different team size, different update frequency, different concurrency targets, different compliance constraints.

In Part 3, we will build a scoring framework that can be run against your own current architecture and capabilities. We will explore three scenarios - startup, growth-stage, and enterprise with objective, measurable criteria to help decide which path makes sense at each stage.The answer may not be always Lightstreamer. But our framework will guide you in making an informed decision.

Want to see this architecture applied to your platform?

DP5 offers a free architecture consultation for trading platform engineering teams.

Schedule Call for Free Consultation →
Finance Series - All Parts
01
02
How Tier-1 Firms Architect Real-Time Data Layers← you are here
Apr 21, 2026
03
Build vs. Buy: The Framework Your Engineering Team Needs
May 1, 2026 · soon
← Part 01
It's 2AM. The Platform Is Down.
Part 03 → Publishing May 1
Build vs. Buy: The Framework Your Engineering Team Needs
Coming soon