Market Data in the Cloud
January 18, 2023 | DiffusionData
By Riaz Mohammed, CTO DiffusionData
Overview and basic concepts
|Market data ecosystems are typically hosted on premise.
|Market Data is used for real-time trading, risk analysis, analytics and more.
||This means maintaining best-fit on-prem infrastructure. This is costly with constant upgrades to meet the increasing traffic and delivery requirements.|
Need for Market Data in the Cloud
Market data distribution network – The participants
- Exchanges and trading venues, who produce market data based on trading activity.
- Market data service providers and vendors, who aggregate and normalize market data from multiple sources, and provide value added services like analytics.
- Brokerage firms, security dealers and investors who consume the data.
All participants are required to maintain dedicated infrastructure for Market data management.
This is costly with constant upgrades to meet the increasing traffic and delivery requirements.
The end consumers of market data such as the brokerage firms, must process market data in phases:
Increasing the complexity and TCO of market data platforms for all participants including the end consumers.
Market Data in the Cloud Challenges
There are several challenges when successfully delivering market data in the cloud, and we will explore some of the key ones.
- IP multicast: Traditional on-premise based market data networks rely on IP multicast for optimized delivery to various consumers, which is not natively supported in the cloud.
- Scaling: The common practice when using cloud infrastructure is to scale horizontally. However, this creates complexity if ordering of market data is to be maintained.
- Native messaging: Cloud native messaging products are not fast enough as they are primarily built for guaranteed delivery. Market data requires at most once delivery.
- Cost: If not architected well, the cost can spiral out of control. Data distribution / bandwidth cost, which is negligible in on-premise implementations, can be a major cost in the Cloud, especially when delivering data to multiple consumers in different locations. Resiliency requires that the solution is available across multiple cloud regions, which in turn can also drive costs up.
- Multi-cloud: Most major enterprises would want a cloud agnostic solution so that they are not tied to a single provider, and this is also mandated by regulation. A carefully selected technology stack this is portable across different cloud providers also adds to the complexity, given that most cloud vendor solutions do not support multi cloud.
- On-prem cloud integration: In most cases availability of data needs to be consistent across on-prem and cloud. This requires efficient site-to-site replication across cloud and on-premise environments.
- Entitlements: An extensive entitlement system is required, which can monitor and control access to market data depending on the type of data and user-level permission including the consuming application or device.
- Filtering & personalization: Not all recipients require all data, nor do they need data in real time, hence different types of customizations can be applied which will simplify consumption of market data for consumer applications as well as reduce the amount of data that is transmitted over the network. This is also vital to provide a zero-footprint solution to consumers.
Introducing Diffusion Intelligent data platform
Diffusion is an advanced cloud ready pub-sub platform for internet scale hyper-personalized messaging with additional features such as:
Overcoming Challenges with Diffusion
- Diffusion maintains a TCP connection with client applications via WebSockets, ensuring full visibility and access control of market data.
- Market data is structured data with price details being the main change between messages, and this allows Diffusion’s delta compression to reduce the bandwidth usage by 70% or more, as it only sends the binary difference between messages for the same instrument.
- This helps to deliver market data to 10s of 1000s of consumers without incurring huge bandwidth costs.
- Diffusion provides solutions for all participants of the market data network
Diffusion Zero-footprint consumption of Market Data
No need to host infrastructure to process market data feeds, instead, Diffusion services running in the cloud will filter and adapt market data for direct consumption by end-application or user.
This is a Diffusion architecture designed to meet the demands of efficient and rapid scalability of servers, where data volumes may be both large and unpredictable.
Automatic scale up and down
- Diffusion servers are automatically brought into active service when required
- and released when no longer needed,
- scaling decisions are based on your own metrics (message rates, CPU load, number of connections)
Rapid response to demand
- Active servers can be brought online in seconds so there is always capacity to manage unexpected loads.
- Add servers indefinitely to handle large volumes of data ingress.