Spring World 2018

Conference & Exhibit

Attend The #1 BC/DR Event!

Spring Journal

Volume 31, Issue 1

Full Contents Now Available!

Thursday, 21 December 2017 16:01

Part 1: Best Practices for Managing Enterprise Data Streams

Before big data and fast data, the challenge of data movement was simple: move fields from fairly static databases to an appropriate home in a data warehouse, or move data between databases and apps in a standardized fashion. The process resembled a factory assembly line.

In contrast, the emerging world is many-to-many, with streaming, batch or micro-batched data coming from numerous sources and being consumed by multiple applications. Big data processing operations are more like a city traffic grid — a network of shared resources — than the linear path taken by traditional data. In addition, the sources and applications are controlled by separate parties, perhaps even third parties. So when the schema or semantics inevitably change — something known as data drift — it can wreak havoc with downstream analysis.

Because modern data is so dynamic, dealing with data in motion is not just a design-time problem for developers, but also a run-time problem requiring an operational perspective that must be managed day to day and evolve over time. In this new world, organizations must architect for change and continually monitor and tune the performance of their data movement system.