Stream and Batch Processing Convergence in Apache Flink

The idea of executing streaming and batch jobs with one engine has been there for a while. People always say batch is a special case of streaming. Conceptually, it is. However, practically, there are many gaps between streaming and batch processing in resource management, scheduling, failure recovery, aggregation, shuffling, etc. Apache Flink has gone through a long journey to address all these challenges, and becomes a leading convergence engine. This talk will introduce these challenges as well as the way Flink tackles them.


From the same track

Session

Beyond Durability: Enhancing Database Resilience and Reducing the Entropy Using Write-Ahead Logging at Netflix

In modern database systems, durability guarantees are crucial but often insufficient in scenarios involving extended system outages or data corruption.

Speaker image - Prudhviraj Karumanchi

Prudhviraj Karumanchi

Staff Software Engineer at Data Platform @Netflix

Speaker image - Vidhya Arvind

Vidhya Arvind

Staff Software Engineer @Netflix

Session

OpenSearch Cluster Topologies for Cost-Saving Autoscaling

The indexing rates of many clusters follow some sort of fluctuating pattern - be it day/night, weekday/weekend, or any sort of duality when the cluster changes from being active to less active.  In these cases how does one scale the cluster?

Speaker image - Amitai Stern

Amitai Stern

OpenSearch PMC, Managing Observability Data Storage of Petabyte Scale @Logz.io