The idea of executing streaming and batch jobs with one engine has been there for a while. People always say batch is a special case of streaming. Conceptually, it is. However, practically, there are many gaps between streaming and batch processing in resource management, scheduling, failure recovery, aggregation, shuffling, etc. Apache Flink has gone through a long journey to address all these challenges, and becomes a leading convergence engine. This talk will introduce these challenges as well as the way Flink tackles them.
Speaker
Jiangjie (Becket) Qin
Principal Staff Software Engineer @LinkedIn, Data Infra Engineer, PMC Member of Apache Kafka & Apache Flink, Previously @Alibaba and @IBM
Becket is currently a Principal Staff Software Engineer at LinkedIn. He started to work on Apache Kafka at LinkedIn after he graduated from Carnegie Mellon University. After that, he joined Alibaba and led the Flink team focusing on Flink SQL, PyFlink, Flink ML, Connectors, among others. He returned to LinkedIn in 2022 to drive the effort of stream and batch unification.
Becket is a PMC member of Apache Kafka and Apache Flink.