Scalable Stream Processing - Spark Streaming and FlinkInput: key = b, value = Some(1), state = 1 • Output: key = b, sum = 2 54 / 79 Structured Streaming 55 / 79 Structured Streaming ▶ Treating a live data stream as a table that is being continuously appended This mode works for output sinks that can be updated in place, such as a MySQL table. 59 / 79 Structured Streaming Example (1/3) ▶ Assume we receive (id, time, action) events from a mobile app. ▶ We Store the result in MySQL. [https://databricks.com/blog/2016/07/28/structured-streaming-in-apache-spark.html] 60 / 79 Structured Streaming Example (2/3) ▶ We could express it as the following SQL query0 码力 | 113 页 | 1.22 MB | 1 年前3
Stream ingestion and pub/sub systems - CS 591 K1: Data Stream Processing and Analytics Spring 2020acknowledged by the subscriber, it is removed from the subscription's queue of messages. 25 Log-structured brokers Logs as message brokers • In typical message brokers, once a message is consumed it is0 码力 | 33 页 | 700.14 KB | 1 年前3
Introduction to Apache Flink and Apache Kafka - CS 591 K1: Data Stream Processing and Analytics Spring 2020Each partition is an ordered, immutable sequence of records that is continually appended to—a structured commit log. An offset is a sequential id number assigned to records within a partition. It uniquely0 码力 | 26 页 | 3.33 MB | 1 年前3
共 3 条
- 1













