Fault-tolerance demo & reconfiguration - CS 591 K1: Data Stream Processing and Analytics Spring 2020JobManager disappears. • Flink relies on Apache ZooKeeper for high-availability • coordination and consensus services, e.g. leader election • The JobManager writes the JobGraph and all required metadata | Boston University 2020 • Ensure result correctness • reconfiguration mechanism often relies on fault-tolerance mechanism • State re-partitioning and migration • minimize communication • keep correctness ??? Vasiliki Kalavri | Boston University 2020 Control: When and how much to adapt? Mechanism: How to apply the re-configuration? 12 • Detect environment changes: external workload and system0 码力 | 41 页 | 4.09 MB | 1 年前3
Elasticity and state migration: Part I - CS 591 K1: Data Stream Processing and Analytics Spring 2020throughput ??? Vasiliki Kalavri | Boston University 2020 Control: When and how much to adapt? Mechanism: How to apply the re-configuration? 3 • Detect environment changes: external workload and system and configuration commands that cannot yet be applied Live state migration Can we apply this mechanism in Flink? ??? Vasiliki Kalavri | Boston University 2020 Lecture references • Vasiliki Kalavri0 码力 | 93 页 | 2.42 MB | 1 年前3
Scalable Stream Processing - Spark Streaming and Flinkis an alternative to updateStateByKeys: • Update function (partial updates) • Built in timeout mechanism • Choose the return type • Initial state def mapWithState[StateType, MappedType](spec: StateSpec[K is an alternative to updateStateByKeys: • Update function (partial updates) • Built in timeout mechanism • Choose the return type • Initial state def mapWithState[StateType, MappedType](spec: StateSpec[K0 码力 | 113 页 | 1.22 MB | 1 年前3
High-availability, recovery semantics, and guarantees - CS 591 K1: Data Stream Processing and Analytics Spring 2020Fault-tolerance trade-offs 12 Steady-state overhead • How is performance affected by the fault-tolerance mechanism under normal, failure- free operation? • How much memory or disk space is required to maintain0 码力 | 49 页 | 2.08 MB | 1 年前3
Flow control and load shedding - CS 591 K1: Data Stream Processing and Analytics Spring 2020??? Vasiliki Kalavri | Boston University 2020 Remarks on buffer-based rate control • Simple mechanism:the buffer occupancy controls the data rate automatically. • The maximum throughput is limited0 码力 | 43 页 | 2.42 MB | 1 年前3
Streaming optimizations - CS 591 K1: Data Stream Processing and Analytics Spring 2020Keep intermediate state in memory • Use Spark's RDDs instead of replication • Parallel recovery mechanism in case of failures 44 input stream time-based micro-batches D-Streams • During an interval0 码力 | 54 页 | 2.83 MB | 1 年前3
Exactly-once fault-tolerance in Apache Flink - CS 591 K1: Data Stream Processing and Analytics Spring 2020Kalavri | Boston University 2020 End-to-end exactly once • Flink’s checkpointing and recovery mechanism only resets the internal state of a streaming application • Some result records might be emitted Kalavri | Boston University 2020 End-to-end exactly once • Flink’s checkpointing and recovery mechanism only resets the internal state of a streaming application • Some result records might be emitted0 码力 | 81 页 | 13.18 MB | 1 年前3
共 7 条
- 1













