Backup and Restore - IT文库_程序员IT互联网编程电子书和文档免费下载，助您码力十足！

首页文库资料文章资讯上传文档发布文章登录账户

Fault-tolerance demo & reconfiguration - CS 591 K1: Data Stream Processing and Analytics Spring 2020

automatic if there is a ResourceManager, e.g. in a YARN setup • A manual TaskManager re-start or a backup is required in standalone mode • The restart strategy determines how often the JobManager tries University 2020 • State is mapped into key-groups • Key-groups are mapped to subtasks as ranges • On restore, reads are sequential within each key-group, and often across multiple key-groups • The metadata

0 码力 | 41 页 | 4.09 MB | 1 年前
3
High-availability, recovery semantics, and guarantees - CS 591 K1: Data Stream Processing and Analytics Spring 2020

recovery (at-least-once) • It avoids information loss • The output may contain duplicates • A backup needs to rebuild state of the failed node 8 Vasiliki Kalavri | Boston University 2020 Recovery output may contain duplicates • A backup needs to rebuild state of the failed node • Gap recovery (at-most-once) • It drops data during failure • The backup starts from most recent information Can you see any disadvantage in this approach? Vasiliki Kalavri | Boston University 2020 Upstream Backup Upstream nodes act as backups for their downstream operators by logging tuples in their output

0 码力 | 49 页 | 2.08 MB | 1 年前
3
Scalable Stream Processing - Spark Streaming and Flink

unique IDs. • Operators send acks when a record has been processed. • Records are dropped from the backup when the have been fully acknowledged. ▶ Fault tolerance in Flink • More coarse-grained approach unique IDs. • Operators send acks when a record has been processed. • Records are dropped from the backup when the have been fully acknowledged. ▶ Fault tolerance in Flink • More coarse-grained approach unique IDs. • Operators send acks when a record has been processed. • Records are dropped from the backup when the have been fully acknowledged. ▶ Fault tolerance in Flink • More coarse-grained approach

0 码力 | 113 页 | 1.22 MB | 1 年前
3
Filtering and sampling streams - CS 591 K1: Data Stream Processing and Analytics Spring 2020

Filter out all compromised passwords? • Remove duplicate tuples on recovery when using upstream backup? The membership problem ??? Vasiliki Kalavri | Boston University 2020 22 What data structure Filter out all compromised passwords? • Remove duplicate tuples on recovery when using upstream backup? The membership problem A hash table requires O(logn) bits per element which might still be

0 码力 | 74 页 | 1.06 MB | 1 年前
3
State management - CS 591 K1: Data Stream Processing and Analytics Spring 2020

checkpoint it, restore it, re- scale it Unmanaged Managed What are the advantages and disadvantages of each approach? Vasiliki Kalavri | Boston University 2020 • Copy, checkpoint, restore, merge, split

0 码力 | 24 页 | 914.13 KB | 1 年前
3
PyFlink 1.15 Documentation

restoreInternal(StreamTask. ˓→java:687) at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:654) at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task

0 码力 | 36 页 | 266.77 KB | 1 年前
3
PyFlink 1.16 Documentation

restoreInternal(StreamTask. ˓→java:687) at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:654) at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task

0 码力 | 36 页 | 266.80 KB | 1 年前
3
Exactly-once fault-tolerance in Apache Flink - CS 591 K1: Data Stream Processing and Analytics Spring 2020

2020 Fault-tolerance approaches recap 3 Vasiliki Kalavri | Boston University 2020 Upstream Backup Upstream nodes act as backups for their downstream operators by logging tuples in their output

0 码力 | 81 页 | 13.18 MB | 1 年前
3

共 8 条前往

页

分类

语言

格式

Fault-tolerance demo & reconfiguration - CS 591 K1: Data Stream Processing and Analytics Spring 2020

High-availability, recovery semantics, and guarantees - CS 591 K1: Data Stream Processing and Analytics Spring 2020

Scalable Stream Processing - Spark Streaming and Flink

Filtering and sampling streams - CS 591 K1: Data Stream Processing and Analytics Spring 2020

State management - CS 591 K1: Data Stream Processing and Analytics Spring 2020

PyFlink 1.15 Documentation

PyFlink 1.16 Documentation

Exactly-once fault-tolerance in Apache Flink - CS 591 K1: Data Stream Processing and Analytics Spring 2020