Fault-tolerance demo & reconfiguration - CS 591 K1: Data Stream Processing and Analytics Spring 2020JobManager cannot restart the application until enough slots become available. • Restart is automatic if there is a ResourceManager, e.g. in a YARN setup • A manual TaskManager re-start or a backup routed to the same parallel instance • Some kind of hashing is typically used • Maintaining routing tables or an index for all key mappings is usually impractical • Skewed load is challenging to channel of each parallel task • Partitioning function performance • space required to implement routing • lookup cost • Migration performance • re-assignment computation cost • state movement cost0 码力 | 41 页 | 4.09 MB | 1 年前3
Streaming optimizations - CS 591 K1: Data Stream Processing and Analytics Spring 2020Profitability: under what conditions does the optimization improve performance? • can the decision be automatic? • Safety: under what conditions does the optimization preserve correctness? • maintain state 2020 33 • if operator is costly enough to bring benefit when parallelized • split incurs a routing overhead • merge might incur overhead if ordering is required • p/s/o: parallel/sequential/overhead0 码力 | 54 页 | 2.83 MB | 1 年前3
Skew mitigation - CS 591 K1: Data Stream Processing and Analytics Spring 2020partitioning 2 w2 w1 w3 round-robin hash-based • Items are perfectly balanced among workers • No routing table required • Key semantics are not preserved: values of the same key might be routed to to different workers • Workers are responsible for roughly the same amount of keys • No routing table is required • Key semantics preserved: values of the same key are always processed by the same both choices: the partitioner sends the item to the worker with the currently lowest load • no routing history required • state needs to be merged to produce the final result: the computation must consist0 码力 | 31 页 | 1.47 MB | 1 年前3
Cardinality and frequency estimation - CS 591 K1: Data Stream Processing and Analytics Spring 2020log2(230) = 30 bits. • We need 1024 counters, so m = 210 and we need p = log2m = 10 bits for routing. Space requirements ??? Vasiliki Kalavri | Boston University 2020 16 As we read the stream, it log2(230) = 30 bits. • We need 1024 counters, so m = 210 and we need p = log2m = 10 bits for routing. • Each counter needs to be able to count up to 20 0s, so we need to allocate log220 = 4.32 bits log2(230) = 30 bits. • We need 1024 counters, so m = 210 and we need p = log2m = 10 bits for routing. • Each counter needs to be able to count up to 20 0s, so we need to allocate log220 = 4.32 bits0 码力 | 69 页 | 630.01 KB | 1 年前3
Elasticity and state migration: Part I - CS 591 K1: Data Stream Processing and Analytics Spring 2020correctness ??? Vasiliki Kalavri | Boston University 2020 Automatic Scaling Control 4 ??? Vasiliki Kalavri | Boston University 2020 The automatic scaling problem 5 Given a logical dataflow with sources S2 π=2 π=3 logical dataflow physical dataflow ??? Vasiliki Kalavri | Boston University 2020 Automatic scaling overview 6 scaling controller detect symptoms decide whether to scale decide much to scale metrics policy scaling action ??? Vasiliki Kalavri | Boston University 2020 Automatic scaling requirements 7 ▸ Accuracy ▸ no over/under-provisioning ▸ Stability ▸ no oscillations0 码力 | 93 页 | 2.42 MB | 1 年前3
Stream processing fundamentals - CS 591 K1: Data Stream Processing and Analytics Spring 2020(historical) and stream processing 6. Ensure availability despite failures 7. Support distribution and automatic elasticity 8. Offer low-latency 7 2005 Vasiliki Kalavri | Boston University 2020 actions, typically exact • Challenges: computation progress, fault-tolerance and result guarantees, automatic scaling and state migration, out-of-order processing 37 Vasiliki Kalavri | Boston University 20200 码力 | 45 页 | 1.22 MB | 1 年前3
共 6 条
- 1













