High-availability, recovery semantics, and guarantees - CS 591 K1: Data Stream Processing and Analytics Spring 2020computation maintains state: • rolling aggregations • window contents • input offsets • machine learning models State in dataflow computations 3 Vasiliki Kalavri | Boston University 2020 Logic State computation maintains state: • rolling aggregations • window contents • input offsets • machine learning models State in dataflow computations 3 Vasiliki Kalavri | Boston University 2020 Logic State computation maintains state: • rolling aggregations • window contents • input offsets • machine learning models State in dataflow computations 3 Vasiliki Kalavri | Boston University 2020 4 Distributed0 码力 | 49 页 | 2.08 MB | 1 年前3
Flow control and load shedding - CS 591 K1: Data Stream Processing and Analytics Spring 2020process of discarding data when input rates increase beyond system capacity. • Load shedding techniques operate in a dynamic fashion: the system detects an overload situation during runtime and selectively case of known aggregation functions, results can be scaled using approximate query processing techniques, where accuracy is measured in terms of relative error in the computed query answers. 17 ?? streams. (VLDB’06) • N. Tatbul, U. Çetintemel, and S. Zdonik. Staying fit: Efficient load shedding techniques for distributed stream processing. (VLDB’07) • N. R. Katsipoulakis, A. Labrinidis, and P.0 码力 | 43 页 | 2.42 MB | 1 年前3
Streaming optimizations - CS 591 K1: Data Stream Processing and Analytics Spring 2020evenly • Data-parallel streaming languages enable fission by construction • Elastic scaling techniques enable dynamic operator fission by adjusting the number of parallel operator instances according0 码力 | 54 页 | 2.83 MB | 1 年前3
Introduction to Apache Flink and Apache Kafka - CS 591 K1: Data Stream Processing and Analytics Spring 2020Batch API Historic data Kafka, RabbitMQ, ... HDFS, JDBC, ... Event logs ETL, Graphs, Machine Learning Relational, … Low latency, windowing, aggregations, ... 2 Vasiliki Kalavri | Boston University0 码力 | 26 页 | 3.33 MB | 1 年前3
State management - CS 591 K1: Data Stream Processing and Analytics Spring 2020computation maintains state: • rolling aggregations • window contents • input offsets • machine learning models State in dataflow computations 2 Vasiliki Kalavri | Boston University 2020 • No explicit0 码力 | 24 页 | 914.13 KB | 1 年前3
PyFlink 1.15 Documentationworkloads, such as real-time data processing pipelines, large-scale exploratory data analysis, Machine Learning (ML) pipelines and ETL processes. If you’re already familiar with Python and libraries such as Pandas0 码力 | 36 页 | 266.77 KB | 1 年前3
PyFlink 1.16 Documentationworkloads, such as real-time data processing pipelines, large-scale exploratory data analysis, Machine Learning (ML) pipelines and ETL processes. If you’re already familiar with Python and libraries such as Pandas0 码力 | 36 页 | 266.80 KB | 1 年前3
共 7 条
- 1













