Graph streaming algorithms - CS 591 K1: Data Stream Processing and Analytics Spring 2020they are added to V(t+1). Preliminaries ??? Vasiliki Kalavri | Boston University 2020 8 Some algorithms model graph streams a sequence of vertex events. A vertex stream consists of events that contain and all of its neighbors. Although this model can enable a theoretical analysis of streaming algorithms, it cannot adequately model real-world unbounded streams, as the neighbors cannot be known in compID = 1 compID = 6 ??? Vasiliki Kalavri | Boston University 2020 22 • How can we run such algorithms if the graph is continuously generated as a stream of edges? • How can we perform iterative0 码力 | 72 页 | 7.77 MB | 1 年前3
Cardinality and frequency estimation - CS 591 K1: Data Stream Processing and Analytics Spring 2020Marianne, and Philippe Flajolet. Loglog counting of large cardinalities. European Symposium on Algorithms, 2003. • Flajolet, Philippe, et al. Hyperloglog: the analysis of a near-optimal cardinality summary: the count-min sketch and its applications. Journal of Algorithms (2005). • Gakhov, Andrii. Probabilistic Data Structures and Algorithms for Big Data Applications. 2019. Further reading0 码力 | 69 页 | 630.01 KB | 1 年前3
Course introduction - CS 591 K1: Data Stream Processing and Analytics Spring 2020streaming 4 Fundamental for representing, summarizing, and analyzing data streams Systems Algorithms Architecture and design Scheduling and load management Scalability and elasticity Fault-tolerance management Operator semantics Window optimizations Filtering, counting, sampling Graph streaming algorithms Vasiliki Kalavri | Boston University 2020 Tools Apache Flink: flink.apache.org Apache Kafka:0 码力 | 34 页 | 2.53 MB | 1 年前3
Stream processing fundamentals - CS 591 K1: Data Stream Processing and Analytics Spring 2020updates cannot change past entries in A. 11 Useful in theory for the development of streaming algorithms With limited practical value in distributed, real-world settings Vasiliki Kalavri | Boston stream. 13 It is the most general model Hard to develop space-efficient and time-efficient algorithms Vasiliki Kalavri | Boston University 2020 Relational Streaming Model Vasiliki Kalavri | Boston0 码力 | 45 页 | 1.22 MB | 1 年前3
High-availability, recovery semantics, and guarantees - CS 591 K1: Data Stream Processing and Analytics Spring 2020Kalavri | Boston University 2020 Further resources • Jeong-Hyon Hwang et al. High-Availability Algorithms for Distributed Stream Processing. (ICDE ’05). • http://cs.brown.edu/research/aurora/hwang0 码力 | 49 页 | 2.08 MB | 1 年前3
Streaming optimizations - CS 591 K1: Data Stream Processing and Analytics Spring 2020computation • query plans, e.g. order of operators • scheduling and placement decisions • different algorithms, e.g. hash-based vs. broadcast join • What does performance depend on? • input data, intermediate0 码力 | 54 页 | 2.83 MB | 1 年前3
共 6 条
- 1













