combined development support team - IT文库_程序员IT互联网编程电子书和文档免费下载，助您码力十足！

首页文库资料文章资讯上传文档发布文章登录账户

Streaming optimizations - CS 591 K1: Data Stream Processing and Analytics Spring 2020

selectivity > 1 • a filter operator typically has selectivity < 1 Is selectivity always known at development time? ??? Vasiliki Kalavri | Boston University 2020 Types of Parallelism 7 B A C A B D perform an equivalent computation • Ensure mergeable state: even a simple counter might differ on a combined stream vs. on separate streams Redundancy elimination Eliminate redundant operations, aka subgraph

0 码力 | 54 页 | 2.83 MB | 1 年前
3
Stream ingestion and pub/sub systems - CS 591 K1: Data Stream Processing and Analytics Spring 2020

• MBs assume a small working set. If consumers are slow, throughput might degrade. • DBs support secondary indexes for efficient search while MBs only offer topic-based subscription. • DB query the form of name-value pairs and basic comparison operators. • Constraints can be logically combined to form complex event patterns. • company == ‘Uber’ and price < 100 • Predecessors of Complex

0 码力 | 33 页 | 700.14 KB | 1 年前
3
Stream processing fundamentals - CS 591 K1: Data Stream Processing and Analytics Spring 2020

6 Vasiliki Kalavri | Boston University 2020 1. Process events online without storing them 2. Support a high-level language (e.g. StreamSQL) 3. Handle missing, out-of-order, delayed data 4. Guarantee Combine batch (historical) and stream processing 6. Ensure availability despite failures 7. Support distribution and automatic elasticity 8. Offer low-latency 7 2005 Vasiliki Kalavri | Boston limitation on the stream: updates cannot change past entries in A. 11 Useful in theory for the development of streaming algorithms With limited practical value in distributed, real-world settings Vasiliki

0 码力 | 45 页 | 1.22 MB | 1 年前
3
PyFlink 1.15 Documentation

contains its own Python executable files and the installed Python packages. It is useful for local development to create a standalone Python environment and also useful when deploying a PyFlink job to production Local This page shows you how to set up PyFlink development environment in your local machine. This is usually used for local execution or development in an IDE. Set up Python environment It requires ExecNodeBase. ˓→translateToPlan(ExecNodeBase.java:134) This is an issue around Java 17. It still doesn’t support Java 17 in Flink. You can refer to FLINK-15736 for more details. To solve this issue, you need to

0 码力 | 36 页 | 266.77 KB | 1 年前
3
PyFlink 1.16 Documentation

contains its own Python executable files and the installed Python packages. It is useful for local development to create a standalone Python environment and also useful when deploying a PyFlink job to production Local This page shows you how to set up PyFlink development environment in your local machine. This is usually used for local execution or development in an IDE. Set up Python environment It requires ExecNodeBase. ˓→translateToPlan(ExecNodeBase.java:134) This is an issue around Java 17. It still doesn’t support Java 17 in Flink. You can refer to FLINK-15736 for more details. To solve this issue, you need to

0 码力 | 36 页 | 266.80 KB | 1 年前
3
State management - CS 591 K1: Data Stream Processing and Analytics Spring 2020

operations and types 4 Consider you are designing a state interface. What operations should state support? What state types can you think of? • Count, sum, list, map, … Vasiliki Kalavri | Boston University Checkpoints sent to JobManager's heap memory, i.e. the state is lost in case of failure • Use only for development and debugging purposes! FsStateBackend • Stores state on TaskManager’s heap but checkpoints it

0 码力 | 24 页 | 914.13 KB | 1 年前
3
监控Apache Flink应用程序(入门)

Flink application. I highly recommend to start monitoring your Flink application early on in the development phase. This way you will be able to improve your dashboards and alerts over time and, more importantly importantly, observe the performance impact of the changes to your application throughout the development phase. By doing so, you can ask the right questions about the runtime behaviour of your application

0 码力 | 23 页 | 148.62 KB | 1 年前
3
Course introduction - CS 591 K1: Data Stream Processing and Analytics Spring 2020

Java JDK. A Java JRE is not sufficient! • Apache Maven 3.x. • An IDE for Java and/or Scala development, such as IntelliJ IDEA (preferred), Eclipse, or Netbeans with appropriate plugins installed.

0 码力 | 34 页 | 2.53 MB | 1 年前
3
Windows and triggers - CS 591 K1: Data Stream Processing and Analytics Spring 2020

iterate over the list of all collected elements when evaluated: • They require more space but support more complex logic. • ProcessWindowFunction Window functions 14 Vasiliki Kalavri | Boston University

0 码力 | 35 页 | 444.84 KB | 1 年前
3
Streaming languages and operator semantics - CS 591 K1: Data Stream Processing and Analytics Spring 2020

value of B 20 Vasiliki Kalavri | Boston University 2020 What kind of queries can we express and support on data streams? 21 Vasiliki Kalavri | Boston University 2020 Non-blocking (monotonic) queries

0 码力 | 53 页 | 532.37 KB | 1 年前
3

共 11 条前往

页

分类

语言

格式

Streaming optimizations - CS 591 K1: Data Stream Processing and Analytics Spring 2020

Stream ingestion and pub/sub systems - CS 591 K1: Data Stream Processing and Analytics Spring 2020

Stream processing fundamentals - CS 591 K1: Data Stream Processing and Analytics Spring 2020

PyFlink 1.15 Documentation

PyFlink 1.16 Documentation

State management - CS 591 K1: Data Stream Processing and Analytics Spring 2020

监控Apache Flink应用程序(入门)

Course introduction - CS 591 K1: Data Stream Processing and Analytics Spring 2020

Windows and triggers - CS 591 K1: Data Stream Processing and Analytics Spring 2020

Streaming languages and operator semantics - CS 591 K1: Data Stream Processing and Analytics Spring 2020