Elasticity and state migration: Part I - CS 591 K1: Data Stream Processing and Analytics Spring 2020Boston University 2020 Scaling approaches Metrics • service time and waiting time per tuple and per task • total time spent processing a tuple and all its derived results • CPU utilization, congestion University 2020 Queuing theory models 9 • Metrics • service time and waiting time per tuple and per task • total time spent processing a tuple and all its derived results • Policy • each operator as University 2020 Queuing theory models 9 • Metrics • service time and waiting time per tuple and per task • total time spent processing a tuple and all its derived results • Policy • each operator as0 码力 | 93 页 | 2.42 MB | 1 年前3
监控Apache Flink应用程序(入门)7/dev/stream/operators/#task-chaining-and-resource-groups 4 进度和吞吐量监控 知道您的应用程序正在运行并且检查点正常工作是件好事,但是它并不能告诉您应用程序是否正在实际取得进 展并与上游系统保持同步。 4.1 吞吐量 Flink提供了多个metrics来衡量应用程序的吞吐量。对于每个operator或task(请记住:一个task可以包含多个 chain chained-task3),Flink会对进出系统的记录和字节进行计数。在这些metrics中,每个operator输出记录的速率 通常是最直观和最容易理解的。 4.2 关键指标 Metric Scope Description numRecordsOutPerSecond task The number of records this operator/task sends When enabled, Flink will insert so-called latency markers periodically at all sources. For each sub-task, a latency distribution from each source to this operator will be reported. The granularity of these0 码力 | 23 页 | 148.62 KB | 1 年前3
PyFlink 1.15 Documentationtaskmanager.Task.runWithSystemExitMonitoring(Task.java: ˓→958) at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766) java:766) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ClassNotFoundException: com.mysql.cj.jdbc.Driver at java.net.URLClassLoader 3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",␣ ˓→line 289, in _execute response = task() File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",␣ ˓→line0 码力 | 36 页 | 266.77 KB | 1 年前3
PyFlink 1.16 Documentationtaskmanager.Task.runWithSystemExitMonitoring(Task.java: ˓→958) at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766) java:766) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ClassNotFoundException: com.mysql.cj.jdbc.Driver at java.net.URLClassLoader 3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",␣ ˓→line 289, in _execute response = task() File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",␣ ˓→line0 码力 | 36 页 | 266.80 KB | 1 年前3
Streaming optimizations - CS 591 K1: Data Stream Processing and Analytics Spring 2020Boston University 2020 Types of Parallelism 7 B A C A B D A A B split Pipeline: A || B Task: B || C Data: A || A ??? Vasiliki Kalavri | Boston University 2020 8 Distributed execution in Flink Synergies with scheduling and other optimizations ??? Vasiliki Kalavri | Boston University 2020 Task chaining: Fusion in Flink 31 StreamExecutionEnvironment .disableOperatorChaining() val input: Profitability ??? Vasiliki Kalavri | Boston University 2020 34 • Fission might be preferable to pipeline and task parallelism because it balances load more evenly • Data-parallel streaming languages enable fission0 码力 | 54 页 | 2.83 MB | 1 年前3
State management - CS 591 K1: Data Stream Processing and Analytics Spring 2020All data maintained by a task and used to compute results: a local or instance variable that is accessed by a task’s business logic Operator state is scoped to an operator task, i.e. records processed processed by the same parallel task have access to the same state • It cannot be accessed by other parallel tasks of the same or different operators Keyed state is scoped to a key defined in the operator’s maintains one state instance per key value and partitions all records with the same key to the operator task that maintains the state for this key • State access is automatically scoped to the key of the current0 码力 | 24 页 | 914.13 KB | 1 年前3
Flow control and load shedding - CS 591 K1: Data Stream Processing and Analytics Spring 2020availability A enters the system and is processed by Task 1 The result is serialized into an output buffer The buffer is shipped to Task 2 • Each produced and consumed stream have managed buffer rate automatically. • The maximum throughput is limited by the processing rate of the slowest task. • Parallel tasks are connected via virtual channels multiplexed over TCP connections: • In management in modern, highly-parallel stream processors and is implemented in Apache Flink. • Each task informs its senders of its buffer availability via credit messages. • This way, senders always0 码力 | 43 页 | 2.42 MB | 1 年前3
Apache Flink的过去、现在和未来Dispatcher Job Manager Task Manager Resource Manager Cluster Manager Task Manager 1. Submit job 2. Start job 3. Request slots 4. Allocate Container 5. Start Task Manager 6. Schedule Task YARN RM K8S RM0 码力 | 33 页 | 3.36 MB | 1 年前3
Graph streaming algorithms - CS 591 K1: Data Stream Processing and Analytics Spring 2020partial state .flatMap(new Merger()) // global state .setParallelism(1); // merging on one task ??? Vasiliki Kalavri | Boston University 2020 Connected components in Flink 37 DataStreampartial state .flatMap(new Merger()) // global state .setParallelism(1); // merging on one task Will this scale? ??? Vasiliki Kalavri | Boston University 2020 Connected components in Flink 37 partial state .flatMap(new Merger()) // global state .setParallelism(1); // merging on one task Will this scale? How to represent the state? ??? Vasiliki Kalavri | Boston University 2020 38 A 0 码力 | 72 页 | 7.77 MB | 1 年前3
Exactly-once fault-tolerance in Apache Flink - CS 591 K1: Data Stream Processing and Analytics Spring 2020input streams 2. Wait for all in-flight data to be completely processed 3. Copy the state of each task to a remote, persistent storage 4. Wait until all tasks have finished their copies 5. Resume processing University 2020 • Assumptions: • DAG of tasks • Epoch change events triggered on each source task (⟨ep1⟩,⟨ep2⟩,…) • Issued by a coordinator or generated periodically • We want to snapshot stream sources to the sink of the dataflow. ??? Vasiliki Kalavri | Boston University 2020 45 • When a source task receives a checkpoint barrier, it pauses emitting records, triggers a checkpoint of its local state0 码力 | 81 页 | 13.18 MB | 1 年前3
共 13 条
- 1
- 2













