Exactly-once fault-tolerance in Apache Flink - CS 591 K1: Data Stream Processing and Analytics Spring 2020Passive Standby • Each primary periodically checkpoints its state and sends it to the secondary 6 Ni primary secondary I1 O1 N’i update checkpoint send state ??? Vasiliki Kalavri | Boston University v4893Q12R8=">AB53icbVBNS8NAEJ34WetX1aOXxSJ4Kok I6q3oxWMLxhbaUDbSbt2swm7G6GE/gIvHlS8+pe8+W/ctjlo64OBx3szMwLU8G1cd1vZ2V1bX1js7RV3t7Z3duvHBw+6CRTDH2WiES1Q6pRcIm+4UZgO1VI41BgKxzdTv3WEyrNE3lvxikGMR1IHnFGjZWaca9SdWvuDGSZeAWp QoFGr/LV7Scsi1EaJqjWHc9NTZBTZTgTOCl3M40pZSM6wI6lksaog3x26IScWqVPokTZkobM1N8TOY21Hseh7YypGepFbyr+53UyE10FOZdpZlCy+aIoE8QkZPo16XOFzIixJZQpbm8lbEgVZcZmU7YheIsvLxP/vHZd85oX1fpNk UYJjuEzsCDS6jDHTABwYIz/AKb86j8+K8Ox/z1hWn0 码力 | 81 页 | 13.18 MB | 1 年前3
Graph streaming algorithms - CS 591 K1: Data Stream Processing and Analytics Spring 2020Kalavri | Boston University 2020 6 ??? Vasiliki Kalavri | Boston University 2020 6 ??? Vasiliki Kalavri | Boston University 2020 6 ??? Vasiliki Kalavri | Boston University 2020 6 ??? Vasiliki Kalavri | Boston Boston University 2020 6 ??? Vasiliki Kalavri | Boston University 2020 6 ??? Vasiliki Kalavri | Boston University 2020 6 ??? Vasiliki Kalavri | Boston University 2020 6 ??? Vasiliki Kalavri | Boston Boston University 2020 6 ??? Vasiliki Kalavri | Boston University 2020 7 Let G(t) = (V(t), E(t)) be the graph observed up to timestamp t. For t=0, V(t) = E(t) = {} For every t > 0, we receive one event:0 码力 | 72 页 | 7.77 MB | 1 年前3
Flink如何实时分析Iceberg数据湖的CDC数据RaAlD sMSPBD .R0,T0 T,-L0 mysOl_AHLlMF HC INT NOT N=LL, LamD ;TRING, CDsBPHNRHML ;TRING, UDHFGR /0.I6,L '0,() ) WIT3 'BMLLDBRMP' = 'mysOl-BCB', 'GMsRLamD' = 'lMBalGMsR', 'NMPR' = '((0)', 'SsDPLamD' = 1RO6 mysOl_AHLlMF; FHRGSA.BMm/TDPTDPHBa/ElHLI-BCB-BMLLDBRMPs Flink 原生支持 Change Log Stream A C D E F G INSERT DELETE UPDATE INSERT DELETE UPDATE INSERT F3152 + Icebe7g CDC导入i案 D6w5st7e+4 、gc近实k导入和实k读取。 2、计算a擎原生gcCDCe入,不需要额外的业务 字r设计。 3、统一的h据t存储,多o化的计算模型。 4、读取合并后的历史h据可F分利wI存加速。 5、云原生gc。 6、gc增量b取。 7、nm足够简s,无在线l务节u。 i案评D Cu 如何实时#入读取? #3 s量更新场景 VS +,+写入场景 k比项 s量更新场景 +,+写入场景 典g场景 1.0 码力 | 36 页 | 781.69 KB | 1 年前3
High-availability, recovery semantics, and guarantees - CS 591 K1: Data Stream Processing and Analytics Spring 2020queue primary nodes secondary nodes other apps I1 I2 O1 O2 N’1 N’K N’2 … I’1 I’2 O’1 O’2 6 Vasiliki Kalavri | Boston University 2020 Assumptions Ni primary secondary I1 I2 O1 O2 N’i after recovery 10 Recovery type Before failure After failure Precise t1 t2 t3 t4 t5 t6 … Gap t1 t2 t3 t5 t6 … Rollback-repeating t1 t2 t3 t2 t3 t4 … Rollback-convergent t1 t2 t3 t’2 t’3 t4 result semantics 11 sum 4 3 1 3 3 … 5 6 Vasiliki Kalavri | Boston University 2020 Processing guarantees and result semantics 11 sum 5 4 3 6 6 … 6 7 1 Vasiliki Kalavri | Boston University 20200 码力 | 49 页 | 2.08 MB | 1 年前3
Cardinality and frequency estimation - CS 591 K1: Data Stream Processing and Analytics Spring 2020of a 0 is 1/2) • around 25% will end in at least two 0s: • *******00 (1/2 * 1/2) • and so on… 6 ??? Vasiliki Kalavri | Boston University 2020 The hash function h hashes x to any of N values with distinct elements, whereas if two 0s is the maximum we’ve seen, that indicates 4 distinct elements, … 6 ??? Vasiliki Kalavri | Boston University 2020 The hash function h hashes x to any of N values with indicates 4 distinct elements, … It takes 2r hash calls before we encounter a result with r 0s. 6 ??? Vasiliki Kalavri | Boston University 2020 Is this a good estimate? 7 ??? Vasiliki Kalavri |0 码力 | 69 页 | 630.01 KB | 1 年前3
PyFlink 1.15 Documentation1.1.1.2 Local . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.1.1.3 Standalone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 (continued from previous page) # -rw-r--r-- 1 dianfu staff 45K 10 18 20:54 flink-dianfu-python-B-7174MD6R-1908. ˓→local.log Besides, you could also check if the files of the PyFlink package are consistent pyflink;import os;print(os.path.dirname(os.path.abspath(pyflink.__ ˓→file__)))" (continues on next page) 6 Chapter 1. How to build docs locally pyflink-docs, Release release-1.15 (continued from previous page)0 码力 | 36 页 | 266.77 KB | 1 年前3
PyFlink 1.16 Documentation1.1.1.2 Local . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.1.1.3 Standalone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 (continued from previous page) # -rw-r--r-- 1 dianfu staff 45K 10 18 20:54 flink-dianfu-python-B-7174MD6R-1908. ˓→local.log Besides, you could also check if the files of the PyFlink package are consistent pyflink;import os;print(os.path.dirname(os.path.abspath(pyflink.__ ˓→file__)))" (continues on next page) 6 Chapter 1. How to build docs locally pyflink-docs, Release release-1.16 (continued from previous page)0 码力 | 36 页 | 266.80 KB | 1 年前3
Scalable Stream Processing - Spark Streaming and Flink▶ Design issues • Continuous vs. micro-batch processing • Record-at-a-Time vs. declarative APIs 6 / 79 Spark Streaming ▶ Run a streaming computation as a series of very small, deterministic batch connection.close() } } 34 / 79 Word Count in Spark Streaming 35 / 79 Word Count in Spark Streaming (1/6) ▶ First we create a StreamingContex import org.apache.spark._ import org.apache.spark.streaming._ ount") val ssc = new StreamingContext(conf, Seconds(1)) 36 / 79 Word Count in Spark Streaming (2/6) ▶ Create a DStream that represents streaming data from a TCP source. ▶ Specified as hostname (e.g0 码力 | 113 页 | 1.22 MB | 1 年前3
监控Apache Flink应用程序(入门).................................................................................................. 6 3 监控 ............................................................................................. ter文档2。 在这篇博客的其余部分中,我们将介绍一些监控Apache Flink应用程序的最重要的指标。 caolei – 监控Apache Flink应用程序(入门) 健康状况 – 6 2 健康状况 caolei – 监控Apache Flink应用程序(入门) 监控 – 7 3 监控 您要监控的第一件事就是您的作业是否实际处于运行状态。此外,还可以监控重启的次数以及自上次重启之后 5 https://ci.apache.org/projects/flink/flink-docs-release-1.7/monitoring/metrics.html#checkpointing 6 https://issues.apache.org/jira/browse/FLINK-10317 7 https://ci.apache.org/projects/flink/flink-docs-release-10 码力 | 23 页 | 148.62 KB | 1 年前3
Elasticity and state migration: Part I - CS 591 K1: Data Stream Processing and Analytics Spring 2020dataflow physical dataflow ??? Vasiliki Kalavri | Boston University 2020 Automatic scaling overview 6 scaling controller detect symptoms decide whether to scale decide how much to scale metrics effect of Dhalion’s scaling actions in an initially under-provisioned wordcount dataflow 1 2 3 6 5 4 ??? Vasiliki Kalavri | Boston University 2020 Dataflow worker activities worker 1 worker 2 step for both operators Dhalion scales one operator at a time, and needs six steps in total 1 6 5 4 3 2 and converges in 60s, as soon as it receives the Heron metrics ??? Vasiliki Kalavri |0 码力 | 93 页 | 2.42 MB | 1 年前3
共 22 条
- 1
- 2
- 3













