Fault-tolerance demo & reconfiguration - CS 591 K1: Data Stream Processing and Analytics Spring 2020
University 2020 • Ensure result correctness • reconfiguration mechanism often relies on fault-tolerance mechanism • State re-partitioning and migration • minimize communication • keep duration short0 码力 | 41 页 | 4.09 MB | 1 年前3Exactly-once fault-tolerance in Apache Flink - CS 591 K1: Data Stream Processing and Analytics Spring
and Analytics Vasiliki (Vasia) Kalavri vkalavri@bu.edu Spring 2020 3/24: Exactly-once fault-tolerance in Apache Flink ??? Vasiliki Kalavri | Boston University 2020 Some slides in this lecture have org/smash/get/diva2:1240814/FULLTEXT01.pdf 2 ??? Vasiliki Kalavri | Boston University 2020 Fault-tolerance approaches recap 3 Vasiliki Kalavri | Boston University 2020 Upstream Backup Upstream nodes0 码力 | 81 页 | 13.18 MB | 1 年前3Scalable Stream Processing - Spark Streaming and Flink
( ) 72 / 79 Fault Tolerance (1/2) ▶ Fault tolerance in Spark • RDD re-computation ▶ Fault tolerance in Storm • Tracks records with unique IDs. • Operators send processed. • Records are dropped from the backup when the have been fully acknowledged. ▶ Fault tolerance in Flink • More coarse-grained approach than Storm. • Based on consistent global snapshots (inspired overhead, stateful exactly-once semantics. 73 / 79 Fault Tolerance (1/2) ▶ Fault tolerance in Spark • RDD re-computation ▶ Fault tolerance in Storm • Tracks records with unique IDs. • Operators send 0 码力 | 113 页 | 1.22 MB | 1 年前3pandas: powerful Python data analysis toolkit - 0.21.1
Series.reindex(), DataFrame.reindex(), Index.get_indexer() now support list-like argu- ment for tolerance. (GH17367) 1.2.2 Backwards incompatible API changes 1.2.2.1 Dependencies have increased minimum (GH15676) • Bug in pd.merge_asof() where left_index/right_index together caused a failure when tolerance was specified (GH15135) • Bug in DataFrame.pivot_table() where dropna=True would not drop all-NaN parsing (GH7626) • Bug in pd.merge_asof() could not handle timezone-aware DatetimeIndex when a tolerance was specified (GH14844) • Explicit check in to_stata and StataWriter for out-of-range values when0 码力 | 2207 页 | 8.59 MB | 1 年前3PyMuPDF 1.24.2 Documentation
two red points and setting clip accordingly. cluster_drawings(clip=None, drawings=None, x_tolerance=3, y_tolerance=3) Cluster vector graphics (synonyms are line-art or drawings) based on their geometrical output of Page.get_drawings() and joins paths whose path["rect"] are closer to each other than some tolerance values (given in the arguments). The result is a list of rectangles that each wrap things like tables • x_tolerance (float) – find_tables(clip=None, strategy=None, vertical_strategy=None, horizontal_strategy=None, vertical_lines=None, horizontal_lines=None, snap_tolerance=None, snap_x_tolerance=None0 码力 | 565 页 | 6.84 MB | 1 年前3pandas: powerful Python data analysis toolkit - 0.17.0
object In [56]: s.drop_duplicates(keep=False) Out[56]: 2 C 5 D dtype: object • Reindex now has a tolerance argument that allows for finer control of Limits on filling while reindexing (GH10411): In [57]: method='nearest', ....: tolerance=0.2) ....: Out[58]: t x 0.1 2000-01-01 0 1.9 2000-01-03 2 3.5 NaT NaN When used on a DatetimeIndex, TimedeltaIndex or PeriodIndex, tolerance will coerced into a Timedelta This allows you to specify tolerance with a string: In [59]: df = df.set_index('t') In [60]: df.reindex(pd.to_datetime(['1999-12-31']), ....: method='nearest', ....: tolerance='1 day') ....: Out[60]:0 码力 | 1787 页 | 10.76 MB | 1 年前3钟阳红-Apache Ballista Introduction
Arrow and DataFusion. It’s mainly for interactive queries of low latency. • Support DAG and fault tolerance • Support data exchange • Support different kinds of object stores, like HDFS, S3, Azure, etc Execution DAG State Machine Normal Stage State Machine SQL Execution Fault Tolerance Stage State Machine for Executor Lost SQL Execution Task Assignment Task: each execution assignment (Snowflake) • LRU based retirement • Cache aware scheduling • Consistent hashing tolerance-based work stealing • Currently it’s file-level Data Cache Three rounds cache aware task Scheduling:0 码力 | 17 页 | 2.66 MB | 1 年前3pandas: powerful Python data analysis toolkit - 0.19.0
object In [56]: s.drop_duplicates(keep=False) Out[56]: 2 C 5 D dtype: object • Reindex now has a tolerance argument that allows for finer control of Limits on filling while reindexing (GH10411): In [57]: method='nearest', ....: tolerance=0.2) ....: Out[58]: t x 0.1 2000-01-01 0.0 1.9 2000-01-03 2.0 3.5 NaT NaN When used on a DatetimeIndex, TimedeltaIndex or PeriodIndex, tolerance will coerced into a This allows you to specify tolerance with a string: In [59]: df = df.set_index('t') In [60]: df.reindex(pd.to_datetime(['1999-12-31']), ....: method='nearest', ....: tolerance='1 day') ....: Out[60]:0 码力 | 1937 页 | 12.03 MB | 1 年前3pandas: powerful Python data analysis toolkit - 0.19.1
object In [56]: s.drop_duplicates(keep=False) Out[56]: 2 C 5 D dtype: object • Reindex now has a tolerance argument that allows for finer control of Limits on filling while reindexing (GH10411): In [57]: method='nearest', ....: tolerance=0.2) ....: Out[58]: t x 0.1 2000-01-01 0.0 1.9 2000-01-03 2.0 3.5 NaT NaN When used on a DatetimeIndex, TimedeltaIndex or PeriodIndex, tolerance will coerced into a This allows you to specify tolerance with a string: In [59]: df = df.set_index('t') In [60]: df.reindex(pd.to_datetime(['1999-12-31']), ....: method='nearest', ....: tolerance='1 day') ....: Out[60]:0 码力 | 1943 页 | 12.06 MB | 1 年前3pandas: powerful Python data analysis toolkit - 0.24.0
contained a DST transition (GH18885) • Bug in merge_asof() when merging on float values within defined tolerance (GH22981) • Bug in pandas.concat() when concatenating a multicolumn DataFrame with tz-aware data perform any checks on the order of the index. Limits on filling while reindexing The limit and tolerance arguments provide additional control over filling while reindexing. Limit specifies the maximum D, dtype: float64 In contrast, tolerance specifies the maximum distance between the index and indexer values: In [234]: ts2.reindex(ts.index, method='ffill', tolerance='1 day') Out[234]: 2000-01-03 -00 码力 | 2973 页 | 9.90 MB | 1 年前3
共 389 条
- 1
- 2
- 3
- 4
- 5
- 6
- 39