PyFlink 1.15 Documentationconda/lib/python3.7/site-packages/ ˓→pyflink/table/utils.py:55: FutureWarning: Schema passed to names= option, please pass␣ ˓→schema= explicitly. Will raise exception in future return pa.RecordBatch conda/lib/python3.7/site-packages/ ˓→pyflink/table/utils.py:55: FutureWarning: Schema passed to names= option, please pass␣ ˓→schema= explicitly. Will raise exception in future return pa.RecordBatch DataStream API. [7]: from pyflink.common import Row from pyflink.datastream import FlatMapFunction class MyFlatMapFunction(FlatMapFunction): def flat_map(self, value): for s in str(value.data).split('|'):0 码力 | 36 页 | 266.77 KB | 1 年前3
PyFlink 1.16 Documentationconda/lib/python3.7/site-packages/ ˓→pyflink/table/utils.py:55: FutureWarning: Schema passed to names= option, please pass␣ ˓→schema= explicitly. Will raise exception in future return pa.RecordBatch conda/lib/python3.7/site-packages/ ˓→pyflink/table/utils.py:55: FutureWarning: Schema passed to names= option, please pass␣ ˓→schema= explicitly. Will raise exception in future return pa.RecordBatch DataStream API. [7]: from pyflink.common import Row from pyflink.datastream import FlatMapFunction class MyFlatMapFunction(FlatMapFunction): def flat_map(self, value): for s in str(value.data).split('|'):0 码力 | 36 页 | 266.80 KB | 1 年前3
Stream ingestion and pub/sub systems - CS 591 K1: Data Stream Processing and Analytics Spring 2020subscribing to a topic implicitly involves subscribing to all sub-topics of that topic, too. • Topic names are represented with URL-like notation and some systems also allow the use of wildcards. 21 Content-based0 码力 | 33 页 | 700.14 KB | 1 年前3
Streaming languages and operator semantics - CS 591 K1: Data Stream Processing and Analytics Spring 2020systems • Almost universally supported across streaming systems and languages albeit with various names and semantics • Allow un-blocking the processing of blocking operators by defining bounded portions0 码力 | 53 页 | 532.37 KB | 1 年前3
Streaming in Apache FlinkExamples Tuples Tuple1 through Tuple25 types. POJOs A POJO (plain old Java object) is any Java class that • has an empty default constructor • all fields are either ◦public, or ◦have a default getter Tuple2<>("Fred", 35); // zero based index! String name = person.f0; Integer age = person.f1; public class Person { public String name; public Integer age; public Person() {}; public total fare collected Lab 1 -- Ride Cleansing Transforming Data Transforming Data public static class EnrichedRide extends TaxiRide { public int startCell; public int endCell; public EnrichedRide()0 码力 | 45 页 | 3.00 MB | 1 年前3
Introduction to Apache Flink and Apache Kafka - CS 591 K1: Data Stream Processing and Analytics Spring 2020Boston University 2020 DataStream API Basics Vasiliki Kalavri | Boston University 2020 case class Reading(id: String, time: Long, temp: Double) object MaxSensorReadings { def main(args: Array[String]) temperature”) } } Example: Sensor Readings 7 Vasiliki Kalavri | Boston University 2020 case class Reading(id: String, time: Long, temp: Double) object MaxSensorReadings { def main(args: Array[String]) temperature reading Example: Sensor Readings 8 Vasiliki Kalavri | Boston University 2020 case class Reading(id: String, time: Long, temp: Double) object MaxSensorReadings { def main(args: Array[String])0 码力 | 26 页 | 3.33 MB | 1 年前3
State management - CS 591 K1: Data Stream Processing and Analytics Spring 2020data types handled by the state are specified as Class or TypeInformation objects. 16 Registering state Vasiliki Kalavri | Boston University 2020 class TemperatureAlertFunction(val threshold: Double) assign name and get the state handle In the operator (FlatMap) class In the open() method Vasiliki Kalavri | Boston University 2020 class TemperatureAlertFunction(val threshold: Double) extends Rich flatMap(new MatchFunction()); Java example 20 Vasiliki Kalavri | Boston University 2020 public static class EnrichmentFunction extends RichCoFlatMapFunction> { 0 码力 | 24 页 | 914.13 KB | 1 年前3
Windows and triggers - CS 591 K1: Data Stream Processing and Analytics Spring 2020average temperature per sensor. // The accumulator holds the sum of temperatures and an event count. class AvgTempFunction extends AggregateFunction [(String, Double), (String, Double, Int), (String, Double)] watermark. ProcessWindowFunction 18 Vasiliki Kalavri | Boston University 2020 public abstract class ProcessWindowFunctionextends AbstractRichFunction { // Evaluates KEY key, Context ctx, Iterable vals, Collector out) throws Exception; public abstract class Context implements Serializable { // Returns the metadata of the window public 0 码力 | 35 页 | 444.84 KB | 1 年前3
Course introduction - CS 591 K1: Data Stream Processing and Analytics Spring 2020Announcements, updates, discussions • Website: vasia.github.io/dspa20 • Syllabus: /syllabus.html • Class schedule: /lectures.html • including today’s slides • Piazza: piazza.com/bu/spring2020/cs591k1/home applications 6 Vasiliki Kalavri | Boston University 2020 Grading Scheme (1) • No Exam • 5 in-class quizzes (10%): • Each quiz contributes 2% to the final grade • 3 hands-on assignments (40%): Kalavri | Boston University 2020 Schedule 9 vasia.github.io/dspa20/ lectures.html deadline no class guest lecture quizzes and announcements Vasiliki Kalavri | Boston University 2020 Guest Lectures0 码力 | 34 页 | 2.53 MB | 1 年前3
Scalable Stream Processing - Spark Streaming and Flinksource: extend the Receiver class. ▶ Implement onStart() and onStop(). ▶ Call store(data) to store received data inside Spark. 16 / 79 Input Operations - Custom Sources (2/3) class CustomReceiver(host: String 79 Basic Operations ▶ Most of operations on DataFrame/Dataset are supported for streaming. case class Call(action: String, time: Timestamp, id: Int) val df: DataFrame = spark.readStream.json("s3://logs")0 码力 | 113 页 | 1.22 MB | 1 年前3
共 13 条
- 1
- 2













