Scalable Stream Processing - Spark Streaming and Flinkforeach(record => connection.send(record)) connection.close() } } 34 / 79 Word Count in Spark Streaming 35 / 79 Word Count in Spark Streaming (1/6) ▶ First we create a StreamingContex import org setMaster("local[2]").setAppName("NetworkWordCount") val ssc = new StreamingContext(conf, Seconds(1)) 36 / 79 Word Count in Spark Streaming (2/6) ▶ Create a DStream that represents streaming data from a TCP source 37 / 79 Word Count in Spark Streaming (3/6) ▶ Use flatMap on the stream to split the records text to words. ▶ It creates a new DStream. val words = lines.flatMap(_.split(" ")) 38 / 79 Word Count in0 码力 | 113 页 | 1.22 MB | 1 年前3
PyFlink 1.15 Documentationgithubusercontent.com/apache/flink/master/flink-python/pyflink/ ˓→examples/table/word_count.py -o word_count.py python3 word_count.py # You will see outputs as following: # Use --input to specify file input githubusercontent.com/apache/flink/master/flink-python/pyflink/ ˓→examples/table/word_count.py -o word_count.py python3 word_count.py If there any any problems, you could check the logging messages in the:8081 \ -pyclientexec /path/to/venv/bin/python3 \ -pyexec /path/to/venv/bin/python3 \ -py word_count.py In the above example, it assumes that there is already a Python virtual environment available 0 码力 | 36 页 | 266.77 KB | 1 年前3
PyFlink 1.16 Documentationgithubusercontent.com/apache/flink/master/flink-python/pyflink/ ˓→examples/table/word_count.py -o word_count.py python3 word_count.py # You will see outputs as following: # Use --input to specify file input githubusercontent.com/apache/flink/master/flink-python/pyflink/ ˓→examples/table/word_count.py -o word_count.py python3 word_count.py If there any any problems, you could check the logging messages in the:8081 \ -pyclientexec /path/to/venv/bin/python3 \ -pyexec /path/to/venv/bin/python3 \ -py word_count.py In the above example, it assumes that there is already a Python virtual environment available 0 码力 | 36 页 | 266.80 KB | 1 年前3
Introduction to Apache Flink and Apache Kafka - CS 591 K1: Data Stream Processing and Analytics Spring 2020Sources 2.Apply Operators 3.Output to Sinks 3 Vasiliki Kalavri | Boston University 2020 Streaming word count textStream .flatMap {_.split("\\W+")} .map {(_, 1)} .keyBy(0) .sum(1) .print()0 码力 | 26 页 | 3.33 MB | 1 年前3
共 4 条
- 1













