demo代碼:
object SocketWindowWordCount { def main(args: Array[String]): Unit = { val env: StreamExecutionEnvironment = StreamExecutionEnvironment.getExecutionEnvironment val text: DataStream[String] = env.socketTextStream("10.202.42.36",9000, '\n') val windowCounts = text.flatMap(w => w.split("\\s")).map(w => WordWithCount(w, 1)).keyBy("word").timeWindow(Time.seconds(5), Time.seconds(1)).sum("count") windowCounts.print().setParallelism(1) env.execute("Socket Window WordCount") } case class WordWithCount(value: String, i: Long) }編譯報錯:
報錯分析:這種異常的發生通常是因爲程序需要一個隱式參數(implicit parameter),參考map或flatMap在flink中的源碼:
/** * Creates a new DataStream by applying the given function to every element and flattening * the results. */ def flatMap[R: TypeInformation](fun: T => TraversableOnce[R]): DataStream[R] = { if (fun == null) { throw new NullPointerException("FlatMap function must not be null.") } val cleanFun = clean(fun) val flatMapper = new FlatMapFunction[T, R] { def flatMap(in: T, out: Collector[R]) { cleanFun(in) foreach out.collect } } flatMap(flatMapper) }
/** * Creates a new DataStream by applying the given function to every element of this DataStream. */ def map[R: TypeInformation](fun: T => R): DataStream[R] = { if (fun == null) { throw new NullPointerException("Map function must not be null.") } val cleanFun = clean(fun) val mapper = new MapFunction[T, R] { def map(in: T): R = cleanFun(in) } map(mapper) }方法的定義中有個
[R: TypeInformation]
,但程序並沒有指定任何有關隱式參數的定義,編譯代碼無法創建TypeInformation,所以出現上面提到的異常信息。解決方案:
1) 我們可以直接在代碼裏面加上以下的代碼:
implicit val typeInfo = TypeInformation.of(classOf[Int]) |
然後再去編譯代碼就不會出現上面的異常。
2) 但是這並不是Flink推薦我們去做的,推薦的做法是在代碼中引入一下包:
import org.apache.flink.streaming.api.scala. _ |
如果數據是有限的(靜態數據集),我們可以引入以下包:
import org.apache.flink.api.scala. _ |
然後即可解決上面的異常信息。
備註:最重要的是 Scala Version 正確。