筆者通過udf創建spark sql 函數,代碼如下:
val getKeyWordsFun = udf((con:Array[String],fea:Vector)=>{
//函數內容
});
使用如下:
idfDf.withColumn("keywords",getKeyWordsFun(col("contents"),col("idf_features")));
返回錯誤:
org.apache.spark.SparkException: Failed to execute user defined function(anonfun$3: (array<string>, vector) => string)
解決方法,將Array[String]改爲Seq[String],問題解決,代碼如下:
val getKeyWordsFun = udf((con:Seq[String],fea:Vector)=>{
//函數內容
});