scala中的foldLeft學習

閒來無事,在stackoverflow上看代碼玩,偶爾發現一個之前沒有用過的函數--foldLeft函數。現做記錄如下:

val sourceDF = Seq(
  ("  p a   b l o", "Paraguay"),
  ("Neymar", "B r    asil")
).toDF("name", "country")

val actualDF = Seq(
  "name",
  "country"
).foldLeft(sourceDF) { (memoDF, colName) =>
  memoDF.withColumn(
    colName,
    regexp_replace(col(colName), "\\s+", "")
  )
}

actualDF.show()
+------+--------+
|  name| country|
+------+--------+
| pablo|Paraguay|
|Neymar|  Brasil|
+------+--------+
val sourceDF = Seq(
  ("funny", "joke")
).toDF("A b C", "de F")

sourceDF.show()
+-----+----+
|A b C|de F|
+-----+----+
|funny|joke|
+-----+----+
val actualDF = sourceDF
  .columns
  .foldLeft(sourceDF) { (memoDF, colName) =>
    memoDF
      .withColumnRenamed(
        colName,
        colName.toLowerCase().replace(" ", "_")
      )
  }

actualDF.show()
+-----+----+
|a_b_c|de_f|
+-----+----+
|funny|joke|
+-----+----+
import org.apache.spark.sql.DataFrame

def snakeCaseColumns(df: DataFrame): DataFrame = {
  df.columns.foldLeft(df) { (memoDF, colName) =>
    memoDF.withColumnRenamed(colName, toSnakeCase(colName))
  }
}

def toSnakeCase(str: String): String = {
  str.toLowerCase().replace(" ", "_")
}
val sourceDF = Seq(
  ("funny", "joke")
).toDF("A b C", "de F")
val actualDF = sourceDF.transform(snakeCaseColumns)
actualDF.show()
+-----+----+
|a_b_c|de_f|
+-----+----+
|funny|joke|
+-----+----+

參考博客爲

1、How can I concat several float columns into one ArrayType(FloatType()) in spark DataFrame?

2、Performing operations on multiple columns in a Spark DataFrame with foldLeft

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章