SparkSQL之DataFrame使用詳解

一、應用

1.show
  def show1(ss:SparkSession):Unit={
    val df=ss.read.json("E:\\data\\spark\\dataframe\\test\\read\\people.json")
    df.show()
  }
2.select
  def select1(ss:SparkSession):Unit={
    val df=ss.read.json("E:\\data\\spark\\dataframe\\test\\read\\people.json")
    df.select(df("name"), df("age")+1).show()
  }
3.filter
  def filter1(ss:SparkSession):Unit={
    val df=ss.read.json("E:\\data\\spark\\dataframe\\test\\read\\people.json")
    df.filter(df("age")>28).show()
  }
4.groupBy
  def groupBy1(ss:SparkSession):Unit={
    val df=ss.read.json("E:\\data\\spark\\dataframe\\test\\read\\people.json")
    df.groupBy("age").count().show()
  }
  
  //SQL風格
  def groupBy11(ss:SparkSession):Unit={
    val df=ss.read.json("E:\\data\\spark\\dataframe\\test\\read\\people.json")
    df.createOrReplaceTempView("people")
    val groupByDf=ss.sql("select age,count(age) as num from people group by age")
    groupByDf.show()
    //newDf.write.format("csv").save("E:\\data\\spark\\dataframe\\test\\write\\groupBy")
  }
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章