Java集合和Scala集合相互轉換
import scala.collection.JavaConverters._
object TestJavaScalaCollection {
def main(args: Array[String]): Unit = {
val arrayList = new util.ArrayList[String]()
arrayList.add("hello")
arrayList.add("word")
for(i <- 0 until arrayList.size()){
println(arrayList.get(i))
}
val scalaList = arrayList.asScala
for(i<- scalaList){
println(i)
}
val javaList = scalaList.asJava
for(i <- 0 until javaList.size()){
println(javaList.get(i))
}
}
}
集合變換-算子
Scala集合提供了豐富的計算算子,用於實現集合/數組的計算,這些計算子一般針對於List、Array、Set、Map、Range、Vector、Iterator等都可以適用。
排 序
- sorted
def sorted[B >: String](implicit ord: scala.math.Ordering[B]): List[String]
scala> var list=List("a","c","d","b")
list: List[String] = List(a, c, d, b)
scala> list.sorted
res6: List[String] = List(a, b, c, d)
因爲系統已經提供了相應的隱式值
Ordering[String]
,所以用戶在使用的時候一般無需提供,如果用戶需要自定義排序規則,用戶可以自己提供參數,不適用默認排序規則。
scala> var order=new Ordering[String]{
| override def compare(x: String, y: String): Int = {
| x.compareTo(y) * -1
| }
| }
order: Ordering[String] = $anon$1@669cb3c7
scala> list.sorted(order)
res8: List[String] = List(d, c, b, a)
scala> case class User(id:Int,name:String,salary:Double)
defined class User
scala> var users=Array(User(1,"張三",1000.0),User(2,"lisi",1500.0),User(3,"wangwu",800.0))
users: Array[User] = Array(User(1,張三,1000.0), User(2,lisi,1500.0), User(3,wangwu,800.0))
# 系統報錯了,因爲沒有找到對應的隱式值
scala> users.sorted
<console>:14: error: No implicit Ordering defined for User.
users.sorted
^
scala> implicit var order=new Ordering[User]{
| override def compare(x: User, y: User): Int = {
| x.salary.compareTo(y.salary) * -1
| }
| }
order: Ordering[User] = $anon$1@188842e2
scala> users.sorted
res10: Array[User] = Array(User(2,lisi,1500.0), User(1,張三,1000.0), User(3,wangwu,800.0))
- sortBy 基於單一屬性排序
def sortBy[B](f: User => B)(implicit ord: scala.math.Ordering[B]): Array[User]
scala> users.sortBy(u=>u.salary)
res13: Array[User] = Array(User(3,wangwu,800.0), User(1,張三,1000.0), User(2,lisi,1500.0))
- sortWith
def sortWith(lt: ((String, Int), (String, Int)) => Boolean): List[(String, Int)]
scala> tuples.sortWith((t1,t2)=> {
| if(t1._1.equals(t2._1)){
| (t1._2.compareTo(t2._2) * -1) > 0
| }else{
| t1._1.compareTo(t2._1) > 0
| }
| })
res19: List[(String, Int)] = List((d,4), (c,3), (b,1), (a,1), (a,2))
flatten
用於展開集合中的元素,主要作用於降維。
def flatten[B](implicit asTraversable: String => scala.collection.GenTraversableOnce[B]): List[B]
scala> var list=List(List("a","b","c"),List("d","e"))
list: List[List[String]] = List(List(a, b, c), List(d, e))
scala> list.flatten
res20: List[String] = List(a, b, c, d, e)
def flatten[B](implicit asTraversable: String => scala.collection.GenTraversableOnce[B]): List[B]
scala> var lines=List("hello word","ni hao")
lines: List[String] = List(hello word, ni hao)
scala> lines.flatten
res26: List[Char] = List(h, e, l, l, o, , w, o, r, d, n, i, , h, a, o)
scala> lines.flatten(line => line.split("\\s+"))
res28: List[String] = List(hello, word, ni, hao)
scala> lines.flatten(line => line.split(" "))
res29: List[String] = List(hello, word, ni, hao)
scala> lines.flatten(_.split(" "))
res37: List[String] = List(hello, word, ni, hao)
Map
該算子可以操作集合的每一個元素,並且對集合中的每一個元素做映射(轉換)
scala> var list=List(1,2,4,5)
list: List[Int] = List(1, 2, 4, 5)
scala> list.map(item => item *2 )
res35: List[Int] = List(2, 4, 8, 10)
scala> list.map(_ * 2)
res36: List[Int] = List(2, 4, 8, 10)
scala> var lines=List("Hello World","good good study")
lines: List[String] = List(Hello World, good good study)
scala> lines.flatten(_.split("\\s+")).map(w=>(w.toLowerCase,1))
res47: List[(String, Int)] = List((hello,1), (world,1), (good,1), (good,1), (study,1))
scala> lines.flatten(_.split("\\s+")).map(w=>(w.toLowerCase,1))
res47: List[(String, Int)] = List((hello,1), (world,1), (good,1), (good,1), (study,1))
flatMap
對集合元素先進行轉換,然後執行flatten展開降維。
scala> var lines=List("Hello World","good good study")
lines: List[String] = List(Hello World, good good study)
scala> lines.map(line=> line.split(" ")).flatten
res55: List[String] = List(Hello, World, good, good, study)
scala> lines.flatMap(line=> line.split("\\s+")) // lines.flattern(line=> line.split("\\s+"))
res56: List[String] = List(Hello, World, good, good, study)
filter/filterNot
過濾掉集合中不滿足條件的 元素
def filter(p: ((Int, String, Double)) => Boolean): List[(Int, String, Double)]
scala> var list=List((1,"zhangsan",1000.0),(2,"lisi",800.0))
list: List[(Int, String, Double)] = List((1,zhangsan,1000.0), (2,lisi,800.0))
scala> list.filter(_._3 >= 1000)
res59: List[(Int, String, Double)] = List((1,zhangsan,1000.0))
scala> list.filterNot(_._3 >= 1000)
res60: List[(Int, String, Double)] = List((2,lisi,800.0))
distinct
去除重複數據
scala> val list=List(1,2,2,3)
list: List[Int] = List(1, 2, 2, 3)
scala> list.distinct
res91: List[Int] = List(1, 2, 3)
groupBy
通常用戶統計分析,將List或者Array轉換爲一個Map
def groupBy[K](f: String => K): scala.collection.immutable.Map[K,List[String]]
scala> var list=List("a","b","a","c")
scala> list.groupBy(w=>w)
res61: scala.collection.immutable.Map[String,List[String]] = Map(b -> List(b), a -> List(a, a), c -> List(c))
scala> list.groupBy(w=>w).map(t=>(t._1,t._2.size))
res63: scala.collection.immutable.Map[String,Int] = Map(b -> 1, a -> 2, c -> 1)
def groupBy[K](f: Employee => K): scala.collection.immutable.Map[K,List[Employee]]
scala> var emps=List("1,001,zhangsan,1000.0","2,002,lisi,1000.0","3,001,王五,800.0")
scala> case class Employee(id:Int,deptNo:String,name:String,salary:Double)
defined class Employee
scala> emps.map(_.split(",")).map(ts=>Employee(ts(0).toInt,ts(1),ts(2),ts(3).toDouble)).groupBy(emp=>emp.deptNo).map(t=>(t._1,t._2.map(e=>e.salary).sum))
res69: scala.collection.immutable.Map[String,Double] = Map(002 -> 1000.0, 001 -> 1800.0)
scala> emps.map(_.split(",")).map(ts=>Employee(ts(0).toInt,ts(1),ts(2),ts(3).toDouble)).groupBy(emp=>emp.deptNo).map(t=>(t._1,(for(e<- t._2) yield e.salary).sum)).toList.sortBy(t=>t._2).reverse
res1: List[(String, Double)] = List((001,1800.0), (002,1000.0))
max|min
計算最值
def max[B >: Int](implicit cmp: Ordering[B]): Int
def min[B >: Int](implicit cmp: Ordering[B]): Int
scala> list.max
res5: Int = 5
scala> list.min
res6: Int = 1
scala> list.sorted
res7: List[Int] = List(1, 2, 3, 4, 5)
scala> list.sorted.head
res8: Int = 1
scala> list.sorted.last
res9: Int = 5
maxBy|minBy
計算含有最大值或者最小值的記錄,按照特定條件求最大或最小
def maxBy[B](f: ((Int, String, Int)) => B)(implicit cmp: Ordering[B]): (Int, String, Int)
def minBy[B](f: ((Int, String, Int)) => B)(implicit cmp: Ordering[B]): (Int, String, Int)
scala> var list=List((1,"zhangsan",28),(2,"lisi",20),(3,"wangwu",18))
list: List[(Int, String, Int)] = List((1,zhangsan,28), (2,lisi,20), (3,wangwu,18))
scala> list.maxBy(t=>t._3)
res12: (Int, String, Int) = (1,zhangsan,28)
scala> list.maxBy(t=>t._1)
res13: (Int, String, Int) = (3,wangwu,18)
scala> var emps=List("1,001,zhangsan,1000.0","2,002,lisi,1000.0","3,001,王五,800.0")
emps: List[String] = List(1,001,zhangsan,1000.0, 2,002,lisi,1000.0, 3,001,王五,800.0)
scala> emps.map(_.split(",")).map(w=>(w(1),w(3))).groupBy(_._1).map(t=> t._2.maxBy(i=>i._2))
res24: scala.collection.immutable.Map[String,String] = Map(002 -> 1000.0, 001 -> 800.0)
scala> emps.map(line=>line.split(",")).map(ts=>(ts(1),ts(3).toDouble)).groupBy(_._1).map(_._2).map(_.maxBy(_._2))
res4: scala.collection.immutable.Iterable[(String, Double)] = List((002,1000.0), (001,1000.0))
reduce|reduceLeft|reduceRight
def reduce[A1 >: Int](op: (A1, A1) => A1): A1
scala> var list=List(1,5,3,4,2)
list: List[Int] = List(1, 5, 3, 4, 2)
scala> list.reduce((v1,v2)=>v1+v2)
res7: Int = 15
scala> list.reduceLeft((v1,v2)=>v1+v2)
res8: Int = 15
scala> list.reduceRight((v1,v2)=>v1+v2)
res9: Int = 15
scala> list.reduceRight(_+_)
res17: Int = 15
如果集合爲空(沒有數據),系統報錯
scala> var list=List[Int]()
list: List[Int] = List()
scala> list.reduce((v1,v2)=>v1+v2)
java.lang.UnsupportedOperationException: empty.reduceLeft
at scala.collection.LinearSeqOptimized$class.reduceLeft(LinearSeqOptimized.scala:137)
at scala.collection.immutable.List.reduceLeft(List.scala:84)
at scala.collection.TraversableOnce$class.reduce(TraversableOnce.scala:208)
at scala.collection.AbstractTraversable.reduce(Traversable.scala:104)
... 32 elided
fold |foldLeft|foldRight
def fold[A1 >: Int](z: A1)(op: (A1, A1) => A1): A1
scala> var list=List(1,5,3,4,2)
list: List[Int] = List(1, 5, 3, 4, 2)
scala> list.fold(0)((z,v)=> z+v)
res12: Int = 15
scala> var list=List[Int]()
list: List[Int] = List()
scala> list.fold(0)((z,v)=> z+v)
res13: Int = 0
scala> list.fold(0)(_+_)
res19: Int = 15
aggregate
def aggregate[B](z: => B)(seqop: (B, Int) => B,combop: (B, B) => B): B
scala> var list=List(1,5,3,4,2)
list: List[Int] = List(1, 5, 3, 4, 2)
scala> list.aggregate(0)((z,v)=>z+v,(b1,b2)=>b1+b2)
res29: Int = 15
scala> list.aggregate(0)(_+_,_+_)
res33: Int = 15
scala> var list=List[Int]()
list: List[Int] = List()
scala> list.aggregate(0)((z,v)=>z+v,(b1,b2)=>b1+b2)
res31: Int = 0
我們reduce和fold計算要求計算結果類型必須和集合元素類型一致,一般用於求和性質的計算。由於aggregate計算對類型無要求,因此可以使用aggregate完成更復雜的計算邏輯,例如:計算均值
scala> var list=List(1,5,3,4,2)
list: List[Int] = List(1, 5, 3, 4, 2)
scala> list.aggregate((0,0.0))((z,v)=>(z._1+1,z._2+v),(b1,b2)=> (b1._1+b2._1,b1._2+b2._2))
res34: (Int, Double) = (5,15.0)
按照部門計算員工平均薪資
scala> var emps=List("1,001,zhangsan,1000.0","2,002,lisi,1000.0","3,001,王五,800.0")
emps: List[String] = List(1,001,zhangsan,1000.0, 2,002,lisi,1000.0, 3,001,王五,800.0)
scala> emps.map(line=>line.split(",")).map(ts=>(ts(1),ts(3).toDouble)).groupBy(_._1).map(t=>(t._1,t._2.map(_._2))).map(t=>(t._1,t._2.aggregate((0,0.0))((z,v)=>(z._1+1,z._2+v),(b1,b2)=> (b1._1+b2._1,b1._2+b2._2)))).map(t=>(t._1,t._2._2/t._2._1))
res63: scala.collection.immutable.Map[String,Double] = Map(002 -> 1000.0, 001 -> 900.0)
scala> emps.map(line=>line.split(",")).map(ts=>(ts(1),ts(3).toDouble)).groupBy(_._1).map(t=>(t._1,t._2.map(_._2))).map(t=>(t._1,t._2.aggregate((0,0.0))((z,v)=>(z._1+1,z._2+v),(b1,b2)=> (b1._1+b2._1,b1._2+b2._2)))).map(t=>(t._1,t._2._2/t._2._1))
res63: scala.collection.immutable.Map[String,Double] = Map(002 -> 1000.0, 001 -> 900.0)
group
可以對一維度數據進行升維度
def grouped(size: Int): Iterator[List[Int]]
scala> var list=List(1,5,3,4,2)
list: List[Int] = List(1, 5, 3, 4, 2)
scala> list.grouped(2)
res73: Iterator[List[Int]] = non-empty iterator
scala> list.grouped(2).toList
res74: List[List[Int]] = List(List(1, 5), List(3, 4), List(2))
zip
將兩個一維的集合合併一個一維度的集合
scala> var list1=List(1,5,3,4,2)
list1: List[Int] = List(1, 5, 3, 4, 2)
scala> var list2=List("a","b","c")
list2: List[String] = List(a, b, c)
scala> list2.zip(list1)
res75: List[(String, Int)] = List((a,1), (b,5), (c,3))
scala> list1.zip(list2)
res76: List[(Int, String)] = List((1,a), (5,b), (3,c))
unizp
將一個元組分解成多個一維度集合
scala> var v=List(("a",1),("b",2),("c",3))
v: List[(String, Int)] = List((a,1), (b,2), (c,3))
scala> v.unzip
res90: (List[String], List[Int]) = (List(a, b, c),List(1, 2, 3))
diff|intersect|union
計算差集合、交集、並集
scala> var v=List(1,2,3)
v: List[Int] = List(1, 2, 3)
scala> v.diff(List(2,3,5))
res54: List[Int] = List(1)
scala> var v=List(1,2,3,5)
v: List[Int] = List(1, 2, 3, 5)
scala> v.intersect(List(2,4,6))
res55: List[Int] = List(2)
scala> var v=List(1,2,3,5)
v: List[Int] = List(1, 2, 3, 5)
scala> v.union(List(2,4,6))
res56: List[Int] = List(1, 2, 3, 5, 2, 4, 6)
Sliding
滑動產生新的數組元素
scala> val list=List(1,2,3,4,5,6)
list: List[Int] = List(1, 2, 3, 4, 5, 6)
scala> list.sliding(3,3)
res0: Iterator[List[Int]] = non-empty iterator
scala> list.sliding(3,3).toList
res1: List[List[Int]] = List(List(1, 2, 3), List(4, 5, 6))
scala> list.sliding(3,1).toList
res2: List[List[Int]] = List(List(1, 2, 3), List(2, 3, 4), List(3, 4, 5), List(4, 5, 6))
slice
截取數組子集
scala> val list=List(1,2,3,4,5,6)
list: List[Int] = List(1, 2, 3, 4, 5, 6)
scala> list.slice(0,3)
res3: List[Int] = List(1, 2, 3)
scala> list.slice(3,5)
res5: List[Int] = List(4, 5)
案例剖析
① 有如下數組
var arrs=Array("this is a demo","good good study","day day up")
請統計字符出現的次數,並按照次數降序排列
scala> arrs.flatMap(_.split(" ")).groupBy(w=>w).map(t=>(t._1,t._2.size)).toList.sortBy(t=>t._2).reverse
res11: List[(String, Int)] = List((day,2), (good,2), (study,1), (a,1), (up,1), (is,1), (demo,1), (this,1))
②讀取一個文本文件,計算字符出現的個數
var source=Source.fromFile("/Users/admin/IdeaProjects/20200203/scala-lang/src/main/resources/t_word")
var array=ListBuffer[String]()
val reader = source.bufferedReader()
var line = reader.readLine()
while(line!=null){
array+=line
line = reader.readLine()
}
array.flatMap(_.split(" "))
.map((_,1))
.groupBy(_._1)
.map(x=> (x._1,x._2.size))
.toList
.sortBy(_._2)
.reverse
.foreach(println)
reader.close()
課程總結
在筆者看來,Scala這門編程語言和Java編程語言相比而言具有很多相似點,但是兩種語言在實際的開發應用領域不太一樣,這裏並不是說誰比誰的運行效率高,而是對於開發人員而言,使用哪種語言解決問題的時候最方便。因此在小編看來,如果是純粹的業務建模領域開發個人還是比較喜歡使用Java編程語言,因爲封裝做的比較清晰可讀性強。但是如果大家做的是服務器端開發,尤其是數據的處理和分析領域,個人覺得像Scala語言比較好一些,因爲這些語言提供了豐富的接口調用,尤其是在集合和網絡編程領域上Scala有很多的應用場景,例如:Spark開發,推薦的編程語言就是Scala
.