項目背景
原來用sbt用的很順手,不過新公司scala項目也是用的maven,無奈只好用maven了,不過今天運行時發現scalatest不能正常被test,還有在用elasticsearch hadoop項目會隨機生成id,所以把這個問題解決了下
elastic hadoop
這個項目可以進行rdd與index的互轉,不過需要注意的是一些參數還是要看一下,默認的index中_id是隨機生成的
import org.apache.spark.{SparkConf, SparkContext}
import org.elasticsearch.hadoop.cfg.ConfigurationOptions
import org.elasticsearch.spark._
import scala.language._
/**
* Created by todd.chen on 16/3/11.
* email : todd.chen@ximalaya.com
*/
object Application {
def main(args: Array[String]) {
require(args.length == 1)
val conf = new SparkConf()
.set("es.index.auto.create", "true")
.set("es.nodes", "esnode1,esnode2,esnode3")
.setMaster("local[*]")
.setAppName("elastic")
val sc = new SparkContext(conf)
/** *
* index
* mapping userInfo
* column :
* uid string
* realName string
* thirdPartyName string
* gender String
*/
val cfg = Map(ConfigurationOptions.ES_MAPPING_ID → "uid")
sc.textFile(args(0))
.map(_.split("\t")).map(array2Map).saveToEs("test/user", cfg)
}
def array2Map(array: Array[String]) = {
val indexName = Vector("uid", "realName", "gender", "thirdPartyName")
array.zipWithIndex.map { case (v, i) ⇒ indexName(i) -> v }.toMap
}
}
注意那個cfg,是爲了用自定義的uid作爲index的id,如果不加就是隨機生成id
scalatest
scalatest我們知道在sbt中直接可以進行test,但用mvn test並不能執行,需要plugin支持,在我們的pom.xml中加入如下插件
<!-- enable scalatest -- >
<plugin>
<groupId>org.scalatest</groupId>
<artifactId>scalatest-maven-plugin</artifactId>
<version>1.0</version>
<configuration>
<reportsDirectory>${project.build.directory}/surefire-reports</reportsDirectory>
<junitxml>.</junitxml>
<filereports>WDF TestSuite.txt</filereports>
</configuration>
<executions>
<execution>
<id>test</id>
<goals>
<goal>test</goal>
</goals>
</execution>
</executions>
</plugin>
我的測試代碼
import org.scalatest.{Matchers, FlatSpec}
/**
* Created by todd.chen on 16/3/11.
* email : todd.chen@ximalaya.com
*/
class Array2MapSpec extends FlatSpec with Matchers{
final lazy val array = Array("1","a","male","qq")
final lazy val app = Application
"Array" should "format to Map" in {
val map = app.array2Map(array)
assert(map.isInstanceOf[Map[String,String]])
}
"Map " should "have key 'uid'" in {
val map = app.array2Map(array)
assert(map.contains("uid"))
}
"Map" should "have value 'a' with key 'realName'" in {
val map = app.array2Map(array)
assert(map.get("realName").isDefined)
assert(map.get("realName").get == "a")
}
}
運行:
[INFO] --- scalatest-maven-plugin:1.0:test (test) @ elstic-bi ---
Discovery starting.
Discovery completed in 195 milliseconds.
Run starting. Expected test count is: 3
Array2MapSpec:
Array
- should format to Map
Map
- should have key 'uid'
Map
- should have value 'a' with key 'realName'
Run completed in 331 milliseconds.
Total number of tests run: 3
Suites: completed 2, aborted 0
Tests: succeeded 3, failed 0, canceled 0, ignored 0, pending 0
All tests passed.
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 6.858 s
[INFO] Finished at: 2016-03-11T20:57:41+08:00
[INFO] Final Memory: 30M/512M
[INFO] ------------------------------------------------------------------------
搞定