项目背景
原来用sbt用的很顺手,不过新公司scala项目也是用的maven,无奈只好用maven了,不过今天运行时发现scalatest不能正常被test,还有在用elasticsearch hadoop项目会随机生成id,所以把这个问题解决了下
elastic hadoop
这个项目可以进行rdd与index的互转,不过需要注意的是一些参数还是要看一下,默认的index中_id是随机生成的
import org.apache.spark.{SparkConf, SparkContext}
import org.elasticsearch.hadoop.cfg.ConfigurationOptions
import org.elasticsearch.spark._
import scala.language._
/**
* Created by todd.chen on 16/3/11.
* email : todd.chen@ximalaya.com
*/
object Application {
def main(args: Array[String]) {
require(args.length == 1)
val conf = new SparkConf()
.set("es.index.auto.create", "true")
.set("es.nodes", "esnode1,esnode2,esnode3")
.setMaster("local[*]")
.setAppName("elastic")
val sc = new SparkContext(conf)
/** *
* index
* mapping userInfo
* column :
* uid string
* realName string
* thirdPartyName string
* gender String
*/
val cfg = Map(ConfigurationOptions.ES_MAPPING_ID → "uid")
sc.textFile(args(0))
.map(_.split("\t")).map(array2Map).saveToEs("test/user", cfg)
}
def array2Map(array: Array[String]) = {
val indexName = Vector("uid", "realName", "gender", "thirdPartyName")
array.zipWithIndex.map { case (v, i) ⇒ indexName(i) -> v }.toMap
}
}
注意那个cfg,是为了用自定义的uid作为index的id,如果不加就是随机生成id
scalatest
scalatest我们知道在sbt中直接可以进行test,但用mvn test并不能执行,需要plugin支持,在我们的pom.xml中加入如下插件
<!-- enable scalatest -- >
<plugin>
<groupId>org.scalatest</groupId>
<artifactId>scalatest-maven-plugin</artifactId>
<version>1.0</version>
<configuration>
<reportsDirectory>${project.build.directory}/surefire-reports</reportsDirectory>
<junitxml>.</junitxml>
<filereports>WDF TestSuite.txt</filereports>
</configuration>
<executions>
<execution>
<id>test</id>
<goals>
<goal>test</goal>
</goals>
</execution>
</executions>
</plugin>
我的测试代码
import org.scalatest.{Matchers, FlatSpec}
/**
* Created by todd.chen on 16/3/11.
* email : todd.chen@ximalaya.com
*/
class Array2MapSpec extends FlatSpec with Matchers{
final lazy val array = Array("1","a","male","qq")
final lazy val app = Application
"Array" should "format to Map" in {
val map = app.array2Map(array)
assert(map.isInstanceOf[Map[String,String]])
}
"Map " should "have key 'uid'" in {
val map = app.array2Map(array)
assert(map.contains("uid"))
}
"Map" should "have value 'a' with key 'realName'" in {
val map = app.array2Map(array)
assert(map.get("realName").isDefined)
assert(map.get("realName").get == "a")
}
}
运行:
[INFO] --- scalatest-maven-plugin:1.0:test (test) @ elstic-bi ---
Discovery starting.
Discovery completed in 195 milliseconds.
Run starting. Expected test count is: 3
Array2MapSpec:
Array
- should format to Map
Map
- should have key 'uid'
Map
- should have value 'a' with key 'realName'
Run completed in 331 milliseconds.
Total number of tests run: 3
Suites: completed 2, aborted 0
Tests: succeeded 3, failed 0, canceled 0, ignored 0, pending 0
All tests passed.
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 6.858 s
[INFO] Finished at: 2016-03-11T20:57:41+08:00
[INFO] Final Memory: 30M/512M
[INFO] ------------------------------------------------------------------------
搞定