Win10+IDEA创建Maven并配置Scala

目录

1.在IDEA中新建Project-->Maven-->Next

2.GroupId一般写公司统一名称,ArtifactId写项目名称 -->Next

3.点击Finish

4.目录结构

5.解压apache-maven-3.3.9-bin.zip

6.打开conf中的settings.xml,修改本地仓库路径

7.在IDEA中打开File-->settings-->搜maven-->配置解压的目录和修改的settings文件-->OK-->右下角弹窗选Enable Auto-Import

8.maven依赖查询 与添加

9.配置maven的环境变量

10.打开cmd --> mvn -v

11.配置Scala

12.配置pom文件

13.测试环境是否成功


1.在IDEA中新建Project-->Maven-->Next

2.GroupId一般写公司统一名称,ArtifactId写项目名称 -->Next

3.点击Finish

4.目录结构

.idea 是元信息,是工作目录所在位置,如果想拷贝工程到自己的电脑需要把这个目录删了重新加载

src 是编辑代码的目录

pom.xml 是它的依赖,写所用的jar包

5.解压apache-maven-3.3.9-bin.zip

6.打开conf中的settings.xml,修改本地仓库路径

拷贝第53行并指定本地仓库路径,我指定的是<localRepository>D:\maven\repository</localRepository>,保存。

7.在IDEA中打开File-->settings-->搜maven-->配置解压的目录和修改的settings文件-->OK-->右下角弹窗选Enable Auto-Import

8.maven依赖查询 与添加

9.配置maven的环境变量

10.打开cmd --> mvn -v

11.配置Scala

main-->新建scala文件夹-->File-->Project Structure

-->Modules-->点击main目录scala-->点击Sources

-->Modules-->点击test目录scala-->点击Tests

-->Libraries-->点击+-->Scala SDK-->OK

12.配置pom文件

再后面追加以下配置

 <properties>
        <spark.version>2.2.0</spark.version>
        <scala.version>2.11</scala.version>
        <hadoop.version>2.7.3</hadoop.version>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_${scala.version}</artifactId>
            <version>${spark.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_${scala.version}</artifactId>
            <version>${spark.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-hive_${scala.version}</artifactId>
            <version>${spark.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-streaming_${scala.version}</artifactId>
            <version>${spark.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>2.6.0</version>
        </dependency>

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-mllib_${scala.version}</artifactId>
            <version>${spark.version}</version>
        </dependency>

        <dependency>
            <groupId>mysql</groupId>
            <artifactId>mysql-connector-java</artifactId>
            <version>5.1.39</version>
        </dependency>

        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>4.12</version>
        </dependency>
    </dependencies>

    <build>
        <sourceDirectory>src/main/scala</sourceDirectory>
        <testSourceDirectory>src/test/scala</testSourceDirectory>
    </build>

13.测试环境是否成功

import org.apache.log4j.{Level, Logger}
import org.apache.spark.SparkConf
import org.apache.spark.rdd.RDD
import org.apache.spark.streaming.{Seconds, StreamingContext}

import scala.collection.mutable

object RDDQueueStream {
  def main(args: Array[String]): Unit = {
    System.setProperty("hadoop.home.dir", "D:\\temp\\hadoop-2.4.1\\hadoop-2.4.1")
    Logger.getLogger("org.apache.spark").setLevel(Level.ERROR)
    Logger.getLogger("org.eclipse.jetty.server").setLevel(Level.OFF)

    val conf = new SparkConf().setAppName("MyNetworkWordCount").setMaster("local[2]")
    val ssc = new StreamingContext(conf,Seconds(1))

    val rddQueue = new mutable.Queue[RDD[Int]]()
    for(i <- 1 to 3){
      rddQueue += ssc.sparkContext.makeRDD(i to 10)
      Thread.sleep(2000)
    }

    val inputDStream = ssc.queueStream(rddQueue)

    val result = inputDStream.map(x => (x,x*2))
    result.print()

    ssc.start()
    ssc.awaitTermination()
  }
}

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章