Flink中使用嵌入式ElasticSearch進行單元測試

Flink中使用嵌入式ElasticSearch進行單元測試

Flink版本 1.8.0

ElasticSearch版本 5.1.2

Scala版本 2.11.12

Java版本 1.8
Github地址:https://github.com/shirukai/flink-examples-embedded-elasticsearch.git

1 前言

前些時間同學在羣裏問關於ElasticSearch的單元測試,如何mock。當時看到這個問題,我想的是mock一個寫ElasticSearch的客戶端的類?但是一直沒想好怎麼實現,這個問題一直困擾我。剛好最近接手的工作上,要求單元測試的覆蓋率,恰好也有寫ES的單元測試。先說一下我的工作需求,我是一個寫es的flink任務,要求進行單元測試。通過查看flink 關於es connector的源碼,豁然開朗,它的做法是啓動了一個嵌入式的es節點。這篇文章將介紹一下三種方式啓動嵌ElasticSearch服務用以單元測試:

  1. 從ElasticSearch官方包中啓動單個Node節點
  2. 使用第三方依賴包啓動ElasticSearch服務
  3. 使用Testcontainers啓動ElasticSearch服務

並將我在Flink中使用嵌入式ES踩過的坑統一記錄下來,避免讓更多的同學踩坑,話說今天是週三,掉坑裏兩天了終於爬出來了。

image-20200617113607294

2 快速創建Flink項目

按照慣例,我給每一個需要驗證的技術點都搭建一個新環境,確保做的驗證沒有污染。廢話不多說,執行如下命令基於flink官方模板創建一個Maven項目:

mvn archetype:generate -DarchetypeGroupId=org.apache.flink -DarchetypeArtifactId=flink-quickstart-scala -DarchetypeVersion=1.8.0 -DgroupId=flink.examples -DartifactId=flink-examples-embedded-elasticsearch -Dversion=1.0 -Dpackage=flink.examples.embedded.elasticsearch -DinteractiveMode=false

pom.xml文件中加入如下依賴

		<!-- elasticsearch -->
		<dependency>
			<groupId>org.apache.flink</groupId>
			<artifactId>flink-connector-elasticsearch5_${scala.binary.version}</artifactId>
			<version>${flink.version}</version>
		</dependency>

		<!-- json4s -->
		<dependency>
			<groupId>org.json4s</groupId>
			<artifactId>json4s-jackson_${scala.binary.version}</artifactId>
			<version>3.6.7</version>
		</dependency>

3 簡單編寫一個Flink寫入ES的任務

在flink.embedded.elasticsearch.examples下創建一個名爲WriteElasticSearchJob的Scala object,實現比較簡單:

  1. 從集合中構建一個事件流
  2. 將事件流寫入ES

代碼如下所示:

package flink.embedded.elasticsearch.examples

import java.net.{InetAddress, InetSocketAddress}

import org.apache.flink.api.common.functions.RuntimeContext
import org.apache.flink.api.java.utils.ParameterTool
import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
import org.apache.flink.streaming.api.scala._
import org.apache.flink.streaming.connectors.elasticsearch.{ElasticsearchSinkFunction, RequestIndexer}
import org.apache.flink.streaming.connectors.elasticsearch5.ElasticsearchSink
import org.elasticsearch.client.Requests
import org.json4s.{Formats, NoTypeHints}
import org.json4s.jackson.Serialization

import scala.util.Random

/**
 * 模擬數據寫入Es
 *
 * @author shirukai
 */
object WriteElasticSearchJob {

  case class Event(id: String, v: Double, t: Long)

  def main(args: Array[String]): Unit = {
    // 獲取執行環境
    val env: StreamExecutionEnvironment = StreamExecutionEnvironment.getExecutionEnvironment

    // 解析參數
    val params: ParameterTool = ParameterTool.fromArgs(args)

    // 1. 從集合中創建一個DataStream
    val events = env.fromCollection(1 to 20).map(i => {
      val v = Random.nextDouble()
      val t = System.currentTimeMillis()
      Event("event-" + i, v, t)
    })

    // 2. 將事件寫入Es
    val (userConfig, transportAddresses) = parseEsConfigs(params)
    import scala.collection.JavaConversions._
    val esSink = new ElasticsearchSink(userConfig, transportAddresses, new EventSinkFunction)
    events.addSink(esSink)

    env.execute("WriteElasticSearchJob")

  }

  def parseEsConfigs(params: ParameterTool): (Map[String, String], List[InetSocketAddress]) = {
    // 構建userConfig,主要設置Es集羣名稱
    val userConfig = Map[String, String](
      "cluster.name" -> params.get("es.cluster.name", "es-test")
    )
    // 構建transport地址
    val esNodes = params.get("es.cluster.nodes", "127.0.0.1").split(",").toList
    val esPort = params.getInt("es.cluster.port", 9300)
    val transportAddresses = esNodes.map(node => new InetSocketAddress(InetAddress.getByName(node), esPort))
    (userConfig, transportAddresses)
  }

  /**
   * 繼承ElasticsearchSinkFunction實現構建索引的方法
   */
  class EventSinkFunction extends ElasticsearchSinkFunction[Event] {
    override def process(t: Event, runtimeContext: RuntimeContext, requestIndexer: RequestIndexer): Unit = {
      implicit val formats: AnyRef with Formats = Serialization.formats(NoTypeHints)
      val source: String = Serialization.write(t)
      requestIndexer.add(Requests.indexRequest()
        .index("events")
        .`type`("test")
        .id(t.id)
        .source(source)
      )
    }
  }

}

flink任務寫好了,就可以執行main方法進行驗證了,從上面代碼中可以看出,程序接受如下參數:

  • es.cluster.name 集羣名稱,默認值:es-test
  • es.cluster.nodes 集羣節點IP,多個IP以逗號分隔,默認值:127.0.0.1
  • es.cluster.port 集羣節點端口號,默認值:9300

可以在啓動程序的時候指定上面的參數,如果不指定將使用默認值。

image-20200618195231344

4 三種方式啓動嵌入式ElasticSearch服務

4.1 從ElasticSearch官方包中啓動單個Node節點

這種方式是Flink在他們的單元測試中使用的,可以下載flink的源碼在flink-connectors/flink-connector-elasticsearch5模塊下查看。

image-20200618200359635

上述代碼相對比較簡單,通過Settings構建一個配置實例,然後創建一個Node實例即可。

4.1.1 快速入門示例

使用Java先實現一個快速入門的示例,創建flink.embedded.elasticsearch.examples.LocalNodeQuickStartExample類,實現內容如下:

package flink.embedded.elasticsearch.examples;

import org.elasticsearch.common.network.NetworkModule;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.node.Node;
import org.elasticsearch.node.NodeValidationException;
import org.elasticsearch.node.internal.InternalSettingsPreparer;
import org.elasticsearch.transport.Netty4Plugin;

import java.util.Collections;

/**
 * Es本地節點快速入門示例
 * @author shirukai
 */
public class LocalNodeQuickStartExample {
    private static class PluginNode extends Node {
        public PluginNode(Settings settings) {
            super(InternalSettingsPreparer.prepareEnvironment(settings, null), Collections.singletonList(Netty4Plugin.class));
        }
    }
    public static void main(String[] args) throws NodeValidationException {
        String systemTempDir = System.getProperty("java.io.tmpdir");
        String esTempDir = systemTempDir+"/es";
        Settings settings = Settings.builder()
                .put("cluster.name", "test")
                .put("http.enabled", true)
                .put("path.home", systemTempDir)
                .put("path.data", esTempDir)
                .put(NetworkModule.HTTP_TYPE_KEY, Netty4Plugin.NETTY_HTTP_TRANSPORT_NAME)
                .put(NetworkModule.TRANSPORT_TYPE_KEY, Netty4Plugin.NETTY_TRANSPORT_NAME)
                .build();
        PluginNode node = new PluginNode(settings);
        node.start();
        // 讓它一直阻塞吧
        Thread.currentThread().join();
    }
}

運行mian方法,應該會報出如下異常:

Exception in thread "main" java.lang.ExceptionInInitializerError
	at org.elasticsearch.node.Node.<init>(Node.java:268)
	at flink.embedded.elasticsearch.examples.LocalNodeQuickStartExample$PluginNode.<init>(LocalNodeQuickStartExample.java:19)
	at flink.embedded.elasticsearch.examples.LocalNodeQuickStartExample.main(LocalNodeQuickStartExample.java:33)
Caused by: java.lang.IllegalStateException: Error finding the build shortHash. Stopping Elasticsearch now so it doesn't run in subtly broken ways. This is likely a build bug.
	at org.elasticsearch.Build.<clinit>(Build.java:62)
	... 3 more

image-20200619095326976

出現這個異常,是由於依賴包的問題導致的,我們需要在pom文件中加入es的依賴:

		<!-- Dependency for Elasticsearch 5.x Java Client -->
		<dependency>
			<groupId>org.elasticsearch.client</groupId>
			<artifactId>transport</artifactId>
			<version>${elasticsearch.version}</version>
		</dependency>

注意這個依賴一定要放在flink-connector-elasticsearch5依賴之前,因爲flink的es連接器裏打包了es的依賴,沒有辦法排除,只能在它之前引入新的es依賴,並且版本要與connector裏的版本一致,以elasticsearch5爲例,它裏面指定的es版本爲5.1.2

image-20200619100128453

加入es依賴之後如果還報錯,可以嘗試清空maven倉庫.m2下對應的緩存.m2/repository/org/apache/flink/flink-connector-elasticsearch5_2.11/*。

再次運行mian方法,應該還會報出如下異常:

image-20200621201715360

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/logging/log4j/Logger
	at org.elasticsearch.common.logging.Loggers.getLogger(Loggers.java:105)
	at org.elasticsearch.node.Node.<init>(Node.java:237)
	at flink.embedded.elasticsearch.examples.LocalNodeQuickStartExample$PluginNode.<init>(LocalNodeQuickStartExample.java:19)
	at flink.embedded.elasticsearch.examples.LocalNodeQuickStartExample.main(LocalNodeQuickStartExample.java:33)
Caused by: java.lang.ClassNotFoundException: org.apache.logging.log4j.Logger
	at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581)
	at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
	at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
	... 4 more

這是由於缺少log4j2相關的依賴,flink的做法是使用log4j-to-slf4j將其路由到slf4j,關於log4j-to-slf4j的介紹參考http://logging.apache.org/log4j/2.x/log4j-to-slf4j/。

所以這裏需要在pom中添加如下依賴:

		<!--
			Elasticsearch 5.x uses Log4j2 and no longer detects logging implementations, making
			Log4j2 a strict dependency. The following is added so that the Log4j2 API in
			Elasticsearch 5.x is routed to SLF4J. This way, user projects can remain flexible
			in the logging implementation preferred.
		-->

		<dependency>
			<groupId>org.apache.logging.log4j</groupId>
			<artifactId>log4j-to-slf4j</artifactId>
			<version>2.7</version>
		</dependency>

現在終於可以正常運行了。

image-20200619175447835

發送個接口查看一下集羣狀態http://127.0.0.1:9200/_cluster/health?pretty=true

{
    "cluster_name": "test",
    "status": "green",
    "timed_out": false,
    "number_of_nodes": 1,
    "number_of_data_nodes": 1,
    "active_primary_shards": 0,
    "active_shards": 0,
    "relocating_shards": 0,
    "initializing_shards": 0,
    "unassigned_shards": 0,
    "delayed_unassigned_shards": 0,
    "number_of_pending_tasks": 0,
    "number_of_in_flight_fetch": 0,
    "task_max_waiting_in_queue_millis": 0,
    "active_shards_percent_as_number": 100
}

4.1.2 使用Builder模式封裝ElasticSearchLocalNode

在flink.embedded.elasticsearch.examples下創建一個名爲ElasticSearchLocalNode的類,主要是使用builder模式,構建setting,然後通過setting構建一個Node實例。

實現內容如下:

package flink.embedded.elasticsearch.examples;

import org.elasticsearch.common.network.NetworkModule;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.node.Node;
import org.elasticsearch.node.internal.InternalSettingsPreparer;
import org.elasticsearch.transport.Netty4Plugin;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.io.File;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.Collections;
import java.util.Comparator;

/**
 * 繼承org.elasticsearch.node.Node實現一個本地節點
 *
 * @author shirukai
 */
public class ElasticSearchLocalNode extends Node {
    private static final Logger LOG = LoggerFactory.getLogger(ElasticSearchLocalNode.class);
    private final String tempDir;

    private ElasticSearchLocalNode(Settings preparedSettings, String tempDir) {
        super(InternalSettingsPreparer.prepareEnvironment(preparedSettings, null), Collections.singletonList(Netty4Plugin.class));
        this.tempDir = tempDir;
    }

    public static class Builder {
        private final Settings.Builder builder;
        private String tempDir;

        public Builder() {
            builder = Settings.builder();
            builder.put("network.host", "0.0.0.0");
            builder.put(NetworkModule.HTTP_TYPE_KEY, Netty4Plugin.NETTY_HTTP_TRANSPORT_NAME);
            builder.put(NetworkModule.TRANSPORT_TYPE_KEY, Netty4Plugin.NETTY_TRANSPORT_NAME);
            builder.put("http.enabled", true);
        }


        public ElasticSearchLocalNode.Builder put(String key, String value) {
            this.builder.put(key, value);
            return this;
        }

        public ElasticSearchLocalNode.Builder setClusterName(String name) {
            this.builder.put("cluster.name", name);
            return this;
        }

        public ElasticSearchLocalNode.Builder setTcpPort(int port) {
            this.builder.put("transport.tcp.port", port);
            return this;
        }

        public ElasticSearchLocalNode.Builder setTempDir(String tempDir) {
            this.tempDir = tempDir;
            this.builder.put("path.home", tempDir + "/home");
            this.builder.put("path.data", tempDir + "/data");
            return this;
        }

        public ElasticSearchLocalNode.Builder enableHttpCors(boolean enable) {
            this.builder.put("http.cors.enabled", enable);
            if (enable) {
                this.builder.put("http.cors.allow-origin", "*");
            }
            return this;
        }

        public ElasticSearchLocalNode build() {
            return new ElasticSearchLocalNode(this.builder.build(), tempDir);
        }
    }

    public void stop() {
        this.stop(false);
    }

    public void stop(boolean cleanDataDir) {

        if (cleanDataDir && tempDir != null) {
            try {
                this.close();
                Files.walk(new File(tempDir).toPath())
                        .sorted(Comparator.reverseOrder())
                        .map(Path::toFile)
                        .forEach(File::delete);
    
            } catch (IOException e) {
                LOG.error("Failed to stop elasticsearch local node.", e);
            }
        }
    }
}

4.1.3 對ElasticSearchLocalNode進行單元測試

  1. 引入單元測試依賴

            <!-- dependencies for test -->
            <dependency>
                <groupId>junit</groupId>
                <artifactId>junit</artifactId>
                <version>4.12</version>
                <scope>test</scope>
            </dependency>
    
  2. 實現單元測試ElasticSearchLocalNodeTest

    package flink.embedded.elasticsearch.examples;
    
    import org.elasticsearch.action.search.SearchResponse;
    import org.elasticsearch.node.NodeValidationException;
    import org.junit.AfterClass;
    import org.junit.BeforeClass;
    import org.junit.Test;
    
    
    /**
     * 對ElasticSearchLocalNode進行單元測試
     *
     * @author shirukai
     */
    public class ElasticSearchLocalNodeTest {
        private static ElasticSearchLocalNode esNode;
    
        /**
         * 單元測試之前,創建es節點實例,綁定端口19300
         */
        @BeforeClass
        public static void prepare() throws NodeValidationException {
            esNode = new ElasticSearchLocalNode.Builder()
                    .setClusterName("test-es")
                    .setTcpPort(19300)
                    .setTempDir("data/es")
                    .build();
            esNode.start();
    
        }
    
        @Test
        public void addIndexTest() {
            People people = new People("xiaoming", 19, 15558800456L);
            esNode.client().prepareIndex()
                    .setIndex("people")
                    .setType("man")
                    .setId("1")
                    .setSource(people.toString())
                    .get();
        }
    
        @Test
        public void getIndexTest() throws InterruptedException {
            Thread.sleep(1000);
            SearchResponse response = esNode.client().prepareSearch("people").execute().actionGet();
            System.out.println(response.toString());
        }
    
        /**
         * 單元測試之後,停止es節點
         */
        @AfterClass
        public static void shutdown() {
            if (null != esNode) {
                esNode.stop(true);
            }
        }
    
        public static class People {
            private String name;
            private int age;
            private long phone;
    
            public People(String name, int age, long phone) {
                this.name = name;
                this.age = age;
                this.phone = phone;
            }
    
            public String getName() {
                return name;
            }
    
            public void setName(String name) {
                this.name = name;
            }
    
            public int getAge() {
                return age;
            }
    
            public void setAge(int age) {
                this.age = age;
            }
    
            public long getPhone() {
                return phone;
            }
    
            public void setPhone(long phone) {
                this.phone = phone;
            }
    
            @Override
            public String toString() {
                return "{" +
                        "\"name\":\"" + name + "\"" +
                        ", \"age\":\"" + age + "\"" +
                        ", \"phone\":\"" + phone + "\"" +
                        "}";
            }
        }
    }
    
  3. 運行單元測試,這時會被如下錯誤

    image-20200621200415475

    Exception in thread "elasticsearch[nfllutF][clusterService#updateTask][T#1]" java.lang.NoClassDefFoundError: org/apache/logging/log4j/core/config/Configurator
    	at org.elasticsearch.common.logging.Loggers.setLevel(Loggers.java:149)
    	at org.elasticsearch.common.logging.Loggers.setLevel(Loggers.java:144)
    	at org.elasticsearch.index.SearchSlowLog.setLevel(SearchSlowLog.java:111)
    	at org.elasticsearch.index.SearchSlowLog.<init>(SearchSlowLog.java:106)
    	at org.elasticsearch.index.IndexModule.<init>(IndexModule.java:127)
    	at org.elasticsearch.indices.IndicesService.createIndexService(IndicesService.java:421)
    	at org.elasticsearch.indices.IndicesService.createIndex(IndicesService.java:394)
    	at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$1.execute(MetaDataCreateIndexService.java:352)
    	at org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:45)
    	at org.elasticsearch.cluster.service.ClusterService.runTasksForExecutor(ClusterService.java:581)
    	at org.elasticsearch.cluster.service.ClusterService$UpdateTask.run(ClusterService.java:920)
    	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:458)
    	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:238)
    	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:201)
    	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    	at java.base/java.lang.Thread.run(Thread.java:834)
    Caused by: java.lang.ClassNotFoundException: org.apache.logging.log4j.core.config.Configurator
    	at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581)
    	at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
    	at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
    	... 17 more
    

    看錯誤日誌,是由於lo4j相關的類找不到導致的,這類flink的源碼裏也有提到,需要我們在pom裏引進相關的依賴:

            <!--
                Including Log4j2 dependencies for tests is required for the
                embedded Elasticsearch nodes used in tests to run correctly.
            -->
            <dependency>
                <groupId>org.apache.logging.log4j</groupId>
                <artifactId>log4j-api</artifactId>
                <version>2.7</version>
            </dependency>
    
            <dependency>
                <groupId>org.apache.logging.log4j</groupId>
                <artifactId>log4j-core</artifactId>
                <version>2.7</version>
                <scope>test</scope>
            </dependency>
    

    引進log4j相關依賴之後,再次運行,仍然會報錯:

    image-20200621201001090

    java.lang.ClassCastException: class org.apache.logging.slf4j.SLF4JLoggerContext cannot be cast to class org.apache.logging.log4j.core.LoggerContext (org.apache.logging.slf4j.SLF4JLoggerContext and org.apache.logging.log4j.core.LoggerContext are in unnamed module of loader 'app')
    
    	at org.apache.logging.log4j.core.LoggerContext.getContext(LoggerContext.java:187)
    	at org.apache.logging.log4j.core.config.Configurator.setLevel(Configurator.java:291)
    	at org.elasticsearch.common.logging.Loggers.setLevel(Loggers.java:149)
    	at org.elasticsearch.common.logging.Loggers.setLevel(Loggers.java:144)
    	at org.elasticsearch.index.SearchSlowLog.setLevel(SearchSlowLog.java:111)
    	at org.elasticsearch.index.SearchSlowLog.<init>(SearchSlowLog.java:106)
    	at org.elasticsearch.index.IndexModule.<init>(IndexModule.java:127)
    	at org.elasticsearch.indices.IndicesService.createIndexService(IndicesService.java:421)
    	at org.elasticsearch.indices.IndicesService.createIndex(IndicesService.java:394)
    	at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$1.execute(MetaDataCreateIndexService.java:352)
    	at org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:45)
    	at org.elasticsearch.cluster.service.ClusterService.runTasksForExecutor(ClusterService.java:581)
    	at org.elasticsearch.cluster.service.ClusterService$UpdateTask.run(ClusterService.java:920)
    	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:458)
    	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:238)
    	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:201)
    	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    	at java.base/java.lang.Thread.run(Thread.java:834)
    

    這個問題比較有趣,也是當時坑的時間比較長的一個問題。之前有提到提過一個錯誤java.lang.NoClassDefFoundError: org/apache/logging/log4j/Logger,當發生這個錯誤時,我們是引入了一個叫log4j-to-slf4j的依賴來解決問題的,它會將log4j的應用路由到slf4j上,但是對於ElasticSearch來說,它內部是使用log4j2,如果我們繼續使用log4j-to-slf4j就會導致上述的異常產生,flink的做法是在測試時,將log4j-to-slf4j排除掉,使用maven的maven-surefire-plugin插件將相關依賴排除掉,如下在pom中添加如下插件:

                <!--
                    For the tests, we need to exclude the Log4j2 to slf4j adapter dependency
                    and let Elasticsearch directly use Log4j2, otherwise the embedded Elasticsearch node
                    used in tests will fail to work.
    
                    In other words, the connector jar is routing Elasticsearch 5.x's Log4j2 API's to SLF4J,
                    but for the test builds, we still stick to directly using Log4j2.
                -->
                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-surefire-plugin</artifactId>
                    <configuration>
                        <classpathDependencyExcludes>
                            <classpathDependencyExclude>org.apache.logging.log4j:log4j-to-slf4j</classpathDependencyExclude>
                        </classpathDependencyExcludes>
                    </configuration>
                </plugin>
    

    再次啓動單元測試:

    image-20200621210639203

    如上圖所示,這時我們的測試用例已經跑過,插入的索引已經能夠查出來。但是可以只能看到一條ERROR的日誌,其它日誌顯示不出來,看日誌內容是說沒有找到log4j2相關的配置文件,只會打印ERROR級別的日誌:

    ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console.
    

    有些同學回說,我的resources下有log4j.properties這個配置文件呀,怎麼還報錯,這裏要注意了,日誌是說沒有log4j2,2啊,log4j2的配置文件一般爲log4j2.xxx,具體配置可以查看官網https://logging.apache.org/log4j/2.x/manual/configuration.html,這裏簡單配置一下,在resources下添加一個名爲log4j2.xml,內容如下:

    <?xml version="1.0" encoding="UTF-8"?>
    <Configuration status="WARN">
        <Appenders>
            <Console name="Console" target="SYSTEM_OUT">
                <PatternLayout pattern="%d{HH:mm:ss,SSS} %-5p %-60c - %m%n"/>
            </Console>
        </Appenders>
        <Loggers>
            <Root level="INFO">
                <AppenderRef ref="Console"/>
            </Root>
        </Loggers>
    </Configuration>
    

    配置日誌文件後,再次啓動,日誌就可以正常輸出了。

    image-20200621212807852

4.2 使用安裝包啓動ElasticSearch服務

這裏要介紹的是https://github.com/allegro/embedded-elasticsearch提供的嵌入式ES的實現方案,原理其實很簡單,就是用Java將ES安裝包解壓,然後調用啓動腳本啓動一個ES集羣實例,用完再將其卸載掉,相當於代替了我們手動部署啓動ES的工作,下面就通過一個快速入門的示例,來體驗一下這是ES的實現方案。

4.2.1 快速入門示例

  1. 首先在pom中引入依賴

            <!-- Embedded es following Github repository,https://github.com/allegro/embedded-elasticsearch -->
            <dependency>
                <groupId>pl.allegro.tech</groupId>
                <artifactId>embedded-elasticsearch</artifactId>
                <version>2.10.0</version>
            </dependency>
    
  2. 創建flink.embedded.elasticsearch.examples.EmbeddedEsQuickStartExample類,實現內容如下:

    package flink.embedded.elasticsearch.examples;
    
    import pl.allegro.tech.embeddedelasticsearch.EmbeddedElastic;
    import pl.allegro.tech.embeddedelasticsearch.PopularProperties;
    
    import java.io.File;
    import java.io.IOException;
    
    /**
     * 使用第三方依賴實現內置ES集羣
     * <p>Following Github repository</p>
     * <a href="https://github.com/allegro/embedded-elasticsearch">
     * https://github.com/allegro/embedded-elasticsearch</a>
     *
     * @author shirukai
     */
    public class EmbeddedEsQuickStartExample {
        public static void main(String[] args) throws IOException, InterruptedException {
            EmbeddedElastic esCluster = EmbeddedElastic.builder()
                    .withElasticVersion("5.6.16")
                    .withSetting(PopularProperties.TRANSPORT_TCP_PORT, 19300)
                    .withSetting(PopularProperties.CLUSTER_NAME, "test-es")
                    .withSetting("http.cors.enabled", true)
                    .withSetting("http.cors.allow-origin", "*")
                    // 安裝包下載路徑
                    .withDownloadDirectory(new File("data"))
                    .build();
            esCluster.start();
            Thread.currentThread().join();
        }
    }
    
    
  3. 執行mian方法,這時會看到如下日誌,程序啓動後會先根據指定的es版本號,從官網下載對應的安裝包。

21:50:35,087 INFO  pl.allegro.tech.embeddedelasticsearch.ElasticDownloader       - Downloading https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.6.16.zip to data/elasticsearch-5.6.16.zip ...

這裏下載安裝包的時候可能會比較慢,如果我們本地有.zip後綴的安裝包,可以取消執行,然後直接將本地的安裝包拷到我們指定的目錄下,比如上面代碼指定的項目中的data目錄,另外要注意的是,我們拷貝安裝包到指定目錄之後,一定要再創建一個空文件名爲elasticsearch-{版本號}.zip-downloaded,如:

data
├── elasticsearch-5.6.16.zip
└── elasticsearch-5.6.16.zip-downloaded

安裝包下載或者拷貝完成之後,再次執行。

image-20200622103015632

4.3 使用Testcontainers啓動ElasticSearch服務

Testcontiners是一個Java單元測試庫,它能夠基於Docker容器啓動一個服務實例。詳情可以查看官網https://www.testcontainers.org/。這裏介紹使用Testcontainers啓動ElasticSearch服務,注意前提是機器上要有Docker環境。

4.3.1 快速入門示例

  1. 創建flink.embedded.elasticsearch.examples.ContainersEsQuickStartExample類,實現內容如下:

    package flink.embedded.elasticsearch.examples;
    
    import org.testcontainers.elasticsearch.ElasticsearchContainer;
    
    import java.util.Collections;
    
    /**
     * TestContainer啓動es快速入門示例
     * <a href="https://www.testcontainers.org/modules/elasticsearch/">
     * https://www.testcontainers.org/modules/elasticsearch/</a>
     *
     * @author shirukai
     */
    public class ContainersEsQuickStartExample {
        public static void main(String[] args) throws InterruptedException {
            // Create the elasticsearch container.
            try (ElasticsearchContainer container = new ElasticsearchContainer("docker.elastic.co/elasticsearch/elasticsearch:5.6.16")) {
                // Disable x-pack
                container.setEnv(Collections.singletonList("xpack.security.enabled=false"));
    
                // Start the container. This step might take some time...
                container.start();
    
                // Add shutdown hook
                Runtime.getRuntime().addShutdownHook(new Thread(container::close));
    
                // 讓它阻塞一會吧
                Thread.currentThread().join();
            }
    
        }
    }
    
    
  2. 確保docker環境可用,然後運行mian方法

    image-20200623113923602

    如果報如下錯誤,這是由於進docker拉取鏡像的時候網絡抖動導致的,可以嘗試先手動將鏡像拉取下來:

    docker pull docker.elastic.co/elasticsearch/elasticsearch:5.6.16
    

    image-20200622111005307

https://stackoverflow.com/questions/41298467/how-to-start-elasticsearch-5-1-embedded-in-my-java-application

5 Flink任務單元測試

在第4章已經介紹了三種方式啓動嵌入式ElasticSearch服務,這章我們進入主題,利用嵌入式的ES對Flink任務進行單元測試。

有了上面4個章節的鋪墊,本章節實現起來相對簡單多了,這裏我們使用內置Node的方式啓動一個ES節點實例,來完成我們的單元測試。

  1. 創建測試類flink.embedded.elasticsearch.examples.WriteElasticSearchJobTest

    package flink.embedded.elasticsearch.examples;
    
    import org.apache.flink.configuration.Configuration;
    import org.apache.flink.runtime.testutils.MiniClusterResourceConfiguration;
    import org.apache.flink.test.util.MiniClusterWithClientResource;
    import org.elasticsearch.action.search.SearchResponse;
    import org.elasticsearch.client.Client;
    import org.elasticsearch.node.NodeValidationException;
    import org.junit.AfterClass;
    import org.junit.BeforeClass;
    import org.junit.ClassRule;
    import org.junit.Test;
    
    
    /**
     * 寫es單元測試
     *
     * @author shirukai
     */
    public class WriteElasticSearchJobTest {
        /**
         * 設置綁定端口
         */
        private static final Integer ES_BIND_PORT = 19300;
        /**
         * 設置集羣名稱
         */
        private static final String ES_CLUSTER_NAME = "test-es";
        /**
         * 內置es節點實例
         */
        private static ElasticSearchLocalNode esNode;
    
        /**
         * 創建Flink mini cluster,當涉及多個flink任務時,可以避免創建多次集羣。
         * 同時可以通過mini cluster 自定義task數量以及slot的數量。
         */
        @ClassRule
        public static MiniClusterWithClientResource flinkMiniCluster =
                new MiniClusterWithClientResource(new MiniClusterResourceConfiguration
                        .Builder()
                        .setConfiguration(new Configuration())
                        .setNumberTaskManagers(1)
                        .setNumberSlotsPerTaskManager(1)
                        .build());
    
    
        @Test
        public void writeElasticSearchJobTest() throws InterruptedException {
            String[] args = new String[]{
                    "--es.cluster.nodes",
                    "127.0.0.1",
                    "-es.cluster.port",
                    ES_BIND_PORT.toString(),
                    "-es.cluster.name",
                    ES_CLUSTER_NAME
            };
            // 提交flink任務
            WriteElasticSearchJob.main(args);
    
            Client esClient = esNode.client();
            Thread.sleep(1000);
            SearchResponse response = esClient.prepareSearch("events").execute().actionGet();
            System.out.println(response.toString());
        }
    
        /**
         * 測試類執行前,創建es節點實例
         *
         * @throws NodeValidationException e
         */
        @BeforeClass
        public static void prepare() throws NodeValidationException {
            esNode = new ElasticSearchLocalNode.Builder()
                    .setClusterName(ES_CLUSTER_NAME)
                    .setTcpPort(ES_BIND_PORT)
                    .setTempDir("data/es")
                    .build();
            esNode.start();
        }
    
        /**
         * 測試類執行後,關閉es節點實例
         */
        @AfterClass
        public static void shutdown() {
            if (esNode != null) {
                esNode.stop(true);
            }
        }
    }
    
    
  2. 運行單元測試

    image-20200623155634887

5 總結

使用嵌入式ElasticSearch集羣進行單元測試確實能夠很好的解決目前的問題,同時也給了我一個新的思路,原來類似這種單元的測試可以使用內置服務的方式進行,讓我從想盡辦法mock類的坑中爬了出來。簡單總結一下文章中提到的三種方式:

  • 第一種啓動單個Node,優點:無需額外的依賴,內置client可以直接使用 缺點:使用時坑比較多,比如日誌的問題,同時官方是不建議我們使用這種方式的查看https://github.com/elastic/elasticsearch/issues/19930
  • 第二種使用安裝包啓動ElasticSearch服務,優點:入門簡單,版本可以指定,缺點:需要引入第三方依賴,需要Es的安裝包
  • 第三種Testcontainers啓動ElasticSearch服務,優點:入門簡單,版本可指定,缺點:需要Docker環境

其實我個人還是比較傾向於第一種方式的,雖然坑多,但是踩過了就不怕了。最後感慨一下吧,遇到問題一定要多想多看,不要一條路走到黑。
項目代碼已經更新到Github上了,歡迎大家有問題一起交流。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章