Hive基於ES的外部表使用

作用:

1.將Hive數據導入ES
2.Hive直接使用ES的數據

缺陷

由於Hive字段名不區分大小寫,es-hadoop默認將字段名都轉換爲小寫。

步驟

一、配置依賴jar

1.臨時生效

啓動HIVE CLI後,ADD JAR /path/elasticsearch-hadoop-xxx.jar;
或
bin/hive --auxpath=/path/elasticsearch-hadoop-xxx.jar
或
bin/hive -hiveconf hive.aux.jars.path=/path/elasticsearch-hadoop-xxx.jar

2.永久生效
hive-site.xml 添加配置

<property>
  <name>hive.aux.jars.path</name>
  <value>/path/elasticsearch-hadoop-xxx.jar</value>
  <description>A comma separated list (with no spaces) of the jar files</description>
</property>

CDH版可直接配置,將jar放在配置目錄,重啓HiveServer2即可。參考:https://blog.csdn.net/qq_23146763/article/details/88897243

二、在Hive創建外部表

CREATE EXTERNAL TABLE tmp.artists (
id string,
user_name string,
age string) 
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' 
TBLPROPERTIES(
'es.resource' = 'radio/artists',
'es.index.auto.create' = 'true',
"es.mapping.id" = "id",
'es.mapping.names' = 'user_name:user_name_es, age:age_es',
'es.nodes' = 'IP1:9200,IP2:9200,IP3:9200',
"es.net.http.auth.user"="XXX",
"es.net.http.auth.pass"="XXX"
);

es.mapping.names爲hive字段和es字段映射
es.net.http.auth.user和es.net.http.auth.pass是xpack權限控制賬號密碼,沒開啓可以不配置。開啓了不配置會報錯如下:

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'

三、導入數據

用一張有數據的表導入

INSERT OVERWRITE TABLE tmp.artists
SELECT personal_user_id as id, personal_user_name as user_name ,personal_age as age FROM xxx limit 100;

四、查看數據

通過hive查看

在這裏插入圖片描述

查看es數據
在這裏插入圖片描述

官網:https://www.elastic.co/guide/en/elasticsearch/hadoop/5.5/hive.html

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章