作用:
1.將Hive數據導入ES
2.Hive直接使用ES的數據
缺陷
由於Hive字段名不區分大小寫,es-hadoop默認將字段名都轉換爲小寫。
步驟
一、配置依賴jar
1.臨時生效
啓動HIVE CLI後,ADD JAR /path/elasticsearch-hadoop-xxx.jar;
或
bin/hive --auxpath=/path/elasticsearch-hadoop-xxx.jar
或
bin/hive -hiveconf hive.aux.jars.path=/path/elasticsearch-hadoop-xxx.jar
2.永久生效
hive-site.xml 添加配置
<property>
<name>hive.aux.jars.path</name>
<value>/path/elasticsearch-hadoop-xxx.jar</value>
<description>A comma separated list (with no spaces) of the jar files</description>
</property>
CDH版可直接配置,將jar放在配置目錄,重啓HiveServer2即可。參考:https://blog.csdn.net/qq_23146763/article/details/88897243
二、在Hive創建外部表
CREATE EXTERNAL TABLE tmp.artists (
id string,
user_name string,
age string)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES(
'es.resource' = 'radio/artists',
'es.index.auto.create' = 'true',
"es.mapping.id" = "id",
'es.mapping.names' = 'user_name:user_name_es, age:age_es',
'es.nodes' = 'IP1:9200,IP2:9200,IP3:9200',
"es.net.http.auth.user"="XXX",
"es.net.http.auth.pass"="XXX"
);
es.mapping.names爲hive字段和es字段映射
es.net.http.auth.user和es.net.http.auth.pass是xpack權限控制賬號密碼,沒開啓可以不配置。開啓了不配置會報錯如下:
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'
三、導入數據
用一張有數據的表導入
INSERT OVERWRITE TABLE tmp.artists
SELECT personal_user_id as id, personal_user_name as user_name ,personal_age as age FROM xxx limit 100;
四、查看數據
通過hive查看
查看es數據
官網:https://www.elastic.co/guide/en/elasticsearch/hadoop/5.5/hive.html