HiveServer2 JDBC客户端连接Hive数据库

1. 介绍:

  • 两者都允许远程客户端使用多种编程语言,通过HiveServer或者HiveServer2,客户端可以在不启动CLI的情况下对Hive中的数据进行操作,两者都允许远程客户端使用多种编程语言如java,python等向hive提交请求,取回结果(从hive0.15起就不再支持hiveserver了),但是在这里我们还是要说一下hiveserver

  • HiveServer或者HiveServer2都是基于Thrift的,但HiveSever有时被称为Thrift server,而HiveServer2却不会。既然已经存在HiveServer,为什么还需要HiveServer2呢?这是因为HiveServer不能处理多于一个客户端的并发请求,这是由于HiveServer使用的Thrift接口所导致的限制,不能通过修改HiveServer的代码修正。因此在Hive-0.11.0版本中重写了HiveServer代码得到了HiveServer2,进而解决了该问题。HiveServer2支持多客户端的并发和认证,为开放API客户端如JDBC、ODBC提供更好的支持。

2. 启动hiveserver2

    1. [hadoop@wjxhadoop001 ~]$ cd /opt/software/hive/bin/
    2. [hadoop@wjxhadoop001 bin]$ hiveserver2 
    3. which: no hbase in (/opt/software/hive/bin:/opt/software/hadoop/sbin:/opt/software/hadoop/bin:/opt/software/apache-maven-3.3.9/bin:/usr/java/jdk1.8.0_45/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hadoop/bin)

3. 连接JDBC

    1. [hadoop@wjxhadoop001 bin]$ ./beeline -u jdbc:hive2://localhost:10000/default -n hadoop

    2. which: no hbase in (/opt/software/hive/bin:/opt/software/hadoop/sbin:/opt/software/hadoop/bin:/opt/software/apache-maven-3.3.9/bin:/usr/java/jdk1.8.0_45/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hadoop/bin)
    3. scan complete in 4ms
    4. Connecting to jdbc:hive2://localhost:10000/default
    5. Connected to: Apache Hive (version 1.1.0-cdh5.7.0)
    6. Driver: Hive JDBC (version 1.1.0-cdh5.7.0)
    7. Transaction isolation: TRANSACTION_REPEATABLE_READ
    8. Beeline version 1.1.0-cdh5.7.0 by Apache Hive
    9. 0: jdbc:hive2://localhost:10000/default>
  • 使用SQL

    1. 0: jdbc:hive2://localhost:10000/default> show databases;
    2. INFO  : Compiling command(queryId=hadoop_20180114082525_e8541a4a-e849-4017-9dab-ad5162fa74c1): show databases
    3. INFO  : Semantic Analysis Completed
    4. INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:database_name, type:string, comment:from deserializer)], properties:null)
    5. INFO  : Completed compiling command(queryId=hadoop_20180114082525_e8541a4a-e849-4017-9dab-ad5162fa74c1); Time taken: 0.478 seconds
    6. INFO  : Concurrency mode is disabled, not creating a lock manager
    7. INFO  : Executing command(queryId=hadoop_20180114082525_e8541a4a-e849-4017-9dab-ad5162fa74c1): show databases
    8. INFO  : Starting task [Stage-0:DDL] in serial mode
    9. INFO  : Completed executing command(queryId=hadoop_20180114082525_e8541a4a-e849-4017-9dab-ad5162fa74c1); Time taken: 0.135 seconds
    10. INFO  : OK
    11. +----------------+--+
    12. | database_name  |
    13. +----------------+--+
    14. | default        |
    15. +----------------+--+
    16.1 row selected      
    
  • 编写java代码

    1. import java.sql.Connection;
    2. import java.sql.DriverManager;
    3. import java.sql.ResultSet;
    4. import java.sql.Statement;
    
    5. public class JdbcApp {
    6.     private static String driverName = "org.apache.hive.jdbc.HiveDriver";
    
    7.     public static void main(String[] args) throws Exception {
    8.         try {
    9.             Class.forName(driverName);
    10.         } catch (ClassNotFoundException e) {
    11.             // TODO Auto-generated catch block
    12.             e.printStackTrace();
    13.             System.exit(1);
    14.         }
    15.         Connection con = DriverManager.getConnection("jdbc:hive2://192.168.137.200:10000/default", "", "");
    16.         Statement stmt = con.createStatement();
    
    17.        //select table:ename
    18.         String tableName = "emp";
    19.         String sql = "select ename from " + tableName;
    20.         System.out.println("Running: " + sql);
    21.         ResultSet res = stmt.executeQuery(sql);
    22.          while(res.next()) {
    23.             System.out.println(res.getString(1));
    24.         }
    25.         // describe table
    26.         sql = "describe " + tableName;
    27.         System.out.println("Running: " + sql);
    28.         res = stmt.executeQuery(sql);
    29.         while (res.next()) {
    30.             System.out.println(res.getString(1) + "\t" + res.getString(2));
    31.         }
    32.     }
    33. }
    

4. 默认参数

  • Hiveserver2允许在配置文件hive-site.xml中进行配置管理,具体的参数为:

    1. hive.server2.thrift.min.worker.threads– 最小工作线程数,默认为5。  
    2. hive.server2.thrift.max.worker.threads – 最小工作线程数,默认为500。  
    3. hive.server2.thrift.port– TCP 的监听端口,默认为10000。  
    4. hive.server2.thrift.bind.host– TCP绑定的主机,默认为localhost 
    
  • 配置监听端口和路径

    1.  sudo vi hive-site.xml
    2. <property>
    3.   <name>hive.server2.thrift.port</name>
    4.   <value>10000</value>
    5. </property>
    6. <property>
    7.   <name>hive.server2.thrift.bind.host</name>
    8.   <value>192.168.48.130</value>
    9. </property>
    

来自@若泽大数据

发布了44 篇原创文章 · 获赞 30 · 访问量 5万+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章