Hive 將元數據存儲在 RDBMS 中,有三種模式可以連接到數據庫:
1)ingle User Mode: 此模式連接到一個 In-memory 的數據庫 Derby,一般用於 Unit Test。
2)Multi User Mode:通過網絡連接到一個數據庫中,是最經常使用到的模式。
3)Remote Server Mode:用於非 Java 客戶端訪問元數據庫,在服務器端啓動一個 MetaStoreServer,客戶端利用 Thrift 協議通過 MetaStoreServer 訪問元數據庫。
Hive默認是採用Derby來存儲其Meta信息的,
現在我們需要修改爲mysql
1.在mysql專門爲hive添加用戶
mysql>grant all PRIVILEGES on *.* to hive@'123.123.123.123' identified by '123456';
mysql> flush privileges;
2.修改配置文件conf/hive-default.xml 中的配置
- <property>
- <name>hive.metastore.local</name>
- <value>true</value>
- <description>controls whether to connect to remove metastore server or open a new metastore server in Hive Client JVM</description>
- </property>
- <property>
- <name>javax.jdo.option.ConnectionURL</name>
- <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>
- <description>JDBC connect string for a JDBC metastore</description>
- </property>
- <property>
- <name>javax.jdo.option.ConnectionDriverName</name>
- <value>com.mysql.jdbc.Driver</value>
- <description>Driver class name for a JDBC metastore</description>
- </property>
- <property>
- <name>javax.jdo.option.ConnectionUserName</name>
- <value>hive</value>
- <description>username to use against metastore database</description>
- </property>
- <property>
- <name>javax.jdo.option.ConnectionPassword</name>
- <value>hivepasswd</value>
- <description>password to use against metastore database</description>
- </property>
- <property>
- <name>datanucleus.autoCreateSchema</name>
- <value>false</value>
- </property>
- <property>
- <name>datanucleus.fixedDatastore</name>
- <value>true</value>
- </property>
- <property>
- <name>hive.support.concurrency</name>
- <value>true</value>
- </property>
- <property>
- <name>hive.zookeeper.quorum</name>
- <value>node1,node2,node3</value>
- <description>The list of zookeeper servers to talk to. This is only needed for read/write locks.</description>
- </property>
datanucleus.fixedDatastore 選項設置主要是解決併發造成的錯誤:
Caused by: com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: Table 'dataoven_prod_hadoop.DELETEME1309959999747' doesn't exist
hive.support.concurrency 設置爲支持併發.
3.添加jdbc的jar包
wget http://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.11.tar.gz/from/http://mysql.he.net/
tar -xvzf mysql-connector-java-5.1.11.tar.gz
cp mysql-connector-java-5.1.11/*.jar /data/soft/hive/lib
4.啓動hive
bin/hive
hive> show tables;
When using MySQL as a metastore I see the error "com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: Specified key was too long; max key length is 767 bytes".
* This is a known limitation of MySQL 5.0 and UTF8 databases. One option is to use another character set, such as 'latin1', which is known to work.
這個問題是因爲hive對mysql的UTF-8編碼方式有限制,修改一下mysql的編碼方式即可:alter database name character set latin1;
FAILED: Error in metadata: javax.jdo.JDOException: Couldnt obtain a new sequence (unique id) : Binary logging not possible. Message: Transaction level 'READ-COMMITTED' in InnoDB is not safe for binlog mode 'STATEMENT'
因爲,READ-COMMITTED需要把bin-log以mixed方式來記錄,用以下命令來修改:
set global binlog_format='MIXED';
5. 使用hive分析日誌作業很多的時候,需要修改mysql的默認連接數
修改方法 打開/etc/my.cnf文件
在[mysqld] 中添加 max_connections=1000
重啓mysql服務 service mysqld restart
查看設置是否成功 mysql -uroot -p
mysql>show variables like '%max_connections%';
查看當前mysql的連接數方法
mysqladmin -uroot -p status
其中,Uptime:mysqld運行時間,單位秒。 Threads: 當前連接數。 Questions: 從啓動以來的查詢數目。 Slow queries:慢查詢數目。 Opens: 從啓動以來打開過的表數目。 Flush tables: 從啓動以來flush的表數目。 Open tables: 當前打開的表數目。 Queries per second avg:平均每秒查詢數目。
分別查看mysqld的運行情況和活動連接列表 mysqladmin -u root -p processlist status