首先需要在本地搭建hadoop環境,參考:http://blog.csdn.net/kunshan_shenbin/article/details/52933499
下載hive2.1.0,解壓,配置hive環境變量。本地安裝mysql,創建數據庫hive_db, 下載MySQL jdbc驅動,病放到hive安裝目錄的lib目錄下。
修改Hive配置:hive/conf/下
1) hive-default.xml.template改名hive-default.xml
2) 新建hive-site.xml,內容如下:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hive.metastore.warehouse.dir</name>
<!--><value>/Users/bin.shen/BigData/apache-hive-2.1.0/warehouse</value>-->
<value>hdfs://localhost:8081/user/hive_local/warehouse</value>
</property>
<property>
<name>hive.metastore.local</name>
<value>true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive_db?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hive</value>
<description>password to use against metastore database</description>
</property>
</configuration>
初始化db:
schematool -initSchema -dbType mysql
啓動hive
在命令行運行hive。
具體可以參考:http://blog.itpub.net/30089851/viewspace-2074761/
Ok,準備工作完成,接下來進度正題:
./hive
CREATE TABLE wordcount(name string,id int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
LOAD DATA INPATH 'output/part-r-00000' INTO TABLE wordcount;
查詢wordcount數據表:
統計wordcount 數據表中的不同的單詞個數,及count
從這個結果中,其實可以看出,我們之前所說的結論:
查詢是通過MapReduce來完成的(並不是所有的查詢都需要MapReduce來完成,比如select * from XXX就不需要
參考資料:http://blog.csdn.net/wangmuming/article/details/25226951