此次博主爲大家帶來的是Hive項目實戰系列的第二部分。
一 啓動hive
- .1 啓動hiveserver2服務
[bigdata@hadoop002 hive]$ bin/hiveserver2
- 2 啓動beeline
[bigdata@hadoop002 hive]$ bin/beeline
Beeline version 1.2.1 by Apache Hive
beeline>
- 3 連接hiveserver2
beeline> !connect jdbc:hive2://hadoop002:10000(回車)
Connecting to jdbc:hive2://hadoop002:10000
Enter username for jdbc:hive2://hadoop002:10000: bigdata(回車)
Enter password for jdbc:hive2://hadoop002:10000: (直接回車)
Connected to: Apache Hive (version 1.2.1)
Driver: Hive JDBC (version 1.2.1)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://hadoop002:10000> create database guli;
0: jdbc:hive2://hadoop002:10000> use guli;
0: jdbc:hive2://hadoop002:10000> show tables;
+-----------+--+
| tab_name |
+-----------+--+
+-----------+--+
No rows selected (0.036 seconds)
二. 創建表
2.1 拿到原始數據(日誌數據| ori表 )
- 1. 創建user_text
create external table user_text(
uploader string,
videos int,
friends int)
row format delimited fields terminated by '\t'
collection items terminated by '&'
location '/guli/user';
// 查看前五行
0: jdbc:hive2://hadoop002:10000> select * from user_text limit 5;
- 2. 創建video_text
// video表
create external table video_text(
videoId string,
uploader string,
age int,
category array<string>,
length int,
views int,
rate float,
ratings int,
comments int,
relatedId array<string>
)
row format delimited fields terminated by '\t'
collection items terminated by '&'
location '/guli/video_etc';
// 查詢
select * from video_text limit 5;
類型我們大致可以看到就行。
2.2 把數據導入到hive中進行處理(創建兩張orc表)
- 1. 創建video_orc:
create table video_orc(
videoId string,
uploader string,
age int,
category array<string>,
length int,
views int,
rate float,
ratings int,
comments int,
relatedId array<string>
)
row format delimited fields terminated by '\t'
collection items terminated by '&'
stored as orc;
如果創建的是表爲如下的這種
就需要輸入如下的命令修改,並出現下圖標記處的類型就行了:
0: jdbc:hive2://hadoop002:10000> alter table video_orc set tblproperties("EXTERNAL"="FALSE")
0: jdbc:hive2://hadoop002:10000> desc formatted video_orc;
- 2. 創建user_orc
create table user_orc(
uploader string,
videos int,
friends int)
row format delimited fields terminated by '\t'
collection items terminated by '&'
stored as orc;
2.3 向ORC表插入數據
- 1. 向user_orc插入數據
0: jdbc:hive2://hadoop002:10000> insert into user_orc select * from user_text;
結果在:
- 2. 向video_orc插入數據
0: jdbc:hive2://hadoop002:10000> insert into video_orc select * from video_text;
- 3. 測試是否成功
0: jdbc:hive2://hadoop002:10000> select * from user_orc limit 5;
0: jdbc:hive2://hadoop002:10000> select * from video_orc limit 5;
好了,到這裏,我們就把分析前的數據準備好了。
^ _ ^ ❤️ ❤️ ❤️
碼字不易,大家的支持就是我堅持下去的動力。點贊後不要忘了關注我哦!