hive對電商用戶訂單行爲特徵分析(二)

原創

2020-02-22 20:36

今天用hive查詢用戶日誌表.這是日誌表的格式:

user_id,item_id,cat_id,merchant_id,brand_id,month,day,action,age_range,gender,province
328862,323294,833,2882,2661,8,29,0,0,1,內蒙古
328862,844400,1271,2882,2661,8,29,0,1,1,山西
328862,575153,1271,2882,2661,8,29,0,2,1,山西
328862,996875,1271,2882,2661,8,29,0,1,1,內蒙古
328862,1086186,1271,1253,1049,8,29,0,0,2,浙江
328862,623866,1271,2882,2661,8,29,0,0,2,黑龍江
328862,542871,1467,2882,2661,8,29,0,5,2,四川
328862,536347,1095,883,1647,8,29,0,7,1,吉林
328862,364513,1271,2882,2661,8,29,0,1,2,貴州
328862,575153,1271,2882,2661,8,29,0,0,0,陝西

日誌數據以及元數據的上傳,詳見本人這篇博客: http://blog.csdn.net/cafebar123/article/details/74371463
下面是對日誌記錄行爲查詢:
創建數據庫名:

create database hive;

創建表名:

CREATE TABLE hive.user_log(user_id INT,item_id INT,cat_id INT,merchant_id INT,brand_id INT,month STRING,day STRING,action INT,age_range INT,gender INT,province STRING) COMMENT 'Welcome to xmu dblab,Now create hive.user_log!' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE LOCATION '/user/hive/user_log/user_log';

(1)查詢10個交易記錄:

select * from user_log limit 10;

(2)對於複雜的列名,可以使用別名:

select merchant_id as meri from user_log;

(3)使用嵌套語句

select ul.meri from (select merchant_id as meri from user_log) as ul limit 10;

(4)統計有多少條行數據

select count(*) from user_log;

(5)統計不重複的數據

select count(distinct user_id) from user_log;

(6)使用group by 查詢不重複的數據

select count(*) from (select user_id,item_id,cat_id,merchant_id,brand_id,action on from user_log group by user_id,item_id,cat_id,merchant_id,brand_id,action having count(*)=1)a;

(7)查詢某一天多少人購買了產品

select count(distinct user_id) from user_log where action='2' and month='11' and day='11';

action=’2’ 表示支付,action=’1’表加入購物車

(8)查詢某一天男女購買的比例

select count(*) from user_log where gender=0 and month='11' and day='11';

select count(*) from user_log where gender=1 and month='11' and day='11';

(9)查詢某天某商品的購買用戶,且某用戶購買2次以上

select user_id from user_log where action='2' group by user_id having count(action='2')>1;

(10)查詢某品牌商品的瀏覽次數

select brand_id,count(action) from user_log where action='2' group by brand_id;

未完待續

參考:http://dblab.xmu.edu.cn/blog/1363-2/

texture_texture

發佈了61 篇原創文章 · 獲贊 42 · 訪問量 13萬+

私信關注

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

hive對電商用戶訂單行爲特徵分析(二)

MySQL 核心模塊揭祕 | 18 期 | 鎖在內存里長什麼樣*

使用perf工具生成火焰圖

HttpSecurity 是如何組裝過濾器鏈的

數說海南——近6年海南各市縣人口簡單看

長序列中Transformers的高級注意力機制總結

WebStorm 創建 Vue 項目

大齡程序員思考

響應式界面控件DevExtreme * 更強的數據分析和可視化功能

hive1.2.2+hadoop2.7.3導入米騎測試日誌以及數據優化(五）

java python之間數據交互(使用jython)

安裝流行腳本編輯器(jupyter notebook)流程

spark查詢任意字段,並使用dataframe輸出結果

hive對電商用戶訂單行爲特徵分析(二)

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結