9(20)本週迴流用戶數20

第11章 需求五:本週迴流用戶數
本週迴流=本週活躍-本週新增-上週活躍
11.1 DWS層
使用日活明細表dws_uv_detail_day作爲DWS層數據
11.2 ADS層
1)建表語句
hive (gmall)>
drop table if exists ads_back_count;
create external table ads_back_count(
dt string COMMENT ‘統計日期’,
wk_dt string COMMENT ‘統計日期所在周’,
wastage_count bigint COMMENT ‘迴流設備數’
)
row format delimited fields terminated by ‘\t’
location ‘/warehouse/gmall/ads/ads_back_count’;
2)導入數據:
hive (gmall)>
insert into table ads_back_count
select
‘2019-02-20’ dt,
concat(date_add(next_day(‘2019-02-20’,‘MO’),-7),’’,date_add(next_day(‘2019-02-20’,‘MO’),-1)) wk_dt,
count(*)
from
(
select t1.mid_id
from
(
select mid_id
from dws_uv_detail_wk
where wk_dt=concat(date_add(next_day(‘2019-02-20’,‘MO’),-7),’
’,date_add(next_day(‘2019-02-20’,‘MO’),-1))
)t1
left join
(
select mid_id
from dws_new_mid_day
where create_date<=date_add(next_day(‘2019-02-20’,‘MO’),-1) and create_date>=date_add(next_day(‘2019-02-20’,‘MO’),-7)
)t2
on t1.mid_id=t2.mid_id
left join
(
select mid_id
from dws_uv_detail_wk
where wk_dt=concat(date_add(next_day(‘2019-02-20’,‘MO’),-7*2),’_’,date_add(next_day(‘2019-02-20’,‘MO’),-7-1))
)t3
on t1.mid_id=t3.mid_id
where t2.mid_id is null and t3.mid_id is null
)t4;
3)查詢結果
hive (gmall)> select * from ads_back_count;
11.3 編寫腳本
1)在hadoop102的/home/atguigu/bin目錄下創建腳本
[atguigu@hadoop102 bin]$ vim ads_back_log.sh
在腳本中編寫如下內容
#!/bin/bash

if [ -n “$1” ];then
do_date=$1
else
do_date=date -d "-1 day" +%F
fi

hive=/opt/module/hive/bin/hive
APP=gmall

echo “-----------導入日期$do_date-----------”

sql="
insert into table "APP".adsbackcountselectAPP".ads_back_count select 'do_date’ dt,
concat(date_add(next_day(‘KaTeX parse error: Expected group after '_' at position 21: …te','MO'),-7),'_̲',date_add(next…do_date’,‘MO’),-1)) wk_dt,
count()
from
(
select t1.mid_id
from
(
select mid_id
from “APP".dwsuvdetailwkwherewkdt=concat(dateadd(nextday(APP".dws_uv_detail_wk where wk_dt=concat(date_add(next_day('do_date’,‘MO’),-7),’_’,date_add(next_day('dodate,MO),1)))t1leftjoin(selectmididfrom"do_date','MO'),-1)) )t1 left join ( select mid_id from "APP”.dws_new_mid_day
where create_date<=date_add(next_day(‘dodate,MO),1)andcreatedate>=dateadd(nextday(do_date','MO'),-1) and create_date>=date_add(next_day('do_date’,‘MO’),-7)
)t2
on t1.mid_id=t2.mid_id
left join
(
select mid_id
from "APP".dwsuvdetailwkwherewkdt=concat(dateadd(nextday(APP".dws_uv_detail_wk where wk_dt=concat(date_add(next_day('do_date’,‘MO’),-7
2),’_’,date_add(next_day(’$do_date’,‘MO’),-7-1))
)t3
on t1.mid_id=t3.mid_id
where t2.mid_id is null and t3.mid_id is null
)t4;
"

hivee"hive -e "sql"
2)增加腳本執行權限
[atguigu@hadoop102 bin]$ chmod 777 ads_back_log.sh
3)腳本使用
[atguigu@hadoop102 module]$ ads_back_log.sh 2019-02-20
4)查詢結果
hive (gmall)> select * from ads_back_count;
5)腳本執行時間
企業開發中一般在每週一凌晨30分~1點

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章