數據準備
cookie1,2015-04-10,1
cookie1,2015-04-11,5
cookie1,2015-04-12,7
cookie1,2015-04-13,3
cookie1,2015-04-14,2
cookie1,2015-04-15,4
cookie1,2015-04-16,4
創建數據庫及表
create database if not exists cookie;
use cookie;
drop table if exists cookie1;
create table cookie1(cookieid string, createtime string, pv int) row format delimited fields terminated by ',';
load data local inpath "/home/hadoop/cookie1.txt" into table cookie1;
select * from cookie1;
SUM函數
select
cookieid,
createtime,
pv,
sum(pv) over (partition by cookieid order by createtime rows between unbounded preceding and current row) as pv1,
sum(pv) over (partition by cookieid order by createtime) as pv2,
sum(pv) over (partition by cookieid) as pv3,
sum(pv) over (partition by cookieid order by createtime rows between 3 preceding and current row) as pv4,
sum(pv) over (partition by cookieid order by createtime rows between 3 preceding and 1 following) as pv5,
sum(pv) over (partition by cookieid order by createtime rows between current row and unbounded following) as pv6
from cookie1;
解釋
pv1: 分組內從起點到當前行的pv累積,如,11號的pv1=10號的pv+11號的pv, 12號=10號+11號+12號
pv2: 同pv1
pv3: 分組內(cookie1)所有的pv累加
pv4: 分組內當前行+往前3行,如,11號=10號+11號, 12號=10號+11號+12號, 13號=10號+11號+12號+13號, 14號=11號+12號+13號+14號
pv5: 分組內當前行+往前3行+往後1行,如,14號=11號+12號+13號+14號+15號=5+7+3+2+4=21
pv6: 分組內當前行+往後所有行,如,13號=13號+14號+15號+16號=3+2+4+4=13,14號=14號+15號+16號=2+4+4=10
關鍵字解釋
如果不指定ROWS BETWEEN,默認爲從起點到當前行;
如果不指定ORDER BY,則將分組內所有值累加;
關鍵是理解ROWS BETWEEN含義,也叫做WINDOW子句:
PRECEDING:往前
FOLLOWING:往後
CURRENT ROW:當前行
UNBOUNDED:起點,
UNBOUNDED PRECEDING 表示從前面的起點,
UNBOUNDED FOLLOWING:表示到後面的終點
–其他AVG,MIN,MAX,和SUM用法一樣。
AVG函數
select
cookieid,
createtime,
pv,
avg(pv) over (partition by cookieid order by createtime rows between unbounded preceding and current row) as pv1, -- 默認爲從起點到當前行
avg(pv) over (partition by cookieid order by createtime) as pv2, --從起點到當前行,結果同pv1
avg(pv) over (partition by cookieid) as pv3, --分組內所有行
avg(pv) over (partition by cookieid order by createtime rows between 3 preceding and current row) as pv4, --當前行+往前3行
avg(pv) over (partition by cookieid order by createtime rows between 3 preceding and 1 following) as pv5, --當前行+往前3行+往後1行
avg(pv) over (partition by cookieid order by createtime rows between current row and unbounded following) as pv6 --當前行+往後所有行
from cookie1;
Hive Sum MAX MIN聚合函數
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章
hadoop搭建之hive安裝
伊人心
2019-02-23 00:41:25
Hive幾種參數配置方法
大數據前沿
2019-02-22 19:56:56
Hive安裝
Hadoop_Hive_gwp
2019-02-22 15:23:16
Centos基於Hadoop安裝Hive
TaoismLi
2019-02-22 14:31:40
讀取Hive中所有表的表結構,並在新Hive庫中創建表,索引等
但丁丶2P丶M
2019-02-22 12:50:58
Hadoop的實現原理及基本使用方法
buyinuan
2019-01-23 13:48:51
一文帶你弄懂Livy——基於Apache Spark的REST服務
Hjiangxue
2019-01-04 13:26:04
新手必備:大數據框架Hadoop主要模塊解析
Hjiangxue
2019-01-03 13:21:32
Hive添加自定義UDF函數
Liam666
2018-12-28 13:45:14
大數據學習路線是什麼,小白學大數據學習路線
用戶4151968
2018-12-18 11:50:11
0459-如何使用SQuirreL通過JDBC連接CDH的Hive(方式一)
Fayson
2018-12-17 11:55:10
0458-Hive數據類型校驗問題分析
Fayson
2018-12-17 11:54:59