Hive案例之成績統計

需求

創建classrecord.txt文件。其中數據如下：

序號 班級 總分
1 1603A 95 
2 1603B 85
3 1603C 75  
4 1603D 96 
5 1604F 94  
6 1604E 95 
7 1604K 91  
8 1604G 89 
9 1501A 79 
10 1502A 69 
11 1503A 59 
12 1504A 89  
13 1701A 99 
14 1702A 100 
15 1703A 65

創建Hive分區表表名爲classrecord導入數據
將以上信息導入到Hive表中
求總分前三名
求每一屆的前三名（16xx爲16屆，15xx爲15屆，17xx爲17屆）
創建Hive自定義函數，功能是劃定班級類型。
- 85-100：優秀班級
- 75-84：良好班級
- 60-74：及格班級
- 其餘爲不合格

實現

創建Hive分區表表名爲classrecord導入數據

創建普通表加載數據

create table classrecord_tmp (id int, classname string, score int) row format delimited fields terminated by ' ' stored as textfile tblproperties('skip.header.line.count'='1');

load data local inpath '/home/test/hive-2.3.7/data/classrecord.txt' into table classrecord_tmp;

創建分區表並動態插入數據

create table classrecord (id int, classname string, score int) partitioned by (dt string) row format delimited fields terminated by ' ';

將以上信息導入到Hive表中

insert into classrecord partition(dt) select *,substr(classname,0,2) dt from classrecord_tmp;

求總分前三名

select * from classrecord order by score desc limit 3;

求每一屆的前三名（16xx爲16屆，15xx爲15屆，17xx爲17屆）

select * from (select *,row_number() over(partition by dt order by score desc) rank from classrecord)t where t.rank<=3;

創建Hive自定義函數，功能是劃定班級類型

導入依賴包

<dependencies>
	<dependency>
		<groupId>org.apache.hive</groupId>
		<artifactId>hive-exec</artifactId>
		<version>2.3.7</version>
	</dependency>
</dependencies>

編寫UDF類

package com.xxx.hive.udf;

import org.apache.hadoop.hive.ql.exec.UDF;

public class LevelUDF extends UDF {
    public String evaluate(Integer score) {
        String level = null;
        if (score >= 85 && score <=100) {
            level = "優秀";
        } else if (score >= 75 && score < 85){
            level = "良好";
        } else if (score >= 60 && score < 75) {
            level = "及格";
        } else {
            level = "不及格";
        }
        return level;
    }
}

打jar包

上傳到linux

註冊函數（二選一）

註冊臨時函數

bin/hive中，執行如下語句：

添加jar到classpath中(不論jar是否在hive的lib目錄下，都需要執行此語句)
```
add jar /home/test/hive-2.3.7/hive_demo-1.0-SNAPSHOT.jar;
```

註冊函數

create temporary function get_level as 'com.xxx.hive.udf.LevelUDF';

查看函數
```
show functions;
```

使用函數

select *,get_level(score) from classrecord;

刪除函數
```
drop temporary function get_level;
```

註冊永久函數

如果是使用bin/hiveserver2方式，需配置conf/hive-site.xml配置文件，添加如下配置

<property>
     <name>hive.aux.jars.path</name>
     <value>file:///home/test/hive-2.3.7/hive_demo-1.0-SNAPSHOT.jar</value>
</property>

如果使用bin/hive方式，需配置conf/hive-env.sh，添加

export HIVE_AUX_JARS_PATH=/home/test/hive-2.3.7/lib/json-serde-1.3.8-jar-with-dependencies.jar,/home/test/hive-2.3.7/lib/app_logs_hive-1.0-SNAPSHOT.jar,/home/test/hive-2.3.7/hive_demo-1.0-SNAPSHOT.jar

在bin/hive中，註冊永久函數(函數在哪個database註冊，就在哪個database中使用和刪除)
```
create function get_level as 'com.xxx.hive.udf.LevelUDF';
```
在mysql的hive數據庫下FUNCS表中查看新註冊的函數
```
mysql -uroot -pxxxx

use hive

select * from FUNCS;
```

使用函數

select *,get_level(score) from classrecord;

刪除函數
```
drop function get_level;
```

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Hive案例之成績統計

Hive案例之成績統計

需求

實現

如何使用 JS 判斷用戶是否處於活躍狀態

lightdb秒級增加列和刪除列（not null帶默認值）

通過HPA+CronHPA組合應對業務複雜彈性伸縮場景

❤️‍🔥 Solon Cloud Event 新的事務特性與應用

lightdb mysql 8.0兼容之不可見主鍵

基於Ubuntu-22.04安裝K8s-v1.28.2實驗（四）使用域名訪問網站應用

數據類型（二）

包和引入（三）

異常處理（五）

初識scala（一）

MapReduce案例之ReduceJoin

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結