1.編寫UDF類
import org.apache.hadoop.hive.ql.exec.UDF;
public class NewIP2Long extends UDF {
public static long ip2long(String ip) {
String[] ips = ip.split("[.]");
long ipNum = 0;
if (ips == null) {
return 0;
}
for (int i = 0; i < ips.length; i++) {
ipNum = ipNum << Byte.SIZE | Long.parseLong(ips[i]);
}
return ipNum;
}
public long evaluate(String ip) {
if (ip.matches("\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}")) {
try {
long ipNum = ip2long(ip);
return ipNum;
} catch (Exception e) {
return 0;
}
} else {
return 0;
}
}
public static void main(String[] argvs) {
NewIP2Long ipl = new NewIP2Long();
System.out.println(ip2long("112.64.106.238"));
System.out.println(ipl.evaluate("58.35.186.62"));
}
}
2.編譯,然後打包成ip2long.jar。
3.在需要使用ip2long這個方法到時候:
1 |
add
jar /tmp/NEWIP2Long.jar; |
2 |
drop
temporary function ip2long; |
3 |
create
temporary function ip2long as 'NewIP2Long' ; |
4 |
select
ip2long(ip) from XXX ; |
這種方法每次使用都要add,create一下,還是很麻煩,如果能把UDF編譯到hive源碼中那一定是件很high的事。 進階:將自定義UDF編譯到hive中
重編譯hive: 1)將寫好的Jave文件拷貝到~/install/hive-0.8.1/src/ql/src/java/org/apache/hadoop/hive/ql/udf/
1 |
cd
~/install/hive- 0.8 . 1 /src/ql/src/java/org/apache/hadoop/hive/ql/udf/ |
2)修改~/install/hive-0.8.1/src/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java,增加import和RegisterUDF
1 |
import com.meilishuo.hive.udf.UDFIp2Long; |
3 |
registerUDF( "ip2long" ,
UDFIp2Long. class , false ); |
3)在~/install/hive-0.8.1/src下運行ant -Dhadoop.version=1.0.1 package
1 |
cd
~/install/hive- 0.8 . 1 /src |
2 |
ant
-Dhadoop.version= 1.0 . 1 package |
4)替換exec的jar包,新生成的包在/hive-0.8.1/src/build/ql目錄下,替換鏈接
1 |
cp
hive-exec- 0.8 . 1 .jar
/hadoop/hive/lib/hive-exec- 0.8 . 1 .jar. 0628 |
3 |
ln
-s hive-exec- 0.8 . 1 .jar. 0628 hive-exec- 0.8 . 1 .jar |
5)重啓hive服務 6)測試