hive 1.1.0版內置函數大全

在hive中使用
show functions 查看所有hive支持的函數
describe function xxx 查看具體xxx函數的定義

以下表格是hive1.1.0支持的所有函數及定義,
其實主要用到的函數並不多,後續另外詳細列舉平時常用的函數例子。

hive function describe
! ! a - Logical not
!= a != b - Returns TRUE if a is not equal to b
% a % b - Returns the remainder when dividing a by b
& a & b - Bitwise and
* a * b - Multiplies a by b
+ a + b - Returns a+b
- a - b - Returns the difference a-b
/ a / b - Divide a by b
< a < b - Returns TRUE if a is less than b
<= a <= b - Returns TRUE if a is not greater than b
<=> a <=> b - Returns same result with EQUAL(=) operator for non-null operands, but returns TRUE if both are NULL, FALSE if one of the them is NULL
<> a <> b - Returns TRUE if a is not equal to b
= a = b - Returns TRUE if a equals b and false otherwise
== a == b - Returns TRUE if a equals b and false otherwise
> a > b - Returns TRUE if a is greater than b
>= a >= b - Returns TRUE if a is not smaller than b
^ a ^ b - Bitwise exclusive or
abs abs(x) - returns the absolute value of x
acos acos(x) - returns the arc cosine of x if -1<=x<=1 or NULL otherwise
add_months add_months
and a and b - Logical and
array array(n0, n1…) - Creates an array with the given elements
array_contains array_contains(array, value) - Returns TRUE if the array contains value.
ascii ascii(str) - returns the numeric value of the first character of str
asin asin(x) - returns the arc sine of x if -1<=x<=1 or NULL otherwise
assert_true assert_true(condition) - Throw an exception if ‘condition’ is not true.
atan atan(x) - returns the atan (arctan) of x (x is in radians)
avg avg(x) - Returns the mean of a set of numbers
base64 base64(bin) - Convert the argument from binary to a base 64 string
between between a [NOT] BETWEEN b AND c - evaluate if a is [not] in between b and c
bin bin(n) - returns n in binary
case CASE a WHEN b THEN c [WHEN d THEN e]* [ELSE f] END - When a = b, returns c; when a = d, return e; else return f
cbrt cbrt(double) - Returns the cube root of a double value.
ceil ceil(x) - Find the smallest integer not smaller than x
ceiling ceiling(x) - Find the smallest integer not smaller than x
coalesce coalesce(a1, a2, …) - Returns the first non-null argument
collect_list collect_list(x) - Returns a list of objects with duplicates
collect_set collect_set(x) - Returns a set of objects with duplicate elements eliminated
compute_stats compute_stats(x) - Returns the statistical summary of a set of primitive type values.
concat concat(str1, str2, … strN) - returns the concatenation of str1, str2, … strN or concat(bin1, bin2, … binN) - returns the concatenation of bytes in binary data bin1, bin2, … binN
concat_ws concat_ws(separator, [string
context_ngrams context_ngrams(expr, array<string1, string2, …>, k, pf) estimates the top-k most frequent n-grams that fit into the specified context. The second parameter specifies a string of words that specify the positions of the n-gram elements, with a null value standing in for a ‘blank’ that must be filled by an n-gram element.
conv conv(num, from_base, to_base) - convert num from from_base to to_base
corr corr(x,y) - Returns the Pearson coefficient of correlation between
a set of number pairs
cos cos(x) - returns the cosine of x (x is in radians)
count count(*) - Returns the total number of retrieved rows, including rows containing NULL values.
count(expr) - Returns the number of rows for which the supplied expression is non-NULL.
count(DISTINCT expr[, expr…]) - Returns the number of rows for which the supplied expression(s) are unique and non-NULL.
covar_pop covar_pop(x,y) - Returns the population covariance of a set of number pairs
covar_samp covar_samp(x,y) - Returns the sample covariance of a set of number pairs
crc32 crc32(str or bin) - Computes a cyclic redundancy check value for string or binary argument and returns bigint value.
create_union create_union(tag, obj1, obj2, obj3, …) - Creates a union with the object for given tag
cume_dist There is no documentation for function ‘cume_dist’
current_database current_database() - returns currently using database name
current_date current_date() - Returns the current date at the start of query evaluation. All calls of current_date within the same query return the same value.
current_timestamp current_timestamp() - Returns the current timestamp at the start of query evaluation. All calls of current_timestamp within the same query return the same value.
current_user current_user() - Returns current user name
date_add date_add(start_date, num_days) - Returns the date that is num_days after start_date.
date_format date_format(date/timestamp/string, fmt) - converts a date/timestamp/string to a value of string in the format specified by the date format fmt.
date_sub date_sub(start_date, num_days) - Returns the date that is num_days before start_date.
datediff datediff(date1, date2) - Returns the number of days between date1 and date2
day day(date) - Returns the date of the month of date
dayofmonth dayofmonth(date) - Returns the date of the month of date
dayofweek dayofweek(param) - Returns the day of the week of date/timestamp (1 = Sunday, 2 = Monday, …, 7 = Saturday)
decode decode(bin, str) - Decode the first argument using the second argument character set
degrees degrees(x) - Converts radians to degrees
dense_rank There is no documentation for function ‘dense_rank’
div a div b - Divide a by b rounded to the long integer
e e() - returns E
elt elt(n, str1, str2, …) - returns the n-th string
encode encode(str, str) - Encode the first argument using the second argument character set
ewah_bitmap ewah_bitmap(expr) - Returns an EWAH-compressed bitmap representation of a column.
ewah_bitmap_and ewah_bitmap_and(b1, b2) - Return an EWAH-compressed bitmap that is the bitwise AND of two bitmaps.
ewah_bitmap_empty ewah_bitmap_empty(bitmap) - Predicate that tests whether an EWAH-compressed bitmap is all zeros
ewah_bitmap_or ewah_bitmap_or(b1, b2) - Return an EWAH-compressed bitmap that is the bitwise OR of two bitmaps.
exp exp(x) - Returns e to the power of x
explode explode(a) - separates the elements of array a into multiple rows, or the elements of a map into multiple rows and columns
field field(str, str1, str2, …) - returns the index of str in the str1,str2,… list or 0 if not found
find_in_set find_in_set(str,str_array) - Returns the first occurrence of str in str_array where str_array is a comma-delimited string. Returns null if either argument is null. Returns 0 if the first argument has any commas.
first_value There is no documentation for function ‘first_value’
floor floor(x) - Find the largest integer not greater than x
format_number format_number(X, D) - Formats the number X to a format like ‘#,###,###.##’, rounded to D decimal places, and returns the result as a string. If D is 0, the result has no decimal point or fractional part. This is supposed to function like MySQL’s FORMAT
from_unixtime from_unixtime(unix_time, format) - returns unix_time in the specified format
from_utc_timestamp from_utc_timestamp(timestamp, string timezone) - Assumes given timestamp is UTC and converts to given timezone (as of Hive 0.8.0)
get_json_object get_json_object(json_txt, path) - Extract a json object from path
greatest greatest(v1, v2, …) - Returns the greatest value in a list of values
hash hash(a1, a2, …) - Returns a hash value of the arguments
hex hex(n, bin, or str) - Convert the argument to hexadecimal
histogram_numeric histogram_numeric(expr, nb) - Computes a histogram on numeric ‘expr’ using nb bins.
hour hour(date) - Returns the hour of date
if IF(expr1,expr2,expr3) - If expr1 is TRUE (expr1 <> 0 and expr1 <> NULL) then IF() returns expr2; otherwise it returns expr3. IF() returns a numeric or string value, depending on the context in which it is used.
in test in(val1, val2…) - returns true if test equals any valN
in_file in_file(str, filename) - Returns true if str appears in the file
index index(a, n) - Returns the n-th element of a
initcap initcap(str) - Returns str, with the first letter of each word in uppercase, all other letters in lowercase. Words are delimited by white space.
inline inline( ARRAY( STRUCT()[,STRUCT()] - explodes and array and struct into a table
instr instr(str, substr) - Returns the index of the first occurance of substr in str
isnotnull isnotnull a - Returns true if a is not NULL and false otherwise
isnull isnull a - Returns true if a is NULL and false otherwise
java_method java_method(class,method[,arg1[,arg2…]]) calls method with reflection
json_tuple json_tuple(jsonStr, p1, p2, …, pn) - like get_json_object, but it takes multiple names and return a tuple. All the input parameters and output column types are string.
lag LAG (scalar_expression [,offset] [,default]) OVER ([query_partition_clause] order_by_clause); The LAG function is used to access data from a previous row.
last_day last_day(date) - Returns the last day of the month which the date belongs to.
last_value There is no documentation for function ‘last_value’
lcase lcase(str) - Returns str with all characters changed to lowercase
lead There is no documentation for function ‘last_value’
least least(v1, v2, …) - Returns the least value in a list of values
length length(str
levenshtein levenshtein(str1, str2) - This function calculates the Levenshtein distance between two strings.
like like(str, pattern) - Checks if str matches pattern
ln ln(x) - Returns the natural logarithm of x
locate locate(substr, str[, pos]) - Returns the position of the first occurance of substr in str after position pos
log log([b], x) - Returns the logarithm of x with base b
log10 log10(x) - Returns the logarithm of x with base 10
log2 log2(x) - Returns the logarithm of x with base 2
logged_in_user logged_in_user() - Returns logged in user name
lower lower(str) - Returns str with all characters changed to lowercase
lpad lpad(str, len, pad) - Returns str, left-padded with pad to a length of len
ltrim ltrim(str) - Removes the leading space characters from str
map map(key0, value0, key1, value1…) - Creates a map with the given key/value pairs
map_keys map_keys(map) - Returns an unordered array containing the keys of the input map.
map_values map_values(map) - Returns an unordered array containing the values of the input map.
matchpath There is no documentation for function ‘last_value’
max max(expr) - Returns the maximum value of expr
md5 md5(str or bin) - Calculates an MD5 128-bit checksum for the string or binary.
min min(expr) - Returns the minimum value of expr
minute minute(date) - Returns the minute of date
month month(date) - Returns the month of date
months_between months_between(date1, date2) - returns number of months between dates date1 and date2
named_struct named_struct(name1, val1, name2, val2, …) - Creates a struct with the given field names and values
negative negative a - Returns -a
next_day next_day(start_date, day_of_week) - Returns the first date which is later than start_date and named as indicated.
ngrams ngrams(expr, n, k, pf) - Estimates the top-k n-grams in rows that consist of sequences of strings, represented as arrays of strings, or arrays of arrays of strings. ‘pf’ is an optional precision factor that controls memory usage.
noop There is no documentation for function ‘row_number’
noopstreaming There is no documentation for function ‘row_number’
noopwithmap There is no documentation for function ‘row_number’
noopwithmapstreaming There is no documentation for function ‘row_number’
not not a - Logical not
ntile There is no documentation for function ‘row_number’
nvl nvl(value,default_value) - Returns default value if value is null else returns value
or a or b - Logical or
parse_url parse_url(url, partToExtract[, key]) - extracts a part from a URL
parse_url_tuple parse_url_tuple(url, partname1, partname2, …, partnameN) - extracts N (N>=1) parts from a URL.
percent_rank There is no documentation for function ‘percent_rank’
percentile percentile(expr, pc) - Returns the percentile(s) of expr at pc (range: [0,1]).pc can be a double or double array
percentile_approx percentile_approx(expr, pc, [nb]) - For very large data, computes an approximate percentile value from a histogram, using the optional argument [nb] as the number of histogram bins to use. A higher value of nb results in a more accurate approximation, at the cost of higher memory usage.
pi pi() - returns pi
pmod a pmod b - Compute the positive modulo
posexplode posexplode(a) - behaves like explode for arrays, but includes the position of items in the original array
positive positive a - Returns a
pow pow(x1, x2) - raise x1 to the power of x2
power power(x1, x2) - raise x1 to the power of x2
printf printf(String format, Obj… args) - function that can format strings according to printf-style format strings
radians radians(x) - Converts degrees to radians
rand rand([seed]) - Returns a pseudorandom number between 0 and 1
rank There is no documentation for function ‘row_number’
reflect There is no documentation for function ‘row_number’
reflect2 There is no documentation for function ‘row_number’
regexp str regexp regexp - Returns true if str matches regexp and false otherwise
regexp_extract regexp_extract(str, regexp[, idx]) - extracts a group that matches regexp
regexp_replace regexp_replace(str, regexp, rep) - replace all substrings of str that match regexp with rep
repeat repeat(str, n) - repeat str n times
reverse reverse(str) - reverse str
rlike str rlike regexp - Returns true if str matches regexp and false otherwise
round round(x[, d]) - round x to d decimal places
row_number There is no documentation for function ‘row_number’
rpad rpad(str, len, pad) - Returns str, right-padded with pad to a length of len
rtrim rtrim(str) - Removes the trailing space characters from str
second second(date) - Returns the second of date
sentences sentences(str, lang, country) - Splits str into arrays of sentences, where each sentence is an array of words. The ‘lang’ and’country’ arguments are optional, and if omitted, the default locale is used.
sha2 sha2(string/binary, len) - Calculates the SHA-2 family of hash functions (SHA-224, SHA-256, SHA-384, and SHA-512).
sign sign(x) - returns the sign of x )
sin sin(x) - returns the sine of x (x is in radians)
size size(a) - Returns the size of a
sort_array sort_array(array(obj1, obj2,…)) - Sorts the input array in ascending order according to the natural ordering of the array elements.
soundex soundex(string) - Returns soundex code of the string.
space space(n) - returns n spaces
split split(str, regex) - Splits str around occurances that match regex
sqrt sqrt(x) - returns the square root of x
stack stack(n, cols…) - turns k columns into n rows of size k/n each
std std(x) - Returns the standard deviation of a set of numbers
stddev stddev(x) - Returns the standard deviation of a set of numbers
stddev_pop stddev_pop(x) - Returns the standard deviation of a set of numbers
stddev_samp stddev_samp(x) - Returns the sample standard deviation of a set of numbers
str_to_map str_to_map(text, delimiter1, delimiter2) - Creates a map by parsing text
struct struct(col1, col2, col3, …) - Creates a struct with the given field values
substr substr(str, pos[, len]) - returns the substring of str that starts at pos and is of length len orsubstr(bin, pos[, len]) - returns the slice of byte array that starts at pos and is of length len
substring substring(str, pos[, len]) - returns the substring of str that starts at pos and is of length len orsubstring(bin, pos[, len]) - returns the slice of byte array that starts at pos and is of length len
sum sum(x) - Returns the sum of a set of numbers
tan tan(x) - returns the tangent of x (x is in radians)
to_date to_date(expr) - Extracts the date part of the date or datetime expression expr
to_unix_timestamp to_unix_timestamp(date[, pattern]) - Returns the UNIX timestamp
to_utc_timestamp to_utc_timestamp(timestamp, string timezone) - Assumes given timestamp is in given timezone and converts to UTC (as of Hive 0.8.0)
translate translate(input, from, to) - translates the input string by replacing the characters present in the from string with the corresponding characters in the to string
trim trim(str) - Removes the leading and trailing space characters from str
trunc trunc(date, fmt) - Returns returns date with the time portion of the day truncated to the unit specified by the format model fmt. If you omit fmt, then date is truncated to the nearest day. It now only supports ‘MONTH’/‘MON’/‘MM’ and ‘YEAR’/‘YYYY’/‘YY’ as format.
ucase ucase(str) - Returns str with all characters changed to uppercase
unbase64 unbase64(str) - Convert the argument from a base 64 string to binary
unhex unhex(str) - Converts hexadecimal argument to binary
unix_timestamp unix_timestamp(date[, pattern]) - Converts the time to a number
upper upper(str) - Returns str with all characters changed to uppercase
uuid uuid() - Returns a universally unique identifier (UUID) string.
var_pop var_pop(x) - Returns the variance of a set of numbers
var_samp var_samp(x) - Returns the sample variance of a set of numbers
variance variance(x) - Returns the variance of a set of numbers
version version() - Returns the Hive build version string - includes base version and revision.
weekofyear weekofyear(date) - Returns the week of the year of the given date. A week is considered to start on a Monday and week 1 is the first week with >3 days.
when CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END - When a = true, returns b; when c = true, return d; else return e
windowingtablefunction There is no documentation for function ‘windowingtablefunction’
xpath xpath(xml, xpath) - Returns a string array of values within xml nodes that match the xpath expression
xpath_boolean xpath_boolean(xml, xpath) - Evaluates a boolean xpath expression
xpath_double xpath_double(xml, xpath) - Returns a double value that matches the xpath expression
xpath_float xpath_float(xml, xpath) - Returns a float value that matches the xpath expression
xpath_int xpath_int(xml, xpath) - Returns an integer value that matches the xpath expression
xpath_long xpath_long(xml, xpath) - Returns a long value that matches the xpath expression
xpath_number xpath_number(xml, xpath) - Returns a double value that matches the xpath expression
xpath_short xpath_short(xml, xpath) - Returns a short value that matches the xpath expression
xpath_string xpath_string(xml, xpath) - Returns the text contents of the first xml node that matches the xpath expression
year year(date) - Returns the year of date
| a | b - Bitwise or
~ ~ n - Bitwise not
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章