hive指南(二)

內置的操作符和功能
   內置操作符
  •          關係操作符 - 下列操作通過比較操作數並生成一個TRUE或FALSE值。
 A=B
 A!=B
 A<B
 A<=B
 A>B
 A>=B
 A IS NULL
 A IS NOT NULL
 A LIKE B   strings
 A RLIKE B strings
 A REGEXPB  strings
算術操作符
   A + B all number types
   A -  B  all number types
   A  * B  all number types
   A /  B  all number types
   A % B  all number types
   A & B  all number types
   A  |  B  
   A  ^ B
   ~A
 
    邏輯操作
        A AND B, A && B, A OR B, A | B, NOT A, !A
     
     複合類型操作
         A[n] A是一個數組,n是一個整數
         M[key] M--Map<K,V>
         S.x S是一個結構類型
 
     內置函數
           
Return Type Function Name (Signature) Description
BIGINT round(double a) returns the rounded BIGINT value of the double
BIGINT floor(double a) returns the maximum BIGINT value that is equal or less than the double
BIGINT ceil(double a) returns the minimum BIGINT value that is equal or greater than the double
double rand(), rand(int seed) returns a random number (that changes from row to row). Specifiying the seed will make sure the generated random number sequence is deterministic.
string concat(string A, string B,...) returns the string resulting from concatenating B after A. For example, concat('foo', 'bar') results in 'foobar'. This function accepts arbitrary number of arguments and return the concatenation of all of them.
string substr(string A, int start) returns the substring of A starting from start position till the end of string A. For example, substr('foobar', 4) results in 'bar'
string substr(string A, int start, int length) returns the substring of A starting from start position with the given length e.g. substr('foobar', 4, 2) results in 'ba'
string upper(string A) returns the string resulting from converting all characters of A to upper case e.g. upper('fOoBaR') results in 'FOOBAR'
string ucase(string A) Same as upper
string lower(string A) returns the string resulting from converting all characters of B to lower case e.g. lower('fOoBaR') results in 'foobar'
string lcase(string A) Same as lower
string trim(string A) returns the string resulting from trimming spaces from both ends of A e.g. trim(' foobar ') results in 'foobar'
string ltrim(string A) returns the string resulting from trimming spaces from the beginning(left hand side) of A. For example, ltrim(' foobar ') results in 'foobar '
string rtrim(string A) returns the string resulting from trimming spaces from the end(right hand side) of A. For example, rtrim(' foobar ') results in ' foobar'
string regexp_replace(string A, string B, string C) returns the string resulting from replacing all substrings in B that match the Java regular expression syntax(See Java regular expressions syntax) with C. For example, regexp_replace('foobar', 'oo|ar', ) returns 'fb'
int size(Map<K.V>) returns the number of elements in the map type
int size(Array<T>) returns the number of elements in the array type
value of <type> cast(<expr> as <type>) converts the results of the expression expr to <type> e.g. cast('1' as BIGINT) will convert the string '1' to it integral representation. A null is returned if the conversion does not succeed.
string from_unixtime(int unixtime) convert the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current system time zone in the format of "1970-01-01 00:00:00"
string to_date(string timestamp) Return the date part of a timestamp string: to_date("1970-01-01 00:00:00") = "1970-01-01"
int year(string date) Return the year part of a date or a timestamp string: year("1970-01-01 00:00:00") = 1970, year("1970-01-01") = 1970
int month(string date) Return the month part of a date or a timestamp string: month("1970-11-01 00:00:00") = 11, month("1970-11-01") = 11
int day(string date) Return the day part of a date or a timestamp string: day("1970-11-01 00:00:00") = 1, day("1970-11-01") = 1
string get_json_object(string json_string, string path) Extract json object from a json string based on json path specified, and return json string of the extracted json object. It will return null if the input json string is invalid
  • The following built in aggregate functions are supported in Hive:
Return Type Aggregation Function Name (Signature) Description
BIGINT count(*), count(expr), count(DISTINCT expr[, expr_.]) count(*) - Returns the total number of retrieved rows, including rows containing NULL values; count(expr) - Returns the number of rows for which the supplied expression is non-NULL; count(DISTINCT expr[, expr]) - Returns the number of rows for which the supplied expression(s) are unique and non-NULL.
DOUBLE sum(col), sum(DISTINCT col) returns the sum of the elements in the group or the sum of the distinct values of the column in the group
DOUBLE avg(col), avg(DISTINCT col) returns the average of the elements in the group or the average of the distinct values of the column in the group
DOUBLE min(col) returns the minimum value of the column in the group
DOUBLE max(col) returns the maximum value of the column in the group


 
語言功能
    Hive查詢語言提供基本的類SQL操作,這些操作工作在表或分區上,它們是:
  •     有使用where條件從表中過濾行的能力。
  •     有使用select條件從表中選擇指定的列的能力
  •     在兩個表中自然連接
  •     使用group by列進行聚合計算
  •     存儲一個查貿易的結果到另外一張表
  •     下載表內容到本地目錄
  •     存儲一個查詢結果到hadoop dfs目錄
  •     管理表和分區
  •     插入自定義的腳本語言選擇自定義的map/reduce工作。
  用法和例子
       下面的例子突出一些顯著的特性,一個詳細的查詢數據集測試用例在Hive Query Test Cases找到並且在Query Test Case Results找到一致的結果
  
     創建表
            
  
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章