SparkSql常見內置函數

字符串:

1.concat對於字符串進行拼接

concat(str1, str2, ..., strN) - Returns the concatenation of str1, str2, ..., strN.

Examples:> SELECT concat('Spark', 'SQL');  SparkSQL

2.concat_ws在拼接的字符串中間添加某種格式

concat_ws(sep, [str | array(str)]+) - Returns the concatenation of the strings separated by sep.

Examples:> SELECT concat_ws(' ', 'Spark', 'SQL');  Spark SQL

3.decode轉碼

decode(bin, charset) - Decodes the first argument using the second argument character set.

Examples: > SELECT decode(encode('abc', 'utf-8'), 'utf-8');   abc

4.encode設置編碼格式

encode(str, charset) - Encodes the first argument using the second argument character set.

Examples: > SELECT encode('abc', 'utf-8');abc

5.format_string/printf 格式化字符串

format_string(strfmt, obj, ...) - Returns a formatted string from printf-style format strings.

Examples:> SELECT format_string("Hello World %d %s", 100, "days");  Hello World 100 days

6.initcap將每個單詞的首字母變爲大寫,其他字母小寫; lower全部轉爲小寫,upper大寫

initcap(str) - Returns str with the first letter of each word in uppercase. All other letters are in lowercase. Words are delimited by white space.

Examples:> SELECT initcap('sPark sql');  Spark Sql

7.length返回字符串的長度

Examples:> SELECT length('Spark SQL ');  10

8.levenshtein編輯距離(將一個字符串變爲另一個字符串的距離)

levenshtein(str1, str2) - Returns the Levenshtein distance between the two given strings.

Examples:> SELECT levenshtein('kitten', 'sitting');   3

9.lpad返回固定長度的字符串,如果長度不夠,用某種字符補全,rpad右補全

lpad(str, len, pad) - Returns str, left-padded with pad to a length of len. If str is longer than len, the return value is shortened to len characters.

Examples:> SELECT lpad('hi', 5, '??');   ???hi

10.ltrim去除空格或去除開頭的某些字符,rtrim右去除,trim兩邊同時去除

ltrim(str) - Removes the leading space characters from str.

ltrim(trimStr, str) - Removes the leading string contains the characters from the trim string

Examples:

> SELECT ltrim('    SparkSQL   ');   SparkSQL
> SELECT ltrim('Sp', 'SSparkSQLS');   arkSQLS

11.regexp_extract 正則提取某些字符串,regexp_replace正則替換

Examples:> SELECT regexp_extract('100-200', '(\d+)-(\d+)', 1);   100

Examples: > SELECT regexp_replace('100-200', '(\d+)', 'num');   num-num

12.repeat複製給的字符串n次

Examples: > SELECT repeat('123', 2);  123123

13.instr返回截取字符串的位置/locate

instr(str, substr) - Returns the (1-based) index of the first occurrence of substr in str.

Examples:> SELECT instr('SparkSQL', 'SQL');  6

Examples:SELECT locate('bar', 'foobarbar');   4

14.space 在字符串前面加n個空格

space(n) - Returns a string consisting of n spaces.

Examples:> SELECT concat(space(2), '1');  1

15.split以某些字符拆分字符串

split(str, regex) - Splits str around occurrences that match regex.

Examples:> SELECT split('oneAtwoBthreeC', '[ABC]');      ["one","two","three",""]

16.substr截取字符串,substring_index

Examples:

> SELECT substr('Spark SQL', 5);  k SQL
> SELECT substr('Spark SQL', -3);  SQL
> SELECT substr('Spark SQL', 5, 1);   k
> SELECT substring_index('www.apache.org', '.', 2);   www.apache

17.translate 替換某些字符串爲

Examples: > SELECT translate('AaBbCc', 'abc', '123');   A1B2C3

18.get_json_object

get_json_object(json_txt, path) - Extracts a json object from path.

Examples:> SELECT get_json_object('{"a":"b"}', '$.a');  b

19.unhex

unhex(expr) - Converts hexadecimal expr to binary.

Examples:> SELECT decode(unhex('537061726B2053514C'), 'UTF-8');   Spark SQL

20.to_json

to_json(expr[, options]) - Returns a json string with a given struct value

Examples:

> SELECT to_json(named_struct('a', 1, 'b', 2));   {"a":1,"b":2}

> SELECT to_json(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy'));   {"time":"26/08/2015"}

> SELECT to_json(array(named_struct('a', 1, 'b', 2));   [{"a":1,"b":2}]

> SELECT to_json(map('a', named_struct('b', 1)));  {"a":{"b":1}}

> SELECT to_json(map(named_struct('a', 1),named_struct('b', 2)));   {"[1]":{"b":2}}

> SELECT to_json(map('a', 1));  {"a":1}

> SELECT to_json(array((map('a', 1))));  [{"a":1}]

時間日期:

一、獲取當前時間

1.current_date獲取當前日期

2018-04-09

2.current_timestamp/now()獲取當前時間

2018-04-09 15:20:49.247

二、從日期時間中提取字段 

1.year,month,day/dayofmonth,hour,minute,second

Examples:> SELECT day('2009-07-30'); 30

2.dayofweek (1 = Sunday, 2 = Monday, ..., 7 = Saturday),dayofyear

Examples:> SELECT dayofweek('2009-07-30');   5

Since: 2.3.0

3.weekofyear

weekofyear(date) - Returns the week of the year of the given date. A week is considered to start on a Monday and week 1 is the first week with >3 days.

Examples:> SELECT weekofyear('2008-02-20');   8

4.trunc截取某部分的日期,其他部分默認爲01

第二個參數 ["year", "yyyy", "yy", "mon", "month", "mm"]

Examples:

> SELECT trunc('2009-02-12', 'MM');
 2009-02-01
> SELECT trunc('2015-10-27', 'YEAR');
 2015-01-01

5.date_trunc ["YEAR", "YYYY", "YY", "MON", "MONTH", "MM", "DAY", "DD", "HOUR", "MINUTE", "SECOND", "WEEK", "QUARTER"]

Examples:> SELECT date_trunc('2015-03-05T09:32:05.359', 'HOUR');  2015-03-05T09:00:00

Since: 2.3.0

6.date_format將時間轉化爲某種格式的字符串

Examples:> SELECT date_format('2016-04-08', 'y');    2016

三、日期時間轉換

1.unix_timestamp返回當前時間的unix時間戳

Examples:

> SELECT unix_timestamp();  1476884637
> SELECT unix_timestamp('2016-04-08', 'yyyy-MM-dd');   1460041200

2.from_unixtime將時間戳換算成當前時間,to_unix_timestamp將時間轉化爲時間戳

Examples:

> SELECT from_unixtime(0, 'yyyy-MM-dd HH:mm:ss');  1970-01-01 00:00:00
>SELECT to_unix_timestamp('2016-04-08', 'yyyy-MM-dd');  1460041200

3.to_date/date將字符串轉化爲日期格式,to_timestamp(Since: 2.2.0)

> SELECT to_date('2009-07-30 04:17:52');  2009-07-30
> SELECT to_date('2016-12-31', 'yyyy-MM-dd');   2016-12-31
> SELECT to_timestamp('2016-12-31 00:12:00');   2016-12-31 00:12:00

4.quarter 將1年4等分(range 1 to 4)

Examples:> SELECT quarter('2016-08-31');  3

四、日期、時間計算

1.months_between兩個日期之間的月數

months_between(timestamp1, timestamp2) - Returns number of months between timestamp1 and timestamp2.

Examples:> SELECT months_between('1997-02-28 10:30:00', '1996-10-30');  3.94959677

2. add_months返回日期後n個月後的日期

Examples:> SELECT add_months('2016-08-31', 1);  2016-09-30

3.last_day(date),next_day(start_date, day_of_week)

Examples:

> SELECT last_day('2009-01-12');  2009-01-31
> SELECT next_day('2015-01-14', 'TU');  2015-01-20

4.date_add,date_sub(減)

date_add(start_date, num_days) - Returns the date that is num_days after start_date.

Examples:

> SELECT date_add('2016-07-30', 1);  2016-07-31

5.datediff(兩個日期間的天數)

datediff(endDate, startDate) - Returns the number of days from startDate to endDate.

Examples:> SELECT datediff('2009-07-31', '2009-07-30'); 1

6.關於UTC時間

to_utc_timestamp

to_utc_timestamp(timestamp, timezone) - Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time zone, and renders that time as a timestamp in UTC. For example, 'GMT+1' would yield '2017-07-14 01:40:00.0'.

Examples:> SELECT to_utc_timestamp('2016-08-31', 'Asia/Seoul');  2016-08-30 15:00:0

from_utc_timestamp

from_utc_timestamp(timestamp, timezone) - Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in UTC, and renders that time as a timestamp in the given time zone. For example, 'GMT+1' would yield '2017-07-14 03:40:00.0'.

Examples:> SELECT from_utc_timestamp('2016-08-31', 'Asia/Seoul');  2016-08-31 09:00:00

 

 

 

 

 

 

 

 

 

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章