轉義字符

將後邊字符轉義，使特殊功能字符作爲普通字符處理，或者普通字符轉化爲特殊功能字符。
各個語言中都用應用，如java、python、sql、hive、shell等等。
如sql中
        "\""    
        "\'"
        "\t"
        "\n"
sql中直接輸出 
        "
        '
        tab鍵
        換行鍵

轉義字符的一般應用

"\"轉義字符放到字符前面，如java和python輸出內容用雙引號標識，雙引號中可以用轉義字符\進行轉義輸出，比如輸出雙引號
java中 system.out.print("\"")
python中 print "\""

特殊的情況：轉義字符自身的轉義

轉義字符的特殊情況，自身的轉義，比如java有時候需要兩個轉義字符"\\"，或者四個轉義字符“\\\\”。

1)java的倆種情況 
    正則表達式匹配和string的split函數
    這兩種情況中字符串包含轉義字符“\”時，需要先對轉義字符自身轉義，就是說需要兩個轉義字符“\\”。（java解析後，再有正則和split自身特定進行解析）
    而當匹配字符正斜線“\”，則需要四個轉義字符“\\\\”，因爲，首先java（編譯器？）自身先解析，轉義成兩個“\\”，再由正則或split的解析功能轉義成一個“\”，纔是最終要處理的字符。
    這是因爲解析過程需要兩次，才能在字符串中出現正斜線“\”，出現後才能轉義後面的字符。

2)hive中的split和正則表達式
    hive用java寫的，所以同Java一樣，兩種情況也需要兩個“\\”，split處理代碼爲例：

    select 
    ad,
    '月資費類型' as feature,
    (CASE subscriptionfee_id
        when '0' then '無'
        when '1' then '[0,50)'
        when '2' then '[50,100]'
        when '3' then '[100,150]'
        when '4' then '[150,200)'
        when '5' then '>=200'
        else 'error_data' 
    END) as feature_detail,
    1 as type
    from mengniubi.dianxin_user_tags
    union all
    select 
    ad,
    '愛好分佈' as feature,
    split(new_interest,'\\|')[1] as feature_detail,
    2 as type
    from mengniubi.dianxin_user_tags
    lateral view explode(interests) AllInterests as new_interest
    union all
    select 
    ad,
    '商品瀏覽' as feature,
    split(products,'\\|')[0] as feature_detail,
    4 as type
    from mengniubi.dianxin_user_tags
    lateral view explode(split(product_view_cates,',')) AllProducts as products

代碼中，如果以“\”作爲分隔符的話，那麼就需要4個轉義字符“\\\\”，即
    split(products,'\\\\')[0] as feature_detail,

3）hive語句在shell腳本中執行
shell語言也有轉義字符，自身直接處理。
而hive語句在shell腳本中執行時，就需要先由shell轉義後，再由hive處理。這個過程又造成二次轉義。
如上面的hive語句寫入shell腳本中，執行是錯誤的，shell先解析，轉義成”|“後傳給hive，hive解析這個轉義字符後，split就無法正確的解析了。
所以，注意hive語句在shell腳本執行時，轉義字符需要翻倍。hive處理的是shell轉義後的語句，必須轉以後正確，才能執行。上面代碼在shell腳本中如下

#!/bin/bash
##### execute hive sql for analyzing data #####
arg_count=$#
if [ $arg_count -lt 1 ];then
   echo "參數錯誤 [$*], Usage:$0 2015-08"
   exit 1
fi

if [ ! -d "$HIVE_HOME" ];then
   echo "HIVE_HOME not exists .. "
   exit 2
fi

month_arg=$1
echo "month : ${month_arg}"

echo "start ... "
########################  SQL EDIT AREA 1 BEGIN #####################################
msg="step1 t_bi_daily_ad_area_report .."
echo
echo
echo $msg
echo
echo
sql=$(cat <<!EOF

USE test_bi;
set mapred.queue.names=queue3;
SET mapred.reduce.tasks=14;

insert OVERWRITE table t_bi_figure_whole_network_report partition(month='${month_arg}')
select '-1' as brand,feature,feature_detail,type,count(ad) as count
from 
(
    select 
    ad,
    '月資費類型' as feature,
    (CASE subscriptionfee_id
        when '0' then '無'
        when '1' then '[0,50)'
        when '2' then '[50,100]'
        when '3' then '[100,150]'
        when '4' then '[150,200)'
        when '5' then '>=200'
        else 'error_data' 
    END) as feature_detail,
    1 as type
    from mengniubi.dianxin_user_tags
    union all
    select 
    ad,
    '愛好分佈' as feature,
    split(new_interest,'\\\\|')[1] as feature_detail,
    2 as type
    from mengniubi.dianxin_user_tags
    lateral view explode(interests) AllInterests as new_interest
    union all
    select 
    ad,
    '商品瀏覽' as feature,
    split(products,'\\\\|')[0] as feature_detail,
    4 as type
    from mengniubi.dianxin_user_tags
    lateral view explode(split(product_view_cates,',')) AllProducts as products

) t1
group by feature,feature_detail,type
union all
select '-1' as brand,
    '搜索關鍵字' as feature,
    search_word as feature_detail,
    2 as type,
    count(1) as count 
from mengniubi.dianxin_user_tags
lateral view explode(split(search_keywords,',')) AllKeyWords as search_word
where search_word is not null and search_word <> '' 
group by search_word order 
by count desc 
limit 1000;
!EOF)
########### execute begin ##########
echo $sql
$HIVE_HOME/bin/hive -e "$sql"
exitCode=$?
if [ $exitCode -ne 0 ];then
   echo "[ERROR] $msg"
   exit $exitCode
fi
########### execute end  ###########
########################  SQL EDIT AREA 1 END #######################################

hive中正則表達式的轉義字符用雙斜線

hive的split函數中，分隔符爲正則表達式

split(string str, string pat) 

Splits str around pat (pat is a regular expression).
這裏切分符號是正則表達式，按一個字符分隔沒問題，如

select 
    split(all,'~')
from
tb_pmp_log_all_lmj_tmp
limit 10

當分隔字符是豎線'|'時，直接使用默認爲正則表達式中的或，則爲無，所以會將字段中的單個字符全部分隔開，如

    select 
        split(all,'|')
    from
    tb_pmp_log_all_lmj_tmp
    limit 10

因爲hive正則表達式轉義字符爲兩個\，所以’|’就是’\|’，如

    select 
        split(all,'\\|')
    from
    tb_pmp_log_all_lmj_tmp
    limit 10

而’|+’無法運行，報錯,

當分隔符爲多個符號組合時，用正則要注意雙斜線\\爲轉義字符，否則如’\|’會有問題，
匹配“|~|”分隔符如

select 
    split(all,'\\|~\\|')
from
tb_pmp_log_all_lmj_tmp
limit 10

或者，在[]內部拼接成字符串

select 
    split(all,'[|~]+')
from
tb_pmp_log_all_lmj_tmp
limit 10

轉義字符\（在hive+shell以及java中注意事項）：正則表達式的轉義字符爲雙斜線,split函數解析也是正則

轉義字符

轉義字符的一般應用

特殊的情況：轉義字符自身的轉義

hive中正則表達式的轉義字符用雙斜線

hive的split函數中，分隔符爲正則表達式

hive編程指南——讀書筆記（無知拾遺）

sql的簡單提高效率方法

java多線程的編程實例

hive中使用case、if：一個region統計業務（hive條件函數case、if、COALESCE語法介紹:CONDITIONAL FUNCTIONS IN HIVE）

pig腳本不需要後綴名（python tempfile模塊生成pig腳本臨時文件，執行）

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結

轉義字符\（在hive+shell以及java中注意事項）：正則表達式的轉義字符爲雙斜線,split函數解析也是正則

轉義字符

轉義字符的一般應用

特殊的情況 ：轉義字符自身的轉義

hive中正則表達式的轉義字符用雙斜線

hive的split函數中，分隔符爲正則表達式

特殊的情況：轉義字符自身的轉義