OpenResty學習——第五章常用Lua開發庫2-JSON庫、編碼轉換、字符串處理

本文轉自https://blog.csdn.net/jinnianshilongnian/article/details/84703641，好文要頂，感謝博主分享！

JSON庫

在進行數據傳輸時JSON格式目前應用廣泛，因此從Lua對象與JSON字符串之間相互轉換是一個非常常見的功能；目前Lua也有幾個JSON庫，本人用過cjson、dkjson。其中cjson的語法嚴格（比如unicode \u0020\u7eaf），要求符合規範否則會解析失敗（如\u002），而dkjson相對寬鬆，當然也可以通過修改cjson的源碼來完成一些特殊要求。而在使用dkjson時也沒有遇到性能問題，目前使用的就是dkjson。使用時要特別注意的是大部分JSON庫都僅支持UTF-8編碼；因此如果你的字符編碼是如GBK則需要先轉換爲UTF-8然後進行處理。

1.1、test_cjson.lua

local cjson = require("cjson")
 
--lua對象到字符串
local obj = {
    id = 1,
    name = "zhangsan",
    age = nil,
    is_male = false,
    hobby = {"film", "music", "read"}
}
 
local str = cjson.encode(obj)
ngx.say(str, "<br/>")
 
--字符串到lua對象
str = '{"hobby":["film","music","read"],"is_male":false,"name":"zhangsan","id":1,"age":null}'
local obj = cjson.decode(str)
 
ngx.say(obj.age, "<br/>")
ngx.say(obj.age == nil, "<br/>")
ngx.say(obj.age == cjson.null, "<br/>")
ngx.say(obj.hobby[1], "<br/>")
 
 
--循環引用
obj = {
   id = 1
}
obj.obj = obj
-- Cannot serialise, excessive nesting
--ngx.say(cjson.encode(obj), "<br/>")
local cjson_safe = require("cjson.safe")
--nil
ngx.say(cjson_safe.encode(obj), "<br/>")

null將會轉換爲cjson.null；循環引用會拋出異常Cannot serialise, excessive nesting，默認解析嵌套深度是1000，可以通過cjson.encode_max_depth()設置深度提高性能；使用cjson.safe不會拋出異常而是返回nil。

1.2、example.conf配置文件

     location ~ /lua_cjson {
        default_type 'text/html';
        lua_code_cache on;
        content_by_lua_file /usr/example/lua/test_cjson.lua;
     }

1.3、訪問如http://192.168.1.2/lua_cjson將得到如下結果

{"hobby":["film","music","read"],"is_male":false,"name":"zhangsan","id":1}
null
false
true
film
nil

lua-cjson文檔http://www.kyne.com.au/~mark/software/lua-cjson-manual.html。

接下來學習下dkjson。

2.1、下載dkjson庫

cd /usr/example/lualib/
wget http://dkolf.de/src/dkjson-lua.fsl/raw/dkjson.lua?name=16cbc26080996d9da827df42cb0844a25518eeb3 -O dkjson.lua

2.2、test_dkjson.lua

local dkjson = require("dkjson")
 
--lua對象到字符串
local obj = {
    id = 1,
    name = "zhangsan",
    age = nil,
    is_male = false,
    hobby = {"film", "music", "read"}
}
 
local str = dkjson.encode(obj, {indent = true})
ngx.say(str, "<br/>")
 
--字符串到lua對象
str = '{"hobby":["film","music","read"],"is_male":false,"name":"zhangsan","id":1,"age":null}'
local obj, pos, err = dkjson.decode(str, 1, nil)
 
ngx.say(obj.age, "<br/>")
ngx.say(obj.age == nil, "<br/>")
ngx.say(obj.hobby[1], "<br/>")
 
--循環引用
obj = {
   id = 1
}
obj.obj = obj
--reference cycle
--ngx.say(dkjson.encode(obj), "<br/>")

默認情況下解析的json的字符會有縮排和換行，使用{indent = true}配置將把所有內容放在一行。和cjson不同的是解析json字符串中的null時會得到nil。

2.3、example.conf配置文件

     location ~ /lua_dkjson {
        default_type 'text/html';
        lua_code_cache on;
        content_by_lua_file /usr/example/lua/test_dkjson.lua;
     }

2.4、訪問如http://192.168.1.2/lua_dkjson將得到如下結果

{ "hobby":["film","music","read"], "is_male":false, "name":"zhangsan", "id":1 }
nil
true
film

dkjson文檔http://dkolf.de/src/dkjson-lua.fsl/home和http://dkolf.de/src/dkjson-lua.fsl/wiki?name=Documentation。

編碼轉換

我們在使用一些類庫時會發現大部分庫僅支持UTF-8編碼，因此如果使用其他編碼的話就需要進行編碼轉換的處理；而Linux上最常見的就是iconv，而lua-iconv就是它的一個Lua API的封裝。

安裝lua-iconv可以通過如下兩種方式：

ubuntu下可以使用如下方式

apt-get install luarocks
luarocks install lua-iconv 
cp /usr/local/lib/lua/5.1/iconv.so  /usr/example/lualib/

源碼安裝方式，需要有gcc環境

wget https://github.com/do^Cloads/ittner/lua-iconv/lua-iconv-7.tar.gz
tar -xvf lua-iconv-7.tar.gz
cd lua-iconv-7
gcc -O2 -fPIC -I/usr/include/lua5.1 -c luaiconv.c -o luaiconv.o -I/usr/include
gcc -shared -o iconv.so -L/usr/local/lib luaiconv.o -L/usr/lib
cp iconv.so  /usr/example/lualib/

1、test_iconv.lua

ngx.say("中文")

此時文件編碼必須爲UTF-8，即Lua文件編碼爲什麼裏邊的字符編碼就是什麼。

2、example.conf配置文件

     location ~ /lua_iconv {
        default_type 'text/html';
        charset gbk;
        lua_code_cache on;
        content_by_lua_file /usr/example/lua/test_iconv.lua;
     }

通過charset告訴瀏覽器我們的字符編碼爲gbk。

3、訪問 http://192.168.1.2/lua_iconv會發現輸出亂碼；

此時需要我們將test_iconv.lua中的字符進行轉碼處理：

local iconv = require("iconv")
local togbk = iconv.new("gbk", "utf-8")
local str, err = togbk:iconv("中文")
ngx.say(str)

通過轉碼我們得到最終輸出的內容編碼爲gbk，使用方式iconv.new(目標編碼, 源編碼)。

有如下可能出現的錯誤：

nil
沒有錯誤成功。
iconv.ERROR_NO_MEMORY
內存不足。
iconv.ERROR_INVALID
有非法字符。
iconv.ERROR_INCOMPLETE
有不完整字符。
iconv.ERROR_FINALIZED
使用已經銷燬的轉換器，比如垃圾回收了。
iconv.ERROR_UNKNOWN
未知錯誤

iconv在轉換時遇到非法字符或不能轉換的字符就會失敗，此時可以使用如下方式忽略轉換失敗的字符

local togbk_ignore = iconv.new("GBK//IGNORE", "UTF-8")

另外在實際使用中進行UTF-8到GBK轉換過程時，會發現有些字符在GBK編碼表但是轉換不了，此時可以使用更高的編碼GB18030來完成轉換。

更多介紹請參考http://ittner.github.io/lua-iconv/。

位運算

Lua 5.3之前是沒有提供位運算支持的，需要使用第三方庫，比如LuaJIT提供了bit庫。

1、test_bit.lua

local bit = require("bit")
ngx.say(bit.lshift(1, 2))

lshift進行左移位運算，即得到4。

其他位操作API請參考http://bitop.luajit.org/api.html。Lua 5.3的位運算操作符http://cloudwu.github.io/lua53doc/manual.html#3.4.2.

cache

ngx_lua模塊本身提供了全局共享內存ngx.shared.DICT可以實現全局共享，另外可以使用如Redis來實現緩存。另外還一個lua-resty-lrucache實現，其和ngx.shared.DICT不一樣的是它是每Worker進程共享，即每個Worker進行會有一份緩存，而且經過實際使用發現其性能不如ngx.shared.DICT。但是其好處就是不需要進行全局配置。

1、創建緩存模塊來實現只初始化一次：

vim /usr/example/lualib/mycache.lua

local lrucache = require("resty.lrucache")
--創建緩存實例，並指定最多緩存多少條目
local cache, err = lrucache.new(200)
if not cache then
   ngx.log(ngx.ERR, "create cache error : ", err)
end
local function set(key, value, ttlInSeconds)
    cache:set(key, value, ttlInSeconds)
end
local function get(key)
    return cache:get(key)
end
local _M = {
  set = set,
  get = get
}
 
return _M

此處利用了模塊的特性實現了每個Worker進行只初始化一次cache實例。

2、test_lrucache.lua

local mycache = require("mycache")
local count = mycache.get("count") or 0
count = count + 1
mycache.set("count", count, 10 * 60 * 60) --10分鐘
ngx.say(mycache.get("count"))

可以實現諸如訪問量統計，但僅是每Worker進程的。

3、example.conf配置文件

     location ~ /lua_lrucache {
        default_type 'text/html';
        lua_code_cache on;
        content_by_lua_file /usr/example/lua/test_lrucache.lua;
     }

訪問如http://192.168.1.2/lua_lrucache測試。

更多介紹請參考https://github.com/openresty/lua-resty-lrucache。

字符串處理

Lua 5.3之前沒有提供字符操作相關的函數，如字符串截取、替換等都是字節爲單位操作；在實際使用時尤其包含中文的場景下顯然不能滿足需求；即使Lua 5.3也僅提供了基本的UTF-8操作。

Lua UTF-8庫

https://github.com/starwing/luautf8

LuaRocks安裝

#首先確保git安裝了
apt-get install git
luarocks install utf8
cp /usr/local/lib/lua/5.1/utf8.so  /usr/example/lualib/

源碼安裝

wget https://github.com/starwing/luautf8/archive/master.zip
unzip master.zip
cd luautf8-master/
gcc -O2 -fPIC -I/usr/include/lua5.1 -c utf8.c -o utf8.o -I/usr/include
gcc -shared -o utf8.so -L/usr/local/lib utf8.o -L/usr/lib

1、test_utf8.lua

local utf8 = require("utf8")
local str = "abc中文"
ngx.say("len : ", utf8.len(str), "<br/>")
ngx.say("sub : ", utf8.sub(str, 1, 4))

文件編碼必須爲UTF8，此處我們實現了最常用的字符串長度計算和字符串截取。

2、example.conf配置文件

     location ~ /lua_utf8 {
        default_type 'text/html';
        lua_code_cache on;
        content_by_lua_file /usr/example/lua/test_utf8.lua;
     }

3、訪問如http://192.168.1.2/lua_utf8測試得到如下結果

len : 5
sub : abc中

字符串轉換爲unicode編碼：

local bit = require("bit")
local bit_band = bit.band
local bit_bor = bit.bor
local bit_lshift = bit.lshift
local string_format = string.format
local string_byte = string.byte
local table_concat = table.concat
local function utf8_to_unicode(str)
    if not str or str == "" or str == ngx.null then
        return nil
    end
    local res, seq, val = {}, 0, nil
    for i = 1, #str do
        local c = string_byte(str, i)
        if seq == 0 then
            if val then
                res[#res + 1] = string_format("%04x", val)
            end
 
           seq = c < 0x80 and 1 or c < 0xE0 and 2 or c < 0xF0 and 3 or
                              c < 0xF8 and 4 or --c < 0xFC and 5 or c < 0xFE and 6 or
                              0
            if seq == 0 then
                ngx.log(ngx.ERR, 'invalid UTF-8 character sequence' .. ",,," .. tostring(str))
                return str
            end
 
            val = bit_band(c, 2 ^ (8 - seq) - 1)
        else
            val = bit_bor(bit_lshift(val, 6), bit_band(c, 0x3F))
        end
        seq = seq - 1
    end
    if val then
        res[#res + 1] = string_format("%04x", val)
    end
    if #res == 0 then
        return str
    end
    return "\\u" .. table_concat(res, "\\u")
end
 
ngx.say("utf8 to unicode : ", utf8_to_unicode("abc中文"), "<br/>")

如上方法將輸出utf8 to unicode : \u0061\u0062\u0063\u4e2d\u6587。

刪除空格：

local function ltrim(s)
    if not s then
        return s
    end
    local res = s
    local tmp = string_find(res, '%S')
    if not tmp then
        res = ''
    elseif tmp ~= 1 then
        res = string_sub(res, tmp)
    end
    return res
end
local function rtrim(s)
    if not s then
        return s
    end
    local res = s
    local tmp = string_find(res, '%S%s*$')
    if not tmp then
        res = ''
    elseif tmp ~= #res then
        res = string_sub(res, 1, tmp)
    end
    return res
end
local function trim(s)
    if not s then
        return s
    end
    local res1 = ltrim(s)
    local res2 = rtrim(res1)
    return res2
end

字符串分割：

function split(szFullString, szSeparator)
    local nFindStartIndex = 1
    local nSplitIndex = 1
    local nSplitArray = {}
    while true do
       local nFindLastIndex = string.find(szFullString, szSeparator, nFindStartIndex)
       if not nFindLastIndex then
        nSplitArray[nSplitIndex] = string.sub(szFullString, nFindStartIndex, string.len(szFullString))
        break
       end
       nSplitArray[nSplitIndex] = string.sub(szFullString, nFindStartIndex, nFindLastIndex - 1)
       nFindStartIndex = nFindLastIndex + string.len(szSeparator)
       nSplitIndex = nSplitIndex + 1
    end
    return nSplitArray
end

如split("a,b,c", ",") 將得到一個分割後的table。

到此基本的字符串操作就完成了，其他luautf8模塊的API和LuaAPI類似可以參考

http://cloudwu.github.io/lua53doc/manual.html#6.4

http://cloudwu.github.io/lua53doc/manual.html#6.5

另外對於GBK的操作，可以先轉換爲UTF-8，最後再轉換爲GBK即可。

OpenResty學習——第五章常用Lua開發庫2-JSON庫、編碼轉換、字符串處理

JSON庫

編碼轉換

位運算

cache

字符串處理

python gdal 安裝使用（Windows， python 3.6.8）

記錄一次生產事故引發的登錄流程梳理

工作問題解決：ajax的妙用

linux查看端口對應程序目錄

WeakHashMap垃圾回收問題

解決springboot2.2.1 單元測試報錯NoClassDefFoundError: org/junit/platform/launcher/core/LauncherFactory

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結

OpenResty學習——第五章 常用Lua開發庫2-JSON庫、編碼轉換、字符串處理

JSON庫

編碼轉換

位運算

cache

字符串處理

OpenResty學習——第五章常用Lua開發庫2-JSON庫、編碼轉換、字符串處理