MXNet學習3——Symbol

概要

本節介紹MXNet中的Symbol（模塊）。Symbol是MXNet中另一個重要的概念，可以通過 mxnet.symbol 或者 mxnet.sym 使用。一個symbol表示一個具有多輸出的符號表達式，表達式由多個運算組成，運算可以是簡單的矩陣和(“+”)，也可以是一個神經網絡層（比如卷積層）。一個運算可以有多個輸入，產生一個或多個輸出，同時也可以由隱變量。變量可以是一個等待綁定參數值的自由符號，也可以是其他symbol的某一個輸出值。

MXNet中重要概念NDArray與numpy類似，操作上也比較簡單就不單獨說明了，之後代碼中遇到會提及。另外，本節代碼跳過了官方tutorial中存儲部分以及卷積神經網絡部分，有些說明部分沒有摘錄，感興趣請參考官方教程

正文

基礎操作

import mxnet as mx
###構造一個簡單的表達式 a+b
a = mx.sym.Variable('a')
b = mx.sym.Variable('b')
###c沒有人爲指定名稱，MXNet會自動命名一個不重複的名稱，注意不一定是"C"
c = a + b

###下方的操作類似numpy
# elemental wise times
d = a * b  
# matrix multiplication
e = mx.sym.dot(a, b)   
# reshape
f = mx.sym.Reshape(d+e, shape=(1,4))  
# broadcast
g = mx.sym.broadcast_to(f, shape=(2,4)) 

mx.viz.plot_network(symbol=g).view()

基礎神經網絡

除了上述基本的操作之外，Symbol還有一系列神經網絡層，下面的代碼構造了一個兩層全連接的神經網絡

###以下神經網絡比較簡單，就不單獨說明了，其中relu是激活函數
# Output may vary
net = mx.sym.Variable('data')
net = mx.sym.FullyConnected(data=net, name='fc1', num_hidden=128)
net = mx.sym.Activation(data=net, name='relu1', act_type="relu")
net = mx.sym.FullyConnected(data=net, name='fc2', num_hidden=10)
net = mx.sym.SoftmaxOutput(data=net, name='out')
mx.viz.plot_network(net, shape={'data':(100,200)}).view()

組合多個Symbol

如果想構造具有多種輸出的神經網絡，可以使用mxnet.sym.Group將多個組合在一起，比如以下softmax和線性迴歸

net = mx.sym.Variable('data')
fc1 = mx.sym.FullyConnected(data=net, name='fc1', num_hidden=128)
net = mx.sym.Activation(data=fc1, name='relu1', act_type="relu")
out1 = mx.sym.SoftmaxOutput(data=net, name='softmax')
out2 = mx.sym.LinearRegressionOutput(data=net, name='regression')
group = mx.sym.Group([out1, out2])
group.list_outputs()

[‘softmax_output’, ‘regression_output’]

NDArray與Symbol的區別

事實上MXNet的Symbol和NDArray很類似（NDArray和numpy類似），兩者都提供了多維數組的操作，比如c=a+b，若想詳細區分請點這裏看官網說明，以下給出簡單的區分。

NDArray 中的計算是一行一行向下順序執行的，每一個參數都會參與當前操作的計算，並傳遞。而Symbol更類似於描述程序的邏輯，首先是申明計算然後再給參數賦值，機制類似正則和SQL（Examples in this category include regular expression and SQL，官網說的這個沒有太懂）。

NDArray的優勢：

直接
能簡單的結合編程語言的特性（比如循環，判斷）和函數庫（比如numpy）
能簡單的一步步debug（吐槽一下，MXNet真的不容易debug，隔壁Tensorflow的Tensorboard貌似不錯）

Symbol的優勢：

提供了NDArray幾乎所有的方法，諸如 +, *, sin, shape
提供了神經網絡相關的操作，諸如卷積，激活函數，BatchNorm（不然就不用MXNet了）
自動求導
簡單的構造和操縱複雜的計算，比如深度神經網絡
簡單的存儲，加載參數和可視化
後端能簡單的優化計算和內存佔用

下一節將描述如何結合兩者開發一個完整的訓練程序，這裏還是專注Symbol吧。

Symbol的操縱

上文講到Symbol相比較NDArray的一個重要區別是，必須首先申明計算，然後賦值，才能計算。下面將介紹如何直接操縱一個Symbol，需要注意的是，下面部分與module聯繫緊密，如果不關心可以跳過。

Shape Inference

對於每個symbol可以直接查詢其輸入和輸出。也可以通過給定的輸入shape推斷出輸出的shape，這有利於更好的分配內存。

arg_name = c.list_arguments()  # get the names of the inputs
out_name = c.list_outputs()    # get the names of the outputs
arg_shape, out_shape, _ = c.infer_shape(a=(2,3), b=(2,3))  
{'input' : dict(zip(arg_name, arg_shape)), 
 'output' : dict(zip(out_name, out_shape))}

{‘input’: {‘a’: (2L, 3L), ‘b’: (2L, 3L)}, ‘output’: {‘_plus0_output’: (2L, 3L)}}

Bind with Data and Evaluate

Symbol c聲明瞭即將要運行的計算，爲了得到結果，我們需要賦值。bind方法通過接收設備（cpu，gpu）信息以及一個字典（參數名稱映射NDArrays）作爲參數，然後返回一個執行器。這個執行器提供了forward用來計算以及得到所有的結果。

ex = c.bind(ctx=mx.cpu(), args={'a' : mx.nd.ones([2,3]), 
                                'b' : mx.nd.ones([2,3])})
ex.forward()
print 'number of outputs = %d\nthe first output = \n%s' % (
           len(ex.outputs), ex.outputs[0].asnumpy())

number of outputs = 1
the first output = 
[[ 2.  2.  2.]
 [ 2.  2.  2.]]

定製Symbol

MXNet允許用戶自己定製Symbol，只需要自行定義前向（forward ）和後向（backward 計算方法，同時提供一些屬性查詢方法，比如 list_arguments and infer_shape.
forward and backward默認的參數類型都是NDArray，但爲了顯示MXNet的靈活性，這裏展示使用numpy實現softmax 層。Numpy實現的操作只能在CPU上運行，但是提供的函數十分豐富，比較利於演示。

首先定義一個mx.operator.CustomOp的子類然後實現 forward and backward.

class Softmax(mx.operator.CustomOp):
    def forward(self, is_train, req, in_data, out_data, aux):
        x = in_data[0].asnumpy()     ###將NDArray轉換成numpy.ndarray
        y = np.exp(x - x.max(axis=1).reshape((x.shape[0], 1)))
        y /= y.sum(axis=1).reshape((x.shape[0], 1))
        self.assign(out_data[0], req[0], mx.nd.array(y))  ###將numpy.ndarray轉換成NDArray
        ###CustomOp.assign 將轉換後的y賦值給out_data[0],根據req的不同可能是覆蓋或者寫入

    def backward(self, req, out_grad, in_data, out_data, in_grad, aux):
        l = in_data[1].asnumpy().ravel().astype(np.int)
        y = out_data[0].asnumpy()
        y[np.arange(l.shape[0]), l] -= 1.0
        self.assign(in_grad[0], req[0], mx.nd.array(y))

接下來在定義一個mx.operator.CustomOpProp的子類用來查詢屬性

# register this operator into MXNet by name "softmax"
@mx.operator.register("softmax")
class SoftmaxProp(mx.operator.CustomOpProp):
    def __init__(self):
        # softmax is a loss layer so we don’t need gradient input
        # from layers above. 
        super(SoftmaxProp, self).__init__(need_top_grad=False)

    def list_arguments(self):
        return ['data', 'label']

    def list_outputs(self):
        return ['output']

    def infer_shape(self, in_shape):
        data_shape = in_shape[0]
        label_shape = (in_shape[0][0],)
        output_shape = in_shape[0]
        return [data_shape, label_shape], [output_shape], []

    def create_operator(self, ctx, shapes, dtypes):
        return Softmax()

最後我們就可以使用剛剛自己定製的操作了，需要注意的是操作的名稱我們已經申明註冊過了

net = mx.symbol.Custom(data=prev_input, op_type='softmax')

定製這塊的內容，官方tutorial給的十分簡易，按照目前的學習進度也不需要掌握這塊知識，所以略過啦~

這一次對tutorial中的內容翻譯了不少，但有些方法名翻譯成中文感覺還挺奇怪的，果然基礎還是太查了/(ㄒoㄒ)/~~不管怎麼樣，歡迎大家拍磚討論

MXNet學習3——Symbol

概要

正文

基礎操作

基礎神經網絡

組合多個Symbol

NDArray與Symbol的區別

NDArray的優勢：

Symbol的優勢：

Symbol的操縱

Shape Inference

Bind with Data and Evaluate

定製Symbol

MySQL 核心模塊揭祕 | 18 期 | 鎖在內存里長什麼樣*

使用perf工具生成火焰圖

HttpSecurity 是如何組裝過濾器鏈的

數說海南——近6年海南各市縣人口簡單看

長序列中Transformers的高級注意力機制總結

WebStorm 創建 Vue 項目

大齡程序員思考

響應式界面控件DevExtreme * 更強的數據分析和可視化功能

MXNet學習7——Logistic Regression

numpy.unravel_index 說明

MXNet學習8——自己寫operator實現Logistic Regression

MXNet學習4——Mixed Programing

MXNet學習1——數據模擬

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結