MATLAB神經網絡(2)之R練習

1. AMORE

1.1 newff

newff(n.neurons, learning.rate.global, momentum.global, error.criterium, Stao, hidden.layer, output.layer, method)

  • n.neurons:包含每層神經元的數值向量。第一個元素是輸入神經元的數量,最後一個元素是輸出神經元的數量,剩餘的是隱含層神經元的數量;
  • learning.rate.global:每個神經元訓練時的學習效率;
  • momentum.global:每個神經元的動量,僅幾個訓練算法需要該參數;
  • error.criterium:用於度量神經網絡預測目標值接近程度的標準,可以使用以下幾項:

    "LMS":Least Mean Squares

    "LMLS":Least Mean Logarithm Squared

    "TAO":TAO Error

  • Stao:當上一項爲TAO時的Stao參數,對於其他的誤差標準無用
  • hidden.layer:隱含層神經元的激活函數,可用:"purelin"、"tansig"、"sigmoid"、"hardlim"、"custom",其中"custom"表示需要用戶自定義神經元的f0和f1元素;
  • output.layer:輸出層神經元的激活函數;
  • method:優先選擇的訓練方法,包括:

    "ADAPTgd": Adaptative gradient descend.(自適應的梯度下降方法)
    "ADAPTgdwm": Adaptative gradient descend with momentum.(基於動量因子的自適應梯度下降方法)
    "BATCHgd": BATCH gradient descend.(批量梯度下降方法)
    "BATCHgdwm": BATCH gradient descend with momentum.(基於動量因子的批量梯度下降方法)

1.2 train

train(net, P, T, Pval=NULL, Tval=NULL, error.criterium="LMS", report=TRUE, n.shows, show.step, Stao=NA, prob=NULL, n.threads=0L)

  • net:要訓練的神經網絡;
  • P:訓練集輸入值;
  • T:訓練集輸出值;
  • Pval:驗證集輸入值;
  • Tval:驗證集輸出值;
  • error.criterium:度量擬合優劣程度的標準;
  • Stao:用於TAO算法的參數初始值;
  • report:邏輯值,是否打印訓練過程信息;
  • n.shows:當上一項爲TRUE時,打印的總次數;
  • show.step:訓練過程一直進行到函數允許打印報告信息時經歷的迭代次數;
  • prob:每一個樣本應用到再抽樣訓練時的概率向量;
  • n.threads:用於BATCH*訓練方法的線程數量,如果小於1,它將產生NumberProcessors-1個線程,其中NumberProcessors爲處理器的個數,如果沒有找到OpenMP庫,該參數將被忽略。

1.3 example

library(AMORE)
# P is the input vector
P <- matrix(sample(seq(-1,1,length=1000), 1000, replace=FALSE), ncol=1) 
# The network will try to approximate the target P^2
target <- P^2                                   
# We create a feedforward network, with two hidden layers.
# The first hidden layer has three neurons and the second has two neurons.
# The hidden layers have got Tansig activation functions and the output layer is Purelin.
net <- newff(n.neurons=c(1,3,2,1), learning.rate.global=1e-2, momentum.global=0.5,
        error.criterium="LMS", Stao=NA, hidden.layer="tansig", 
        output.layer="purelin", method="ADAPTgdwm")
result <- train(net, P, target, error.criterium="LMS", report=TRUE, show.step=100, n.shows=5 )
P_test <- matrix(sample(seq(-1,1,length=1000), 100, replace=FALSE), ncol=1)
target_test <- P_test^2
y_test <- sim(result$net, P_test)
plot(P_test,y_test-target_test)

index.show: 1 LMS 0.0893172434474773
index.show: 2 LMS 0.0892277761187557
index.show: 3 LMS 0.000380711026069436
index.show: 4 LMS 0.000155618390342181
index.show: 5 LMS 9.53881309223154e-05

1.4 exercise

P <- matrix(rnorm(2000),ncol=2)
target <- apply(P^2,1,mean)
net <- newff(n.neurons=c(2,20,1),
             learning.rate.global=1e-2,
             momentum.global=0.5,
             error.criterium="LMS", 
             Stao=NA, 
             hidden.layer="tansig",
             output.layer="purelin", 
             method="ADAPTgdwm")
result <- train(net, P, target, 
                error.criterium="LMS", 
                report=TRUE, 
                show.step=100, 
                n.shows=10)
P_test <- matrix(rnorm(200),ncol=2) 
target_test <- apply(P_test^2,1,mean)
y_test <- sim(result$net, P_test)
plot(1:100,y_test-target_test)
(mean(abs(y_test-target_test)/target_test))

index.show: 1 LMS 0.00334168377394762
index.show: 2 LMS 0.00234274830312042
index.show: 3 LMS 0.00153074744985914
index.show: 4 LMS 0.00107702325684643
index.show: 5 LMS 0.000808644341077497
index.show: 6 LMS 0.000618371026226565
index.show: 7 LMS 0.000464896136767987
index.show: 8 LMS 0.000367196186841271
index.show: 9 LMS 0.00030261519286547
index.show: 10 LMS 0.000256892157194439
[1] 0.02934992

2. RSNNS

2.1 mlp

mlp(x, y, size = c(5), maxit = 100, initFunc = "Randomize_Weights", initFuncParams = c(-0.3, 0.3), learnFunc = "Std_Backpropagation", learnFuncParams = c(0.2, 0), updateFunc = "Topological_Order", updateFuncParams = c(0),
hiddenActFunc = "Act_Logistic", shufflePatterns = TRUE, linOut = FALSE, outputActFunc = if (linOut) "Act_Identity" else "Act_Logistic", inputsTest = NULL, targetsTest = NULL, pruneFunc = NULL, pruneFuncParams = NULL, ...)

  • x:一個矩陣,作爲訓練數據用於神經網絡的輸入;
  • y:對應的目標值;
  • size:隱含層單元的數量,默認爲c(5),當設置多個值時,表示有多個隱含層;
  • maxit:學習過程的最大迭代次數,默認爲100;
  • initFunc:使用的初始化函數,默認爲Randomize_Weights,即隨機權重;
  • initFuncParams:用於初始化函數的參數,默認參數爲c(-0.3,0.3);
  • learnFunc:使用的學習函數,默認爲Std_Backpropagation;
  • learnFuncParams:用於學習函數的參數,默認爲c(0.2,0);
  • updateFunc:使用的更新函數,默認爲Topological_Order;
  • undateFuncParams:用於更新函數的參數,默認爲c(0);
  • hiddenActFunc:所有隱含層神經元的激活函數,默認爲Act_Logistic;
  • shufflePatterns:是否將模式打亂,默認爲TRUE;
  • linOut,設置輸出神經元的激活函數,linear或者logistic,默認爲FALSE;
  • inputsTest:一個矩陣,作爲測試數據用於神經網絡的輸入,默認爲NULL;
  • targetsTest:與測試輸入對應的目標值,默認爲NULL;
  • pruneFunc:使用的修剪函數,默認爲NULL;
  • pruneFuncParams:用於修剪函數的函數,默認爲NULL。

2.2 example

數據準備

library(RSNNS)
data(iris)

#shuffle the vector
iris <- iris[sample(1:nrow(iris),nrow(iris)),]

irisValues <- iris[,1:4]
irisTargets <- decodeClassLabels(iris[,5])
#irisTargets <- decodeClassLabels(iris[,5], valTrue=0.9, valFalse=0.1)
  • decodeClassLabels: This method decodes class labels from a numerical or levels vector to a binary matrix, i.e., it converts the input vector to a binary matrix.
> head(irisTargets)
     setosa versicolor virginica
[1,]      1          0         0
[2,]      0          1         0
[3,]      0          0         1
[4,]      1          0         0
[5,]      0          0         1
[6,]      0          0         1  

抽樣

iris <- splitForTrainingAndTest(irisValues, irisTargets, ratio=0.15)
iris <- normTrainingAndTestSet(iris)
  • splitForTrainingAndTest: Split the input and target values to a training and a test set. Test set is taken from the end of the data. If the data is to be shuffled, this should be done before calling this function.
  • normTrainingAndTestSet: Normalize training and test set as obtained by splitForTrainingAndTest in the following way: The inputsTrain member is normalized using normalizeData with the parameters given in type. The normalization parameters obtained during this normalization are then used to normalize the inputsTest member. if dontNormTargets is not set, then the targets are normalized in the same way. In classification problems, normalizing the targets normally makes no sense. For regression, normalizing also the targets is usually a good idea. The default is to not normalize targets values.

> attr(iris$inputsTrain,"normParams")$type
[1] "norm"
> attr(iris$inputsTest,"normParams")$type
[1] "norm"

訓練&測試

model <- mlp(iris$inputsTrain, iris$targetsTrain, 
             size=5, 
             learnFuncParams=c(0.1),
             maxit=50, 
             inputsTest=iris$inputsTest,
             targetsTest=iris$targetsTest)

summary(model)

SNNS network definition file V1.4-3D
generated at Tue Feb 18 16:06:54 2020

network name : RSNNS_untitled
source files :
no. of units : 12(4+5+3=12個神經元)
no. of connections : 35(4×5+5×3=35權重)
no. of unit types : 0
no. of site types : 0


learning function : Std_Backpropagation
update function : Topological_Order


unit default section :

act | bias | st | subnet | layer | act func | out func
---------|----------|----|--------|-------|--------------|-------------
0.00000 | 0.00000 | i | 0 | 1 | Act_Logistic | Out_Identity
---------|----------|----|--------|-------|--------------|-------------


unit definition section :(給出偏置等參數)

no. | typeName | unitName | act | bias | st | position | act func | out func | sites
----|----------|-------------------|----------|----------|----|----------|--------------|----------|-------
1 | | Input_1 | 0.30243 | 0.27615 | i | 1,0,0 | Act_Identity | |
2 | | Input_2 | -0.55103 | 0.28670 | i | 2,0,0 | Act_Identity | |
3 | | Input_3 | 0.51752 | 0.05927 | i | 3,0,0 | Act_Identity | |
4 | | Input_4 | -0.00309 | 0.27664 | i | 4,0,0 | Act_Identity | |
5 | | Hidden_2_1 | 0.76338 | 1.91173 | h | 1,2,0 |||
6 | | Hidden_2_2 | 0.20872 | -2.48788 | h | 2,2,0 |||
7 | | Hidden_2_3 | 0.19751 | -0.35229 | h | 3,2,0 |||
8 | | Hidden_2_4 | 0.03343 | -1.45754 | h | 4,2,0 |||
9 | | Hidden_2_5 | 0.89784 | 0.82294 | h | 5,2,0 |||
10 | | Output_setosa | 0.05013 | -0.63914 | o | 1,4,0 |||
11 | | Output_versicolor | 0.82301 | -0.62233 | o | 2,4,0 |||
12 | | Output_virginica | 0.11542 | -1.02579 | o | 3,4,0 |||
----|----------|-------------------|----------|----------|----|----------|--------------|----------|-------


connection definition section :(給出權重矩陣)

target | site | source:weight
-------|------|---------------------------------------------------------------------------------------------------------------------
5 | | 4:-2.27555, 3:-1.59991, 2: 0.00019, 1: 0.26665
6 | | 4: 2.54355, 3: 1.83954, 2:-0.49918, 1:-0.21152
7 | | 4:-1.29117, 3:-0.90530, 2: 0.95187, 1:-0.20054
8 | | 4:-1.49130, 3:-1.54012, 2: 1.55497, 1:-0.85142
9 | | 4: 1.35965, 3: 1.07292, 2:-1.20123, 1: 0.45480
10 | | 9:-2.51547, 8: 2.26887, 7: 1.42713, 6:-2.24000, 5: 0.08607
11 | | 9: 1.31081, 8:-3.15358, 7:-1.09917, 6:-3.07740, 5: 2.55070
12 | | 9: 1.16730, 8:-2.03073, 7:-2.15568, 6: 2.90927, 5:-2.84570
-------|------|---------------------------------------------------------------------------------------------------------------------

> model

Class: mlp->rsnns
Number of inputs: 4
Number of outputs: 3
Maximal iterations: 50
Initialization function: Randomize_Weights
Initialization function parameters: -0.3 0.3
Learning function: Std_Backpropagation
Learning function parameters: 0.1
Update function:Topological_Order
Update function parameters: 0
Patterns are shuffled internally: TRUE
Compute error in every iteration: TRUE
Architecture Parameters:
$`size`
[1] 5

All members of model:
[1] "nInputs" "maxit"
[3] "initFunc" "initFuncParams"
[5] "learnFunc" "learnFuncParams"
[7] "updateFunc" "updateFuncParams"
[9] "shufflePatterns" "computeIterativeError"
[11] "snnsObject" "archParams"
[13] "IterativeFitError" "IterativeTestError"
[15] "fitted.values" "fittedTestValues"
[17] "nOutputs"

weightMatrix(model)
#extractNetInfo(model)
  Hidden_2_1 Hidden_2_2 Hidden_2_3 Hidden_2_4 Hidden_2_5 Output_setosa Output_versicolor Output_virginica
Input_1 0.2666532695 -0.2115191 -0.2005364 -0.8514175 0.4547978 0.00000000 0.000000 0.000000
Input_2 0.0001948424 -0.4991824 0.9518659 1.5549698 -1.2012329 0.00000000 0.000000 0.000000
Input_3 -1.5999110937 1.8395387 -0.9053020 -1.5401160 1.0729166 0.00000000 0.000000 0.000000
Input_4 -2.2755548954 2.5435514 -1.2911705 -1.4912956 1.3596520 0.00000000 0.000000 0.000000
Hidden_2_1 0.0000000000 0.0000000 0.0000000 0.0000000 0.0000000 0.08606748 2.550698 -2.845704
Hidden_2_2 0.0000000000 0.0000000 0.0000000 0.0000000 0.0000000 -2.24000168 -3.077400 2.909271
Hidden_2_3 0.0000000000 0.0000000 0.0000000 0.0000000 0.0000000 1.42712903 -1.099168 -2.155685
Hidden_2_4 0.0000000000 0.0000000 0.0000000 0.0000000 0.0000000 2.26887059 -3.153584 -2.030726
Hidden_2_5 0.0000000000 0.0000000 0.0000000 0.0000000 0.0000000 -2.51547217 1.310814 1.167299
Output_setosa 0.0000000000 0.0000000 0.0000000 0.0000000 0.0000000 0.00000000 0.000000 0.000000
Output_versicolor 0.0000000000 0.0000000 0.0000000 0.0000000 0.0000000 0.00000000 0.000000 0.000000
Output_virginica 0.0000000000 0.0000000 0.0000000 0.0000000 0.0000000 0.00000000 0.000000 0.000000

 結果分析

繪出隨迭代次數增加,誤差平方和(SSE)的變化情況。

plotIterativeError(model)

散點爲真實值和預測值的分佈,黑色線爲$y=x$,紅色線爲一次迴歸線。

predictions <- predict(model,iris$inputsTest)

plotRegressionError(predictions[,2], iris$targetsTest[,2])

 列出混淆矩陣,繪出EOC曲線。

> confusionMatrix(iris$targetsTrain,fitted.values(model))
       predictions
targets  1  2  3
      1 42  0  0
      2  0 38  2
      3  0  1 44
> confusionMatrix(iris$targetsTest,predictions)
       predictions
targets 1 2 3
      1 8 0 0
      2 0 9 1
      3 0 0 5
par(mfrow=c(1,2))
plotROC(fitted.values(model)[,2], iris$targetsTrain[,2]) plotROC(predictions[,2], iris$targetsTest[,2])

給出更嚴格的檢驗,若3項中1項大於0.6,其餘兩項小於0.4,則將其歸爲該項,否則標0,視爲無法判定。

> confusionMatrix(iris$targetsTrain, 
+                 encodeClassLabels(fitted.values(model),
+                                   method="402040", l=0.4, h=0.6))
       predictions
targets  0  1  2  3
      1  0 42  0  0
      2  4  0 36  0
      3  3  0  0 42

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章