一、構建網絡

在這裏，使用 Mnist 數據集進行演示。

1、導入需要的庫和數據集

import tensorflow as tf
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

2、對數據集進行處理

需要將數據類型從整型轉換成浮點型。

x_train = tf.cast(x_train, tf.float32)
x_test = tf.cast(x_test, tf.float32)

3、對數據集切片處理

dataset_train = tf.data.Dataset.from_tensor_slices((x_train, y_train)).shuffle(y_train.shape[0]).batch(64)
dataset_test = tf.data.Dataset.from_tensor_slices((x_test, y_test)).shuffle(y_test.shape[0]).batch(y_test.shape[0])

4、構建分類器

4.1 LSTM 模塊

每個 LSTM 模塊都包含一層 lstm 層和一層批歸一化層。
對於 lstm 層，如果這一層在分類器中是最後一個 lstm 層，那麼我們要求只輸出最後一個時間步的結果，即令 return_sequences=False。

class LstmSection(tf.keras.Model):
    def __init__(self, units, final_layer=False):
        super().__init__()
        if final_layer:
            self.lstm = tf.keras.layers.Bidirectional(
                            tf.keras.layers.LSTM(units, activation='relu', return_sequences=False)
                        )
        else:
            self.lstm = tf.keras.layers.Bidirectional(
                            tf.keras.layers.LSTM(units, activation='relu', return_sequences=True)
                        )
        self.bn = tf.keras.layers.BatchNormalization()
        
    def call(self, inputs):
        x = self.lstm(inputs)
        x = self.bn(x)
        
        return x

4.2 Dense 模塊

構建方法與 LSTM 模塊類似，每個 Dense 模塊都包含一層全連接層和一層批歸一化層。

class DenseSection(tf.keras.Model):
    def __init__(self, units):
        super().__init__()
        self.dense = tf.keras.layers.Dense(units, activation='relu')
        self.bn = tf.keras.layers.BatchNormalization()
        
    def call(self, inputs):
        x = self.dense(inputs)
        x = self.bn(x)
        
        return x

4.3 分類器

class Classifier(tf.keras.Model):
    def __init__(self,
                 num_lstm, lstm_units,
                  num_dense, dense_units):
        super().__init__()
        self.LSTM=[]
        if num_lstm == 1:
            self.LSTM.append(LstmSection(lstm_units[0], final_layer=True))
        else:
            for i in range(num_lstm-1):
                self.LSTM.append(LstmSection(lstm_units[i]))
            self.LSTM.append(LstmSection(lstm_units[num_lstm-1], final_layer=True))
        
        self.DENSE=[]
        for i in range(num_dense):
            self.DENSE.append(DenseSection(dense_units[i]))
        self.DENSE.append(tf.keras.layers.Dense(10, activation='softmax'))
    
    def call(self, inputs):
        x = inputs
        for layer in self.LSTM.layers:
            x = layer(x)
        
        for layer in self.DENSE.layers:
            x = layer(x)
        
        return x

4.4、設置參數

該參數即遺傳算法中的染色體。

classifier = Classifier(params[0], params[1], params[2], params[3])

在這裏，params 是一個列表，params[0] 是 LSTM 層的層數；params[1] 是一個列表，列表中元素的數量等於 params[0]，它表示每層 LSTM 中的神經元的個數；params[2] 是全連接層的層數；params[3] 是一個列表，列表中元素的數量等於 params[2]，它表示每層全連接層中的神經元的個數。比如，我們設：

params = [3, [256, 256, 512], 4, [256, 256, 128, 32]]

5、構造損失函數

loss_obj_classifier = tf.keras.losses.CategoricalCrossentropy()
def loss_classifier(real, pred):
    l = loss_obj_classifier(real, pred)
    return l

6、構造梯度下降函數

opt_classifier = tf.keras.optimizers.Adam()
def train_step_classifier(x, y):
    with tf.GradientTape() as tape:
        pred = classifier(x)
        real = tf.one_hot(y, depth=10)
        l = loss_classifier(real, pred)
    grad = tape.gradient(l, classifier.trainable_variables)
    opt_classifier.apply_gradients(zip(grad, classifier.trainable_variables))
    return l, tf.cast(tf.argmax(pred, axis=1), dtype=tf.int32), y

7、訓練

epochs_classifier = 1
for epoch in range(epochs_classifier):
    for i, (feature, label) in enumerate(dataset_train):
        loss, pred_label, real_label = train_step_classifier(feature, label)
        if (i+1) % 100 == 0:
            print('第{}次訓練中第{}批的誤差爲{}'.format(epoch+1, i+1, loss))
    print('第{}次訓練後的誤差爲{}'.format(epoch+1, loss))

total_correct = 0
total_num = 0
for feature, label in dataset_test:
    prob = classifier(feature)
    pred = tf.argmax(prob, axis=1)
    pred = tf.cast(pred, tf.int32)
    correct = tf.equal(pred, label)
    correct = tf.reduce_sum(tf.cast(correct, dtype=tf.int32))
    total_correct += int(correct)
    total_num += feature.shape[0]
acc = total_correct / total_num
print('測試集的準確率爲{}'.format(acc))

在這裏，測試集的準確率就代表了遺傳算法中每條染色體的適應度。
至此，模型已經構建完畢，我們將上面的模型寫入 project.py 文件，並將數據導入過程以及訓練過程分別封裝成函數 dataset_train, dataset_test = load() 和 acc = classify(dataset_train, dataset_test, params)。

二、遺傳算法

常規的遺傳算法介紹可以參考我的另一篇文章遺傳算法求解最大值問題詳解（附python代碼）。

1、導入需要的庫

import numpy as np
import project
import copy

2、設置參數

DNA_SIZE = 4
POP_SIZE = 20
CROSS_RATE = 0.2
MUTATION_RATE = 0.1
N_GENERATIONS = 40

DNA_SIZE：每條染色體上的基因個數，在模型構建部分的 4.4 部分已經介紹過了。
POP_SIZE：一個種羣中的染色體個數。
CROSS_RATE：交叉率。
MUTATION_RATE：變異率。
N_GENERATIONS ：進化的代數。

3、導入數據

dataset_train, dataset_test = project.load()

4、適應度函數

適應度也就是測試集分類的準確率。

def get_fitness(params): 
    return project.classify(dataset_train, dataset_test, params)

5、選擇函數

每遍歷完一次種羣後，都要根據適應度來從這一代種羣中選擇染色體構成下一代種羣。

def select(pop, fitness):
    new_pop = []
    idx = np.random.choice(np.arange(POP_SIZE), size=POP_SIZE, replace=True, p=fitness / fitness.sum())
    for each in idx:
        new_pop.append(pop[each])
    return new_pop

6、交叉函數

進化過程中，每條染色體都有機會和其他染色體互換某一部分相同位置上的基因。在這個項目中，由於前兩個基因之間和後兩個基因之間都存在關聯性，即第二（四）個基因列表中的元素個數等於第一（三）個基因所表示的數字，所以在這裏，如果要交換第一（二）個基因，那麼第二（一）個基因也必須被交換，對於後兩個基因也做如此規定。

def crossover(parent, pop):
    if np.random.rand() < CROSS_RATE:
        chrome_selected = int(np.random.randint(0, POP_SIZE, size=1))
        gene_selected = np.random.randint(0, 2, size=int(DNA_SIZE/2)).astype(np.bool)
        
        pop_copy = copy.deepcopy(pop)
        parent_cop = copy.deepcopy(parent)
        for i, point in enumerate(gene_selected):
            if point == True:
                parent_cop[2*i: 2*(i+1)] = pop_copy[chrome_selected][2*i: 2*(i+1)]
        return parent_cop
    
    else:
        return parent

7、變異函數

同樣地，對於變異函數也需要作出修改：
如果變異的是第二個或第四個基因（和神經元個數有關的基因），那麼就對它們隨機賦值；如果變異的是第一個或第三個基因（和層數有關的基因），那麼就先判斷新的隨機生成的層數是否和原來的層數相等，如果相等，則不做改變；如果大於原來的層數，那麼在神經元基因中要隨機添加數字來表示新的層數上的神經元個數；如果小於原來的層數，那麼在神經元基因中要去掉最後一個數字來適應現在的層數。

def mutate(child_ori):
    child = copy.deepcopy(child_ori)
    for point in range(DNA_SIZE):
        if np.random.rand() < MUTATION_RATE:
            if point == 1 or point == 3:
                child[point] = list(np.random.randint(32, 257, size=len(child[point])))
            elif point == 0:
                new_num = np.random.randint(2, 4)
                if new_num < child[point]:
                    for _ in range(child[point]-new_num):
                        child[point+1].pop()
                elif new_num > child[point]:
                    for _ in range(new_num-child[point]):
                        child[point+1].append(np.random.randint(32, 257))
                child[point] = new_num
            elif point == 2:
                new_num = np.random.randint(1, 3)
                if new_num < child[point]:
                    for _ in range(child[point]-new_num):
                        child[point+1].pop()
                elif new_num > child[point]:
                    for _ in range(new_num-child[point]):
                        child[point+1].append(np.random.randint(32, 257))
                child[point] = new_num
                
    return child

8、生成第一代種羣

假設卷積層的層數被限制在2到5之間，全連接層的層數被限制在1到5之間，每個層上的神經元個數的範圍都是 [32, 257)。

pop = []
for i in range(POP_SIZE):
    each_pop = []
    number_c = np.random.randint(2, 6)
    each_pop.append(number_c)
    units_c = np.random.randint(32, 257, size=(number_c, ))
    each_pop.append(list(units_c))

    number_d = np.random.randint(1, 6)
    each_pop.append(number_d)
    units_d = np.random.randint(32, 257, size=(number_d, ))
    each_pop.append(list(units_d))
    pop.append(each_pop)

9、優化

for each_generation in range(N_GENERATIONS):
    fitness = np.zeros([POP_SIZE, ])
    for i in range(POP_SIZE):
        fitness[i] = get_fitness(pop[i])
        print('第%d代第%d個染色體的適應度爲%f' % (each_generation+1, i+1, fitness[i]))
        print('此染色體爲：', pop[i])
    print("Generation:", each_generation+1, "Most fitted DNA: ", pop[np.argmax(fitness)], "適應度爲：", fitness[np.argmax(fitness)])
    pop = select(pop, fitness)
    pop_copy = copy.deepcopy(pop)
    for i, child in enumerate(pop):
        child_cross = crossover(child, pop_copy)
        child_mutate = mutate(child)
        pop[i] = child_mutate

【注】在以上代碼中多處使用了 copy.deepcopy() 函數，其具體作用可以參考：python 複製列表的六種方法。

Tensorflow2.0之用遺傳算法優化LSTM網絡 Version2

文章目錄