神經網絡學習小記錄40——春節到了,用LSTM寫古詩不?
學習前言
不知道咋地,就是想寫寫古詩,誰不是個祖安詩人呢?
整體實現思路
LSTM可以對輸入進來的序列進行特徵提取,並做出預測結果。
今天我們試着利用LSTM來作五言詩。
我們可以按照這樣的思路進行預測:
由前六個字預測出下一個字。
利用“寒隨窮律變,”預測出“春”。
利用“隨窮律變,春”預測出“逐”。
利用這樣的方式去構建LSTM就可以一步一步的往下預測,實現古詩創作。
即:
- 寒隨窮律變, -> 春
- 隨窮律變,春 -> 逐
- 窮律變,春逐 -> 鳥
- 律變,春逐鳥 -> 聲
- 變,春逐鳥聲 -> 開
- ,春逐鳥聲開 -> 。
- ……
最終得到古詩:
寒隨窮律變,春逐鳥聲開。初風飄帶柳,晚雪間花梅。碧林青舊竹,綠沼翠新苔。芝田初雁去,綺樹巧鶯來。
github下載地址與B站連接
下載地址
https://github.com/bubbliiiing/poems-generator
b站觀看連接
https://www.bilibili.com/video/av85400912
代碼實現
1、數據處理
a、讀取古詩並轉化爲id
從存放古詩的txt裏面讀出五言詩:
通過讀入進來的每一行:後面的序號爲5的內容是不是,判斷其是不是五言詩。
之後再利用獲得所有的字,並對所有的字進行計數,然後將計數結果從高到低排列。
取出出現頻率最高的字,不存在的字用空格代替。
建立字到id的映射,id到字的映射。
然後把獲取到的所有詩都利用字到id的映射轉化爲id。
此時一首詩就用它每一個文字的id構成了。
義髻拋河裏,黃裙逐水流。
[835, 2197, 1604, 210, 51, 0, 172, 2135, 406, 16, 78, 1, 2]
實現代碼如下:
def load(poetry_file):
def handle(line):
return line + END_CHAR
poetrys = [line.strip().replace(' ', '').split(':')[1] for line in
open(poetry_file, encoding='utf-8')]
collect = []
for poetry in poetrys:
if len(poetry) <= 5 :
continue
if poetry[5]==",":
collect.append(handle(poetry))
print(len(collect))
poetrys = collect
# 所有字
words = []
for poetry in poetrys:
words += [word for word in poetry]
counter = collections.Counter(words)
count_pairs = sorted(counter.items(), key=lambda x: -x[1])
# 獲得所有字,出現次數從大到小排列
words, _ = zip(*count_pairs)
# 取出現頻率最高的詞的數量組成字典,不在字典中的字用'*'代替
words_size = min(max_words, len(words))
words = words[:words_size] + (UNKNOWN_CHAR,)
# 計算總長度
words_size = len(words)
# 字映射成id,採用ont-hot的形式
char2id_dict = {w: i for i, w in enumerate(words)}
id2char_dict = {i: w for i, w in enumerate(words)}
unknow_char = char2id_dict[UNKNOWN_CHAR]
char2id = lambda char: char2id_dict.get(char, unknow_char)
poetrys = sorted(poetrys, key=lambda line: len(line))
# 訓練集中每一首詩都找到了每個字對應的id
poetrys_vector = [list(map(char2id, poetry)) for poetry in poetrys]
return np.array(poetrys_vector),char2id_dict,id2char_dict
b、將讀取到的所有古詩轉化爲6to1的形式
利用get_6to1將所有古詩轉化爲6to1的形式。
傳入的x_data爲一首五言詩。如:
寒隨窮律變,春逐鳥聲開。初風飄帶柳,晚雪間花梅。碧林青舊竹,綠沼翠新苔。芝田初雁去,綺樹巧鶯來。
輸出的inputs就是6個字的集合,輸出的targets就是利用6個字預測的1個字的集合。
def get_6to1(x_data,char2id_dict):
inputs = []
targets = []
for index in range(len(x_data)):
x = x_data[index:(index+unit_sentence)]
y = x_data[index+unit_sentence]
if (END_CHAR in x) or y == char2id_dict[END_CHAR]:
return np.array(inputs),np.array(targets)
else:
inputs.append(x)
targets.append(y)
return np.array(inputs),np.array(targets)
2、神經網絡構建
神經網絡的構建非常簡單,只需要下面幾行代碼就能完成:
需要指定輸入進來的每一個時間節點的內容的維度爲words_size,也就是所有的字符的數量。
#-------------------------------#
# 建立神經網絡
#-------------------------------#
inputs = Input(shape=(None,words_size))
x = CuDNNLSTM(UNITS,return_sequences=True)(inputs)
x = Dropout(0.6)(x)
x = CuDNNLSTM(UNITS)(x)
x = Dropout(0.6)(x)
x = Dense(words_size, activation='softmax')(x)
model = Model(inputs,x)
3、古詩預測
隨機選擇一首古詩的首六個字,然後往下進行預測,每一次預測一個字,然後預測完整首古詩。
def predict_from_nothing(epoch,x_data,char2id_dict,id2char_dict,model):
# 訓練過程中,每1個epoch打印出當前的學習情況
print("\n#-----------------------Epoch {}-----------------------#".format(epoch))
words_size = len(id2char_dict)
index = np.random.randint(0, len(x_data))
sentence = x_data[index][:unit_sentence]
def _pred(text):
temp = text[-unit_sentence:]
x_pred = np.zeros((1, unit_sentence, words_size))
for t, index in enumerate(temp):
x_pred[0, t, index] = 1.
preds = model.predict(x_pred)[0]
choice_id = np.random.choice(range(len(preds)),1,p=preds)
if id2char_dict[choice_id[0]] == ' ':
while id2char_dict[choice_id[0]] in [',','。',' ']:
choice_id = np.random.randint(0,len(char2id_dict),1)
return choice_id
for i in range(24-unit_sentence):
pred = _pred(sentence)
sentence = np.append(sentence,pred)
output = ""
for i in range(len(sentence)):
output = output + id2char_dict[sentence[i]]
print(output)
全部代碼
代碼需要按照如下方式擺放:
1、poem_keras.py
import numpy as np
from keras.callbacks import TensorBoard, ModelCheckpoint, ReduceLROnPlateau, EarlyStopping
from keras.layers import CuDNNLSTM,Dense,Input,Softmax,Convolution1D,Embedding,Dropout
from keras.callbacks import LambdaCallback
from keras.optimizers import Adam
from keras.models import Model
from utils import load,get_batch,predict_from_nothing,predict_from_head
UNITS = 256
batch_size = 64
epochs = 50
poetry_file = 'poetry.txt'
# 載入數據
x_data,char2id_dict,id2char_dict = load(poetry_file)
max_length = max([len(txt) for txt in x_data])
words_size = len(char2id_dict)
#-------------------------------#
# 建立神經網絡
#-------------------------------#
inputs = Input(shape=(None,words_size))
x = CuDNNLSTM(UNITS,return_sequences=True)(inputs)
x = Dropout(0.6)(x)
x = CuDNNLSTM(UNITS)(x)
x = Dropout(0.6)(x)
x = Dense(words_size, activation='softmax')(x)
model = Model(inputs,x)
#-------------------------------#
# 劃分訓練集驗證集
#-------------------------------#
val_split = 0.1
np.random.seed(10101)
np.random.shuffle(x_data)
np.random.seed(None)
num_val = int(len(x_data)*val_split)
num_train = len(x_data) - num_val
#-------------------------------#
# 設置保存方案
#-------------------------------#
checkpoint = ModelCheckpoint('logs/loss{loss:.3f}-val_loss{val_loss:.3f}.h5',
monitor='val_loss', save_weights_only=True, save_best_only=False, period=1)
#-------------------------------#
# 設置學習率並訓練
#-------------------------------#
model.compile(optimizer=Adam(1e-3), loss='categorical_crossentropy',
metrics=['accuracy'])
for i in range(epochs):
predict_from_nothing(i,x_data,char2id_dict,id2char_dict,model)
model.fit_generator(get_batch(batch_size, x_data[:num_train], char2id_dict, id2char_dict),
steps_per_epoch=max(1, num_train//batch_size),
validation_data=get_batch(batch_size, x_data[:num_train], char2id_dict, id2char_dict),
validation_steps=max(1, num_val//batch_size),
epochs=1,
initial_epoch=0,
callbacks=[checkpoint])
#-------------------------------#
# 設置學習率並訓練
#-------------------------------#
model.compile(optimizer=Adam(1e-4), loss='categorical_crossentropy',
metrics=['accuracy'])
for i in range(epochs):
predict_from_nothing(i,x_data,char2id_dict,id2char_dict,model)
model.fit_generator(get_batch(batch_size, x_data[:num_train], char2id_dict, id2char_dict),
steps_per_epoch=max(1, num_train//batch_size),
validation_data=get_batch(batch_size, x_data[:num_train], char2id_dict, id2char_dict),
validation_steps=max(1, num_val//batch_size),
epochs=1,
initial_epoch=0,
callbacks=[checkpoint])
2、utils.py
import numpy as np
import collections
END_CHAR = '\n'
UNKNOWN_CHAR = ' '
unit_sentence = 6
max_words = 3000
MIN_LENGTH = 10
def load(poetry_file):
def handle(line):
return line + END_CHAR
poetrys = [line.strip().replace(' ', '').split(':')[1] for line in
open(poetry_file, encoding='utf-8')]
collect = []
for poetry in poetrys:
if len(poetry) <= 5 :
continue
if poetry[5]==",":
collect.append(handle(poetry))
print(len(collect))
poetrys = collect
# 所有字
words = []
for poetry in poetrys:
words += [word for word in poetry]
counter = collections.Counter(words)
count_pairs = sorted(counter.items(), key=lambda x: -x[1])
# 獲得所有字,出現次數從大到小排列
words, _ = zip(*count_pairs)
# 取出現頻率最高的詞的數量組成字典,不在字典中的字用'*'代替
words_size = min(max_words, len(words))
words = words[:words_size] + (UNKNOWN_CHAR,)
# 計算總長度
words_size = len(words)
# 字映射成id,採用ont-hot的形式
char2id_dict = {w: i for i, w in enumerate(words)}
id2char_dict = {i: w for i, w in enumerate(words)}
unknow_char = char2id_dict[UNKNOWN_CHAR]
char2id = lambda char: char2id_dict.get(char, unknow_char)
poetrys = sorted(poetrys, key=lambda line: len(line))
# 訓練集中每一首詩都找到了每個字對應的id
poetrys_vector = [list(map(char2id, poetry)) for poetry in poetrys]
return np.array(poetrys_vector),char2id_dict,id2char_dict
def get_6to1(x_data,char2id_dict):
inputs = []
targets = []
for index in range(len(x_data)):
x = x_data[index:(index+unit_sentence)]
y = x_data[index+unit_sentence]
if (END_CHAR in x) or y == char2id_dict[END_CHAR]:
return np.array(inputs),np.array(targets)
else:
inputs.append(x)
targets.append(y)
return np.array(inputs),np.array(targets)
def get_batch(batch_size,x_data,char2id_dict,id2char_dict):
n = len(x_data)
batch_i = 0
words_size = len(char2id_dict)
while(True):
one_hot_x_data = []
one_hot_y_data = []
for i in range(batch_size):
batch_i = (batch_i+1)%n
inputs,targets = get_6to1(x_data[batch_i],char2id_dict)
for j in range(len(inputs)):
one_hot_x_data.append(inputs[j])
one_hot_y_data.append(targets[j])
batch_size_after = len(one_hot_x_data)
input_data = np.zeros(
(batch_size_after, unit_sentence, words_size))
target_data = np.zeros(
(batch_size_after, words_size))
for i, (input_text, target_text) in enumerate(zip(one_hot_x_data, one_hot_y_data)):
# 爲末尾加上" "空格
for t, index in enumerate(input_text):
input_data[i, t, index] = 1
# 相當於前一個內容的識別結果,作爲輸入,傳入到解碼網絡中
target_data[i, target_text] = 1.
yield input_data,target_data
def predict_from_nothing(epoch,x_data,char2id_dict,id2char_dict,model):
# 訓練過程中,每1個epoch打印出當前的學習情況
print("\n#-----------------------Epoch {}-----------------------#".format(epoch))
words_size = len(id2char_dict)
index = np.random.randint(0, len(x_data))
sentence = x_data[index][:unit_sentence]
def _pred(text):
temp = text[-unit_sentence:]
x_pred = np.zeros((1, unit_sentence, words_size))
for t, index in enumerate(temp):
x_pred[0, t, index] = 1.
preds = model.predict(x_pred)[0]
choice_id = np.random.choice(range(len(preds)),1,p=preds)
if id2char_dict[choice_id[0]] == ' ':
while id2char_dict[choice_id[0]] in [',','。',' ']:
choice_id = np.random.randint(0,len(char2id_dict),1)
return choice_id
for i in range(24-unit_sentence):
pred = _pred(sentence)
sentence = np.append(sentence,pred)
output = ""
for i in range(len(sentence)):
output = output + id2char_dict[sentence[i]]
print(output)
def predict_from_head(epoch,name,x_data,char2id_dict,id2char_dict,model):
# 根據給定的字,生成藏頭詩
if len(name) < 4:
for i in range(4-len(name)):
index = np.random.randint(0,len(char2id_dict))
while id2char_dict[index] in [',','。',' ']:
index = np.random.randint(0,len(char2id_dict))
name += id2char_dict[index]
origin_name = name
name = list(name)
for i in range(len(name)):
if name[i] not in char2id_dict:
index = np.random.randint(0,len(char2id_dict))
while id2char_dict[index] in [',','。',' ']:
index = np.random.randint(0,len(char2id_dict))
name[i] = id2char_dict[index]
name = ''.join(name)
words_size = len(char2id_dict)
index = np.random.randint(0, len(x_data))
#選取隨機一首詩的最後max_len字符+給出的首個文字作爲初始輸入
sentence = np.append(x_data[index][-unit_sentence:-1],char2id_dict[name[0]])
def _pred(text):
temp = text[-unit_sentence:]
x_pred = np.zeros((1, unit_sentence, words_size))
for t, index in enumerate(temp):
x_pred[0, t, index] = 1.
preds = model.predict(x_pred)[0]
choice_id = np.random.choice(range(len(preds)),1,p=preds)
if id2char_dict[choice_id[0]] == ' ':
while id2char_dict[choice_id[0]] in [',','。',' ']:
choice_id = np.random.randint(0,len(char2id_dict),1)
return choice_id
for i in range(5):
pred = _pred(sentence)
sentence = np.append(sentence,pred)
sentence = sentence[-unit_sentence:]
for i in range(3):
sentence = np.append(sentence,char2id_dict[name[i+1]])
for i in range(5):
pred = _pred(sentence)
sentence = np.append(sentence,pred)
output = []
for i in range(len(sentence)):
output.append(id2char_dict[sentence[i]])
for i in range(4):
output[i*6] = origin_name[i]
output = ''.join(output)
print(output)
3、實現效果
列闢鳴鸞至,恭登貫鳳韜。流川將合命,悠悠意從如。
藏頭詩:
快樂(雖然這個作出來的詩好像不是很快樂……)
快塵浮老田,樂炭唯愛墳。號之二畝士,芳草再無魂。