在用keras-tf backend做語音識別網絡時,採用MFCC特徵值輸入,LSTM網絡和ctc loss function,出現了以下錯誤:
2018-07-02 11:32:45.861523: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at ctc_loss_op.cc:166 : Invalid argument: Saw a non-null label (index >= num_classes- 1) following a null label, batch: 4 num_classes: 29 labels:
Traceback (most recent call last):
File "/media/chutz/000206BE0003636E/ASRT_SpeechRecognition/train_mspeech.py", line 47, in <module>
ms.TrainModel(datapath, epoch = 50, batch_size = 8, save_step = 1000, filename= modelpath)
File "/media/chutz/000206BE0003636E/ASRT_SpeechRecognition/SpeechModel25.py", line 215, in TrainModel
#self._model.fit_generator(yielddatas, save_step, nb_worker=2)
File "/home/chutz/anaconda3/lib/python3.5/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/home/chutz/anaconda3/lib/python3.5/site-packages/keras/engine/training.py", line 2230, in fit_generator
class_weight=class_weight)
File "/home/chutz/anaconda3/lib/python3.5/site-packages/keras/engine/training.py", line 1883, in train_on_batch
outputs = self.train_function(ins)
File "/home/chutz/anaconda3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 2482, in __call__
**self.session_kwargs)
File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 900, in run
run_metadata_ptr)
File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1135, in _run
feed_dict_tensor, options, run_metadata)
File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
run_metadata)
File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Saw a non-null label (index >= num_classes - 1) following a null label, batch: 4 num_classes: 29 labels:
[[Node: ctc/CTCLoss = CTCLoss[_class=["loc:@training/SGD/gradients/ctc/CTCLoss_grad/mul"], ctc_merge_repeated=true, ignore_longer_outputs_than_inputs=false, preprocess_collapse_repeated=false, _device="/job:localhost/replica:0/task:0/device:CPU:0"](ctc/Log/_123, ctc/ToInt64/_125, ctc/GatherNd, ctc/Squeeze_1/_127)]]
[[Node: training/SGD/gradients/ctc/CTCLoss_grad/mul/_155 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_1925_training/SGD/gradients/ctc/CTCLoss_grad/mul", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
Caused by op 'ctc/CTCLoss', defined at:
File "/home/chutz/.vscode/extensions/ms-python.python-2018.4.0/pythonFiles/PythonTools/visualstudio_py_launcher.py", line 91, in <module>
vspd.debug(filename, port_num, debug_id, debug_options, currentPid, run_as)
File "/home/chutz/.vscode/extensions/ms-python.python-2018.4.0/pythonFiles/PythonTools/visualstudio_py_debugger.py", line 2625, in debug
exec_file(file, globals_obj)
File "/home/chutz/.vscode/extensions/ms-python.python-2018.4.0/pythonFiles/PythonTools/visualstudio_py_util.py", line 119, in exec_file
exec_code(code, file, global_variables)
File "/home/chutz/.vscode/extensions/ms-python.python-2018.4.0/pythonFiles/PythonTools/visualstudio_py_util.py", line 95, in exec_code
exec(code_obj, global_variables)
File "/media/chutz/000206BE0003636E/ASRT_SpeechRecognition/train_mspeech.py", line 44, in <module>
ms = ModelSpeech(datapath)
File "/media/chutz/000206BE0003636E/ASRT_SpeechRecognition/SpeechModel25.py", line 40, in __init__
self._model = self.graves()
File "/media/chutz/000206BE0003636E/ASRT_SpeechRecognition/SpeechModel25.py", line 94, in graves
label_length])
File "/home/chutz/anaconda3/lib/python3.5/site-packages/keras/engine/topology.py", line 619, in __call__
output = self.call(inputs, **kwargs)
File "/home/chutz/anaconda3/lib/python3.5/site-packages/keras/layers/core.py", line 685, in call
return self.function(inputs, **arguments)
File "/media/chutz/000206BE0003636E/ASRT_SpeechRecognition/SpeechModel25.py", line 60, in ctc_lambda_func
return K.ctc_batch_cost(labels, y_pred, input_length, label_length)
File "/home/chutz/anaconda3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 3956, in ctc_batch_cost
sequence_length=input_length), 1)
File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/ops/ctc_ops.py", line 158, in ctc_loss
ignore_longer_outputs_than_inputs=ignore_longer_outputs_than_inputs)
File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/ops/gen_ctc_ops.py", line 285, in ctc_loss
name=name)
File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op
op_def=op_def)
File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1718, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
InvalidArgumentError (see above for traceback): Saw a non-null label (index >= num_classes - 1) following a null label, batch: 4 num_classes: 29 labels:
[[Node: ctc/CTCLoss = CTCLoss[_class=["loc:@training/SGD/gradients/ctc/CTCLoss_grad/mul"], ctc_merge_repeated=true, ignore_longer_outputs_than_inputs=false, preprocess_collapse_repeated=false, _device="/job:localhost/replica:0/task:0/device:CPU:0"](ctc/Log/_123, ctc/ToInt64/_125, ctc/GatherNd, ctc/Squeeze_1/_127)]]
[[Node: training/SGD/gradients/ctc/CTCLoss_grad/mul/_155 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_1925_training/SGD/gradients/ctc/CTCLoss_grad/mul", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
設定是輸入語音長度爲(1600, 26)的MFCC特徵值序列,n_mfcc=26,語音最長爲1600,短了就補零
labels爲maxlength=64,短了補零
但是出現了以上錯誤
由於網絡的inputdim=26,outputdim=29,所以我把ignore_longer_outputs_than_inputs設置改爲了True,再運行依舊報錯,不過錯誤好像少了一點,有希望,繼續調試……
2018-07-02 11:38:30.403252: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at ctc_loss_op.cc:166 : Invalid argument: Saw a non-null label (index >= num_classes - 1) following a null label, batch: 2 num_classes: 29 labels:
Traceback (most recent call last):
File "/media/chutz/000206BE0003636E/ASRT_SpeechRecognition/train_mspeech.py", line 47, in <module>
ms.TrainModel(datapath, epoch = 50, batch_size = 8, save_step = 1000, filename= modelpath)
File "/media/chutz/000206BE0003636E/ASRT_SpeechRecognition/SpeechModel25.py", line 216, in TrainModel
self._model.fit_generator(yielddatas, save_step)
File "/home/chutz/anaconda3/lib/python3.5/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/home/chutz/anaconda3/lib/python3.5/site-packages/keras/engine/training.py", line 2230, in fit_generator
class_weight=class_weight)
File "/home/chutz/anaconda3/lib/python3.5/site-packages/keras/engine/training.py", line 1883, in train_on_batch
outputs = self.train_function(ins)
File "/home/chutz/anaconda3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 2482, in __call__
**self.session_kwargs)
File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 900, in run
run_metadata_ptr)
File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1135, in _run
feed_dict_tensor, options, run_metadata)
File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
run_metadata)
File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Saw a non-null label (index >= num_classes - 1) following a null label, batch: 2 num_classes: 29 labels:
[[Node: ctc/CTCLoss = CTCLoss[_class=["loc:@training/SGD/gradients/ctc/CTCLoss_grad/mul"], ctc_merge_repeated=true, ignore_longer_outputs_than_inputs=true, preprocess_collapse_repeated=false, _device="/job:localhost/replica:0/task:0/device:CPU:0"](ctc/Log/_123, ctc/ToInt64/_125, ctc/GatherNd, ctc/Squeeze_1/_127)]]
Caused by op 'ctc/CTCLoss', defined at:
File "/home/chutz/.vscode/extensions/ms-python.python-2018.4.0/pythonFiles/PythonTools/visualstudio_py_launcher.py", line 91, in <module>
vspd.debug(filename, port_num, debug_id, debug_options, currentPid, run_as)
File "/home/chutz/.vscode/extensions/ms-python.python-2018.4.0/pythonFiles/PythonTools/visualstudio_py_debugger.py", line 2625, in debug
exec_file(file, globals_obj)
File "/home/chutz/.vscode/extensions/ms-python.python-2018.4.0/pythonFiles/PythonTools/visualstudio_py_util.py", line 119, in exec_file
exec_code(code, file, global_variables)
File "/home/chutz/.vscode/extensions/ms-python.python-2018.4.0/pythonFiles/PythonTools/visualstudio_py_util.py", line 95, in exec_code
exec(code_obj, global_variables)
File "/media/chutz/000206BE0003636E/ASRT_SpeechRecognition/train_mspeech.py", line 44, in <module>
ms = ModelSpeech(datapath)
File "/media/chutz/000206BE0003636E/ASRT_SpeechRecognition/SpeechModel25.py", line 40, in __init__
self._model = self.graves()
File "/media/chutz/000206BE0003636E/ASRT_SpeechRecognition/SpeechModel25.py", line 94, in graves
label_length])
File "/home/chutz/anaconda3/lib/python3.5/site-packages/keras/engine/topology.py", line 619, in __call__
output = self.call(inputs, **kwargs)
File "/home/chutz/anaconda3/lib/python3.5/site-packages/keras/layers/core.py", line 685, in call
return self.function(inputs, **arguments)
File "/media/chutz/000206BE0003636E/ASRT_SpeechRecognition/SpeechModel25.py", line 60, in ctc_lambda_func
return K.ctc_batch_cost(labels, y_pred, input_length, label_length)
File "/home/chutz/anaconda3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 3956, in ctc_batch_cost
sequence_length=input_length), 1)
File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/ops/ctc_ops.py", line 158, in ctc_loss
ignore_longer_outputs_than_inputs=ignore_longer_outputs_than_inputs)
File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/ops/gen_ctc_ops.py", line 285, in ctc_loss
name=name)
File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op
op_def=op_def)
File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1718, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
InvalidArgumentError (see above for traceback): Saw a non-null label (index >= num_classes - 1) following a null label, batch: 2 num_classes: 29 labels:
[[Node: ctc/CTCLoss = CTCLoss[_class=["loc:@training/SGD/gradients/ctc/CTCLoss_grad/mul"], ctc_merge_repeated=true, ignore_longer_outputs_than_inputs=true, preprocess_collapse_repeated=false, _device="/job:localhost/replica:0/task:0/device:CPU:0"](ctc/Log/_123, ctc/ToInt64/_125, ctc/GatherNd, ctc/Squeeze_1/_127)]]
這個錯誤是由於output_dim決定了y_pred的shape,即輸出的預測分類,它的值應該等於labels的類別總數
我所做的是中文識別,拼音類別是1421個,加上一個空白塊,將output_dim設置爲1422即可