keras ctc loss error: InvalidArgumentError: Saw a non-null label following a null label

在用keras-tf backend做語音識別網絡時,採用MFCC特徵值輸入,LSTM網絡和ctc loss function,出現了以下錯誤:

2018-07-02 11:32:45.861523: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at ctc_loss_op.cc:166 : Invalid argument: Saw a non-null label (index >= num_classes- 1) following a null label, batch: 4 num_classes: 29 labels:
Traceback (most recent call last):
  File "/media/chutz/000206BE0003636E/ASRT_SpeechRecognition/train_mspeech.py", line 47, in <module>
    ms.TrainModel(datapath, epoch = 50, batch_size = 8, save_step = 1000, filename= modelpath)
  File "/media/chutz/000206BE0003636E/ASRT_SpeechRecognition/SpeechModel25.py", line 215, in TrainModel
    #self._model.fit_generator(yielddatas, save_step, nb_worker=2)
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/keras/engine/training.py", line 2230, in fit_generator
    class_weight=class_weight)
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/keras/engine/training.py", line 1883, in train_on_batch
    outputs = self.train_function(ins)
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 2482, in __call__
    **self.session_kwargs)
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 900, in run
    run_metadata_ptr)
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1135, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
    run_metadata)
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Saw a non-null label (index >= num_classes - 1) following a null label, batch: 4 num_classes: 29 labels:
         [[Node: ctc/CTCLoss = CTCLoss[_class=["loc:@training/SGD/gradients/ctc/CTCLoss_grad/mul"], ctc_merge_repeated=true, ignore_longer_outputs_than_inputs=false, preprocess_collapse_repeated=false, _device="/job:localhost/replica:0/task:0/device:CPU:0"](ctc/Log/_123, ctc/ToInt64/_125, ctc/GatherNd, ctc/Squeeze_1/_127)]]
         [[Node: training/SGD/gradients/ctc/CTCLoss_grad/mul/_155 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_1925_training/SGD/gradients/ctc/CTCLoss_grad/mul", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]

Caused by op 'ctc/CTCLoss', defined at:
  File "/home/chutz/.vscode/extensions/ms-python.python-2018.4.0/pythonFiles/PythonTools/visualstudio_py_launcher.py", line 91, in <module>
    vspd.debug(filename, port_num, debug_id, debug_options, currentPid, run_as)
  File "/home/chutz/.vscode/extensions/ms-python.python-2018.4.0/pythonFiles/PythonTools/visualstudio_py_debugger.py", line 2625, in debug
    exec_file(file, globals_obj)
  File "/home/chutz/.vscode/extensions/ms-python.python-2018.4.0/pythonFiles/PythonTools/visualstudio_py_util.py", line 119, in exec_file
    exec_code(code, file, global_variables)
  File "/home/chutz/.vscode/extensions/ms-python.python-2018.4.0/pythonFiles/PythonTools/visualstudio_py_util.py", line 95, in exec_code
    exec(code_obj, global_variables)
  File "/media/chutz/000206BE0003636E/ASRT_SpeechRecognition/train_mspeech.py", line 44, in <module>
    ms = ModelSpeech(datapath)
  File "/media/chutz/000206BE0003636E/ASRT_SpeechRecognition/SpeechModel25.py", line 40, in __init__
    self._model = self.graves()
  File "/media/chutz/000206BE0003636E/ASRT_SpeechRecognition/SpeechModel25.py", line 94, in graves
    label_length])
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/keras/engine/topology.py", line 619, in __call__
    output = self.call(inputs, **kwargs)
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/keras/layers/core.py", line 685, in call
    return self.function(inputs, **arguments)
  File "/media/chutz/000206BE0003636E/ASRT_SpeechRecognition/SpeechModel25.py", line 60, in ctc_lambda_func
    return K.ctc_batch_cost(labels, y_pred, input_length, label_length)
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 3956, in ctc_batch_cost
    sequence_length=input_length), 1)
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/ops/ctc_ops.py", line 158, in ctc_loss
    ignore_longer_outputs_than_inputs=ignore_longer_outputs_than_inputs)
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/ops/gen_ctc_ops.py", line 285, in ctc_loss
    name=name)
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op
    op_def=op_def)
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1718, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Saw a non-null label (index >= num_classes - 1) following a null label, batch: 4 num_classes: 29 labels:
         [[Node: ctc/CTCLoss = CTCLoss[_class=["loc:@training/SGD/gradients/ctc/CTCLoss_grad/mul"], ctc_merge_repeated=true, ignore_longer_outputs_than_inputs=false, preprocess_collapse_repeated=false, _device="/job:localhost/replica:0/task:0/device:CPU:0"](ctc/Log/_123, ctc/ToInt64/_125, ctc/GatherNd, ctc/Squeeze_1/_127)]]
         [[Node: training/SGD/gradients/ctc/CTCLoss_grad/mul/_155 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_1925_training/SGD/gradients/ctc/CTCLoss_grad/mul", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]

設定是輸入語音長度爲(1600, 26)的MFCC特徵值序列,n_mfcc=26,語音最長爲1600,短了就補零

labels爲maxlength=64,短了補零

但是出現了以上錯誤

由於網絡的inputdim=26,outputdim=29,所以我把ignore_longer_outputs_than_inputs設置改爲了True,再運行依舊報錯,不過錯誤好像少了一點,有希望,繼續調試……

2018-07-02 11:38:30.403252: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at ctc_loss_op.cc:166 : Invalid argument: Saw a non-null label (index >= num_classes - 1) following a null label, batch: 2 num_classes: 29 labels:
Traceback (most recent call last):
  File "/media/chutz/000206BE0003636E/ASRT_SpeechRecognition/train_mspeech.py", line 47, in <module>
    ms.TrainModel(datapath, epoch = 50, batch_size = 8, save_step = 1000, filename= modelpath)
  File "/media/chutz/000206BE0003636E/ASRT_SpeechRecognition/SpeechModel25.py", line 216, in TrainModel
    self._model.fit_generator(yielddatas, save_step)
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/keras/engine/training.py", line 2230, in fit_generator
    class_weight=class_weight)
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/keras/engine/training.py", line 1883, in train_on_batch
    outputs = self.train_function(ins)
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 2482, in __call__
    **self.session_kwargs)
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 900, in run
    run_metadata_ptr)
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1135, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
    run_metadata)
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Saw a non-null label (index >= num_classes - 1) following a null label, batch: 2 num_classes: 29 labels:
         [[Node: ctc/CTCLoss = CTCLoss[_class=["loc:@training/SGD/gradients/ctc/CTCLoss_grad/mul"], ctc_merge_repeated=true, ignore_longer_outputs_than_inputs=true, preprocess_collapse_repeated=false, _device="/job:localhost/replica:0/task:0/device:CPU:0"](ctc/Log/_123, ctc/ToInt64/_125, ctc/GatherNd, ctc/Squeeze_1/_127)]]

Caused by op 'ctc/CTCLoss', defined at:
  File "/home/chutz/.vscode/extensions/ms-python.python-2018.4.0/pythonFiles/PythonTools/visualstudio_py_launcher.py", line 91, in <module>
    vspd.debug(filename, port_num, debug_id, debug_options, currentPid, run_as)
  File "/home/chutz/.vscode/extensions/ms-python.python-2018.4.0/pythonFiles/PythonTools/visualstudio_py_debugger.py", line 2625, in debug
    exec_file(file, globals_obj)
  File "/home/chutz/.vscode/extensions/ms-python.python-2018.4.0/pythonFiles/PythonTools/visualstudio_py_util.py", line 119, in exec_file
    exec_code(code, file, global_variables)
  File "/home/chutz/.vscode/extensions/ms-python.python-2018.4.0/pythonFiles/PythonTools/visualstudio_py_util.py", line 95, in exec_code
    exec(code_obj, global_variables)
  File "/media/chutz/000206BE0003636E/ASRT_SpeechRecognition/train_mspeech.py", line 44, in <module>
    ms = ModelSpeech(datapath)
  File "/media/chutz/000206BE0003636E/ASRT_SpeechRecognition/SpeechModel25.py", line 40, in __init__
    self._model = self.graves()
  File "/media/chutz/000206BE0003636E/ASRT_SpeechRecognition/SpeechModel25.py", line 94, in graves
    label_length])
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/keras/engine/topology.py", line 619, in __call__
    output = self.call(inputs, **kwargs)
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/keras/layers/core.py", line 685, in call
    return self.function(inputs, **arguments)
  File "/media/chutz/000206BE0003636E/ASRT_SpeechRecognition/SpeechModel25.py", line 60, in ctc_lambda_func
    return K.ctc_batch_cost(labels, y_pred, input_length, label_length)
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 3956, in ctc_batch_cost
    sequence_length=input_length), 1)
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/ops/ctc_ops.py", line 158, in ctc_loss
    ignore_longer_outputs_than_inputs=ignore_longer_outputs_than_inputs)
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/ops/gen_ctc_ops.py", line 285, in ctc_loss
    name=name)
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op
    op_def=op_def)
  File "/home/chutz/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1718, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Saw a non-null label (index >= num_classes - 1) following a null label, batch: 2 num_classes: 29 labels:
         [[Node: ctc/CTCLoss = CTCLoss[_class=["loc:@training/SGD/gradients/ctc/CTCLoss_grad/mul"], ctc_merge_repeated=true, ignore_longer_outputs_than_inputs=true, preprocess_collapse_repeated=false, _device="/job:localhost/replica:0/task:0/device:CPU:0"](ctc/Log/_123, ctc/ToInt64/_125, ctc/GatherNd, ctc/Squeeze_1/_127)]]

這個錯誤是由於output_dim決定了y_pred的shape,即輸出的預測分類,它的值應該等於labels的類別總數

我所做的是中文識別,拼音類別是1421個,加上一個空白塊,將output_dim設置爲1422即可

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章