1、在代碼啓動處加入:
import pdb
和
pdb.set_trace()
2、正常啓動代碼
3、s->單步調試
pp->打印變量的值
r->跳出當前函數
1、ffmpeg轉換格式:
f32le
2、(waveform, sample_rate)數據:
(array([[ 7.4898242e-05, 5.3574135e-05],
[ 7.9060490e-05, 8.2915823e-05],
[ 7.2390605e-05, 7.4393036e-05],
...,
[-2.2528782e-05, 3.8444487e-06],
[ 2.0584919e-05, -3.8877548e-05],
[-4.6448025e-05, 1.3482724e-05]], dtype=float32),
44100)
3、waveform
array([[ 7.4898242e-05, 5.3574135e-05],
[ 7.9060490e-05, 8.2915823e-05],
[ 7.2390605e-05, 7.4393036e-05],
...,
[-2.2528782e-05, 3.8444487e-06],
[ 2.0584919e-05, -3.8877548e-05],
[-4.6448025e-05, 1.3482724e-05]], dtype=float32)
4、waveform.shape[-1]==2
5、params
{'F': 1024,
'MWF': False,
'T': 512,
'batch_size': 4,
'frame_length': 4096,
'frame_step': 1024,
'instrument_list': ['vocals', 'accompaniment'],
'learning_rate': 0.0001,
'mask_extension': 'zeros',
'mix_name': 'mix',
'model': {'params': {}, 'type': 'unet.unet'},
'model_dir': 'pretrained_models/2stems',
'n_channels': 2,
'random_seed': 0,
'sample_rate': 44100,
'save_checkpoints_steps': 150,
'save_summary_steps': 5,
'separation_exponent': 2,
'throttle_secs': 300,
'train_csv': 'path/to/train.csv',
'train_max_steps': 1000000,
'training_cache': 'training_cache',
'validation_cache': 'validation_cache',
'validation_csv': 'path/to/test.csv'}
6、session_config
gpu_options {
per_process_gpu_memory_fraction: 0.7
}
def build_predict_model(self):
""" Builder interface for creating model instance that aims to perform
prediction / inference over given track. The output of such estimator
will be a dictionary with a "<instrument>" key per separated instrument
, associated to the estimated separated waveform of the instrument.
:returns: An estimator for performing prediction.
"""
self._build_stft_feature()
output_dict = self._build_output_dict()
output_waveform = self._build_output_waveform(output_dict)
return tf.estimator.EstimatorSpec(
tf.estimator.ModeKeys.PREDICT,
predictions=output_waveform)
7、predictor
SavedModelPredictor with feed tensors {'waveform': <tf.Tensor 'Placeholder:0' shape=(?, 2) dtype=float32>, 'mix_stft': <tf.Tensor 'transpose_1:0' shape=(?, 2049, 2) dtype=complex64>, 'mix_spectrogram': <tf.Tensor 'strided_slice_3:0' shape=(?, 512, 1024, 2) dtype=float32>, 'audio_id': <tf.Tensor 'Placeholder_1:0' shape=<unknown> dtype=string>} and fetch_tensors {'accompaniment': <tf.Tensor 'strided_slice_23:0' shape=(?, 2) dtype=float32>, 'vocals': <tf.Tensor 'strided_slice_13:0' shape=(?, 2) dtype=float32>, 'audio_id': <tf.Tensor 'Placeholder_1:0' shape=<unknown> dtype=string>}
8、
predictor({
'waveform': waveform,
'audio_id': ''})
waveform
array([[ 7.4898242e-05, 5.3574135e-05],
[ 7.9060490e-05, 8.2915823e-05],
[ 7.2390605e-05, 7.4393036e-05],
...,
[-2.2528782e-05, 3.8444487e-06],
[ 2.0584919e-05, -3.8877548e-05],
[-4.6448025e-05, 1.3482724e-05]], dtype=float32)