CTC loss for test_keras.py throws error when Masking layer is used #13

chandraprakash5 · 2016-09-05T12:56:13Z

When a masking layer is used for speech utterances of variable length, an input dimension mis-match error is thrown. The following is the edited model from the test_keras.py to reproduce the error.

model = Sequential() model.add(Masking(mask_value=0., input_shape=(frame_len, nb_feat))) model.add(LSTM(inner_dim, return_sequences = True)) model.add(BatchNormalization()) model.add(TimeDistributed(Dense(nb_output)))

ValueError: GpuElemwise. Input dimension mis-match. Input 1 (indices start at 0) has shape[1] == 80, but the output's size on that axis is 16.

Please suggest how can a Masking layer be used when using CTC loss with Keras.

The text was updated successfully, but these errors were encountered:

githubnemo · 2016-09-13T13:10:48Z

I think the problem is that internally Keras tries to apply the mask to the loss which has no time dimension anymore (since CTC returns a shape of (batch, 1)). You could do a layer at the end of your network that removes the mask (a layer that returns None in compute_mask).

Alternatively, implement the CTC loss as a layer and use the mask to compute the activation sequence lengths while ignoring the mask as described above.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CTC loss for test_keras.py throws error when Masking layer is used #13

CTC loss for test_keras.py throws error when Masking layer is used #13

chandraprakash5 commented Sep 5, 2016

githubnemo commented Sep 13, 2016

CTC loss for test_keras.py throws error when Masking layer is used #13

CTC loss for test_keras.py throws error when Masking layer is used #13

Comments

chandraprakash5 commented Sep 5, 2016

githubnemo commented Sep 13, 2016