Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CTC loss for test_keras.py throws error when Masking layer is used #13

Open
chandraprakash5 opened this issue Sep 5, 2016 · 1 comment

Comments

@chandraprakash5
Copy link

When a masking layer is used for speech utterances of variable length, an input dimension mis-match error is thrown. The following is the edited model from the test_keras.py to reproduce the error.

model = Sequential() model.add(Masking(mask_value=0., input_shape=(frame_len, nb_feat))) model.add(LSTM(inner_dim, return_sequences = True)) model.add(BatchNormalization()) model.add(TimeDistributed(Dense(nb_output)))

ValueError: GpuElemwise. Input dimension mis-match. Input 1 (indices start at 0) has shape[1] == 80, but the output's size on that axis is 16.

Please suggest how can a Masking layer be used when using CTC loss with Keras.

@githubnemo
Copy link
Contributor

I think the problem is that internally Keras tries to apply the mask to the loss which has no time dimension anymore (since CTC returns a shape of (batch, 1)). You could do a layer at the end of your network that removes the mask (a layer that returns None in compute_mask).

Alternatively, implement the CTC loss as a layer and use the mask to compute the activation sequence lengths while ignoring the mask as described above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants