Recipe for African Accented French #2813

johnjosephmorgan · 2018-11-01T20:04:54Z

This is a recipe to build an ASR system with the African Accented French corpus. It follows the mini_librispeech pattern.

danpovey · 2018-11-01T21:23:55Z

egs/yaounde_fr/s5/local/chain/tuning/run_tdnn_1a.sh

+# Num-params                 5270002
+
+
+#| model | dev tgsmall | test tgsmall | devtest tgsmall | dev tgmed | test tgmed | devtest tgmed | dev tglarge | test tglarge | devtest tglarge |


Thanks. Would you mind putting these GMM numbers in a RESULTS file instead? It would make it easier to find for those used to the usual structure. And please don't forget to include some kind of command in the RESULTS that would enable you to obtain those numbers (even if not in the exact same format). That will make it easier for others, after running it, to verify that their results are similar to yours.

… version on openslr.org. My scripts were using those new files. I changed my scripts to use the files that are currently on openslr.org. Later when we get transcripts for the answers, I will ask Yenda to update the corpus on openslr.org and I will update my scripts. I also changed to use iconv instead of uconv.

…slr.org is bad. The transcripts are actually the questions instead of the answers. I am including a temporary transcription that was obtained from a decoding run until we get the good transcripts.

danpovey · 2018-11-18T02:22:51Z

egs/yaounde_fr/s5/RESULTS

+| tri2b | 33.02 | 19.01 | 3.85 | 33.26 | 26.36 | 9.91 | 31.48 | 21.27 | 5.53 |
+| tri3b | 26.91 | 18.85 | 3.49 | 25.90 | 24.51 | 8.51 | 23.83 | 20.01 | 4.37 |
+| chain tdnn-f | 24.02 | 17.20 | 1.96 | 22.30 | 33.66 | 16.17 | 20.14 | 18.69 | 3.33 | 
+| chain tdnn-f online | 24.21 | 17.23 | 1.96 | 22.26 | 33.72 | 16.14 | 19.10 | 32.07 | 14.74 |


Did something go wrong with the last 2 results on this line? The chain tdnnf-online with tglarge rescoring?

You don't seem to be getting as much improvement from the GMM to TDNN phase as I would normally expect.
Since there is more data, you could try increase the bottleneck dimension from 96 to 128 and reducing the l2-regularize values from 0.03 and 0.015 to 0.02 and 0.01, and reducing num-epochs from 20 to 15 (or maybe even 10, but test it).

…ively.

…ively. Also removed mllt lda tri2b steps.

…oding depends on previous decoding?

… scores.

…sults.

stale · 2020-06-19T08:35:59Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale · 2020-07-19T05:23:46Z

This issue has been automatically closed by a bot strictly because of inactivity. This does not mean that we think that this issue is not important! If you believe it has been closed hastily, add a comment to the issue and mention @kkm000, and I'll gladly reopen it.

stale · 2020-09-17T08:30:12Z

This issue has been automatically marked as stale by a bot solely because it has not had recent activity. Please add any comment (simply 'ping' is enough) to prevent the issue from being closed for 60 more days if you believe it should be kept open.

danpovey reviewed Nov 1, 2018

View reviewed changes

johnjosephmorgan added 5 commits November 1, 2018 18:46

Initial commit.

be8c443

Moved results to RESULTS file.

a4204c9

Corrected name of transcript file for yaounde answers training data.

845bd47

The transcript file for the yaounde answers that is currently on open…

413f0bc

…slr.org is bad. The transcripts are actually the questions instead of the answers. I am including a temporary transcription that was obtained from a decoding run until we get the good transcripts.

danpovey reviewed Nov 18, 2018

View reviewed changes

johnjosephmorgan added 23 commits November 20, 2018 11:25

Add decoding with large lm rescoring for monophones?

055ca6e

Fixed bug. eq instead of == in string comparision.

dd49f32

Changed comments to echoes.

8f23a41

Do not exit after phonetisaurus align ? output.

439dfbd

Experiment with lower bottleneck dimension. Lowered from 128 to 96.

7fe5495

Merge.

071d7f4

Lower l2 regularization from 0.03 to 0.02.

75e3846

Added clauses to stage conditionals to allow small lm decoding exclus…

688f768

…ively.

Added clauses to stage conditionals to allow small lm decoding exclus…

588d5b7

…ively. Also removed mllt lda tri2b steps.

More fixing stage conditionals for small lm decoding only.

3efc399

Inserted a wait before decoding tri3b models with larger lms. The dec…

35f23b5

…oding depends on previous decoding?

Inserted a wait before decoding with larger LMs and enhanced lexicon.

8230717

added another tuning script for trying to improve the chain model WER…

1ee6068

… scores.

Updated experiment script.

77436d8

Updated experiment script.

76e9ed6

Updated experiment script.

b45cf38

Added ascript to experiment with number of epochs.

0786dfc

UUpdate experiment script.

38115e6

Corrected comparison. I ran it too early. Fewer epochs gave better re…

c615a70

…sults.

Added current results. They look bad. Why are they so bad?

a485acf

Minor cleaning.

569eda1

Put tri2b models back.

7d5b1e3

Adding Yenda's scripts for kws.

682f49c

johnjosephmorgan added 8 commits November 30, 2018 20:38

Adding a script to do l2 regularization tuning experiments.

e1fe905

Adjusted number of leaves and gaussians after experimenting.

f49a67b

Add pruned lm modeling.

652f04b

I am simplifying the directory and file names in this recipe.

0f1824c

Simplifying.

c2946e5

Update.

1081fbd

Update.

ee5726a

No small medium or large LMs.

334ab41

stale bot added the stale Stale bot on the loose label Jun 19, 2020

stale bot closed this Jul 19, 2020

kkm000 reopened this Jul 19, 2020

stale bot removed the stale Stale bot on the loose label Jul 19, 2020

stale bot added the stale Stale bot on the loose label Sep 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recipe for African Accented French #2813

Recipe for African Accented French #2813

johnjosephmorgan commented Nov 1, 2018

danpovey Nov 1, 2018

danpovey Nov 18, 2018

stale bot commented Jun 19, 2020

stale bot commented Jul 19, 2020

stale bot commented Sep 17, 2020

		# Num-params 5270002


		#\| model \| dev tgsmall \| test tgsmall \| devtest tgsmall \| dev tgmed \| test tgmed \| devtest tgmed \| dev tglarge \| test tglarge \| devtest tglarge \|

Recipe for African Accented French #2813

Are you sure you want to change the base?

Recipe for African Accented French #2813

Conversation

johnjosephmorgan commented Nov 1, 2018

danpovey Nov 1, 2018

Choose a reason for hiding this comment

danpovey Nov 18, 2018

Choose a reason for hiding this comment

stale bot commented Jun 19, 2020

stale bot commented Jul 19, 2020

stale bot commented Sep 17, 2020