Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to start using VAD? #198

Open
jhoelzl opened this issue Oct 11, 2016 · 4 comments
Open

How to start using VAD? #198

jhoelzl opened this issue Oct 11, 2016 · 4 comments

Comments

@jhoelzl
Copy link

jhoelzl commented Oct 11, 2016

Hello,

i am interested in the performance of the VAD in this project. I generated the docu files and found the page about "Building a voice activity detector (VAD)".

I went to the path alex/tools/vad and tried to run some scripts, however, i have no training data. In the documentation these folders are described:

data_vad_sil # a directory with only silence, noise data and its mlf file
data_voip_cs # a directory where CS data reside and its MLF (phoneme alignment)
data_voip_en # a directory where EN data reside and its MLF (phoneme alignment)
model_voip # a directory where all the resulting models are stored.

I suppose i have to make these folders in alex/tools/vad ? But where can i find/download the appropriate audio and mlf files?

Regards,
Josef

@jurcicek
Copy link
Member

Unfortunately, we do not have such data publicly available. You have to
have your own.

Best regards,
Filip Jurcicek


Work tel. (CZ): +420221914402
Personal tel. (CZ): +420777805048
Skype: bozskyfilip

http://ufal.mff.cuni.cz/filip-jurcicek

On 11 October 2016 at 14:12, Josef Hölzl [email protected] wrote:

Hello,

i am interested in the performance of the VAD in this project. I generated
the docu files and found the page about "Building a voice activity detector
(VAD)".

I went to the path alex/tools/vad and tried to run some scripts, however,
i have no training data. In the documentation these folders are described:

data_vad_sil # a directory with only silence, noise data and its mlf file
data_voip_cs # a directory where CS data reside and its MLF (phoneme
alignment)
data_voip_en # a directory where EN data reside and its MLF (phoneme
alignment)
model_voip # a directory where all the resulting models are stored.

I suppose i have to make these folders in alex/tools/vad ? But where can
i find/download the appropriate audio and mlf files?

Regards,
Josef


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#198, or mute the thread
https://github.com/notifications/unsubscribe-auth/AEmNUW90OKcog8iaTjRkf887gMog1NN9ks5qy309gaJpZM4KTjsv
.

@jhoelzl
Copy link
Author

jhoelzl commented Oct 11, 2016

Hello @jurcicek, thanks for reply.

I realized that i have to generate the mlf files using Kaldi or HTK.
Therefore i started the script train_voip_en.sh in alex/tools/kaldi/data_voip_en/ and also set my variable KALDI_ROOT.

Then i added some required directories to model_voip_en and data_voip_en, like dev, test,train.
When i run the script train_voip_en.sh i get some empty files:

model_voip_en
└───local
│   └───dev
│       │   spk2gender
│       │   spk2utt
│       │   trans.txt
│       │   utt2spk
│       │   wav.scp
│   └───test
    │   ...
│   └───train
    │   ...

and therefore the script stops:

Initializing set 'dev' output files
Initializing set 'test' output files
Initializing set 'train' output files
--- Distributing the file lists to train and (dev test x build0 build2) directories ...
utils/validate_data_dir.sh: empty file spk2utt

Can you tell me what is all about these files?

@jurcicek
Copy link
Member

You need train, dev, test audio and their transcriptions. Given this you
can train an acoustic model, which will force align the transcriptions and
generate the MLF file. Then you can train VAD.

DISCLAIMER: This part of the code was used about 3 years ago for the last
time. It may not even work.

Best regards,
Filip


Personal tel. (CZ): +420777805048
Skype: bozskyfilip

On 11 October 2016 at 15:53, Josef Hölzl [email protected] wrote:

Hello @jurcicek https://github.com/jurcicek, thanks for reply.

I realized that i have to generate the mlf files using Kaldi or HTK.
Therefore i started the script train_voip_en.sh in
alex/tools/kaldi/data_voip_en/ and also set my variable KALDI_ROOT.

Then i added some required directories to model_voip_en and data_voip_en,
like dev, test,train.
When i run the script train_voip_en.sh i get some empty files:

model_voip_en
└───local
│ └───dev
│ │ spk2gender
│ │ spk2utt
│ │ trans.txt
│ │ utt2spk
│ │ wav.scp
│ └───test
│ ...
│ └───train
│ ...

and therefore the script stops:

Initializing set 'dev' output files
Initializing set 'test' output files
Initializing set 'train' output files
--- Distributing the file lists to train and (dev test x build0 build2)
directories ...
utils/validate_data_dir.sh: empty file spk2utt

Can you tell me what is all about these files?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#198 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AEmNUS0IR3CAQD3hGt_5EaCvsChpv_Hpks5qy5T2gaJpZM4KTjsv
.

@cappelll
Copy link

Hi Filip,

say I've trained a VAD model, is there already a script that outputs the VAD results, using as input the model file and an audio?

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants