-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using the code for tedlium2_v2 with merged apostroph #218
base: main
Are you sure you want to change the base?
Conversation
from i6_experiments.common.tools.sctk import compile_sctk | ||
|
||
RASR_BINARY_PATH = compile_rasr_binaries_i6mode( | ||
branch="apptainer_tf_2_8", configure_options=["--apptainer-patch=2023-05-08_tensorflow-2.8_v1"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is not your change, but I dont like merging something that will not work, since the branch does not exist anymore. I think we can just remove this and replace this by None and then let the assert do the work.
RASR_BINARY_PATH = compile_rasr_binaries_i6mode( | ||
branch="apptainer_tf_2_8", configure_options=["--apptainer-patch=2023-05-08_tensorflow-2.8_v1"] | ||
) # use most recent RASR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RASR_BINARY_PATH = compile_rasr_binaries_i6mode( | |
branch="apptainer_tf_2_8", configure_options=["--apptainer-patch=2023-05-08_tensorflow-2.8_v1"] | |
) # use most recent RASR | |
RASR_BINARY_PATH = None # TODO please set RASR |
SCTK_BINARY_PATH = compile_sctk(branch="v2.4.12") # use last published version | ||
SCTK_BINARY_PATH.hash_overwrite = "TEDLIUM2_DEFAULT_SCTK_BINARY_PATH" | ||
|
||
SRILM_PATH = tk.Path("/work/tools/users/luescher/srilm-1.7.3/bin/i686-m64/") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to RASR I think this needs to be either changed to some u22 path or just asserted as False
@@ -0,0 +1,423 @@ | |||
from i6_core.features.filterbank import filter_width_from_channels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will just hijack this line just to make sure: You talked about some possible changes to FSA and Silence etc. for GMM, are they already included here? Or is this for now a straight copy?
|
||
|
||
def run_tedlium2_common_baseline( | ||
alias_prefix="baselines/tedlium2/gmm/common_baseline", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this have the same prefix as the other baseline?
This would cause overwriting if both are enabled. But I would not mind keeping the "true" alias to this file.
from typing import Dict | ||
|
||
from i6_experiments.common.datasets.tedlium2.constants import CONCURRENT | ||
from i6_experiments.common.datasets.tedlium2_v2.corpus import get_corpus_object_dict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this file pushed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am waiting for the referenced PR to be merged.
|
||
from i6_experiments.common.datasets.tedlium2.constants import CONCURRENT | ||
from i6_experiments.common.datasets.tedlium2_v2.corpus import get_corpus_object_dict | ||
from i6_experiments.common.datasets.tedlium2_v2.lexicon import ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this pushed?
from collections import defaultdict | ||
from typing import Dict | ||
|
||
from i6_experiments.common.datasets.tedlium2.constants import CONCURRENT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I know this does not change, if we do a "full copy" shouldn't we for completeness also copy this? @JackTemaki opinions?
from i6_core.corpus.convert import CorpusToTxtJob | ||
from i6_core.lexicon.conversion import LexiconToWordListJob | ||
|
||
from i6_experiments.common.datasets.tedlium2_v2.corpus import get_bliss_corpus_dict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for data.py, some files are not pushed?
d5d3ebc
to
cf5884a
Compare
This is the recipes version that uses the fix from here.