-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make-probes genbank file key error #131
Comments
The problem is that In our email correspondence I asked @arnoldbain to download the latest genbank file (which has this locus) and still get's the same issue. This locus does not have a "gene" name and the way mykrobe currently indexes genbank files relies on there being a gene name. For instance, this is the entry for the locus @arnoldbain is looking at
So it seems we need update our genbank indexing method @martinghunt |
As a work-around in the meantime @arnoldbain you could try adding a gene name to the annotation for Rv1979c in the latest genbank file. You should find it on line 41310 and 41313 in the latest genbank file. So it looks something like
|
I expect this is non-trivial to fix, and needs careful thought. If we allow more than one way to refer to the same variant (eg For now, I recommend using @mbhall88's workaround. |
Working fine with @mbhall88 suggestion, thanks guys |
We could just use a sort of hierarchy when we first try and use the gene tag and of that isn't present try and use a locus tag and if that isn't present, move on? That way we would always use rpoB instead of Rv0667 |
Need to handle when someone provides two variants in the same gene, but one of them specified by the gene name and the other specified by the locus tag. There are likely more edge cases. |
Yes, but that seems to be a corner case we could very easily say we don't handle yet (or may never handle as it seems insanse) |
Mykrobe make-probes doesnot seem to like protein locus tag (e.g. Rv1979c) but ok with the name tags like rpoB; below is the error;
mykrobe variants make-probes -t gene.txt -g H37Rv.gbk rH37Rv.fasta > probes.gene.fa
[mykrobe 2021-06-11T13:44:34 WARNING] Could not connect to database. Continuing without using genetic backgrounds
Traceback (most recent call last):
File "/home/arnold/miniconda3/bin/mykrobe", line 11, in
sys.exit(main())
File "/home/arnold/miniconda3/lib/python3.8/site-packages/mykrobe/cli.py", line 8, in main
args.func(parser, args)
File "/home/arnold/miniconda3/lib/python3.8/site-packages/mykrobe/parser.py", line 55, in run_subtool
run(parser, args)
File "/home/arnold/miniconda3/lib/python3.8/site-packages/mykrobe/cmds/makeprobes.py", line 69, in run
for var_name in aa2dna.get_variant_names(
File "/home/arnold/miniconda3/lib/python3.8/site-packages/mykrobe/annotation/genes/models.py", line 181, in get_variant_names
gene = self.get_gene(gene)
File "/home/arnold/miniconda3/lib/python3.8/site-packages/mykrobe/annotation/genes/models.py", line 237, in get_gene
return self.genbank[gene]
KeyError: 'Rv1979c'
The text was updated successfully, but these errors were encountered: