-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VariantUtils and TestVariantUtils #272
Conversation
8fd76c4
to
c44cd80
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Releasing some feedback you can work on in background - feel free to reply/reach out with any questions or clarifications
# Uncomment this if needed | ||
# self.health = get_health_page(key=self.creds) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think good to remove at this point
dcicutils/variant_utils.py
Outdated
SEARCH_RARE_VARIANTS_BY_GENE = '/search/?samplegeno.samplegeno_role=proband&type=VariantSample\ | ||
&variant.csq_gnomadg_af_popmax.from=0&variant.csq_gnomadg_af_popmax.to=0.001\ | ||
&variant.genes.genes_most_severe_gene.display_title=' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer f-string for multi-line strings ie:
SEARCH_RARE_VARIANTS_BY_GENE = (f'/search/?samplegeno.samplegeno_role=proband&type=VariantSample'
f'&variant.csq_gnomadg_af_popmax.from=0&'
f'variant.csq_gnomadg_af_popmax.to=0.001&'
f'variant.genes.genes_most_severe_gene.display_title=')
dcicutils/variant_utils.py
Outdated
def sort_dict_in_descending_order(unsorted_dict): | ||
"""sorts dictionary in descending value order""" | ||
sorted_list = sorted(unsorted_dict.items(), key=lambda x: x[1], reverse=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sort dict how? answer: by value (key=lambda x: x[1]), so better name is sort_dict_by_value_descending
, or better yet make the descending fact a default argument that can be changed so you can just have sort_dict_by_value
. You can also make this one line by calling dict
directly on line 41.
dcicutils/variant_utils.py
Outdated
return dict(sorted_list) | ||
|
||
def create_dict_of_mutations(self, gene): | ||
"""cretes dictionary of specified gene and 10+ occuring positions with their number of variants""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whenever you're creating a dict like this with specific structure, add info on the docstring on what the return value actually looks like.
return {gene: self.sort_dict_in_descending_order({k: v for k, v in mutation_dict.items() if v >= 10})} | ||
|
||
@staticmethod | ||
def return_json(file_name): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, it is useful to use type annotations ie:
def return_json(file_name: str) -> dict:
...
See here for more info. This applies to all functions in this file.
dcicutils/variant_utils.py
Outdated
def create_url(self, gene): | ||
"""returns a url to the variants at the most commonly mutated position of a gene""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
create_url
isn't the best name I'd say... What you have in the docstring should probably be reflected in the name. As a general convention, for basic usage the function name should be as self explanatory as possible.
|
||
def create_url(self, gene): | ||
"""returns a url to the variants at the most commonly mutated position of a gene""" | ||
d = self.create_dict_from_json_file('10+sorted_msa_genes_and_mutations.json') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hard-coded filename likely not desired
test/test_variant_utils.py
Outdated
expected_result = vu.SEARCH_RARE_VARIANTS_BY_GENE + mock_gene + ('&variant.POS.from={pos}\ | ||
&variant.POS.to={pos}&sort=-DP') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likely to be much cleaner with f-strings ie:
expected_result = (f'{vu.SEARCH_RARE_VARIANTS_BY_GENE}{mock_gene}'
f'&variant.POS.from={pos}'
f'&variant.POS.to={pos}&sort=-DP')
It's also not clear this is working as desired, seems like you may want pos actually injected and that isn't happening as written I don't think?
expected_result = vu.SEARCH_RARE_VARIANTS_BY_GENE + mock_gene + ('&variant.POS.from={pos}\ | ||
&variant.POS.to={pos}&sort=-DP') | ||
assert result == expected_result | ||
mock_create_dict_from_json_file.assert_called_once_with('10+sorted_msa_genes_and_mutations.json') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hardcoded file path again - will break tests in the workflows
mock_create_dict_from_json_file.assert_called_once_with('10+sorted_msa_genes_and_mutations.json') | ||
|
||
@patch('dcicutils.variant_utils.VariantUtils.return_json') | ||
def test_create_list_of_als_park_genes(self, mock_return_json): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This tests looks very similar to another above... See if you can figure out how to parametrize them together as one
Pull Request Test Coverage Report for Build 5894264393
💛 - Coveralls |
Functions that help filter down data from CGAP.