Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document or update split_vds_by_strata to more accurately reflect behavior #683

Open
mike-w-wilson opened this issue Mar 7, 2024 · 0 comments

Comments

@mike-w-wilson
Copy link
Contributor

Working on v4.0, we created the gnomad_methods function split_vds_by_strata which splits a vds based on a n expression. The desired behavior was to split a vds and maintain all alleles in each subset. This does not happen as it utilizes hail's vds.filter_samples function which unexpectedly removes all variants that are not present in a filtered sample subset despite keeping the arg remove_dead_alleles as false.

As it stands, our function does not state it will maintain or remove the dead alleles simply it will split the vds. However, we should consider updating the function so removing or keeping the dead alleles/variants is an option and it is documented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant