Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge function is crashing #13

Open
vicbp1 opened this issue Sep 21, 2024 · 6 comments
Open

merge function is crashing #13

vicbp1 opened this issue Sep 21, 2024 · 6 comments

Comments

@vicbp1
Copy link

vicbp1 commented Sep 21, 2024

when running the following command line

$ibdtools merge -i ${chr}.sibd -m ${chr}.meta -o ${chr}.mibd -M 10
ibdtools merge options received:
--ibd_in: 19.sibd
--meta_in: 19.meta
--ibd_out: 19.mibd
--max_snp: 1
--max_cm: 0.6
--mem: 10

I am getting this error:

Error from ../src/../include/ibdmerger.hpp:168:

I am not sure what it means by this error but when. I increased the memory to 500I received:

terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
Aborted (core dumped)

any thought?

@bguo068
Copy link
Member

bguo068 commented Sep 22, 2024

Thank you for reporting this. I will first check whether the latest commit has already fixed the issue. If not, I will need your files for debugging. Could you share the path to the files used in this command?

Additionally, the merge process uses your VCF file, which should contain only biallelic sites. Could you also provide the path to the VCF file?

@vicbp1
Copy link
Author

vicbp1 commented Sep 22, 2024

Thanks!!!
I report it today to not forget it

Here are the paths:

IBD compressed files: /local/chib/toconnor_grp/TOPMed_analyses/IBD_analyses/IBDprocessing
HAPIBD results: /local/chib/toconnor_grp/TOPMed_analyses/IBD_analyses/hap-ibd_outputs
VCF files: /local/chib/toconnor_grp/TOPMed_analyses/MAF_filtered/freeze.10b.chr${chr}.phased.mac5.vcf.gz
Genetic Maps: /local/chib/toconnor_grp/victor/Public_data/INS_LDGH_data/hg38//IBD_analyses/genetic_maps/genmap_chr${chr}_space.txt

I did not see any flag for the vcf file

Thanks!

@bguo068
Copy link
Member

bguo068 commented Sep 22, 2024

Oh, I meant the VCF file you used for "ibdtools encode: encode the IBD file, VCF file, and PLINK map file into binary format for better/quicker IO."

In the current implementation, ibdtools assumes all sites are phased and biallelic to achieve high memory compaction for storing genotype information. Do you know if the VCF file used in ibdtools encode contains unphased or multiallelic sites? I am checking /local/chib/toconnor_grp/TOPMed_analyses/MAF_filtered/freeze.10b.chr19.phased.mac5.vcf.gz, but it is a bit large and takes some time to finish. Could you confirm that this was the VCF file you used for ibdtools encode?

@vicbp1
Copy link
Author

vicbp1 commented Sep 22, 2024

Oh! :(
I included multiallelic since hap-ibd can handle them. So maybe that is the problem; I will run a test to check if this is the problem.
By the way, I moved from the sort function to the matrix function, assuming that just the merge was problematic, and I had a similar error.

Thank you so much!

@bguo068
Copy link
Member

bguo068 commented Sep 22, 2024

Got it. Yes, please try to use only biallelic sites if possible. The human genome has plenty of biallelic sites, which should suffice for IBD calling.

  • If multiallelic sites are necessary, we can consider relaxing the biallelic VCF restriction (note that refactoring the code may take some time).
  • If multiallelic sites are not necessary, I will add code in the ibdtools encode step to print an error message when encountering multiallelic VCF/bcf records.

Let me know your thoughts :)

@bguo068
Copy link
Member

bguo068 commented Sep 22, 2024

matrix function, assuming

Could you also share the error message and the files used for the matrix function?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants