BGEN support #37

prasunanand · 2018-06-19T13:31:07Z

In GEMMA, bgen support was added in PR.

However, there are no tests to validate the code so that I can port it to faster_lmm_d .

I need to test BGEN files with a 500k sample. I believe this would be a great exercise to test GPU support.

PS: This thread tracks the implementation of BGEN file support.

The text was updated successfully, but these errors were encountered:

prasunanand · 2018-08-20T07:29:24Z

Might be helpful for future reference.

The following table tabulates features of various different formats:

	PLINK binary	GEN	BGEN v1.1	BGEN v1.2 / v1.3	VCF	BCF
Supports unphased genotype calls	✓	✓^*	✓^*	✓	✓	✓
Supports unphased genotype probabilities		✓	✓	✓	✓	✓
Supports NULL/outlier probability e.g. NULL class from CHIAMO / GenoSNP		✓	✓		✓	✓
Supports non-diploid samples		†	^†	✓	✓^‡	✓^‡
Supports phased data?				✓	✓^‡	✓^‡
Supports multi-allelic variants				✓	✓	✓
Efficient representation?	✓		✓	✓		✓

Hard-called genotypes are converted to probabilities in GEN and BGEN v1.1. †By convention, males on the X chromosome are stored as homozygote females in GEN and BGEN v1.1. ‡At the time of writing, the storage of genotype likelihoods and probabilities for non-diploid samples and/or phased data in VCF/BCF is not fully specified.

Found this on http://www.well.ox.ac.uk/~gav/bgen_format/

pjotrp · 2018-08-21T18:36:21Z

It is also important how quickly file formats can be streamed for parallel processing. Binary formats typically do no better than compressed textual data here. I see that as a too early optimization ;).

I suspect for GEMMA we end up with our own R/qtl2 based format and convert from one of the above.

Computing probabilities is something we like to control. Also it is not a great idea to have GEMMA support multiple formats for reasons of maintenance. One type is enough. Conversion will be rapid so we can pipe it in.

prasunanand assigned pjotrp and prasunanand Jun 19, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BGEN support #37

BGEN support #37

prasunanand commented Jun 19, 2018 •

edited

Loading

prasunanand commented Aug 20, 2018 •

edited

Loading

pjotrp commented Aug 21, 2018

BGEN support #37

BGEN support #37

Comments

prasunanand commented Jun 19, 2018 • edited Loading

prasunanand commented Aug 20, 2018 • edited Loading

pjotrp commented Aug 21, 2018

prasunanand commented Jun 19, 2018 •

edited

Loading

prasunanand commented Aug 20, 2018 •

edited

Loading