Major feature release
Major feature release, adding support for population genetic statistics,
improved VCF output and many other features.
Note: Version 0.2.0 was skipped because of an error uploading to PyPI
which could not be undone.
Breaking changes
-
Genotype arrays returned by
TreeSequence.variants
and
TreeSequence.genotype_matrix
have changed from unsigned 8 bit values
to signed 8 bit values to accomodate missing data (see :issue:144
for
discussion). Specifically, the dtype of the genotypes arrays have changed
from numpy "u8" to "i8". This should not affect client code in any way
unless it specifically depends on the type of the returned numpy array. -
The VCF written by the
write_vcf
is no longer compatible with previous
versions, which had significant shortcomings. Position values are now rounded
to the nearest integer by default, REF and ALT values are derived from the
actual allelic states (rather than always being A and T). Sample names
are now of the formtsk_j
for sample ID j. Most of the legacy behaviour
can be recovered with new options, however. -
The positional parameter
reference_sets
ingenealogical_nearest_neighbours
andmean_descendants
TreeSequence methods has been renamed to
sample_sets
.
New features
-
Support for general windowed statistics. Implementations of diversity,
divergence, segregating sites, Tajima's D, Fst, Patterson's F statistics,
Y statistics, trait correlations and covariance, and k-dimensional allele
frequency specra (:user:petrelharp
, :user:jeromekelleher
, :user:molpopgen
). -
Add the
keep_unary
option to simplify (:user:gtsambos
). See :issue:1
and :pr:143
. -
Add the
map_ancestors
method to TableCollection (user:gtsambos
). See :pr:175
. -
Add the
squash
method to EdgeTable (:user:gtsambos
). See :issue:59
and
:pr:285
. -
Add support for individuals to VCF output, and fix major issues with output
format (:user:jeromekelleher
). Position values are transformed in a much
more straightforward manner and output has been generalised substantially.
Addsindividual_names
andposition_transform
arguments.
See :pr:286
, and issues :issue:2
, :issue:30
and :issue:73
. -
Control height scale in SVG trees using 'tree_height_scale' and 'max_tree_height'
(:user:hyanwong
, :user:jeromekelleher
). See :issue:167
, :pr:168
.
Various other improvements to tree drawing (:pr:235
, :pr:241
, :pr:242
,
:pr:252
, :pr:259
). -
Add
Tree.max_root_time
property (:user:hyanwong
, :user:jeromekelleher
).
See :pr:170
. -
Improved input checking on various methods taking numpy arrays as parameters
(:user:hyanwong
). See :issue:8
and :pr:185
. -
Define the branch length over roots in trees to be zero (previously raise
-
Implementation of the genealogical nearest neighbours statistic
(:user:hyanwong
, :user:jeromekelleher
). -
New
delete_intervals
andkeep_intervals
method for the TableCollection
to allow slicing out of topology from specific intervals (:user:hyanwong
,
:user:andrewkern
, :user:petrelharp
, :user:jeromekelleher
). See
:pr:225
and :pr:261
. -
Support for missing data via a topological definition (:user:
jeromekelleher
).
See :issue:270
and :pr:272
. -
Add ability to set columns directly in the Tables API (:user:
jeromekelleher
).
See :issue:12
and :pr:307
. -
Various documentation improvements from :user:
brianzhang
, :user:hyanwong
,
:user:petrelharp
and :user:jeromekelleher
.
Deprecated
-
Deprecate
Tree.length
in favour ofTree.span
(:user:hyanwong
).
See :pr:169
. -
Deprecate
TreeSequence.pairwise_diversity
in favour of the new
diversity
method. See :issue:215
, :pr:312
.
Bugfixes
- Catch NaN and infinity values within tables (:user:
hyanwong
).
See :issue:293
and :pr:294
.