Python 0.4.0 #2058
benjeffery
started this conversation in
General
Python 0.4.0
#2058
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Major Python release
Breaking changes
The
Tree.num_nodes
method is now deprecated with a warning, because it confusinglyreturns the number of nodes in the entire tree sequence, rather than in the tree. Text
summaries of trees (e.g.
str(tree)
) now return the number of nodes in the tree,not in the entire tree sequence (@hyanwong, #1966 #1968)
The CLI
info
command now gives more detailed information on the tree sequence(@benjeffery, #1611)
64 bits are now used to store the sizes of ragged table columns such as metadata,
allowing them to hold more data. This change is fully backwards and forwards compatible
for all tree-sequences whose ragged column sizes fit into 32 bits. New tree-sequences with
large offset arrays that require 64 bits will fail to load in previous versions with
error
_tskit.FileFormatError: An incompatible type for a column was found in the file
.(@jeromekelleher, #343, #1527, #1528, #1530,
#1554, #1573, #1589,#1598,#1628, #1571,
#1579, #1585, #1590, #1602, #1618, #1620, #1652).
The Tree class now conceptually has an extra node, the "virtual root" whose
children are the roots of the tree. The quintuply linked tree arrays
(parent_array, left_child_array, right_child_array, left_sib_array and right_sib_array)
all have one extra element.
(@jeromekelleher, #1691, #1704).
Tree traversal orders returned by the
nodes
method have changed when thereare multiple roots. Previously orders were defined locally for each root, but
are now globally across all roots. (@jeromekelleher, #1704).
Individuals are no longer guaranteed or required to be topologically sorted in a tree sequence.
TableCollection.sort
no longer sorts individuals.(@benjeffery, #1774, #1789)
Metadata encoding errors now raise
MetadataEncodingError
(@benjeffery, #1505, #1827).
For
TreeSequence.samples
all arguments afterpopulation
are now keyword only(@benjeffery, #1715, #1831).
Remove the method
TreeSequence.to_nexus
and replace withTreeSequence.as_nexus
.As the old method was not generating standards-compliant output, it seems unlikely
that it was used by anyone. Calls to
to_nexus
will result in aNotImplementedError, informing users of the change. See below for details on
as_nexus
.Change default value for
missing_data_char
in theTreeSequence.haplotypes
method from "-" to "N". This is a more idiomatic usage to indicate
missing data rather than a gap in an alignment. (@jeromekelleher,
#1893, #1894)
Features
Add the
ibd_segments
method and associated classes to compute, summariseand store segments of identity by descent from a tree sequence
(@gtsambos, @jeromekelleher).
Allow skipping of site and mutation tables in
TableCollection.sort
(@benjeffery, #1475, #1826).
Add
TableCollection.sort_individuals
to sort the individuals as this is no longer done by thedefault sort (@benjeffery, #1774, #1789).
Add
__setitem__
to all tables allowing single rows to be updated. For exampletables.nodes[0] = tables.nodes[0].replace(flags=tskit.NODE_IS_SAMPLE)
(@jeromekelleher, @benjeffery, #1545, #1600).
Added a new parameter
time
toTreeSequence.samples()
allowing to selectsamples at a specific time point or time interval.
(@mufernando, @petrelharp, #1692, #1700)
Add
table.metadata_vector
to all table classes to allow easy extraction of a singlemetadata key into an array
(@petrelharp, #1676, #1690).
Add
time_units
toTreeSequence
to describe the units of the time dimension of thetree sequence. This is then used to generate an error if
time_units
isuncalibrated
whenusing the branch lengths in statistics. (@benjeffery, #1644, #1760, #1832)
Add the
virtual_root
property to the Tree class (@jeromekelleher, #1704).Add the
num_edges
property to the Tree class (@jeromekelleher, #1704).Improved performance for tree traversal methods in the
nodes
iterator.Roughly a 10X performance increase for "preorder", "postorder", "timeasc"
and "timedesc" (@jeromekelleher, #1704).
Substantial performance improvement for
Tree.total_branch_length
(@jeromekelleher, #1794 #1799)
Add the
discrete_genome
property to the TreeSequence class which is true ifall coordinates are discrete (@jeromekelleher, #1144, #1819)
Add a
random_nucleotides
function. (user:jeromekelleher
, #1825)Add the
TreeSequence.alignments
method. (user:jeromekelleher
, #1825)Add alignment export in the FASTA and nexus formats using the
TreeSequence.write_nexus
andTreeSequence.write_fasta
methods.(@jeromekelleher, @hyanwong, #1894)
Add the
discrete_time
property to the TreeSequence class which is true ifall time coordinates are discrete or unknown (@benjeffery, #1839, #1890)
Add the
skip_tables
option toload
to support only loadingtop-level information from a file. Also add the
ignore_tables
option toTableCollection.equals
andTableCollection.assert_equals
tocompare only top-level information. (@clwgg, #1882, #1854).
Add the
skip_reference_sequence
option toload
. Also add theignore_reference_sequence
optionequals
to compare two tablecollections without comparing their reference sequence. (@clwgg,
#2019, #1971).
tskit now supports python 3.10 (@benjeffery, #1895, #1949)
Fixes
dump_tables
omitted individual parents. (@benjeffery, #1828, #1884)Add the
Tree.as_newick
method and deprecateTree.newick
. Theas_newick
method by default labels samples with the pattern"n{node_id}"
which is much more useful that the behaviour of
Tree.newick
(which mimicsms
output). (@jeromekelleher, #1671, #1838.)Add the
as_nexus
andwrite_nexus
methods to the TreeSequence class,replacing the broken
to_nexus
method (see above). This uses the samesample labelling pattern as
as_newick
.(@jeetsukumaran, @jeromekelleher, #1785, #1835,
#1836, #1838)
load_text
created additional populations even if the population table was specified,and didn't strip newlines from input text (@hyanwong, #1909, #1910)
This discussion was created from the release Python 0.4.0.
Beta Was this translation helpful? Give feedback.
All reactions