CHANGES.txt

Release 3.12.0
--------------

New Features
^^^^^^^^^^^^

    - New infrastructure for metadata annotations: AnnotationSet and Annotation.
    - [NeXML] Full support of NeXML 0.9 metadata parsing and writing.
    - `get_from_url()` and `read_from_url()` methods now allow for reading of phylogenetic data from URL's.
    - Added GBIF interoperability module ("``dendropy.interop.gbif``").
    - Added GenBank interoperability module ("``dendropy.interop.genbank``").

Bug Fixes
^^^^^^^^^

    - Splits hash calculations now correctly deal with unrooted incomplete leaf-set trees.

Changes
^^^^^^^

    - Annotations/metadata API significantly changed. Affects not only parsing
      of NeXML-formatted data by NEXUS/NEWICK as well (metadata no longer
      stored in dictionaries, by AnnotationSet collections of Annotation
      objects).
    - [NEXUS/NEWICK] ``suppress_internal_node_taxa`` keyword now actually
      instantiates Taxon objects from internal node labels when 'False', but
      defaults to `True` for consistency with legacy behavior.
    - [NEXUS/NEWICK] Faster parsing of tree files.
    - NCBI module ("``dendropy.interop.ncbi``") has been deprecated. Use "``dendropy.interop.genbank``" instead.


Release 3.11.0
--------------

New Features
^^^^^^^^^^^^

    - New application script for concatenation of branch labels from across
      multiple input trees: `sumlabels.py`.
    - New interoperability class `dendropy.interop.seqgen.SeqGen`:
      wrapper for Seq-Gen integrated into library.
    - New interoperability function `dendropy.interop.muscle.muscle_align()`:
      wrapper for MUSCLE alignment.
    - New interoperability class `dendropy.interop.raxml.RaxmlRunner`:
      wrapper for RAxML.
    - `prune_taxa()` method added to `CharacterMatrix`.
    - Math modules moved to their own subpackage: `dendropy.mathlib`.
    - New module for matrix and vector computations: `dendropy.mathlib.linearalg`.
    - New module for statistical distance calculation: `dendropy.mathlib.distance`.
    - Family of Mahalanobis distance calculation functions in
      `dendropy.mathlib.distance`: `squared_mahalanobis`,
      `squared_mahalanobis_1d`, `mahalanobis`, `mahalanobis_1d`.
    - Better documentation and examples of geodispersal analysis of Lieberman
        and Eldredge (see extras/geodispersal subdirectory).

Release 3.10.1
--------------

Bug Fixes
^^^^^^^^^

    - Split orientation regression bug introduced in 3.10.0 fixed (affected SumTrees and other split comparisons when dealing with unrooted trees).

Release 3.10.0
--------------

New Features
^^^^^^^^^^^^

    - Support for writing preamble blocks in NEXUS format.
    - Support for writing character subsets in NEXUS.
    - Support for user customization of edge label formatting.
    - `find_missing_splits()` method added to `dendropy.Tree()`.
    - `concatenate()`, `concatenate_from_paths()`, and
      `concatenate_from_streams()` methods added to CharacterMatrix and derived
      classes: allows for creation of new CharacterMatrix or subtype from
      multiple character matrices or file sources. Sub-components or alignments
      will be stored as character subsets in the concatenated matrices (and can
      be re-exported as individual matrices using `export_character_subset()`.

Bug Fixes
^^^^^^^^^

    - Log probability of coalescent frames/trees now calculated correctly [PLEASE REVISIT PREVIOUS ANALYSIS IF YOU USED THIS FEATURE!].
    - Strict PHYLIP format now correct.


Release 3.9.0
-------------

New Features
^^^^^^^^^^^^

    - Phylogenetic independent contrasts (PIC) analysis can now be carried out using the ``dendropy.continuous.PhylogeneticIndependentContrasts`` class.
    - Simplified contained coalescent (gene tree in species tree) simulation.

Changes
^^^^^^^

    - Keyword arguments to ``as_string()``, ``write_to_path()``, ``write_to_file``, etc. methods have been tweaked to become more consistent for NEXUS and Newick formats. Previous keywords are still supported, but will be deprecated. The new set of keyword arguments supported can be seen in the :ref:``NEXUS and Newick writing customization <Customizing_Writing_NEXUS_and_Newick>`` section.
    - NEXUS and Newick formats now default to case-insensitive taxon labels; specify ``case_insensitive_taxon_labels=False`` for case-sensitivity.

Bug Fixes
^^^^^^^^^

    - Reading interleaved character matrices no longer results in the following block being skipped (NEXUS).
    - Caught OverflowError when calculating summary statistics.

Release 3.8.1
-------------

New Features
^^^^^^^^^^^^

    - SumTrees now allows ``e unweighted`` / ``edges=unweighted`` to strip edge length information from output trees.
    - NexusReader now processes "ASSUMPTIONS" and "CODONS" blocks in addtion to "SETS" blocks for character set information.

Changes
^^^^^^^

    - SumTrees now will not write edge lengths if none of the input trees have length information for that edge (previously, SumTrees would write "0.0" for the edge).

Bug Fixes
^^^^^^^^^

    - NexusWriter (and hence SumTrees) does not write an extra-semicolon after each tree statement (which confuses some applications, e.g. FigTree).


Release 3.8.0
-------------

SumTrees
^^^^^^^^

    - SumTrees can now summarize edge lengths on target trees in different ways:
        - ``-e mean-length``: sets the edge lengths of the target/consensus tree(s) to the mean of the lengths of the corresponding edges of the input trees.
        - ``-e median-length``: sets the edge lengths of the target/consensus tree(s) to the median of the lengths of the corresponding edges of the input trees.
        - ``-e median-age``: adjusts the edge lengths of the target/consensus tree(s) such that the node ages correspond to the median age of corresponding nodes of the input trees [requires rooted ultrametric trees].
        - ``-e mean-age``: adjusts the edge lengths of the target/consensus tree(s) such that the node ages correspond to the mean age of corresponding nodes of the input trees [requires rooted ultrametric trees].
    - SumTrees will now annotate nodes with summaries (mean, median, standard deviation, range, 95% highest posterior density, 5% and 95% quantiles, etc.) of the distribution of ages across across input trees if the trees are indicated to be ultrametric with the "``--ultrametric``" flag [requires rooted ultrametric trees].
    - SumTrees will now annotate edges with summaries (mean, median, standard deviation, range, 95% highest posterior density, 5% and 95% quantiles, etc.) of the distribution of edge lengths across across input trees.
    - SumTrees will now be interpret and handle tree weights if the ``--weighted-trees`` option is specified.
    - SumTrees will now report topology frequencies/probabilities, if given the ``--trprobs`` option.
    - SumTrees can now handle arbitrarily-large numbers of files.
    - SumTrees creates a rooted consensus tree if input trees are specified to be rooted using the ``--rooted-trees`` flag (or rooted trees are assumed as with the '--ultrametric' flag).

DendroPy Library
^^^^^^^^^^^^^^^^

        - Tree objects can now be rerooted at midpoint (see ``Tree.reroot_at_midpoint()``).
        - Annotations (i.e., attributes of Tree, Node or Edge objects that have had "annotate()" called on them) can now be written as metadata comments ("[&field=value]") when writing NEXUS/NEWICK format if the keyword argument ``annotations_as_comments`` is used.
        - When reading in NEXUS/NEWICK format trees, specifying ``extract_comment_metadata=True`` will result in metadata comments being pulled into dictionary, with keys being fieldnames and values being the field values.
        - When reading NEXUS format data, SETS blocks will be processed, and character sets parsed into the relevant CharacterDataMatrix.
        - Character sets (as, for example, parsed out of NEXUS SETS blocks: see above) can be exported as new CharacterDataMatrix objects, and be saved/manipulated/etc. independentally.
        - When writing in NEXUS or NEWICK formats, the ``write_item_comments`` keyword argument (``True`` or ``False``) can control whether extended comments associated with nodes on trees will be written or not.
        - TopologyCounter class added to dendropy.treesum: allows for tracking of topology frequencies.
        - ``treesplits.tree_from_splits()`` allows constructing of (topology-only) trees from a set of splits.
        - Most functionality that used to be 'dendropy.treemanip' has now been migrated as native methods of the dendropy.Tree class. 'dendropy.treemanip' will be deprecated.
        - Trees can now be pruned based on a list of taxa labels to remove or keep (previously, the methods would only accept lists of Taxon objects).

Release 3.7.2
-------------

Bug Fixes
^^^^^^^^^

    - "1.0" threshold special-cased to correctly recover strict consensus tre "1.0" threshold special-cased to correctly recover strict consensus tree.

Release 3.7.1
--------------

New Features
^^^^^^^^^^^^

    - Implementation of the 'General Sampling Approach' (Hartman et al. 2010: Sampling Trees from Evolutionary Models; Syst. Biol. 49, 465-476) method of simulating trees from the birth-death model.

Changes
^^^^^^^

    - Correct/consistent names for some probability functions.

Bug Fixes
^^^^^^^^^

    - Bug in confirming overwriting of output file when using SumTrees '-e'/'--split-edges' option.
    - Ancient and grizzled semi-fossilized reference to 'taxa_block' corrected to 'taxon_set'.

Release 3.7.0
-------------

Migrated to BSD-style license.

Release 3.6.1
-------------

Bug Fixes
^^^^^^^^^

    - SumTrees now works (in serial mode) under older Python versions (i.e. < 2.6).
    - Fixes for compatibility with Python 2.4.x.

Release 3.6.0
-------------

New Features
^^^^^^^^^^^^

    - SumTrees now can can run in parallel mode, with each input source tree file handled in a separate process. The maximum number of parallel processes can be specified by the '-m' or '--multiprocessing' flag::

        $ sumtrees.py -m 4 mb.run1.t mb.run2.t mb.run3.t mb.run4.t > consensus.tre
        $ sumtrees.py -multiprocessing=4 mb.run1.t mb.run2.t mb.run3.t mb.run4.t > consensus.tre

    - Tree comments now parsed and stored with trees (NEXUS and NEWICK formats), allowing for processing of annotations such as tree weights and calculated likelihoods.

Release 3.5.0
-------------

New Features
^^^^^^^^^^^^

    - Added new module for interacting with NCBI databases: ``dendropy.interop.ncbi``. Sequences can be downloaded individually or by specifying ranges. In addition, labels suitable for use in phylogenetic analyses can be automatically composed for each sequence. For example::

        >>> from dendropy.interop import ncbi
        >>> entrez = ncbi.Entrez(generate_labels=True, sort_taxa_by_label=True)
        >>> data1 = entrez.fetch_nucelotide_accessions(['EU105474', 'EU105476'])
        >>> data1.write_to_path('seqs1.nex', 'nexus')
        >>> data2 = entrez.fetch_nucleotide_accession_range(105474, 106045, prefix="EU")
        >>> data2.write_to_path('seqs2.nex', 'nexus')

      Note that unlike Python's native "``range``" command, here the last element **is included** in the range (i.e., a range specified as "*a*, *b*" => [a,b]), and thus ``entrez.fetch_nucleotide_accession_range(105474, 106045, prefix="EU")`` will result in the full range of sequences from "EU105474-106045" being retrieved.

    - Added "``beast-summary-tree``" schema specification to process BEAST annotated consensus trees::

        >>> import dendropy
        >>> tree = dendropy.Tree.get_from_path('pythonidae.beast.tre', 'beast-summary-tree')

      Each node on the resulting tree will have the following attributes: "``height``", "``height_median``", "``height_95hpd``", "``height_range``", "``length``", "``length_median``", "``length_95hpd``", "``length_range``", "``posterior'. Scalar values will be of ``float`` type, while ranges (e.g., "``height_95hpd``", "``height_range``", "``length_95hpd``", "``length_range``") will be two-element lists of ``float`` types.

    - Added ``ladderize()`` method, to order nodes in ascending (default) or descending (``ladderize(right=True)``) order.

Release 3.4.1
-------------

Changes
^^^^^^^

    - Bundled ``ez_setup.py`` updated to latest version.

Release 3.4.0
-------------

Changes
^^^^^^^

    - ``age`` is now a permanent attribute of ``Node`` objects from initialization (previously, this attribute was optionally added when ``add_ages_to_nodes()`` was called).
    - ``Tree.add_ages_to_nodes()`` replaced by ``Tree.calc_node_ages()``: functioning is similar, but no longer offers the option for client code to select attribute name for the node age; this is now fixed to ``age`` (see above).
    - NeXML namespace has been updated to use modern NeXML instance docs.
    - new ``ContainingTree`` class to manage operations with embedded trees (species tree with gene trees...)

Release 3.3.0
-------------

Changes
^^^^^^^

    * An exception is now thrown when branches with edge weights of "None" are encountered when calculating weighted Robinson-Foulds or Euclidean distances, EXCEPT if the edges are root edges (i.e., edges subtending the root node).
    * Deep-coalescence counting methods now moved to a distinct module: ``dendropy.reconcile``:

        - ``dendropy.reconcile.reconciliation_discordance``
        - ``dendropy.reconcile.monophyletic_partition_discordance``

Bug Fixes
^^^^^^^^^

    * ``TaxonSetPartition`` iterator now iterates over partitions without crashing.
    * When a ``Tree`` object is independently written, its ``TaxonSet`` no longer gets re-created.
    * Corrected variable reference when inferring ``TaxonSet`` of a ``Tree`` object.
    * Correct list comprehension when composing ``Node`` and ``Edge`` sets of a ``Tree`` object.

Release 3.2.2
-------------

Bug Fixes
^^^^^^^^^

    * ``Node.level()`` now works correctly.

Release 3.2.1
-------------

New Features
^^^^^^^^^^^^

    *  Taxon objects can now be accessed from a TaxonSet directly by their label::

        >>> taxonset = TaxonSet(['a', 'b', 'c'])
        >>> assert taxonset[0] is taxonset['a'] # True

Bug Fixes
^^^^^^^^^

    * Fixed taxon-pruning clean-up resulting in all leaves being removed (!).

    * Fixed improper value returned for latest commit date when running under Windows.


Release 3.2.0
-------------

New Features
^^^^^^^^^^^^

    * DendroPy is now backwards-compatible with Python 2.4.
    * Native ASCII tree plotting is available as the "``as_ascii_plot()``" method of a ``Tree`` object.
    * Some tree distance functions integrated as native methods of ``Tree``: ``symmetric_difference``, ``false_positives_and_negatives``, ``robinson_foulds_distance``, ``euclidean_distance``.
    * ``Tree()`` objects can now be cloned with a custom ``TaxonSet`` object by passing it to the constructor via the ``taxon_set`` keyword argument.
    * Interoperability with the ``ETE <http://ete.cgenomics.org/>``_ library via the ``dendropy.interop.ete`` module.

Release 3.1.3
-------------

New Features
^^^^^^^^^^^^

    * New character matrix type added: ``NucleotideCharacterMatrix`` (corresponding to NEXUS '``datatype=nucleotide``').
    * Symbol '``X``' added to DNA, RNA and Nucleotide alphabets.

Release 3.1.2
-------------

New Features
^^^^^^^^^^^^

    * '``--unrooted``' option added to SumTrees, forcing trees to be treated as unrooted (option '``--rooted``' is also still available, forcing trees to be treated as rooted).

Release 3.1.1
-------------

Bug Fixes
^^^^^^^^^

    * SumTrees now handles '``--rooted``' option correctly; previously unrooted trees were treated as rooted, leading to incorrect split supports when structurally-equivalent trees of different rotations were used.


Release 3.1
-----------

New Features
^^^^^^^^^^^^

    * FASTA format reading now accepts '``fasta``' as a schema specification::

        >>> import dendropy
        >>> cytb = dendropy.DnaCharacterMatrix.get_from_path("data.fas", "fasta")

      but requires supplemental keyword argument, '``data_type``' to specify type of data ('``dna``', '``rna``', '``protein``', etc.) when using ``DataSet``::

        >>> import dendropy
        >>> cytb_ds = dendropy.DataSet.get_from_path("data.fas", "fasta", data_type="dna")

    * Implemented full-featured PHYLIP reader and enhanced PHYLIP writer, for both strict and relaxed modes (PHYLIP reader requires 'data_type' keyword argument when using ``DataSet``)::

        >>> import dendropy
        >>> cytb = dendropy.DnaCharacterMatrix.get_from_path("data.dat", "phylip")
        >>> cytb.write_to_path("data2.dat", "phylip", strict=True)
        >>> cytb_ds = dendropy.DataSet.get_from_path("data2.dat", "phylip", data_type="dna", strict=True)

    * Setup is now zipsafe by default.

Bug Fixes
^^^^^^^^^

    * Fixed major bug in SumTrees when mapping support to target tree (support value did not get written to node label).
    * Fixed bug in handling trailing whitespaces in interleaved NEXUS data.
    * Fixed issue with encoding of splits when root has degree one.

Release 3.0
-----------

    * Initial release of Dendropy 3.