Skip to content

Latest commit

 

History

History
63 lines (37 loc) · 5.9 KB

arbor_hackathon_summary.md

File metadata and controls

63 lines (37 loc) · 5.9 KB

Arbor summary

This is a summary of the Arbor parts of the hackathon, including links and descriptions of Arbor tools for analyses that are associated with your group.

Arbor on Amazon Cloud for the hackathon

We have created three Arbor instances in the cloud for this workshop.

All of the functions discussed below are available on these instances and in the arbor_analyses folder on this github. Eventually all functions will be included in Arbor collections.

Beetles

  1. Method to pull two study trees from opentree and compare them: The new Arbor method pull and compare opentree studies fetches two study trees from open tree of life based on their study id. The workflow then compares the topology of the two trees using the function phylo.diff from the distory R package.

inputs treecompare

  1. Workflow to mash together beetle tree and data matrix: The data matrix is at the genus level while the tree is at the species level. The Arbor workflow beetleDataTreeSmash matches the two by extracting genus names from the tree tip labels and then dropping all duplicates from the tree. We then make fake branch lengths and reconstruct ancestral character states on the tree. This workflow includes a few new generally useful Arbor functions for data processing.

inputs inputs acebeetles

  1. Workflow to make cophylogeny plot of beetles and hosts: The team provided us with a tree of beetle genera and a tree of their hosts. cophyloflow The Arbor workflow phytoTangleTree matches the two trees by dropping an underscore character and extracting genus names. We then remove any rows from the matrix that don't match both trees, and plot using the function cophylo from phytools.

inputs cophylo

Catfishes

  1. Process catfish data matrix and make heatmap view

The Phenoscape team provided Arbor with sample-size data matrices and a matching small sample trees of a catfish family, which were extracted from Phenoscape and morphobank to perform initial integration. The Arbor team incorporated two new, open-source rendering packages for heatmap (jheatmap) and heatmap&phylogeny (InCHlib) rendering to allow for exploring the large phenotype matrixes exported from Phenoscape. During the workshop, we used prototype visualizations to explore large trait presence/absence matrices (1250+ taxa, 200+ characters) and a second multi-valued character attribute table of equivalent size. Renderings are provided below that represent interactive browser views (including a demonstration on the iPad).

heatmap

Ipad version heatmap

The catfish tree was rolled up to be viewable at the family level using some traits selected out the full matrix:

heatmap

  1. Link catfish character data to images

The catfish trait data is currently curated in Morphobank. Images are associated with each specimen reported in the trait matrix. As this data is exported from Morphobank into Arbor, an association table of tip-names to image names is created. A method was added to Arbor that embeds image URLs into the tree-tips of an “annotated tree”. Arbor allows additional information to be added at any node, so the trait data is added at the tips. Additional trait information at the tips is used by visualization algorithms,such as phyloMap and PhyloPen.

  1. Display catfish tree and data in phylopen

PhyloPen, an interactive tree, matrix, and image renderer was tested for the first time on the catfish data this weekend. PhyloPen is a native Java application which is a client of Arbor. We used Arbor to preprocess the tree and matrix so the user can browse the values of traits on the tree. This project was curated using Morphobank and we are still in the process of establishing integration between Morphobank and Arbor for attaching images to the correct tree tips. Other data exports are working. Below is a screenshot of PhyloPen browsing the catfish tree, matrix, and associated tip image:

Barnacles

  1. Match tree and tip data, then reconstruct ancestral character states: The workflow aceBarnacles makes the barnacle tree ultrametric using PATHd8, then reconstructs ancestral character states on this ultrametric tree.