The applied topology visualiation (ApTViz) package contains functions designed to create data graphics from common TDA analyses and structures. Visualization functions rely on some combination of Plotly, JavaScript, Flask, and D3. Many of the colors used in these functions are from Fabio Crameri's colormaps [1].
Contact me for questions and comments.
These functions expect filtered simplicial complex or barcode data in a specific organizgion:
fsc_df
is a pandas data frame with one row per simplex and the following columns:
Required
-
cell_id
: An integer used to reference the simplex. This number should be unique to the simplex. -
dim
: An integer k denoting the dimension of the$k$ -simplex. -
nodes
: A list of integers denoting the nodes involved in the simplex. -
faces
: A list of cell ids indicating the faces of the simplex.
Semi-required
weight
: Float designating simplex weight. See Assumptions section for more information. This column can be ignored if only using the rank of simplices. See function flags for rank vs. weight.rank
: An integer indicating the simplex rank. See [2, 3] for more details. This column can be ignored if only using the weight of simplices. See function flags for rank vs. weight.
Optional
- <indicator_col>: Indicator column with entries = 0, 1. Examples include
is_maximal
, in which maximal simplices are marked with 1, andin_subcomplex
, in which simplices involved in a particular subcomplex are marked with 1.
bar_df
is a pandas data frame with one row per bar and the following columns:
Required
-
bar_id
: An integer used to reference the bar (persistent cycle). This number should be unique to the bar. -
bar_dim
: An integer k denoting the dimension of the persistent$k$ -cycle. -
bar_birth
: Filtration value at which bar is born. -
bar_death
: Filtration value at which bar dies.
Optional
rep
: A list ofcell_id
s corresponding to simplices in the associated representative. This column is only used in a subset of plotting functions.- <indicator_col>: Indicator column with entries = 0,1. Used to highlight bars with a certain property. Examples include
contains_node_of_interest
. - <continuous_prop>: Column containing a numeric property of each bar. Can be used to provide a color for bars instead
bar_dim
.
These functions are particular about data organization (see previous section), but they also make a few assumptions about the data.
-
Simplex weights are positive. Most of the code should work regardless, but at this time the functions have not been designed to handle negative weights, specifically.
-
Simplex weights are ranked highest to lowest. Following common edge-weighting schemes in the neuroscience and biology, we assume that the highest-weighted simplices are the strongest in the complex.
-
Simplex weights are unique. Most functions should work regardless, but currently the code is not designed to handle non-unique weights. This will be removed in a later version.
-
Simplex ranks are unique. Most functions should work regardless, but currently the code is not designed to handle non-unique ranks. This will be removed in a later version.
-
One representative per bar. Functions are prepared for exactly one representative per bar. If multiple representatives are needed, for example if we wanted to use all minimal generators, consider adding all of those simplex ids within the one representative list in the
rep
column.
Before running analyses, it can be helpful to gain a better understanding of the (filtered) simplicial complex itself. The exampples_filtered_simplicial_complex.ipynb
notebook illustrates the following functions:
fsc_histogram_by_dim(fsc_df, prop = "weight")
fsc_histogram_thresholded(fsc_df, prop = "dim", filter_on = "weight", threshold = 0, keep = "geq")
fsc_attribute_across_filtration(fsc_df, filtration_steps, filtration_col="weight")
If there is a particularly interesting set of nodes or simplices (for example, those involving default mode brain regions), see the examples_subcomplexes.ipynb
notebook for functions that compare properties of a subcomplex to the whole. Note requires an indicator column. This notebook demonstrates the following functions:
fsc_violin_compare_by_dim(fsc_df, indicator_col, prop = "weight")
- Faceted histogram using px.histogram.
The examples_ph_output.ipynb
notebook contains the following functions used to create persistence diagrams or barcode plots:
plot_pd(bar_df, axis_range)
plot_pd_faceted(bar_df, axis_range, col_wrap = 3)
plot_barcode(bar_df, axis_range)
plot_barcode_highlighted(bar_df, axis_range, indicator_col, shaded=False)
plot_barcode_continuous_highlight(bar_df, axis_range, continuous_prop)
All of the above functions return a plotly figure object that can be further modified.
The goal of this project is to reach beyond basic persistent homology charts into other areas of TDA, as well as creating novel methods for visualizing topological objects and TDA outputs. If you have ideas for the package, please reach out!
[1] Crameri, Fabio, Grace E. Shephard, and Philip J. Heron. "The misuse of colour in science communication." Nature communications 11.1 (2020): 1-10.
[2] Giusti, Chad, et al. "Clique topology reveals intrinsic geometric structure in neural correlations." Proceedings of the National Academy of Sciences 112.44 (2015): 13455-13460.
[3] Petri, Giovanni, et al. "Topological strata of weighted complex networks." PloS one 8.6 (2013): e66506.