From d3fb085d607a4a14b79a0a4dc15d2d341ae44f68 Mon Sep 17 00:00:00 2001 From: cmalinmayor Date: Fri, 9 Aug 2024 20:55:09 +0000 Subject: [PATCH] Commit from GitHub Actions (Build Notebooks) --- exercise.ipynb | 451 +++++++++++++++++++++++-------------------------- solution.ipynb | 445 +++++++++++++++++++++--------------------------- 2 files changed, 400 insertions(+), 496 deletions(-) diff --git a/exercise.ipynb b/exercise.ipynb index 634c0cc..b32d6b3 100644 --- a/exercise.ipynb +++ b/exercise.ipynb @@ -2,7 +2,7 @@ "cells": [ { "cell_type": "markdown", - "id": "27afaf05", + "id": "64f6e52c", "metadata": {}, "source": [ "# Exercise 9: Tracking-by-detection with an integer linear program (ILP)\n", @@ -45,7 +45,7 @@ }, { "cell_type": "markdown", - "id": "ce2ad255", + "id": "20d72889", "metadata": {}, "source": [ "## Import packages" @@ -54,7 +54,7 @@ { "cell_type": "code", "execution_count": null, - "id": "6bff5d24", + "id": "0fe8999b", "metadata": {}, "outputs": [], "source": [ @@ -67,7 +67,7 @@ { "cell_type": "code", "execution_count": null, - "id": "8469a37a", + "id": "70b45da1", "metadata": {}, "outputs": [], "source": [ @@ -86,6 +86,7 @@ "import zarr\n", "from motile_toolbox.visualization import to_napari_tracks_layer\n", "from motile_toolbox.candidate_graph import graph_to_nx\n", + "from motile_toolbox.visualization.napari_utils import assign_tracklet_ids\n", "import motile_plugin.widgets as plugin_widgets\n", "from motile_plugin.backend.motile_run import MotileRun\n", "from napari.layers import Tracks\n", @@ -102,7 +103,7 @@ }, { "cell_type": "markdown", - "id": "f4ee8217", + "id": "de6c9cf9", "metadata": {}, "source": [ "## Load the dataset and inspect it in napari" @@ -110,7 +111,7 @@ }, { "cell_type": "markdown", - "id": "869845de", + "id": "72be02eb", "metadata": {}, "source": [ "For this exercise we will be working with a fluorescence microscopy time-lapse of breast cancer cells with stained nuclei (SiR-DNA). It is similar to the dataset at https://zenodo.org/record/4034976#.YwZRCJPP1qt. The raw data, pre-computed segmentations, and detection probabilities are saved in a zarr, and the ground truth tracks are saved in a csv. The segmentation was generated with a pre-trained StartDist model, so there may be some segmentation errors which can affect the tracking process. The detection probabilities also come from StarDist, and are downsampled in x and y by 2 compared to the detections and raw data." @@ -118,7 +119,7 @@ }, { "cell_type": "markdown", - "id": "5aa6c023", + "id": "c37abd41", "metadata": {}, "source": [ "Here we load the raw image data, segmentation, and probabilities from the zarr, and view them in napari." @@ -127,7 +128,7 @@ { "cell_type": "code", "execution_count": null, - "id": "646169b5", + "id": "5f15cf64", "metadata": {}, "outputs": [], "source": [ @@ -140,7 +141,7 @@ }, { "cell_type": "markdown", - "id": "a2cd889b", + "id": "d9c9eff2", "metadata": {}, "source": [ "Let's use [napari](https://napari.org/tutorials/fundamentals/getting_started.html) to visualize the data. Napari is a wonderful viewer for imaging data that you can interact with in python, even directly out of jupyter notebooks. If you've never used napari, you might want to take a few minutes to go through [this tutorial](https://napari.org/stable/tutorials/fundamentals/viewer.html)." @@ -149,7 +150,7 @@ { "cell_type": "code", "execution_count": null, - "id": "494a92b0", + "id": "79051d16", "metadata": {}, "outputs": [], "source": [ @@ -161,7 +162,7 @@ }, { "cell_type": "markdown", - "id": "10fa981d", + "id": "b2303fed", "metadata": {}, "source": [ "## Read in the ground truth graph\n", @@ -179,7 +180,7 @@ }, { "cell_type": "markdown", - "id": "59e5645b", + "id": "ed5a1655", "metadata": {}, "source": [ "\n", @@ -203,7 +204,7 @@ { "cell_type": "code", "execution_count": null, - "id": "5354470e", + "id": "b6f9ee99", "metadata": { "tags": [ "task" @@ -222,7 +223,7 @@ { "cell_type": "code", "execution_count": null, - "id": "5397dc1f", + "id": "ef45818a", "metadata": {}, "outputs": [], "source": [ @@ -242,28 +243,44 @@ }, { "cell_type": "markdown", - "id": "2d9fed23", + "id": "22a1f33c", "metadata": {}, "source": [ - "We can also use the helper function `to_napari_tracks_layer` to visualize the ground truth tracks in our napari viewer." + "Here we set up a napari widget for visualizing the tracking results. This is part of the motile napari plugin, not part of core napari.\n", + "If you get a napari error that the viewer window is closed, please re-run the previous visualization cell to re-open the viewer window." ] }, { "cell_type": "code", "execution_count": null, - "id": "5d3251d9", + "id": "6779781f", "metadata": {}, "outputs": [], "source": [ - "\n", "widget = plugin_widgets.TreeWidget(viewer)\n", "viewer.window.add_dock_widget(widget, name=\"Lineage View\", area=\"bottom\")" ] }, + { + "cell_type": "markdown", + "id": "b7cda55e", + "metadata": {}, + "source": [ + "Here we add a \"MotileRun\" to the napari tracking visualization widget (the \"view_controller\"). A MotileRun includes a name, a set of tracks, and a segmentation. The tracking visualization widget will add:\n", + "- a Points layer with the points in the tracks\n", + "- a Tracks layer to display the track history as a \"tail\" behind the point in the current time frame\n", + "- a Labels layer, if a segmentation was provided\n", + "- a Lineage View widget, which displays an abstract graph representation of all the solution tracks\n", + "\n", + "These views are synchronized such that every element is colored by the track ID of the element. Clicking on a node in the Lineage View will navigate to that cell in the data, and vice versa.\n", + "\n", + "Hint - if your screen is too small, you can \"pop out\" the lineage tree view into a separate window using the icon that looks like two boxes in the top left of the lineage tree view. You can also close the tree view with the x just above it, and open it again from the menu bar: Plugins -> Motile -> Lineage View (then re-run the below cell to add the data to the lineage view)." + ] + }, { "cell_type": "code", "execution_count": null, - "id": "0c5a17b1", + "id": "934c2cc9", "metadata": {}, "outputs": [], "source": [ @@ -277,7 +294,7 @@ }, { "cell_type": "markdown", - "id": "a1064a48", + "id": "a04d4215", "metadata": { "lines_to_next_cell": 2 }, @@ -294,7 +311,7 @@ }, { "cell_type": "markdown", - "id": "01c2aa7c", + "id": "a288a7e4", "metadata": {}, "source": [ "

Task 2: Extract candidate nodes from the predicted segmentations

\n", @@ -316,7 +333,7 @@ { "cell_type": "code", "execution_count": null, - "id": "c771392a", + "id": "b3387cb6", "metadata": { "tags": [ "task" @@ -350,7 +367,7 @@ { "cell_type": "code", "execution_count": null, - "id": "d9deb6ef", + "id": "eebf6235", "metadata": {}, "outputs": [], "source": [ @@ -370,7 +387,7 @@ }, { "cell_type": "markdown", - "id": "e6e74e1e", + "id": "f48d9da7", "metadata": {}, "source": [ "We can visualize our candidate points using the napari Points layer. You should see one point in the center of each segmentation when we display it using the below cell." @@ -379,7 +396,7 @@ { "cell_type": "code", "execution_count": null, - "id": "47504dad", + "id": "51739880", "metadata": {}, "outputs": [], "source": [ @@ -390,7 +407,7 @@ }, { "cell_type": "markdown", - "id": "af4431a4", + "id": "c49a77b3", "metadata": {}, "source": [ "### Adding Candidate Edges\n", @@ -403,7 +420,7 @@ { "cell_type": "code", "execution_count": null, - "id": "d10eccde", + "id": "a085e685", "metadata": {}, "outputs": [], "source": [ @@ -472,7 +489,7 @@ }, { "cell_type": "markdown", - "id": "9391e4a1", + "id": "253b78fe", "metadata": {}, "source": [ "Visualizing the candidate edges in napari is, unfortunately, not yet possible. However, we can print out the number of candidate nodes and edges, and compare it to the ground truth nodes and edgesedges. We should see that we have a few more candidate nodes than ground truth (due to false positive detections) and many more candidate edges than ground truth - our next step will be to use optimization to pick a subset of the candidate nodes and edges to generate our solution tracks." @@ -481,7 +498,7 @@ { "cell_type": "code", "execution_count": null, - "id": "1588427e", + "id": "61c29895", "metadata": {}, "outputs": [], "source": [ @@ -491,7 +508,7 @@ }, { "cell_type": "markdown", - "id": "e4000f51", + "id": "8da5f320", "metadata": {}, "source": [ "## Checkpoint 1\n", @@ -503,7 +520,7 @@ }, { "cell_type": "markdown", - "id": "d25b3d5b", + "id": "d559c9a1", "metadata": {}, "source": [ "## Setting Up the Tracking Optimization Problem" @@ -511,7 +528,7 @@ }, { "cell_type": "markdown", - "id": "fa12f2ff", + "id": "0101df33", "metadata": {}, "source": [ "As hinted earlier, our goal is to prune the candidate graph. More formally we want to find a graph $\\tilde{G}=(\\tilde{V}, \\tilde{E})$ whose vertices $\\tilde{V}$ are a subset of the candidate graph vertices $V$ and whose edges $\\tilde{E}$ are a subset of the candidate graph edges $E$.\n", @@ -528,7 +545,7 @@ }, { "cell_type": "markdown", - "id": "840742c0", + "id": "6b7907ba", "metadata": {}, "source": [ "## Task 3 - Basic tracking with motile\n", @@ -553,7 +570,7 @@ { "cell_type": "code", "execution_count": null, - "id": "195df8ce", + "id": "8ccea935", "metadata": { "tags": [ "task" @@ -581,7 +598,7 @@ }, { "cell_type": "markdown", - "id": "948f93f9", + "id": "ea4810b3", "metadata": {}, "source": [ "Here is a utility function to gauge some statistics of a solution." @@ -590,7 +607,7 @@ { "cell_type": "code", "execution_count": null, - "id": "b56ae1ff", + "id": "fbb338f9", "metadata": {}, "outputs": [], "source": [ @@ -600,7 +617,7 @@ }, { "cell_type": "markdown", - "id": "0db4497b", + "id": "0e1948c5", "metadata": {}, "source": [ "Here we actually run the optimization, and compare the found solution to the ground truth.\n", @@ -616,7 +633,7 @@ { "cell_type": "code", "execution_count": null, - "id": "7dad6f81", + "id": "7d6d6323", "metadata": {}, "outputs": [], "source": [ @@ -631,7 +648,7 @@ }, { "cell_type": "markdown", - "id": "b101edc7", + "id": "9ce61acd", "metadata": {}, "source": [ "If you haven't selected any nodes or edges in your solution, try adjusting your weight and/or constant values. Make sure you have some negative costs or selecting nothing will always be the best solution!" @@ -639,7 +656,7 @@ }, { "cell_type": "markdown", - "id": "680c9f61", + "id": "3416d3d1", "metadata": {}, "source": [ "

Question 1: Interpret your results based on statistics

\n", @@ -651,15 +668,25 @@ }, { "cell_type": "markdown", - "id": "f7b794d2", + "id": "8f1c5422", + "metadata": {}, + "source": [ + "

Checkpoint 2

\n", + "We will discuss the exercise up to this point as a group shortly. If you reach this checkpoint early, you can go on to Checkpoint 3.\n", + "
" + ] + }, + { + "cell_type": "markdown", + "id": "0ece0a24", "metadata": {}, "source": [ "## Visualize the Result\n", "Rather than just looking at printed statistics about our solution, let's visualize it in `napari`.\n", "\n", - "This involves two steps:\n", - "1. First, we can add a tracks layer with the solution graph.\n", - "2. Second, we can add another segmentation layer, where the segmentations are relabeled so that the same cell will be the same color over time. Cells will still change color at division.\n", + "Before we can create our MotileRun, we need to create an output segmentation from our solution. Our output segmentation differs from our input segmentation in two main ways:\n", + "1. Not all candidate nodes will be selected in our solution graph. We need to filter the masks corresponding to the un-selected candidate detections out of the output segmentation. \n", + "2. Segmentations will be relabeled so that the same cell will be the same label (and thus color) over time. Cells will still change label/color at division.\n", "\n", "Note that bad tracking results at this point does not mean that you implemented anything wrong! We still need to customize our costs and constraints to the task before we can get good results. As long as your pipeline selects something, and you can kind of interepret why it is going wrong, that is all that is needed at this point." ] @@ -667,32 +694,24 @@ { "cell_type": "code", "execution_count": null, - "id": "34701279", + "id": "38c587de", "metadata": {}, "outputs": [], "source": [ - "# recolor the segmentation\n", - "\n", - "from motile_toolbox.visualization.napari_utils import assign_tracklet_ids\n", "def relabel_segmentation(\n", " solution_nx_graph: nx.DiGraph,\n", " segmentation: np.ndarray,\n", ") -> np.ndarray:\n", - " \"\"\"Relabel a segmentation based on tracking results so that nodes in same\n", - " track share the same id. IDs do change at division.\n", + " \"\"\"Relabel a segmentation based on tracking results to get the output segmentation.\n", "\n", " Args:\n", " solution_nx_graph (nx.DiGraph): Networkx graph with the solution to use\n", - " for relabeling. Nodes not in graph will be removed from seg. Original\n", - " segmentation ids and hypothesis ids have to be stored in the graph so we\n", - " can map them back.\n", - " segmentation (np.ndarray): Original (potentially multi-hypothesis)\n", - " segmentation with dimensions (t,h,[z],y,x), where h is 1 for single\n", - " input segmentation.\n", + " for relabeling. Nodes not in graph will be removed from seg.\n", + " segmentation (np.ndarray): Original segmentation with dimensions (t,y,x)\n", "\n", " Returns:\n", " np.ndarray: Relabeled segmentation array where nodes in same track share same\n", - " id with shape (t,1,[z],y,x)\n", + " id with shape (t,y,x)\n", " \"\"\"\n", " assign_tracklet_ids(solution_nx_graph)\n", " tracked_masks = np.zeros_like(segmentation)\n", @@ -712,12 +731,12 @@ { "cell_type": "code", "execution_count": null, - "id": "bbf09ab4", + "id": "77a642e2", "metadata": {}, "outputs": [], "source": [ "basic_run = MotileRun(\n", - " run_name=\"basic_solution_test\",\n", + " run_name=\"basic_solution\",\n", " tracks=solution_graph,\n", " output_segmentation=np.expand_dims(solution_seg, axis=1) # need to add a dummy dimension to fit API\n", ")\n", @@ -727,7 +746,7 @@ }, { "cell_type": "markdown", - "id": "ac106e5b", + "id": "dc80ada1", "metadata": {}, "source": [ "

Question 2: Interpret your results based on visualization

\n", @@ -739,7 +758,7 @@ }, { "cell_type": "markdown", - "id": "a3fa790b", + "id": "73ead5d8", "metadata": { "lines_to_next_cell": 2 }, @@ -750,19 +769,26 @@ "\n", "Additionally, we would also like to quantify this. We will use the package [`traccuracy`](https://traccuracy.readthedocs.io/en/latest/) to calculate some [standard metrics for cell tracking](http://celltrackingchallenge.net/evaluation-methodology/). For example, a high-level indicator for tracking performance is called TRA.\n", "\n", - "If you're interested in more detailed metrics, you can look at the false positive (FP) and false negative (FN) nodes, edges and division events." + "If you're interested in more detailed metrics, you can look at the false positive (FP) and false negative (FN) nodes, edges and division events.\n", + "\n", + "TODO: explain this better. Pick a few important metrics (TRA, FP Nodes/edges/divs, FN nodes/edges/divs, TP divs) and explain them. Also, make a table of those specific results that we can add \"runs\" to so we can compare results across runs without scrolling up and down." + ] + }, + { + "cell_type": "markdown", + "id": "c74010a3", + "metadata": {}, + "source": [ + "The metrics we want to compute require a ground truth segmentation. Since we do not have a ground truth segmentation, we can make one by drawing a circle around each ground truth detection. While not perfect, it will be good enough to match ground truth to predicted detections in order to compute metrics." ] }, { "cell_type": "code", "execution_count": null, - "id": "006cb964", - "metadata": { - "lines_to_next_cell": 2 - }, + "id": "06297df6", + "metadata": {}, "outputs": [], "source": [ - "\n", "from skimage.draw import disk\n", "def make_gt_detections(data_shape, gt_tracks, radius):\n", " segmentation = np.zeros(data_shape, dtype=\"uint32\")\n", @@ -776,14 +802,13 @@ " segmentation[time][rr, cc] = node\n", " return segmentation\n", "\n", - "gt_dets = make_gt_detections(data_root[\"raw\"].shape, gt_tracks, 10)\n", - "# viewer.add_image(gt_dets)" + "gt_dets = make_gt_detections(data_root[\"raw\"].shape, gt_tracks, 10)" ] }, { "cell_type": "code", "execution_count": null, - "id": "2f6b791d", + "id": "4fb4bcb7", "metadata": {}, "outputs": [], "source": [ @@ -829,7 +854,7 @@ { "cell_type": "code", "execution_count": null, - "id": "0cd8fcda", + "id": "b49a7e73", "metadata": {}, "outputs": [], "source": [ @@ -838,7 +863,7 @@ }, { "cell_type": "markdown", - "id": "eb4a8e7b", + "id": "531c40f9", "metadata": {}, "source": [ "

Question 3: Interpret your results based on metrics

\n", @@ -850,19 +875,17 @@ }, { "cell_type": "markdown", - "id": "698f9149", + "id": "94dca515", "metadata": {}, "source": [ - "

Checkpoint 2

\n", - "We have set up and run a basic ILP to get tracks and visualized and evaluated the output. \n", - "\n", - "We will go over the code and discuss the answers to Questions 1, 2, and 3 together soon. If you have extra time, think about what kinds of improvements you could make to the costs and constraints to fix the issues that you are seeing. You can even try tuning your weights and constants, or adding or removing motile Costs and Constraints, and seeing how that changes the output.\n", + "

Checkpoint 3

\n", + "If you reach this checkpoint with extra time, think about what kinds of improvements you could make to the costs and constraints to fix the issues that you are seeing. You can try tuning your weights and constants, or adding or removing motile Costs and Constraints, and seeing how that changes the output. See how good you can make the results!\n", "
" ] }, { "cell_type": "markdown", - "id": "57ea6152", + "id": "9907c404", "metadata": {}, "source": [ "## Customizing the Tracking Task\n", @@ -872,137 +895,15 @@ "2. Change the structure of the candidate graph\n", "3. Add a new type of cost or constraint\n", "\n", - "The first way is the most common, and is quite flexible, so we will focus on two examples of this type of customization." + "The first way is the most common, and is quite flexible, so we will focus on an example of this type of customization." ] }, { "cell_type": "markdown", - "id": "ee0471d1", - "metadata": {}, - "source": [ - "## Task 4 - Add an appear cost, but not at the boundary\n", - "The Appear cost penalizes starting a new track, encouraging continuous tracks. However, you do not want to penalize tracks that appear in the first frame. In our case, we probably also do not want to penalize appearing at the \"bottom\" of the dataset. The built in Appear cost ([docs here](https://funkelab.github.io/motile/api.html#motile.costs.Appear_)) has an `ignore_attribute` argument, where if the node has that attribute and it evaluates to True, the Appear cost will not be paid for that node." - ] - }, - { - "cell_type": "markdown", - "id": "08c133dc", + "id": "2efc8ee1", "metadata": {}, "source": [ - "\n", - "

Task 4a: Add an ignore_appear attribute to the candidate graph

\n", - "

\n", - "For each node on our candidate graph, you should add an attribute `ignore_appear` that evaluates to True if the appear cost should NOT be paid for that node. For nodes that should pay the appear cost, you can either set the attribute to False, or not add the attribute. Nodes should NOT pay the appear cost if they are in the first time frame, or if they are within a certain distance to the \"bottom\" of the dataset. (You will need to determine the coordinate range that you consider the \"bottom\" of the dataset, perhaps by hovering over the napari image layer and seeing what the coordinate values are).\n", - "

\n", - "
" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "79c01b01", - "metadata": { - "tags": [ - "task" - ] - }, - "outputs": [], - "source": [ - "def add_appear_ignore_attr(cand_graph):\n", - " for node in cand_graph.nodes():\n", - " ### YOUR CODE HERE ###\n", - " pass # delete this\n", - "\n", - "add_appear_ignore_attr(cand_graph)" - ] - }, - { - "cell_type": "markdown", - "id": "09df8fe1", - "metadata": {}, - "source": [ - "

Task 4b: Use the `ignore_appear` attribute in your solver pipeline

\n", - "

Copy your solver pipeline from above, and then adapt the Appear cost to use our new `ignore_appear` attribute. You may also want to adapt the Appear constant value. Then re-solve to see if performance improves.\n", - "

\n", - "
" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "82873eaa", - "metadata": { - "tags": [ - "task" - ] - }, - "outputs": [], - "source": [ - "def solve_appear_optimization(cand_graph):\n", - " \"\"\"Set up and solve the network flow problem.\n", - "\n", - " Args:\n", - " graph (nx.DiGraph): The candidate graph.\n", - "\n", - " Returns:\n", - " nx.DiGraph: The networkx digraph with the selected solution tracks\n", - " \"\"\"\n", - "\n", - " cand_trackgraph = motile.TrackGraph(cand_graph, frame_attribute=\"t\")\n", - " solver = motile.Solver(cand_trackgraph)\n", - "\n", - " ### YOUR CODE HERE ###\n", - "\n", - " solver.solve()\n", - " solution_graph = graph_to_nx(solver.get_selected_subgraph())\n", - " return solution_graph" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "c9d8bcf7", - "metadata": {}, - "outputs": [], - "source": [ - "solution_graph = solve_appear_optimization(cand_graph)\n", - "solution_seg = relabel_segmentation(solution_graph, segmentation)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "bf3b499d", - "metadata": { - "lines_to_next_cell": 2 - }, - "outputs": [], - "source": [ - "appear_run = MotileRun(\n", - " run_name=\"appear_solution\",\n", - " tracks=solution_graph,\n", - " output_segmentation=np.expand_dims(solution_seg, axis=1) # need to add a dummy dimension to fit API\n", - ")\n", - "\n", - "widget.view_controller.update_napari_layers(appear_run, time_attr=\"t\", pos_attr=(\"x\", \"y\"))" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "469b4b7b", - "metadata": {}, - "outputs": [], - "source": [ - "get_metrics(gt_tracks, gt_dets, solution_graph, solution_seg)" - ] - }, - { - "cell_type": "markdown", - "id": "d09bd609", - "metadata": {}, - "source": [ - "## Task 5 - Incorporating Known Direction of Motion\n", + "## Task 4 - Incorporating Known Direction of Motion\n", "\n", "So far, we have been using motile's EdgeDistance as an edge selection cost, which penalizes longer edges by computing the Euclidean distance between the endpoints. However, in our dataset we see a trend of upward motion in the cells, and the false detections at the top are not moving. If we penalize movement based on what we expect, rather than Euclidean distance, we can select more correct cells and penalize the non-moving artefacts at the same time.\n", " \n" @@ -1010,10 +911,10 @@ }, { "cell_type": "markdown", - "id": "5df286ff", + "id": "7867d480", "metadata": {}, "source": [ - "

Task 5a: Add a drift distance attribute

\n", + "

Task 4a: Add a drift distance attribute

\n", "

For this task, we need to determine the \"expected\" amount of motion, then add an attribute to our candidate edges that represents distance from the expected motion direction.

\n", "
" ] @@ -1021,7 +922,7 @@ { "cell_type": "code", "execution_count": null, - "id": "d4d1e4c4", + "id": "9198a496", "metadata": { "tags": [ "task" @@ -1044,10 +945,10 @@ }, { "cell_type": "markdown", - "id": "2c6e4553", + "id": "6bc50aff", "metadata": {}, "source": [ - "

Task 5b: Add a drift distance attribute

\n", + "

Task 4b: Add a drift distance attribute

\n", "

Now, we set up yet another solving pipeline. This time, we will replace our EdgeDistance\n", "cost with an EdgeSelection cost using our new \"drift_dist\" attribute. The weight should be positive, since a higher distance from the expected drift should cost more, similar to our prior EdgeDistance cost. Also similarly, we need a negative constant to make sure that the overall cost of selecting tracks is negative.

\n", "
" @@ -1056,7 +957,7 @@ { "cell_type": "code", "execution_count": null, - "id": "3b8c40e0", + "id": "4814dfcd", "metadata": {}, "outputs": [], "source": [ @@ -1085,7 +986,7 @@ { "cell_type": "code", "execution_count": null, - "id": "533d50ac", + "id": "63f31351", "metadata": {}, "outputs": [], "source": [ @@ -1096,7 +997,7 @@ { "cell_type": "code", "execution_count": null, - "id": "7ad12973", + "id": "69f253e4", "metadata": {}, "outputs": [], "source": [ @@ -1112,7 +1013,7 @@ { "cell_type": "code", "execution_count": null, - "id": "cb7510f5", + "id": "43eab523", "metadata": {}, "outputs": [], "source": [ @@ -1121,10 +1022,9 @@ }, { "cell_type": "markdown", - "id": "8bbb315d", + "id": "dae0c95b", "metadata": {}, "source": [ - "## Checkpoint 4\n", "

Checkpoint 4

\n", "That is the end of the main exercise! If you have extra time, feel free to go onto the below bonus exercise to see how to learn the weights of your costs instead of setting them manually.\n", "
" @@ -1132,7 +1032,7 @@ }, { "cell_type": "markdown", - "id": "cbbdeae9", + "id": "766a8666", "metadata": {}, "source": [ "## Bonus: Learning the Weights" @@ -1140,14 +1040,18 @@ }, { "cell_type": "markdown", - "id": "3c812d0a", + "id": "af9baa27", "metadata": {}, - "source": [] + "source": [ + "Motile also provides the option to learn the best weights and constants using a (Structured Support Vector Machine)[https://en.wikipedia.org/wiki/Structured_support_vector_machine]. There is a tutorial on the motile documentation (here)[https://funkelab.github.io/motile/learning.html], but we will also walk you through an example below.\n", + "\n", + "We need some ground truth annotations on our candidate graph in order to learn the best weights. The next cell contains a function that matches our ground truth graph to our candidate graph using the predicted segmentations. The function checks for each ground truth node if it is inside one of our predicted segmentations. If it is, that candidate node is marked with attribute \"gt\" = True. Any unmatched candidate nodes have \"gt\" = False. We also annotate the edges in a similar fashion - if both endpoints of a GT edge are inside predicted segmentations, the corresponding candidate edge will have \"gt\" = True, while all other edges going out of that candidate node have \"gt\" = False." + ] }, { "cell_type": "code", "execution_count": null, - "id": "94e3a34a", + "id": "2dbf9fb1", "metadata": {}, "outputs": [], "source": [ @@ -1174,10 +1078,18 @@ " cand_graph.nodes[node][\"gt\"] = False" ] }, + { + "cell_type": "markdown", + "id": "6516a738", + "metadata": {}, + "source": [ + "The SSVM does not need dense ground truth - providing only some annotations frequently is sufficient to learn good weights, and is efficient for both computation time and annotation time. Below, we create a validation graph that spans the first three time frames, and annotate it with our ground truth." + ] + }, { "cell_type": "code", "execution_count": null, - "id": "5c8d6b5b", + "id": "b7d09c22", "metadata": { "lines_to_next_cell": 2 }, @@ -1191,10 +1103,18 @@ "add_gt_annotations(gt_tracks, validation_graph, segmentation)" ] }, + { + "cell_type": "markdown", + "id": "3a705247", + "metadata": {}, + "source": [ + "Here we print the number of nodes and edges that have been annotated with True and False ground truth. It is important to provide negative/False annotations, as well as positive/True annotations, or the SSVM will try and select weights to pick everything." + ] + }, { "cell_type": "code", "execution_count": null, - "id": "73b71faa", + "id": "bfdef448", "metadata": {}, "outputs": [], "source": [ @@ -1207,17 +1127,43 @@ "print(f\"{len(gt_pos_edges) + len(gt_neg_edges)} annotated: {len(gt_pos_edges)} True, {len(gt_neg_edges)} False\")" ] }, + { + "cell_type": "markdown", + "id": "85ce3477", + "metadata": {}, + "source": [ + "

Bonus task: Add your best solver parameters

\n", + "

Now, similar to before, we make the solver by adding costs and constraints. You can copy your best set of costs and constraints from before. It does not matter what weights and constants you choose. However, this time we just return the solver, rather than actually solving.

\n", + "
" + ] + }, { "cell_type": "code", "execution_count": null, - "id": "f03d0087", - "metadata": {}, + "id": "ac2d00fc", + "metadata": { + "tags": [ + "task" + ] + }, "outputs": [], "source": [ - "import logging\n", + "def get_ssvm_solver(cand_graph):\n", "\n", - "logging.basicConfig(level=logging.INFO)\n", + " cand_trackgraph = motile.TrackGraph(cand_graph, frame_attribute=\"t\")\n", + " solver = motile.Solver(cand_trackgraph)\n", "\n", + " ### YOUR CODE HERE ###\n", + " return solver" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9e4380d5", + "metadata": {}, + "outputs": [], + "source": [ "def get_ssvm_solver(cand_graph):\n", "\n", " cand_trackgraph = motile.TrackGraph(cand_graph, frame_attribute=\"t\")\n", @@ -1226,7 +1172,6 @@ " solver.add_cost(\n", " motile.costs.EdgeSelection(weight=1.0, constant=-30, attribute=\"drift_dist\")\n", " )\n", - " solver.add_cost(motile.costs.Appear(constant=20, ignore_attribute=\"ignore_appear\"))\n", " solver.add_cost(motile.costs.Split(constant=20))\n", "\n", " solver.add_constraint(motile.constraints.MaxParents(1))\n", @@ -1234,10 +1179,20 @@ " return solver" ] }, + { + "cell_type": "markdown", + "id": "f5ced80f", + "metadata": {}, + "source": [ + "To fit the best weights, the solver will solve the ILP many times and slowly converge to the best set of weights in a structured manner. Running the cell below may take some time - we recommend getting a Gurobi license if you want to use this technique in your research, as it speeds up solving quite a bit.\n", + "\n", + "At the end, it will print the optimal weights, and you can compare them to the weights you found by trial and error." + ] + }, { "cell_type": "code", "execution_count": null, - "id": "925ff81a", + "id": "a38be363", "metadata": {}, "outputs": [], "source": [ @@ -1247,10 +1202,18 @@ "optimal_weights" ] }, + { + "cell_type": "markdown", + "id": "484b0c68", + "metadata": {}, + "source": [ + "After we have our optimal weights, we need to solve with them on the full candidate graph." + ] + }, { "cell_type": "code", "execution_count": null, - "id": "58f02fb1", + "id": "dc9286ab", "metadata": { "lines_to_next_cell": 2 }, @@ -1266,10 +1229,18 @@ "solution_graph = get_ssvm_solution(cand_graph, optimal_weights)" ] }, + { + "cell_type": "markdown", + "id": "a7ee8784", + "metadata": {}, + "source": [ + "Finally, we can visualize and compute metrics on the solution found using the weights discovered by the SSVM." + ] + }, { "cell_type": "code", "execution_count": null, - "id": "ce459603", + "id": "53ddf188", "metadata": {}, "outputs": [], "source": [ @@ -1280,7 +1251,7 @@ { "cell_type": "code", "execution_count": null, - "id": "88221855", + "id": "e92d8674", "metadata": {}, "outputs": [], "source": [ @@ -1294,20 +1265,16 @@ ] }, { - "cell_type": "code", - "execution_count": null, - "id": "68efa31e", - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "7d79c0d9", + "cell_type": "markdown", + "id": "1b3c7848", "metadata": {}, - "outputs": [], - "source": [] + "source": [ + "

Bonus Question: Interpret SSVM results

\n", + "

\n", + "How do the results compare between the SSVM-discovered weights and your hand-crafted weights? What are the advantages and disadvantages of each approach in terms of (human or computer) time needed?\n", + "

\n", + "
\n" + ] } ], "metadata": { diff --git a/solution.ipynb b/solution.ipynb index f1106f3..eea9408 100644 --- a/solution.ipynb +++ b/solution.ipynb @@ -2,7 +2,7 @@ "cells": [ { "cell_type": "markdown", - "id": "27afaf05", + "id": "64f6e52c", "metadata": {}, "source": [ "# Exercise 9: Tracking-by-detection with an integer linear program (ILP)\n", @@ -45,7 +45,7 @@ }, { "cell_type": "markdown", - "id": "ce2ad255", + "id": "20d72889", "metadata": {}, "source": [ "## Import packages" @@ -54,7 +54,7 @@ { "cell_type": "code", "execution_count": null, - "id": "6bff5d24", + "id": "0fe8999b", "metadata": {}, "outputs": [], "source": [ @@ -67,7 +67,7 @@ { "cell_type": "code", "execution_count": null, - "id": "8469a37a", + "id": "70b45da1", "metadata": {}, "outputs": [], "source": [ @@ -86,6 +86,7 @@ "import zarr\n", "from motile_toolbox.visualization import to_napari_tracks_layer\n", "from motile_toolbox.candidate_graph import graph_to_nx\n", + "from motile_toolbox.visualization.napari_utils import assign_tracklet_ids\n", "import motile_plugin.widgets as plugin_widgets\n", "from motile_plugin.backend.motile_run import MotileRun\n", "from napari.layers import Tracks\n", @@ -102,7 +103,7 @@ }, { "cell_type": "markdown", - "id": "f4ee8217", + "id": "de6c9cf9", "metadata": {}, "source": [ "## Load the dataset and inspect it in napari" @@ -110,7 +111,7 @@ }, { "cell_type": "markdown", - "id": "869845de", + "id": "72be02eb", "metadata": {}, "source": [ "For this exercise we will be working with a fluorescence microscopy time-lapse of breast cancer cells with stained nuclei (SiR-DNA). It is similar to the dataset at https://zenodo.org/record/4034976#.YwZRCJPP1qt. The raw data, pre-computed segmentations, and detection probabilities are saved in a zarr, and the ground truth tracks are saved in a csv. The segmentation was generated with a pre-trained StartDist model, so there may be some segmentation errors which can affect the tracking process. The detection probabilities also come from StarDist, and are downsampled in x and y by 2 compared to the detections and raw data." @@ -118,7 +119,7 @@ }, { "cell_type": "markdown", - "id": "5aa6c023", + "id": "c37abd41", "metadata": {}, "source": [ "Here we load the raw image data, segmentation, and probabilities from the zarr, and view them in napari." @@ -127,7 +128,7 @@ { "cell_type": "code", "execution_count": null, - "id": "646169b5", + "id": "5f15cf64", "metadata": {}, "outputs": [], "source": [ @@ -140,7 +141,7 @@ }, { "cell_type": "markdown", - "id": "a2cd889b", + "id": "d9c9eff2", "metadata": {}, "source": [ "Let's use [napari](https://napari.org/tutorials/fundamentals/getting_started.html) to visualize the data. Napari is a wonderful viewer for imaging data that you can interact with in python, even directly out of jupyter notebooks. If you've never used napari, you might want to take a few minutes to go through [this tutorial](https://napari.org/stable/tutorials/fundamentals/viewer.html)." @@ -149,7 +150,7 @@ { "cell_type": "code", "execution_count": null, - "id": "494a92b0", + "id": "79051d16", "metadata": {}, "outputs": [], "source": [ @@ -161,7 +162,7 @@ }, { "cell_type": "markdown", - "id": "10fa981d", + "id": "b2303fed", "metadata": {}, "source": [ "## Read in the ground truth graph\n", @@ -179,7 +180,7 @@ }, { "cell_type": "markdown", - "id": "59e5645b", + "id": "ed5a1655", "metadata": {}, "source": [ "\n", @@ -203,7 +204,7 @@ { "cell_type": "code", "execution_count": null, - "id": "dce39690", + "id": "b5344669", "metadata": { "tags": [ "solution" @@ -234,7 +235,7 @@ { "cell_type": "code", "execution_count": null, - "id": "5397dc1f", + "id": "ef45818a", "metadata": {}, "outputs": [], "source": [ @@ -254,28 +255,44 @@ }, { "cell_type": "markdown", - "id": "2d9fed23", + "id": "22a1f33c", "metadata": {}, "source": [ - "We can also use the helper function `to_napari_tracks_layer` to visualize the ground truth tracks in our napari viewer." + "Here we set up a napari widget for visualizing the tracking results. This is part of the motile napari plugin, not part of core napari.\n", + "If you get a napari error that the viewer window is closed, please re-run the previous visualization cell to re-open the viewer window." ] }, { "cell_type": "code", "execution_count": null, - "id": "5d3251d9", + "id": "6779781f", "metadata": {}, "outputs": [], "source": [ - "\n", "widget = plugin_widgets.TreeWidget(viewer)\n", "viewer.window.add_dock_widget(widget, name=\"Lineage View\", area=\"bottom\")" ] }, + { + "cell_type": "markdown", + "id": "b7cda55e", + "metadata": {}, + "source": [ + "Here we add a \"MotileRun\" to the napari tracking visualization widget (the \"view_controller\"). A MotileRun includes a name, a set of tracks, and a segmentation. The tracking visualization widget will add:\n", + "- a Points layer with the points in the tracks\n", + "- a Tracks layer to display the track history as a \"tail\" behind the point in the current time frame\n", + "- a Labels layer, if a segmentation was provided\n", + "- a Lineage View widget, which displays an abstract graph representation of all the solution tracks\n", + "\n", + "These views are synchronized such that every element is colored by the track ID of the element. Clicking on a node in the Lineage View will navigate to that cell in the data, and vice versa.\n", + "\n", + "Hint - if your screen is too small, you can \"pop out\" the lineage tree view into a separate window using the icon that looks like two boxes in the top left of the lineage tree view. You can also close the tree view with the x just above it, and open it again from the menu bar: Plugins -> Motile -> Lineage View (then re-run the below cell to add the data to the lineage view)." + ] + }, { "cell_type": "code", "execution_count": null, - "id": "0c5a17b1", + "id": "934c2cc9", "metadata": {}, "outputs": [], "source": [ @@ -289,7 +306,7 @@ }, { "cell_type": "markdown", - "id": "a1064a48", + "id": "a04d4215", "metadata": { "lines_to_next_cell": 2 }, @@ -306,7 +323,7 @@ }, { "cell_type": "markdown", - "id": "01c2aa7c", + "id": "a288a7e4", "metadata": {}, "source": [ "

Task 2: Extract candidate nodes from the predicted segmentations

\n", @@ -328,7 +345,7 @@ { "cell_type": "code", "execution_count": null, - "id": "5411baa3", + "id": "c11baf37", "metadata": { "tags": [ "solution" @@ -368,7 +385,7 @@ { "cell_type": "code", "execution_count": null, - "id": "d9deb6ef", + "id": "eebf6235", "metadata": {}, "outputs": [], "source": [ @@ -388,7 +405,7 @@ }, { "cell_type": "markdown", - "id": "e6e74e1e", + "id": "f48d9da7", "metadata": {}, "source": [ "We can visualize our candidate points using the napari Points layer. You should see one point in the center of each segmentation when we display it using the below cell." @@ -397,7 +414,7 @@ { "cell_type": "code", "execution_count": null, - "id": "47504dad", + "id": "51739880", "metadata": {}, "outputs": [], "source": [ @@ -408,7 +425,7 @@ }, { "cell_type": "markdown", - "id": "af4431a4", + "id": "c49a77b3", "metadata": {}, "source": [ "### Adding Candidate Edges\n", @@ -421,7 +438,7 @@ { "cell_type": "code", "execution_count": null, - "id": "d10eccde", + "id": "a085e685", "metadata": {}, "outputs": [], "source": [ @@ -490,7 +507,7 @@ }, { "cell_type": "markdown", - "id": "9391e4a1", + "id": "253b78fe", "metadata": {}, "source": [ "Visualizing the candidate edges in napari is, unfortunately, not yet possible. However, we can print out the number of candidate nodes and edges, and compare it to the ground truth nodes and edgesedges. We should see that we have a few more candidate nodes than ground truth (due to false positive detections) and many more candidate edges than ground truth - our next step will be to use optimization to pick a subset of the candidate nodes and edges to generate our solution tracks." @@ -499,7 +516,7 @@ { "cell_type": "code", "execution_count": null, - "id": "1588427e", + "id": "61c29895", "metadata": {}, "outputs": [], "source": [ @@ -509,7 +526,7 @@ }, { "cell_type": "markdown", - "id": "e4000f51", + "id": "8da5f320", "metadata": {}, "source": [ "## Checkpoint 1\n", @@ -521,7 +538,7 @@ }, { "cell_type": "markdown", - "id": "d25b3d5b", + "id": "d559c9a1", "metadata": {}, "source": [ "## Setting Up the Tracking Optimization Problem" @@ -529,7 +546,7 @@ }, { "cell_type": "markdown", - "id": "fa12f2ff", + "id": "0101df33", "metadata": {}, "source": [ "As hinted earlier, our goal is to prune the candidate graph. More formally we want to find a graph $\\tilde{G}=(\\tilde{V}, \\tilde{E})$ whose vertices $\\tilde{V}$ are a subset of the candidate graph vertices $V$ and whose edges $\\tilde{E}$ are a subset of the candidate graph edges $E$.\n", @@ -546,7 +563,7 @@ }, { "cell_type": "markdown", - "id": "840742c0", + "id": "6b7907ba", "metadata": {}, "source": [ "## Task 3 - Basic tracking with motile\n", @@ -571,7 +588,7 @@ { "cell_type": "code", "execution_count": null, - "id": "bc1a047f", + "id": "5bb7f6ec", "metadata": { "tags": [ "solution" @@ -606,7 +623,7 @@ }, { "cell_type": "markdown", - "id": "948f93f9", + "id": "ea4810b3", "metadata": {}, "source": [ "Here is a utility function to gauge some statistics of a solution." @@ -615,7 +632,7 @@ { "cell_type": "code", "execution_count": null, - "id": "b56ae1ff", + "id": "fbb338f9", "metadata": {}, "outputs": [], "source": [ @@ -625,7 +642,7 @@ }, { "cell_type": "markdown", - "id": "0db4497b", + "id": "0e1948c5", "metadata": {}, "source": [ "Here we actually run the optimization, and compare the found solution to the ground truth.\n", @@ -641,7 +658,7 @@ { "cell_type": "code", "execution_count": null, - "id": "7dad6f81", + "id": "7d6d6323", "metadata": {}, "outputs": [], "source": [ @@ -656,7 +673,7 @@ }, { "cell_type": "markdown", - "id": "b101edc7", + "id": "9ce61acd", "metadata": {}, "source": [ "If you haven't selected any nodes or edges in your solution, try adjusting your weight and/or constant values. Make sure you have some negative costs or selecting nothing will always be the best solution!" @@ -664,7 +681,7 @@ }, { "cell_type": "markdown", - "id": "680c9f61", + "id": "3416d3d1", "metadata": {}, "source": [ "

Question 1: Interpret your results based on statistics

\n", @@ -676,15 +693,25 @@ }, { "cell_type": "markdown", - "id": "f7b794d2", + "id": "8f1c5422", + "metadata": {}, + "source": [ + "

Checkpoint 2

\n", + "We will discuss the exercise up to this point as a group shortly. If you reach this checkpoint early, you can go on to Checkpoint 3.\n", + "
" + ] + }, + { + "cell_type": "markdown", + "id": "0ece0a24", "metadata": {}, "source": [ "## Visualize the Result\n", "Rather than just looking at printed statistics about our solution, let's visualize it in `napari`.\n", "\n", - "This involves two steps:\n", - "1. First, we can add a tracks layer with the solution graph.\n", - "2. Second, we can add another segmentation layer, where the segmentations are relabeled so that the same cell will be the same color over time. Cells will still change color at division.\n", + "Before we can create our MotileRun, we need to create an output segmentation from our solution. Our output segmentation differs from our input segmentation in two main ways:\n", + "1. Not all candidate nodes will be selected in our solution graph. We need to filter the masks corresponding to the un-selected candidate detections out of the output segmentation. \n", + "2. Segmentations will be relabeled so that the same cell will be the same label (and thus color) over time. Cells will still change label/color at division.\n", "\n", "Note that bad tracking results at this point does not mean that you implemented anything wrong! We still need to customize our costs and constraints to the task before we can get good results. As long as your pipeline selects something, and you can kind of interepret why it is going wrong, that is all that is needed at this point." ] @@ -692,32 +719,24 @@ { "cell_type": "code", "execution_count": null, - "id": "34701279", + "id": "38c587de", "metadata": {}, "outputs": [], "source": [ - "# recolor the segmentation\n", - "\n", - "from motile_toolbox.visualization.napari_utils import assign_tracklet_ids\n", "def relabel_segmentation(\n", " solution_nx_graph: nx.DiGraph,\n", " segmentation: np.ndarray,\n", ") -> np.ndarray:\n", - " \"\"\"Relabel a segmentation based on tracking results so that nodes in same\n", - " track share the same id. IDs do change at division.\n", + " \"\"\"Relabel a segmentation based on tracking results to get the output segmentation.\n", "\n", " Args:\n", " solution_nx_graph (nx.DiGraph): Networkx graph with the solution to use\n", - " for relabeling. Nodes not in graph will be removed from seg. Original\n", - " segmentation ids and hypothesis ids have to be stored in the graph so we\n", - " can map them back.\n", - " segmentation (np.ndarray): Original (potentially multi-hypothesis)\n", - " segmentation with dimensions (t,h,[z],y,x), where h is 1 for single\n", - " input segmentation.\n", + " for relabeling. Nodes not in graph will be removed from seg.\n", + " segmentation (np.ndarray): Original segmentation with dimensions (t,y,x)\n", "\n", " Returns:\n", " np.ndarray: Relabeled segmentation array where nodes in same track share same\n", - " id with shape (t,1,[z],y,x)\n", + " id with shape (t,y,x)\n", " \"\"\"\n", " assign_tracklet_ids(solution_nx_graph)\n", " tracked_masks = np.zeros_like(segmentation)\n", @@ -737,12 +756,12 @@ { "cell_type": "code", "execution_count": null, - "id": "bbf09ab4", + "id": "77a642e2", "metadata": {}, "outputs": [], "source": [ "basic_run = MotileRun(\n", - " run_name=\"basic_solution_test\",\n", + " run_name=\"basic_solution\",\n", " tracks=solution_graph,\n", " output_segmentation=np.expand_dims(solution_seg, axis=1) # need to add a dummy dimension to fit API\n", ")\n", @@ -752,7 +771,7 @@ }, { "cell_type": "markdown", - "id": "ac106e5b", + "id": "dc80ada1", "metadata": {}, "source": [ "

Question 2: Interpret your results based on visualization

\n", @@ -764,7 +783,7 @@ }, { "cell_type": "markdown", - "id": "a3fa790b", + "id": "73ead5d8", "metadata": { "lines_to_next_cell": 2 }, @@ -775,19 +794,26 @@ "\n", "Additionally, we would also like to quantify this. We will use the package [`traccuracy`](https://traccuracy.readthedocs.io/en/latest/) to calculate some [standard metrics for cell tracking](http://celltrackingchallenge.net/evaluation-methodology/). For example, a high-level indicator for tracking performance is called TRA.\n", "\n", - "If you're interested in more detailed metrics, you can look at the false positive (FP) and false negative (FN) nodes, edges and division events." + "If you're interested in more detailed metrics, you can look at the false positive (FP) and false negative (FN) nodes, edges and division events.\n", + "\n", + "TODO: explain this better. Pick a few important metrics (TRA, FP Nodes/edges/divs, FN nodes/edges/divs, TP divs) and explain them. Also, make a table of those specific results that we can add \"runs\" to so we can compare results across runs without scrolling up and down." + ] + }, + { + "cell_type": "markdown", + "id": "c74010a3", + "metadata": {}, + "source": [ + "The metrics we want to compute require a ground truth segmentation. Since we do not have a ground truth segmentation, we can make one by drawing a circle around each ground truth detection. While not perfect, it will be good enough to match ground truth to predicted detections in order to compute metrics." ] }, { "cell_type": "code", "execution_count": null, - "id": "006cb964", - "metadata": { - "lines_to_next_cell": 2 - }, + "id": "06297df6", + "metadata": {}, "outputs": [], "source": [ - "\n", "from skimage.draw import disk\n", "def make_gt_detections(data_shape, gt_tracks, radius):\n", " segmentation = np.zeros(data_shape, dtype=\"uint32\")\n", @@ -801,14 +827,13 @@ " segmentation[time][rr, cc] = node\n", " return segmentation\n", "\n", - "gt_dets = make_gt_detections(data_root[\"raw\"].shape, gt_tracks, 10)\n", - "# viewer.add_image(gt_dets)" + "gt_dets = make_gt_detections(data_root[\"raw\"].shape, gt_tracks, 10)" ] }, { "cell_type": "code", "execution_count": null, - "id": "2f6b791d", + "id": "4fb4bcb7", "metadata": {}, "outputs": [], "source": [ @@ -854,7 +879,7 @@ { "cell_type": "code", "execution_count": null, - "id": "0cd8fcda", + "id": "b49a7e73", "metadata": {}, "outputs": [], "source": [ @@ -863,7 +888,7 @@ }, { "cell_type": "markdown", - "id": "eb4a8e7b", + "id": "531c40f9", "metadata": {}, "source": [ "

Question 3: Interpret your results based on metrics

\n", @@ -875,19 +900,17 @@ }, { "cell_type": "markdown", - "id": "698f9149", + "id": "94dca515", "metadata": {}, "source": [ - "

Checkpoint 2

\n", - "We have set up and run a basic ILP to get tracks and visualized and evaluated the output. \n", - "\n", - "We will go over the code and discuss the answers to Questions 1, 2, and 3 together soon. If you have extra time, think about what kinds of improvements you could make to the costs and constraints to fix the issues that you are seeing. You can even try tuning your weights and constants, or adding or removing motile Costs and Constraints, and seeing how that changes the output.\n", + "

Checkpoint 3

\n", + "If you reach this checkpoint with extra time, think about what kinds of improvements you could make to the costs and constraints to fix the issues that you are seeing. You can try tuning your weights and constants, or adding or removing motile Costs and Constraints, and seeing how that changes the output. See how good you can make the results!\n", "
" ] }, { "cell_type": "markdown", - "id": "57ea6152", + "id": "9907c404", "metadata": {}, "source": [ "## Customizing the Tracking Task\n", @@ -897,145 +920,15 @@ "2. Change the structure of the candidate graph\n", "3. Add a new type of cost or constraint\n", "\n", - "The first way is the most common, and is quite flexible, so we will focus on two examples of this type of customization." - ] - }, - { - "cell_type": "markdown", - "id": "ee0471d1", - "metadata": {}, - "source": [ - "## Task 4 - Add an appear cost, but not at the boundary\n", - "The Appear cost penalizes starting a new track, encouraging continuous tracks. However, you do not want to penalize tracks that appear in the first frame. In our case, we probably also do not want to penalize appearing at the \"bottom\" of the dataset. The built in Appear cost ([docs here](https://funkelab.github.io/motile/api.html#motile.costs.Appear_)) has an `ignore_attribute` argument, where if the node has that attribute and it evaluates to True, the Appear cost will not be paid for that node." - ] - }, - { - "cell_type": "markdown", - "id": "08c133dc", - "metadata": {}, - "source": [ - "\n", - "

Task 4a: Add an ignore_appear attribute to the candidate graph

\n", - "

\n", - "For each node on our candidate graph, you should add an attribute `ignore_appear` that evaluates to True if the appear cost should NOT be paid for that node. For nodes that should pay the appear cost, you can either set the attribute to False, or not add the attribute. Nodes should NOT pay the appear cost if they are in the first time frame, or if they are within a certain distance to the \"bottom\" of the dataset. (You will need to determine the coordinate range that you consider the \"bottom\" of the dataset, perhaps by hovering over the napari image layer and seeing what the coordinate values are).\n", - "

\n", - "
" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "6c2fddef", - "metadata": { - "tags": [ - "solution" - ] - }, - "outputs": [], - "source": [ - "def add_appear_ignore_attr(cand_graph):\n", - " for node in cand_graph.nodes():\n", - " time = cand_graph.nodes[node][\"t\"]\n", - " pos_x = cand_graph.nodes[node][\"x\"]\n", - " if time == 0 or pos_x >= 710:\n", - " cand_graph.nodes[node][\"ignore_appear\"] = True\n", - "\n", - "add_appear_ignore_attr(cand_graph)" - ] - }, - { - "cell_type": "markdown", - "id": "09df8fe1", - "metadata": {}, - "source": [ - "

Task 4b: Use the `ignore_appear` attribute in your solver pipeline

\n", - "

Copy your solver pipeline from above, and then adapt the Appear cost to use our new `ignore_appear` attribute. You may also want to adapt the Appear constant value. Then re-solve to see if performance improves.\n", - "

\n", - "
" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "aebf2af0", - "metadata": { - "tags": [ - "solution" - ] - }, - "outputs": [], - "source": [ - "def solve_appear_optimization(cand_graph):\n", - " \"\"\"Set up and solve the network flow problem.\n", - "\n", - " Args:\n", - " graph (nx.DiGraph): The candidate graph.\n", - "\n", - " Returns:\n", - " nx.DiGraph: The networkx digraph with the selected solution tracks\n", - " \"\"\"\n", - "\n", - " cand_trackgraph = motile.TrackGraph(cand_graph, frame_attribute=\"t\")\n", - " solver = motile.Solver(cand_trackgraph)\n", - "\n", - " solver.add_cost(\n", - " motile.costs.EdgeDistance(weight=1, constant=-20, position_attribute=(\"x\", \"y\"))\n", - " )\n", - " solver.add_cost(motile.costs.Appear(constant=10, ignore_attribute=\"ignore_appear\"))\n", - "\n", - " solver.add_constraint(motile.constraints.MaxParents(1))\n", - " solver.add_constraint(motile.constraints.MaxChildren(2))\n", - "\n", - " solver.solve()\n", - " solution_graph = graph_to_nx(solver.get_selected_subgraph())\n", - " return solution_graph" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "c9d8bcf7", - "metadata": {}, - "outputs": [], - "source": [ - "solution_graph = solve_appear_optimization(cand_graph)\n", - "solution_seg = relabel_segmentation(solution_graph, segmentation)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "bf3b499d", - "metadata": { - "lines_to_next_cell": 2 - }, - "outputs": [], - "source": [ - "appear_run = MotileRun(\n", - " run_name=\"appear_solution\",\n", - " tracks=solution_graph,\n", - " output_segmentation=np.expand_dims(solution_seg, axis=1) # need to add a dummy dimension to fit API\n", - ")\n", - "\n", - "widget.view_controller.update_napari_layers(appear_run, time_attr=\"t\", pos_attr=(\"x\", \"y\"))" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "469b4b7b", - "metadata": {}, - "outputs": [], - "source": [ - "get_metrics(gt_tracks, gt_dets, solution_graph, solution_seg)" + "The first way is the most common, and is quite flexible, so we will focus on an example of this type of customization." ] }, { "cell_type": "markdown", - "id": "d09bd609", + "id": "2efc8ee1", "metadata": {}, "source": [ - "## Task 5 - Incorporating Known Direction of Motion\n", + "## Task 4 - Incorporating Known Direction of Motion\n", "\n", "So far, we have been using motile's EdgeDistance as an edge selection cost, which penalizes longer edges by computing the Euclidean distance between the endpoints. However, in our dataset we see a trend of upward motion in the cells, and the false detections at the top are not moving. If we penalize movement based on what we expect, rather than Euclidean distance, we can select more correct cells and penalize the non-moving artefacts at the same time.\n", " \n" @@ -1043,10 +936,10 @@ }, { "cell_type": "markdown", - "id": "5df286ff", + "id": "7867d480", "metadata": {}, "source": [ - "

Task 5a: Add a drift distance attribute

\n", + "

Task 4a: Add a drift distance attribute

\n", "

For this task, we need to determine the \"expected\" amount of motion, then add an attribute to our candidate edges that represents distance from the expected motion direction.

\n", "
" ] @@ -1054,7 +947,7 @@ { "cell_type": "code", "execution_count": null, - "id": "43f6ce9d", + "id": "68e48933", "metadata": { "tags": [ "solution" @@ -1080,10 +973,10 @@ }, { "cell_type": "markdown", - "id": "2c6e4553", + "id": "6bc50aff", "metadata": {}, "source": [ - "

Task 5b: Add a drift distance attribute

\n", + "

Task 4b: Add a drift distance attribute

\n", "

Now, we set up yet another solving pipeline. This time, we will replace our EdgeDistance\n", "cost with an EdgeSelection cost using our new \"drift_dist\" attribute. The weight should be positive, since a higher distance from the expected drift should cost more, similar to our prior EdgeDistance cost. Also similarly, we need a negative constant to make sure that the overall cost of selecting tracks is negative.

\n", "
" @@ -1092,7 +985,7 @@ { "cell_type": "code", "execution_count": null, - "id": "3b8c40e0", + "id": "4814dfcd", "metadata": {}, "outputs": [], "source": [ @@ -1121,7 +1014,7 @@ { "cell_type": "code", "execution_count": null, - "id": "12b8df7d", + "id": "3fbfd6e4", "metadata": { "tags": [ "solution" @@ -1145,8 +1038,6 @@ " solver.add_cost(\n", " motile.costs.EdgeSelection(weight=1.0, constant=-30, attribute=\"drift_dist\")\n", " )\n", - " solver.add_cost(motile.costs.Appear(constant=100, ignore_attribute=\"ignore_appear\"))\n", - " solver.add_cost(motile.costs.Split(constant=20))\n", "\n", " solver.add_constraint(motile.constraints.MaxParents(1))\n", " solver.add_constraint(motile.constraints.MaxChildren(2))\n", @@ -1159,7 +1050,7 @@ { "cell_type": "code", "execution_count": null, - "id": "533d50ac", + "id": "63f31351", "metadata": {}, "outputs": [], "source": [ @@ -1170,7 +1061,7 @@ { "cell_type": "code", "execution_count": null, - "id": "7ad12973", + "id": "69f253e4", "metadata": {}, "outputs": [], "source": [ @@ -1186,7 +1077,7 @@ { "cell_type": "code", "execution_count": null, - "id": "cb7510f5", + "id": "43eab523", "metadata": {}, "outputs": [], "source": [ @@ -1195,10 +1086,9 @@ }, { "cell_type": "markdown", - "id": "8bbb315d", + "id": "dae0c95b", "metadata": {}, "source": [ - "## Checkpoint 4\n", "

Checkpoint 4

\n", "That is the end of the main exercise! If you have extra time, feel free to go onto the below bonus exercise to see how to learn the weights of your costs instead of setting them manually.\n", "
" @@ -1206,7 +1096,7 @@ }, { "cell_type": "markdown", - "id": "cbbdeae9", + "id": "766a8666", "metadata": {}, "source": [ "## Bonus: Learning the Weights" @@ -1214,14 +1104,18 @@ }, { "cell_type": "markdown", - "id": "3c812d0a", + "id": "af9baa27", "metadata": {}, - "source": [] + "source": [ + "Motile also provides the option to learn the best weights and constants using a (Structured Support Vector Machine)[https://en.wikipedia.org/wiki/Structured_support_vector_machine]. There is a tutorial on the motile documentation (here)[https://funkelab.github.io/motile/learning.html], but we will also walk you through an example below.\n", + "\n", + "We need some ground truth annotations on our candidate graph in order to learn the best weights. The next cell contains a function that matches our ground truth graph to our candidate graph using the predicted segmentations. The function checks for each ground truth node if it is inside one of our predicted segmentations. If it is, that candidate node is marked with attribute \"gt\" = True. Any unmatched candidate nodes have \"gt\" = False. We also annotate the edges in a similar fashion - if both endpoints of a GT edge are inside predicted segmentations, the corresponding candidate edge will have \"gt\" = True, while all other edges going out of that candidate node have \"gt\" = False." + ] }, { "cell_type": "code", "execution_count": null, - "id": "94e3a34a", + "id": "2dbf9fb1", "metadata": {}, "outputs": [], "source": [ @@ -1248,10 +1142,18 @@ " cand_graph.nodes[node][\"gt\"] = False" ] }, + { + "cell_type": "markdown", + "id": "6516a738", + "metadata": {}, + "source": [ + "The SSVM does not need dense ground truth - providing only some annotations frequently is sufficient to learn good weights, and is efficient for both computation time and annotation time. Below, we create a validation graph that spans the first three time frames, and annotate it with our ground truth." + ] + }, { "cell_type": "code", "execution_count": null, - "id": "5c8d6b5b", + "id": "b7d09c22", "metadata": { "lines_to_next_cell": 2 }, @@ -1265,10 +1167,18 @@ "add_gt_annotations(gt_tracks, validation_graph, segmentation)" ] }, + { + "cell_type": "markdown", + "id": "3a705247", + "metadata": {}, + "source": [ + "Here we print the number of nodes and edges that have been annotated with True and False ground truth. It is important to provide negative/False annotations, as well as positive/True annotations, or the SSVM will try and select weights to pick everything." + ] + }, { "cell_type": "code", "execution_count": null, - "id": "73b71faa", + "id": "bfdef448", "metadata": {}, "outputs": [], "source": [ @@ -1281,17 +1191,23 @@ "print(f\"{len(gt_pos_edges) + len(gt_neg_edges)} annotated: {len(gt_pos_edges)} True, {len(gt_neg_edges)} False\")" ] }, + { + "cell_type": "markdown", + "id": "85ce3477", + "metadata": {}, + "source": [ + "

Bonus task: Add your best solver parameters

\n", + "

Now, similar to before, we make the solver by adding costs and constraints. You can copy your best set of costs and constraints from before. It does not matter what weights and constants you choose. However, this time we just return the solver, rather than actually solving.

\n", + "
" + ] + }, { "cell_type": "code", "execution_count": null, - "id": "f03d0087", + "id": "9e4380d5", "metadata": {}, "outputs": [], "source": [ - "import logging\n", - "\n", - "logging.basicConfig(level=logging.INFO)\n", - "\n", "def get_ssvm_solver(cand_graph):\n", "\n", " cand_trackgraph = motile.TrackGraph(cand_graph, frame_attribute=\"t\")\n", @@ -1300,7 +1216,6 @@ " solver.add_cost(\n", " motile.costs.EdgeSelection(weight=1.0, constant=-30, attribute=\"drift_dist\")\n", " )\n", - " solver.add_cost(motile.costs.Appear(constant=20, ignore_attribute=\"ignore_appear\"))\n", " solver.add_cost(motile.costs.Split(constant=20))\n", "\n", " solver.add_constraint(motile.constraints.MaxParents(1))\n", @@ -1308,10 +1223,20 @@ " return solver" ] }, + { + "cell_type": "markdown", + "id": "f5ced80f", + "metadata": {}, + "source": [ + "To fit the best weights, the solver will solve the ILP many times and slowly converge to the best set of weights in a structured manner. Running the cell below may take some time - we recommend getting a Gurobi license if you want to use this technique in your research, as it speeds up solving quite a bit.\n", + "\n", + "At the end, it will print the optimal weights, and you can compare them to the weights you found by trial and error." + ] + }, { "cell_type": "code", "execution_count": null, - "id": "925ff81a", + "id": "a38be363", "metadata": {}, "outputs": [], "source": [ @@ -1321,10 +1246,18 @@ "optimal_weights" ] }, + { + "cell_type": "markdown", + "id": "484b0c68", + "metadata": {}, + "source": [ + "After we have our optimal weights, we need to solve with them on the full candidate graph." + ] + }, { "cell_type": "code", "execution_count": null, - "id": "58f02fb1", + "id": "dc9286ab", "metadata": { "lines_to_next_cell": 2 }, @@ -1340,10 +1273,18 @@ "solution_graph = get_ssvm_solution(cand_graph, optimal_weights)" ] }, + { + "cell_type": "markdown", + "id": "a7ee8784", + "metadata": {}, + "source": [ + "Finally, we can visualize and compute metrics on the solution found using the weights discovered by the SSVM." + ] + }, { "cell_type": "code", "execution_count": null, - "id": "ce459603", + "id": "53ddf188", "metadata": {}, "outputs": [], "source": [ @@ -1354,7 +1295,7 @@ { "cell_type": "code", "execution_count": null, - "id": "88221855", + "id": "e92d8674", "metadata": {}, "outputs": [], "source": [ @@ -1368,20 +1309,16 @@ ] }, { - "cell_type": "code", - "execution_count": null, - "id": "68efa31e", - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "7d79c0d9", + "cell_type": "markdown", + "id": "1b3c7848", "metadata": {}, - "outputs": [], - "source": [] + "source": [ + "

Bonus Question: Interpret SSVM results

\n", + "

\n", + "How do the results compare between the SSVM-discovered weights and your hand-crafted weights? What are the advantages and disadvantages of each approach in terms of (human or computer) time needed?\n", + "

\n", + "
\n" + ] } ], "metadata": {