Contig Clustering Testing #51

mgottl04 · 2016-09-12T18:36:15Z

I previously tested contig ordering on CAD11 using cluster::pam for 23 and 36 clusters. Calculated F-score for both runs of pam was worse than contiBAIT's current algorithm's first pass.

Before discarding changes altogether, would like to test on a harder to cluster data set. This would require data from @rareaquaticbadger that does not cluster well. I believe the TDEV data should fit the bill, but I do not have the input to the first call of clusterContigs from this set.

I don't expect cluster::pam to be better at this point, but it would be nice to confirm on a messier data set. I feel like contiBAIT's Three Star process is pretty similar to doing one iteration of k-means with a sliding number of clusters. Given @matthewborkowski performance improvements to clustering, it should be possible to do multiple iterations after the first pass in a fashion similar to k-means.

oneillkza added the enhancement label Mar 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contig Clustering Testing #51

Contig Clustering Testing #51

mgottl04 commented Sep 12, 2016

Contig Clustering Testing #51

Contig Clustering Testing #51

Comments

mgottl04 commented Sep 12, 2016