Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contig Clustering Testing #51

Open
mgottl04 opened this issue Sep 12, 2016 · 0 comments
Open

Contig Clustering Testing #51

mgottl04 opened this issue Sep 12, 2016 · 0 comments

Comments

@mgottl04
Copy link
Contributor

I previously tested contig ordering on CAD11 using cluster::pam for 23 and 36 clusters. Calculated F-score for both runs of pam was worse than contiBAIT's current algorithm's first pass.

Before discarding changes altogether, would like to test on a harder to cluster data set. This would require data from @rareaquaticbadger that does not cluster well. I believe the TDEV data should fit the bill, but I do not have the input to the first call of clusterContigs from this set.

I don't expect cluster::pam to be better at this point, but it would be nice to confirm on a messier data set. I feel like contiBAIT's Three Star process is pretty similar to doing one iteration of k-means with a sliding number of clusters. Given @matthewborkowski performance improvements to clustering, it should be possible to do multiple iterations after the first pass in a fashion similar to k-means.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants