Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ordinations (MDS and PCA) #230

Merged
merged 9 commits into from
Oct 21, 2019
Merged

Add ordinations (MDS and PCA) #230

merged 9 commits into from
Oct 21, 2019

Conversation

kescobo
Copy link
Contributor

@kescobo kescobo commented May 22, 2019

EDIT: Wait to merge for a tagged version of MultivariateStats that includes JuliaStats/MultivariateStats.jl#85

Unfortunately, the interface is not super generic across ordinations, but here at least is PCA:

using MultivariateStats, RDatasets, StatsPlots

iris = dataset("datasets", "iris")
Xtr = convert(Matrix, iris[:, 1:4])
M = fit(PCA, Xtr; maxoutdim=2)

plot(M, group=iris.Species)

pca1

This and PCoA are the only ones I'm familiar with. I could try some others, but I'm not confident I'd do the correctly.

@kescobo
Copy link
Contributor Author

kescobo commented May 22, 2019

Maybe don't merge until we get an answer about the aspect ratio thing. Also might want to regress that previous one, since it might break things if folks don't have MultivariateStats.jl master. I can put the MDS (and any other ordinations you think I should try) here.

@mkborregaard
Copy link
Member

Good point - this is awesome!

@mkborregaard mkborregaard self-assigned this May 22, 2019
@mkborregaard
Copy link
Member

I've reverted the MDS PR, so if you rebase this and keep that code when resolving the conflict it should work cleanly I guess?

@kescobo
Copy link
Contributor Author

kescobo commented Jun 21, 2019

Link to discourse post about aspect ratio.

@kescobo
Copy link
Contributor Author

kescobo commented Jun 21, 2019

I'm unclear if I should stop using projection for PCA as well.

@kescobo
Copy link
Contributor Author

kescobo commented Jun 21, 2019

According to my labmate Siyuan, the axes of PCA should be scaled also, but I'm running into issues. Something is clearly wrong with using the projection, a quick google search for PCA on iris all have the opposite orientation to my example above (so does the MultivariateStats example:

Screen Shot 2019-06-21 at 3 34 54 PM

@mkborregaard
Copy link
Member

There's been some discussion on rows vs columns for input data in MultivariateStats - is this related to that?

@kescobo
Copy link
Contributor Author

kescobo commented Jun 22, 2019

It's related, but it's not only that. At the moment, one can get the scaled axes by using transform(pca, matrix), but not from the PCA type alone. At least not as far as I can tell.

@mkborregaard
Copy link
Member

I've reached the point where I'm happy to accept anything you find best :-)

@kescobo
Copy link
Contributor Author

kescobo commented Jun 22, 2019

Lol, K. I think it's going to require MultivariateStats exposing eigenvectors of the PCA. Since we need to wait for a release before this gets merged anyway, I'll just wait until then before attempting to fix. Will let you know.

@asinghvi17
Copy link
Member

Bump :)

@kescobo
Copy link
Contributor Author

kescobo commented Aug 15, 2019

@asinghvi17 I think you need to bump MultivariateStats - this depends on a PR that's merged in master over there but doesn't have a release yet.

@kescobo
Copy link
Contributor Author

kescobo commented Sep 13, 2019

The change to MDS has finally been merged, so this can move forward. Unfortunately, I'm stuck with PCA. The MDS works great:

using MultivariateStats, RDatasets, StatsPlots

iris = dataset("datasets", "iris")
X = convert(Matrix, iris[:, 1:4])
M = fit(MDS, X'; maxoutdim=2)

plot(M, group=iris.Species)

mds

Unfortunately, PCA isn't working so well:

using MultivariateStats, RDatasets, StatsPlots

iris = dataset("datasets", "iris")
X = convert(Matrix, iris[:, 1:4])
M = fit(PCA, X; maxoutdim=2)

plot(M, group=iris.Species)

pca

I think this is because the eigenvectors (acquired using transform(pca) are supposed to be scaled by the eigenvalues, which I don't think come along for the ride in the PCA object.

I'm wondering if this PR should go back to being just about MDS, and leave PCA for someone a bit more knowledgable about linear algebra...

@kescobo
Copy link
Contributor Author

kescobo commented Oct 19, 2019

I think this is ready to go as well. I removed the PCA recipe, but will refactor once JuliaStats/MultivariateStats.jl#109 is worked out.

@mkborregaard
Copy link
Member

Thanks @kescobo . And I see we went with aspect_ratio = 1 in the end 😂

@mkborregaard mkborregaard merged commit e8e074c into JuliaPlots:master Oct 21, 2019
@kescobo
Copy link
Contributor Author

kescobo commented Oct 21, 2019

Yeah, you bullied me into it 😛

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants