Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

likelihood gradient acceleration computing from Sigma matrix #29

Merged
merged 44 commits into from
Dec 9, 2021

Conversation

YuuuXie
Copy link
Collaborator

@YuuuXie YuuuXie commented Dec 9, 2021

This pull request adds the function to compute likelihood gradient from Sigma matrix expression, instead of Qff. It needs to be merged after #27


The compute_likelihood_gradient method construct Qff matrix and invert it via QR decomposition. The cost is N_full^3. While compute_likelihood_stable uses the Sigma matrix which avoids the usage of Qff.

I added the method compute_likelihood_gradient_stable which computes the gradient of compute_likelihood_stable. Now the most expensive part when using compute_likelihood_gradient_stable is the QR decomposition of matrix A to get Sigma, and the cost is N_full * N_sparse^2. Therefore, the speedup and memory saving are both (N_full / N_sparse)^2.

Notes:

For NormalizedDotProduct kernel, the computational cost can be further reduced (a bit) using the fact that Kuf_grad is only a constant different from Kuf. So for this kernel, some results can be precomputed and stored, which are used repeatedly during the training and no need for re-computing. Therefore, there are options precompute... added for some functions. This part of code might be re-organized a bit for a cleaner way.

Below is a timing report from Cameron (on the warm & melt Au dataset), where the tot_time is the total time for hyperparameter optimization. Each frame selects 5 sparse envs.

This comparison is not exact, because different training sizes have different numbers of iterations. But the speedup is still significant.

Todo

The likelihood scales with n_labels, maybe we should normalize it with n_labels?

@YuuuXie YuuuXie merged commit b940d39 into master Dec 9, 2021
@YuuuXie YuuuXie deleted the likegrad branch December 13, 2021 15:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant