-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from Cole Trapnell #110
Changes from Cole Trapnell #110
Conversation
…can be used e.g. in bootstrap/jackknife
…arnings about ineffiecient access
…ass for PLNnetworkfit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some very minor comments which need quick explanation before approval.
@@ -105,7 +105,7 @@ PLN_param <- function( | |||
Omega = NULL, | |||
config_post = list(), | |||
config_optim = list(), | |||
inception = NULL # pretrained PLNfit used as initialization | |||
inception = NULL # pretrained PLNfit used as initialization, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the additional comma if its the last element of the list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops :)
# CHECK_ME_TORCH_GPU | ||
# This appears to be in torch_gpu only. The commented out line below is | ||
# in both PLNmodels/master and PLNmodels/dev. | ||
myPLN <- switch(control$covariance, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why one would need to init with a diagonal or spherical PLN covariance model, when the goal is to fit a PLNnetwork model, that is, one with sparse precision/covariance matrix... But if used for completeness/compatibility, that is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have exactly the same question. Diagonal covariance matrix would result in an empty graph (with only isolated nodes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maddyduran and I seem to remember issues when initializing PLNnetwork when you have fewer samples than species - if you use "fixed" there's a call to solve() that can throw an exception. We figured that "spherical" or diagonal would avoid this (skipping that call) and move on to estimating a penalized inverse covariance matrix even with a spherical or diagonal init.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for the nice PR and many changes, including the brand new jackniffe estimator for the B matrix.
Like @jchiquet I have a few comment before approving the PR. The main one concerns changes to PLNnetwork which I'm not sure are so useful: since a PLNnetworkfit is always an object of class PLNfit_fixedcov, there is limited use in passing a different variance structure in the control list (and it increases potential for making mistakes).
@@ -107,6 +108,11 @@ trace <- function(x) sum(diag(x)) | |||
x | |||
} | |||
|
|||
.logfactorial_torch <- function(n){ | |||
n[n == 0] <- 1 ## 0! = 1! | |||
n*torch_log(n) - n + torch_log(8*torch_pow(n,3) + 4*torch_pow(n,2) + n + 1/30)/6 + log(pi)/2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For .logfactorial_torch(), shouldn't the final term log(pi)/2 be torch_log(pi)/2 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, though not actually sure what would happen under the hood here in terms of evaluation?
.5 * torch_sum(torch_mm(params$M, params$Omega) * params$M + S2 * torch_diag(params$Omega), dim = 2) | ||
Ji_tmp = Ji_tmp$cpu() | ||
Ji_tmp = as.numeric(Ji_tmp) | ||
Ji <- .5 * self$p - rowSums(.logfactorial(as.matrix(data$Y$cpu()))) + Ji_tmp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use .logfactorial_torch()
instead of .logfactorial()
? Because the computation has already moved back to the CPU at this point ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. Perhaps it would be better to defer that so we can use .logfactorial_torch() instead
@@ -156,7 +159,7 @@ PLNfit <- R6Class( | |||
|
|||
## Check for convergence | |||
if (delta_f < config$ftol_rel) status <- 3 | |||
if (delta_x < config$xtol_rel) status <- 4 | |||
#if (delta_x < config$xtol_rel) status <- 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to remove the convergence check on the parameter values and to keep only the one on the ELBO value ? This will speed up the algorithm (less conditions to satisfy) but may cause the nlopt and torch implementations diverge in the result they produce (not a bad thing per se, but something we need be aware of).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I defer to you on that - we have been generally happy with larger default values of ftol_rel, triggering convergence on status 3 most of the time.
@@ -217,6 +220,54 @@ PLNfit <- R6Class( | |||
invisible(list(var_B = var_B, var_Omega = var_Omega)) | |||
}, | |||
|
|||
compute_vcov_from_resamples = function(resamples){ | |||
# compute the covariance of the parameters | |||
get_cov_mat = function(data, cell_group) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be mistaken, but get_cov_mat()
appears to be defined but never used anywhere in compute_vcov_from_resamples()
. Is it necessary ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vestigial code from debugging - can be removed
# CHECK_ME_TORCH_GPU | ||
# This appears to be in torch_gpu only. The commented out line below is | ||
# in both PLNmodels/master and PLNmodels/dev. | ||
myPLN <- switch(control$covariance, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have exactly the same question. Diagonal covariance matrix would result in an empty graph (with only isolated nodes).
nullModel <- nullModelPoisson(self$responses, self$covariates, self$offsets, self$weights) | ||
#' @param config_post a list for controlling the post-treatment. | ||
postTreatment = function(config_post, config_optim) { | ||
#nullModel <- nullModelPoisson(self$responses, self$covariates, self$offsets, self$weights) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No comparison to a null Poisson model in the general post-treatment of PLN families. Is it to improve speed / because it's not required for the post-treatments ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In our applications, the call to glm.fit inside of nullModelPoisson() can sometimes throw an exception, which ruins a perfectly good PLN fit! Since it seemed to us that nullModelPoisson() was something one only needed to do when (optionally) computing the approximate R2, we thought this call was superfluous.
variance_jackknife = function(Y, X, O, w, config = config_default_nlopt) { | ||
jacks <- future.apply::future_lapply(seq_len(self$n), function(i) { | ||
jacks <- lapply(seq_len(self$n), function(i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no strong opinion on this one as @jchiquet was the one who wrote this part but is there a reason prefer lapply to future_lapply (one less dependency ? simpler to use ?). A nice thing about future_lapply is that it is backend-agnostic and can be used for several parallalelization paradigms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes, agree it would be wonderful to keep this. The issue is that on machines that use OpenBLAS with a multithreaded backend, using future can deadlock the session. A workaround is to wrap calls to future with something like this:
old_omp_num_threads = as.numeric(Sys.getenv("OMP_NUM_THREADS"))
if (is.na(old_omp_num_threads)){
old_omp_num_threads = 1
}
RhpcBLASctl::omp_set_num_threads(1)
old_blas_num_threads = as.numeric(Sys.getenv("OPENBLAS_NUM_THREADS"))
if (is.na(old_omp_num_threads)){
old_blas_num_threads = 1
}
RhpcBLASctl::blas_set_num_threads(1)
Then you do work with future and then:
RhpcBLASctl::omp_set_num_threads(old_omp_num_threads)
RhpcBLASctl::blas_set_num_threads(old_blas_num_threads)
We didn't add this because we didn't want to add a new dependency on RhpcBLASctl to the package, but you could do if you want to be able to do linear algebra inside of functions called by future
data <- list(Y = Y[resample, , drop = FALSE], | ||
X = X[resample, , drop = FALSE], | ||
O = O[resample, , drop = FALSE], | ||
w = w[resample]) | ||
#print (config$torch_device) | ||
#print (config) | ||
if (config$algorithm %in% c("RPROP", "RMSPROP", "ADAM", "ADAGRAD")) # hack, to know if we're doing torch or not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a backend
element (set to the appropriate value) to config_default_nlopt
and config_default_torch
so that they are self-aware and that we can use
if (config$backend == "torch") {...}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be better. Would also be good to be able to specify the torch device (e.g. "mps", "cuda", etc)
args <- list(data = data, | ||
params = list(B = private$B, M = matrix(0,self$n,self$p), S = private$S[resample, ]), | ||
config = config) | ||
if (config$algorithm %in% c("RPROP", "RMSPROP", "ADAM", "ADAGRAD")) # hack, to know if we're doing torch or not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as previous comment
@@ -72,18 +73,25 @@ PLNnetwork <- function(formula, data, subset, weights, penalties = NULL, control | |||
#' @seealso [PLN_param()] | |||
#' @export | |||
PLNnetwork_param <- function( | |||
backend = "nlopt", | |||
backend = c("nlopt", "torch"), | |||
covariance = c("fixed", "spherical", "diagonal"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm mixed about this: to avoid potential problems, it would be better to allow only covariance = "fixed"
(especially since f801ca6) reverts back to PLNnetworkfit <-R6Class(inherits = PLNfit_fixedcov)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maddyduran and I discussed this and we think the use case for covariance = "spherical" or "diagonal" in PLNnetwork is when you have fewer samples than you have species. IIRC, the issue was the call to solve() in "fixed" optimize() function.
@mahendra-mariadassou thanks for the question. I replied to this question in the wrong place yesterday - but I think the answer is that the use case for covariance = "spherical" or "diagonal" in PLNnetwork is when you have fewer samples than you have species. IIRC, the issue was the call to solve() in "fixed" optimize() function. If n < p, that solve can throw an exception, never making it to the sparse estimation (which can deal with the n < p case at least insofar as returning an answer). |
Hi, We made another pass with @jchiquet and made a few suggestions on a new tweaks branch. Since I'm not a github expert, I branched tweaks from PLN-team:master, pulled cole-trapnell-lab:master into and then pushed a few commits (so that it has our suggestions on top of your PR). Here are the suggestions:
You can pull PLN-team:tweaks into your master and make additional changes if you want, this should automatically update this PR. |
To complete @mahendra-mariadassou 's answer, regarding lapply vs future_lapply, I suggest, as mahendra said, that we open a specific issue since it concerns all paralelissable action. Like Cole, I had noted that using future with a multi-threaded BLAS/LAPACK library (such as OpenBLAS) could have detrimental consequences. I therefore suggest defining a lapply function which, depending on the architecture in place, directs towards a classic or multicore lapply. See if future is capable of this ('sequential' or 'multicore' plan). This is now referenced as Issue #111 |
Hi PLN-team. Thanks for creating this. We have pulled this branch into our fork, merged in the changes from the main fork's master, and run our tests. All looks great to us! |
Hi @ctrapnell. You probably received an automatic email from github, but since I didn't use a standard workflow and just in case. I merged all your changes (amended with our changes in PLNteam:tweaks) into master, which automatically closed the PR. Thanks again for the PR, the various changes and more generally your interest in PLNmodels ! |
Thank you all very much. I'm preparing a version for CRAN today, now that Mahendra has done most of the work. |
@ctrapnell, @maddyduran and @brgew we'd like to add you to the list of package contributors, do you agree? If so, I'll need an e-mail address for each of you (I get [email protected], [email protected] and [email protected], is this correct?) |
Thank you @brgew for your answer! I assume then that @maddyduran and @ctrapnell, who actually contributed to the PR, would like to be added as contributors. |
There are essentially three groups of changes (in order of most important to least important for our lab):