You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Precisely, Kmax is the maximum number of factors added during the initialization step where we fit flash with a point Laplace prior. Then we do a nonnegative transform to split each factor into two. But we will always get an nonnegative intercept/baseline factor at this stage, so we will have at most 2*Kmax - 1 factors that enter the next step where we fit flash with a GB prior and improve the fit using backfit. (edited)
But after we are done with fitting flash with GB prior, we have an additional step to filter out those k's for which l_k and \tilde{l}_k are not consistent (by having a correlation < 0.8 by default), which means that we could have fewer than 2*Kmax - 1 factors in the function output.
If the underlying structure in the data requires much fewer than 2Kmax - 1 factors to explain, we will find many k's such that l_k and \tilde{l}_k are not consistent and these will be removed during this step. On the other hand, if our specified Kmax is smaller than needed, this final step will filter out no or very few factors, leaving almost all the 2Kmax - 1 factors in the function output.
So the resulting number of factors will be up to 2Kmax - 1 factors; but it can be much smaller than that. The difference depends on the relationship between the specified Kmax and the underlying "true" number of factors needed to explain the data structure (Again this is totally based on my empirical experience with real datasets). I think I made it clear that Kmax should be interpreted as an approximation of the final K we will get; maybe we can also say that the final K is up to 2Kmax - 1?
The text was updated successfully, but these errors were encountered:
From @YushaLiu:
The text was updated successfully, but these errors were encountered: