-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redesign of MCSE #63
Redesign of MCSE #63
Conversation
Pull Request Test Coverage Report for Build 3939266728Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
Codecov ReportBase: 95.76% // Head: 96.16% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #63 +/- ##
==========================================
+ Coverage 95.76% 96.16% +0.40%
==========================================
Files 10 10
Lines 732 757 +25
==========================================
+ Hits 701 728 +27
+ Misses 31 29 -2
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
src/mcse.jl
Outdated
T = promote_type(eltype(x), typeof(zero(eltype(x)) / 1)) | ||
values = similar(x, T, (axes(x, 3),)) | ||
for (i, xi) in zip(eachindex(values), eachslice(x; dims=3)) | ||
values[i] = _mcse_sbm(f, vec(xi); batch_size=batch_size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't seen in the literature a study of how best to combine MCMC chains when estimating MCSE with SBM. I benchmarked a few alternatives and found this one underestimated the MCSE the least: arviz-devs/arviz#1974 (comment)
for size in (51, 75, 100, 153) | ||
@test_logs (:warn,) mcse(samples; method=:bm, size=size) | ||
@testset "mcse.jl" begin | ||
@testset "estimand is within interval defined by MCSE estimate" begin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For mean
, std
, and quantile
, these tests are somewhat redundant with those for ess_rhat
, since the only difference is that for ess_rhat
we compute the exact asymptotic variance of the estimator since we know the stationary distribution, whereas for mcse
we estimate the asymptotic variance of the estimator. If we want to speed up our test suite, it may be better to remove the corresponding ess_rhat
tests and use these as integration tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've opted to speed up our test suite by decreasing the number of replicates (params) for each suite to 100 instead of 1,000.
src/mcse.jl
Outdated
SBM tends to underestimate the MCSE, especially for highly autocorrelated chains. | ||
SBM should only be used as a fallbeck when a specific [`mcse`](@ref) method for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This begs the question whether we should include mcse_sbm
at all. On the one hand it's useful to have a fallback, and I don't know of any other general-purpose methods for estimating MCSE. On the other hand, we would prefer a fallback that errs on the side of overestimating MCSE instead of overestimating.
I lean towards including it with this caveat clearly documented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we only want it to be used as a fallback, maybe we do not even want to add another function for it but also call it mcse
? And just document there that it is used as a fallback but one should be aware of its limitations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought, it's useful to have it as a standalone method so if we want, we can compare its results with those of the specific estimators. But we can make the function internal and copy the documentation to mcse
.
@@ -480,7 +482,7 @@ function _expectand_proxy(::typeof(StatsBase.mad), x) | |||
return _expectand_proxy(Statistics.median, x_folded) | |||
end | |||
function _expectand_proxy(f::Base.Fix2{typeof(Statistics.quantile),<:Real}, x) | |||
y = similar(x, Bool) | |||
y = similar(x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Necessary because otherwise our transformation discards the type of x
, so a Float32
eltype x
will get a Float64
ESS/R-hat/MCSE.
improves the quality of the estimates and reduces random failures
src/mcse.jl
Outdated
SBM tends to underestimate the MCSE, especially for highly autocorrelated chains. | ||
SBM should only be used as a fallbeck when a specific [`mcse`](@ref) method for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we only want it to be used as a fallback, maybe we do not even want to add another function for it but also call it mcse
? And just document there that it is used as a fallback but one should be aware of its limitations.
Co-authored-by: David Widmann <[email protected]>
Following #39 (comment) and #53, this PR redesigns
mcse
:mcse
now requires the user provide an estimator to make it less ambiguous what is estimated.mcse
have been removed and replaced with 3d array methods similar toess_rhat
, which also allow for eltypes to includeMissing
. This allows MCSEs to be computed for many parameters with one function call and also allows for improved ESS estimates when multiple chains are providedmean
,std
, andquantile
using the ESS methodsmean
. From the Geyer reference in the docstrings, IMSE is better than IPSE, and both are better than batch-means, which tends to underestimate MCSE more severely. IMSE is used in theess_rhat(::typeof(mean), x)
call used in the newess_rhat
implementations.Marked as a draft for now until I add tests.