-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redesign of MCSE #63
Merged
Merged
Redesign of MCSE #63
Changes from all commits
Commits
Show all changes
49 commits
Select commit
Hold shift + click to select a range
2d82222
Add mcse_sbm
sethaxen e4b067b
Update description of `estimator`
sethaxen 34e0221
Add specialized estimators for mean, std, and quantile
sethaxen 6839cd7
Remove vector methods, defaulting to sbm
sethaxen 7dbda2d
Update docstring
sethaxen b93ce5b
Fix bugs
sethaxen 790a99b
Update docstrings
sethaxen 9a1d3d5
Update docstring
sethaxen c0d5a94
Move helper functions to own file
sethaxen cf908af
Rearrange tests
sethaxen 93d121e
Update mcse tests
sethaxen fe99356
Export mcse_sbm
sethaxen d12648b
Increment minor version number with DEV suffix
sethaxen 465afac
Merge branch 'main' into mcseupdate
sethaxen 7dac85e
Increment docs and tests version numbers
sethaxen 9407369
Add additional citation
sethaxen f95a066
Merge branch 'main' into mcseupdate
sethaxen 88b6c41
Update diagnostics to use new mcse
sethaxen 899711e
Increase tolerance of mcse tests
sethaxen 01b8dbc
Increase tolerance more
sethaxen 60d6441
Add mcse_sbm to docs
sethaxen 2441bcb
Skip high autocorrelation tests for mcse_sbm
sethaxen 8e3b06a
Note underestimation for SBM
sethaxen e7ca85a
Merge branch 'main' into mcseupdate
sethaxen 2af67e9
Update src/mcse.jl
sethaxen da4ed63
Merge branch 'main' into mcseupdate
sethaxen b7cd495
Don't enforce type
sethaxen 4d55716
Document kwargs passed to mcse
sethaxen d9f6734
Cross-link mcse and ess_rhat docstrings
sethaxen 1c48266
Document derivation of mcse for std
sethaxen f072b9e
Test type-inferrability of ess_rhat
sethaxen bb47887
Make sure ess_rhat for quantiles not promoted
sethaxen 7f61907
Make sure ess_rhat for median type-inferrable
sethaxen a03cc2a
Implement specific method for median
sethaxen bac8a3c
Return missing if any are missing
sethaxen 652b86f
Add mcse tests
sethaxen 8dbae84
Decrease the number of checks
sethaxen d9aff61
Make ESS/MCSE for median with with Union{Missing,Real}
sethaxen cced4be
Make _fold_around_median type-inferrable
sethaxen ce9d427
Increase tolerance for exhaustive tests
sethaxen 787a05f
Fix _fold_around_median
sethaxen 34f3771
Fix count of checks
sethaxen d10740a
Increase the number of draws
sethaxen cf09e4e
Apply suggestions from code review
sethaxen 39d74a9
Make sure heideldiag and gewekediag preserve input type
sethaxen a575fc8
Consistently use first and last for ess_rhat
sethaxen b4eea8d
Copy comment to _fold_around_median
sethaxen 3738347
Make mcse_sbm an internal function
sethaxen 69fe0a1
Update tests
sethaxen File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,72 +1,129 @@ | ||
const normcdf1 = 0.8413447460685429 # StatsFuns.normcdf(1) | ||
const normcdfn1 = 0.15865525393145705 # StatsFuns.normcdf(-1) | ||
|
||
""" | ||
mcse(x::AbstractVector{<:Real}; method::Symbol=:imse, kwargs...) | ||
mcse(estimator, samples::AbstractArray{<:Union{Missing,Real}}; kwargs...) | ||
|
||
Estimate the Monte Carlo standard errors (MCSE) of the `estimator` applied to `samples` of | ||
shape `(draws, chains, parameters)`. | ||
|
||
See also: [`ess_rhat`](@ref) | ||
|
||
## Estimators | ||
|
||
`estimator` must accept a vector of the same `eltype` as `samples` and return a real estimate. | ||
|
||
Compute the Monte Carlo standard error (MCSE) of samples `x`. | ||
The optional argument `method` describes how the errors are estimated. Possible options are: | ||
For the following estimators, the effective sample size [`ess_rhat`](@ref) and an estimate | ||
of the asymptotic variance are used to compute the MCSE, and `kwargs` are forwarded to | ||
`ess_rhat`: | ||
- `Statistics.mean` | ||
- `Statistics.median` | ||
- `Statistics.std` | ||
- `Base.Fix2(Statistics.quantile, p::Real)` | ||
|
||
- `:bm` for batch means [^Glynn1991] | ||
- `:imse` initial monotone sequence estimator [^Geyer1992] | ||
- `:ipse` initial positive sequence estimator [^Geyer1992] | ||
For other estimators, the subsampling bootstrap method (SBM)[^FlegalJones2011][^Flegal2012] | ||
is used as a fallback, and the only accepted `kwargs` are `batch_size`, which indicates the | ||
size of the overlapping batches used to estimate the MCSE, defaulting to | ||
`floor(Int, sqrt(draws * chains))`. Note that SBM tends to underestimate the MCSE, | ||
especially for highly autocorrelated chains. One should verify that autocorrelation is low | ||
by checking the bulk- and tail-[`ess_rhat`](@ref) values. | ||
|
||
[^Glynn1991]: Glynn, P. W., & Whitt, W. (1991). Estimating the asymptotic variance with batch means. Operations Research Letters, 10(8), 431-435. | ||
[^FlegalJones2011]: Flegal JM, Jones GL. (2011) Implementing MCMC: estimating with confidence. | ||
Handbook of Markov Chain Monte Carlo. pp. 175-97. | ||
[pdf](http://faculty.ucr.edu/~jflegal/EstimatingWithConfidence.pdf) | ||
[^Flegal2012]: Flegal JM. (2012) Applicability of subsampling bootstrap methods in Markov chain Monte Carlo. | ||
Monte Carlo and Quasi-Monte Carlo Methods 2010. pp. 363-72. | ||
doi: [10.1007/978-3-642-27440-4_18](https://doi.org/10.1007/978-3-642-27440-4_18) | ||
|
||
[^Geyer1992]: Geyer, C. J. (1992). Practical Markov Chain Monte Carlo. Statistical Science, 473-483. | ||
""" | ||
function mcse(x::AbstractVector{<:Real}; method::Symbol=:imse, kwargs...) | ||
return if method === :bm | ||
mcse_bm(x; kwargs...) | ||
elseif method === :imse | ||
mcse_imse(x) | ||
elseif method === :ipse | ||
mcse_ipse(x) | ||
else | ||
throw(ArgumentError("unsupported MCSE method $method")) | ||
mcse(f, x::AbstractArray{<:Union{Missing,Real},3}; kwargs...) = _mcse_sbm(f, x; kwargs...) | ||
function mcse( | ||
::typeof(Statistics.mean), samples::AbstractArray{<:Union{Missing,Real},3}; kwargs... | ||
) | ||
S = first(ess_rhat(Statistics.mean, samples; kwargs...)) | ||
return dropdims(Statistics.std(samples; dims=(1, 2)); dims=(1, 2)) ./ sqrt.(S) | ||
end | ||
function mcse( | ||
::typeof(Statistics.std), samples::AbstractArray{<:Union{Missing,Real},3}; kwargs... | ||
) | ||
x = (samples .- Statistics.mean(samples; dims=(1, 2))) .^ 2 # expectand proxy | ||
S = first(ess_rhat(Statistics.mean, x; kwargs...)) | ||
# asymptotic variance of sample variance estimate is Var[var] = E[μ₄] - E[var]², | ||
# where μ₄ is the 4th central moment | ||
# by the delta method, Var[std] = Var[var] / 4E[var] = (E[μ₄]/E[var] - E[var])/4, | ||
# See e.g. Chapter 3 of Van der Vaart, AW. (200) Asymptotic statistics. Vol. 3. | ||
mean_var = dropdims(Statistics.mean(x; dims=(1, 2)); dims=(1, 2)) | ||
mean_moment4 = dropdims(Statistics.mean(abs2, x; dims=(1, 2)); dims=(1, 2)) | ||
return @. sqrt((mean_moment4 / mean_var - mean_var) / S) / 2 | ||
end | ||
function mcse( | ||
f::Base.Fix2{typeof(Statistics.quantile),<:Real}, | ||
samples::AbstractArray{<:Union{Missing,Real},3}; | ||
kwargs..., | ||
) | ||
p = f.x | ||
S = first(ess_rhat(f, samples; kwargs...)) | ||
T = eltype(S) | ||
R = promote_type(eltype(samples), typeof(oneunit(eltype(samples)) / sqrt(oneunit(T)))) | ||
values = similar(S, R) | ||
for (i, xi, Si) in zip(eachindex(values), eachslice(samples; dims=3), S) | ||
values[i] = _mcse_quantile(vec(xi), p, Si) | ||
end | ||
return values | ||
end | ||
function mcse( | ||
::typeof(Statistics.median), samples::AbstractArray{<:Union{Missing,Real},3}; kwargs... | ||
) | ||
S = first(ess_rhat(Statistics.median, samples; kwargs...)) | ||
T = eltype(S) | ||
R = promote_type(eltype(samples), typeof(oneunit(eltype(samples)) / sqrt(oneunit(T)))) | ||
values = similar(S, R) | ||
for (i, xi, Si) in zip(eachindex(values), eachslice(samples; dims=3), S) | ||
values[i] = _mcse_quantile(vec(xi), 1//2, Si) | ||
end | ||
return values | ||
end | ||
|
||
function mcse_bm(x::AbstractVector{<:Real}; size::Int=floor(Int, sqrt(length(x)))) | ||
n = length(x) | ||
m = min(div(n, 2), size) | ||
m == size || @warn "batch size was reduced to $m" | ||
mcse = StatsBase.sem(Statistics.mean(@view(x[(i + 1):(i + m)])) for i in 0:m:(n - m)) | ||
return mcse | ||
function _mcse_quantile(x, p, Seff) | ||
Seff === missing && return missing | ||
S = length(x) | ||
# quantile error distribution is asymptotically normal; estimate σ (mcse) with 2 | ||
# quadrature points: xl and xu, chosen as quantiles so that xu - xl = 2σ | ||
# compute quantiles of error distribution in probability space (i.e. quantiles passed through CDF) | ||
# Beta(α,β) is the approximate error distribution of quantile estimates | ||
α = Seff * p + 1 | ||
β = Seff * (1 - p) + 1 | ||
prob_x_upper = StatsFuns.betainvcdf(α, β, normcdf1) | ||
prob_x_lower = StatsFuns.betainvcdf(α, β, normcdfn1) | ||
# use inverse ECDF to get quantiles in quantile (x) space | ||
l = max(floor(Int, prob_x_lower * S), 1) | ||
u = min(ceil(Int, prob_x_upper * S), S) | ||
iperm = partialsortperm(x, l:u) # sort as little of x as possible | ||
xl = x[first(iperm)] | ||
xu = x[last(iperm)] | ||
# estimate mcse from quantiles | ||
return (xu - xl) / 2 | ||
end | ||
|
||
function mcse_imse(x::AbstractVector{<:Real}) | ||
n = length(x) | ||
lags = [0, 1] | ||
ghat = StatsBase.autocov(x, lags) | ||
Ghat = sum(ghat) | ||
@inbounds value = Ghat + ghat[2] | ||
@inbounds for i in 2:2:(n - 2) | ||
lags[1] = i | ||
lags[2] = i + 1 | ||
StatsBase.autocov!(ghat, x, lags) | ||
Ghat = min(Ghat, sum(ghat)) | ||
Ghat > 0 || break | ||
value += 2 * Ghat | ||
function _mcse_sbm( | ||
f, | ||
x::AbstractArray{<:Union{Missing,Real},3}; | ||
batch_size::Int=floor(Int, sqrt(size(x, 1) * size(x, 2))), | ||
) | ||
T = promote_type(eltype(x), typeof(zero(eltype(x)) / 1)) | ||
values = similar(x, T, (axes(x, 3),)) | ||
for (i, xi) in zip(eachindex(values), eachslice(x; dims=3)) | ||
values[i] = _mcse_sbm(f, vec(xi), batch_size) | ||
end | ||
|
||
mcse = sqrt(value / n) | ||
|
||
return mcse | ||
return values | ||
end | ||
|
||
function mcse_ipse(x::AbstractVector{<:Real}) | ||
function _mcse_sbm(f, x, batch_size) | ||
any(x -> x === missing, x) && return missing | ||
n = length(x) | ||
lags = [0, 1] | ||
ghat = StatsBase.autocov(x, lags) | ||
@inbounds value = ghat[1] + 2 * ghat[2] | ||
@inbounds for i in 2:2:(n - 2) | ||
lags[1] = i | ||
lags[2] = i + 1 | ||
StatsBase.autocov!(ghat, x, lags) | ||
Ghat = sum(ghat) | ||
Ghat > 0 || break | ||
value += 2 * Ghat | ||
end | ||
|
||
mcse = sqrt(value / n) | ||
|
||
return mcse | ||
i1 = firstindex(x) | ||
v = Statistics.var( | ||
f(view(x, i:(i + batch_size - 1))) for i in i1:(i1 + n - batch_size); | ||
corrected=false, | ||
) | ||
return sqrt(v * (batch_size//n)) | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Necessary because otherwise our transformation discards the type of
x
, so aFloat32
eltypex
will get aFloat64
ESS/R-hat/MCSE.