-
-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update Documentation, readme and docstrings #79
Conversation
I still cannot figure why format check fail, I tried JuliaFormatter every possible ways! |
The formatter just had an update that changed some of the formatting. I'll update the format in a separate commit. |
``` | ||
|
||
!!! warning | ||
The method `generate_design_matrices(n, d, sampler, R::NoRand, num_mats, T = Float64)` is an ad hoc way to produce a Design Matrix. Indeed, it creates a deterministic point set in dimension `d × num_mats` and splits it into `num_mats` point set of dimension `d`. The resulting sequences have no QMC guarantees. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. Is there a better way to do this which guarantees the good properties for QMC?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! That is the point of all the randomization methods (RQMC).
I thought of adding this illustration to the design_matrices
page.
Even with a relatively small case d = 6
and a nice function F_test,
one can observe that sometime the method proposed by Saltelli and implemented with NoRand()
produce from time to time very far away estimate. With M=500
realizations one can observe that the distribution of estimate from Saltelli has heavy tails while you would hope for something looking like a Normal distribution.
On the other hand, RQMC is a theoretically grounded way to do (but it can be more computationally expensive)
using QuasiMonteCarlo, Random, StatsBase
using Plots
gr() # plotly() # if you wanna zoom and all
Random.seed!(1234)
m = 13
d = 6
b = 2
N = b^m # Number of points
pad = m+4 # Can also choose something as `2m` to get "better" randomization
M = 500 # nb of realization
# "Saltelli independent" sequences
Saltelli = QuasiMonteCarlo.generate_design_matrices(N, d, SobolSample(R = NoRand()), M)
# True i.i.d sequences
RQMC = QuasiMonteCarlo.generate_design_matrices(N, d, SobolSample(R = OwenScramble(base = b, pad = pad)), M)
F_test(x) = prod(x)
μ_exact = 1/2^d # = ∫ F_test(x) dᵈx
μ_Saltelli = [mean(F_test(c) for c in eachcol(X)) for X in Saltelli]
μ_RQMC = [mean(F_test(c) for c in eachcol(X)) for X in RQMC]
mean(μ_Saltelli .- μ_exact) # -2.309473096176987e-5
mean(μ_RQMC .- μ_exact) # -6.113472379543662e-7
std(μ_Saltelli .- μ_exact) # 0.0002962911429492179
std(μ_RQMC .- μ_exact) # 2.408945952759537e-5
stephist(μ_Saltelli .- μ_exact, label = "Saltelli", norm = :pdf) # not close at all from being asymptotically Normal
stephist!(μ_RQMC .- μ_exact, label = "RQMC", norm = :pdf)
vline!([0], label = "true", c = :black, s = :dot)
# yaxis!(:log10, ylims = (1e2, Inf)) # logscale can help or not to vizualize
# savefig("Saltelli_vs_RQMC.png")
|
||
!!! warning | ||
The method `generate_design_matrices(n, d, sampler, R::NoRand, num_mats, T = Float64)` is an ad hoc way to produce a Design Matrix. Indeed, it creates a deterministic point set in dimension `d × num_mats` and splits it into `num_mats` point set of dimension `d`. The resulting sequences have no QMC guarantees. | ||
This seems to have been proposed in Section 5.1 of [*Saltelli, A. et al. (2010)*](https://d1wqtxts1xzle7.cloudfront.net/76482087/PUBLISHED_PAPER-libre.pdf?1639660270=&response-content-disposition=inline%3B+filename%3DVariance_based_sensitivity_analysis_of_m.pdf&Expires=1685981169&Signature=aim5tHldlkb0ewZ9-gSMZsW2F1b88tLvV8euV1FpD61UYrE1mLR3RDERut0BsHNbcibjKQnF1JlsZ8mtEx~E1~eI3A~SOSySbpQllIpbhu556pFGUvD3GV5M6ghwa-5QMDP3-aQczBzflR721N4PCVJgqfmV-y94pkijQYvHSvZaPKb-tsoS8TVxE6H31Ptk4u662H61ofKzXR5JCHv0740qkQ0hORH~GqXOt8s7yQMVWYswZT4pWGSkJ9EehEQHCLo2uDVW-YSopwlSaaMRz~~0O~hkGAVE8sC~TAB7b5KnUgtNXl0jYTfGNTYO4GNJo1XhmHwj~Og~sBLDIXDxsg__&Key-Pair-Id=APKAJLOHF5GGSLRBV4ZA)[^1] to do uncertainty quantification. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that is the source.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am very puzzled on why this is so much used in the UQ community. Some notations are not up-to-date with the current QMC community, and the review cited dates from 1988.
This phenomenon is somehow related to what is explained in the excellent paper On Dropping the First Sobol’ Point.
Basically, points are generated to have good low discrepancy properties all together, so one cannot play with half the sequence or dimension or remove even a point and hope to keep the good properties.
For what is done there this is even more clear since it is known Sobol points are not too good in some higher dimensions.
IMO, these practices should be progressively removed from QMC software. For example, in #75 something similar to Saltelli is proposed (+ the dropping zero story).
To nuance my words, for the use case M=2
of sensitivity analysis this is probably not as bad as I show for small enough cases.
``` | ||
|
||
## Design Matrices | ||
- `MatousekScramble` a.k.a. Linear Matrix Scramble is what people use in practice. Indeed, the observed performances are similar to `OwenScramble` for a lesser numerical cost. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a quick description of this one?
This is amazingly clear. Thanks for putting the time in. I left some small comments, but they are small so I'm going to merge and try the new formatter. |
I made a few update to documentations like
sample
.I hope this help!
BTW, I believe we should remove
RandomSample
sometime calledUniformSample
. Instead, we can directly useUniform
insample
.