Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update Documentation, readme and docstrings #79

Merged
merged 15 commits into from
Jun 8, 2023

Conversation

dmetivie
Copy link
Contributor

@dmetivie dmetivie commented Jun 7, 2023

I made a few update to documentations like

  • fixing docstring call of sample.
  • Changing a bit some docstrings to match the PR Refactor code #65 rebase
  • Separate in a new page generating matrix for clarity
  • Add an example in home page of doc of QMC vs MC (this seems like a must-have for a QMC pkg).
  • Change Project.toml of docs to use Documenters.jl code block example .

I hope this help!

BTW, I believe we should remove RandomSample sometime called UniformSample. Instead, we can directly use Uniform in sample.

@dmetivie
Copy link
Contributor Author

dmetivie commented Jun 8, 2023

I still cannot figure why format check fail, I tried JuliaFormatter every possible ways!

@ChrisRackauckas
Copy link
Member

The formatter just had an update that changed some of the formatting. I'll update the format in a separate commit.

```

!!! warning
The method `generate_design_matrices(n, d, sampler, R::NoRand, num_mats, T = Float64)` is an ad hoc way to produce a Design Matrix. Indeed, it creates a deterministic point set in dimension `d × num_mats` and splits it into `num_mats` point set of dimension `d`. The resulting sequences have no QMC guarantees.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. Is there a better way to do this which guarantees the good properties for QMC?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! That is the point of all the randomization methods (RQMC).

I thought of adding this illustration to the design_matrices page.
Even with a relatively small case d = 6 and a nice function F_test, one can observe that sometime the method proposed by Saltelli and implemented with NoRand() produce from time to time very far away estimate. With M=500 realizations one can observe that the distribution of estimate from Saltelli has heavy tails while you would hope for something looking like a Normal distribution.
On the other hand, RQMC is a theoretically grounded way to do (but it can be more computationally expensive)

using QuasiMonteCarlo, Random, StatsBase
using Plots
gr() # plotly() # if you wanna zoom and all
Random.seed!(1234)
m = 13
d = 6
b = 2
N = b^m # Number of points
pad = m+4 # Can also choose something as `2m` to get "better" randomization
M = 500 # nb of realization

# "Saltelli independent" sequences
Saltelli = QuasiMonteCarlo.generate_design_matrices(N, d, SobolSample(R = NoRand()), M)
# True i.i.d sequences
RQMC = QuasiMonteCarlo.generate_design_matrices(N, d, SobolSample(R = OwenScramble(base = b, pad = pad)), M)

F_test(x) = prod(x)
μ_exact = 1/2^d # = ∫ F_test(x) dᵈx

μ_Saltelli = [mean(F_test(c) for c in eachcol(X)) for X in Saltelli]
μ_RQMC = [mean(F_test(c) for c in eachcol(X)) for X in RQMC]

mean(μ_Saltelli .- μ_exact) # -2.309473096176987e-5
mean(μ_RQMC .- μ_exact) # -6.113472379543662e-7

std(μ_Saltelli .- μ_exact) # 0.0002962911429492179
std(μ_RQMC .- μ_exact) # 2.408945952759537e-5

stephist(μ_Saltelli .- μ_exact, label = "Saltelli", norm = :pdf) # not close at all from being asymptotically Normal
stephist!(μ_RQMC .- μ_exact, label = "RQMC", norm = :pdf)
vline!([0], label = "true", c = :black, s = :dot)
# yaxis!(:log10, ylims = (1e2, Inf)) # logscale can help or not to vizualize
# savefig("Saltelli_vs_RQMC.png")

Saltelli_vs_RQMC


!!! warning
The method `generate_design_matrices(n, d, sampler, R::NoRand, num_mats, T = Float64)` is an ad hoc way to produce a Design Matrix. Indeed, it creates a deterministic point set in dimension `d × num_mats` and splits it into `num_mats` point set of dimension `d`. The resulting sequences have no QMC guarantees.
This seems to have been proposed in Section 5.1 of [*Saltelli, A. et al. (2010)*](https://d1wqtxts1xzle7.cloudfront.net/76482087/PUBLISHED_PAPER-libre.pdf?1639660270=&response-content-disposition=inline%3B+filename%3DVariance_based_sensitivity_analysis_of_m.pdf&Expires=1685981169&Signature=aim5tHldlkb0ewZ9-gSMZsW2F1b88tLvV8euV1FpD61UYrE1mLR3RDERut0BsHNbcibjKQnF1JlsZ8mtEx~E1~eI3A~SOSySbpQllIpbhu556pFGUvD3GV5M6ghwa-5QMDP3-aQczBzflR721N4PCVJgqfmV-y94pkijQYvHSvZaPKb-tsoS8TVxE6H31Ptk4u662H61ofKzXR5JCHv0740qkQ0hORH~GqXOt8s7yQMVWYswZT4pWGSkJ9EehEQHCLo2uDVW-YSopwlSaaMRz~~0O~hkGAVE8sC~TAB7b5KnUgtNXl0jYTfGNTYO4GNJo1XhmHwj~Og~sBLDIXDxsg__&Key-Pair-Id=APKAJLOHF5GGSLRBV4ZA)[^1] to do uncertainty quantification.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that is the source.

Copy link
Contributor Author

@dmetivie dmetivie Jun 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am very puzzled on why this is so much used in the UQ community. Some notations are not up-to-date with the current QMC community, and the review cited dates from 1988.
This phenomenon is somehow related to what is explained in the excellent paper On Dropping the First Sobol’ Point.
Basically, points are generated to have good low discrepancy properties all together, so one cannot play with half the sequence or dimension or remove even a point and hope to keep the good properties.
For what is done there this is even more clear since it is known Sobol points are not too good in some higher dimensions.

From Owen book

IMO, these practices should be progressively removed from QMC software. For example, in #75 something similar to Saltelli is proposed (+ the dropping zero story).

To nuance my words, for the use case M=2 of sensitivity analysis this is probably not as bad as I show for small enough cases.

```

## Design Matrices
- `MatousekScramble` a.k.a. Linear Matrix Scramble is what people use in practice. Indeed, the observed performances are similar to `OwenScramble` for a lesser numerical cost.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a quick description of this one?

@ChrisRackauckas
Copy link
Member

This is amazingly clear. Thanks for putting the time in. I left some small comments, but they are small so I'm going to merge and try the new formatter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants