SciML · ChrisRackauckas · Jun 8, 2023 · May 23, 2023 · May 23, 2023 · Jun 3, 2023
diff --git a/.gitignore b/.gitignore
@@ -3,3 +3,4 @@
 /dev/
 
 .vscode/
+docs/Manifest.toml
diff --git a/README.md b/README.md
@@ -1,3 +1,4 @@
+
 # QuasiMonteCarlo.jl
 
 [![Join the chat at https://julialang.zulipchat.com #sciml-bridged](https://img.shields.io/static/v1?label=Zulip&message=chat&color=9558b2&labelColor=389826)](https://julialang.zulipchat.com/#narrow/stream/279055-sciml-bridged)
@@ -28,12 +29,12 @@ ub = [1.0,20.0]
 n = 5
 d = 2
 
-s = QuasiMonteCarlo.sample(n,lb,ub,GridSample([0.1,0.5]))
-s = QuasiMonteCarlo.sample(n,lb,ub,UniformSample())
+s = QuasiMonteCarlo.sample(n,lb,ub,GridSample())
+s = QuasiMonteCarlo.sample(n,lb,ub,Uniform())
 s = QuasiMonteCarlo.sample(n,lb,ub,SobolSample())
 s = QuasiMonteCarlo.sample(n,lb,ub,LatinHypercubeSample())
 s = QuasiMonteCarlo.sample(n,lb,ub,LatticeRuleSample())
-s = QuasiMonteCarlo.sample(n,lb,ub,HaltonSample([10,3], false))
+s = QuasiMonteCarlo.sample(n,lb,ub,HaltonSample())
 ```
 
 The output `s` is a matrix, so one can use things like `@uview` from
@@ -53,11 +54,18 @@ Everything has the same interface:
 A = QuasiMonteCarlo.sample(n,lb,ub,sample_method)
 ```
 
+or to generate points directly in the unit box $[0,1]^d$
+
+```julia
+A = QuasiMonteCarlo.sample(n,d,sample_method) # = QuasiMonteCarlo.sample(n,zeros(d),ones(d),sample_method)
+```
+
 where:
 
 - `n` is the number of points to sample.
 - `lb` is the lower bound for each variable. The length determines the dimensionality.
 - `ub` is the upper bound.
+- `d` is the dimension of the unit box.
 - `sample_method` is the quasi-Monte Carlo sampling strategy.
 
 Additionally, there is a helper function for generating design matrices:
@@ -75,13 +83,13 @@ all sampled from the same low-discrepancy sequence.
 Sampling methods `SamplingAlgorithm` are divided into two subtypes
 
 - `DeterministicSamplingAlgorithm`
-  - `GridSample(dx)` where the grid is given by `lb:dx[i]:ub` in the ith direction.
+  - `GridSample` for samples on a regular grid.
   - `SobolSample` for the Sobol' sequence.
   - `FaureSample` for the Faure sequence.
   - `LatticeRuleSample` for a randomly-shifted rank-1 lattice rule.
-  - `HaltonSample(base)` where `base[i]` is the base in the ith direction.
+  - `HaltonSample` for the Halton sequence.
   - `GoldenSample` for a Golden Ratio sequence.
-  - `KroneckerSample(alpha, s0)` for a Kronecker sequence, where alpha is an length-`d` vector of irrational numbers (often `sqrt(d`)) and `s0` is a length-`d` seed vector (often `0`).
+  - `KroneckerSample(alpha, s0)` for a Kronecker sequence, where alpha is a length-`d` vector of irrational numbers (often `sqrt(d)`) and `s0` is a length-`d` seed vector (often `0`).
 - `RandomSamplingAlgorithm`
   - `UniformSample` for uniformly distributed random numbers.
   - `LatinHypercubeSample` for a Latin Hypercube.
@@ -115,24 +123,25 @@ end
 
 ## Randomization of QMC sequences
 
-Most of the previous methods are deterministic i.e. `sample(n, d, Sampler()::DeterministicSamplingAlgorithm)` always produces the same sequence $X = (X_1, \dots, X_n)$.
-The API to randomize sequence is either 
-- Directly use `QuasiMonteCarlo.sample(n, d, DeterministicSamplingAlgorithm(R = RandomizationMethod()))` or `sample(n, lb, up, DeterministicSamplingAlgorithm(R = RandomizationMethod()))`
-- Or given any matrix $X$ ($d\times n$) of $n$ points all in dimension $d$ in $[0,1]^d$ one can do `randomize(x, ::RandomizationMethod())`
+Most of the previous methods are deterministic, i.e. `sample(n, d, Sampler()::DeterministicSamplingAlgorithm)` always produces the same sequence $X = (X_1, \dots, X_n)$.
+There are two ways to obtain a randomized sequence:
+
+- Either directly use `QuasiMonteCarlo.sample(n, d, DeterministicSamplingAlgorithm(R = RandomizationMethod()))` or `sample(n, lb, up, DeterministicSamplingAlgorithm(R = RandomizationMethod()))`.
+- Or, given $n$ points $d$-dimensional points, all in $[0,1]^d$ one can do `randomize(X, ::RandomizationMethod())` where $X$ is a $d\times n$-matrix.
 
-There are the following randomization methods:
+The currently available randomization methods are:
 
-- Scrambling methods `ScramblingMethods(base, pad, rng)` where `base` is the base used to scramble and `pad` the number of bits in `b`-ary decomposition.
+- Scrambling methods `ScramblingMethods(b, pad, rng)` where `b` is the base used to scramble and `pad` the number of bits in `b`-ary decomposition.
 `pad` is generally chosen as $\gtrsim \log_b(n)$.
 The implemented `ScramblingMethods` are
-  - `DigitalShift` 
-  - `MatousekScramble` a.k.a Linear Matrix Scramble.
-  - `OwenScramble` a.k.a Nested Uniform Scramble is the most understood theoretically but is more costly to operate.
-- `Shift(rng)` a.k.a. Cranley Patterson Rotation. 
+  - `DigitalShift`
+  - `MatousekScramble` a.k.a. Linear Matrix Scramble.
+  - `OwenScramble` a.k.a. Nested Uniform Scramble is the most understood theoretically, but is more costly to operate.
+- `Shift(rng)` a.k.a. Cranley Patterson Rotation.
 
 For numerous independent randomization, use `generate_design_matrices(n, d, ::DeterministicSamplingAlgorithm), ::RandomizationMethod, num_mats)` where `num_mats` is the number of independent `X` generated.
 
-### Example
+### Randomization Example
 
 Randomization of a Faure sequence with various methods.
 
@@ -144,7 +153,7 @@ b = QuasiMonteCarlo.nextprime(d)
 N = b^m # Number of points
 pad = m # Length of the b-ary decomposition number = sum(y[k]/b^k for k in 1:pad)
 
-# Unrandomized low discrepency sequence
+# Unrandomized (deterministic) low discrepancy sequence
 x_faure = QuasiMonteCarlo.sample(N, d, FaureSample())
 
 # Randomized version
@@ -165,7 +174,7 @@ Plot the resulting sequences along dimensions `1` and `3`.
 
 ```julia
 d1 = 1
-d2 = 3 # you can try every combination of dimension (d1, d2)
+d2 = 3 # you can try every combination of dimensions (d1, d2)
 sequences = [x_uniform, x_faure, x_shift, x_digital_shift, x_lms, x_nus]
 names = ["Uniform", "Faure (deterministic)", "Shift", "Digital Shift", "Matousek Scramble", "Owen Scramble"]
 p = [plot(thickness_scaling=1.5, aspect_ratio=:equal) for i in sequences]
@@ -184,4 +193,4 @@ end
 plot(p..., size=(1500, 900))
 ```
 
-![Different randomize methods of the same initial set of points](img/various_randomization.svg)
+![Different randomize methods of the same initial set of points](img/various_randomization.svg)
diff --git a/docs/Project.toml b/docs/Project.toml
@@ -1,6 +1,8 @@
 [deps]
-QuasiMonteCarlo = "8a4e6c94-4038-4cdc-81c3-7e6ffdb2a71b"
 Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
+Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80"
+QuasiMonteCarlo = "8a4e6c94-4038-4cdc-81c3-7e6ffdb2a71b"
+Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
 
 [compat]
 Documenter = "0.27"
diff --git a/docs/pages.jl b/docs/pages.jl
@@ -4,4 +4,5 @@ pages = [
     "Home" => "index.md",
     "samplers.md",
     "randomization.md",
+    "design_matrix.md",
 ]
diff --git a/docs/src/design_matrix.md b/docs/src/design_matrix.md
@@ -0,0 +1,33 @@
+
+# [Design Matrices](@id DesignMatrices)
+
+## API
+
+It is often convenient to generate multiple independent sequence, for error estimation (uncertainty quantification).
+The resulting sequences can be stored in what is often called a design matrix.
+In this package, this is achieved with the `generate_design_matrices(n, d, ::DeterministicSamplingAlgorithm), ::RandomizationMethod, num_mats)` function. `num_mats` is the number of independent realization. The resulting design matrix is a vector of matrix of length `num_mats`.
+
+```@docs
+QuasiMonteCarlo.generate_design_matrices
+```
+
+!!! warning
+    The method `generate_design_matrices(n, d, sampler, R::NoRand, num_mats, T = Float64)` is an ad hoc way to produce a Design Matrix. Indeed, it creates a deterministic point set in dimension `d × num_mats` and splits it into `num_mats` point set of dimension `d`. The resulting sequences have no QMC guarantees.
+    This seems to have been proposed in Section 5.1 of [*Saltelli, A. et al. (2010)*](https://d1wqtxts1xzle7.cloudfront.net/76482087/PUBLISHED_PAPER-libre.pdf?1639660270=&response-content-disposition=inline%3B+filename%3DVariance_based_sensitivity_analysis_of_m.pdf&Expires=1685981169&Signature=aim5tHldlkb0ewZ9-gSMZsW2F1b88tLvV8euV1FpD61UYrE1mLR3RDERut0BsHNbcibjKQnF1JlsZ8mtEx~E1~eI3A~SOSySbpQllIpbhu556pFGUvD3GV5M6ghwa-5QMDP3-aQczBzflR721N4PCVJgqfmV-y94pkijQYvHSvZaPKb-tsoS8TVxE6H31Ptk4u662H61ofKzXR5JCHv0740qkQ0hORH~GqXOt8s7yQMVWYswZT4pWGSkJ9EehEQHCLo2uDVW-YSopwlSaaMRz~~0O~hkGAVE8sC~TAB7b5KnUgtNXl0jYTfGNTYO4GNJo1XhmHwj~Og~sBLDIXDxsg__&Key-Pair-Id=APKAJLOHF5GGSLRBV4ZA)[^1] to do uncertainty quantification.
+
+[^1]: Saltelli, A., Annoni, P., Azzini, I., Campolongo, F., Ratto, M., & Tarantola, S. (2010). Variance based sensitivity analysis of model output. Design and estimator for the total sensitivity index. Computer physics communications, 181(2), 259-270.
+
+## Example
+
+```@example 2
+using QuasiMonteCarlo, Random
+Random.seed!(1234)
+m = 4
+d = 3
+b = QuasiMonteCarlo.nextprime(d)
+N = b^m # Number of points
+pad = m # Can also choose something as `2m` to get "better" randomization
+
+# 5 independent Randomized Faure sequences
+QuasiMonteCarlo.generate_design_matrices(N, d, FaureSample(R = OwenScramble(base = b, pad = m)), 5)
+```
diff --git a/docs/src/index.md b/docs/src/index.md
@@ -12,7 +12,9 @@ using Pkg
 Pkg.add("QuasiMonteCarlo")
 ```
 
-## Example
+## Get Started
+
+### Basic API
 
 ```julia
 using QuasiMonteCarlo, Distributions
@@ -21,12 +23,12 @@ ub = [1.0,20.0]
 n = 5
 d = 2
 
-s = QuasiMonteCarlo.sample(n,lb,ub,GridSample([0.1,0.5]))
-s = QuasiMonteCarlo.sample(n,lb,ub,UniformSample())
+s = QuasiMonteCarlo.sample(n,lb,ub,GridSample())
+s = QuasiMonteCarlo.sample(n,lb,ub,Uniform())
 s = QuasiMonteCarlo.sample(n,lb,ub,SobolSample())
 s = QuasiMonteCarlo.sample(n,lb,ub,LatinHypercubeSample())
 s = QuasiMonteCarlo.sample(n,lb,ub,LatticeRuleSample())
-s = QuasiMonteCarlo.sample(n,lb,ub,HaltonSample([10,3]))
+s = QuasiMonteCarlo.sample(n,lb,ub,HaltonSample())
 ```
 
 The output `s` is a matrix, so one can use things like `@uview` from
@@ -38,6 +40,54 @@ using UnsafeArrays
 @uview s[:,i]
 ```
 
+### MC vs QMC
+
+We illustrate the gain of QMC methods over plain Monte Carlo using the 5-dimensional example from Section 15.9 in the [book by A. Owen](https://artowen.su.domains/mc/qmcstuff.pdf).
+```@example MCvsQMC; continued = true
+f₁(𝐱) = prod(1 + √(12)/5*(xⱼ - 1/2) for xⱼ ∈ 𝐱)
+μ_exact = 1 # = ∫ f₁(𝐱) d⁵𝐱
+```
+
+One can estimate the integral $\mu$ using plain Monte Carlo, or Quasi Monte Carlo or Randomized Quasi Monte Carlo. See the other section of this documentation for more information on the functions used in the example.
+
+```@example MCvsQMC
+using QuasiMonteCarlo, Random, Distributions
+using Plots; default(fontfamily="Computer Modern")
+Random.seed!(1234)
+d = 5 # Dimension (= prime base for Faure net)
+b = 2 # Base for Sobol net
+m_max = 19
+m_max_Faure = 8
+N = b^m_max
+
+# Generate sequences
+seq_MC = QuasiMonteCarlo.sample(N, d, Uniform()) # Monte Carlo i.i.d Uniform sampling
+seq_QMC_Sobol = QuasiMonteCarlo.sample(N, d, SobolSample()) # Sobol net
+seq_RQMC_Sobol = QuasiMonteCarlo.sample(N, d, SobolSample(R = OwenScramble(base = b, pad = 32))) # Randomized version of Sobol net
+seq_RQMC_Faure = QuasiMonteCarlo.sample(d^m_max_Faure, d, FaureSample(R = OwenScramble(base = d, pad = 32))) # Randomized version of Faure net
+
+# Estimate the integral for different n with different estimator μ̂ₙ
+μ_MC = [mean(f₁(x) for x in eachcol(seq_MC[:, 1:b^m])) for m in 1:m_max]
+μ_QMC_Sobol = [mean(f₁(x) for x in eachcol(seq_QMC_Sobol[:, 1:b^m])) for m in 1:m_max]
+μ_RQMC_Sobol = [mean(f₁(x) for x in eachcol(seq_RQMC_Sobol[:, 1:b^m])) for m in 1:m_max]
+μ_RQMC_Faure = [mean(f₁(x) for x in eachcol(seq_RQMC_Faure[:, 1:d^m])) for m in 1:m_max_Faure]
+
+# Plot the error |μ̂-μ| vs n
+plot(b.^(1:m_max), abs.(μ_MC .- μ_exact), label="MC")
+plot!(b.^(1:m_max), abs.(μ_QMC_Sobol .- μ_exact), label="QMC Sobol")
+plot!(b.^(1:m_max), abs.(μ_RQMC_Sobol .- μ_exact), label="RQMC Sobol")
+plot!(d .^(1:m_max_Faure), abs.(μ_RQMC_Faure .- μ_exact), label="RQMC Faure")
+plot!(n -> n^(-1/2), b.^(1:m_max), c = :black, s = :dot, label = "n^(-1/2)")
+plot!(n -> n^(-3/2), b.^(1:m_max), c = :black, s = :dash, label = "n^(-3/2)") 
+# n^(-3/2) is the theoretical scaling for scrambled nets e.g. Theorem 17.5. in https://artowen.su.domains/mc/qmcstuff.pdf
+xlims!(1, 1e6)
+ylims!(1e-9, 1)
+xaxis!(:log10)
+yaxis!(:log10)
+xlabel!("n", legend = :bottomleft)
+ylabel!("|μ̂-μ|")
+```
+
 ## Adding a new sampling method
 
 Adding a new sampling method is a two-step process:
@@ -62,68 +112,83 @@ function sample(n,lb,ub,::NewAmazingSamplingAlgorithm)
     end
 end
 ```
+
 ## Contributing
 
 - Please refer to the
   [SciML ColPrac: Contributor's Guide on Collaborative Practices for Community Packages](https://github.com/SciML/ColPrac/blob/master/README.md)
   for guidance on PRs, issues, and other matters relating to contributing to SciML.
 - There are a few community forums:
-    - the #diffeq-bridged channel in the [Julia Slack](https://julialang.org/slack/)
-    - [JuliaDiffEq](https://gitter.im/JuliaDiffEq/Lobby) on Gitter
-    - on the [Julia Discourse forums](https://discourse.julialang.org)
-    - see also [SciML Community page](https://sciml.ai/community/)
+  - the #diffeq-bridged channel in the [Julia Slack](https://julialang.org/slack/)
+  - [JuliaDiffEq](https://gitter.im/JuliaDiffEq/Lobby) on Gitter
+  - on the [Julia Discourse forums](https://discourse.julialang.org)
+  - see also [SciML Community page](https://sciml.ai/community/)
 
 ## Reproducibility
+
 ```@raw html
 <details><summary>The documentation of this SciML package was built using these direct dependencies,</summary>
 ```
+
 ```@example
 using Pkg # hide
 Pkg.status() # hide
 ```
+
 ```@raw html
 </details>
 ```
+
 ```@raw html
 <details><summary>and using this machine and Julia version.</summary>
 ```
+
 ```@example
 using InteractiveUtils # hide
 versioninfo() # hide
 ```
+
 ```@raw html
 </details>
 ```
+
 ```@raw html
 <details><summary>A more complete overview of all dependencies and their versions is also provided.</summary>
 ```
+
 ```@example
 using Pkg # hide
 Pkg.status(;mode = PKGMODE_MANIFEST) # hide
 ```
+
 ```@raw html
 </details>
 ```
+
 ```@raw html
 You can also download the
 <a href="
 ```
+
 ```@eval
 using TOML
 version = TOML.parse(read("../../Project.toml",String))["version"]
 name = TOML.parse(read("../../Project.toml",String))["name"]
 link = "https://github.com/SciML/"*name*".jl/tree/gh-pages/v"*version*"/assets/Manifest.toml"
 ```
+
 ```@raw html
 ">manifest</a> file and the
 <a href="
 ```
+
 ```@eval
 using TOML
 version = TOML.parse(read("../../Project.toml",String))["version"]
 name = TOML.parse(read("../../Project.toml",String))["name"]
 link = "https://github.com/SciML/"*name*".jl/tree/gh-pages/v"*version*"/assets/Project.toml"
 ```
+
 ```@raw html
 ">project</a> file.
 ```