Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PAWN method #69

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open

Conversation

ConnectedSystems
Copy link

As in title - this PR adds the PAWN moment-independent sensitivity analysis method to this package.

Two additional dependencies are required: HypothesisTests (which provides the Kolmogorov-Smirnov test) and Logging.
The Logging package is used to hide some warnings that appear from HypothesisTests when ties occur in the outputs (which may be quite frequent).

I did not implement sample(), and not sure if it's necessary.

Usage example:

julia> using Distributions, DataStructures, GlobalSensitivityAnalysis

julia> data = PAWNData(
       params=OrderedDict(:x1 => Uniform(-3.14159265359, 3.14159265359),
               :x2 => Uniform(-3.14159265359, 3.14159265359),
               :x3 => Uniform(-3.14159265359, 3.14159265359)),
       N=100,
       S=10)

julia> Y = rand(100)

# Generating some dummy data for the example
julia> analyze(data, rand(100, 3), Y)
Calculating indices for 3 parameters ... 100%|██████████████████████████████████████████| Time: 0:00:00        
Dict{Symbol, Vector{Float64}} with 6 entries:
  :median => [0.723627, 0.753778, 0.633174]
  :max    => [0.904534, 1.2965, 1.08544]
  :std    => [0.249479, 0.387074, 0.299123]
  :mean   => [0.66446, 0.711057, 0.646303]
  :cv     => [0.375462, 0.544364, 0.462822]
  :min    => [0.0, 0.0, 0.0]

Unrelated to this PR, but I think relying on Dictionaries as the return type results in some annoyances (e.g., accessing/slicing across results) and performance issues however minor they may be in practice.

I would suggest one of the below be investigated and adopted:

@lrennels
Copy link
Owner

lrennels commented Jan 6, 2023

@ConnectedSystems thank you for this contribution! I will take a look sometime next week, this was certainly on my list so I'm excited to see it here.

Per the return type I would tend to agree, and think using NamedArrays would be good modification to this package. I'll add an Issue for that and hope to get to it soon.

@ConnectedSystems
Copy link
Author

Glad to be service!

FYI, I've been working with NamedArrays recently and while it supports both dimension names and axis names in one package, it relies on Dicts underneath to do so, which isn't ideal for performance particularly when the number of elements is small.

While in practice it may not matter, NamedDims + AxisKeys may be the better option.

@ConnectedSystems
Copy link
Author

PS: Recent set of changes fixes an error, where I was not including the upper most bin-edge in the calculations.

I've also adjusted to return an OrderedDict so that the results are in the intuitive order:

OrderedDict{Symbol, Vector{Float64}} with 6 entries:
  :min    => [0.0, 0.0, 0.0]
  :mean   => [0.623764, 0.647307, 0.670555]
  :median => [0.603023, 0.693476, 0.723627]
  :max    => [1.08544, 1.05999, 0.994987]
  :std    => [0.288987, 0.282678, 0.279635]
  :cv     => [0.463296, 0.436699, 0.41702]

@lrennels
Copy link
Owner

lrennels commented Jan 6, 2023

Ah, now that I re-read what I wrote I was thinking of using Named Tuples, which is what I'm more comfortable with from previous work, but am open to the discussion for sure. I think I used Dictionaries originally just to hew closely to the Python SALib implementation, though that was several years ago so I've learned more since then about Julia!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants