Change discrete distributions to standard #3192

alkino · 2024-11-09T17:38:15Z

Discrete:

Binomial
Discrete Uniform
Poisson

Computation for binomial
KS test 2 samples from old to new
D-=0.0024800000000000377, pvalue=0.9174075702167388
19.0
Generating graph...
Over

Computation for discunif
KS test 2 samples from old to new
D-=0.0030100000000000127, pvalue=0.7542898827264148
Generating graph...
Over

Computation for poisson
KS test 2 samples from old to new
D-=0.00269999999999998, pvalue=0.8582934803778028
poisson
Generating graph...
Over

Script to generate values:

from neuron import h
import pickle

r = h.Random()

nrun = int(1e5)

def generate_data(name, *args):
    fun = getattr(r, name)
    fun(*args)
    hist = []
    for i in range(nrun):
        j = r.repick()
        hist.append(j)
    with open(f"{name}.data", "wb") as h:
        pickle.dump(hist, h)

# Discrete
# generate_data("binomial", 20, .5)
# generate_data("discunif", 0, 10)
# generate_data("poisson", 3)

# Continuous
generate_data("negexp", 0.5)
generate_data("normal", -1, .5)
generate_data("lognormal", 5, 2) # mean = 5, variance = 2
generate_data("uniform", 0, 2)
generate_data("erlang", 5, 1)
generate_data("weibull", 5, 1.5)

# Not implemented
# generate_data("geometric", .8)
# generate_data("hypergeo", 10, 150)

Script to generate graphs:

import pickle
import numpy as np
import scipy.stats as stats
import matplotlib.pyplot as plt
import os.path
from math import pow, sqrt, log, exp

stats_name = {
        'erlang': {'fun': stats.erlang, 'args': [25, 0, 1/5]},
        'lognormal': {'fun': stats.lognorm, 'args': [sqrt(log(2/(5*5) + 1)), 0, 5*5/sqrt(2+5*5)]},
        'negexp': {'fun': stats.expon, 'args': [0, 0.5]},
        'normal': {'fun': stats.norm, 'args': [-1, sqrt(.5)]},
        'uniform': {'fun': stats.uniform, 'args': [0, 2]},
        'weibull': {'fun': stats.weibull_min, 'args': [5, 0, pow(1.5, 1/5)]},
        }
def plot(name):
    print(f"Computation for {name}")
    with open(f'old/{name}.data', 'rb') as f:
        old_data = pickle.load(f)
    with open(f'new/{name}.data', 'rb') as f:
        new_data = pickle.load(f)

    if name in stats_name:
        fun = stats_name[name]["fun"]
        args = stats_name[name]["args"]
        print("KS test 2 sample")
        res = stats.ks_2samp(old_data, new_data)
        print(res)
        print(f"KS test 1 sample from old compare to {name}")
        res = stats.ks_1samp(x=old_data, cdf=fun.cdf, args=args)
        print(res)
        print(f"KS test 1 sample from new compare to {name}")
        res = stats.ks_1samp(x=new_data, cdf=fun.cdf, args=args)
        print(res)
        if name == 'negexp' or name == 'normal' or name == 'uniform':
            old_loc, old_scale = fun.fit(old_data, floc=0)  # floc=0 fixes the location at 0, as lognormal is defined on (0, inf)
            new_loc, new_scale = fun.fit(new_data, floc=0)  # floc=0 fixes the location at 0, as lognormal is defined on (0, inf)

            print(f"Fitted old parameters: scale={old_scale}")
            print(f"Fitted new parameters: scale={new_scale}")
        else:
            old_shape, old_loc, old_scale = fun.fit(old_data, floc=0)  # floc=0 fixes the location at 0, as lognormal is defined on (0, inf)
            new_shape, new_loc, new_scale = fun.fit(new_data, floc=0)  # floc=0 fixes the location at 0, as lognormal is defined on (0, inf)

            print(f"Fitted old parameters: scale={old_scale}, loc={old_loc}, shape={old_shape}")
            print(f"Fitted new parameters: scale={new_scale}, loc={new_loc}, shape={new_shape}")

        NUM_BINS = 2000  # Increase this number for smaller bins
    else:
        NUM_BINS = 'auto'

    print("Generating graph...")
    plt.hist(old_data, bins=NUM_BINS, density=True, alpha=0.6, color='b')
    plt.hist(new_data, bins=NUM_BINS, density=True, alpha=0.6, color='g')
    plt.title(f"Histogram of Data and Fitted {name} Distribution")
    plt.savefig(f"{name}_comparison.png")
    plt.clf()
    print("Over")

if __name__ == "__main__":
    import sys
    if len(sys.argv) > 1:
        plot(sys.argv[1])
    else:
        for n in stats_name.keys():
            if os.path.is_file(os.path.join("old", f"{n}.data")) and os.path.is_file(os.path.join("new", f"{n}.data")):
                plot(n)

azure-pipelines · 2024-11-11T15:38:31Z

✔️ 7bd2d12 -> Azure artifacts URL

src/gnu/distributions.hpp

azure-pipelines · 2024-11-11T18:25:25Z

✔️ a052717 -> Azure artifacts URL

azure-pipelines · 2024-11-12T01:16:57Z

✔️ f6bd650 -> Azure artifacts URL

azure-pipelines · 2024-11-12T09:18:21Z

✔️ 8d73abe -> Azure artifacts URL

cattabiani · 2024-11-12T09:31:33Z

The graphs are good and everything but... eyeballing graphs does not give an objective evaluation.

If you want to be more thorrough I suggest to use a "goodnless of fit" test. For steps for example we used the Kolmogorov–Smirnov test. Here the nice suite in python.

They all work in this way:

Hypothesis: "my sample comes from this distribution"
try to prove that this hypothesis is improbable. Do the test
compare the p-value to some threshold that you deemed good. Usually 5%
fail to disproof the hypothesis
-> demonstrated that the sample comes from that distribution!

It is a matter of a few more lines of code in python. If you are not sure on how to do it I am sure that if you ask "add the ks test matching the distribution X" and append your current code to chatGPT it can help you further with the details

azure-pipelines · 2024-11-12T15:31:06Z

✔️ de2b6ed -> Azure artifacts URL

azure-pipelines · 2024-11-13T17:01:00Z

✔️ 2bb8a05 -> Azure artifacts URL

sonarcloud · 2024-11-15T13:47:38Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

bbpbuildbot · 2024-11-15T14:50:01Z

azure-pipelines · 2024-11-15T14:56:02Z

✔️ 6e07a0d -> Azure artifacts URL

This comment has been minimized.

Sign in to view

alkino changed the base branch from cornu/random/cpp11 to master November 11, 2024 15:04

alkino marked this pull request as ready for review November 11, 2024 15:04

This comment has been minimized.

Sign in to view

matz-e reviewed Nov 11, 2024

View reviewed changes

src/gnu/distributions.hpp Show resolved Hide resolved

This comment has been minimized.

Sign in to view

alkino changed the title ~~Use binomial from C++ stdlib~~ Change most of distributions to use standard library Nov 13, 2024

This comment has been minimized.

Sign in to view

alkino changed the title ~~Change most of distributions to use standard library~~ Change discrete distributions to standard Nov 15, 2024

alkino marked this pull request as draft November 15, 2024 12:13

Move discrete distributions to standard

6e07a0d

alkino changed the base branch from master to cornu/random/continuous_to_standard November 15, 2024 13:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change discrete distributions to standard #3192

Change discrete distributions to standard #3192

alkino commented Nov 9, 2024 •

edited

Loading

This comment has been minimized.

azure-pipelines bot commented Nov 11, 2024

This comment has been minimized.

azure-pipelines bot commented Nov 11, 2024

This comment has been minimized.

azure-pipelines bot commented Nov 12, 2024

This comment has been minimized.

azure-pipelines bot commented Nov 12, 2024

cattabiani commented Nov 12, 2024 •

edited

Loading

This comment has been minimized.

This comment has been minimized.

azure-pipelines bot commented Nov 12, 2024

azure-pipelines bot commented Nov 13, 2024

This comment has been minimized.

sonarcloud bot commented Nov 15, 2024

bbpbuildbot commented Nov 15, 2024

azure-pipelines bot commented Nov 15, 2024

Change discrete distributions to standard #3192

Are you sure you want to change the base?

Change discrete distributions to standard #3192

Conversation

alkino commented Nov 9, 2024 • edited Loading

This comment has been minimized.

azure-pipelines bot commented Nov 11, 2024

This comment has been minimized.

azure-pipelines bot commented Nov 11, 2024

This comment has been minimized.

azure-pipelines bot commented Nov 12, 2024

This comment has been minimized.

azure-pipelines bot commented Nov 12, 2024

cattabiani commented Nov 12, 2024 • edited Loading

This comment has been minimized.

This comment has been minimized.

azure-pipelines bot commented Nov 12, 2024

azure-pipelines bot commented Nov 13, 2024

This comment has been minimized.

sonarcloud bot commented Nov 15, 2024

Quality Gate passed

bbpbuildbot commented Nov 15, 2024

azure-pipelines bot commented Nov 15, 2024

alkino commented Nov 9, 2024 •

edited

Loading

cattabiani commented Nov 12, 2024 •

edited

Loading