Generic functions for PySpark experiment analysis and design
pip install git+https://github.com/sdaza/experiment-utils.git
The DataFrame df
is a PySpark DataFrame. If it's a Pandas DataFrame, it will transform automatically.
analyzer = ExperimentAnalyzer(
df,
treatment_col="treatment",
outcomes=['registrations', 'visits'],
covariates=covariates,
experiment_identifier=["campaign_key"],
adjustment=None)
analyzer.get_effects()
analyzer.results
from experiment_utils import PowerSim
p = PowerSim(metric='proportion', relative_effect=False,
variants=1, nsim=1000, alpha=0.05, alternative='two-tailed')
p.get_power(baseline=[0.33], effect=[0.03], sample_size=[3000])