Skip to content

Commit

Permalink
Updating the non-hierarchical worksheets to make sure they run
Browse files Browse the repository at this point in the history
  • Loading branch information
kaybenleroll committed Jun 6, 2023
1 parent 96fbf40 commit 9337725
Show file tree
Hide file tree
Showing 11 changed files with 8,234 additions and 97 deletions.
12 changes: 6 additions & 6 deletions construct_longsynth_fixed_pnbd_models.html

Large diffs are not rendered by default.

29 changes: 29 additions & 0 deletions construct_longsynth_onehier_pnbd_models.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,7 @@ customer_transactions_tbl |> glimpse()

## Load Derived Data

<<<<<<< Updated upstream
```{r write_data_disk}
#| echo: TRUE
Expand All @@ -126,12 +127,40 @@ fit_1000_data_tbl <- read_rds("data/fit_1000_longframe_data_tbl.rds")
fit_10000_data_tbl <- read_rds("data/fit_10000_longframe_data_tbl.rds")
customer_summarystats_tbl <- read_rds("data/customer_summarystats_longframe_tbl.rds")
=======
```{r load_modelling_data}
#| echo: TRUE
id_1000 <- read_rds("data/longsynth_id_1000.rds")
id_5000 <- read_rds("data/longsynth_id_5000.rds")
id_10000 <- read_rds("data/longsynth_id_10000.rds")
fit_1000_data_tbl <- read_rds("data/longsynth_fit_1000_data_tbl.rds")
fit_10000_data_tbl <- read_rds("data/longsynth_fit_10000_data_tbl.rds")
customer_fit_stats_tbl <- fit_1000_data_tbl
customer_summarystats_tbl <- read_rds("data/longsynth_customer_summarystats_tbl.rds")
>>>>>>> Stashed changes
obs_fitdata_tbl <- read_rds("data/longsynth_obs_fitdata_tbl.rds")
obs_validdata_tbl <- read_rds("data/longsynth_obs_validdata_tbl.rds")
```


<<<<<<< Updated upstream
=======
Finally, we need to set our directories where we save our Stan code and the
model outputs.

```{r setup_workbook_parameters}
#| echo: TRUE
stan_modeldir <- "stan_models"
stan_codedir <- "stan_code"
```


>>>>>>> Stashed changes
# Fit First Hierarchical Lambda Model

Expand Down
164 changes: 98 additions & 66 deletions construct_onlineretail_fixed_pnbd_models.html

Large diffs are not rendered by default.

38 changes: 27 additions & 11 deletions construct_onlineretail_fixed_pnbd_models.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,6 @@ ggplot(plot_tbl, aes(x = tnx_timestamp, y = customer_id)) +
```



## Construct Datasets

Having loaded the synthetic data we need to construct a number of datasets of
Expand Down Expand Up @@ -980,19 +979,19 @@ calculate_simulation_statistics <- function(file_rds) {
```{r load_model_assessment_data}
#| echo: TRUE
obs_fit_customer_count <- customer_fit_stats_tbl |>
filter(x > 0) |>
obs_fit_customer_count <- obs_fitdata_tbl |>
filter(tnx_count > 0) |>
nrow()
obs_valid_customer_count <- customer_valid_stats_tbl |>
obs_valid_customer_count <- obs_validdata_tbl |>
filter(tnx_count > 0) |>
nrow()
obs_fit_total_count <- customer_fit_stats_tbl |>
pull(x) |>
obs_fit_total_count <- obs_fitdata_tbl |>
pull(tnx_count) |>
sum()
obs_valid_total_count <- customer_valid_stats_tbl |>
obs_valid_total_count <- obs_validdata_tbl |>
pull(tnx_count) |>
sum()
Expand All @@ -1004,19 +1003,30 @@ obs_stats_tbl <- tribble(
"valid", "simtnx_count", obs_valid_total_count
)
model_assess_tbl <- dir_ls("data", regexp = "pnbd_onlineretail_.*_assess") |>
enframe(name = NULL, value = "file_path") |>
filter(str_detect(file_path, "_assess_model_", negate = TRUE)) |>
mutate(
model_label = str_replace(file_path, "data/pnbd_onlineretail_(.*?)_assess_.*", "\\1"),
assess_type = if_else(str_detect(file_path, "_assess_fit_"), "fit", "valid"),
sim_data = map(file_path, calculate_simulation_statistics)
sim_data = map(
file_path, calculate_simulation_statistics,
.progress = "calculate_simulation_statistics"
)
)
model_assess_tbl |> glimpse()
```

We have now constructed the simulation summary statistics and now reshape our
data to aid in our model assessment.


```{r calculate_model_assess_summary_stats}
#| echo: TRUE
model_assess_summstat_tbl <- model_assess_tbl |>
select(model_label, assess_type, sim_data) |>
unnest(sim_data) |>
Expand All @@ -1034,9 +1044,15 @@ model_assess_summstat_tbl <- model_assess_tbl |>
p75 = quantile(value, 0.75),
p90 = quantile(value, 0.90)
)
model_assess_summstat_tbl |> glimpse()
```


We now use this data to construct model comparison plots for the different
models we have fit.


```{r construct_model_comparison_plot}
#! echo: TRUE
Expand Down
4,030 changes: 4,030 additions & 0 deletions exploring_longsynth_data.html

Large diffs are not rendered by default.

10 changes: 5 additions & 5 deletions exploring_online_retail_transactions.html
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,9 @@
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">

<meta name="author" content="Mick Cooney [email protected]">
<meta name="dcterms.date" content="2023-05-18">
<meta name="dcterms.date" content="2023-06-06">

<title>Construct Non-Hierarchical P/NBD Model for Long Timeframe Synthetic Data</title>
<title>Exploring the Online Retail Transaction Data</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
Expand Down Expand Up @@ -5331,7 +5331,7 @@ <h2 id="toc-title">Table of contents</h2>

<header id="title-block-header" class="quarto-title-block default">
<div class="quarto-title">
<h1 class="title">Construct Non-Hierarchical P/NBD Model for Long Timeframe Synthetic Data</h1>
<h1 class="title">Exploring the Online Retail Transaction Data</h1>
</div>


Expand All @@ -5348,7 +5348,7 @@ <h1 class="title">Construct Non-Hierarchical P/NBD Model for Long Timeframe Synt
<div>
<div class="quarto-title-meta-heading">Published</div>
<div class="quarto-title-meta-contents">
<p class="date">May 18, 2023</p>
<p class="date">June 6, 2023</p>
</div>
</div>

Expand Down Expand Up @@ -7702,7 +7702,7 @@ <h1 data-number="11"><span class="header-section-number">11</span> R Environment
collate en_US.UTF-8
ctype en_US.UTF-8
tz Europe/Dublin
date 2023-05-18
date 2023-06-06
pandoc 2.19.2 @ /usr/local/bin/ (via rmarkdown)

─ Packages ───────────────────────────────────────────────────────────────────
Expand Down
2 changes: 1 addition & 1 deletion exploring_online_retail_transactions.qmd
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: "Construct Non-Hierarchical P/NBD Model for Long Timeframe Synthetic Data"
title: "Exploring the Online Retail Transaction Data"
author: "Mick Cooney <[email protected]>"
date: "Last updated: `r format(Sys.time(), '%B %d, %Y')`"
editor: source
Expand Down
Loading

0 comments on commit 9337725

Please sign in to comment.