Skip to content

Commit

Permalink
Render for correct output in weigh() (#281)
Browse files Browse the repository at this point in the history
* Render for correct output in `weigh()`

* Make sure we don't run into this again in the future
  • Loading branch information
juliasilge authored Jan 28, 2025
1 parent 6dff2cc commit 5836908
Show file tree
Hide file tree
Showing 2 changed files with 23 additions and 13 deletions.
30 changes: 20 additions & 10 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,16 @@ knitr::opts_chunk$set(
fig.path = "man/figures/README-",
out.width = "100%"
)
# `devtools::build_readme()` evaluates in `callr:r_safe()` which causes issues
# with butcher's memory profiling. Neither RStudio's Knit button nor
# `rmarkdown::render()` have this issue (#280).
if (identical(Sys.getenv("CALLR_IS_RUNNING"), "true")) {
rlang::abort(c(
"Build this README with `rmarkdown::render()` rather than `devtools::build_readme()`.",
"See tidymodels/butcher#280 for more info."
))
}
```

# butcher <a href="https://butcher.tidymodels.org"><img src="man/figures/logo.png" align="right" height="138" alt="butcher website" /></a>
Expand All @@ -29,7 +39,7 @@ Modeling or machine learning in R can result in fitted model objects that take u
1. Heavy usage of formulas and closures that capture the enclosing environment in model training
2. Lack of selectivity in the construction of the model object itself

As a result, fitted model objects contain components that are often redundant and not required for post-fit estimation activities. The butcher package provides tooling to "axe" parts of the fitted output that are no longer needed, without sacrificing prediction functionality from the original model object.
As a result, fitted model objects contain components that are often redundant and not required for post-fit estimation activities. The butcher package provides tooling to "axe" parts of the fitted output that are no longer needed, without sacrificing prediction functionality from the original model object.

## Installation

Expand All @@ -46,33 +56,33 @@ Or install the development version from [GitHub](https://github.com/):
pak::pak("tidymodels/butcher")
```

## Butchering
## Butchering

As an example, let's wrap an `lm` model so it contains a lot of unnecessary stuff:
As an example, let's wrap an `lm` model so it contains a lot of unnecessary stuff:

```{r example}
library(butcher)
our_model <- function() {
some_junk_in_the_environment <- runif(1e6) # we didn't know about
lm(mpg ~ ., data = mtcars)
lm(mpg ~ ., data = mtcars)
}
```

This object is unnecessarily large:
This object is unnecessarily large:

```{r}
library(lobstr)
obj_size(our_model())
```

When, in fact, it should only be:
When, in fact, it should only be:

```{r}
small_lm <- lm(mpg ~ ., data = mtcars)
small_lm <- lm(mpg ~ ., data = mtcars)
obj_size(small_lm)
```

To understand which part of our original model object is taking up the most memory, we leverage the `weigh()` function:
To understand which part of our original model object is taking up the most memory, we leverage the `weigh()` function:

```{r}
big_lm <- our_model()
Expand Down Expand Up @@ -113,8 +123,8 @@ Check out the `vignette("available-axe-methods")` to see butcher's current cover

1. Run `new_model_butcher(model_class = "your_object", package_name = "your_package")`
2. Use butcher helper functions `weigh()` and `locate()` to decide what to axe
3. Finalize edits to `R/your_object.R` and `tests/testthat/test-your_object.R`
4. Make a pull request!
3. Finalize edits to `R/your_object.R` and `tests/testthat/test-your_object.R`
4. Make a pull request!

## Contributing

Expand Down
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ weigh(big_lm)
#> # A tibble: 25 × 2
#> object size
#> <chr> <dbl>
#> 1 terms 8.05
#> 1 terms 8.01
#> 2 qr.qr 0.00666
#> 3 residuals 0.00286
#> 4 fitted.values 0.00286
Expand All @@ -102,7 +102,7 @@ remove the (mostly) extraneous component, we can use `butcher()`:

``` r
cleaned_lm <- butcher(big_lm, verbose = TRUE)
#> ✔ Memory released: 8.03 MB
#> ✔ Memory released: 8.00 MB
#> ✖ Disabled: `print()`, `summary()`, and `fitted()`
```

Expand Down Expand Up @@ -133,7 +133,7 @@ weigh(small_lm)
#> # A tibble: 25 × 2
#> object size
#> <chr> <dbl>
#> 1 terms 8.06
#> 1 terms 0.00763
#> 2 qr.qr 0.00666
#> 3 residuals 0.00286
#> 4 fitted.values 0.00286
Expand Down

0 comments on commit 5836908

Please sign in to comment.