diff --git a/DESCRIPTION b/DESCRIPTION index 8223372..2009624 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -17,20 +17,21 @@ Imports: gridExtra, nortest, stats, - utils + utils, + xplorerr Suggests: covr, descriptr, knitr, rmarkdown, testthat, - vdiffr, - xplorerr + vdiffr License: MIT + file LICENSE URL: https://olsrr.rsquaredacademy.com/, https://github.com/rsquaredacademy/olsrr BugReports: https://github.com/rsquaredacademy/olsrr/issues Encoding: UTF-8 LazyData: true VignetteBuilder: knitr -RoxygenNote: 7.1.1 +Roxygen: list(markdown = TRUE) +RoxygenNote: 7.2.3 Config/testthat/edition: 3 diff --git a/NEWS.md b/NEWS.md index 74764b9..700dc7a 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,10 +1,22 @@ -# olsrr 0.5.3.9000 +# olsrr 0.6.0 + +This is a minor release for bug fixes and other enhancements. ## New Features -- Force variables in/out in variable selection procedures -- Hierarchical selection -- Variable selection using r-squared and adjusted r-squared +- hierarchical selection can be enables when using `p` values as variable selection metric + +## Enhancements + +- force variables to be included or excluded from the model at all stages of variable selection +- Variable selection methods allow use of the following metrics: + - p value + - akaike information criterion (aic) + - schwarz bayesian criterion (sbc) + - sawa bayesian criterion (sbic) + - r-square + - adjusted r-square +- Choose threshold for determining influential observations in `ols_plot_dffits()` ## Bug Fixes diff --git a/README.Rmd b/README.Rmd index 1097fb0..7877f2a 100644 --- a/README.Rmd +++ b/README.Rmd @@ -16,9 +16,8 @@ knitr::opts_chunk$set( [](https://cran.r-project.org/package=olsrr) -[](https://cran.r-project.org/web/checks/check_results_olsrr.html) [](https://github.com/rsquaredacademy/olsrr/actions) -[](https://app.codecov.io/github/rsquaredacademy/olsrr?branch=master) [](https://CRAN.R-project.org/package=olsrr) [](https://lifecycle.r-lib.org/articles/stages.html) [](https://cran.r-project.org/package=olsrr) +[](https://app.codecov.io/github/rsquaredacademy/olsrr?branch=master) ## Overview @@ -41,8 +40,8 @@ The olsrr package provides following tools for building OLS regression models us install.packages("olsrr") # Install development version from GitHub -# install.packages("devtools") -devtools::install_github("rsquaredacademy/olsrr") +# install.packages("pak") +pak::pak("rsquaredacademy/olsrr") ``` ## Articles @@ -56,8 +55,6 @@ devtools::install_github("rsquaredacademy/olsrr") ## Usage -olsrr uses consistent prefix `ols_` for easy tab completion. - ```{r, echo=FALSE, message=FALSE} library(olsrr) library(dplyr) @@ -67,58 +64,14 @@ library(nortest) library(goftest) ``` -olsrr is built with the aim of helping those users who are new to the R language. If you know how to -write a `formula` or build models using `lm`, you will find olsrr very useful. Most of the functions -use an object of class `lm` as input. So you just need to build a model using `lm` and then pass it onto -the functions in olsrr. Below is a quick demo: +olsrr uses consistent prefix `ols_` for easy tab completion. If you know how to write a `formula` or build models using `lm`, you will find olsrr very useful. Most of the functions use an object of class `lm` as input. So you just need to build a model using `lm` and then pass it onto the functions in olsrr. Below is +a quick demo: #### Regression ```{r regress} -ols_regress(mpg ~ disp + hp + wt + qsec, data = mtcars) -``` - -#### Stepwise Regression - -Build regression model from a set of candidate predictor variables by entering and removing predictors based on -p values, in a stepwise manner until there is no variable left to enter or remove any more. - -#### Variable Selection - -```{r stepwise1} -# stepwise regression -model <- lm(y ~ ., data = surgical) -ols_step_both_p(model) -``` - -#### Stepwise AIC Backward Regression - -Build regression model from a set of candidate predictor variables by removing predictors based on -Akaike Information Criteria, in a stepwise manner until there is no variable left to remove any more. - -##### Variable Selection - -```{r stepaicb1} -# stepwise aic backward regression -model <- lm(y ~ ., data = surgical) -k <- ols_step_backward_aic(model) -k -``` - -#### Breusch Pagan Test - -Breusch Pagan test is used to test for herteroskedasticity (non-constant error variance). It tests whether the variance of the errors from a regression is dependent on the values of the independent variables. It is a $\chi^{2}$ test. - -```{r bp1} -model <- lm(mpg ~ disp + hp + wt + drat, data = mtcars) -ols_test_breusch_pagan(model) -``` - -#### Collinearity Diagnostics - -```{r colldiag} model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars) -ols_coll_diag(model) +ols_regress(model) ``` ## Getting Help diff --git a/README.md b/README.md index a972342..6f8cce1 100644 --- a/README.md +++ b/README.md @@ -6,16 +6,10 @@ [](https://cran.r-project.org/package=olsrr) -[](https://cran.r-project.org/web/checks/check_results_olsrr.html) [](https://github.com/rsquaredacademy/olsrr/actions) [](https://app.codecov.io/github/rsquaredacademy/olsrr?branch=master) -[](https://CRAN.R-project.org/package=olsrr) -[](https://lifecycle.r-lib.org/articles/stages.html) -[](https://cran.r-project.org/package=olsrr) ## Overview @@ -39,8 +33,8 @@ models using R: install.packages("olsrr") # Install development version from GitHub -# install.packages("devtools") -devtools::install_github("rsquaredacademy/olsrr") +# install.packages("pak") +pak::pak("rsquaredacademy/olsrr") ``` ## Articles @@ -59,19 +53,17 @@ devtools::install_github("rsquaredacademy/olsrr") ## Usage -olsrr uses consistent prefix `ols_` for easy tab completion. - -olsrr is built with the aim of helping those users who are new to the R -language. If you know how to write a `formula` or build models using -`lm`, you will find olsrr very useful. Most of the functions use an -object of class `lm` as input. So you just need to build a model using -`lm` and then pass it onto the functions in olsrr. Below is a quick -demo: +olsrr uses consistent prefix `ols_` for easy tab completion. If you know +how to write a `formula` or build models using `lm`, you will find olsrr +very useful. Most of the functions use an object of class `lm` as input. +So you just need to build a model using `lm` and then pass it onto the +functions in olsrr. Below is a quick demo: #### Regression ``` r -ols_regress(mpg ~ disp + hp + wt + qsec, data = mtcars) +model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars) +ols_regress(model) #> Model Summary #> --------------------------------------------------------------- #> R 0.914 RMSE 2.409 @@ -108,195 +100,6 @@ ols_regress(mpg ~ disp + hp + wt + qsec, data = mtcars) #> ---------------------------------------------------------------------------------------- ``` -#### Stepwise Regression - -Build regression model from a set of candidate predictor variables by -entering and removing predictors based on p values, in a stepwise manner -until there is no variable left to enter or remove any more. - -#### Variable Selection - -``` r -# stepwise regression -model <- lm(y ~ ., data = surgical) -ols_step_both_p(model) -#> -#> -#> Stepwise Summary -#> ------------------------------------------------------------------------------ -#> Step Variable AIC SBC SBIC R2 Adj. R2 -#> ------------------------------------------------------------------------------ -#> 0 Base Model 802.606 806.584 646.794 0.00000 0.00000 -#> 1 liver_test (+) 771.875 777.842 616.009 0.45454 0.44405 -#> 2 alc_heavy (+) 761.439 769.395 605.506 0.56674 0.54975 -#> 3 enzyme_test (+) 750.509 760.454 595.297 0.65900 0.63854 -#> 4 pindex (+) 735.715 747.649 582.943 0.75015 0.72975 -#> 5 bcs (+) 730.620 744.543 579.638 0.78091 0.75808 -#> ------------------------------------------------------------------------------ -#> -#> Final Model Output -#> ------------------ -#> -#> Model Summary -#> ------------------------------------------------------------------- -#> R 0.884 RMSE 184.276 -#> R-Squared 0.781 MSE 38202.426 -#> Adj. R-Squared 0.758 Coef. Var 27.839 -#> Pred R-Squared 0.700 AIC 730.620 -#> MAE 137.656 SBC 744.543 -#> ------------------------------------------------------------------- -#> RMSE: Root Mean Square Error -#> MSE: Mean Square Error -#> MAE: Mean Absolute Error -#> AIC: Akaike Information Criteria -#> SBC: Schwarz Bayesian Criteria -#> -#> ANOVA -#> ----------------------------------------------------------------------- -#> Sum of -#> Squares DF Mean Square F Sig. -#> ----------------------------------------------------------------------- -#> Regression 6535804.090 5 1307160.818 34.217 0.0000 -#> Residual 1833716.447 48 38202.426 -#> Total 8369520.537 53 -#> ----------------------------------------------------------------------- -#> -#> Parameter Estimates -#> ------------------------------------------------------------------------------------------------ -#> model Beta Std. Error Std. Beta t Sig lower upper -#> ------------------------------------------------------------------------------------------------ -#> (Intercept) -1178.330 208.682 -5.647 0.000 -1597.914 -758.746 -#> liver_test 58.064 40.144 0.156 1.446 0.155 -22.652 138.779 -#> alc_heavy 317.848 71.634 0.314 4.437 0.000 173.818 461.878 -#> enzyme_test 9.748 1.656 0.521 5.887 0.000 6.419 13.077 -#> pindex 8.924 1.808 0.380 4.935 0.000 5.288 12.559 -#> bcs 59.864 23.060 0.241 2.596 0.012 13.498 106.230 -#> ------------------------------------------------------------------------------------------------ -``` - -#### Stepwise AIC Backward Regression - -Build regression model from a set of candidate predictor variables by -removing predictors based on Akaike Information Criteria, in a stepwise -manner until there is no variable left to remove any more. - -##### Variable Selection - -``` r -# stepwise aic backward regression -model <- lm(y ~ ., data = surgical) -k <- ols_step_backward_aic(model) -k -#> -#> -#> Stepwise Summary -#> ------------------------------------------------------------------------- -#> Step Variable AIC SBC SBIC R2 Adj. R2 -#> ------------------------------------------------------------------------- -#> 0 Full Model 736.390 756.280 586.665 0.78184 0.74305 -#> 1 alc_mod 734.407 752.308 583.884 0.78177 0.74856 -#> 2 gender 732.494 748.406 581.290 0.78142 0.75351 -#> 3 age 730.620 744.543 578.844 0.78091 0.75808 -#> ------------------------------------------------------------------------- -#> -#> Final Model Output -#> ------------------ -#> -#> Model Summary -#> ------------------------------------------------------------------- -#> R 0.884 RMSE 184.276 -#> R-Squared 0.781 MSE 38202.426 -#> Adj. R-Squared 0.758 Coef. Var 27.839 -#> Pred R-Squared 0.700 AIC 730.620 -#> MAE 137.656 SBC 744.543 -#> ------------------------------------------------------------------- -#> RMSE: Root Mean Square Error -#> MSE: Mean Square Error -#> MAE: Mean Absolute Error -#> AIC: Akaike Information Criteria -#> SBC: Schwarz Bayesian Criteria -#> -#> ANOVA -#> ----------------------------------------------------------------------- -#> Sum of -#> Squares DF Mean Square F Sig. -#> ----------------------------------------------------------------------- -#> Regression 6535804.090 5 1307160.818 34.217 0.0000 -#> Residual 1833716.447 48 38202.426 -#> Total 8369520.537 53 -#> ----------------------------------------------------------------------- -#> -#> Parameter Estimates -#> ------------------------------------------------------------------------------------------------ -#> model Beta Std. Error Std. Beta t Sig lower upper -#> ------------------------------------------------------------------------------------------------ -#> (Intercept) -1178.330 208.682 -5.647 0.000 -1597.914 -758.746 -#> bcs 59.864 23.060 0.241 2.596 0.012 13.498 106.230 -#> pindex 8.924 1.808 0.380 4.935 0.000 5.288 12.559 -#> enzyme_test 9.748 1.656 0.521 5.887 0.000 6.419 13.077 -#> liver_test 58.064 40.144 0.156 1.446 0.155 -22.652 138.779 -#> alc_heavy 317.848 71.634 0.314 4.437 0.000 173.818 461.878 -#> ------------------------------------------------------------------------------------------------ -``` - -#### Breusch Pagan Test - -Breusch Pagan test is used to test for herteroskedasticity (non-constant -error variance). It tests whether the variance of the errors from a -regression is dependent on the values of the independent variables. It -is a $\chi^{2}$ test. - -``` r -model <- lm(mpg ~ disp + hp + wt + drat, data = mtcars) -ols_test_breusch_pagan(model) -#> -#> Breusch Pagan Test for Heteroskedasticity -#> ----------------------------------------- -#> Ho: the variance is constant -#> Ha: the variance is not constant -#> -#> Data -#> ------------------------------- -#> Response : mpg -#> Variables: fitted values of mpg -#> -#> Test Summary -#> --------------------------- -#> DF = 1 -#> Chi2 = 1.429672 -#> Prob > Chi2 = 0.231818 -``` - -#### Collinearity Diagnostics - -``` r -model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars) -ols_coll_diag(model) -#> Tolerance and Variance Inflation Factor -#> --------------------------------------- -#> Variables Tolerance VIF -#> 1 disp 0.1252279 7.985439 -#> 2 hp 0.1935450 5.166758 -#> 3 wt 0.1445726 6.916942 -#> 4 qsec 0.3191708 3.133119 -#> -#> -#> Eigenvalue and Condition Index -#> ------------------------------ -#> Eigenvalue Condition Index intercept disp hp wt -#> 1 4.721487187 1.000000 0.000123237 0.001132468 0.001413094 0.0005253393 -#> 2 0.216562203 4.669260 0.002617424 0.036811051 0.027751289 0.0002096014 -#> 3 0.050416837 9.677242 0.001656551 0.120881424 0.392366164 0.0377028008 -#> 4 0.010104757 21.616057 0.025805998 0.777260487 0.059594623 0.7017528428 -#> 5 0.001429017 57.480524 0.969796790 0.063914571 0.518874831 0.2598094157 -#> qsec -#> 1 0.0001277169 -#> 2 0.0046789491 -#> 3 0.0001952599 -#> 4 0.0024577686 -#> 5 0.9925403056 -``` - ## Getting Help If you encounter a bug, please file a minimal reproducible example using diff --git a/cran-comments.md b/cran-comments.md index 108b5a3..e4fa3ce 100644 --- a/cran-comments.md +++ b/cran-comments.md @@ -1,10 +1,10 @@ -## Test environments -* local Windows 10, R 3.6.2 -* ubuntu 12.04 (on travis-ci), R 3.5.3, R 3.6.2, R-devel -* win-builder (devel and release) - ## R CMD check results 0 errors | 0 warnings | 0 note +## revdepcheck results + +We checked 4 reverse dependencies, comparing R CMD check results across CRAN and dev versions of this package. + * We saw 0 new problems + * We failed to check 0 packages diff --git a/docs/news/index.html b/docs/news/index.html index b57fe91..13a5992 100644 --- a/docs/news/index.html +++ b/docs/news/index.html @@ -1,5 +1,5 @@ -
ols_plot_dffits()
+ols_test_outlier()
does not find any outliers, it returns largest positive residual instead of largest absolute residual (#177)ols_step_all_possible()
with Model created from dynamic function leads to "Error in eval(model$call$data) . . . not found"
(#176)