Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested cross-validation with spatial block CV and cluster CV #555

Open
gregor-fausto opened this issue Nov 21, 2024 · 0 comments
Open

Nested cross-validation with spatial block CV and cluster CV #555

gregor-fausto opened this issue Nov 21, 2024 · 0 comments

Comments

@gregor-fausto
Copy link

gregor-fausto commented Nov 21, 2024

I'm setting up a nested cross-validation using spatial block cross-validation for outer folds and cluster cross-validation for inner folds. Interaction between these two types of cross-validations gives me the following error:

library(spatialsample)
library(rsample)
folds <- spatialsample::spatial_block_cv(boston_canopy, v = 3)
rsample::nested_cv(boston_canopy, outside = folds, inside = rsample::clustering_cv(v = 3, repeats = 1, vars = c('mean_temp','mean_heat_index_morning')))
#> Error in `map()`:
#> ℹ In index: 1.
#> Caused by error in `dist()`:
#> ! 'list' object cannot be coerced to type 'double'

The behavior that I'm trying to achieve is shown here:

library(spatialsample)
library(rsample)
library(sf)
#> Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE
folds <- spatialsample::spatial_block_cv(boston_canopy, v = 3)
tmp<-list()
for(j in 1:nrow(folds)){
  x<-rsample::analysis(folds$splits[[j]]) %>% sf::st_drop_geometry()
  tmp[[j]] <- rsample::clustering_cv(data=x, v = 3, repeats = 1, vars = c('mean_temp','mean_heat_index_morning'))
}
folds$inner_resamples <- tmp
folds
#> #  3-fold spatial block cross-validation 
#> # A tibble: 3 × 3
#>   splits            id    inner_resamples   
#>   <list>            <chr> <list>            
#> 1 <split [458/224]> Fold1 <clt_fold [3 × 2]>
#> 2 <split [469/213]> Fold2 <clt_fold [3 × 2]>
#> 3 <split [437/245]> Fold3 <clt_fold [3 × 2]>

This example uses data from the spatialsample package. I've opened the issue in rsample because the error message seems to be related to how rsample::nested_cv interacts with the output of spatial_block_cv.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant