Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor convert_input to Perform tasks via helper function #3338

Open
wants to merge 40 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
614b8f9
Shift functions to check for missing files
Sweetdevil144 Jul 18, 2024
838af61
Update CHANGELOG
Sweetdevil144 Jul 18, 2024
d5e8d24
Merge branch 'develop' into gsoc/convert-input
Sweetdevil144 Jul 25, 2024
f22b962
Remove unutilized variables from convert_input
Sweetdevil144 Jul 25, 2024
d884203
Update logger statements in convert_input
Sweetdevil144 Jul 25, 2024
68d9516
Added seperate function to check machine info
Sweetdevil144 Jul 25, 2024
5208b02
Update input args to get machine info
Sweetdevil144 Jul 25, 2024
f570646
Correct roxygen documentations
Sweetdevil144 Jul 25, 2024
e479c46
Update tests
Sweetdevil144 Jul 25, 2024
d9e911b
Merge branch 'PecanProject:develop' into gsoc/convert-input
Sweetdevil144 Jul 31, 2024
ed581f7
Merge branch 'develop' into gsoc/convert-input
Sweetdevil144 Jul 31, 2024
63ac964
Merge branch 'develop' into gsoc/convert-input
Sweetdevil144 Aug 5, 2024
0f9ac13
Merge branch 'develop' into gsoc/convert-input
Sweetdevil144 Aug 9, 2024
b98617f
Merge branch 'develop' into gsoc/convert-input
Sweetdevil144 Aug 12, 2024
4b771d3
Merge branch 'develop' into gsoc/convert-input
Sweetdevil144 Aug 14, 2024
63f270f
Refactor extra variables in `run.meta.anbalysis`
Sweetdevil144 Aug 14, 2024
dbb7a6d
Merge branch 'PecanProject:develop' into gsoc/convert-input
Sweetdevil144 Aug 16, 2024
74003d9
get existing machine info using helper function
Sweetdevil144 Aug 21, 2024
95fb810
Merge branch 'develop' into gsoc/convert-input
Sweetdevil144 Aug 28, 2024
2bcb7c4
Merge branch 'develop' into gsoc/convert-input
Sweetdevil144 Aug 31, 2024
fcae9bd
Merge branch 'develop' into gsoc/convert-input
Sweetdevil144 Sep 3, 2024
c8e8a02
Merge branch 'develop' into gsoc/convert-input
Sweetdevil144 Sep 5, 2024
766174f
Merge branch 'develop' into gsoc/convert-input
Sweetdevil144 Sep 15, 2024
d9074df
Merge branch 'develop' into gsoc/convert-input
Sweetdevil144 Sep 21, 2024
94c92f0
Merge branch 'develop' into gsoc/convert-input
Sweetdevil144 Oct 7, 2024
a578be2
Applied changes as suggested by @infotroph
Sweetdevil144 Oct 9, 2024
293a68b
Minor review changes
Sweetdevil144 Oct 9, 2024
f7f6926
Update base/db/R/get.machine.info.R
Sweetdevil144 Oct 9, 2024
8f820b0
Apply suggestions from code review
Sweetdevil144 Oct 9, 2024
4548744
Merge branch 'develop' into gsoc/convert-input
Sweetdevil144 Nov 20, 2024
ffc0971
Merge branch 'develop' into gsoc/convert-input
Sweetdevil144 Nov 21, 2024
9384343
Update machine host to remove duplicate code
Sweetdevil144 Dec 2, 2024
6dc63f7
Merge branch 'PecanProject:develop' into gsoc/convert-input
Sweetdevil144 Dec 22, 2024
04d7835
Update naming
Abhinav-IITBHU Dec 23, 2024
f5edb7f
Merge branch 'develop' into gsoc/convert-input
Sweetdevil144 Jan 16, 2025
3379def
Merge branch 'PecanProject:develop' into gsoc/convert-input
Sweetdevil144 Jan 25, 2025
e9a95ee
Update documentations wrt comments by @mdietze
Sweetdevil144 Jan 27, 2025
525e05f
Update check_missing_files.R
Sweetdevil144 Jan 27, 2025
f82fc4b
Update add_database_entries.R
Sweetdevil144 Jan 27, 2025
5ac6413
Renamed `add_database_entries` and Updated documentations
Sweetdevil144 Jan 27, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
# Change Log

All notable changes are kept in this file. All changes made should be added to the section called
`Unreleased`. Once a new release is made this file will be updated to create a new `Unreleased`
section for the next release.
Expand All @@ -9,6 +10,8 @@ For more information about this file see also [Keep a Changelog](http://keepacha

### Added

- Refactor `convert_input` to Perform tasks via helper function. Subtask of [#3307](https://github.com/PecanProject/pecan/issues/3307)

### Fixed
- updated github action to build docker images

Expand Down
56 changes: 56 additions & 0 deletions base/db/R/check_missing_files.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
#' Check for Missing or Empty Files in Conversion Results
#'
#' This function inspects the file paths in a list of data frames (typically produced by a download or conversion routine) to ensure that each file is present and non-empty. Specifically, it checks whether any file path is missing or has a file size of zero, and logs an error if such files are detected. It also normalizes `existing.input` and `existing.dbfile` so that each is returned as a list of data frames.
#'
#' @param result A list of data frames containing file information. Each data frame is expected to have a column named `file` with absolute file paths created by a data-conversion or download function. For example, this might be the structure returned by a "download_X" or "met2model_X" function when invoked via [convert_input()].
#' @param existing.input A data frame or list of data frames (possibly zero rows) representing input records in the BETY `inputs` table that match (or partially match) the data being added. This is converted to a list of data frames if it is not already.
#' @param existing.dbfile A data frame or list of data frames (possibly zero rows) representing dbfile records in the BETY `dbfiles` table that match (or partially match) the data being added. This is also converted to a list of data frames if it is not already.
#'
#' @return A list containing:
#' \itemize{
#' \item A list of data frames for `existing.input`
#' \item A list of data frames for `existing.dbfile`
#' }
#'
#' @details
#' The function calculates the file size for each file specified in the `result` data frames. If any file path is missing (`NA`) or any file size is zero, the function raises a fatal error (via [PEcAn.logger::logger.severe]) indicating that an expected file is either nonexistent or empty. If no such issues are found, it merely ensures that `existing.input` and `existing.dbfile` are each wrapped in a list for consistent downstream usage.
#'
#' @author Betsy Cowdery, Michael Dietze, Ankur Desai, Tony Gardella, Luke Dramko

check_missing_files <- function(result, existing.input = NULL, existing.dbfile = NULL) {
result_sizes <- purrr::map_dfr(
result,
~ dplyr::mutate(
.,
file_size = purrr::map_dbl(file, file.size),
missing = is.na(file_size),
empty = file_size == 0
)
)

if (any(result_sizes$missing) || any(result_sizes$empty)) {
PEcAn.logger::logger.severe(
"Requested Processing produced empty files or Nonexistent files:\n",
log_format_df(result_sizes[, c(1, 8, 9, 10)]),
"\n Table of results printed above.",
wrap = FALSE
)
}


# Wrap in a list for consistent processing later
if (is.data.frame(existing.input)) {
existing.input <- list(existing.input)
}

if (is.data.frame(existing.dbfile)) {
existing.dbfile <- list(existing.dbfile)
}
return(list(existing.input, existing.dbfile))
}

log_format_df <- function(df) {
formatted_df <- rbind(colnames(df), format(df))
formatted_text <- purrr::reduce(formatted_df, paste, sep = " ")
paste(formatted_text, collapse = "\n")
}
Loading
Loading