dataduck

An R package providing tools for data bucketing with lookup tables, aggregation, and validation, optimized for performance with databases like DuckDB.

Unit Testing

Tests are located in the tests/testthat folder. To run all tests:

library("devtools")
devtools::test()

Contribution Guidelines

Variable naming conventions

1. Data Variables: In this R package, variables often refer to data that is either:

Stored in R memory as a tibble.
Stored on a DuckDB server as a pointer to a database table.

These two data types are handled differently, and failing to distinguish between them may lead to subtle bugs. To prevent this:

Variables referring to an in-memory tibble are suffixed with _tb.
Variables referring to an on-server database table are suffixed with _db.

Example:

library(duckdb)
library(dplyr)

# The name of `iris` data represented as a tibble is `iris_tb`
iris_tb <- tibble(iris)

# The name of `iris` data represented as a pointer to a database table is `iris_db`. But the name of the table itself, in the connection, is just `iris`.
con <- dbConnect(duckdb::duckdb(), ":memory:")
dbWriteTable(con, "iris", iris_tb, overwrite = TRUE)
iris_db <- tbl(con, "iris")
dbDisconnect(con)

2. Column name arguments: Many functions have arguments that are strings representing column names in the data. To clearly indicate that an argument represents a column name, we suffix it with _col.

Example:

library(dplyr)

# Define the function
calculate_group_average <- function(
  data, 
  group_col, 
  value_col
  ) {
  data |>
    group_by(.data[[group_col]]) |>
    summarize(average = mean(.data[[value_col]], na.rm = TRUE)) |>
    ungroup()
}

# Example usage
calculate_group_average(
  data = iris_tb,
  group_col = "Species",
  value_col = "Petal.Width"
)

Name		Name	Last commit message	Last commit date
Latest commit History 151 Commits
R		R
man		man
renv		renv
tests		tests
.Rbuildignore		.Rbuildignore
.Rprofile		.Rprofile
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
NAMESPACE		NAMESPACE
README.md		README.md
dataduck.Rproj		dataduck.Rproj
renv.lock		renv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

dataduck

Unit Testing

Contribution Guidelines

Variable naming conventions

About

Releases

Packages

Languages

License

lorae/dataduck

Folders and files

Latest commit

History

Repository files navigation

dataduck

Unit Testing

Contribution Guidelines

Variable naming conventions

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages