Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reworked cs_runner + added FEMA 100 + bathymetry #132

Closed
wants to merge 65 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
0092719
updated transects and cross section pts runners with new hydrofabric3…
anguswg-ucsb Dec 6, 2023
16c6637
small cleanups
anguswg-ucsb Dec 6, 2023
ddecfff
Updating config and downlaod_nextgen runner scripts to be more flexib…
anguswg-ucsb Dec 9, 2023
2567b41
small cleanups to cs_runner/config.R, removing old code
anguswg-ucsb Dec 11, 2023
67d33bb
updated transect and cross section runners to allow transects to be u…
anguswg-ucsb Feb 26, 2024
a5d6e05
added COLLECT_META flag before setting the meta_path variable in cs r…
anguswg-ucsb Feb 26, 2024
ad88c27
updated cross secction points generation to better rectify flat cross…
anguswg-ucsb Feb 28, 2024
7bf195e
added download script for FEMA 100 year flood fgb files, and related …
anguswg-ucsb Mar 19, 2024
28be701
cleaned up 01_transects and 02_cs_pts runners to reflect newest metho…
anguswg-ucsb Mar 21, 2024
adeee65
cleaned up download_fema100.R script and replaced glue::glue() with p…
anguswg-ucsb Mar 21, 2024
bfa3c9b
fixed aws s3 copy command to specify the aws_profile to use
anguswg-ucsb Mar 21, 2024
6b8bfe7
working on new runner for injecting ML outputs to dem cross sections
anguswg-ucsb Apr 6, 2024
ae7ae6d
updating inject ml runner to use updated hydrofabric3D functionality
anguswg-ucsb Apr 8, 2024
70eccc3
updated runners/cs_runner to include machine learning estimated width…
anguswg-ucsb Apr 11, 2024
4e40b00
super minor cleanups
anguswg-ucsb Apr 18, 2024
e4c2558
random cleanup
anguswg-ucsb Apr 23, 2024
0b290ce
basic layout for fema processing
anguswg-ucsb Apr 23, 2024
f781526
added code to simplfiy, dissolve, and explode fema geometries via map…
anguswg-ucsb Apr 25, 2024
d49133e
added code to apply clean_geometries from hydrofab
anguswg-ucsb Apr 26, 2024
633109b
created a preprocess_fema.R script in cs_runner/ that takes FEMA FGBs…
anguswg-ucsb Apr 30, 2024
703fadf
partioning fema floodplains by vpu code
anguswg-ucsb May 1, 2024
5d36e1e
small minor cleanups
anguswg-ucsb May 1, 2024
59a8d0a
random cleanups
anguswg-ucsb May 2, 2024
d8742c1
added ogr2ogr commands to merge together all femas within a VPU into …
anguswg-ucsb May 6, 2024
4d4d057
work in progress on extending transects according to fema polygons
anguswg-ucsb May 6, 2024
89242b2
small changes to dir structure for final merged fema polygons
anguswg-ucsb May 7, 2024
c8a1ff1
one more round of dissolving and exploding
anguswg-ucsb May 8, 2024
b34d3b3
added code for splitting up transects to check within FEMA relations …
anguswg-ucsb May 10, 2024
75e93bd
continued work on generating extended transect lines from fema files
anguswg-ucsb May 13, 2024
6056c85
first cut of extensions across an entire vpu, added code to determine…
anguswg-ucsb May 14, 2024
1e51123
changed extensions to check that they only interest a flowline only a…
anguswg-ucsb May 15, 2024
f9bf310
continuing improvement of transect extensions using FEMA
anguswg-ucsb May 16, 2024
3e58767
starting to cleanup scripts and make it so transects will try to be e…
anguswg-ucsb May 17, 2024
9e4405b
added a generic transect line checker that makes sure that a transect…
anguswg-ucsb May 21, 2024
568925b
slowly putting all the code into individual functions for extending t…
anguswg-ucsb May 22, 2024
248d540
small stuff
anguswg-ucsb May 23, 2024
f4f7832
random cleanups
anguswg-ucsb May 23, 2024
b81e477
small cleanups as im migrating fema functions to hydrofabric3D
anguswg-ucsb May 29, 2024
7914a45
updating cs_runner/01_transects.R to use FEMA VPU polygons to extend …
anguswg-ucsb Jun 13, 2024
1ba56d4
reworked final step in producing fema geometries to better resolve in…
anguswg-ucsb Jul 11, 2024
fcedf07
reworking processing of fema FGBs to improve transect extensions
anguswg-ucsb Jul 12, 2024
e1c6621
wip on removing wholes but not losing many polygons
anguswg-ucsb Jul 12, 2024
42ad438
replaced preprocess fema with new version which better resolves inter…
anguswg-ucsb Jul 16, 2024
e7e8023
cleaned up and deleted old scripts, finalized FEMA polygon simplifica…
anguswg-ucsb Jul 17, 2024
5b449f7
moved variables for running partition_fema_by_vpu.R and renamed proce…
anguswg-ucsb Jul 17, 2024
c81675c
small cleanups
anguswg-ucsb Jul 18, 2024
64f5b15
small cleanups and set fema simplification to 1% but to keep all shapes
anguswg-ucsb Jul 18, 2024
16cd482
removed extra 00_fema.R file
anguswg-ucsb Jul 18, 2024
1b5fb76
updated gitignore
anguswg-ucsb Jul 18, 2024
72b3e88
updated final step for processing fema polygons to use mapshaper inst…
anguswg-ucsb Jul 19, 2024
8477da0
changes to fema processing steps, using mapshaper clean arg and remov…
anguswg-ucsb Jul 22, 2024
550c85c
testing out new fema simplification amounts...
anguswg-ucsb Jul 26, 2024
b99bbd9
small cleanups
anguswg-ucsb Jul 31, 2024
92fad96
huge overall of variable declarations and path variable creations for…
anguswg-ucsb Aug 2, 2024
6c6b7a5
removed duplicate S3_TRANSECTS_DIR variable declaration
anguswg-ucsb Aug 2, 2024
967f40b
updated 02_cs_pts.R to use new ID based hydrofabric3D updates
anguswg-ucsb Aug 26, 2024
1998c42
added crosswalk_id paramater to classify_points function in ml_inject…
anguswg-ucsb Aug 26, 2024
3ac563d
setup new indepdent file (domain_with_fema.R) for generating a fema i…
anguswg-ucsb Sep 25, 2024
f8ff3d4
updated new domain runner to use new ML inputs and added a loop for r…
anguswg-ucsb Sep 27, 2024
3b61de2
made a temp bathy.R file that ill delete tomorrow that was for runnin…
anguswg-ucsb Oct 1, 2024
7c8ad10
removed excess code from new_domain.R cs runner file
anguswg-ucsb Oct 8, 2024
3441b88
updated inject ml script to use new version of hydrofabric3D
anguswg-ucsb Oct 16, 2024
22cabf3
slowly moving all cs runner files and code over to a new folder for c…
anguswg-ucsb Nov 15, 2024
8c09498
building new cs_runner format in cs_runner2 to replace first version …
anguswg-ucsb Nov 18, 2024
83bb22c
finished 01_transects script and adding extensions script and more ut…
anguswg-ucsb Nov 20, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,4 @@ vignettes/tutorial
check
runners/secret
runners/data
in-progress
in-progress
266 changes: 179 additions & 87 deletions runners/cs_runner/01_transects.R
Original file line number Diff line number Diff line change
@@ -1,148 +1,240 @@
# Generate the flowlines layer for the final cross_sections_<VPU>.gpkg for each VPU
# source("runners/cs_runner/config.R")
source("runners/cs_runner/config.R")
source("runners/cs_runner/utils.R")

# # load libraries
# library(terrainSliceR)
# # # # load libraries
# library(hydrofabric3D)
# library(dplyr)
# library(sf)
# install.packages("devtools")

# name of S3 bucket
s3_bucket <- "s3://lynker-spatial/"

# transect bucket prefix
transects_prefix <- paste0(s3_bucket, "v20/3D/transects/")
# # transect bucket prefix
# S3_TRANSECTS_DIR <- paste0(LYNKER_SPATIAL_HF_S3_URI, VERSION, "/3D/transects/")

# paths to nextgen datasets and model attribute parquet files
nextgen_files <- list.files(nextgen_dir, full.names = FALSE)
model_attr_files <- list.files(model_attr_dir, full.names = FALSE)

# string to fill in "cs_source" column in output datasets
net_source <- "terrainSliceR"
NEXTGEN_FILES <- list.files(NEXTGEN_DIR, full.names = FALSE)
# model_attr_files <- list.files(MODEL_ATTR_DIR, full.names = FALSE)

# ensure the files are in the same order and matched up by VPU
path_df <- align_files_by_vpu(
x = nextgen_files,
y = model_attr_files,
base = base_dir
x = NEXTGEN_FILES,
y = NEXTGEN_FILES,
base = BASE_DIR
)

# loop over each VPU and generate cross sections, then save locally and upload to S3 bucket
for(i in 1:nrow(path_df)) {

# nextgen file and full path
nextgen_file <- path_df$x[i]
nextgen_path <- paste0(nextgen_dir, nextgen_file)
nextgen_path <- paste0(NEXTGEN_DIR, nextgen_file)

# model attributes file and full path
model_attr_file <- path_df$y[i]
model_attr_path <- paste0(model_attr_dir, model_attr_file)
vpu <- path_df$vpu[i]

message("Creating VPU ", path_df$vpu[i], " transects:\n - flowpaths: '", nextgen_file, "'\n - model attributes: '", model_attr_file, "'")
# Get FEMA by VPU directory and files for current VPU
fema_vpu_dir <- paste0(FEMA_VPU_SUBFOLDERS[grepl(paste0("VPU_", vpu), basename(FEMA_VPU_SUBFOLDERS))])
# fema_vpu_dir <- paste0(FEMA_VPU_SUBFOLDERS[grepl(paste0("VPU_", vpu), basename(FEMA_VPU_SUBFOLDERS))], "/merged")

vpu_fema_files <- list.files(fema_vpu_dir, full.names = TRUE)
vpu_fema_file <- vpu_fema_files[grepl(paste0(vpu, "_output.gpkg"), vpu_fema_files)]

message("Creating VPU ", vpu, " transects:",
"\n - flowpaths: '",
nextgen_file, "'",
"\n - FEMA polygons: '",
basename(vpu_fema_file), "'"
)

# message("Creating VPU ", path_df$vpu[i], " transects:\n - flowpaths: '", nextgen_file, "'\n - model attributes: '", model_attr_file, "'")
# sf::write_sf(
# dplyr::slice(dplyr::filter(flines, order == 2), 2),
# "/Users/anguswatters/Desktop/example_flowline.gpkg"
# )

# read in nextgen data
flines <- sf::read_sf(nextgen_path, layer = "flowpaths")

# model attributes
model_attrs <- arrow::read_parquet(model_attr_path)

# join flowlines with model atttributes
flines <- dplyr::left_join(
flines,
dplyr::select(
model_attrs,
id, eTW
),
by = "id"
)

# calculate bankfull width
flines <-
flines %>%
dplyr::mutate(
bf_width = 11 * eTW
) %>%
dplyr::mutate( # if there are any NAs, use exp(0.700 + 0.365* log(tot_drainage_areasqkm)) equation to calculate bf_width
bf_width = dplyr::case_when(
is.na(bf_width) ~ exp(0.700 + 0.365* log(tot_drainage_areasqkm)),
TRUE ~ bf_width
)
) %>%
hydrofabric3D::add_powerlaw_bankful_width(
total_drainage_area_sqkm_col = "tot_drainage_areasqkm",
min_bf_width = 50
) %>%
dplyr::select(
hy_id = id,
lengthkm,
tot_drainage_areasqkm,
bf_width,
mainstem,
geometry = geom
)

# dplyr::mutate(
# bf_width = exp(0.700 + 0.365* log(tot_drainage_areasqkm))
# ) %>%
# dplyr::select(
# hy_id = id,
# lengthkm,
# tot_drainage_areasqkm,
# bf_width,
# mainstem,
# geometry = geom
# )

# flines$bf_width <- ifelse(is.na(flines$bf_width), exp(0.700 + 0.365* log(flines$tot_drainage_areasqkm)), flines$bf_width)

time1 <- Sys.time()
# system.time({
# create transect lines
transects <- terrainSliceR::cut_cross_sections(
net = flines, # flowlines network
id = "hy_id", # Unique feature ID
cs_widths = pmax(50, flines$bf_width), # cross section width of each "id" linestring ("hy_id")
num = 10, # number of cross sections per "id" linestring ("hy_id")
smooth = TRUE, # smooth lines
densify = 3, # densify linestring points
rm_self_intersect = TRUE, # remove self intersecting transects
fix_braids = FALSE, # whether to fix braided flowlines or not
#### Arguments used for when fix_braids = TRUE
# terminal_id = NULL,
# braid_threshold = NULL,
# version = 2,
# braid_method = "comid",
# precision = 1,
add = TRUE # whether to add back the original data
)
# })


# create transect lines
transects <- hydrofabric3D::cut_cross_sections(
net = flines, # flowlines network
crosswalk_id = "hy_id", # Unique feature ID
cs_widths = flines$bf_width, # cross section width of each "id" linestring ("hy_id")
# cs_widths = pmax(50, flines$bf_width * 11), # cross section width of each "id" linestring ("hy_id")
# cs_widths = pmax(50, flines$bf_width), # cross section width of each "id" linestring ("hy_id")
num = 10, # number of cross sections per "id" linestring ("hy_id")
smooth = TRUE, # smooth lines
densify = 3, # densify linestring points
rm_self_intersect = TRUE, # remove self intersecting transects
fix_braids = FALSE, # whether to fix braided flowlines or not
#### Arguments used for when fix_braids = TRUE # TODO: these methods need revision in hydrofabric3D to allow for more flexible processing for data that is NOT COMID based (i.e. hy_id)
# terminal_id = NULL,
# braid_threshold = NULL,
# version = 2,
# braid_method = "comid",
# precision = 1,
add = TRUE # whether to add back the original data
)

gc()

time2 <- Sys.time()
time_diff <- round(as.numeric(time2 - time1 ), 2)

message("\n\n ---> Transects processed in ", time_diff)

# name of file and path to save transects gpkg too
out_file <- paste0("nextgen_", path_df$vpu[i], "_transects.gpkg")
out_path <- paste0(transects_dir, out_file)

message("Saving transects to:\n - filepath: '", out_path, "'")
out_path <- paste0(TRANSECTS_DIR, out_file)

# add cs_source column and keep just the desired columns to save and upload to S3
transects <-
# add cs_source column and rename cs_widths to cs_lengthm
transects <-
transects %>%
dplyr::mutate(
cs_source = net_source
) %>%
cs_source = CS_SOURCE
)

# ---------------------------------------------------------------------
# --- Extend transects out to FEMA 100yr floodplains
# ---------------------------------------------------------------------
message("Reading in FEMA polygons...")

# fema polygons and transect lines
fema <- sf::read_sf(vpu_fema_file)

message("Simplifying FEMA polygons...")
message(" - Number of geoms BEFORE simplifying: ", nrow(fema))

# TODO: this should be a function argument OR removed, shouldn't probably forcibly and silently simplify the input polygons without user knowing..
# keep 1% of the original points for speed
fema <- rmapshaper::ms_simplify(fema, keep_shapes = T, keep = 0.01, sys = TRUE, sys_mem = 16)
# fema <- rmapshaper::ms_simplify(fema, keep_shapes = T, keep = 0.1, sys = TRUE, sys_mem = 16)

message(" - Number of geoms AFTER simplifying: ", nrow(fema))
message("Extending transects out to FEMA 100yr floodplain polygon boundaries - (", Sys.time(), ")")

transects <-
transects %>%
dplyr::left_join(
dplyr::select(sf::st_drop_geometry(flines),
hy_id,
mainstem
),
by = "hy_id"
)

# TODO: make sure this 3000m extension distance is appropriate across VPUs
# TODO: also got to make sure that this will be feasible on memory on the larger VPUs...
transects <- hydrofabric3D::extend_transects_to_polygons(
transect_lines = transects,
polygons = fema,
flowlines = flines,
crosswalk_id = "hy_id",
grouping_id = "mainstem",
max_extension_distance = 3000
)

message("FEMA extensions complete! - ( ", Sys.time(), " )")

transects <- dplyr::select(transects, -tmp_id)
transects <- hydrofabric3D::add_tmp_id(transects)

transects <-
transects %>%
# dplyr::select(-cs_lengthm) %>%
# dplyr::mutate(is_fema_extended = left_is_extended | right_is_extended) %>%
dplyr::select(
hy_id,
hy_id,
cs_id,
cs_lengthm,
# cs_lengthm = new_cs_lengthm,
cs_source,
cs_id,
cs_measure,
cs_lengthm = cs_widths,
geometry
# is_extended,
# is_fema_extended,
# geometry = geom
)

# save flowlines to out_path (lynker-spatial/01_transects/transects_<VPU num>.gpkg)

gc()

# # ---------------------------------------------------------------------
message("Saving transects to:\n - filepath: '", out_path, "'")

# save transects with only columns to be uploaded to S3 (lynker-spatial/01_transects/transects_<VPU num>.gpkg)
sf::write_sf(
transects,
# save dataset with only subset of columns to upload to S3
dplyr::select(transects,
hy_id,
cs_source,
cs_id,
cs_measure,
cs_lengthm,
# sinuosity,
geometry
),
out_path
)
)

# command to copy transects geopackage to S3
if (!is.null(aws_profile)) {
copy_to_s3 <- paste0("aws s3 cp ", out_path, " ", transects_prefix, out_file,
ifelse(is.null(aws_profile), "", paste0(" --profile ", aws_profile)))
} else {
copy_to_s3 <- paste0("aws s3 cp ", out_path, " ", transects_prefix, out_file)
}
copy_to_s3 <- paste0("aws s3 cp ", out_path, " ", S3_TRANSECTS_DIR, out_file,
ifelse(is.null(AWS_PROFILE), "", paste0(" --profile ", AWS_PROFILE))
)


message("Copy VPU ", path_df$vpu[i], " transects to S3:\n - S3 copy command:\n'",
copy_to_s3,
"'\n==========================")

system(copy_to_s3, intern = TRUE)

message("Overwritting local copy of transects to include 'is_extended' column...\n==========================")

# Overwrite transects with additional columns for development purposes (is_extended) to have a local copy of dataset with information about extensions
sf::write_sf(
dplyr::select(
dplyr::mutate(transects, is_extended = FALSE),
hy_id,
cs_source,
cs_id,
cs_measure,
cs_lengthm,
# sinuosity,
is_extended,
geometry
),
out_path
)

rm(fema, transects, flines)
gc()
}
Loading