Code to rescue (scrape) data from the National Eutrophication Survey archival PDF.
Until magick
can handle local adaptive thresholding. This package requires you to be able to call the imagemagick
convert
command with system()
.
You can install nesR from github with:
# install.packages("devtools")
devtools::install_github("jsta/nesR")
library(nesR)
nes_file <- system.file("extdata/national-eutrophication-survey_477.PDF",
package = "nesR")
res <- nes_get(nes_file, 89)
parse_nes(res)
#> Warning in read_ocr_dt(strsplit(nut_txt, " ")[[5]], section_name = "nuts"):
#> The following nuts positions may have bad OCR: 3
#> state name county storet_code lake_type
#> 1 NEVADA LAKE MEAD CLARK. NY; MOHAVE. Az 3201 IMPOUNDMENT
#> drainage_area surface_area mean_depth total_inflow retention_time
#> 1 434601.8 <NA> 59.1 377.34 3.5
#> alkalinity conductivity sechhi tp po4 tin tn p_pnt_source_muni
#> 1 136 815 5.9 0.016 0.005 0.34 0.55 322055
#> p_pnt_source_industrial p_pnt_source_septic p_nonpnt_source p_total
#> 1 <NA> <NA> 10 3370770
#> n_pnt_source_muni n_pnt_source_industrial n_pnt_source_septic
#> 1 <NA> 6426644444 375
#> n_nonpnt_source n_total p_total_out p_percent_retention
#> 1 <NA> 26880405 247325 93
#> p_surface_area_loading n_total_out n_percent_retention
#> 1 6.23 <NA> 55
#> n_surface_area_loading
#> 1 45.3
As written, building the NES database requires GNU Make and the ability to run R
commands using the Rscript
command-line utility (aka doesn't work on Windows).
make PDFSOURCE=474 all
make PDFSOURCE=475 all
make PDFSOURCE=476 all
make PDFSOURCE=477 all
Brett, M. T., and M. M. Benjamin. 2007. A review and reassessment of lake phosphorus retention and the nutrient loading concept. Freshwater Biology.
Reckhow, K. H. 1988. Empirical models for trophic state in southeastern US lakes and reservoirs.